Async and virtual threads in Spring Batch

9 Minutes read

As developers, performance is a subject that we have to think about on almost every new functionality that we develop. But it is even more important when dealing with a large set of data.

When we are processing a large number of items, suddenly every millisecond counts because it could add up to seconds, minutes, or even hours.

On my current project at ekino, we needed to develop a batch updating all items in our database with data coming from an external web service. The external service had to compute many things and therefore took some time to respond to each call.

Since it was impossible for us to improve the performance of the web service itself as it was developed by another team, we decided to see what was possible on our side. And this is how I found out about asynchronous processors in Spring Batch.

In this article, we are going to build a small sample project with three jobs to compare their performances:

first, a regular job with a regular item processor
then, a job with an asynchronous processor using platform threads
and finally, a job with an asynchronous processor using virtual threads

The code is available on GitHub, feel free to clone it to check specificities or try new things.

But first, let’s implement the common components that will be useful to the three jobs.

Common components

Reader

The reader is a custom component that will return 50 integers (from 0 to 49 included).

@Component
public class NumberItemReader implements ItemReader<Integer> {

    private static final Integer UPPER_BOUND = 50;
    private int currentIndex = 0;

    public Integer read() {
        return (currentIndex < UPPER_BOUND) ? currentIndex++ : null;
    }
}

It is purposefully simplified as it is not very important for the topic of the day – the processor will be the interesting part. But if you would like to implement a custom reader streaming data you can check the Spring documentation and especially the section titled “Making the ItemReader Restartable” to get a more “production ready” version.

Processor

The processor is defined as follows:

@Component
public class SyncItemProcessor implements ItemProcessor<Integer, Integer> {

    private static final Logger LOGGER = LoggerFactory.getLogger(SyncItemProcessor.class);

    @Override
    public Integer process(@NonNull Integer item) throws Exception {
        LOGGER.info("Processing item {}", item);
        Thread.sleep(1000);
        return item;
    }
}

It takes an Integer as input (coming from the reader) and also returns an Integer (for the writer explained in the next section). The process function does three things:

first, it logs the current item: this will be useful once we add asynchronism to be able to check which thread is processing the data
then, it simulates a long processing that would take 1 second with Thread.sleep. In a real-life application, this is where you would call other web services or do some other lengthy operations
finally, it returns the item to continue the process and provide it to the writer

Writer

The writer will simply log all the items of the processed chunk:

@Component
public class LogItemWriter implements ItemWriter<Integer> {

    private static final Logger LOGGER = LoggerFactory.getLogger(LogItemWriter.class);

    @Override
    public void write(Chunk<? extends Integer> chunk) {
        for (var item : chunk) {
            LOGGER.info("Writing item {}", item);
        }
    }
}

Synchronous processing

Synchronous configuration

Now that all our components are ready, let’s configure our job the usual way:

@Configuration
public class BatchConfiguration {

    public static final String SYNC_JOB_NAME = "syncJob";

    @Bean
    public Step syncStep(JobRepository jobRepository,
                         PlatformTransactionManager transactionManager,
                         NumberItemReader numberItemReader,
                         SyncItemProcessor syncItemProcessor,
                         LogItemWriter logItemWriter) {
        return new StepBuilder("sync-step", jobRepository)
                .<Integer, Integer>chunk(100, transactionManager)
                .reader(numberItemReader)
                .processor(syncItemProcessor)
                .writer(logItemWriter)
                .build();
    }

    @Bean(SYNC_JOB_NAME)
    public Job syncJob(JobRepository jobRepository,
                       Step syncStep) {
        return new JobBuilder(SYNC_JOB_NAME, jobRepository)
                .incrementer(new RunIdIncrementer())
                .start(syncStep)
                .build();
    }
}

As you can see, we end up with a job called syncJob consisting of a single step. This step uses the reader, processor, and writer defined in the previous section of this article.

It uses chunk-oriented processing to read/process one item at a time, and then write them all as a group (or chunk) of a certain size (defined as 100 in our case).

Synchronous run

Let’s see what happens in the logs when we run this job in an integration test.

In the red box, you can see that each computation takes one second (as it was defined in the processor with Thread.sleep). And that they are launched synchronously, one after the other.

In the green box, we can observe that all the operations are processed on the same thread (the Test worker thread).

Then, a few lines below, we can see that the writer also uses the main thread and writes all the items in a row thanks to Spring Batch chunk process (the chunk size was defined as 100 in the previous configuration):

And since we are processing 50 items, each taking 1 second to process, the whole step takes around 50 seconds to finish as expected:

Now let’s try to define exactly the same job, but with asynchronous processing.

Asynchronous processing with platform threads

Asynchronous configuration

Compared to the synchronous version, we will need to add a new layer on top of our existing beans to configure the asynchronous mechanism.

Luckily, Spring Batch has out-of-the-box support for this. The configuration will be quite straightforward, and all the magic will happen in the configuration file.

First, let’s configure our new processor:

@Bean
public AsyncItemProcessor<Integer, Integer> asyncItemProcessor(SyncItemProcessor itemProcessor,
                                                               TaskExecutor threadPoolTaskExecutor) {
    var asyncItemProcessor = new AsyncItemProcessor<Integer, Integer>();
    asyncItemProcessor.setDelegate(itemProcessor);
    asyncItemProcessor.setTaskExecutor(threadPoolTaskExecutor);
    return asyncItemProcessor;
}

@Bean
public TaskExecutor threadPoolTaskExecutor() {
    var taskExecutor = new ThreadPoolTaskExecutor();
    taskExecutor.setCorePoolSize(5);
    taskExecutor.setMaxPoolSize(10);
    taskExecutor.setThreadNamePrefix("platform-");
    return taskExecutor;
}

Spring Batch provides the AsyncItemProcessor class that encapsulates our previous processor and gives us the ability to attach a custom TaskExecutor to it.

Our newly defined task executor is a thread pool executor configured to normally handle 5 platform threads, and up to 10 threads if necessary. The name of the threads will be “platform-1”, “platform-2”, etc.

Platform threads are what we used to simply call “threads” before virtual threads were released. They run Java code directly on an underlying operating system thread.

One specificity of the AsyncItemProcessor, as specified in the documentation, is that it returns a Future object:

public class AsyncItemProcessor<I,O> implements ItemProcessor<I,Future<O>>

Since our current writer takes an Integer as input and not a Future<Integer>, we will also have to make some changes to the writer part:

    @Bean
    public AsyncItemWriter<Integer> asyncWriter(LogItemWriter itemWriter) {
        var asyncItemWriter = new AsyncItemWriter<Integer>();
        asyncItemWriter.setDelegate(itemWriter);
        return asyncItemWriter;
    }

The AsyncItemWriter wraps our previous LogItemWriter and handles the asynchronous output from the processor.

And that’s all we need for now. These three new beans (the AsyncItemProcessor, TaskExecutor, and AsyncItemWriter) are enough to handle the asynchronous processing and give us the ability to process multiple items in parallel.

We can now use them in our step and job as we would normally do:

@Configuration
public class AsyncBatchConfiguration {

    public static final String ASYNC_JOB_NAME = "asyncJob";

    @Bean
    public AsyncItemProcessor<Integer, Integer> asyncItemProcessor(SyncItemProcessor itemProcessor,
                                                                   TaskExecutor threadPoolTaskExecutor) {...}

    @Bean
    public TaskExecutor threadPoolTaskExecutor() {...}

    @Bean
    public AsyncItemWriter<Integer> asyncWriter(LogItemWriter itemWriter) {...}

    @Bean
    public Step asyncStep(JobRepository jobRepository,
                          PlatformTransactionManager transactionManager,
                          NumberItemReader numberItemReader,
                          AsyncItemProcessor<Integer, Integer> asyncItemProcessor,
                          AsyncItemWriter<Integer> asyncWriter) {
        return new StepBuilder("async-step", jobRepository)
                .<Integer, Future<Integer>>chunk(100, transactionManager)
                .reader(numberItemReader)
                .processor(asyncItemProcessor)
                .writer(asyncWriter)
                .build();
    }

    @Bean(ASYNC_JOB_NAME)
    public Job asyncJob(JobRepository jobRepository,
                        Step asyncStep) {
        return new JobBuilder(ASYNC_JOB_NAME, jobRepository)
                .incrementer(new RunIdIncrementer())
                .start(asyncStep)
                .build();
    }
}

Asynchronous run

After running the asyncJob in a test, let’s look at the differences from the first job in the logs.

By looking at the timestamps (red box) and thread names (green box), we can clearly see that the items are processed as groups of 5 items simultaneously. The first 5 items will be processed in parallel at 17:58:32 on our 5 platform threads defined in the thread pool. And once each thread is released, it will be able to handle a new item one second later.

The writer logs on the other hand will be exactly the same as before:

Since the thread pool is used only on our processor and not on the writer, the thread will be the main one: Test worker.

And finally, if we check the execution time, it will be 5 times faster than before:

The gain in time is already very interesting in my opinion. But since Java 21 introduced virtual threads as a more lightweight and performant option, let’s see if we could use them in this type of setting. On a side note, I won’t go into the details of virtual threads but you can check this article for example that explains a lot of the concepts.

Asynchronous processing with virtual threads

Asynchronous with virtual threads configuration

The configuration will be very similar to the previous one. We only have to redefine a processor that will use a “virtual thread” task executor instead of a regular one.

@Bean
public AsyncItemProcessor<Integer, Integer> virtualAsyncItemProcessor(SyncItemProcessor itemProcessor) {
    var asyncItemProcessor = new AsyncItemProcessor<Integer, Integer>();
    asyncItemProcessor.setDelegate(itemProcessor);
    asyncItemProcessor.setTaskExecutor(new VirtualThreadTaskExecutor("virtual-"));

    return asyncItemProcessor;
}

The only difference from the previous configuration is that we use a VirtualThreadTaskExecutor instead of the ThreadPoolTaskExecutor.

The parameter in the constructor is the thread name prefix, so our threads will be named “virtual-1”, “virtual-2”, etc.

Now we can configure our job and step, using this new processor and the same async writer as before:

@Configuration
public class VirtualAsyncBatchConfiguration {

    public static final String VIRTUAL_ASYNC_JOB_NAME = "virtualAsyncJob";

    @Bean
    public AsyncItemProcessor<Integer, Integer> virtualAsyncItemProcessor(SyncItemProcessor itemProcessor) {...}

    @Bean
    public Step virtualAsyncStep(JobRepository jobRepository,
                                 PlatformTransactionManager transactionManager,
                                 NumberItemReader numberItemReader,
                                 AsyncItemProcessor<Integer, Integer> virtualAsyncItemProcessor,
                                 AsyncItemWriter<Integer> asyncWriter) {
        return new StepBuilder("virtual-async-step", jobRepository)
                .<Integer, Future<Integer>>chunk(100, transactionManager)
                .reader(numberItemReader)
                .processor(virtualAsyncItemProcessor)
                .writer(asyncWriter)
                .build();
    }

    @Bean(VIRTUAL_ASYNC_JOB_NAME)
    public Job virtualAsyncJob(JobRepository jobRepository,
                               Step virtualAsyncStep) {
        return new JobBuilder(VIRTUAL_ASYNC_JOB_NAME, jobRepository)
                .incrementer(new RunIdIncrementer())
                .start(virtualAsyncStep)
                .build();
    }
}

Asynchronous with virtual threads run

If we run the new job and check the logs for the processing part, we can see that the parallelism works as expected:

I’m only showing the first items on the picture for each job, but in the complete logs we can see that there will be 50 virtual threads created, allowing to process all the 50 items in parallel at the same time.

Since we only have 50 items to process and not much overhead cost in the reader and writer, the total execution time will be 1 second:

Comparisons and takeaways

On this example the performance was greatly improved but let’s not forget two important points:

it was a very small sample of items
the process itself was simulated with Thread.sleep

Number of items

Usually people are using Spring Batch when they have a lot of data to process, so it would be interesting to see how the application reacts when we increase the UPPER_BOUND constant to process more items.

I tried to launch the three previous jobs (synchronous, asynchronous, asynchronous with virtual threads) with 50, 1 500, 10 000, and 50 000 items. In the following table, the highlighted values are estimated calculated values (because it would have been too long to wait) and the other ones were retrieved when launching the jobs on my computer.

A lot of parameters could have an impact on these execution times: the machine running the job, the number of threads we chose in the threads pool (5), the chunk size (100), etc.

But it is still interesting to see how the performance is dramatically impacted depending on which mode we chose. With 50 000 items, a batch that would take almost 14 hours to run normally could be finished in less than 9 minutes with virtual threads and very minimal changes to the code base.

2. Real-life processing

Now let’s review the second “non-realistic” aspect of the previous demonstration.

In a real-life application, the processor would have to handle real operations such as:

calling web-services
getting data from files or databases
computing some values
…

Some of these processings might not benefit from asynchronism or virtual threads. And also, some tools needed for these actions might not be entirely ready to handle virtual threads.

On my current project we had already developed the second option on one of our batch: asynchronism with platform threads. Since it was working just fine and the switch to virtual threads only requires changing one line of code, it was a good opportunity to run some tests on a real-life application. To give some context, it’s a batch handling around 50 000 items, and the main operation of the processing part is a web-service call using Feign.

I have to say it didn’t go as smoothly as in the previous example. First, I had some issues with Feign. Some articles recommended to use the latest version of our HTTP client (OkHttpClient in our case) as the previous ones can’t handle virtual threads that well. Other articles recommended to wait for Java 24 and this JEP. After a few fixes, it started to work correctly on some runs but not on all of them.

I also had to wonder, in such cases, if high performance could become an issue. Since we are calling external web-services, it might become a problem if we send more calls simultaneously than they can handle.

This is why it is important to balance all pros and cons on your specific use-case before making a decision. Aside from the technical challenges, asynchronism as a whole might not be a viable candidate for your functionality.

With this small demonstration project we saw that asynchronism can be a very powerful mechanism in Spring Batch, and even more powerful when combined with virtual threads.

Spring Batch provides all the tools to ease the configuration process and makes it effortless to transform a synchronous process into an asynchronous one.

Of course, it’s not a “one size fits all”. Depending on your use case and tools, it won’t always be the right solution. But in the right context it can significantly improve the performance of a job with very little effort.

Async and virtual threads in Spring Batch was originally published in ekino-france on Medium, where people are continuing the conversation by highlighting and responding to this story.