Spring Batch documentation about step-oriented step versus reality?

In the Spring documentation, the package for setting up a step shows a clear image of how the read and write process is performed.

read process ... read process // until #amountOfReadsAndProcesses = commit interval write 

Corresponding (according to the document):

 List items = new Arraylist(); for(int i = 0; i < commitInterval; i++){ Object item = itemReader.read() Object processedItem = itemProcessor.process(item); items.add(processedItem); } itemWriter.write(items); 

However, when I debug and put a breakpoint in the reader method and a breakpoint in the processor process method, I see the following behavior:

 read ... read // until #amountOfReads = commit interval process ... process // until #amountOfProcesses = commit interval write 

So is the documentation wrong? Or am I missing a configuration to make it behave like documentation (I can't find anything there).

The problem that I have is that each subsequent read now depends on the state of the processor. A reader is a composite that reads two sources in parallel, depending on the elements read in one of the sources, only the first, second, or both sources are read during one read operation. But the status of which reading sources is done in the processor. Currently, the only solution goes for commit-interval 1, which is not very optimal for performance.

+5
source share
1 answer

Short answer: you are right, our documentation is inaccurate in the chunking model. This is what needs to be updated. There are reasons why this is so (they are mainly related to fault tolerance processing). But it does not concern your problem. There are several options for your use case:

  • Customize your work using the JSR-352 configuration. The JSR-352 processing model is what our documentation says (they took it as gospel instead of what Spring Batch really does). Since Spring Batch supports JSR-352, just changing the configuration and how you run your tasks will give you the same results. There are JSR-352 limitations that are not suitable for discussion, but this is one option.
  • Another option is to do what Michael Pralow offers. Although I understand your concern about separation of concerns, it looks like you are already breaking this rule, given that your processor generates the output that the reader needs (or are you sharing this state in some other way?).
  • Other options. Without knowing more about your work, there may be other ways to structure your work that works well (for example, moving the logic into several steps, etc.), and still achieve separation of the problems that the Spring Package is trying to resolve, but I will need to see more of your configuration to help there.
+3
source

Source: https://habr.com/ru/post/1205870/


All Articles