Spring Batch的事务-Part 1:基础
原文 https://blog.codecentric.de/en/2012/03/transactions-in-spring-batch-part-1-the-basics/
This is the first post in a series about transactions in Spring Batch, you find the second one here, it’s about restarting a batch, cursor based reading and listeners, and the third one here, it’s about skip and retry.
Transactions are important in almost any application, but handling transactions in batch applications is something a little more tricky. In standard online applications you usually have one transaction for one user action, and as a developer you normally just have to assure that your code picks up an existing transaction or creates a new one when there’s none (propagation typeREQUIRED). That’s it. Developers of batch applications have much more headaches with transactions. Of course you cannot have just one transaction for the whole batch, the database couldn’t cope with that, so there have to be commits somewhere in between. A failed batch then doesn’t mean you get the unchanged data back, and when you throw in features like restarting a failed batch, retrying or skipping failing items, you automatically get a complicated transaction behaviour. Spring Batch offers the functionality just mentioned, but how does it do that?
Spring Batch is a great framework, and there is a lot of documentation and some good books, but after reading a lot about Spring Batch I still wasn’t sure about everything regarding transactions, so in the end all that helped to understand everything was looking into the code and a lot of debugging. So, this is no introduction to Spring Batch, I’m gonna focus just on transactions, and I assume that you’re familiar with transactions in Spring (transaction managers, transaction attributes). And since I have to restrict myself a little bit, I will just talk about one-threaded chunk oriented processing.
Chunk oriented steps
Let’s start with a picture that will follow us throughout this and the following blog posts, only changed in little details every now and then to focus on a certain subject.
It’s already telling a lot about Spring Batch and its transactional behaviour. In chunk-oriented processing we have ItemReaders reading items, one after the other, always delivering the next one item. When there are no more items, the reader delivers null. Then we have optional ItemProcessors taking one item and delivering one item, that may be of another type. Finally we haveItemWriters taking a list of items and writing them somewhere.
The batch is separated in chunks, and each chunk is running in its own transaction. The chunk size actually is determined by aCompletionPolicy, as you can see in the illustration at (1): when the CompletionPolicy is fulfilled, Spring Batch stops reading items and starts with the processing. By default, if you use the commit-interval attribute on chunk, you get aSimpleCompletionPolicy that is completed when the number of items you specified in the attribute is read. If you want something more sophisticated you can specify your own CompletionPolicy in the attribute chunk-completion-policy.
This is all quite straight forward, if there’s a RuntimeException being thrown in one of the participating components, the transaction for the chunk is rolled back and the batch fails. Every already committed chunk of course stays in the processed state.
Business data and batch job data
As you might know already, Spring Batch brings a set of database table definitions. These tables are used to store data about the jobs and steps and the different job and step execution contexts. This persistence layer is useful for some kind of history on the one hand, and for restarting jobs on the other hand. If you’re thinking of putting these tables in a different database than your business data: don’t. The data stored there is about the state of the job and the steps, with numbers of processed items, start time, end time, a state identifier (COMPLETED, FAILED and so on) and much more. In addition there is a map for each step (the step execution context) and job (the job execution context) which can be filled by any batch programmer. Changes in this data have to be in line with the transaction running on our business data, so if we have two databases we’ll need for sure aJtaTransactionManager handling different DataSources, suffering in performance as well. So, if you have a choice, put those tables near to your business data. In the following diagram you can see where in the processing step and job data is persisted. As you can see, it doesn’t happen only inside the chunk transaction, for good reasons: we want to have step and job data persisted in the case of a failure, too.
Note that I use little numbers for indicating items that are explained in a text box. The numbers stay in following versions of the diagram while the text box may disappear due to readability. It’s always possible to look up the explanation in a previous version of the diagram.
A failed batch
Until now, the diagram just includes successful processing. Let’s take a look at the diagram including a possible failure.
If you didn’t configure skip or retry functionality (we’ll get to that in the next blog posts) and there’s an uncaughtRuntimeException somewhere in an element executed inside the chunk, the transaction is rolled back, the step is marked asFAILED and the whole job will fail. Persisting step data in a separate transaction at (5) makes sure that the failure state gets into the database.
When I say that an uncaught RuntimeException causes the rollback, then it’s not quite true for every case. We have the option to set no-rollback-exceptions:
<batch:tasklet>
<batch:chunk ... />
<batch:no-rollback-exception-classes>
<batch:include class="de.codecentric.MyRuntimeException"/>
</batch:no-rollback-exception-classes>
</batch:tasklet>
Transaction attributes
One more thing for today: if you don’t configure transaction attributes explicitly, you get the defaults. Transaction attributes are propagation type, isolation level and timeout, for example. You may specify those attributes as shown here:
<batch:tasklet>
<batch:transaction-attributes isolation="READ_COMMITTED" propagation="REQUIRES_NEW" timeout="200"/>
<batch:chunk reader="myItemReader" writer="myItemReader" commit-interval="20"/>
</batch:tasklet>
If you don’t specify them, you’ll get the propagation type REQUIRED and the isolation level DEFAULT, which means that the default of the actual database is used. Normally you don’t want to change the propagation type, but it makes sense to think about the isolation level and check the batch job: am I fine with non-repeatable reads? Am I fine with phantom reads? And: what other applications are accessing and changing the database, do they corrupt the data I’m working on in a way that causes trouble? Is there a possibility to get locks? For more information on the different isolation levels check this wikipedia article.
Conclusion
In this first article on transactions in Spring Batch I explained the basic reader-processor-writer cycle in chunk oriented steps and where the transactions come into play. We saw what happens when a step fails, how to set transaction attributes and no-rollback-exception-classes and how job and step metadata is updated.
Next on the list will be restart, retry and skip functionality: what are the preconditions? How does the transaction management work with these features? Click here for the next blog post in this series about restart, cursor based reading and listeners, andhere for the third post about skip and retry.
Spring Batch的事务-Part 1:基础的更多相关文章
- Spring Batch的事务– Part 3: 略过和重试
原文:https://blog.codecentric.de/en/2012/03/transactions-in-spring-batch-part-3-skip-and-retry/ This i ...
- Spring Batch 中文参考文档 V3.0.6 - 1 Spring Batch介绍
1 Spring Batch介绍 企业领域中许多应用系统需要采用批处理的方式在特定环境中运行业务操作任务.这种业务作业包括自动化,大量信息的复杂操作,他们不需要人工干预,并能高效运行.这些典型作业包括 ...
- spring batch(一):基础部分
spring batch(一):基础部分 博客分类: Spring java spring batch 官网: http://www.springsource.org/spring-batch 下 ...
- 【spring基础】spring声明式事务详解
一.spring声明式事务 1.1 spring的事务管理器 spring没有直接管理事务,而是将管理事务的责任委托给JTA或相应的持久性机制所提供的某个特定平台的事务实现.spring容器负责事物的 ...
- Web基础之Spring AOP与事务
Spring之AOP AOP 全程Aspect Oriented Programming,直译就是面向切面编程.和POP.OOP相似,它也是一种编程思想.OOP强调的是封装.继承.多态,也就是功能的模 ...
- 初探Spring Batch
此系列博客皆为学习Spring Batch时的一些笔记: 为什么我们需要批处理? 我们不会总是想要立即得到需要的信息,批处理允许我们在请求处理之前就一个既定的流程开始搜集信息:比如说一个银行对账单,我 ...
- Spring Batch 专题
如今微服务架构讨论的如火如荼.但在企业架构里除了大量的OLTP交易外,还存在海量的批处理交易.在诸如银行的金融机构中,每天有3-4万笔的批处理作业需要处理.针对OLTP,业界有大量的开源框架.优秀的架 ...
- 陪你解读Spring Batch(一)Spring Batch介绍
前言 整个章节由浅入深了解Spring Batch,让你掌握批处理利器.面对大批量数据毫无惧色.本章只做介绍,后面章节有代码示例.好了,接下来是我们的主角Spring Batch. 1.1 背景介绍 ...
- spring batch (一) 常见的基本的概念介绍
SpringBatch的基本概念介绍 内容来自<Spring Batch 批处理框架>,作者:刘相. 一.配置文件 在项目中使用spring batch 需要在配置文件中声明: 事务管理器 ...
随机推荐
- delphi常用函数
直接引用了 http://www.cnblogs.com/doit8791/archive/2012/05/17/2507073.html.
- dojo 十二 rest
从今年8月份开始一直在做以HTML5+CSS3+Dojo实现前端设计,以REST风格实现后台数据请求的项目研发.实践出真知,现在对研发中用到的技术和遇到的问题做一个总结. 后台服务没有采用那些主流的框 ...
- Linux守护进程详解(init.d和xinetd) [转]
一 Linux守护进程 Linux 服务器在启动时需要启动很多系统服务,它们向本地和网络用户提供了Linux的系统功能接口,直接面向应用程序和用户.提供这些服务的程序是由运行在后台 的守护进程来执行的 ...
- linux系统的 suid/guid简单介绍 linux suid guid
我们在前面曾经提到过s u i d和g u i d.这种权限位近年来成为一个棘手的问题.很多系统供应商不允许实现这一位,或者即使它被置位,也完全忽略它的存在,因为它会带来安全性风险.那么人们为何如此大 ...
- 《c程序设计语言》读书笔记-字符型0-9转为数字0-9
#include <stdio.h> #define Num 10 int atoi(char s[]); int main() { int c,i = 0; char s[Num]; i ...
- MongoDB 学习笔记(四)C# 操作MongoDB
C#驱动对mongodb的操作,目前驱动有两种:官方驱动和samus驱动,不过我个人还是喜欢后者, 因为提供了丰富的linq操作,相当方便. 官方驱动:https://github.com/mongo ...
- UVa 1626 (输出方案) Brackets sequence
正规括号序列定义为: 空序列是正规括号序列 如果S是正规括号序列,那么[S]和(S)也是正规括号序列 如果A和B都是正规括号序列,则AB也是正规括号序列 输入一个括号序列,添加尽量少的括号使之成为正规 ...
- HDU 2059 龟兔赛跑
受上一道题影响,我本来想着开一个二维数组来表示充电和不充电的状态. 可这样就有一个问题,如果没有充电,那么在下一个阶段就有剩余的电量. 这样问题貌似就不可解了,难道是因为不满足动态规划的无后效性这一条 ...
- 51nod1346 递归
我终于知道我有多么蠢了...推规律根本不带我这么推的...跟51nod那场比赛的傻逼B题一样,想都不想想就打表找规律...智障啊找规律也要按照基本法! //f[1][2]=a[1][2] f[2][1 ...
- struts2拦截器配置;拦截器栈;配置默认拦截器;拦截方法的拦截器MethodFilterInterceptor;完成登录验证
struts2.xml 内容 <?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE struts ...