Spring batch是用来处理大量数据操作的一个框架,主要用来读取大量数据,然后进行一定处理后输出成指定的形式。

  Spring batch主要有以下部分组成:

  • JobRepository       用来注册job的容器
  • JobLauncher             用来启动Job的接口
  • Job                          实际执行的任务,包含一个或多个Step
  • Step                        step包含ItemReader、ItemProcessor和ItemWriter
  • ItemReader              用来读取数据的接口
  • ItemProcessor          用来处理数据的接口
  • ItemWriter               用来输出数据的接口

以上Spring Batch的主要组成部分只需要注册成Spring的Bean即可。若想开启批处理的支持还需在配置类上使用@EnableBatchProcessing,在Spring Batch中提供了大量的ItemReader和ItemWriter的实现,用来读取不同的数据来源,数据的处理和校验都要通过ItemProcessor接口实现来完成。

Spring Boot的支持

  Spring Boot对Spring Batch支持的源码位于org.springframework.boot.autoconfigure.batch下。

  Spring Boot为我们自动初始化了Spring Batch存储批处理记录的数据库。

  spring batch会自动加载hsqldb驱动,根据需求选择去留。

下面是一个spring boot支持spring batch 的例子:

  1. 实体类

 public class Person {

     @Size(max=4,min=2) //使用JSR-303注解来校验注解
private String name; private int age; private String nation; private String address; public String getName() {
return name;
} public void setName(String name) {
this.name = name;
} public int getAge() {
return age;
} public void setAge(int age) {
this.age = age;
} public String getNation() {
return nation;
} public void setNation(String nation) {
this.nation = nation;
} public String getAddress() {
return address;
} public void setAddress(String address) {
this.address = address;
}
}

  2. 校验器

 public class CsvBeanValidator<T> implements Validator<T>,InitializingBean {
private javax.validation.Validator validator;
@Override
public void afterPropertiesSet() throws Exception { //使用JSR-303的Validator来校验我们的数据,在此处进行JSR-303的Validator的初始化
ValidatorFactory validatorFactory = Validation.buildDefaultValidatorFactory();
validator = validatorFactory.usingContext().getValidator();
} @Override
public void validate(T value) throws ValidationException {
Set<ConstraintViolation<T>> constraintViolations = validator.validate(value); //使用Validator的validate方法校验数据
if(constraintViolations.size()>0){ StringBuilder message = new StringBuilder();
for (ConstraintViolation<T> constraintViolation : constraintViolations) {
message.append(constraintViolation.getMessage() + "\n");
}
throw new ValidationException(message.toString()); } } }

  3. ItemProcessor  

 public class CsvItemProcessor  extends ValidatingItemProcessor<Person>{

     @Override
public Person process(Person item) throws ValidationException {
super.process(item); //需要执行super.process(item)才会调用自定义校验器 if(item.getNation().equals("汉族")){ //对数据做简单的处理,若民族为汉族,则数据转换成01,其余转换成02
item.setNation("01");
}else{
item.setNation("02");
}
return item;
} }

  4. Job监听(监听器要实现JobExecutionListener接口,并重写其beforeJob、afterJob方法即可)

 public class CsvJobListener implements JobExecutionListener{ 

     long startTime;
long endTime;
@Override
public void beforeJob(JobExecution jobExecution) {
startTime = System.currentTimeMillis();
System.out.println("任务处理开始");
} @Override
public void afterJob(JobExecution jobExecution) {
endTime = System.currentTimeMillis();
System.out.println("任务处理结束");
System.out.println("耗时:" + (endTime - startTime) + "ms");
} }

  5. 配置

 @Configuration    
@EnableBatchProcessing
public class CsvBatchConfig { @Bean
public ItemReader<Person> reader() throws Exception {
FlatFileItemReader<Person> reader = new FlatFileItemReader<Person>(); //使用FlatFileItemReader读取文件
reader.setResource(new ClassPathResource("people.csv")); //使用FlatFileItemReader的setResource方法设置CSV文件的路径
reader.setLineMapper(new DefaultLineMapper<Person>() {{ //在此处对CVS文件的数据和领域模型类做对应映射
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames(new String[] { "name","age", "nation" ,"address"});
}});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
setTargetType(Person.class);
}});
}});
return reader;
} @Bean
public ItemProcessor<Person, Person> processor() {
CsvItemProcessor processor = new CsvItemProcessor(); //使用自定义的ItemProcessor的实现
processor.setValidator(csvBeanValidator()); //为Processor指定校验器
return processor;
} @Bean
public ItemWriter<Person> writer(DataSource dataSource) {//Spring能让容器中已有的Bean以参数的形式注入,Spring boot已经定义了DataSource
JdbcBatchItemWriter<Person> writer = new JdbcBatchItemWriter<Person>(); //使用JDBC批处理的JdbcBatchItemWriter来写数据到数据库
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<Person>());
String sql = "insert into person " + "(id,name,age,nation,address) "
+ "values(hibernate_sequence.nextval, :name, :age, :nation,:address)";
writer.setSql(sql); //在此设置要执行批处理的sql语句
writer.setDataSource(dataSource);
return writer;
} @Bean
public JobRepository jobRepository(DataSource dataSource, PlatformTransactionManager transactionManager)
throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(transactionManager);
jobRepositoryFactoryBean.setDatabaseType("oracle");
return jobRepositoryFactoryBean.getObject();
} @Bean
public SimpleJobLauncher jobLauncher(DataSource dataSource, PlatformTransactionManager transactionManager)
throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository(dataSource, transactionManager));
return jobLauncher;
} @Bean
public Job importJob(JobBuilderFactory jobs, Step s1) {
return jobs.get("importJob")
.incrementer(new RunIdIncrementer())
.flow(s1) //指定step
.end()
.listener(csvJobListener()) //绑定监听器
.build();
} @Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Person> reader, ItemWriter<Person> writer,
ItemProcessor<Person,Person> processor) {
return stepBuilderFactory
.get("step1")
.<Person, Person>chunk(65000) //批处理每次提交65000条数据
.reader(reader) //给step绑定reader
.processor(processor) //给step绑定Processor
.writer(writer) //给step绑定writer
.build();
} @Bean
public CsvJobListener csvJobListener() {
return new CsvJobListener();
} @Bean
public Validator<Person> csvBeanValidator() {
return new CsvBeanValidator<Person>();
} }

  6.application.xml

 spring.datasource.driverClassName=oracle.jdbc.OracleDriver
spring.datasource.url=jdbc\:oracle\:thin\:@localhost\:1521\:xe
spring.datasource.username=boot
spring.datasource.password=boot spring.batch.job.enabled=true logging.level.org.springframework.web = DEBUG

上面的例子是自动触发批处理的,当我们需要手动触发批处理时,需要将CsvBatchConfig类的@Configuration注解注释掉,让此配置类不再起效,新建TriggerBatchConfig配置类,内容与CsvBatchConfig完全一致,除了修改定义ItemReader这个Bean;另外,还需要修改application.xml配置文件spring.batch.job.enable=false

 @Configuration
@EnableBatchProcessing
public class TriggerBatchConfig { @Bean
@StepScope
public FlatFileItemReader<Person> reader(@Value("#{jobParameters['input.file.name']}") String pathToFile) throws Exception {
FlatFileItemReader<Person> reader = new FlatFileItemReader<Person>(); //
reader.setResource(new ClassPathResource(pathToFile)); //
reader.setLineMapper(new DefaultLineMapper<Person>() {{ //
setLineTokenizer(new DelimitedLineTokenizer() {{
setNames(new String[] { "name","age", "nation" ,"address"});
}});
setFieldSetMapper(new BeanWrapperFieldSetMapper<Person>() {{
setTargetType(Person.class);
}});
}}); return reader;
} @Bean
public ItemProcessor<Person, Person> processor() {
CsvItemProcessor processor = new CsvItemProcessor();
processor.setValidator(csvBeanValidator());
return processor;
} @Bean
public ItemWriter<Person> writer(DataSource dataSource) {
JdbcBatchItemWriter<Person> writer = new JdbcBatchItemWriter<Person>();
writer.setItemSqlParameterSourceProvider(new BeanPropertyItemSqlParameterSourceProvider<Person>());
String sql = "insert into person " + "(id,name,age,nation,address) "
+ "values(hibernate_sequence.nextval, :name, :age, :nation,:address)";
writer.setSql(sql); //
writer.setDataSource(dataSource);
return writer;
} @Bean
public JobRepository jobRepository(DataSource dataSource, PlatformTransactionManager transactionManager)
throws Exception {
JobRepositoryFactoryBean jobRepositoryFactoryBean = new JobRepositoryFactoryBean();
jobRepositoryFactoryBean.setDataSource(dataSource);
jobRepositoryFactoryBean.setTransactionManager(transactionManager);
jobRepositoryFactoryBean.setDatabaseType("oracle");
return jobRepositoryFactoryBean.getObject();
} @Bean
public SimpleJobLauncher jobLauncher(DataSource dataSource, PlatformTransactionManager transactionManager)
throws Exception {
SimpleJobLauncher jobLauncher = new SimpleJobLauncher();
jobLauncher.setJobRepository(jobRepository(dataSource, transactionManager));
return jobLauncher;
} @Bean
public Job importJob(JobBuilderFactory jobs, Step s1) {
return jobs.get("importJob")
.incrementer(new RunIdIncrementer())
.flow(s1)
.end()
.listener(csvJobListener())
.build();
} @Bean
public Step step1(StepBuilderFactory stepBuilderFactory, ItemReader<Person> reader, ItemWriter<Person> writer,
ItemProcessor<Person,Person> processor) {
return stepBuilderFactory
.get("step1")
.<Person, Person>chunk(65000)
.reader(reader)
.processor(processor)
.writer(writer)
.build();
} @Bean
public CsvJobListener csvJobListener() {
return new CsvJobListener();
} @Bean
public Validator<Person> csvBeanValidator() {
return new CsvBeanValidator<Person>();
} }

  控制层代码

 @RestController
public class DemoController { @Autowired
JobLauncher jobLauncher; @Autowired
Job importJob;
public JobParameters jobParameters; @RequestMapping("/read")
public String imp(String fileName) throws Exception{ String path = fileName+".csv";
jobParameters = new JobParametersBuilder()
.addLong("time", System.currentTimeMillis())
.addString("input.file.name", path)
.toJobParameters();
jobLauncher.run(importJob,jobParameters);
return "ok";
} }

Spring batch的学习的更多相关文章

  1. Spring Batch学习笔记三:JobRepository

    此系列博客皆为学习Spring Batch时的一些笔记: Spring Batch Job在运行时有很多元数据,这些元数据一般会被保存在内存或者数据库中,由于Spring Batch在默认配置是使用H ...

  2. Spring Batch学习笔记二

    此系列博客皆为学习Spring Batch时的一些笔记: Spring Batch的架构 一个Batch Job是指一系列有序的Step的集合,它们作为预定义流程的一部分而被执行: Step代表一个自 ...

  3. spring batch学习笔记

    Spring Batch是什么?       Spring Batch是一个基于Spring的企业级批处理框架,按照我师父的说法,所有基于Spring的框架都是使用了spring的IoC特性,然后加上 ...

  4. Spring Batch学习

    今天准备研究下Spring Batch,然后看了一系列资料,如下还是比较好的教程吧. 链接: http://www.cnblogs.com/gulvzhe/archive/2011/12/20/229 ...

  5. Maven+Spring Batch+Apache Commons VF学习

    Apache Commons VFS资料:例子:http://www.zihou.me/html/2011/04/12/3377.html详细例子:http://p7engqingyang.iteye ...

  6. spring batch批处理框架学习

    内如主要来自以下链接: http://www.importnew.com/26177.html http://www.infoq.com/cn/articles/analysis-of-large-d ...

  7. Spring Batch学习笔记(一)

    Spring Batch简介 Spring Batch提供了可重复使用的功能,用来处理大量数据.包括记录.跟踪,事务管理,作业处理统计,作业重启,跳过和资源管理. 此外还提供了更高级的技术服务和功能, ...

  8. Spring batch学习 详细配置解读(3)

    第一篇讲到普通job 配置 那么spring  batch 给我们提供了丰富的配置,包括定时任务,校验,复合监听器,父类,重启机制等. 下面看一个动态设置读取文件的配置 1.动态文件读取 <?x ...

  9. Spring batch学习 (1)

    Spring Batch 批处理框架 埃森哲和Spring Source研发 主要解决批处理数据的问题,包含并行处理,事务处理机制等.具有健壮性 可扩展,和自带的监控功能,并且支持断点和重发.让程序员 ...

随机推荐

  1. SQLServerDBA十大必备工具---让生活轻松点

    原贴:http://www.cnblogs.com/fygh/archive/2012/04/25/2469563.html 国外整理拓展帖:http://weblogs.sqlteam.com/ml ...

  2. string、const char*、 char* 、char[]相互转换(待整理)

    string.const char*. char* .char[]相互转换(全) https://blog.csdn.net/rongrongyaofeiqi/article/details/5244 ...

  3. 安装HDF5及在VS下配置HDF5

    最近要用到HDF5来存储数据,想要安装尝试用一下.发现网上有两种安装方式,一种是obtain518.html:获取最新的HDF5-1.8软件;另一种是cmakebuild518.html:使用CMAK ...

  4. Apache 配置多个HTTPS站点

    作中经常会遇到多个站点实现https访问,并指向同一个网页,本文将详解如何在Centos 环境下配置Apache多站点实现HTTPS访问. 准备工作 OS:CentOS release 6.8 (Fi ...

  5. 1.keras实现-->自己训练卷积模型实现猫狗二分类(CNN)

    原数据集:包含 25000张猫狗图像,两个类别各有12500 新数据集:猫.狗 (照片大小不一样) 训练集:各1000个样本 验证集:各500个样本 测试集:各500个样本 1= 狗,0= 猫 # 将 ...

  6. sublime text3搭建react native

    Sublime Text 3 搭建React.js开发环境 Sublime有很强的自定义功能,插件库很庞大,针对新语言插件更新很快,配合使用可以快速搭建适配语言的开发环境. 1. babel-subl ...

  7. #C++初学记录(算法4)

    A - Serval and Bus It is raining heavily. But this is the first day for Serval, who just became 3 ye ...

  8. 008-centos服务管理

  9. Codeforces Round #440 (Div. 2, based on Technocup 2018 Elimination Round 2) C. Maximum splitting

    地址: 题目: C. Maximum splitting time limit per test 2 seconds memory limit per test 256 megabytes input ...

  10. cocoapods 配置

    二.CocoaPods 安装 CocoaPods可以方便地通过Mac自带的RubyGems安装. 打开Terminal(Mac电脑自带的终端): (1).设置ruby的软件源 这是因为ruby的软件源 ...