Open source Java projects: Spring Batch

Reading and writing CSV files with Spring Batch and MySQL

1 2 3 4 Page 2
Page 2 of 4

Wiring it together in the application context file

Thus far we have built a Product domain object, a ProductFieldSetMapper that converts a line in the CSV file into an object, and a ProductItemWriter that writes objects to the database. Now we need to configure Spring Batch to wire all of these together. Listing 4 shows the source code for the applicationContext.xml file, which defines our beans.

Listing 4. applicationContext.xml


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:batch="http://www.springframework.org/schema/batch"
       xmlns:jdbc="http://www.springframework.org/schema/jdbc"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
                http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
                http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd
                http://www.springframework.org/schema/jdbc http://www.springframework.org/schema/jdbc/spring-jdbc.xsd">


    <context:annotation-config />

    <!-- Component scan to find all Spring components -->
    <context:component-scan base-package="com.geekcap.javaworld.springbatchexample" />


    <!-- Data source - connect to a MySQL instance running on the local machine -->
    <bean id="dataSource" class="org.apache.commons.dbcp.BasicDataSource" destroy-method="close">
        <property name="driverClassName" value="com.mysql.jdbc.Driver"/>
        <property name="url" value="jdbc:mysql://localhost/spring_batch_example"/>
        <property name="username" value="sbe"/>
        <property name="password" value="sbe"/>
    </bean>

    <bean id="transactionManager" class="org.springframework.jdbc.datasource.DataSourceTransactionManager">
        <property name="dataSource" ref="dataSource" />
    </bean>

    <bean id="jdbcTemplate" class="org.springframework.jdbc.core.JdbcTemplate">
        <property name="dataSource" ref="dataSource" />
    </bean>

    <!-- Create job-meta tables automatically -->
    <jdbc:initialize-database data-source="dataSource">
        <jdbc:script location="org/springframework/batch/core/schema-drop-mysql.sql" />
        <jdbc:script location="org/springframework/batch/core/schema-mysql.sql" />
    </jdbc:initialize-database>


    <!-- Job Repository: used to persist the state of the batch job -->
    <bean id="jobRepository" class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
        <property name="transactionManager" ref="transactionManager" />
    </bean>


    <!-- Job Launcher: creates the job and the job state before launching it -->
    <bean id="jobLauncher" class="org.springframework.batch.core.launch.support.SimpleJobLauncher">
        <property name="jobRepository" ref="jobRepository" />
    </bean>


    <!-- Reader bean for our simple CSV example -->
    <bean id="productReader" class="org.springframework.batch.item.file.FlatFileItemReader" scope="step">

        <!-- <property name="resource" value="file:./sample.csv" /> -->
        <property name="resource" value="file:#{jobParameters['inputFile']}" />


        <!-- Skip the first line of the file because this is the header that defines the fields -->
        <property name="linesToSkip" value="1" />

        <!-- Defines how we map lines to objects -->
        <property name="lineMapper">
            <bean class="org.springframework.batch.item.file.mapping.DefaultLineMapper">

                <!-- The lineTokenizer divides individual lines up into units of work -->
                <property name="lineTokenizer">
                    <bean class="org.springframework.batch.item.file.transform.DelimitedLineTokenizer">

                        <!-- Names of the CSV columns -->
                        <property name="names" value="id,name,description,quantity" />
                    </bean>
                </property>

                <!-- The fieldSetMapper maps a line in the file to a Product object -->
                <property name="fieldSetMapper">
                    <bean class="com.geekcap.javaworld.springbatchexample.simple.reader.ProductFieldSetMapper" />
                </property>
            </bean>
        </property>
    </bean>

    <bean id="productWriter" class="com.geekcap.javaworld.springbatchexample.simple.writer.ProductItemWriter" />

</beans>

Note that separating our job configuration from our application/environment configuration enables us to move a job from one environment to another without redefining the job. The following beans are defined in Listing 4:

  • dataSource: The sample application connects to MySQL, so the data source is configured to connect to a MySQL database named spring_batch_example running on the localhost (see below for setup instructions).
  • transactionManager: The Spring transaction manager is used to manage MySQL transactions.
  • jdbcTemplate: This class provides an implementation of the template design pattern for interacting with JDBC connections. It's a helper class to simplify our database integration. In a production application we would probably opt to use an ORM tool like Hibernate behind a service layer, but I want to keep the example as simple as possible.
  • jobRepository: The MapJobRepositoryFactoryBean is a Spring Batch component that manages the state of a job. In this case it stores job information into the MySQL database using the previously configured JdbcTemplate.
  • jobLauncher: This is the component that launches and manages the workflow of a Spring Batch job.
  • productReader: This bean performs the read operation in our job.
  • productWriter: This bean writes the Product instances to the database.

Note that the jdbc:initialize-database node points to two Spring Batch scripts that create the database tables to support the running batch job. These scripts are located in the Spring Batch core JAR file (which is automatically imported by Maven) in the specified paths. The JAR file contains scripts for various database vendors, including MySQL, Oracle, SQL Server, and more. These scripts create the schema for use while running the jobs. In this example it drops and then creates the tables, which you can do for a temporary run. In a production environment you could extract the SQL file and create the tables yourself --- in which case you'd get to keep them around forever.

Defining the job

Listing 5 shows the file-import-job.xml file, which defines the actual job.

Listing 5. file-import-job.xml


<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:context="http://www.springframework.org/schema/context"
       xmlns:batch="http://www.springframework.org/schema/batch"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
                http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd
                http://www.springframework.org/schema/batch http://www.springframework.org/schema/batch/spring-batch.xsd">


    <!-- Import our beans -->
    <import resource="classpath:/applicationContext.xml" />

    <job id="simpleFileImportJob" xmlns="http://www.springframework.org/schema/batch">
        <step id="importFileStep">
            <tasklet>
                <chunk reader="productReader" writer="productWriter" commit-interval="5" />
            </tasklet>
        </step>
    </job>

</beans>

Note that a job can contain zero or more steps; a step can contain zero or one tasklet; and a tasklet can contain zero or one chunk, as shown graphically in Figure 3.

1 2 3 4 Page 2
Page 2 of 4