Apache Sqoop Job Command With Example

posted on Nov 20th, 2016

Apache Sqoop

Apache Sqoop is a command-line interface application for transferring data between relational databases and Hadoop. It supports incremental loads of a single table or a free form SQL query as well as saved jobs which can be run multiple times to import updates made to a database since the last import. Imports can also be used to populate tables in Hive or HBase. Exports can be used to put data from Hadoop into a relational database. Sqoop got the name from sql+hadoop. Sqoop became a top-level Apache project in March 2012.

Pre Requirements

1) A machine with Ubuntu 14.04 LTS operating system.

2) Apache Hadoop pre installed (How to install Hadoop on Ubuntu 14.04)

3) MySQL Database pre installed (How to install MySQL Database on Ubuntu 14.04)

4) Apache Sqoop pre installed (How to install Sqoop on Ubuntu 14.04)

Apache Sqoop Job Command With Example

This post describes how to create and maintain the Sqoop jobs. Sqoop job creates and saves the import and export commands. It specifies parameters to identify and recall the saved job. This re-calling or re-executing is used in the incremental import, which can import the updated rows from RDBMS table to HDFS.

Step 1 - Change the directory to /usr/local/hadoop/sbin

$ cd /usr/local/hadoop/sbin

Step 2 - Start all hadoop daemons.

$ start-all.sh

Step 3 - The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions.

$ jps

Step 4 - Change the directory to /usr/local/sqoop/bin

$ cd /usr/local/sqoop/bin

Create Job (--create) (See, How to use Import command)

Here we are creating a job with the name myjob, which can import the table data from RDBMS table to HDFS. The following command is used to create a job that is importing data from the employee table in the db database to the HDFS file.

$ sqoop job --create myjob \
-- import \
--connect jdbc:mysql://localhost/userdb \
--username root \
--password root \
--table employee \
--m 1 \
--target-dir /targetfloder

Verify Job (--list)

'--list' argument is used to verify the saved jobs. The following command is used to verify the list of saved Sqoop jobs.

$ sqoop job --list

Apache Sqoop Job Command With Example

Inspect Job (--show)

'--show' argument is used to inspect or verify particular jobs and their details. The following command and sample output is used to verify a job called myjob.

$ sqoop job --show myjob

Apache Sqoop Job Command With Example

Execute Job (--exec)

'--exec' option is used to execute a saved job. The following command is used to execute a saved job called myjob.

$ sqoop job --exec myjob

Apache Sqoop Job Command With Example

Verify after executing.

$ hdfs dfs -cat /user/hduser/targetfolder/part-m-00000

Apache Sqoop Job Command With Example

Please share this blog post and follow me for latest updates on

facebook             google+             twitter             feedburner

Previous Post                                                                                          Next Post

Labels : Apache Sqoop Installation on Ubuntu   Apache Sqoop Import Command Example   Apache Sqoop Export Command Example   Apache Sqoop Codegen Command Example   Apache Sqoop Eval Command Example   Apache Sqoop List-tables Command Example   Apache Sqoop List-tables Command Example