Apache Sqoop Import Command Usage
Apache Sqoop is a command-line interface application for transferring data between relational databases and Hadoop. It supports incremental loads of a single table or a free form SQL query as well as saved jobs which can be run multiple times to import updates made to a database since the last import. Imports can also be used to populate tables in Hive or HBase. Exports can be used to put data from Hadoop into a relational database. Sqoop got the name from sql+hadoop. Sqoop became a top-level Apache project in March 2012.
1) A machine with Ubuntu 14.04 LTS operating system.
2) Apache Hadoop pre installed (How to install Hadoop on Ubuntu 14.04)
3) MySQL Database pre installed (How to install MySQL Database on Ubuntu 14.04)
4) Apache Sqoop pre installed (How to install Sqoop on Ubuntu 14.04)
Sqoop Import From MySQL Database data to HDFS
The 'Import tool' imports individual tables from RDBMS to HDFS. Each row in a table is treated as a record in HDFS. All records are stored as text data in the text files or as binary data in Avro and Sequence files.
Step 1 - Change the directory to /usr/local/hadoop/sbin
Step 2 - Start all hadoop daemons.
Step 3 - The JPS (Java Virtual Machine Process Status Tool) tool is limited to reporting information on JVMs for which it has the access permissions.
Step 4 - Enter into MySQL command line inteface(CLI). Open a terminal (CTRL + ALT + T) and type the following command.
Step 5 - Create a new database 'userdb'
Step 6 - Use database 'userdb'
Step 7 - Create a new table 'employee'
Step 8 - Insert data into employee table.
Step 9 - Verify
Step 10 - Create a new table 'employee_address'
Step 11 - Insert data into employee_address table.
Step 12 - Verify
Step 13 - Create a new table 'employee_contact'
Step 14 - Insert data into employee_contact table.
Importing a Table
Sqoop tool 'import' is used to import table data from the table to the Hadoop file system as a text file or a binary file.
Step 15 - Change the directory to /usr/local/sqoop/bin
Step 16 - The following command is used to import the employee table from MySQL database server to HDFS.
Step 17 - Verify
Importing into Target Directory
We can specify the target directory while importing table data into HDFS using the Sqoop import tool.
Step 18 - The following command is used to import employee_address table data into '/targetflodername' directory.
Step 19 - Verify
Import Subset of Table Data
We can import a subset of a table using the 'where' clause in Sqoop import tool. It executes the corresponding SQL query in the respective database server and stores the result in a target directory in HDFS.
Step 20 - The following command is used to import a subset of employee_address table data. The subset query is to retrieve the employee id and address, who lives in Secunderabad city.
Step 21 - Verify
Incremental import is a technique that imports only the newly added rows in a table. It is required to add 'incremental', 'check-column', and 'last-value' options to perform the incremental import.
Step 22 - Insert some more data into employee table.
Before going to next step please remove old employee folder from HDFS.
Step 23 - The following command is used to perform the incremental import in the employee table.
Step 24 - The following command is used to verify the imported data from employee table to HDFS employee/ directory.
Step 25 - The following command is used to see the modified or newly added rows from the employee table.
Import all tables
Import all the tables from the RDBMS database server to the HDFS. Each table data is stored in a separate directory and the directory name is same as the table name.
Let us take an example of importing all tables from the userdb database. The list of tables that the database userdb contains is as follows.
Step 26 - The following command is used to import all the tables from the userdb database.
Step 27 - Verify
Please share this blog post and follow me for latest updates on
Labels : Apache Sqoop Installation on Ubuntu Apache Sqoop Export Command Example Apache Sqoop Job Command Example Apache Sqoop Codegen Command Example Apache Sqoop Eval Command Example Apache Sqoop List-tables Command Example Apache Sqoop List-tables Command Example