Hadoop: sqoop Compression snappy and avro Format

locate core-site.xml -> find codec type available

Example of loading data from MySQL to HDFS (compression: Snappy and Avro format)

$ sqoop import \
 --connect jdbc:mysql://localhost:33/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_snappy_avro \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ 
 --as-avrodatafile

Example of loading data from MySQL to HDFS (compression: gzip and Avro format)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_gzip_avro \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.GzipCodec \
 --as-avrodatafile

Example of loading data from MySQL to HDFS (compression: BZIP2 and Sequence format)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_bzip2_sequence \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.BZip2Codec \
 --as-sequencefile

Example of loading data from MySQL to HDFS (restricting data with columns)

$ sqoop import \
 --connect jdbc:mysql://localhost/mibbdd \
 --username=root -P \
 --table=mitabla \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_2_columns \
 --columns nombre,edad

Example of loading data from MySQL to HDFS (restricting data with WHERE)

$ sqoop import \
 --connect jdbc:mysql://localhost/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_mayor_age_40 \
 --where "edad > 40"

Example of loading data from MySQL to HDFS (incremental load)

In order to make an incremental insertion we need to include new data to the table “MyTable”, for this we execute in MySQL the following sentence:

mysql> 
INSERT INTO mytable (nombre, edad, salario) VALUES
        ("Diego", 24, 21000), ("Rosa", 26, 24000), ("Javier", 28, 25000), ("Lorena", 35, 28000), ("Miriam", 42, 30000), ("Patricia", 43, 25000), ("Natalia", 45, 39000);

Note: To make the insertion necessary to do it in the db “MIBBDD”

Once the insertion is done we can make the incremental insertion from the 8 as it is the first element introduced in the new insertion.

$ sqoop import \ 
 --connect jdbc:mysql://localhost/mybbdd \ 
 --username=root -P \
 --table=mytable \ 
 --driver=com.mysql.jdbc.Driver \ 
 --target-dir=/my_table_hdfs \ 
 --incremental append \
 --check-column id \
 --last-value 8

Example of loading data from MySQL to HDFS and consultable from HIVE

In order to make an insertion of the table in the hive database, we must create db where it will be inserted, to avoid problems:

Hive > CREATE DATABASE mybbddhive;

Once the database is created, you are ready to run the query:

$ sqoop import \
 --connect jdbc:mysql://localhost/mybbdd \
 --username=root -P \
 --table=mytable \
 --driver=com.mysql.jdbc.Driver \
 --target-dir=/ej_hive \
 --compress \
 --compression-codec org.apache.hadoop.io.compress.SnappyCodec \ 
 --hive-import \
 --hive-database mihive \ 
 --create-hive-table \
 --hive-table ej_hive_table

Creating a MySQL database table
A database is created with a table on which to perform the tests, the following commands will be used.
accessing MYSQL
$ mysql
You can make an ERROR 1045 (28000): Access denied for user ‘ root ‘ @ ‘ localhost, which is resolved:
mysql -u root-P
Note: Keep in mind that the MySQL and system wash can be different.
Consult database
mysql> show databases;
Create Database
mysql> create database myddbb;
Use database
mysql> use myddbb;
Create base table
mysql> CREATE TABLE mytable (
         id MEDIUMINT NOT NULL AUTO_INCREMENT,
         name CHAR (30) NOT NULL,
         age INTEGER (30),
         salary INTEGER (30),
         PRIMARY KEY (id));
CREATE TABLE 2
mysql> CREATE TABLE mytable2 (
         id MEDIUMINT NOT NULL AUTO_INCREMENT,
         name CHAR (30) NOT NULL,
         age INTEGER (30),
         salary INTEGER (30),
         PRIMARY KEY (id));
Insert Data
mysql> INSERT INTO mytable (name, age, salary) values
        ("Peter", 24, 21000), ("Maria", 26, 24000), ("John", 28, 25000), ("Louis", 35, 28000), ("Monica", 42, 30000), ("Rose", 43, 25000), ("Susana", 45, 39000);
Note: If you have permission problems enter MySQL and give all permissions:
grant all privileges on *.* to 'root'@'localhost' IDENTIFIED BY 'MySQL_Key' WITH GRANT OPTION;
Exit MYSQL
mysql> exit;
 
Load MySQL data to HDFS
Example of loading data from the table “MyTable” of the Database “MIBBDD” to the folder HDFs name “Mitabla_hdfs”
$ sqoop Import \
 --connect jdbc:mysql://localhost/myddbb \
 --username = root -P \
 --table = mytable \
 --driver = com.mysql.jdbc.driver \
 --target-dir =/my_hdfs_table \
 --fields-terminated-by = ',' \
 --lines-terminated-by '\n'
Target-dir: File HDFS where it is stored.
Table: Identifies the table to be copied.
Clear-Staging-table: Indicates that past data can be deleted.
VERBOSE: Prints additional information to facilitate debugging.
Fields-terminated-by: defining the delimiter.
 
Loading data from HDFS to MySQL
Example of loading data from the HDFS folder named “my_hdfs_table” to the “mytable2” table in the “myddbb” database.

$ sqoop export 
 --connect jdbc: mysql://localhost/myddbb 
 --username = root -P 
 --table = mytable2 
 --export-dir =/my_hdfs_table -m 1
Note: If you have problems “set $ ACCUMULO_HOME to the root in your ACCUMULO intallation”, it can be avoided with:
$ ACCUMULO_HOME = '/var/lib/accumulo'
$ Export ACCUMULO_HOME
$ sudo mkdir/var/lib/accumulo
To practice with Sqoop then consult: “Examples of Sqoop”, in this section compiled many examples that can serve you useful.
Source: official Shell documentation
Source: official documentation for JAVA API

Hadoop

Wednesday, October 7, 2020

sqoop Compression snappy and avro Format

Example of loading data from MySQL to HDFS (compression: Snappy and Avro format)

Example of loading data from MySQL to HDFS (compression: gzip and Avro format)

Example of loading data from MySQL to HDFS (compression: BZIP2 and Sequence format)

Example of loading data from MySQL to HDFS (restricting data with columns)

Example of loading data from MySQL to HDFS (restricting data with WHERE)

Example of loading data from MySQL to HDFS (incremental load)

Example of loading data from MySQL to HDFS and consultable from HIVE

Creating a MySQL database table

Load MySQL data to HDFS

Loading data from HDFS to MySQL

No comments:

Post a Comment