Uploading and Extracting data from mysql as csv

Extract data as csv from mysql table 

From the termial run the following command (you may need to unformat  this if you are copying this command, the quotes may appear in different format),

mysql -u root -pspace123 -e “SELECT * from employee” mydb > asd.csv                                                                         The above command export data from the table employee of the database mydb into a csv file called asd.csv . By default the delimiter would be tab in this csv file and this can be overidden. Also note that there no space after -p. The password is expected be written along with -p attribute unlike what we do for -u. Otherwise error will be thrown.

asd.csv                                                                                                                                                                                   empno ename
1              ram
1000       Jeena

In this, the header row can be omitted by specifying -N parameter along with the previous command

mysql -N -u root -pspace123 -e “SELECT * from employee” mydb > asd.csv

asd.csv
1               ram
1000        Jeena

Upload csv into mysql table                                                                                                                                                        For this, the only requirement is that we need to have a table in the database with the same and structure as that of the csv file we are going to upload. In my case I have an employee.csv file and table with same name employee and same structure.

employee.csv
1;ram
1000;Jeena

Now type the following command from the terminal,

mysqlimport -u root -pspace123 –fields-terminated-by=’;’ –local mydb employee.csv

This would upload this csv file to the table employee and that can be viewed    from mysql console

mysql> select * from employee;
Screenshot from 2016-06-04 12:52:07

Upload csv file into Remote Table

mysqlimport -h remote-host-name –port 3306 -u username -ppassword –fields-terminated-by=’;’ –local remote-db-name filename.csv-on-local-machine

mysqlimport -h example.com –port 3306 -u testUser -ptestUser123 –fields-terminated-by=’;’ –local demo test.csv

Advertisements

Error : java.lang.RuntimeException: Can’t parse input data: ‘NULL’

When we execute a sqoop export action via oozie  (moving data from hdfs to mysql ) using the tag <command>, we may come across the error java.lang.RuntimeException: Can’t parse input data: ‘NULL’.  This sqoop export  command, which was executed successfully though terminal, may raise this issue, when it is combined with oozie.

The fix for the above issue is to remove all the single/double quotes from the command.                                              eg : Instead of –input-null-string ‘NULL’, we need to use –input-null-string NULL                                                  

A working command is added below :                                                

<command>export –connect jdbc:mysql://localhost/demo –username root –password mysql123 –table calc_match_out –input-null-string NULL –input-null-non-string NULL -m 1 –input-fields-terminated-by ; –input-lines-terminated-by \n –export-dir /user/ambari-qa/output/output-calc/part-r-00000</command>

The reason why this fix is working can be interpreted from the following excerpts taken from the blog http://hadooped.blogspot.in/

Sqoop command:
The Sqoop command can be specified either using the command element or multiple arg elements.
– When using the command element, Oozie will split the command on every space into multiple arguments.- When using the arg elements, Oozie will pass each argument value as an argument to Sqoop.  The arg variant should be used when there are spaces within a single argument.  – All the above elements can be parameterized (templatized) using EL expressions.

Also I think the following apache documentation on sqoop is of high relevance for beginners in sqoop domain :

Sqoop features

 The sqoop action runs a Sqoop job synchronously.- The information to be included in the oozie sqoop action  are the job-tracker, the name-node and Sqoop command or arg elements as well as configuration.- A prepare node can be included to do any prep work including hdfs actions.  This will be executed prior to execution of the sqoop job.- Sqoop configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.- Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration. As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Sqoop job. 


Importing CSV files into MYSQL tables

For  importing a csv file into mysql table, we need to execute following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table1 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’;

Sometimes we may face some file not found exceptions and errors and it may persist even after giving full permissions. I got an error which said t.csv file is missing even though I have place it at correct location. The fix for this issue is to change the ownwership of t.csv to mysql using the following command.

chown mysql:mysql t.csv

DATE filed and CSV imports

I had a table table2:

create table table2(ID INT,title INT NULL,dt DATETIME NULL);

t.csv :

1, 3,10/12/2000
1000, 1223,12/12/2014

And when I executed the previopus load command, I got incorrect values in mysql table table2 and which was as follows:

ID            title                    dt
+——+——-+———————+
1          3             0000-00-00 00:00:00
1000   1223        0000-00-00 00:00:00

In this case, we need to modify the date filed while uploading to database. This can be done using the following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table2 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’ (ID, title, @var1) SET dt = STR_TO_DATE(@var1, ‘%m/%d/%Y’);

ID            title                    dt
+——+——-+———————+
1          3            2000-10-12
1000   1223      2014-12-12

And if there are multiple fields which are to be modified,you can specify them in a comma separated way like,

LOAD DATA LOCAL INFILE '/path/to/csv/file.csv' 
INTO TABLE mytable 
LINES TERMINATED BY '\n'
(id, task, hoursWorked, @var1, @var2) 
SET begindate = STR_TO_DATE(@var1, '%m/%d/%Y'),     
enddate = STR_TO_DATE(@var2, '%m/%d/%Y');