Access denied issue with mysqlimport with remote machine

On importing a csv file into a table with same structure that placed on different machine, with mysql import using mysql root user , it is common that we may end up with some access denied error messages.

mysqlimport -h yourhostname –port 3306 -u root -pyourpassword –fields-terminated-by=’;’ –local csv_test tweet.csv

For instance the above stated query which worked in some machines, may raise

mysqlimport: Error: 1045 Access denied for user ‘root’@’ipaddress-of-remote-host’ (using password: YES)

The fix to this issue is as follows :

  1. First check the my.cnf file ( inside /etc/mysql folder ) and look for the line :

bind-address = 127.0.0.1

If it not commented, comments it out, as :

 #bind-address = 127.0.0.1

This tells system that, this mysql is accessible from different hosts as well.

2. Next we need to grant permission to the root user.

For that login to the mysql prompt using mysql -u root -p. Then execute the following query,

GRANT ALL PRIVILEGES ON *.* TO ‘root’@’%’ IDENTIFIED BY ‘password’;

3. Now restart mysql using the command:

service mysql restart

Java – Batch Upload to Database

Usually I/O operations are costly. When I  tried to upload my huge csv file of 1.8 lakhs records one by one,  into a MySQL table , it took almost more than 30 minutes. And obviously it was not an acceptable result. So I had to resort  to batch uploading. On writing data as batches of 1000 records, everything was over in 30-40 seconds.

Lets see the code in detail.

The important thing to remember is that we need to turn off  “auto commit” mode. This means that if this mode is enable, every time record is pushed DB memory, it would automatically get written into tables, nullifying the effects of batch upload. At the same we have to enable it just before writing the records into table once enough number of records are pushed into DB memory or cache, using the commit().

In the the following example, records are read from a csv file named input.csv and its first three fields are written into tables called “batch” in the DB test. MySQL was DB of my choice.

At first auto commit mode is turned off by calling setAutoCommit(false) on DB connection object. Each record will be read and pushed them into DB cache using the addBatch(); 

When the we have 1000 records in the cache ie count variables becomes multiples of 1000, we need to write them into DB . For that we call following method, executeBatch(); Since we have disabled the auto commit mode, we need to enable that as well by calling commit()  on connection object in order to get this data written into the DB.

Gitub link here

BatcUpload.java

public class BatchUpload {

public static void main(String[] args) throws IOException, SQLException {

String line = “”;
String delimiter = “,”;
int count = 1;
DBConnector.createConnection();
Connection dbConn = DBConnector.getDBConnection();
System.out.println(dbConn);
PreparedStatement ps = DBConnector.getPSInstance();
dbConn.setAutoCommit(false);
String inputFile = “input.csv”;
BufferedReader br = new BufferedReader(new FileReader(inputFile));
while ((line = br.readLine()) != null) {
String[] entities = line.split(delimiter);
try {
ps.setString(1,entities[0]);
ps.setString(2,entities[1]);
ps.setString(3,entities[2]);

ps.addBatch();
if(count%1000==0){
ps.executeBatch();
dbConn.commit();
}
count++;

System.out.println(“Records are inserted into DBUSER table!”);

} catch (SQLException e) {

System.out.println(e.getMessage());

}

}

/*** To write the remaining records into DB*/

ps.executeBatch();
dbConn.commit();
dbConn.close();

}
}

DBConnector.java 

public class DBConnector {
private static final String DB_DRIVER = “com.mysql.jdbc.DRIVER”;
private static final String DB_CONNECTION = “jdbc:mysql://localhost:3306/test”;
private static final String DB_USER = “root”;
private static final String DB_PASSWORD = “root”;

private static Connection conn;
private static PreparedStatement ps;

public static void createConnection() {

// conn = null;
System.out.println(“asdasdfafds”);

try {

Class.forName(DB_DRIVER);

} catch(ClassNotFoundException cnf){
System.out.println(“Driver could not be loaded: ” + cnf);
}

try{
conn = DriverManager.getConnection(DB_CONNECTION, DB_USER,DB_PASSWORD);
String query = “INSERT INTO batch”
+ “(userID, username, address) VALUES”
+ “(?,?,?)”;

ps = conn.prepareStatement(query);
}
catch (SQLException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}

}

public static Connection getDBConnection() {

return conn;

}

public static PreparedStatement getPSInstance() {
return ps;

}
}

Uploading and Extracting data from mysql as csv

Extract data as csv from mysql table 

From the termial run the following command (you may need to unformat  this if you are copying this command, the quotes may appear in different format),

mysql -u root -pspace123 -e “SELECT * from employee” mydb > asd.csv                                                                         The above command export data from the table employee of the database mydb into a csv file called asd.csv . By default the delimiter would be tab in this csv file and this can be overidden. Also note that there no space after -p. The password is expected be written along with -p attribute unlike what we do for -u. Otherwise error will be thrown.

asd.csv                                                                                                                                                                                   empno ename
1              ram
1000       Jeena

In this, the header row can be omitted by specifying -N parameter along with the previous command

mysql -N -u root -pspace123 -e “SELECT * from employee” mydb > asd.csv

asd.csv
1               ram
1000        Jeena

Upload csv into mysql table                                                                                                                                                        For this, the only requirement is that we need to have a table in the database with the same and structure as that of the csv file we are going to upload. In my case I have an employee.csv file and table with same name employee and same structure.

employee.csv
1;ram
1000;Jeena

Now type the following command from the terminal,

mysqlimport -u root -pspace123 –fields-terminated-by=’;’ –local mydb employee.csv

This would upload this csv file to the table employee and that can be viewed    from mysql console

mysql> select * from employee;
Screenshot from 2016-06-04 12:52:07

Upload csv file into Remote Table

mysqlimport -h remote-host-name –port 3306 -u username -ppassword –fields-terminated-by=’;’ –local remote-db-name filename.csv-on-local-machine

mysqlimport -h example.com –port 3306 -u testUser -ptestUser123 –fields-terminated-by=’;’ –local demo test.csv

Error : java.lang.RuntimeException: Can’t parse input data: ‘NULL’

When we execute a sqoop export action via oozie  (moving data from hdfs to mysql ) using the tag <command>, we may come across the error java.lang.RuntimeException: Can’t parse input data: ‘NULL’.  This sqoop export  command, which was executed successfully though terminal, may raise this issue, when it is combined with oozie.

The fix for the above issue is to remove all the single/double quotes from the command.                                              eg : Instead of –input-null-string ‘NULL’, we need to use –input-null-string NULL                                                  

A working command is added below :                                                

<command>export –connect jdbc:mysql://localhost/demo –username root –password mysql123 –table calc_match_out –input-null-string NULL –input-null-non-string NULL -m 1 –input-fields-terminated-by ; –input-lines-terminated-by \n –export-dir /user/ambari-qa/output/output-calc/part-r-00000</command>

The reason why this fix is working can be interpreted from the following excerpts taken from the blog http://hadooped.blogspot.in/

Sqoop command:
The Sqoop command can be specified either using the command element or multiple arg elements.
– When using the command element, Oozie will split the command on every space into multiple arguments.- When using the arg elements, Oozie will pass each argument value as an argument to Sqoop.  The arg variant should be used when there are spaces within a single argument.  – All the above elements can be parameterized (templatized) using EL expressions.

Also I think the following apache documentation on sqoop is of high relevance for beginners in sqoop domain :

Sqoop features

 The sqoop action runs a Sqoop job synchronously.- The information to be included in the oozie sqoop action  are the job-tracker, the name-node and Sqoop command or arg elements as well as configuration.- A prepare node can be included to do any prep work including hdfs actions.  This will be executed prior to execution of the sqoop job.- Sqoop configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.- Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration. As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Sqoop job. 


Importing CSV files into MYSQL tables

For  importing a csv file into mysql table, we need to execute following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table1 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’;

Sometimes we may face some file not found exceptions and errors and it may persist even after giving full permissions. I got an error which said t.csv file is missing even though I have place it at correct location. The fix for this issue is to change the ownwership of t.csv to mysql using the following command.

chown mysql:mysql t.csv

DATE filed and CSV imports

I had a table table2:

create table table2(ID INT,title INT NULL,dt DATETIME NULL);

t.csv :

1, 3,10/12/2000
1000, 1223,12/12/2014

And when I executed the previopus load command, I got incorrect values in mysql table table2 and which was as follows:

ID            title                    dt
+——+——-+———————+
1          3             0000-00-00 00:00:00
1000   1223        0000-00-00 00:00:00

In this case, we need to modify the date filed while uploading to database. This can be done using the following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table2 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’ (ID, title, @var1) SET dt = STR_TO_DATE(@var1, ‘%m/%d/%Y’);

ID            title                    dt
+——+——-+———————+
1          3            2000-10-12
1000   1223      2014-12-12

And if there are multiple fields which are to be modified,you can specify them in a comma separated way like,

LOAD DATA LOCAL INFILE '/path/to/csv/file.csv' 
INTO TABLE mytable 
LINES TERMINATED BY '\n'
(id, task, hoursWorked, @var1, @var2) 
SET begindate = STR_TO_DATE(@var1, '%m/%d/%Y'),     
enddate = STR_TO_DATE(@var2, '%m/%d/%Y');