Error : java.lang.RuntimeException: Can’t parse input data: ‘NULL’

When we execute a sqoop export action via oozie  (moving data from hdfs to mysql ) using the tag <command>, we may come across the error java.lang.RuntimeException: Can’t parse input data: ‘NULL’.  This sqoop export  command, which was executed successfully though terminal, may raise this issue, when it is combined with oozie.

The fix for the above issue is to remove all the single/double quotes from the command.                                              eg : Instead of –input-null-string ‘NULL’, we need to use –input-null-string NULL                                                  

A working command is added below :                                                

<command>export –connect jdbc:mysql://localhost/demo –username root –password mysql123 –table calc_match_out –input-null-string NULL –input-null-non-string NULL -m 1 –input-fields-terminated-by ; –input-lines-terminated-by \n –export-dir /user/ambari-qa/output/output-calc/part-r-00000</command>

The reason why this fix is working can be interpreted from the following excerpts taken from the blog http://hadooped.blogspot.in/

Sqoop command:
The Sqoop command can be specified either using the command element or multiple arg elements.
– When using the command element, Oozie will split the command on every space into multiple arguments.- When using the arg elements, Oozie will pass each argument value as an argument to Sqoop.  The arg variant should be used when there are spaces within a single argument.  – All the above elements can be parameterized (templatized) using EL expressions.

Also I think the following apache documentation on sqoop is of high relevance for beginners in sqoop domain :

Sqoop features

 The sqoop action runs a Sqoop job synchronously.- The information to be included in the oozie sqoop action  are the job-tracker, the name-node and Sqoop command or arg elements as well as configuration.- A prepare node can be included to do any prep work including hdfs actions.  This will be executed prior to execution of the sqoop job.- Sqoop configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.- Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration. As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Sqoop job. 


Advertisements

Oozie – Connection Refused Error on creating sharelib directory on hdfs

Please refer http://anggao.js.org/apache-oozie-installation-on-ubuntu.html

After preparing oozie  war file, we need to create sharelib directory on HDFS file system using the command  (we need to be in oozie home directory)

./bin/oozie-setup.sh sharelib create -fs hdfs://localhost:9000

The above command will internally issue a HDFS create directory command to the Name node running at hdfs://localhost:9000 and then copy the shared library to that directory.

Though we expect everything to be normal, sometimes errors may popup. The error I got was added below.

oozie-error

From the connection error shown on  screen, it is clear that name node at localhost:9000 is not  working. This can be confirmed by typing in terminal :

telnet localhost 9000 

Now I need to confirm the if that was the specified port for name node. This can be done using the command

# first get HDFS info
hdfs getconf -confKey fs.defaultFS

the above command gives the the output hdfs://localhost:9000.                                                              So we  are using correct path and port number

Now into the details of the issue.

The problem is name-node has not been started. So we need to ensure that all hadoop services are running.  Then we need to run the following command from hadoop user home (/home/hadoop for me) jps

This command is expected to give the following result if everything is normal,

2583 DataNode
2970 ResourceManager
3461 Jps
3177 NodeManager
2361 NameNode
2840 SecondaryNameNode
But when I executed the command, NameNode was missing from the result. That is the reason why we are not able to create sharelib on hdfs, Our real problem. So fixing this is the key to our problem.
This can be done as follows:
1. Move to hadoop user home (for me  /home/hadoop)
stop all hadoop services using  stop-all.sh
2. Then reformat name node using
   hdfs namenode -format
3. then restart hadoop all services using start-all.sh(or by specifically runningthe commands  start-dfs.sh and start-yarn.sh)
4.Then issue the command jps
5. If everything is normal will be getting earlier mentioned expected result. If that the case, we can resume with oozie installation. and run the command,
 ./bin/oozie-setup.sh sharelib create -fs hdfs://localhost:9000
But unfortunately, for me data node was missing in jps output. Fixing this issue is a bit tricky.
Before that stop all hadoop services using stop-all.sh
1. First we need to get the hdfs data node path. This will be specified in hdfs-site.xml of hadoop. (for me it was in /usr/local/hadoop/etc/hadoop)
2. for me data-node was in /home/hadoop/mydata/hdfs/datanode
3. cd to above path and remove all its contents by issuing rm command
   hadoop@space-Vostro-3800:/home/hadoop/mydata/hdfs/datanode$ rm -r *
4. Then go to hadoop user home folder and reformat name node .
   hadoop@space-Vostro-3800:/home/hadoop$ hdfs namenode -format
Then start all hadoop services using start-all.sh or by running the commands separately.
5. The run the command jps. Then I got the expected result
2583DataNode
2970ResourceManager
3461Jps
3177NodeManager
2361NameNode
2840SecondaryNameNode
So here name node service is running and this can be verified by
                      telnet  localhost 9000
So everything is under control now . Now we can resume with oozie installation.  We can move to oozi folder in hadioop user home and run
hadoop@space-Vostro-3800:/home/hadoop/oozie$./bin/oozie-setup.sh sharelib create -fs hdfs://localhost:9000
success……………………..
Now we can create database using
hadoop@space-Vostro-3800:/home/hadoop/oozie$ ./bin/ooziedb.sh create -sqlfile oozie.sql -run
ok. with this database is created.
Now start oozie service using
hadoop@space-Vostro-3800:/home/hadoop/oozie$ ./bin/oozied.sh start
The same can be verified from web interface. Please type the following url in your browser.
# Web UI
http://localhost:11000/oozie/
****************
Now we are done with so called oozie installation.