When we execute a sqoop export action via oozie (moving data from hdfs to mysql ) using the tag <command>, we may come across the error java.lang.RuntimeException: Can’t parse input data: ‘NULL’. This sqoop export command, which was executed successfully though terminal, may raise this issue, when it is combined with oozie.
The fix for the above issue is to remove all the single/double quotes from the command. eg : Instead of –input-null-string ‘NULL’, we need to use –input-null-string NULL
A working command is added below :
<command>export –connect jdbc:mysql://localhost/demo –username root –password mysql123 –table calc_match_out –input-null-string NULL –input-null-non-string NULL -m 1 –input-fields-terminated-by ; –input-lines-terminated-by \n –export-dir /user/ambari-qa/output/output-calc/part-r-00000</command>
The reason why this fix is working can be interpreted from the following excerpts taken from the blog http://hadooped.blogspot.in/
The Sqoop command can be specified either using the command element or multiple arg elements.
– When using the command element, Oozie will split the command on every space into multiple arguments.- When using the arg elements, Oozie will pass each argument value as an argument to Sqoop. The arg variant should be used when there are spaces within a single argument. – All the above elements can be parameterized (templatized) using EL expressions.
Also I think the following apache documentation on sqoop is of high relevance for beginners in sqoop domain :
The sqoop action runs a Sqoop job synchronously.- The information to be included in the oozie sqoop action are the job-tracker, the name-node and Sqoop command or arg elements as well as configuration.- A prepare node can be included to do any prep work including hdfs actions. This will be executed prior to execution of the sqoop job.- Sqoop configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.- Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration. As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Sqoop job.