Error processing package nginx

During the installation of nginx using apt repository ie when we use the following command :
apt-get install nginx we may come across some error messages like this,

Job for nginx.service failed. See ‘systemctl status nginx.service’ and ‘journalctl -xn’ for details.
invoke-rc.d: initscript nginx, action “start” failed.
dpkg: error processing package nginx-full (–configure):
subprocess installed post-installation script returned error exit status 1
dpkg: dependency problems prevent configuration of nginx:
nginx depends on nginx-full (>= 1.6.2-5+deb8u4) | nginx-light (>= 1.6.2-5+deb8u4) | nginx-extras (>= 1.6.2-5+deb8u4); however:
Package nginx-full is not configured yet.
Package nginx-light is not installed.
Package nginx-extras is not installed.
nginx depends on nginx-full (<< 1.6.2-5+deb8u4.1~) | nginx-light (<< 1.6.2-5+deb8u4.1~) | nginx-extras (<< 1.6.2-5+deb8u4.1~); however:
Package nginx-full is not configured yet.
Package nginx-light is not installed.
Package nginx-extras is not installed.

dpkg: error processing package nginx (–configure):
dependency problems – leaving unconfigured
Errors were encountered while processing:
nginx-full
nginx
E: Sub-process /usr/bin/dpkg returned an error code (1)

Fix

Stopping the apache service (or current webserver) before we try to install nginx, would solve this issue. Once we get nginx installed, we can start apache service again.

Hence the following steps are supposed to solve this issue.

1. sudo systemctl stop apache2.service
2. sudo apt-get install nginx
3. sudo systemctl start apache2.service

Advertisements

Invoking Celery Tasks from Java Application – Part #2

In the previous post we have seen how to invoke a celery tasks from java application. but it was based on sending messge to  rabbitMQ queue using respective rabbitMQ libraries. But in this post, let’s be be familiar with more convenient way or rather using Rest APIs.

For this, we need to install a celery monitoring tool called flower. Not all version of flower is supposed to serve our purpose. What worked for me is the development version. (the command to install is written below)
pip install https://github.com/mher/flower/zipball/master#egg=flower

So let me assume that we have tasks.py with a task named add

@app.task
def add(x, y):
print x+y

Now run the worker
celery -A tasks worker –loglevel=info

Starting flower
Finally it is time to start flower so that we access/control both tasks and workers using flower REST apis. For that we need to run the following command :

celery flower -A appname (celery flower -A tasks)

Care should be taken to specify the project name in the above command(here tasks) when we start flower because the apis would not work properly otherwise.

Now this can be viewed from the url http://localhost:5555 (or using respective hostname). This has got different tabs to show the status of tasks, workers and so on. So basically what we are going to do is, use the the apis which flower is using for aforementioned feature, directly in our application.

In order to simulate REST api call, throughout this post I am using curl command as I am coming from linux background. This apis can be integrated from any programming languages.

1. Invoking a celery task

curl -X POST -d ‘{“args”:[1,2]}’ http://localhost:5555/api/task/async-apply/tasks.add

this would trigger celery task add with parameters 1 and 2 and would generate an output similar to the following:

{
“task-id”: “81775ebb-7d88-4e91-b580-b3a2d79fe668”,
“state”: “PENDING”
}

So this api would return the task id of the generaed task, which can be used for tracking it whenever we want.

2. Retrieving information regarding a specific task using its id

curl -X GET http://localhost:5555/api/task/info/81775ebb-7d88-4e91-b580-b3a2d79fe668

output :
{
“task-id”: “81775ebb-7d88-4e91-b580-b3a2d79fe668”,
“result”: “‘None'”,
“clock”: 371,
“routing_key”: null,
“retries”: 0,
“failed”: false,
“state”: “SUCCESS”,
“kwargs”: “{}”,
“sent”: false,
“expires”: null,
“exchange”: null,
“started”: 1466248131.745754,
“timestamp”: 1466248131.837694,
“args”: “[1, 2]”,
“worker”: “celery@space-Vostro-3800”,
“revoked”: false,
“received”: 1466248131.744577,
“exception”: null,
“name”: “tasks.add”,
“succeeded”: 1466248131.837694,
“traceback”: null,
“eta”: null,
“retried”: false,
“runtime”: 0.09263942600227892
}

3. Listing all the tasks sent to workers

curl -X GET http://localhost:5555/api/tasks

output :
{
“81775ebb-7d88-4e91-b580-b3a2d79fe668”: {
“received”: 1466248131.744577,
“revoked”: false,
“name”: “tasks.add”,
“succeeded”: 1466248131.837694,
“clock”: 371,
“started”: 1466248131.745754,
“timestamp”: 1466248131.837694,
“args”: “[1, 2]”,
“retries”: 0,
“failed”: false,
“state”: “SUCCESS”,
“result”: “‘None'”,
“retried”: false,
“kwargs”: “{}”,
“runtime”: 0.09263942600227892,
“sent”: false,
“uuid”: “81775ebb-7d88-4e91-b580-b3a2d79fe668”
},
“50c589e1-b613-496f-af1e-c94c04b163dc”: {
“received”: 1466248086.289584,
“revoked”: false,
“name”: “tasks.add”,
“succeeded”: 1466248086.339701,
“clock”: 313,
“started”: 1466248086.291148,
“timestamp”: 1466248086.339701,
“args”: “[4, 3]”,
“retries”: 0,
“failed”: false,
“state”: “SUCCESS”,
“result”: “‘None'”,
“retried”: false,
“kwargs”: “{}”,
“runtime”: 0.049509562999446644,
“sent”: false,
“uuid”: “50c589e1-b613-496f-af1e-c94c04b163dc”
}
}

4. Terminating a task
curl -X POST -d ‘terminate=True’ http://localhost:5555/api/task/revoke/81775ebb-7d88-4e91-b580-b3a2d79fe668

References :
https://pypi.python.org/pypi/flower
http://flower.readthedocs.io/en/latest/api.html

http://nbviewer.jupyter.org/github/mher/flower/blob/master/docs/api.ipynb

 

 

Error : java.lang.RuntimeException: Can’t parse input data: ‘NULL’

When we execute a sqoop export action via oozie  (moving data from hdfs to mysql ) using the tag <command>, we may come across the error java.lang.RuntimeException: Can’t parse input data: ‘NULL’.  This sqoop export  command, which was executed successfully though terminal, may raise this issue, when it is combined with oozie.

The fix for the above issue is to remove all the single/double quotes from the command.                                              eg : Instead of –input-null-string ‘NULL’, we need to use –input-null-string NULL                                                  

A working command is added below :                                                

<command>export –connect jdbc:mysql://localhost/demo –username root –password mysql123 –table calc_match_out –input-null-string NULL –input-null-non-string NULL -m 1 –input-fields-terminated-by ; –input-lines-terminated-by \n –export-dir /user/ambari-qa/output/output-calc/part-r-00000</command>

The reason why this fix is working can be interpreted from the following excerpts taken from the blog http://hadooped.blogspot.in/

Sqoop command:
The Sqoop command can be specified either using the command element or multiple arg elements.
– When using the command element, Oozie will split the command on every space into multiple arguments.- When using the arg elements, Oozie will pass each argument value as an argument to Sqoop.  The arg variant should be used when there are spaces within a single argument.  – All the above elements can be parameterized (templatized) using EL expressions.

Also I think the following apache documentation on sqoop is of high relevance for beginners in sqoop domain :

Sqoop features

 The sqoop action runs a Sqoop job synchronously.- The information to be included in the oozie sqoop action  are the job-tracker, the name-node and Sqoop command or arg elements as well as configuration.- A prepare node can be included to do any prep work including hdfs actions.  This will be executed prior to execution of the sqoop job.- Sqoop configuration can be specified with a file, using the job-xml element, and inline, using the configuration elements.- Oozie EL expressions can be used in the inline configuration. Property values specified in the configuration element override values specified in the job-xml file.
Note that Hadoop mapred.job.tracker and fs.default.name properties must not be present in the inline configuration. As with Hadoop map-reduce jobs, it is possible to add files and archives in order to make them available to the Sqoop job. 


Importing CSV files into MYSQL tables

For  importing a csv file into mysql table, we need to execute following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table1 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’;

Sometimes we may face some file not found exceptions and errors and it may persist even after giving full permissions. I got an error which said t.csv file is missing even though I have place it at correct location. The fix for this issue is to change the ownwership of t.csv to mysql using the following command.

chown mysql:mysql t.csv

DATE filed and CSV imports

I had a table table2:

create table table2(ID INT,title INT NULL,dt DATETIME NULL);

t.csv :

1, 3,10/12/2000
1000, 1223,12/12/2014

And when I executed the previopus load command, I got incorrect values in mysql table table2 and which was as follows:

ID            title                    dt
+——+——-+———————+
1          3             0000-00-00 00:00:00
1000   1223        0000-00-00 00:00:00

In this case, we need to modify the date filed while uploading to database. This can be done using the following command:

LOAD DATA INFILE ‘/tmp/t.csv’ INTO TABLE table2 FIELDS TERMINATED BY ‘,’ ENCLOSED BY ‘”‘ LINES TERMINATED BY ‘\n’ (ID, title, @var1) SET dt = STR_TO_DATE(@var1, ‘%m/%d/%Y’);

ID            title                    dt
+——+——-+———————+
1          3            2000-10-12
1000   1223      2014-12-12

And if there are multiple fields which are to be modified,you can specify them in a comma separated way like,

LOAD DATA LOCAL INFILE '/path/to/csv/file.csv' 
INTO TABLE mytable 
LINES TERMINATED BY '\n'
(id, task, hoursWorked, @var1, @var2) 
SET begindate = STR_TO_DATE(@var1, '%m/%d/%Y'),     
enddate = STR_TO_DATE(@var2, '%m/%d/%Y');

AttributeError:’module’ object has no attribute ‘monkey_patch’

If you ever come across the above error while using eventlet as pool of process, please be informed that most likely it would be an error related to eventlet installation. Please run the below command to fix the issue.

apt-get install python-eventlet

I would strongly recommend to use this in virtual python environment.