Sample AWS CodeDeploy configuration for django

AWS has its own continuous integration tool known as CodeDeploy, using a simple command you would be able to deploy on multiple servers when you want to change something on code base.

Installing code deploy to instance

If code deploy client is not installed at your instance, you would need to do that:

sudo yum install -y ruby wget
cd /opt
wget https://aws-codedeploy-ap-south-1.s3.amazonaws.com/latest/install
chmod +x ./install
sudo ./install auto

Create CodeDeploy Application

You have to create Code Deploy application with Deployment type to Inplace deployment, and deployment Configuration set to CodeDeployDefault.OneAtATime.
Give it a name under Ec2 configuration and Amazon ec2 instance, say the name is Code deploy instance. Now you have to add the same tag to all your code deploy instances.

Set IAM Permissions

Now that we are done with installation, we would need to setup IAM rules:
First create an IAM group called CodeDeployGroup. This group needs AmazonS3FullAccess and AWSCodeDeployFullAccess permissions. Create a user and add it to this group. This user only needs programmatic access.Save key and key id to somewhere safe.

Create role that has Trusted entities and Policies are ec2.amazonaws.com and AWSCodeDeployRole AmazonS3FullAccess, respectively.

Edit trust relationship to following:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
            "ec2.amazonaws.com",
            "codedeploy.ap-south-1.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create new s3 bucket with previously created IAM rules.

CodeDeploy configuration

My codebase  structure is something like following:

- src
  - <django project>
- scripts
   install_dependencies
   start_server
   stop_server
appspec.yml
codedeploy_deploy.py
deploy.sh

appspec.yml is the file that contains our hooks and configuration for code deploy.

version: 0.0
os: linux
files:
  - source: src
    destination: /home/centos/proj_name
hooks:
  BeforeInstall:
    - location: scripts/install_dependencies
      timeout: 300
      runas: root
  ApplicationStop:
    - location: scripts/stop_server
      timeout: 300
      runas: root
  ApplicationStart:
    - location: scripts/start_server
      timeout: 300
      runas: root

for django scripts/install_dependencies may look like following:

sudo yum install -y gcc openssl-devel bzip2-devel wget
sudo yum install -y make git
cd /opt
command -v python3.6 || {
    wget https://www.python.org/ftp/python/3.6.3/Python-3.6.3.tgz
    tar xzf Python-3.6.3.tgz
    cd Python-3.6.3
    sudo ./configure --enable-optimizations
    sudo make altinstall
}
sudo yum install -y mysql-devel

for scripts/start_server I have following:

cd /home/centos/evaly
pip3.6 install -r requirements.txt
nohup uwsgi --http :80 --module evaly.wsgi > /dev/null 2>&1 &

for scripts/stop_server I have following:

pkill uwsgi

I have borrowed a python script from bitbucket team which looks like following:

# Copyright 2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file
# except in compliance with the License. A copy of the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is distributed on an "AS IS"
# BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under the License.
"""
A BitBucket Builds template for deploying an application revision to AWS CodeDeploy
narshiva@amazon.com
v1.0.0
"""
from __future__ import print_function
import os
import sys
from time import strftime, sleep
import boto3
from botocore.exceptions import ClientError

VERSION_LABEL = strftime("%Y%m%d%H%M%S")
BUCKET_KEY = os.getenv('APPLICATION_NAME') + '/' + VERSION_LABEL + \
    '-bitbucket_builds.zip'

def upload_to_s3(artifact):
    """
    Uploads an artifact to Amazon S3
    """
    try:
        client = boto3.client('s3')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False
    try:
        client.put_object(
            Body=open(artifact, 'rb'),
            Bucket=os.getenv('S3_BUCKET'),
            Key=BUCKET_KEY
        )
    except ClientError as err:
        print("Failed to upload artifact to S3.\n" + str(err))
        return False
    except IOError as err:
        print("Failed to access artifact.zip in this directory.\n" + str(err))
        return False
    return True

def deploy_new_revision():
    """
    Deploy a new application revision to AWS CodeDeploy Deployment Group
    """
    try:
        client = boto3.client('codedeploy')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False

    try:
        response = client.create_deployment(
            applicationName=str(os.getenv('APPLICATION_NAME')),
            deploymentGroupName=str(os.getenv('DEPLOYMENT_GROUP_NAME')),
            revision={
                'revisionType': 'S3',
                's3Location': {
                    'bucket': os.getenv('S3_BUCKET'),
                    'key': BUCKET_KEY,
                    'bundleType': 'zip'
                }
            },
            deploymentConfigName=str(os.getenv('DEPLOYMENT_CONFIG')),
            description='New deployment from BitBucket',
            ignoreApplicationStopFailures=True
        )
    except ClientError as err:
        print("Failed to deploy application revision.\n" + str(err))
        return False     
           
    """
    Wait for deployment to complete
    """
    while 1:
        try:
            deploymentResponse = client.get_deployment(
                deploymentId=str(response['deploymentId'])
            )
            deploymentStatus=deploymentResponse['deploymentInfo']['status']
            if deploymentStatus == 'Succeeded':
                print ("Deployment Succeeded")
                return True
            elif (deploymentStatus == 'Failed') or (deploymentStatus == 'Stopped') :
                print ("Deployment Failed")
                return False
            elif (deploymentStatus == 'InProgress') or (deploymentStatus == 'Queued') or (deploymentStatus == 'Created'):
                continue
        except ClientError as err:
            print("Failed to deploy application revision.\n" + str(err))
            return False      
    return True

def main():
    if not upload_to_s3('/Users/sadafnoor/Projects/evaly/artifact.zip'):
        sys.exit(1)
    if not deploy_new_revision():
        sys.exit(1)

if __name__ == "__main__":
    main()

I have written a script to zip up my source code so that the script can upload it to s3 and eventually all my ec2 instances will be downloading that zip from s3.

export APPLICATION_NAME="CodeDeployApplicationName" 
export AWS_ACCESS_KEY_ID="IAMUserKeyId"
export AWS_DEFAULT_REGION="ap-south-1"

export AWS_SECRET_ACCESS_KEY="IAMUserSecretKey"
export DEPLOYMENT_CONFIG="CodeDeployDefault.OneAtATime"

export DEPLOYMENT_GROUP_NAME="CodeDeployDeploymentGroup"
export S3_BUCKET="S3BucketName"
zip -r ../artifact.zip src/* appspec.yml scripts/*
python codedeploy_deploy.py

Running a python/django app containerised in docker image pushed on private repo on top of kubernetes cluster

Recently I was looking for more flexible way to ship our code in production,   docker and kubernetes are the sweetheart of the devops engineers. Docker is something that let us containerise our app in an image and then we let that image to run on production. I know that we all have some experience with Virtual Machines, it is easy to get confused about the difference that will stop you from appreciating what docker does. A virtual machine works separately on top of the hypervisor of your computer on the other hand docker creates another layer of abstraction on top of your OS. It lets you share the similarities the images that you already have on your OS, and on top of that it adds another layer that has the differences. Now we can run multiple linux image on top of one machine without costing double. So it optimises, it is intelligent and it saves us. When we are running a cluster of n nodes, the complexity exponentially, what if something broke somewhere in a docker container in a cluster of n nodes? How can we ensure that which docker container should be run on which node? How do we move that docker container to another node because that node is going to be turned off for maintenance? We need a manager, who takes care of them, don’t we? Kubernetes comes along takes the responsibility.

First of all lets setup kubernetes.

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF

We would turn the selinux off as the documentation says so.

setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

Now we would be installing docker, kubeadm, and in more abstract kubernetes.

yum install -y docker kubelet kubeadm kubectl --disableexcludes=kubernetes

We would need to ensure that these are the service that should be the first thing it should be doing when the computer turns on:

systemctl enable kubelet && systemctl start kubelet
systemctl enable docker && systemctl start docker

Now that we have our kubernetes and docker running on master and slave node. We would need to change one or two things in configuration for a safe initial launching on master.

vi /var/lib/kubelet/kubeadm-flags.env

KUBELET_KUBEADM_ARGS=--cgroup-driver=systemd
#--network-plugin=cni

We are initializing the node on master:

kubeadm init

for future token creation

sudo kubeadm token create --print-join-command 

It should generate a token for you which you would need to copy and paste on your slave node, but before that you would be need to put configuration file in proper directory with proper permission.

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

on slave server

kubeadm join 10.0.15.10:6443 --token vzau5v.vjiqyxq26lzsf28e --discovery-token-ca-cert-hash sha256:e6d046ba34ee03e7d55e1f5ac6d2de09fd6d7e6959d16782ef0778794b94c61e

if you are getting something similar:

I0706 07:18:56.609843    1084 kernel_validator.go:96] Validating kernel config
	[WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr] or no builtin kernel ipvs support: map[ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{} ip_vs:{} ip_vs_rr:{}]
you can solve this problem with following methods:
 1. Run 'modprobe -- ' to load missing kernel modules;
2. Provide the missing builtin kernel ipvs support

 Pulling images required for setting up a Kubernetes cluster

running following would help

for i in ip_vs ip_vs_rr ip_vs_wrr ip_vs_sh nf_conntrack_ipv4; do modprobe $i; done

Now if you run following in server, you would know that you have some nodes attached to your kubernetes cluster.

sudo kubectl get nodes
sudo kubectl describe nodes

if you had issues dealing with nodes:

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

For this demonstration I will be adding an existing python/django application on this cluster. So first of all I would be needing to dockerize that first.

at my Dockerfile

# set the base image 
FROM python:3.7
# File Author / Maintainer
MAINTAINER Sadaf
#add project files to the usr/src/app folder

#set directoty where CMD will execute 
WORKDIR /usr/src/app
ADD app_name ./app_name
COPY /app_name/requirements.txt .
# Get pip to download and install requirements: --no-cache-dir 
RUN pip install -r requirements.txt
# Expose ports
EXPOSE 8000
# default command to execute
WORKDIR /usr/src/app/app_name
RUN chmod +x app_name/gunicorn.sh
CMD ./app_name/gunicorn.sh
#ENTRYPOINT ["/bin/bash", "app_name/gunicorn.sh"]

Now that we have a dockerfile ready. Lets build that image:

sudo docker build -t app_name_api_server .

Time to run that image and expose it on port 8000

sudo docker run -p 8000:8000 -i -t app_name_api_server

If you like what you see on localhost:8000! Congratulations! Your app is working on docker. Now let’s push that image on docker hub. For me I have created a private repo on docker hub. To be able to push your image on docker hub you would be needing to add tags to that image first then you can push it.

sudo docker tag app_name_api_server sadaf2605/app_name_api_server
sudo docker push sadaf2605/app_name_api_server

Now that you have your image pushed on docker hub. Now we will go back to our kubernetes master. As the image that we want to pull are on private image, no wonder we would need some sort of credential to pull it.

DOCKER_REGISTRY_SERVER=docker.io
DOCKER_USER=Type your dockerhub username, same as when you `docker login`
DOCKER_EMAIL=Type your dockerhub email, same as when you `docker login`
DOCKER_PASSWORD=Type your dockerhub pw, same as when you `docker login`

kubectl create secret docker-registry myregistrykey \
  --docker-server=$DOCKER_REGISTRY_SERVER \
  --docker-username=$DOCKER_USER \
  --docker-password=$DOCKER_PASSWORD \
  --docker-email=$DOCKER_EMAIL

Now lets define and yaml file that we are going to define our kubernetes deployments, app_name.yaml.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app-name-api-server
spec:
  selector:
    matchLabels:
      run: app-name-api-server
  replicas: 1
  template:
    metadata:
      labels:
        run: app-name-api-server
    spec:

      containers:
      - name: app-name-api-server
        image: index.docker.io/sadaf2605/app_name_api_server:latest
        imagePullPolicy: Always
        ports:
        - containerPort: 8000
          #hostPort: 8000
        env:
        - name: DB_USERNAME
          value: "user"
        - name: DB_PASSWORD
          value: "password"
        - name: DB_NAME
          value: "dbname"
        - name: DB_HOST
          value: "1.2.2.3"
      imagePullSecrets:
      - name: myregistrykey
      hostNetwork: true
      dnsPolicy: ClusterFirstWithHostNet
      dnsPolicy: "None"
      dnsConfig:
        nameservers:
          - 8.8.8.8
      imagePullSecrets:
      - name: myregistrykey
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: kube-dns
  namespace: kube-system
data:
  upstreamNameservers: |
    ["8.8.8.8"]

Now time to run a deployment using that configuration:

sudo kubectl apply -f app_name.yaml

Lets check if we have a deployment or not:

sudo kubectl get deployments

Lets check if any instance of our docker container is running or not.

sudo kubectl get pods

Now we would be creating a service that would let us access to these pods from outside of pods.

sudo kubectl expose deployment app-name-api-server --type=LoadBalancer --name=app-name-api-server
sudo kubeadm upgrade plan --feature-gates CoreDNS=true

CentOS postgis setup: You need JSON-C for ST_GeomFromGeoJSON

I am actually struggling with this error on our centos server all day, this is how I fixed it. I could not recover everything from my commandline history but this is what I could find out. Hope it will be helpful. I am sure while cleaning up my server for production it will be helpful to me. So cheers!

All this started with this error message on my centos server:

npm-2 Unhandled rejection SequelizeDatabaseError: You need JSON-C for ST_GeomFromGeoJSON
npm-2     at Query.formatError (/home/centos/jobcue.com/node_modules/sequelize/lib/dialects/postgres/query.js:357:14)
npm-2     at null. (/home/centos/jobcue.com/node_modules/sequelize/lib/dialects/postgres/query.js:88:19)
npm-2     at emitOne (events.js:77:13)
npm-2     at emit (events.js:169:7)
npm-2     at Query.handleError (/home/centos/jobcue.com/node_modules/pg/lib/query.js:108:8)
npm-2     at null. (/home/centos/jobcue.com/node_modules/pg/lib/client.js:171:26)
npm-2     at emitOne (events.js:77:13)
npm-2     at emit (events.js:169:7)
npm-2     at Socket. (/home/centos/jobcue.com/node_modules/pg/lib/connection.js:109:12)
npm-2     at emitOne (events.js:77:13)
npm-2     at Socket.emit (events.js:169:7)
npm-2     at readableAddChunk (_stream_readable.js:153:18)
npm-2     at Socket.Readable.push (_stream_readable.js:111:10)
npm-2     at TCP.onread (net.js:531:20)

When I tried to install json-c on server it was like:

sudo yum install json-c
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
 * base: centos.eecs.wsu.edu
 * epel: s3-mirror-us-west-2.fedoraproject.org
 * extras: linux.mirrors.es.net
 * updates: mirror.raystedman.net
Package json-c-0.11-4.el7_0.x86_64 already installed and latest version
Nothing to do

Then I started panicing. Tried 5-6 hours long yum battles and figure out a solution that would look like following:

Install some dependencies at first:

yum install geos-devel.x86_64
yum install proj-devel.x86_64
yum install gdal-devel.x86_64
yum install libxml2-devel.x86_64
yum install json-c-devel.x86_64

yum install postgresql92-devel
sudo yum install postgresql-server 

sudo yum install geos geos-devel
wget http://download.osgeo.org/proj/proj-4.8.0.tar.gz
gzip -d proj-4.8.0.tar.gz
tar -xvf proj-4.8.0.tar
cd proj-4.8.0
./configure
make
sudo make install

I needed to install gdal:

installing gdal:

sudo rpm -Uvh http://elgis.argeo.org/repos/6/elgis-release-6-6_0.noarch.rpm
sudo yum install -y gdal
./configure
make
make install

Obviously I needed to install json c:

sudo yum install json-c-devel

I needed to know where it is located:

rpm -ql json-c json-c-devel

for me it was at:

/usr/include/*

Now it is time to built our postgis like this:

wget http://download.osgeo.org/postgis/source/postgis-2.2.1.tar.gz
tar xvzf postgis-2.2.1.tar.gz
cd postgis-2.2.1
./configure --with-jsonc=/usr/include

make
make install
sudo make install

Install PostGIS on your ubuntu 15.10

It took me quite a lot of time to figure out how do you do that! Though it looks like it is just 3 apt get install command, so simple right? but took quite a toll of my life to figure those 3 out. If you are reading this, probably you are also going through something semilar!

>sudo apt-get install postgresql postgresql-contrib postgis
> psql --version
psql (PostgreSQL) 9.5.2

> sudo psql --username=postgres --dbname=jobcue -c "CREATE EXTENSION postgis;"
Password for user postgres:
ERROR: could not open extension control file "/usr/share/postgresql/9.5/extension/postgis.control": No such file or directory

>sudo apt-get install postgresql-9.5-postgis-2.0

> sudo psql --username=postgres --dbname=jobcue -c "CREATE EXTENSION postgis;"
Password for user postgres:
ERROR: could not open extension control file "/usr/share/postgresql/9.5/extension/postgis.control": No such file or directory

> sudo apt-get install postgresql-9.5-postgis-scripts
> sudo psql --username=postgres --dbname=jobcue -c "CREATE EXTENSION postgis;"
Password for user postgres:
CREATE EXTENSION

Apparently your ubuntu can do blue screen too! Here is how to fix blue screen in ubuntu

I did not know that one can have blue screen in ubuntu as well. But if you know it already that don’t panic, this is how to fix blue screen in ubuntu. It is probably happened because of problem in upgrading distro.


$ sudo su
$ sudo apt-get install libgdk-pixbuf2.0-dev
$ cd /usr/lib/x86_64-linux-gnu/gdk-pixbuf-2.0/
$ find 2.10.0/loaders/ > ~/pixbuf-files
$ nano ~/pixbuf-files
# and delete 1st line 2.10.0/loaders/

$ cat ~/pixbuf-files | xargs -n1 gdk-pixbuf-query-loaders > 2.10.0/loaders.cache
$ reboot

RabbitMQ & celery demo using image processing app on flask

Celery is a distributed system for processing messages on a task queue with a focus on real-time processing and support for task scheduling. When we have to run an expensive function that keeps user waiting for like “forever”, it is always better to use something like celery. In this blog we will be writing a face detection web app using flask, python-opencv and celery.

Before I can tell something, let me share a flask code snippet with you:

from time import sleep

@app.route("/")
def hello():
    sleep(10) # <---what would you see in this 10s?
    return "Hello World!"

Can you tell me what would you see in first 10s while we run our flask app? I know the answer, before getting the response it will keep the user waiting for 10s. We don’t love to wait 10s. We are so impatient, we want everything instantly thats the motivation that we have in modern computing. But life is cruel, we can’t get everything instantly, we understand that but our users DO NOT understand this simple truth. So we do what, we will try to sell them a feeling that we are working instantly, at least it is not taking forever to load. So we need to get over from that sleep block. How would we do that, that’s what I am going to discuss in this blog with a real life image processing app in flask.

Obviously in life we don’t need to write “sleep” to make our code run slower. We had to write plenty of function that makes our life slower. In this blog we will discuss we will be writing an application that enables user to upload a picture and we will help them to detect faces. So what is the function we have this face detection function which is very expensive. It takes almost 3-10s on my machine to detect the face of my favourite actress. Let me share my code:

#server.py

__author__ = 'sadaf2605'


import os
from flask import Flask, request, redirect, url_for
from werkzeug import secure_filename

import face_detect
from os.path import basename


UPLOAD_FOLDER = '/home/sadaf2605/flask_celery_upload_image/uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER


def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        import time
        start_time = time.time()
        file = request.files['file']

        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)

            base,ext=os.path.splitext(filename)


            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
            face_detect.detect(os.path.join(app.config['UPLOAD_FOLDER'], filename),os.path.join(app.config['UPLOAD_FOLDER'], base+"-face"+ext))

            print "--- %s seconds ---" % str (time.time() - start_time)
            return redirect("/")
            return redirect(url_for('uploaded_file',
                                    filename="facedetect-"+filename))

    from os import listdir
    from os.path import isfile, join
    htmlpic=""
    for f in sorted(listdir(UPLOAD_FOLDER)):
        if isfile(join(UPLOAD_FOLDER,f)):
            print f
            htmlpic+="""
            
                
            
                """

    return '''
    
    

    
    Upload new File
    

Upload new File

'''+htmlpic from flask import send_from_directory @app.route('/uploads/') def uploaded_file(filename): return send_from_directory(app.config['UPLOAD_FOLDER'], filename) from werkzeug import SharedDataMiddleware app.add_url_rule('/uploads/', 'uploaded_file', build_only=True) app.wsgi_app = SharedDataMiddleware(app.wsgi_app, { '/uploads': app.config['UPLOAD_FOLDER'] }) if __name__ == "__main__": app.debug=True app.run()

#face_detect.py

import numpy as np
import cv2


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

@app.task
def detect(src_img,dest_img):
    img = cv2.imread(src_img)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(gray, 1.3, 1)
    for (x,y,w,h) in faces:
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),5)
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]


    cv2.imwrite(dest_img, img)

You can test this app running using

python server.py

But we don’t want our user to wait 10s to see the next page. So we will use celery and rabbitmq to help us. First of all lets install RABBITMQ and CELERY.

To install rabbitMq we will use aptitude because it installed all its dependency in the way, if you don’t have aptitude installed then:

sudo apt-get install aptitude

Now its time to install rabbitmq server

 sudo aptitude install rabbitmq-server

Now we will create user and server for rabbitmq.

sudo rabbitmqctl add_user rabbit_user password
sudo rabbitmqctl add_vhost /app_rabbit

we will set permission for our user to do everything

sudo rabbitmqctl set_permissions -p /app_rabbit rabbit_user ".*" ".*" ".*"

Now we need to restart rabbit server, so that the change gets implemented

sudo /etc/init.d/rabbitmq-server stop
sudo /etc/init.d/rabbitmq-server start

Now we will install celery:

pip install celery

Now we need to configure celery, and celery provides few decorator functions like @tasks to achieve our goal. Rabbitmq is default for celery. Now we need to know that celery communicate via broker url using a different port. We want to enqueue our image processing tasks, so we can define it in face_ditect.py but it will be better if we can put it in our server.py as it is the entry point… but whatever for now!

import numpy as np
import cv2

from celery import Celery

app= Celery(broker='amqp://rabbit_user:password@localhost:5672//app_rabbit' )


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

@app.task
def detect(src_img,dest_img):
    img = cv2.imread(src_img)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(gray, 1.3, 1)
    for (x,y,w,h) in faces:
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),5)
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]


    cv2.imwrite(dest_img, img)

Now it won’t change your life radically, because we are not using the decorator functions that celery provided us. To put that task in queue we need use delay function of decorator function. so we need to call detec_image.delay(src_img,dest_image) and we actually need to keep our celery server running other wise it will only put it in queue and wait for the server to run. In -A parameter of celery we need to mention which file the decorator functions are located.

To run celery server

celery worker -A detect_face -l INFO

So now finally we can change our server.py

__author__ = 'sadaf2605'


import os
from flask import Flask, request, redirect, url_for
from werkzeug import secure_filename

import face_detect
from os.path import basename


UPLOAD_FOLDER = '/home/sadaf2605/flask_celery_upload_image/uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER


def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        import time
        start_time = time.time()
        file = request.files['file']

        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)

            base,ext=os.path.splitext(filename)


            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
            face_detect.detect.delay(os.path.join(app.config['UPLOAD_FOLDER'], filename),os.path.join(app.config['UPLOAD_FOLDER'], base+"-face"+ext))

            print "--- %s seconds ---" % str (time.time() - start_time)
            return redirect("/")
            return redirect(url_for('uploaded_file',
                                    filename="facedetect-"+filename))

    from os import listdir
    from os.path import isfile, join
    htmlpic=""
    for f in sorted(listdir(UPLOAD_FOLDER)):
        if isfile(join(UPLOAD_FOLDER,f)):
            print f
            htmlpic+="""
            
                
            
                """

    return '''
    
    

    
    Upload new File
    

Upload new File

'''+htmlpic from flask import send_from_directory @app.route('/uploads/') def uploaded_file(filename): return send_from_directory(app.config['UPLOAD_FOLDER'], filename) from werkzeug import SharedDataMiddleware app.add_url_rule('/uploads/', 'uploaded_file', build_only=True) app.wsgi_app = SharedDataMiddleware(app.wsgi_app, { '/uploads': app.config['UPLOAD_FOLDER'] }) if __name__ == "__main__": app.debug=True app.run()

So after uploading now in the front page the picture won’t show up you will need to refresh it couple of time after 5-6s to show up. Keep on refreshing and you may like to send me a pull request at: https://github.com/sadaf2605/facedetection-flaskwebapp-rabbitmq

Preparing DataCleaner with RSpec, Capybara, Selenium for testing Rails app

“data_cleaner” is a beautiful gem that cleans database. We can use it for cleaning our database for each test case. I have been banging my head for hours to make it work, even though I had followed the right documentation. So I thought may be I would look “COOL” if I could write a blog describing steps, at the same time it could be helpful to other.

I would add at my Gemfile:

gem 'database_cleaner'

Then I would like to do a

bundler install

Now when database_cleaner gem is install I want to add DatabaseCleaner to my RSpec configuration, it is always a good idea to keep this code separated, so I have created a different file ‘support/database_cleaner.rb’. We need to set DataBase cleanup strategy, by default RSpec’s get/post request driven testing uses  Transaction but Capybara, Selenium which is being used for testing js uses Truncation as strategy so we will need to write at least two different strategies. It is also good practice to clean up everything before we run the test so even if database had any data by mistake, it can not mess with my test cases. So my ‘support/database_cleaner.rb’ looks like:

RSpec.configure do |config|
  #It runs before the entire test suite runs and it clears the test database. 
  config.before(:suite) do
    DatabaseCleaner.clean_with(:truncation)
  end
  
  # itsets the default database cleaning strategy to be transactions. 
  config.before(:each) do
    DatabaseCleaner.strategy = :transaction
  end
  
  #Tests which is flagged as :js => true by default, these tests using Capybara 
  # using Capybaras test server and firing an actual browser window via the Selenium 
  # backend. For these types of tests, our previous transactions won’t work, so we need to override
  # the setting and chooses the “truncation” strategy instead.
  config.before(:each, :js => true) do
    DatabaseCleaner.strategy = :truncation
  end


# Now we need to start and end database cleaner.
  config.before(:each) do
    DatabaseCleaner.start
  end

  config.after(:each) do
    DatabaseCleaner.clean
  end

end

 

Now we need to fix few things at  `rails_helper.rb`.
We will add require ‘support/database_cleaner’.

# This file is copied to spec/ when you run 'rails generate rspec:install'
ENV['RAILS_ENV'] ||= 'test'
require File.expand_path('../../config/environment', __FILE__)
# Prevent database truncation if the environment is production
abort("The Rails environment is running in production mode!") if Rails.env.production?
require 'rspec/rails'
require 'spec_helper'
require 'support/database_cleaner'

But still it won’t work unless we, change `rails_helper.rb` from:
config.use_transactional_fixtures = true
To:
config.use_transactional_fixtures = false

If you try to put `config.use_transactional_fixtures = false` at `spec_helper.rb` then you may need to change the order of rails and spec_helper at `rails_helper.rb`

require 'rspec/rails'
require 'spec_helper'

or else you may get:

spec_helper.rb:53:in `block in <top (requir
ed)>': undefined method `use_transactional_fixtures=' for #<RSpec::Core::Configuration:0x00000001d971b8>

How ubuntu server boots up: V init vs upstream

How many time did I boot my labtop today? Well, I booted up twice, usually I never move my ass out, but shockingly today I just did. Don’t get too curious about where did I go, because thats not the point, the point of curiosity is what happens when a ubuntu server boots up. Thats what we will try to figure out in this blog. Unlike any other Linux distributions, Ubuntu uses a different startup process for services, known as Upstart. As it has backward compatibility, the difference remain unnoticed most of the time.

When Ubuntu server starts up, first thing it does is it starts up GRUB boot loader. GRUB boot loader stays at least partially at the boot code on the Master Boot Record (the first 512 bytes of hard drive). It selects which Linux kernel the system will boot from and which options to uses when it boots.

When we look at /boot/grub/grub.cfg or /etc/default/grub, we see references of a program called update-grub. This is a helper program that automates the update of GRUB configuration file when new kernels are added. It executes number of configuration scripts stored at /etc/grub.d. When we select a kernel to boot from GRUB menu, it loads the kernel into memory along with its initrd file (initial RAM disk). The initrd file is actually a gzipped cpio archive known as an initramfs file under Ubuntu.  example gz file is initrd.img-2.6.32-14-generic-pae.

When a kernel boots, it needs to be able to at least mount the root file system so that it can access basic configuration files, kernel modules, and system binaries.

Now with the increase of hardware and supporting file systems, it makes sense to support them only if that is necessary. It keeps kernel smaller and flexible.

It needs to accessthe files to mount root file system. The initramfs file provides the kernel the essential kernel modules and system binaries it needs to have to mount theroot file system and complete the boot process.  Grub provide the information about root file system.

When kernel boots, it extracts the initramfs into RAM and runs a script called init. It basically creates some system mount points and mounts the actual root partition. Finally, after this init script has mountedthe real root file system, its last task is to run the /sbin/init program on theroot file system, which starts the next phase of the boot process.

The /sbin/init program is the parent process of every program running onthe system. This process always has a PID of 1 and is responsible for starting the rest of the processes that make up a running Linux system.

UNIX like OS has few standards to initialize, most of the known distributions are using System V init model but Ubuntu Server has switched to a systemknown as Upstart. Ubuntu still some features of System V init such as runlevels and /etc/rc?.d directories for backwardcompatibility. The good thing about upstart is that it manages everything under the hood.

In this V init system, different system states are  known as runlevels. When V init system starts it goes through the configuration file located at /etc/inittab and discovers its default runlevel. Then it enters to that runlevel and starts processes that has been configured to run at that runlevel. Runlevels are labeled by numbers ranging from 0 to 6. For an instance, runlevel 0 is reserved for a halted system state. When we enter runlevel 0, the system shuts down all running processes, unmounts all file systems, and powers off. Likewise, runlevel 6 is reserved for rebooting the machine. Runlevel 1 is reserved for single-user mode a state where only a single user can log in to the system with only few process running which comes very handy for digonosis. Even in the default GRUB menu you will notice a recovery mode option that boots you into runlevel 1.

Runlevels 2 through 5 are left for distributions and us to define. So we can create our own runlevels. Traditionally in Linux distributions one runlevel is allocated for graphical desktop (eg. runlevel 5 of Red Hat) and another runlevel for a system with no graphics (Eg. runlevel 3 of RedHat). User also has scope to create his own run level, for instance, maybe starting up a system without network access some time could come handy, so we can define it as a runlevel . In that case we need to pass an argument at boot prompt to override the default runlevel with desired runlevel. Once the system is booted, we can change the current runlevel with the init commands using sudo init 1.

/etc/init.d directory contains all of the start-up scripts of all services of all runlevels. This scripts usually contains start and stop commands.

After the choice of runlevel, init goes to /etc/rcS.d and runs each script that begins with an S in numerical order with start as an argument. Finally init is finished but stays running in the background, waiting for the runlevel to change.

init scripts has few draw backs for an instance, if a service dies in before completing the task it does not automatically starts the process. So we need another tool to monitor this process succeedeed or not. Init scripts is are generally affected by either change in runlevel or when the system starts up but some reason not executed. A perfect example would be Init scripts that depend on a network connection. On Ubuntu the init script that establishes the network connection is called networking. As we know it follow a neurmaric sequence, any init scripts that depend on a network connection are named with a higher number than this init script to ensure they run after the networking script has run. Lets imagine a situation where you boot up your system at the time when your network cable is unplugged. So in V init system, network init will run and failed and other connection will time out one by one.

It was designed not only to address some of the shortcomings of the System V init process, but also to provide a more robust system for managing services. Upstart solves this problem because upstart is event driven. Upstart can be configured to take action based on those events. Some sample events might be system start-up, system shutdown, the Ctrl-Alt-Del sequence being pressed, the runlevel changing, or an Upstart script starting or stopping. Upstart also constantly monitors the system for certain events to occur, and when they do,

Upstart does not completely replace System V init, or the functionality of init and the /etc/inittab files or changes of runlevels, but instead more core functionality is being ported to Upstart scripts. The difference is that Upstart now starts and stops services when runlevels changes. Upstart script are defined with either the script or exec options. Exec option keeps track of its PID. The convention is to keep track of these PIDs in the /var/run/ directory. With the script option, Upstart treats the lines that follow as a shell script until it reaches the end script line. Upstart provides methods to check the status of Upstart jobs and start and stop them as appropriate. We can check the status, start, and stop Upstart scripts with the appropriately named status, start, and stop commands. For example we can use sudo /etc/init.d/networking status, ubuntu short hand of this command is sudo service networking status. To disable an init script from starting up, we need to use sudo update-rc.d -f servicename remove, and to enable a script we need to use sudo update-rc.d servicename defaults. When we need to write out own script, we should start from the sceleton provided by ubuntu at /etc/init.d/skeleton. init scripts reside in /etc/init.d and have symlinks in /etc/rc?.d directories, where ? is a number representing a runlevel. So when we create our own script we need to choose a value for rc?.d wisely. The lower the value early it runs. We need to be careful about the dependency.

In Ubunutu services are managed two ways, either through init scripts or using xinetd. Xinetd is an updated and resource efficient version of the classic inetd service. When a system boots a init scripted service starts, the service could sit idly for decades before it gets accessed, wasting valuable server resources. On the other hand xinetd listen to ports its child services uses. If a connection is made on one of the ports, xinetd will then spawn the service that corresponds to that port, and once the connection is finished, the service exits until it is needed again.

Thanks:
1. BRACU Ayesha Abed Library
2. Kyle Rankin and Benjamin Mako Hill
3. My boredom 😛

Ubuntu server directory structure: what and whys

I am writing this blog from a ubuntu desktop distro, but there are other versions of ubuntu, ubuntu cloud, ubuntu server and so on. Today basically I am interested in ubuntu server version in particular. We may or may not know the truth that Ubuntu has always been used as a server. The company which developed ubuntu, Canonical is using ubuntu for their server from the begining. So it has been a goal of this ubuntu. Later the desktop and server got their path separated for their own good.

When I installed my first ubuntu LVM, I actually followed the settings blindly, I followed what some random guy wrote on their blog or on youtube. But today I want to shed some light on linux directories. The trick is very simple, the more we can understand what our directories do, the more we can secure them and save our time to debug what went wrong.

/bin directory holds the core binaries like ps, ls, rm, mv, chmod, df etc that an administrators or a regular users may need to use.

/sbin directory contains only system binaries which is very crucial for mounting the rest of the system and recovering the system if it can’t boot. Programs like fsck, ifconfig, mkfs, route, and init here. Usually this commands are run by root users.

/lib directory usually contains core system libraries the system needs to complete the boot process and use the binaries under /bin and /sbin. All of the kernel’s modules are also found here under /lib/modules/.

/usr is intended to store all of the noncritical binaries and libraries for the system in UNIX like operating system. if we keep only the core binaries and libraries in /bin, /sbin, and /lib, we can keep the root partition small. Without any update /usr remains relatively static. So we could add an extra layer of security by mounting it read-only and remount it read-write when performing updates.

/usr/bin directory is similar to /bin, but it stores the rest of the binaries on the system that aren’t considered critical. Commands like man, gzip, nano, and other binaries intended for use both by administrators or regular users stays here.

/usr/sbin directory is similar to /sbin, only it stores binaries for administrator use that aren’t critical to booting. Commands like traceroute, chroot, and ntpdate along with a majority of the daemons that are started by init scripts like Web servers, ntp daemons, and mail servers are stored here.

/usr/lib holds the program libraries that supports in binaries under /usr/bin and /usr/lib.

/usr/local is used for custom service, binary or script of your own to the system.

/opt and /usr/local share the same purpose, it stores of third-party programs. Usually the program that installs under /opt are installed under their own directory (such as, say,
/opt/someprogram) and then create their own bin, sbin, and lib directories under there. On one hand this means we can remove a program by removing that directory under /opt, but at the same time it let our PATH environment variable grow rapidly. This directory stores kernel images, initramfs files, and also the GRUB configuration files.

/boot directory exists so that we can potentially separate it out into its own small mount point. When we want to experiment with LVM or an experimental file system for our root partition, we can format a separate /boot partition as ext2 or ext3 with no software RAID or LVM configured and be sure that our boot loader could read it.

/etc stores all the configuration files for the system and services. For example system startup scripts are stored at /etc/init and /etc/init.d. Keeping /etc separated makes it easy to back it up easily. It also makes the recovery process is easier.

/var directory was separated to store files and directories that could be vary in size and change frequently.

/var/log stores the system logs. Log has this tendency to grow out of control when the system is either under load or has a problem. When logs grow out of control, it fills up all the space of system and cause a crash.

/var/spool directory contains subdirectories that stores information of user crontabs, printer spools, and mail spools. In mail server this directory becomes very important. Like any other var it can grow quite large if the server spools a lot of mail for delivery.

/var/www directory won’t exist on all systems, but a Web server is run it will be the default place for the Web server’s docroot and cgi-bin directories.

/home holds all of the personal files for user accounts on that machine. It can also grow quite large. The advantage of separating this directory is that when we decide to change distributions, we could install the system on the root partition and overwrite what was there but preserve all user settings here.

/dev directory contains all of the device files of the system. These files include disk devices, keyboards, mice, and any other devices the system detects. On classic Linux systems these files. This directory is also pretty static, but on a modern Ubuntu server device files are often created on the fly by the udev program as devices are added or modules are loaded.

/media directory for these devices so it can keep /mnt for temporary mount points for nonremovable media such as floppy disks, CD-ROMs, and USB drives.

/proc is not an actual area on disk but it is a virtual file systems under Linux, It exists in RAM. Every process gets a directory full of live information, under /proc corresponding to its PID.It also stores virtual files related to the kernel process itself. We can query such things as the options passed to the kernel at boot (/proc/cmdline) or view settings for different kernel devices or other settings. A number of the kernel /proc files can also be used to not only read settings, but set them by writing to the corresponding file.

/sys directory is also a virtual file system. The files within /sys provide information about devices and drivers on your system, and the file system was created in part so that these sorts of files would no longer clutter up /proc. As with /proc, not only do many of the files within /sys provide information about parts of the system, but we can also write to various files to change settings on the fly.

/tmp is to store temporary files that don’t have to persist after a reboot. A common problem, though, is that a program might store far too much data in /tmp, and if it is part of the root partition, the entire root partition can fill up. A common culprit is vi. When you open a file with vi, a temporary copy of the file is placed in /tmp. I can’t count the number of times I’ve been paged because a server’s root partition was full, only to find out that a user used vi to open a 500Mb log file and dumped an equivalently sized temporary file under /tmp. I then had to kill the vi session and instruct the user on using a program such as less or more to view large text files.

Thanks:
1. BRACU Ayesha Abed Library
2. Kyle Rankin and Benjamin Mako Hill
3. My boredom 😛

ssl (https) from python flask

Basically here I am in this blog I am going share a code snippet, and I am going to describe what else stupid things I tried and did not work to do that. Well, don’t miss my point, when I am sharing my stupid ordeal, it does not mean, I am proving myself stupid but I am trying to save your time, I am basically telling what did not work in last couple of hour so that it could save your time.

So using flask when I shared a static file migrated directly from django, while working with django we have figured out that we actually don’t need powerful tool like django, instead we can use something very lightweight like flask and it is much easier to switch from django to flask as for both of them default front end templating is jinja. Now after this shift, I had to face a little bit trouble with https, because when I tried https, it is completely blank, I lost my mind, what the hell is going on here? Then I realized for flask probably I need to define my ssl. They got this nice documentation at their website (http://flask.pocoo.org/snippets/111/), I follwed them, and it did not work! Why it won’t work? Alright after couple of trial and google search I realized this is an old school way of doing this, fine, I need to go to new school, obviously! I got myself a new dress and now context looks more pretty {“cert.crt”,”key.key”}. I am impressed but what the hell? it did not work as well, why it won’t work? I lost my mind! hours of fighting, and I got this error:

Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 602, in inner
    passthrough_errors, ssl_context).serve_forever()
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 506, in make_server
    passthrough_errors, ssl_context)
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 450, in __init__
    self.socket = ssl_context.wrap_socket(self.socket,
AttributeError: 'OpenSSL.SSL.Context' object has no attribute 'wrap_socket'

It is because I am using a python 2.7 version below 2.7.9.

What else did I try? You won’t want to know, I tried to install pyopenssl on heroku using pip, but looks like it is a ported version and failed to compile on heroku. Now I will write about what I had to do to make it work.

I have to make my cirtificate and keys:

$ openssl genrsa -des3 -passout pass:x -out server.pass.key 2048
 openssl rsa -passin pass:x -in server.pass.key -out server.key
 rm server.pass.key
 openssl req -new -key server.key -out server.csr
openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

So now we have server.key and server.csr two files in our directory

Now python script:

from flask import Flask, request, send_from_directory
from flask import render_template

#from OpenSSL import SSL
#context = SSL.Context(SSL.SSLv23_METHOD)
#context.use_privatekey_file('server.key')
#context.use_certificate_file('server.crt')

import os

ASSETS_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), '/static')
app = Flask(__name__, static_url_path='/static')

@app.route('/js/')
def send_js(path):
    return send_from_directory('js', path)

@app.route('/signup')
def signup():
    return render_template('signup.html')

if __name__ == '__main__':
    context = ('server.crt', 'server.key')
    app.run(ssl_context=context, threaded=True, debug=True)

Done! Up and running if you got a version >=2.7.9.!