Dealing with mandatory ForiegnkeyField for fields that is not in django rest framework serializers

Although I am big fan of django rest framework but sometime i feel it is gruesome to deal with nested serializers (Maybe I am doing something wrong, feel free to suggest me your favourite trick.)

Suppose we have two models, ASerializer is based on A model, BSerializer is based on `B` model. A and B models are related, say B has a foreign key to A. So while creating B it is mandatory to define A but A serializer is full of so much data that I don’t want to have that unnecessary overhead at my BSerializer, but when creating B I must have it. Here how I solved it:

For the sake of brevity let’s say A is our Category, and B is Product. Every Product has a Category, so Product has a foreign key of Category, but I am not making it visible at ProductSerializer given that category has a lot of unnecessary information that is not necessary.

from django.shortcuts import get_object_or_404
class ProductSerializer(serializers.ModelSerializer):
    def to_internal_value(self, data):
        if data.get('category'):
            self.fields['category'] = serializers.PrimaryKeyRelatedField(
                queryset=Category.objects.all())

            cat_slug = data['category']['slug']
            cat = get_object_or_404(Category, slug=cat_slug)
            
            data['category']= cat.id



        return super().to_internal_value(data)

Integrating amazon s3 with django using django-storage and boto3

If we are lucky enough to get high amount of traffic at our website, next thing we start to think about is performance. The throughput of a website loading depends on the speed we are being able to deliver the contents of the website, to our users from our storages. In vanilla django, all assets including css, js, files and images are being stored locally in a predefined or preconfigured folder. To enhance performance we may have to decide to use a third party storage service that alleviate the headache of caching, zoning, replicating and to build the infrastructure of a Content Delivery Network. Ideally we would like to have a pluggable solution, something that allows us to switch storages from this to that, based on configuration. django-storages is one of the cool libraries from django community that helps to maintain 3rd party storage services like aws s3, google cloud, ftp, dropbox and so on. Amazon Webservice is one of the trusted service that offers a large range of services, s3 is one of the cool services from AWS that helps us to store static assets. boto3 is a python library being distributed by amazon to interact with amazon s3.

First thing first, to be able to store files on s3 we would need permission. In AWS world, all sorts of permissions are being managed using Identity Access Management (IAM).
i) In amazon console, you will be able to find IAM under Security, Identity & Compliance. Go there.
ii) We would need to add user with programmatic access.
iii) We would need to add new group.
iv) We would need to set policy for the group. Amazon provides bunch of predefined policies. For our use case, we can choose AmazonS3FullAccess
v) We have to store User, Access key ID and the Secret access key.

In s3 we can organize our contents into multiple buckets. We can use several buckets for a single Django project, sometime it is more efficient to use more but for now we will use only one. We will need to create bucket.

Now we need to install:

pip install boto3
pip install django-storages

We will need to add storages inside our INSTALLED_APPS of settings.py along with other configuration files of

django-storage
INSTALLED_APPS = [
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    'storages',
]


AWS_ACCESS_KEY_ID = '#######'
AWS_SECRET_ACCESS_KEY = '#####'
AWS_STORAGE_BUCKET_NAME = '####bucket-name'
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
#AWS_S3_OBJECT_PARAMETERS = {
#    'CacheControl': 'max-age=86400',
#}
AWS_LOCATION = 'static'

STATICFILES_DIRS = [
    os.path.join(BASE_DIR, 'mysite/static'),
]
STATIC_URL = 'https://%s/%s/' % (AWS_S3_CUSTOM_DOMAIN, AWS_LOCATION)
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'

When we are using django even when we don’t write any html, css or js files for our projects, it already has few because many of classes that we will be using at our views, its parent class may have static template files, base html files, css, js files. These static assets are being stored in our python library folder. To move then from library folder to s3 we will need to use following command:

python manage.py collectstatic

Thing to notice here is that, previously static referred to localhost:port but now it is being referred to s3 link.

{% static 'img/logo.png' %}

We may like to have some custom configuration for file storage, say we may like to put media files in a separate directory, we may not like it to be overwritten by another user. In that case we can define a child class of S3Boto3Storage and change the value of DEFAULT_FILE_STORAGE.

#storage_backends.py

from storages.backends.s3boto3 import S3Boto3Storage

class MyStorage(S3Boto3Storage):
    location = 'media'
    file_overwrite = False
DEFAULT_FILE_STORAGE = 'mysite.storage_backends.MediaStorage'  

Now all our file related fields like models.FileField(), models.ImageField() will be uploading file in our s3 bucket inside the directory ‘media’.

Now we may have different types of storages, some of them will be storing documents, some of them will be publicly accessible, some of them will be classified. Their directory could be different and so on so forth.

class MyPrivateFileStorage(S3Boto3Storage):
    location = 'classified'
    default_acl = 'private'
    file_overwrite = False
    custom_domain = False

If we want to use any other storages that is not defined in DEFAULT_FILE_STORAGE in settings.py. We would need to define it at the field of our model models.FileField(storage=PrivateMediaStorage()).

Crawling a website with python scrapy

Think of a website as a web, how do we crawl that web? Chances are you went to that navigation menu and found a link that you found interesting you clicked on that and you went to that page to find that important information that you were looking for. Or probably your favourite search engine did it for you. How did your search engine did that or how can you make that traversal automatically? Exactly thats where crawler comes into business. Chances are your search engine started crawling on your website based on a link you shared somewhere. We will create one such crawler using python’s crawling framework called scrapy. For last couple of months I have been using it, so felt like it would be wrong not to have a blog about it.

It is always better to have a python virtual environment, so lets set it up:

$ virtualenv .env
$ source .env/bin/activate

Now that we have a virtual environment running, we will install scrapy.
$ pip install scrapy

it has some dependency, like lxml which is basically being used for html parsing using selectors, cryptography and ssl related python libraries will also be installed. Pip takes care of everything, but when we will be writing codes, we will see this in our error message very often, so it is always good idea to have some idea about the dependencies.

Now that we have it installed, we have access to few new commands. Using these commands we can create our own scrapy project, which is not strictly necessary but still I personally like to have everything bootstrapped here the way the creator wanted me to have, that way I could possibly have the same code standard the author of scrapy had while writing this framework.

$ scrapy startproject blog_search_engine

It will create bunch of necessary and unnecessary files you can read about all of those files at documentation, but the interesting part here is that it will create a configuration file called scrapy.cfg , which empowers you with few extra commands. Your spider resides inside the other folder. Spiders are basically the BOT that contains the characteristics defination of that BOT. Usually you can create a spider using following command as a solid start:

$ scrapy genspider wordpress wordpress.com

It will generate a spider called wordpress inside your blog_search_engine/blog_search_engine/spiders/ directory. It creates a 4 or 5 lines of code at your file which does nothing. Lets give it some functionality, shall we? But we don’t know yet what we are automating. We are visiting wordpress.com and we will find the a links of an article, and then we will go that link and get that article. So before we write our spider we need to define what we are looking for right? Lets define our model. Model are usually stored inside items.py . A possible Article might have following fields.

class Article(scrapy.Item):
    title = scrapy.Field()
    body = scrapy.Field()
    link = scrapy.Field()

Now we will define our spider.

class WordPressSpider(scrapy.Spider):
    name = 'wordpress'
    start_urls = [ 'www.wordpress.com' ]

    def parse(self, response):
        article_links = response.css("#post-river").xpath("//a/@href").extract()

        for link in article_links:
            if "https://en.blog.wordpress.com/" in link:
                yield scrapy.Request(article_url,
                                     self.extract_article)

    def extract_article(self, response):
        article = Article()
        css = lambda s: response.css(s).extract()
        
        article['title'] = css(".post-title::text").extract()[0]

        body_html=" ".join(css('.entrytext'))
        body_soup = BeautifulSoup(body_html)
        body_text = ''.join(soup.findAll(text=True))


        article['body'] = body_text
        yield article

As we had configured at our scrapy settings yield at the parse hands over your article to pipeline, as it looks, pipeline could be a great place for database operations. This is possibly out of the scope of this particular blog, but yet you can have an outline of what you might need to do if you are using sqlalchemy as database, although sqlalchemy won’t be particularly helpful to deal with what we intend to do here, still i felt it would be helpful to have them.

class BlogSearchEnginePipeline(object):
    def process_item(self, item, spider):
        # a = Article(title=item['title'],body=item['body'])
        # db.session.add(instance)
        # db.session.commit()
        print 'article found:', item['title'], item['body']

        return item

Now we have a spider defined. But how do we run it? Its actually easy, but remember that you need to be inside your scrapy project to make this command work!

$ scrapy crawl wordpress

On the side note scrapy actually provide us options to pass parameters from commandline to pass argument to spider, we just need to define an intializer parameter

class WordPress
        name = "wordpress"
        ...
        def __init__(self, param=None):
                pass
        ...

Now we could call:

$ scrapy crawl wordpress -a param=helloworld

In this blog I tried to give you an outline of database orms. Sofar we have a spider but this spider has no great use so far, we will try to create a search engine with this spider at my next blog. Databases that sqlalchemy deals with are not particularly super good with text searches elastic search could be a great option if we are looking forward to implement a search option so at my next blog, I will be writing about a basic search engine implementation using elastic search. Thats in my todo list for this weekend.

[Code snippet] Django rest framework social oauth2 user sign up and sign in

At my serializers.py I have got the following:

from django.contrib.auth.models import User
from rest_framework import serializers
 
 
class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ('username', 'email', 'password')

    def create(self, validated_data):
        return User.objects.create_user(**validated_data)

at my views.py I have got the following:

import serializers
from rest_framework.decorators import permission_classes, api_view
from rest_framework.permissions import AllowAny
from django.views.decorators.csrf import csrf_exempt
from rest_framework.parsers import JSONParser
from rest_framework.response import Response
from rest_framework import status
import json
from django.http import JsonResponse

from oauth2_provider.models import Application, AccessToken

@permission_classes((AllowAny,))
@csrf_exempt
@api_view(['POST'])
def create_auth(request, format=None):
    if request.user.is_authenticated():
        return Response({"already_registered": "User with that username has already registered"}, status=701)
    
    data = request.data
    print data

    
    serializer = UserSerializer(data=data, partial=True)
    if serializer.is_valid():
        u=serializer.save(username= data.get(u'username') )
        application=Application.objects.create(user=u, client_type="public", authorization_grant_type="password",name="general")
        client_id = application.client_id #call the url to get your tokens, use urllib or something similar
        client_secret = application.client_secret
        return JsonResponse({'client_id': client_id, 'client_password' : client_secret}, status=201)
    else:
        return JsonResponse({'errors': serializer.errors}, status=400)

I have added following at my urls.py

urlpatterns = patterns(
    '',
    url(r'^register/', 'social_app.views.create_auth'),
    url(r'^auth/', include('rest_framework_social_oauth2.urls')),
)

Testing:

 
sadaf2605@pavilion-g4:~$ curl -X POST -H "Content-Type: application/json" -d '{"email":"boba@athingy09876.com", "password":"p4ssword", "username": "user100"}' http://localhost:8000/register/

returns:

{"client_password": "EjQKMCAGmsUEm3L26uO7XSKnrZZVSVBQJUuvqfwi63pRB7d5y3ndlbZV0cBgQU7t3lCy078DS0FLqhaYoe9JZF0cQCIAgFKo7lfYU3npP7Eyv1PLk2eLPRnD3lF3OUUP", "client_id": "JhbwqqvE34vVjWiuMPnkV1eE636QQ3SzyQXLjmgs"}
sadaf2605@pavilion-g4:~$ curl -X POST -d "client_id=JhbwqqvE34vVjWiuMPnkV1eE636QQ3SzyQXLjmgs&client;_secret=EjQKMCAGmsUEm3L26uO7XSKnrZZVSVBQJUuvqfwi63pRB7d5y3ndlbZV0cBgQU7t3lCy078DS0FLqhaYoe9JZF0cQCIAgFKo7lfYU3npP7Eyv1PLk2eLPRnD3lF3OUUP&grant;_type=password&username;=user100&password;=p4ssword" http://localhost:8000/auth/token
{"access_token": "bssEYlDNaXefq8TPNRuu8oLolqYJp2", "token_type": "Bearer", "expires_in": 36000, "refresh_token": "fankCVPC3P84pQWI5oWOIhtWLCky4w", "scope": "read write"}

RabbitMQ & celery demo using image processing app on flask

Celery is a distributed system for processing messages on a task queue with a focus on real-time processing and support for task scheduling. When we have to run an expensive function that keeps user waiting for like “forever”, it is always better to use something like celery. In this blog we will be writing a face detection web app using flask, python-opencv and celery.

Before I can tell something, let me share a flask code snippet with you:

from time import sleep

@app.route("/")
def hello():
    sleep(10) # <---what would you see in this 10s?
    return "Hello World!"

Can you tell me what would you see in first 10s while we run our flask app? I know the answer, before getting the response it will keep the user waiting for 10s. We don’t love to wait 10s. We are so impatient, we want everything instantly thats the motivation that we have in modern computing. But life is cruel, we can’t get everything instantly, we understand that but our users DO NOT understand this simple truth. So we do what, we will try to sell them a feeling that we are working instantly, at least it is not taking forever to load. So we need to get over from that sleep block. How would we do that, that’s what I am going to discuss in this blog with a real life image processing app in flask.

Obviously in life we don’t need to write “sleep” to make our code run slower. We had to write plenty of function that makes our life slower. In this blog we will discuss we will be writing an application that enables user to upload a picture and we will help them to detect faces. So what is the function we have this face detection function which is very expensive. It takes almost 3-10s on my machine to detect the face of my favourite actress. Let me share my code:

#server.py

__author__ = 'sadaf2605'


import os
from flask import Flask, request, redirect, url_for
from werkzeug import secure_filename

import face_detect
from os.path import basename


UPLOAD_FOLDER = '/home/sadaf2605/flask_celery_upload_image/uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER


def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        import time
        start_time = time.time()
        file = request.files['file']

        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)

            base,ext=os.path.splitext(filename)


            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
            face_detect.detect(os.path.join(app.config['UPLOAD_FOLDER'], filename),os.path.join(app.config['UPLOAD_FOLDER'], base+"-face"+ext))

            print "--- %s seconds ---" % str (time.time() - start_time)
            return redirect("/")
            return redirect(url_for('uploaded_file',
                                    filename="facedetect-"+filename))

    from os import listdir
    from os.path import isfile, join
    htmlpic=""
    for f in sorted(listdir(UPLOAD_FOLDER)):
        if isfile(join(UPLOAD_FOLDER,f)):
            print f
            htmlpic+="""
            
                
            
                """

    return '''
    
    

    
    Upload new File
    

Upload new File

'''+htmlpic from flask import send_from_directory @app.route('/uploads/') def uploaded_file(filename): return send_from_directory(app.config['UPLOAD_FOLDER'], filename) from werkzeug import SharedDataMiddleware app.add_url_rule('/uploads/', 'uploaded_file', build_only=True) app.wsgi_app = SharedDataMiddleware(app.wsgi_app, { '/uploads': app.config['UPLOAD_FOLDER'] }) if __name__ == "__main__": app.debug=True app.run()
#face_detect.py

import numpy as np
import cv2


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

@app.task
def detect(src_img,dest_img):
    img = cv2.imread(src_img)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(gray, 1.3, 1)
    for (x,y,w,h) in faces:
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),5)
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]


    cv2.imwrite(dest_img, img)

You can test this app running using

python server.py

But we don’t want our user to wait 10s to see the next page. So we will use celery and rabbitmq to help us. First of all lets install RABBITMQ and CELERY.

To install rabbitMq we will use aptitude because it installed all its dependency in the way, if you don’t have aptitude installed then:

sudo apt-get install aptitude

Now its time to install rabbitmq server

 sudo aptitude install rabbitmq-server

Now we will create user and server for rabbitmq.

sudo rabbitmqctl add_user rabbit_user password
sudo rabbitmqctl add_vhost /app_rabbit

we will set permission for our user to do everything

sudo rabbitmqctl set_permissions -p /app_rabbit rabbit_user ".*" ".*" ".*"

Now we need to restart rabbit server, so that the change gets implemented

sudo /etc/init.d/rabbitmq-server stop
sudo /etc/init.d/rabbitmq-server start

Now we will install celery:

pip install celery

Now we need to configure celery, and celery provides few decorator functions like @tasks to achieve our goal. Rabbitmq is default for celery. Now we need to know that celery communicate via broker url using a different port. We want to enqueue our image processing tasks, so we can define it in face_ditect.py but it will be better if we can put it in our server.py as it is the entry point… but whatever for now!

import numpy as np
import cv2

from celery import Celery

app= Celery(broker='amqp://rabbit_user:password@localhost:5672//app_rabbit' )


face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')

@app.task
def detect(src_img,dest_img):
    img = cv2.imread(src_img)
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    faces = face_cascade.detectMultiScale(gray, 1.3, 1)
    for (x,y,w,h) in faces:
        cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),5)
        roi_gray = gray[y:y+h, x:x+w]
        roi_color = img[y:y+h, x:x+w]


    cv2.imwrite(dest_img, img)

Now it won’t change your life radically, because we are not using the decorator functions that celery provided us. To put that task in queue we need use delay function of decorator function. so we need to call detec_image.delay(src_img,dest_image) and we actually need to keep our celery server running other wise it will only put it in queue and wait for the server to run. In -A parameter of celery we need to mention which file the decorator functions are located.

To run celery server

celery worker -A detect_face -l INFO

So now finally we can change our server.py

__author__ = 'sadaf2605'


import os
from flask import Flask, request, redirect, url_for
from werkzeug import secure_filename

import face_detect
from os.path import basename


UPLOAD_FOLDER = '/home/sadaf2605/flask_celery_upload_image/uploads'
ALLOWED_EXTENSIONS = set(['txt', 'pdf', 'png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)
app.config['UPLOAD_FOLDER'] = UPLOAD_FOLDER


def allowed_file(filename):
    return '.' in filename and 
           filename.rsplit('.', 1)[1] in ALLOWED_EXTENSIONS

@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        import time
        start_time = time.time()
        file = request.files['file']

        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)

            base,ext=os.path.splitext(filename)


            file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
            face_detect.detect.delay(os.path.join(app.config['UPLOAD_FOLDER'], filename),os.path.join(app.config['UPLOAD_FOLDER'], base+"-face"+ext))

            print "--- %s seconds ---" % str (time.time() - start_time)
            return redirect("/")
            return redirect(url_for('uploaded_file',
                                    filename="facedetect-"+filename))

    from os import listdir
    from os.path import isfile, join
    htmlpic=""
    for f in sorted(listdir(UPLOAD_FOLDER)):
        if isfile(join(UPLOAD_FOLDER,f)):
            print f
            htmlpic+="""
            
                
            
                """

    return '''
    
    

    
    Upload new File
    

Upload new File

'''+htmlpic from flask import send_from_directory @app.route('/uploads/') def uploaded_file(filename): return send_from_directory(app.config['UPLOAD_FOLDER'], filename) from werkzeug import SharedDataMiddleware app.add_url_rule('/uploads/', 'uploaded_file', build_only=True) app.wsgi_app = SharedDataMiddleware(app.wsgi_app, { '/uploads': app.config['UPLOAD_FOLDER'] }) if __name__ == "__main__": app.debug=True app.run()

So after uploading now in the front page the picture won’t show up you will need to refresh it couple of time after 5-6s to show up. Keep on refreshing and you may like to send me a pull request at: https://github.com/sadaf2605/facedetection-flaskwebapp-rabbitmq

One workaroud to pass variables (context) to django admin submit_line.html template

Recently I had to modify django admin page massively, while trying to add a new button at add new model item at admin page I got into trouble, trouble was not to show the button, or get that button working but it was at variable passing. So in this blog I am going to describe, how did I solve it.

I am overriding this template submit_line.html:

{% load i18n admin_urls %}
{% if show_save %}{% endif %} {% if show_save_as_draft %} {% endif %} {% if show_save_and_add_another %}{% endif %} {% if show_save_and_continue %}{% endif %} {% if show_delete_link %} {% url opts|admin_urlname:'delete' original.pk|admin_urlquote as delete_url %} {% trans "Delete" %} {% endif %}

Here,

{{show_save_as_draft}}

is at out extra context of our ModelAdmin while it is showing:

#/home/sadaf2605/PycharmProjects/stripe/stripe/news/admin.py
class ArticleAdmin(admin.ModelAdmin):
    change_form_template = 'admin/news/change_form.html'
    def change_view(self, request,object_id, form_url='', extra_context=None):
        extra_context = extra_context or {}
        extra_context["show_save_as_draft"] = True
        return super(ArticleAdmin, self).change_view(request,object_id, form_url, extra_context)

Still

{{show_save_as_draft}}

is not showing up. This is the problem

To solve this problem I actually override a template tag that was responsible for showing buttons, basically that template tag was only keeping few selected context field. In this new tag I am keeping tags which are necessary for my app.

#stripe/stripe/news/templatetags/stripe_admin_tag.py
__author__ = 'sadaf2605'
from django import template
register = template.Library()
from django.contrib.admin.templatetags import admin_modify

@register.inclusion_tag('admin/submit_line.html', takes_context=True)
def submit_line_row(context):
    context = context or {}
    ctx= admin_modify.submit_row(context)
    if "show_save_as_draft" in context.keys():
        ctx["show_save_as_draft"] = context["show_save_as_draft"]
    return  ctx

and then finally I need to override change_form.html as well, I need to replace:

{% block submit_buttons_bottom %}{% submit_row %}{% endblock %}

with:

{% load stripe_admin_tag %}
{% block submit_buttons_bottom %}{% submit_ine_row %}{% endblock %}

/stripe/stripe/stripe/templates/admin/news/change_form.html

{% extends "admin/base_site.html" %}
{% load i18n admin_urls admin_static admin_modify %}

{% block extrahead %}{{ block.super }}

{{ media }}
{% endblock %}

{% block extrastyle %}{{ block.super }}{% endblock %}

{% block coltype %}colM{% endblock %}

{% block bodyclass %}{{ block.super }} app-{{ opts.app_label }} model-{{ opts.model_name }} change-form{% endblock %}

{% if not is_popup %}
{% block breadcrumbs %}

{% endblock %}
{% endif %}

{% block content %}
{% block object-tools %} {% if change %}{% if not is_popup %} {% endif %}{% endif %} {% endblock %}
{% csrf_token %}{% block form_top %}{% endblock %}
{% if is_popup %}{% endif %} {% if to_field %}{% endif %} {# WP Admin start #} {% if 0 %}{% block submit_buttons_top %}{% submit_row %}{% endblock %}{% endif %} {# WP Admin end #} {% if errors %}

{% if errors|length == 1 %}{% trans "Please correct the error below." %}{% else %}{% trans "Please correct the errors below." %}{% endif %}

{{ adminform.form.non_field_errors }} {% endif %} {% block field_sets %} {% for fieldset in adminform %} {% include "admin/includes/fieldset.html" %} {% endfor %} {% endblock %} {% block after_field_sets %}{% endblock %} {% block inline_field_sets %} {% for inline_admin_formset in inline_admin_formsets %} {% include inline_admin_formset.opts.template %} {% endfor %} {% endblock %} {% block after_related_objects %}{% endblock %} {% load stripe_admin_tag %} {% block submit_buttons_bottom %}{% submit_line_row %}{% endblock %} {% if adminform and add %} {% endif %} {# JavaScript for prepopulated fields #} {% prepopulated_fields_js %}
{% endblock %}

ssl (https) from python flask

Basically here I am in this blog I am going share a code snippet, and I am going to describe what else stupid things I tried and did not work to do that. Well, don’t miss my point, when I am sharing my stupid ordeal, it does not mean, I am proving myself stupid but I am trying to save your time, I am basically telling what did not work in last couple of hour so that it could save your time.

So using flask when I shared a static file migrated directly from django, while working with django we have figured out that we actually don’t need powerful tool like django, instead we can use something very lightweight like flask and it is much easier to switch from django to flask as for both of them default front end templating is jinja. Now after this shift, I had to face a little bit trouble with https, because when I tried https, it is completely blank, I lost my mind, what the hell is going on here? Then I realized for flask probably I need to define my ssl. They got this nice documentation at their website (http://flask.pocoo.org/snippets/111/), I follwed them, and it did not work! Why it won’t work? Alright after couple of trial and google search I realized this is an old school way of doing this, fine, I need to go to new school, obviously! I got myself a new dress and now context looks more pretty {“cert.crt”,”key.key”}. I am impressed but what the hell? it did not work as well, why it won’t work? I lost my mind! hours of fighting, and I got this error:

Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 763, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 602, in inner
    passthrough_errors, ssl_context).serve_forever()
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 506, in make_server
    passthrough_errors, ssl_context)
  File "/usr/local/lib/python2.7/dist-packages/werkzeug/serving.py", line 450, in __init__
    self.socket = ssl_context.wrap_socket(self.socket,
AttributeError: 'OpenSSL.SSL.Context' object has no attribute 'wrap_socket'

It is because I am using a python 2.7 version below 2.7.9.

What else did I try? You won’t want to know, I tried to install pyopenssl on heroku using pip, but looks like it is a ported version and failed to compile on heroku. Now I will write about what I had to do to make it work.

I have to make my cirtificate and keys:

$ openssl genrsa -des3 -passout pass:x -out server.pass.key 2048
 openssl rsa -passin pass:x -in server.pass.key -out server.key
 rm server.pass.key
 openssl req -new -key server.key -out server.csr
openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt

So now we have server.key and server.csr two files in our directory

Now python script:

from flask import Flask, request, send_from_directory
from flask import render_template

#from OpenSSL import SSL
#context = SSL.Context(SSL.SSLv23_METHOD)
#context.use_privatekey_file('server.key')
#context.use_certificate_file('server.crt')

import os

ASSETS_DIR = os.path.join(os.path.dirname(os.path.abspath(__file__)), '/static')
app = Flask(__name__, static_url_path='/static')

@app.route('/js/')
def send_js(path):
    return send_from_directory('js', path)

@app.route('/signup')
def signup():
    return render_template('signup.html')

if __name__ == '__main__':
    context = ('server.crt', 'server.key')
    app.run(ssl_context=context, threaded=True, debug=True)

Done! Up and running if you got a version >=2.7.9.!

How to fix theano numpy wrong number of dimensions!

Recently I had to work with Theano, numpy, scipy libraries in python. had to pass hour with this this piece of error which says TypeError: (‘Bad input argument to theano function at index 1(0-based)’, ‘Wrong number of dimensions: expected 2, got 1 with shape (2,).’) So here it goes, how I fixed it.

For the sake helping other fellow programmers, I am posting the whole error log, which will help them via google:

/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
/usr/lib/python2.7/dist-packages/scipy/lib/_util.py:35: DeprecationWarning: Module scipy.linalg.blas.fblas is deprecated, use scipy.linalg.blas instead
DeprecationWarning)
False
Traceback (most recent call last):
File "bgd.py", line 44, in 
theta = gradient_descent_2(alpha, x, y, 2000)
File "bgd.py", line 24, in gradient_descent_2
hypothesis =find_hypothesis(x, theta).T
File "/usr/local/lib/python2.7/dist-packages/Theano-0.6.0-py2.7.egg/theano/compile/function_module.py", line 497, in __call__
allow_downcast=s.allow_downcast)
File "/usr/local/lib/python2.7/dist-packages/Theano-0.6.0-py2.7.egg/theano/tensor/type.py", line 157, in filter
data.shape))
TypeError: ('Bad input argument to theano function at index 1(0-based)', 'Wrong number of dimensions: expected 2, got 1 with shape (2,).')

So basically it is saying that in my code there is a shape mismatch, apparently there should not be any mismatch, because I coded everything, I KNOW what I did! But apparently there is something in numpy that makes a matrix without height, which means with blank height. So it has a shape of (2, ).

Example could be:

import numpy as np
theta = np.array([1,1])
print theta.shape       #(2,)

It works pretty fine when you are working with nupy alone, but it causes tremendous problems when you are using other library like THENO.

Now let me share another snippet, from where I got this error:

from theano import tensor as T, function

T_x,T_y, T_z, T_theta, T_loss = T.dmatrices('x','y','z','theta','loss')
find_hypothesis=function([T_x,T_theta],T.dot(T_x,T_theta))

hypothesis =find_hypothesis(x, theta).T

From code like this you will get the error I previously mentioned.

Now lets talk about how did I fix it, actually it was simple, I passed hours without any reason. You can solve it just by reshapiing your numpy like this:

theta=theta.reshape(theta.shape[0],1)        #had tough time without this,

How to automate numpy installation in your project using setuptool?

Recently for an opensurce project, I had to work using numpy. Numpy makes life a lot easier, without any doubt, but when it comes to setup.py of setuptool. It does not work just by adding <code>’install_requires:[“numpy”]’ </code> mainly because unlike any other python packages numpy is written in C/C++, mainly because of the optimization, interpreted language like python are not fast enough for mathematical computations. They ported their code in python. So before using them we need to compile them.

To do that we need to use custom commands, for custom commands in setuptool, we need to add “cmdclass” as a key in the dictionary. “build_ext” is a cython command which helps to compile files of numpy. But before compiling is done we should not call anything else, so we need to customize some existing class. So we overwrite build_ext class with an extension of it.

So the code should look like this:

 

from setuptools import setup
from setuptools.command.build_ext import build_ext as _build_ext

class build_ext(_build_ext):
    'to install numpy'
    def finalize_options(self):
        _build_ext.finalize_options(self)
        # Prevent numpy from thinking it is still in its setup process:
        __builtins__.__NUMPY_SETUP__ = False
        import numpy
        self.include_dirs.append(numpy.get_include())

config = {
 'cmdclass':{'build_ext':build_ext}, #numpy hack
 'setup_requires':['numpy'],         #numpy hack
 #...
 'install_requires': ["numpy" ]
}

 

Now why do we write __NUMPY_SETUP__=false? I have found an interesting answer @ StackOverFlow by coldfix.

In the source code of numpy we have got this:

 

if __NUMPY_SETUP__:
    import sys as _sys
    _sys.stderr.write('Running from numpy source directory.\n')
    del _sys
else:
    # import submodules etc. (main branch)

So when NUMPY_SETUP is true it does not import submodules so we need to make it sure that it does not get called first time. Then we include it in our library path.

 

 

Implementation and mathematics behind Harris Corner Detection

Say I have this amazing friend who travels a lot, say he visits every corner of the world. If I had the job/money I would have done the same, but unfortunately I need to be passionate a little bit to make it happen. So he want to share all his photos with me. There could have been the picture of the hill top buildings of italy, the underground city of turkey, Plitvice lakes of Croatia, Prague of Czech Republic. Obviously some of the photos will be in Zambia capturing The Southern Cross. Bazaruto Archipelago of Mozambique will obviously be there, Some of them will capture Nyiragongo Volcano of Congo. And so on, not one picture in each, but instead, there would be multiples of it. At this age, we don’t just take one picture at a place, right? But as I have seen these places in only in my dream, I could not really tell, which pictures are taken from where just by looking at it. I need to take a closer look, I need to figure out something unique in that picture and correspond that unique feature with other pictures I can tell if the place is same or not. They key point of this para is “Feature detection”.

So as a human we do feature detection. We want our algorithm to do that for us, by nature computer can’t tell a man “a man”, because everything is just a bunch of number that represent pixels, how would it know if they are feature or not? For the simplicity sake lets say we know there are two images where the picture where the image has been captured from the same place. Say we have two corresponding images representing the same scene, we need to match one point of the image to the other point. If we try to correspond each pixel of one image with other image. It will be extremely costly, because even a simple can have 1000×600 pixels. So better approach would be to get few unique identical points that is very rare in the images. Say within 1000×600=100000 pixels, we choose 100 points that does not comes too often, within two images. Since both of the images represent the same scene then something that is unique feature in one image should also be unique in other. So now our new aim will be to correspond 100 points of one image to the other.

Feature pixcels such as mountain peaks, building corners, doorways, or interestingly shaped patches are kinds of localized feature are often called keypoint features, interest points or even corners. They are basically defined by the appearance of patches of pixels surrounding the point location. Another set of important features are edges, say the profile of mountains against the sky.

As we see, similarity plays a great role in it, so we can use summed square difference here between two images, mathematically,

w is the weight, say some pixcel that is close to x_i has a different weight when it is far far away. Usually, it is a gaussian filter.

Which is essentially the correlation equation.

Now we know are going to consider pixels as a series of 2D array, like I(x+u,y+v). As we know any function can be represented as a tylor series, so do our I(). So our I(x+u,y+v) can be represented as

So putting all together:


Now we will compute eigenvalues, it helps us to find least squares plane of some data. It helps us to find the surface normal of point cloud.

is a 2×2 matrix, it can have two eigenvalues, λ1 and λ2.

If λ1 and λ2 are too small then flat.
If λ1 and λ2 are large then it is corner.
If λ1 or λ2 are too small or too large from other it is edge.

Harris measured this cornerness using R=λ1 * λ2 – k (λ1 + λ2)^2 = det M – k(trace M)^2,

if R is negative then it is edge, R is small then flat, R is greater than corner.

from scipy.ndimage import *
from pylab import *
from PIL import *

def compute_harris_response(im,sigma=3):
    """ Compute the Harris corner detector response function for each pixel in a graylevel image. """

    # derivatives
    imx = zeros(im.shape)
    filters.gaussian_filter(im, (sigma,sigma), (0,1), imx)
    imy = zeros(im.shape)
    filters.gaussian_filter(im, (sigma,sigma), (1,0), imy)

    # compute components of the Harris matrix
    Wxx = filters.gaussian_filter(imx*imx,sigma)
    Wxy = filters.gaussian_filter(imx*imy,sigma)
    Wyy = filters.gaussian_filter(imy*imy,sigma)

    # determinant and trace
    Wdet = Wxx*Wyy - Wxy**2
    Wtr = Wxx + Wyy

    return Wdet / Wtr


def get_harris_points(harrisim,min_dist=10,threshold=0.1):
    """ Return corners from a Harris response image
    min_dist is the minimum number of pixels separating
    corners and image boundary. """
    # find top corner candidates above a threshold
    corner_threshold = harrisim.max() * threshold
    harrisim_t = (harrisim > corner_threshold) * 1
    # get coordinates of candidates
    coords = array(harrisim_t.nonzero()).T
    # ...and their values
    candidate_values = [harrisim[c[0],c[1]] for c in coords]
    # sort candidates
    index = argsort(candidate_values)
    # store allowed point locations in array
    allowed_locations = zeros(harrisim.shape)
    allowed_locations[min_dist:-min_dist,min_dist:-min_dist] = 1
    # select the best points taking min_distance into account
    filtered_coords = []
    for i in index:
        if allowed_locations[coords[i,0],coords[i,1]] == 1:
            filtered_coords.append(coords[i])
            allowed_locations[(coords[i,0]-min_dist):(coords[i,0]+min_dist),
            (coords[i,1]-min_dist):(coords[i,1]+min_dist)] = 0
    return filtered_coords

def plot_harris_points(image,filtered_coords):
    """ Plots corners found in image. """
    figure()
    gray()
    imshow(image)
    plot([p[1] for p in filtered_coords],[p[0] for p in filtered_coords],'*')
    axis('off')
    show()






im = array(Image.open('empire.jpg').convert('L'))
harrisim = compute_harris_response(im)

filtered_coords = get_harris_points(harrisim,10)
plot_harris_points(im, filtered_coords)

Thanks to:
1. Professor Mubarak Shah for sharing his course online.
2. Professor Szeleski, for making his book draft public
3. Jan Erik Solem for his book on programming computer vision.
4. That random guy at physics forum who tells me why it works, after full day of wondering, why it works.