Running a on premise local mysql replica with AWS RDS Aurora master

To solve our problem we are running a hybrid cloud. Few of our services are running on cloud and some of our services are running in premise locally in our country where we have our users and where AWS does not provide service. To able to do that we need a database replica that has read facility.

We need to creating replica user:

CREATE USER 'replica'@'%' IDENTIFIED BY 'slavepass'; 
GRANT REPLICATION SLAVE ON *.* TO 'replica'@'%';

Then create a new DB Cluster parameter group and set binlog_format to MIXED. Modify the Aurora cluster and select the custom parameter group. Restart your db to apply those changes. Now if you run following command you will be able to see the bin log file name and position.

show master status

Now we need to dump our master user data to sql dump so that we can feed our slave database.

mysqldump --single-transaction --routines --triggers --events -h XXX.azhxxxxxx2zkqxh3j.us-east-1.rds.amazonaws.com -u bhuvi –-password='xxx' my_db_name > my_db_name.sql

It can be GB to TB of data depending on your database size. So it will take time to download.

Run follwoing to know your mysql configuration file:

mysqld --help -verbose | grep my.cnf

For me it is /usr/local/etc/my.cnf

vi /usr/local/etc/my.cnf

and change server-id to:

 [mysqld] server-id = 2

now lets import these data into our mysql.

mysql -u root –-password='xxx' my_db_name < my_db_name.sql

Now we need to let our slave database know who is the master:

CHANGE MASTER TO  
MASTER_HOST = 'RDS END Point name',  
MASTER_PORT = 3306,  
MASTER_USER = '',  
MASTER_PASSWORD = '',  
MASTER_LOG_FILE='',  
MASTER_LOG_POS=;

Now we need to start the slave.

start slave;

Setting up (comodo) ssl for your website on aws

We have bought our ssl from comodo from name.com as we got a better deal there. After sending them our signed key. comodo sent us following files via email, against my private key. Now I would blog about how I setted the whole thing up on AWS.

First of all, before purchasing I had to send them a key which I had generated using OpenSSL using following command:

openssl req \
       -newkey rsa:2048 -nodes -keyout domain.key \
       -out domain.csr

Which was pretty easy. And as we had bought Comodo Essential SSL Wildcard so we could buy it without verifying our company, in fairly easy in less than 5 min.

After our successful purchase comodo sent us following files as zip in my email:
domain_com.crt
COMODORSAAddTrustCA.crt
domain_com.crt os our Primary Certificate, COMODORSAAddTrustCA.crt is our Intermediate Certificate, and AddTrustExternalCAROOT.crt is the The Root Certificate.

Now it gets a little bit tricky because currently our certificates are in .crt format, but we want it to be in *.pem format. So we would need to convert them in *.pem.

openssl x509 -in ./AddTrustExternalCARoot.crt -outform pem -out ./pem/AddTrustExternalCARoot.pem
openssl x509 -in ./COMODORSAAddTrustCA.crt -outform pem -out ./pem/COMODORSAAddTrustCA.pem
openssl x509 -in ./COMODORSADomainValidationSecureServerCA.crt -outform pem -out ./pem/COMODORSADomainValidationSecureServerCA.pem
openssl x509 -in ./domain_com.crt -outform pem -out ./domain.pem

We would also need to keys that was used to create these certificates by comodo.

openssl rsa -in ./domain.key -outform PEM -out domain.key.pem

Lets create the chain first:

$ cat ./COMODORSADomainValidationSecureServerCA.pem > ./CAChain.pem
$ cat ./COMODORSAAddTrustCA.pem >> ./CAChain.pem
$ cat ./AddTrustExternalCARoot.pem >> ./CAChain.pem

Now you need to login to your aws console and search for ACM (Amazon Certificate Manager). and if it is your first time you need to click on Provision certificates.

It is time to import your certificate to ACM. At the form where it says Certificate body* please paste domain.pem and domain.key.pem and at Certificate chain paste CAChain.pem.

So thats it we are done importing.

Now if you have a load balancer you can take advantages of this ssl. If you have an existing load balancer or feel free to create one, where at the place of listener add https instead of http and for certificate choose acm and your domain.

You are good to go.

Configuring django for centralised log monitoring with ELK stack with custom logging option (eg. client ip, username, request & response data)

When you are lucky enough to have enough users that you decide to roll another cloud instance for your django app, logging becomes a little bit tough because in your architecture now you would be needing a load balancer which will be proxying request from one instance to another instance based on requirement. Previously we had log in one machine to log monitoring was easier, when someone reported a error we went to that instance and looked for errors, but now as we have multiple instance we have to go to all the instance, regardless of security risks, i would say it is a lot of work. So I think it would be wise to have a centralized log aggregating service.

For log management and monitoring we are using Elastic Logstash and Kibana popularly known as ELK stack. For this blog we will be logging pretty much all the request and its corresponding responses so that debugging process gets handy for us. To serve this purpose we will leverage django middlewares and python-logstash.

First of all let’s configure our settings.py for logging:

LOGGING = {
    'version': 1,
    'disable_existing_loggers': True,
    'formatters': {
        
        'standard': {
            'format': '%(asctime)s [%(levelname)s] %(name)s: %(message)s'
        },
        'logstash': {
            '()': 'proj_name.formatter.SadafNoorLogstashFormatter',
        },
    },
    'handlers': {
        'default': {
            'level':'DEBUG',
            'class':'logging.handlers.RotatingFileHandler',
            'filename': '/var/log/proj_name/django.log',
            'maxBytes': 1024*1024*5, # 5 MB
            'backupCount': 5,
            'formatter':'standard',
        },  
        'logstash': {
          'level': 'DEBUG',
          'class': 'logstash.TCPLogstashHandler',
          'host': 'ec2*****.compute.amazonaws.com',
          'port': 5959, # Default value: 5959
          'version': 1, # Version of logstash event schema. Default value: 0 (for backward compatibility of the library)
          'message_type': 'logstash',  # 'type' field in logstash message. Default value: 'logstash'.
          'fqdn': False, # Fully qualified domain name. Default value: false.
          #'tags': ['tag1', 'tag2'], # list of tags. Default: None.
          'formatter': 'logstash',
      },

        'request_handler': {
            'level':'DEBUG',
            'class':'logging.handlers.RotatingFileHandler',
            'filename': '/var/log/proj_name/django.log',
            'maxBytes': 1024*1024*5, # 5 MB
            'backupCount': 5,
            'formatter': 'standard',
        },
    },
    'loggers': {
        'sadaf_logger': {
            'handlers': ['default', 'logstash'],
            'level': 'DEBUG',
            'propagate': True
        },
    }
}

As you can see we are using a custom logging format. We can leave this configuration and by default LogstashFormatterVersion1 is the logging format that will work just fine. But I chose to define my own logging format because my requirement is different, I am running behind a proxy server, I want to log who has done that and from which IP. So roughly my Log Formatter looks like following:

from logstash.formatter import LogstashFormatterVersion1

from django.utils.deprecation import MiddlewareMixin
class SadafNoorLogstashFormatter(LogstashFormatterVersion1):
    def __init__(self,*kargs, **kwargs):
        print(*kargs, **kwargs)
        super().__init__(*kargs, **kwargs)


    def format(self, record,sent_request=None):
        print(record)
        print(sent_request, "old req")
        caddr = "unknown"
        #print(record.request.META)

        if 'HTTP_X_FORWARDED_FOR' in record.request.META:
            caddr = record.request.META['HTTP_X_FORWARDED_FOR'] #.split(",")[0].strip()
        
#        print(record.request.POST,record.request.GET, record.request.user)
        message = {
            '@timestamp': self.format_timestamp(record.created),
            '@version': '1',
            'message': record.getMessage(),
            'host': self.host,
            
            'client': caddr,
            'username': str(record.request.user),

            'path': record.pathname,
            'tags': self.tags,
            'type': self.message_type,
            #'request': self.record

            # Extra Fields
            'level': record.levelname,
            'logger_name': record.name,
        }

        # Add extra fields
#        print(type(self.get_extra_fields(record)['request']))
        message.update(self.get_extra_fields(record))

        # If exception, add debug info
        if record.exc_info:
            message.update(self.get_debug_fields(record))

        return self.serialize(message)

As our requirement is to log every request our middleware may look like following:

import logging

request_logger = logging.getLogger('sadaf_logger')
from datetime import datetime
from django.utils.deprecation import MiddlewareMixin
class LoggingMiddleware(MiddlewareMixin):
    """
    Provides full logging of requests and responses
    """
    _initial_http_body = None
    def __init__(self, get_response):
        self.get_response = get_response

    def process_request(self, request):
        self._initial_http_body = request.body # this requires because for some reasons there is no way to access request.body in the 'process_response' method.


    def process_response(self, request, response):
        """
        Adding request and response logging
        """
#        print(response.content, "xxxx")
        if request.path.startswith('/') and \
                (request.method == "POST" and
                         request.META.get('CONTENT_TYPE') == 'application/json'
                 or request.method == "GET"):
            status_code = getattr(response, 'status_code', None)
            print(status_code)

            if status_code:
                if status_code >= 400:
                    log_lvl = logging.ERROR
                else:
                    log_lvl = logging.INFO

            #request_logger.log(logging.DEBUG,)
            request_logger.log(log_lvl,
                               "GET: {}"
                               ""
                               .format(
                                   request.GET,
                                   ), 
                                   extra ={
                                       'request': request,
                                       'request_method': request.method,
                                       'request_url': request.build_absolute_uri(),
                                       'request_body': self._initial_http_body.decode("utf-8"),
                                       'response_body':response.content,
                                       'status': response.status_code
                                   }
                                       #extra={
                    #'tags': {
                    #    'url': request.build_absolute_uri()
                    #}
                #}
                )
#            print(request.POST,"fff")
        print("hot")
        return response

So pretty much you are done. Go login to your Kibana dashboard, make index pattern that you are interest and see your log:

Sample AWS CodeDeploy configuration for django

AWS has its own continuous integration tool known as CodeDeploy, using a simple command you would be able to deploy on multiple servers when you want to change something on code base.

Installing code deploy to instance

If code deploy client is not installed at your instance, you would need to do that:

sudo yum install -y ruby wget
cd /opt
wget https://aws-codedeploy-ap-south-1.s3.amazonaws.com/latest/install
chmod +x ./install
sudo ./install auto

Create CodeDeploy Application

You have to create Code Deploy application with Deployment type to Inplace deployment, and deployment Configuration set to CodeDeployDefault.OneAtATime.
Give it a name under Ec2 configuration and Amazon ec2 instance, say the name is Code deploy instance. Now you have to add the same tag to all your code deploy instances.

Set IAM Permissions

Now that we are done with installation, we would need to setup IAM rules:
First create an IAM group called CodeDeployGroup. This group needs AmazonS3FullAccess and AWSCodeDeployFullAccess permissions. Create a user and add it to this group. This user only needs programmatic access.Save key and key id to somewhere safe.

Create role that has Trusted entities and Policies are ec2.amazonaws.com and AWSCodeDeployRole AmazonS3FullAccess, respectively.

Edit trust relationship to following:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Service": [
            "ec2.amazonaws.com",
            "codedeploy.ap-south-1.amazonaws.com"
        ]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}

Create new s3 bucket with previously created IAM rules.

CodeDeploy configuration

My codebase  structure is something like following:

- src
  - <django project>
- scripts
   install_dependencies
   start_server
   stop_server
appspec.yml
codedeploy_deploy.py
deploy.sh

appspec.yml is the file that contains our hooks and configuration for code deploy.

version: 0.0
os: linux
files:
  - source: src
    destination: /home/centos/proj_name
hooks:
  BeforeInstall:
    - location: scripts/install_dependencies
      timeout: 300
      runas: root
  ApplicationStop:
    - location: scripts/stop_server
      timeout: 300
      runas: root
  ApplicationStart:
    - location: scripts/start_server
      timeout: 300
      runas: root

for django scripts/install_dependencies may look like following:

sudo yum install -y gcc openssl-devel bzip2-devel wget
sudo yum install -y make git
cd /opt
command -v python3.6 || {
    wget https://www.python.org/ftp/python/3.6.3/Python-3.6.3.tgz
    tar xzf Python-3.6.3.tgz
    cd Python-3.6.3
    sudo ./configure --enable-optimizations
    sudo make altinstall
}
sudo yum install -y mysql-devel

for scripts/start_server I have following:

cd /home/centos/evaly
pip3.6 install -r requirements.txt
nohup uwsgi --http :80 --module evaly.wsgi > /dev/null 2>&1 &

for scripts/stop_server I have following:

pkill uwsgi

I have borrowed a python script from bitbucket team which looks like following:

# Copyright 2016 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file
# except in compliance with the License. A copy of the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is distributed on an "AS IS"
# BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under the License.
"""
A BitBucket Builds template for deploying an application revision to AWS CodeDeploy
narshiva@amazon.com
v1.0.0
"""
from __future__ import print_function
import os
import sys
from time import strftime, sleep
import boto3
from botocore.exceptions import ClientError

VERSION_LABEL = strftime("%Y%m%d%H%M%S")
BUCKET_KEY = os.getenv('APPLICATION_NAME') + '/' + VERSION_LABEL + \
    '-bitbucket_builds.zip'

def upload_to_s3(artifact):
    """
    Uploads an artifact to Amazon S3
    """
    try:
        client = boto3.client('s3')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False
    try:
        client.put_object(
            Body=open(artifact, 'rb'),
            Bucket=os.getenv('S3_BUCKET'),
            Key=BUCKET_KEY
        )
    except ClientError as err:
        print("Failed to upload artifact to S3.\n" + str(err))
        return False
    except IOError as err:
        print("Failed to access artifact.zip in this directory.\n" + str(err))
        return False
    return True

def deploy_new_revision():
    """
    Deploy a new application revision to AWS CodeDeploy Deployment Group
    """
    try:
        client = boto3.client('codedeploy')
    except ClientError as err:
        print("Failed to create boto3 client.\n" + str(err))
        return False

    try:
        response = client.create_deployment(
            applicationName=str(os.getenv('APPLICATION_NAME')),
            deploymentGroupName=str(os.getenv('DEPLOYMENT_GROUP_NAME')),
            revision={
                'revisionType': 'S3',
                's3Location': {
                    'bucket': os.getenv('S3_BUCKET'),
                    'key': BUCKET_KEY,
                    'bundleType': 'zip'
                }
            },
            deploymentConfigName=str(os.getenv('DEPLOYMENT_CONFIG')),
            description='New deployment from BitBucket',
            ignoreApplicationStopFailures=True
        )
    except ClientError as err:
        print("Failed to deploy application revision.\n" + str(err))
        return False     
           
    """
    Wait for deployment to complete
    """
    while 1:
        try:
            deploymentResponse = client.get_deployment(
                deploymentId=str(response['deploymentId'])
            )
            deploymentStatus=deploymentResponse['deploymentInfo']['status']
            if deploymentStatus == 'Succeeded':
                print ("Deployment Succeeded")
                return True
            elif (deploymentStatus == 'Failed') or (deploymentStatus == 'Stopped') :
                print ("Deployment Failed")
                return False
            elif (deploymentStatus == 'InProgress') or (deploymentStatus == 'Queued') or (deploymentStatus == 'Created'):
                continue
        except ClientError as err:
            print("Failed to deploy application revision.\n" + str(err))
            return False      
    return True

def main():
    if not upload_to_s3('/Users/sadafnoor/Projects/evaly/artifact.zip'):
        sys.exit(1)
    if not deploy_new_revision():
        sys.exit(1)

if __name__ == "__main__":
    main()

I have written a script to zip up my source code so that the script can upload it to s3 and eventually all my ec2 instances will be downloading that zip from s3.

export APPLICATION_NAME="CodeDeployApplicationName" 
export AWS_ACCESS_KEY_ID="IAMUserKeyId"
export AWS_DEFAULT_REGION="ap-south-1"

export AWS_SECRET_ACCESS_KEY="IAMUserSecretKey"
export DEPLOYMENT_CONFIG="CodeDeployDefault.OneAtATime"

export DEPLOYMENT_GROUP_NAME="CodeDeployDeploymentGroup"
export S3_BUCKET="S3BucketName"
zip -r ../artifact.zip src/* appspec.yml scripts/*
python codedeploy_deploy.py

Integrating amazon s3 with django using django-storage and boto3

If we are lucky enough to get high amount of traffic at our website, next thing we start to think about is performance. The throughput of a website loading depends on the speed we are being able to deliver the contents of the website, to our users from our storages. In vanilla django, all assets including css, js, files and images are being stored locally in a predefined or preconfigured folder. To enhance performance we may have to decide to use a third party storage service that alleviate the headache of caching, zoning, replicating and to build the infrastructure of a Content Delivery Network. Ideally we would like to have a pluggable solution, something that allows us to switch storages from this to that, based on configuration. django-storages is one of the cool libraries from django community that helps to maintain 3rd party storage services like aws s3, google cloud, ftp, dropbox and so on. Amazon Webservice is one of the trusted service that offers a large range of services, s3 is one of the cool services from AWS that helps us to store static assets. boto3 is a python library being distributed by amazon to interact with amazon s3.

First thing first, to be able to store files on s3 we would need permission. In AWS world, all sorts of permissions are being managed using Identity Access Management (IAM).
i) In amazon console, you will be able to find IAM under Security, Identity & Compliance. Go there.
ii) We would need to add user with programmatic access.
iii) We would need to add new group.
iv) We would need to set policy for the group. Amazon provides bunch of predefined policies. For our use case, we can choose AmazonS3FullAccess
v) We have to store User, Access key ID and the Secret access key.

In s3 we can organize our contents into multiple buckets. We can use several buckets for a single Django project, sometime it is more efficient to use more but for now we will use only one. We will need to create bucket.

Now we need to install:

pip install boto3
pip install django-storages

We will need to add storages inside our INSTALLED_APPS of settings.py along with other configuration files of

django-storage
INSTALLED_APPS = [
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    'storages',
]


AWS_ACCESS_KEY_ID = '#######'
AWS_SECRET_ACCESS_KEY = '#####'
AWS_STORAGE_BUCKET_NAME = '####bucket-name'
AWS_S3_CUSTOM_DOMAIN = '%s.s3.amazonaws.com' % AWS_STORAGE_BUCKET_NAME
#AWS_S3_OBJECT_PARAMETERS = {
#    'CacheControl': 'max-age=86400',
#}
AWS_LOCATION = 'static'

STATICFILES_DIRS = [
    os.path.join(BASE_DIR, 'mysite/static'),
]
STATIC_URL = 'https://%s/%s/' % (AWS_S3_CUSTOM_DOMAIN, AWS_LOCATION)
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'

When we are using django even when we don’t write any html, css or js files for our projects, it already has few because many of classes that we will be using at our views, its parent class may have static template files, base html files, css, js files. These static assets are being stored in our python library folder. To move then from library folder to s3 we will need to use following command:

python manage.py collectstatic

Thing to notice here is that, previously static referred to localhost:port but now it is being referred to s3 link.

{% static 'img/logo.png' %}

We may like to have some custom configuration for file storage, say we may like to put media files in a separate directory, we may not like it to be overwritten by another user. In that case we can define a child class of S3Boto3Storage and change the value of DEFAULT_FILE_STORAGE.

#storage_backends.py

from storages.backends.s3boto3 import S3Boto3Storage

class MyStorage(S3Boto3Storage):
    location = 'media'
    file_overwrite = False
DEFAULT_FILE_STORAGE = 'mysite.storage_backends.MediaStorage'  

Now all our file related fields like models.FileField(), models.ImageField() will be uploading file in our s3 bucket inside the directory ‘media’.

Now we may have different types of storages, some of them will be storing documents, some of them will be publicly accessible, some of them will be classified. Their directory could be different and so on so forth.

class MyPrivateFileStorage(S3Boto3Storage):
    location = 'classified'
    default_acl = 'private'
    file_overwrite = False
    custom_domain = False

If we want to use any other storages that is not defined in DEFAULT_FILE_STORAGE in settings.py. We would need to define it at the field of our model models.FileField(storage=PrivateMediaStorage()).