• Redis and MongoDB can be used together with good results. A company well-known for running MongoDB and Redis (along with MySQL and Sphinx) is Craiglist. See this presentation from Jeremy Zawodny.
    MongoDB is interesting for persistent, document oriented, data indexed in various ways. Redis is more interesting for volatile data, or latency sensitive semi-persistent data.
    Here are a few examples of concrete usage of Redis on top of MongoDB.
    • MongoDB does not have yet an expiration mechanism. Capped collections cannot really be used to implement a real TTL. Redis has a TTL-based expiration mechanism, making it convenient to store volatile data. For instance, user sessions are commonly stored in Redis, while user data will be stored and indexed in MongoDB.
    • Redis provides a convenient set datatype and its associated operations (union, intersection, difference on multiple sets, etc ...). It is quite easy to implement a basic faceted search or tagging engine on top of this feature, which is an interesting addition to MongoDB more traditional indexing capabilities.
    • Redis supports efficient blocking pop operations on lists. This can be used to implement an ad-hoc distributed queuing system. It is more flexible than MongoDB tailable cursors IMO, since a backend application can listen to several queues with a timeout, transfer items to another queue atomically, etc ... If the application requires some queuing, it makes sense to store the queue in Redis, and keep the persistent functional data in MongoDB.
    • Redis also offers a pub/sub mechanism. In a distributed application, an event propagation system may be useful. This is again an excellent use case for Redis, while the persistent data are kept in MongoDB.
    Because it is much easier to design a data model with MongoDB than with Redis (Redis is more low-level), it is interesting to benefit from the flexibility of MongoDB for main persistent data, and from the extra features provided by Redis (low latency, item expiration, queues, pub/sub, atomic blocks, etc ...). It is indeed a good combination.
    Please note you should never run a Redis and MongoDB server on the same machine. MongoDB memory is designed to be swapped out, Redis is not. If MongoDB triggers some swapping activity, the performance of Redis will be catastrophic. They should be isolated on different nodes.

  • Here are a few examples of how to use the email package to read, write, and send simple email messages, as well as more complex MIME messages.
    First, let’s see how to create and send a simple text message:

    # Import smtplib for the actual sending function
    import smtplib
    # Import the email modules we'll need
    from email.mime.text import MIMEText
    # Open a plain text file for reading.  For this example, assume that
    # the text file contains only ASCII characters.
    fp = open(textfile, 'rb')
    # Create a text/plain message
    msg = MIMEText(fp.read())
    # me == the sender's email address
    # you == the recipient's email address
    msg['Subject'] = 'The contents of %s' % textfile
    msg['From'] = me
    msg['To'] = you
    # Send the message via our own SMTP server, but don't include the
    # envelope header.
    s = smtplib.SMTP('localhost')
    s.sendmail(me, [you], msg.as_string())

    And parsing RFC822 headers can easily be done by the parse(filename) or parsestr(message_as_string) methods of the Parser() class:
    # Import the email modules we'll need
    from email.parser import Parser
    #  If the e-mail headers are in a file, uncomment this line:
    #headers = Parser().parse(open(messagefile, 'r'))
    #  Or for parsing headers in a string, use:
    headers = Parser().parsestr('From: <user@example.com>\n'
            'To: <someone_else@example.com>\n'
            'Subject: Test message\n'
            'Body would go here\n')
    #  Now the header items can be accessed as a dictionary:
    print 'To: %s' % headers['to']
    print 'From: %s' % headers['from']
    print 'Subject: %s' % headers['subject']
    Here’s an example of how to send a MIME message containing a bunch of family pictures that may be residing in a directory:
    # Import smtplib for the actual sending function
    import smtplib
    # Here are the email package modules we'll need
    from email.mime.image import MIMEImage
    from email.mime.multipart import MIMEMultipart
    COMMASPACE = ', '
    # Create the container (outer) email message.
    msg = MIMEMultipart()
    msg['Subject'] = 'Our family reunion'
    # me == the sender's email address
    # family = the list of all recipients' email addresses
    msg['From'] = me
    msg['To'] = COMMASPACE.join(family)
    msg.preamble = 'Our family reunion'
    # Assume we know that the image files are all in PNG format
    for file in pngfiles:
        # Open the files in binary mode.  Let the MIMEImage class automatically
        # guess the specific image type.
        fp = open(file, 'rb')
        img = MIMEImage(fp.read())
    # Send the email via our own SMTP server.
    s = smtplib.SMTP('localhost')
    s.sendmail(me, family, msg.as_string())
    Here’s an example of how to send the entire contents of a directory as an email message: [1]
    #!/usr/bin/env python
    """Send the contents of a directory as a MIME message."""
    import os
    import sys
    import smtplib
    # For guessing MIME type based on file name extension
    import mimetypes
    from optparse import OptionParser
    from email import encoders
    from email.message import Message
    from email.mime.audio import MIMEAudio
    from email.mime.base import MIMEBase
    from email.mime.image import MIMEImage
    from email.mime.multipart import MIMEMultipart
    from email.mime.text import MIMEText
    COMMASPACE = ', '
    def main():
        parser = OptionParser(usage="""\
    Send the contents of a directory as a MIME message.
    Usage: %prog [options]
    Unless the -o option is given, the email is sent by forwarding to your local
    SMTP server, which then does the normal delivery process.  Your local machine
    must be running an SMTP server.
        parser.add_option('-d', '--directory',
                          type='string', action='store',
                          help="""Mail the contents of the specified directory,
                          otherwise use the current directory.  Only the regular
                          files in the directory are sent, and we don't recurse to
        parser.add_option('-o', '--output',
                          type='string', action='store', metavar='FILE',
                          help="""Print the composed message to FILE instead of
                          sending the message to the SMTP server.""")
        parser.add_option('-s', '--sender',
                          type='string', action='store', metavar='SENDER',
                          help='The value of the From: header (required)')
        parser.add_option('-r', '--recipient',
                          type='string', action='append', metavar='RECIPIENT',
                          default=[], dest='recipients',
                          help='A To: header value (at least one required)')
        opts, args = parser.parse_args()
        if not opts.sender or not opts.recipients:
        directory = opts.directory
        if not directory:
            directory = '.'
        # Create the enclosing (outer) message
        outer = MIMEMultipart()
        outer['Subject'] = 'Contents of directory %s' % os.path.abspath(directory)
        outer['To'] = COMMASPACE.join(opts.recipients)
        outer['From'] = opts.sender
        outer.preamble = 'You will not see this in a MIME-aware mail reader.\n'
        for filename in os.listdir(directory):
            path = os.path.join(directory, filename)
            if not os.path.isfile(path):
            # Guess the content type based on the file's extension.  Encoding
            # will be ignored, although we should check for simple things like
            # gzip'd or compressed files.
            ctype, encoding = mimetypes.guess_type(path)
            if ctype is None or encoding is not None:
                # No guess could be made, or the file is encoded (compressed), so
                # use a generic bag-of-bits type.
                ctype = 'application/octet-stream'
            maintype, subtype = ctype.split('/', 1)
            if maintype == 'text':
                fp = open(path)
                # Note: we should handle calculating the charset
                msg = MIMEText(fp.read(), _subtype=subtype)
            elif maintype == 'image':
                fp = open(path, 'rb')
                msg = MIMEImage(fp.read(), _subtype=subtype)
            elif maintype == 'audio':
                fp = open(path, 'rb')
                msg = MIMEAudio(fp.read(), _subtype=subtype)
                fp = open(path, 'rb')
                msg = MIMEBase(maintype, subtype)
                # Encode the payload using Base64
            # Set the filename parameter
            msg.add_header('Content-Disposition', 'attachment', filename=filename)
        # Now send or store the message
        composed = outer.as_string()
        if opts.output:
            fp = open(opts.output, 'w')
            s = smtplib.SMTP('localhost')
            s.sendmail(opts.sender, opts.recipients, composed)
    if __name__ == '__main__':

    Here’s an example of how to unpack a MIME message like the one above, into a directory of files:
    #!/usr/bin/env python
    """Unpack a MIME message into a directory of files."""
    import os
    import sys
    import email
    import errno
    import mimetypes
    from optparse import OptionParser
    def main():
        parser = OptionParser(usage="""\
    Unpack a MIME message into a directory of files.
    Usage: %prog [options] msgfile
        parser.add_option('-d', '--directory',
                          type='string', action='store',
                          help="""Unpack the MIME message into the named
                          directory, which will be created if it doesn't already
        opts, args = parser.parse_args()
        if not opts.directory:
            msgfile = args[0]
        except IndexError:
        except OSError, e:
            # Ignore directory exists error
            if e.errno != errno.EEXIST:
        fp = open(msgfile)
        msg = email.message_from_file(fp)
        counter = 1
        for part in msg.walk():
            # multipart/* are just containers
            if part.get_content_maintype() == 'multipart':
            # Applications should really sanitize the given filename so that an
            # email message can't be used to overwrite important files
            filename = part.get_filename()
            if not filename:
                ext = mimetypes.guess_extension(part.get_content_type())
                if not ext:
                    # Use a generic bag-of-bits extension
                    ext = '.bin'
                filename = 'part-%03d%s' % (counter, ext)
            counter += 1
            fp = open(os.path.join(opts.directory, filename), 'wb')
    if __name__ == '__main__':

    Here’s an example of how to create an HTML message with an alternative plain text version: [2]
    #!/usr/bin/env python
    import smtplib
    from email.mime.multipart import MIMEMultipart
    from email.mime.text import MIMEText
    # me == my email address
    # you == recipient's email address
    me = "my@email.com"
    you = "your@email.com"
    # Create message container - the correct MIME type is multipart/alternative.
    msg = MIMEMultipart('alternative')
    msg['Subject'] = "Link"
    msg['From'] = me
    msg['To'] = you
    # Create the body of the message (a plain-text and an HTML version).
    text = "Hi!\nHow are you?\nHere is the link you wanted:\nhttp://www.python.org"
    html = """\
           How are you?<br>
           Here is the <a href="http://www.python.org">link</a> you wanted.
    # Record the MIME types of both parts - text/plain and text/html.
    part1 = MIMEText(text, 'plain')
    part2 = MIMEText(html, 'html')
    # Attach parts into message container.
    # According to RFC 2046, the last part of a multipart message, in this case
    # the HTML message, is best and preferred.
    # Send the message via local SMTP server.
    s = smtplib.SMTP('localhost')
    # sendmail function takes 3 arguments: sender's address, recipient's address
    # and message to send - here it is sent as one string.
    s.sendmail(me, you, msg.as_string())
  • Sometimes we need to create thumbnails of certain size from image. In this post I show you how to create thumbnail by resizing image in Python.
    Here is the Python code for creating thumbnail / resizing an image:

    import Image
    def generate_and_save_thumbnail(imageFile, h, w, ext):    image = Image.open(imageFile)    image = image.resize((w, h), Image.ANTIALIAS)    outFileLocation = "./images/"    outFileName = "thumb"    image.save(outFileLocation + outFileName + ext)
    # set the image file name heremyImageFile = "myfile.jpg"# set the image file extensionextension = ".jpg"# set heighth = 200# set widthw = 200
    generate_and_save_thumbnail(myImageFile, h, w, extension)

    The second argument of image.resize() is the filter. You have several options there:
    • Image.ANTIALIAS
    • Image.BICUBIC
    • Image.BILINEAR
    • Image.NEAREST

    Each uses different algorithm. For downsampling (reducing the size) you can use ANTIALIAS. Other filers may increase the size.
    You can also look at the following Python modules:
    1. imageop
    2. imghdr

  • Some websites don't allow your spider to scrape the pages unless you use an user-agent in your code. You can fool the websites using user-agent so that they understand that the request is coming from a browser. Here is a piece of code that use user agent 'Mozilla 5.0' to get the html content of a website:

    import urllib2

    url = "http://www.example.com" #write your url here
    opener = urllib2.build_opener()
    opener.addheaders = [('User-agent', 'Mozilla/5.0')]
    usock = opener.open(url)
    url = usock.geturl()
    data = usock.read()
    print data

    You can use other user agent as well. For example, the user agent my Firefox browser uses:
    "Mozilla/5.0 (X11; U; Linux i686; en-US; rv: Gecko/20061201 Firefox/ (Ubuntu-feisty)"

  • Overview

    I'm looking to create a (REST) API for my application. The initial/primary purpose will be for consumption by mobile apps (iPhone, Android, Symbian, etc). I've been looking into different mechanisms for authentication and authorization for web-based APIs (by studying other implementations). I've got my head wrapped around most of the fundamental concepts but am still looking for guidance in a few areas. The last thing I want to do is reinvent the wheel, but I'm not finding any standard solutions that fits my criteria (however my criteria my be misguided so feel free to critique that as well). Additionally, I want the API to be the same for all platforms/applications consuming it.


    I'll go ahead and throw out my objection to oAuth since I know that will likely be the first solution offered. For mobile applications (or more specifically non-web applications), it just seems wrong to leave the application (to go to a web-browser) for the authentication. Additionally, there is no way (I am aware of) for the browser to return the callback to the application (especially cross-platform). I know a couple of apps that do that, but it just feels wrong and gives a break in the application UX.


    1. User enters username/password into application.
    2. Every API call is identified by the calling application.
    3. Overhead is kept to a minimum and the auth aspect is intuitive for developers.
    4. The mechanism is secure for both the end user (their login credentials are not exposed) as well as the developer (their application credentials are not exposed).
    5. If possible, not require https (by no means a hard requirement).

    My Current Thoughts on Implementation

    An external developer will request an API account. They will receive an apikey and apisecret. Every request will require at minimum three parameters.
    • apikey - given to developer at regisration
    • timestamp - doubles as a unique identifier for each message for a given apikey
    • hash - a hash of the timestamp + the apisecret
    The apikey is required to identify the application issuing the request. The timestamp acts similarly to the oauth_nonce and avoids/mitigates replay attacks. The hash ensures that request was actually issued from the owner of the given apikey.
    For authenticated requests (ones done on the behalf of a user), I'm still undecided between going with an access_token route or a username and password hash combo. Either way, at some point a username/password combo will be required. So when it does, a hash of several pieces of information (apikey, apisecret, timestamp) + the password would be used. I'd love feedback on this aspect. FYI, they would have to hash the password first, since I don't store the passwords in my system without hashing.


    FYI, this isn't a request for how to build/structure the API in general only how to handle the authentication and authorization from solely within an application.

    Random Thoughts/Bonus Questions

    For APIs that only require an apikey as part of the request, how do you prevent someone other than the apikey owner from being able to see the apikey (since sent in the clear) and make excessive requests to push them over usage limits? Maybe I'm just over thinking this, but shouldn't there be something to authenticate that a request was verified to the apikey owner? In my case, that was the purpose of the apisecret, it is never shown/transmitted without being hashed.
    Speaking of hashes, what about md5 vs hmac-sha1? Does it really matter when all of the values are hashed with with sufficiently long data (ie. apisecret)?
    I had been previously considering adding a per user/row salt to my users password hash. If I were to do that, how could the application be able to create a matching hash without knowing the salt used?

  • Here's my attempt to help others looking into using RadioSelect, CheckboxSelectMultiple, or the SelectDateWidget.

    These are all excellent features and the more I use Django, the more I like it. Chalk up Forms as another part of django that blows away any other web framework I've worked with.

    from django.forms.fields import ChoiceField
    from django.forms.fields import MultipleChoiceField
    from django.forms.widgets import RadioSelect
    from django.forms.widgets import CheckboxSelectMultiple
    from django.forms.extras.widgets import SelectDateWidget

    YEAR_CHOICES = ('2010','2009')
    RADIO_CHOICES = [['1','Radio 1'],['2','Radio 2']]
    CHECKBOX_CHOICES = (('1','The first choice'),('2','The Second Choice'))

    class SimpleForm(forms.Form):
       radio = forms.ChoiceField( widget=RadioSelect(), choices=RADIO_CHOICES)
       date = forms.DateField(widget=SelectDateWidget(None,YEAR_CHOICES) )
       checkboxes = forms.MultipleChoiceField( required=False,
       widget=CheckboxSelectMultiple(), choices=CHECKBOX_CHOICES)
  • Django/MongoDB Session Middleware

    Folder structure:-


    Step1: Create an app for session in your project
    Step 2: Import your db_connection
    Step 3: Just copy paste the db.py and middleware file in your app

    I hope this will helpful for django/mongodb beginners to know about ,
    "Django Contrib Auth Middleware Session in MongoDB"


    import datetime
    from django.conf import settings
    from django.contrib.sessions.backends.base import SessionBase, CreateError
    from django.core.exceptions import SuspiciousOperation
    from django.utils.encoding import force_unicode
    from <<projectname>>.db import get_db_connection

    class SessionStore(SessionBase):
        Implements database session store.
        def __init__(self, session_key=None):
            super(SessionStore, self).__init__(session_key)

        def load(self):
            db = get_db_connection()
            s = db.session.find_one({"session_key": self.session_key,"expire_date":{"$gte":datetime.datetime.now()}})
            if not s:
                return {}
            return self.decode(force_unicode(s['session_data']))

        def exists(self, session_key):
            db = get_db_connection()
            if db.session.find({"session_key": session_key}).count() > 0:
                return True
                return False
            return True

        def create(self):
            while True:
                self.session_key = self._get_new_session_key()
                    # Save immediately to ensure we have a unique entry in the
                    # database.
                except CreateError:
                    # Key wasn't unique. Try again.
                self.modified = True
                self._session_cache = {}

        def save(self, must_create=False):
            Saves the current session data to the database. If 'must_create' is
            True, a database error will be raised if the saving operation doesn't
            create a *new* entry (as opposed to possibly updating an existing
            db = get_db_connection()
            sesobj = db.session.find_one({"session_key":self.session_key})
            if sesobj:
                sesobj['session_data'] = self.encode(self._get_session(no_load=must_create))
                sesobj['expire_date'] = self.get_expiry_date()
                obj = {"session_key":self.session_key,

        def delete(self, session_key=None):
            if session_key is None:
                if self._session_key is None:
                session_key = self._session_key
                db = get_db_connection()
            except Exception, exception:


    import time
    from django.conf import settings
    from django.utils.cache import patch_vary_headers
    from django.utils.http import cookie_date

    from <<projectname>>.db import get_db_connection

    class SessionMiddleware(object):
        def process_request(self, request):
            session_key = request.COOKIES.get(settings.SESSION_COOKIE_NAME, None)
            request.session = SessionStore(session_key)

        def process_response(self, request, response):
            If request.session was modified, or if the configuration is to save the
            session every time, save the changes and set a session cookie.
                accessed = request.session.accessed
                modified = request.session.modified
            except AttributeError:
                if accessed:
                    patch_vary_headers(response, ('Cookie',))
                if modified or settings.SESSION_SAVE_EVERY_REQUEST:
                    if request.session.get_expire_at_browser_close():
                        max_age = None
                        expires = None
                        max_age = request.session.get_expiry_age()
                        expires_time = time.time() + max_age
                        expires = cookie_date(expires_time)
                    # Save the session data and refresh the client cookie.
                            request.session.session_key, max_age=max_age,
                            expires=expires, domain=settings.SESSION_COOKIE_DOMAIN,
                            secure=settings.SESSION_COOKIE_SECURE or None,
                            httponly=settings.SESSION_COOKIE_HTTPONLY or None)
            return response

    code written by,
    T.Thanga Vignesh Raja

  • Quickstart usage of various features:

    import _mssql
    conn = _mssql.connect(server='SQL01', user='user', password='password', \
    conn.execute_non_query('CREATE TABLE persons(id INT, name VARCHAR(100))')
    conn.execute_non_query("INSERT INTO persons VALUES(1, 'John Doe')")
    conn.execute_non_query("INSERT INTO persons VALUES(2, 'Jane Doe')") 

    # how to fetch rows from a table
    conn.execute_query('SELECT * FROM persons WHERE salesrep=%s', 'John Doe')
    for row in conn:
        print "ID=%d, Name=%s" % (row['id'], row['name']) 

    # examples of other query functions
    numemployees = conn.execute_scalar("SELECT COUNT(*) FROM employees")
    numemployees = conn.execute_scalar("SELECT COUNT(*) FROM employees WHERE name LIKE 'J%'")    # note that '%' is not a special character here
    employeedata = conn.execute_row("SELECT * FROM employees WHERE id=%d", 13) 

    # how to fetch rows from a stored procedure
    conn.execute_query('sp_spaceused')   # sp_spaceused without arguments returns 2 result sets
    res1 = [ row for row in conn ]       # 1st result
    res2 = [ row for row in conn ]       # 2nd result 

    # how to get an output parameter from a stored procedure
    sqlcmd = """
    DECLARE @res INT
    EXEC usp_mystoredproc @res OUT
    SELECT @res
    res = conn.execute_scalar(sqlcmd) 

    # how to get more output parameters from a stored procedure
    sqlcmd = """
    DECLARE @res1 INT, @res2 TEXT, @res3 DATETIME
    EXEC usp_getEmpData %d, %s, @res1 OUT, @res2 OUT, @res3 OUT
    SELECT @res1, @res2, @res3
    res = conn.execute_row(sqlcmd, (13, 'John Doe')) 

    # examples of queries with parameters
    conn.execute_query('SELECT * FROM empl WHERE id=%d', 13)
    conn.execute_query('SELECT * FROM empl WHERE name=%s', 'John Doe')
    conn.execute_query('SELECT * FROM empl WHERE id IN (%s)', ((5, 6),))
    conn.execute_query('SELECT * FROM empl WHERE name LIKE %s', 'J%')
    conn.execute_query('SELECT * FROM empl WHERE name=%(name)s AND city=%(city)s', \
        { 'name': 'John Doe', 'city': 'Nowhere' } )
    conn.execute_query('SELECT * FROM cust WHERE salesrep=%s AND id IN (%s)', \
        ('John Doe', (1, 2, 3)))
    conn.execute_query('SELECT * FROM empl WHERE id IN (%s)', (tuple(xrange(4)),))
    conn.execute_query('SELECT * FROM empl WHERE id IN (%s)', \
        (tuple([3, 5, 7, 11]),)) 

    Please note the usage of iterators and ability to access results by column name. Also please note that parameters to connect method have different names than in pymssql module.

    An example of exception handling:

    import _mssqltry:
        conn = _mssql.connect(server='SQL01', user='user', password='password', \
        conn.execute_non_query('CREATE TABLE t1(id INT, name VARCHAR(50))')
    except _mssql.MssqlDatabaseException,e:
        if e.number == 2714 and e.severity == 16:
            # table already existed, so quieten the error
            raise # re-raise real error
Join me on Facebook Follow me on Twitter Follow me on Google Plus Email me