Navigate back to the homepage

Building a Twitter Filter With CherryPy, Redis, and tweetstream`

Bulkan Evcimen
March 18th, 2010 · 3 min read

Photo by Brooke Lark on Unsplash


All the code is available at

Since reading this post by Simon Willison I’ve been interested in Redis and have been following its development. After having a quick play around with Redis I’ve been looking for a project to work on that uses Redis as a data store. I then came across this blog post by Mirko Froehlich, in which he shows the steps and code to create a Twitter filter using Redis as the datastore and Sinatra as the web app. This blog post will explain how I created in Python and the various listed tools below.


  • tweetstream - provides the interface to the Twitter Streaming API
  • CherryPy - used for handling the web app side, no need for an ORM
  • Jinja2 - HTML templating
  • jQuery - for doing the AJAXy stuff and visual effects
  • redis-py - Python client for Redis
  • Redis - the “database”, look here for the documenation on how to install it

Retrieving tweets

The first thing we need to is retrieve tweets from the Twitter Streaming API. Thankfully there is already a Python module that provides a nice interface called tweetstream. For more information about tweetstream look at the Cheeseshop page for its usage guide.

Here is the code for the, which when executed as a script from the command-line will start streaming tweets from Twitter that contain the words “why”, “how”, “when”, “lol”, “feeling” and the tweet must end in a question mark.

1import time
3import redis
4import tweetstream
6from datetime import datetime
9 import simplejson as json
11 import json
14class FilterRedis(object):
16 key = "tweets"
17 r = redis.Redis()
18 r.connect()
19 num_tweets = 20
20 trim_threshold = 100
22 def __init__(self):
23 self.trim_count = 0
26 def push(self, data):
27 self.r.push(self.key, data, True)
29 self.trim_count += 1
30 if self.trim_count >= self.trim_threshold:
31 self.r.ltrim(self.key, 0, self.num_tweets)
32 self.trim_count = 0
35 def tweets(self, limit=15, since=0):
36 data = self.r.lrange(self.key, 0, limit - 1)
37 return [json.loads(x) for x in data if int(json.loads(x)['received_at']) > since]
40if __name__ == '__main__':
41 fr = FilterRedis()
43 words = ["why", "how", "when", "lol", "feeling"]
45 username = "your twitter username"
46 password = "password for twitter account"
48 with tweetstream.TrackStream(username, password, words) as stream:
49 for tweet in stream:
50 if 'text' not in tweet: continue
51 if '@' in tweet['text'] or not tweet['text'].endswith('?'):
52 continue
53 fr.push(json.dumps( {'id':tweet['id'],
54 'text':tweet['text'],
55 'username':tweet['user']['screen_name'],
56 'userid':tweet['user']['id'],
57 'name':tweet['user']['name'],
58 'profile_image_url':tweet['user']['profile_image_url'],
59 'received_at':time.time()}
60 )
61 )
62 print tweet['user']['screen_name'],':', tweet['text'].encode('utf-8')

In this script I define a class, FilterRedis which I use to abstract some methods that will be used by both and later by the web app itself.

The important part of this class is the push method, which will push data onto the tail of a Redis list. It also keeps a count of items and when it goes over the threshold of 100 items, it will trim starting from the head and the first 20th elements (or the oldest tweets).

The schema for the tweet data that gets pushed into the Redis list is a dictionary of values that gets jsonified (we can probably use then new Redis hash type);

2 "id":"the tweet id",
3 "text":"text of the tweet",
4 "username":"",
5 "userid":"userid",
6 "name": "name of the twitter user",
7 "profile_image_url": "url to profile image",
8 "received_at": time.time()

‘received_at’ is important because we will be using that to find new tweets to display in the web app.

Web App

I picked CherryPy to write the web application, because I wanted to learn it for the future when I need to write a small web frontends that dont need an ORM. Also, CherryPy has a built-in HTTP server that is sufficient for websites with small loads, which I initially used to run it is now being run with mod_python. For templating, I used Jinja2 because its similair in syntax to the Django templating language that I am familiar with.

The following is the code for which is the CherryPy application.

1import time
2import os
4import cherrypy
5import jinja2
7from filter_daemon import *
10 import json
12 import simplejson as json
14from simplejson import JSONEncoder
15encoder = JSONEncoder()
17def jsonify_tool_callback(*args, **kwargs):
18 response = cherrypy.response
19 response.headers['Content-Type'] = 'application/json'
20 response.body = encoder.iterencode(response.body)
21 = cherrypy.Tool('before_finalize', jsonify_tool_callback, priority=30)
24root_path = os.path.dirname(__file__)
26# jinja2 template renderer
27env = jinja2.Environment(loader=jinja2.FileSystemLoader(os.path.join(root_path, 'templates')))
28def render_template(template,**context):
29 global env
30 template = env.get_template(template+'.jinja')
31 return template.render(context)
34class Questions(object):
35 _cp_config = {
36 'tools.encode.on':True,
37 'tools.encode.encoding':'utf8',
38 }
40 fr = FilterRedis()
42 @cherrypy.expose()
43 def index(self):
44 tweets =
45 return render_template('index', tweets=tweets)
47 @cherrypy.expose()
49 def latest(self, since, nt):
50 if not since:
51 since = 0
53 tweets =, since=float(since))
54 return render_template('tweets', tweets=tweets)
56if __name__ == '__main__':
57 cherrypy.quickstart(Questions())

The index (method) of the web app will get the all the tweets from Redis. The other exposed
function is latest which accepts an argument since which is used to get tweets that are newer (since is the latest tweets received_at value). nt is used to create a different URL each time so that IE doesn’t cache it. This method returns JSON at.

The templates are located in a directory called templates :)

Here is the template for the root/index of the site; index.jinja

1<html xmlns="">
2 <head>
3 <title>Queshuns</title>
4 <script type="text/javascript" src=""></script>
5 </head>
6 <body>
7 <script type="text/javascript">
8 function refreshTweets() {
9 $.getJSON('/latest', {since: window.latestTweet, nt:(new Date()).getTime()},
10 function(data) {
11 $('#content').prepend(data[0]);
12 $('.latest').slideDown('slow', function() { $(this).removeClass('latest');});
13 $('#content div:gt(50)').remove();
14 setTimeout(refreshTweets, 10000);
15 });
16 };
18 $(function() { setTimeout(refreshTweets, 10000); });
19 </script>
21 <div id='content'>
22 {% for tweet in tweets %}
23 <div>
24 <h1><a href="{{ tweet.username }}/status/{{ }}" class="more">{{ tweet.username }}</a> </h1>
25 <div>
26 <p>
27 <img height=45 width=48 src="{{ tweet.profile_image_url }}">
28 <span> {{ tweet.text }} <span>
29 </p>
30 </div>
31 </div>
32 {% endfor %}
33 </div>
35 {% if tweets %}
36 <script type="text/javascript">
37 window.latestTweet = {{ tweets.0.received_at }};
38 </script>
39 {% else %}
40 <script type="text/javascript">
41 window.latestTweet = 0;
42 </script>
43 {% endif %}
44 </body>
45 </html>

This template will be used to render a list of tweets and also assign the first tweets recieved_at value to a variable on the window object. This is used by the refreshTweets function which will pass it on to /latest in a GET parameter. refreshTweets will try to get new tweets and prepend it to the content div and then slide the latest tweets. This is the template used to render the HTML for the latest tweets;

1{% if tweets %}
2<div class='latest' style='display:none;'>
3{% for tweet in tweets %}
5 <h1><a href="{{ tweet.username }}/status/{{ }}" class="more">{{ tweet.username }}</a> </h1>
6 <div class="entry">
7 <p>
8 <img align='left' height=45 width=48 src="{{ tweet.profile_image_url }}"></img>
9 <span> {{ tweet.text }}</span>
10 </p>
11 </div>
13{% endfor %}
16 window.latestTweet = {{ tweets.0.received_at }};
19{% endif %}

I explicitly set the the latest div to “display: none” so that I can animate it.

Now we should be able to run to start retrieving tweets then start to look at the web app. On your browser go to https://localhost:8080/ and if everything went correctly you should see a list of tweets that update every 10 seconds.

Thats it. Hope this was helpful.

More articles from Bulkan and py2app awesomeness

Photo by Bulkan Evcimen on Unsplash At work i created this script that changes permissions on our application BizarShop so that it works…

November 5th, 2007 · 2 min read

Experimenting with Python frameworks and modules

Photo by Yasin Arıbuğa on Unsplash I've been very busy since last semester got a new job as a Python (Zope) Developer. Then quitting from my…

September 14th, 2007 · 1 min read
© 2007–2020 Bulkan
Link to $ to $ to $