Python + Twitter = Twython

Being bored and a massive geek I decided to look into python interfaces for twitter. Twitter recommends several python libraries here, after some deliberation I settled on Twython as it appeared to provide a simple yet comprehensive interface to twitters API. Please note that although I will not be attempting any methods that require authentication – Twython fully supports OAuth which is now twitters sole method of authenticating users.

Installation

To install click the Download button at the top of Twython – github homepage and select either the zipped source (recommended for windows) or tar-gzipped source (linux). Navigate to the unzipped directory (containing setup.py) and issue the following command

python setup install

Assuming python is installed this will install the twython library.

Getting Started

Firstly import tywthon!

from twython import Twython

And create a Twython object, all API calls are handled through this object. On a side note I recommend taking a look at the Twython source code as it demonstrates a rather nice use of the __getattr__ method in the Twyhon class.

twitter = Twython()

Some Examples

Most methods that retrieve data (searching, getting trend data, etc.)  return a dictionary, that will be comprised of various other dictionaries and lists, the format of these can be found from the twitter API documentation, for example the get trends/daily method in the example request section – dictionaries are denoted with curly braces ‘{}’ and lists square braces ‘[]‘ (actually the examples are in json format but can be treated as python).

Retrieving Daily Trends

#import Twython
from twython import Twython
 
#Instantiate Twython Object
twitter = Twython()
 
#Retrieve daily trends
results = twitter.getDailyTrends()
 
#Daily trends returns the top twenty trends for each hour in a given day
#To Unpack
#1. Iterate through the times and associated lists of trends in the 'trends' map
for time, trend_list in results['trends'].iteritems():
	print time
	#2. Iterate through the trends in each trend list
	for trend in trend_list:
		#3. print the 'query' item of the trend dictionary
		print trend['query'].encode('utf-8')

Note: Twython stores all strings in unicode, in the example above I have used the encode method to display string correctly.

Retrieving Public Timeline

#import Twython
from twython import Twython
 
#Instantiate Twython Object
twitter = Twython()
 
#Get Public Timeline - returns 20 newest posts
tweets = twitter.getPublicTimeline()
 
#Iterate through tweets
for tweet in tweets:
	#print user name and text
	print '%s: %s' % (	tweet['user']['name'].encode('utf-8'),
						tweet['text'].encode('utf-8'))

A note on rate limiting

You should note that most calls to the twitter API are subject to rate limiting, that is there is a certain number of requests you can make per hour. Twython will raise an exception if you exceed this limitation. More information can be found here.

 

Unrelated: Introducing the new Wii U!
 

Python & Excel (xlrd & xlwt)

My current job requires me to work extensively with Microsoft Excel; creating reports and statistics as well as manipulating large datasets. These reports often need to be reproduced on weekly or monthly basis and as such were ideal candidates for automation. Whilst Excel’s macro feature are great for many repetitive tasks I found in scenarios where I was using data from several different sources (that often did not take the form of another excel workbook) that python offered a more flexible & powerful solution.

Python libraries

There are two python libraries available (that I have worked with, there are probably others):

Both libraries can run without Excel being present, xlwt generated worksheets are of course compatible with Excel as well as Open Office.

xlrd

A simple example for opening and reading an Excel sheet:

#import xlrd
import xlrd
 
#Open a workbook
workbook = xlrd.open_workbook('statistics.xls')
 
#Get a sheet by index
sheet = workbook.sheet_by_index(0)
 
#Or by name
sheet = workbook.sheet_by_name('Sheet1')
 
#Access the cell value at (2,2)
print sheet.cell_value(2,2)


xlwt

Here is a simple example using xlwt to generate a new spreadsheet:

#Import xlwt
import xlwt
 
#Create a new workbook object
workbook = xlwt.Workbook()
 
#Add a sheet
worksheet = workbook.add_sheet('Statistics')
 
#Add some values
for x in range(0, 10):
	for y in range(0,10):
		worksheet.write(x,y,x*y)
 
workbook.save('statistics.xls')

A more useful example

The example below uses pythons csv library and xlwt to convert a CSV file to an Excel sheet.

#import modules
import sys, csv, xlwt
 
#Called directly?
if __name__ == '__main__':
 
	#Do we have the correct number of arguments?
	#csv2excel [csv input] [excel sheetname] [excel output]
	if len(sys.argv) != 4:
		print 'Usage: %s [csv input] [excel sheetname] [excel output]' % sys.argv[0]
		exit(1)
 
	#Open the given csv file
	csv_input = csv.reader(open(sys.argv[1],'rb'),delimiter=',',quotechar='"')
 
	#Create an Excel workbook
	workbook = xlwt.Workbook()
 
	#Add a new sheet
	sheet = workbook.add_sheet(sys.argv[2])
 
	#Store the current row
	row_count = 0
 
	#Read in each row
	for row in csv_input:
		#Iterate through each column
		for col in range(len(row)):
			#Write to sheet
			sheet.write(row_count,col,row[col])
		#Increment row_count
		row_count += 1
 
	#Save the Excel workbook
	workbook.save(sys.argv[3])

More information can be found on the csv module here: http://docs.python.org/library/csv.html

Python & Oracle (cx_Oracle)

I often need to work with data stored inside an oracle database, sometimes to export said data as in some format or to compute some form of statistics. To automate this I use the cx_Oracle module for python, allowing me full access to an Oracle database from within a python script.

cx_Oracle

cx_Oracle can be downloaded here:

http://cx-oracle.sourceforge.net/

Simply choose the download for your desired operating system and version of oracle. The documentation can be viewed here:

http://cx-oracle.sourceforge.net/html/index.html

Creating a connection

Having downloaded cx_Oracle the first step is to import it into your script.

import cx_Oracle

To create a connection to your database use the module’s connect function, there are a few ways to specify the database you wish to connect to – my preferred method is to use a single connect string argument.

connection = cx_Oracle.connect('username/password@localhost')

Here the username and password should be replace with the appropriate oracle login details and localhost the hostname or IP address where oracle is running.

The connection can be close when no longer needed by using the close function.

connection.close()

Performing a query

To perform an SQL query, first create a new cursor object and then use its execute function with your SQL query as an argument.

cursor = connection.cursor()
 
cursor.execute('SELECT Firstname,Lastname FROM TB_NAME')

A note on security

When accepting input from a user it is naive to simply insert their input into an SQL statement as this leaves your code vulnerable to SQL injections, instead the execute function allows you to use labels in place of the input in the SQL string and takes additional keyword arguments (where the keywords match your labels) or a dictionary. Labels take the form of a colon ‘:’ followed immediately by an identifier. For example:

#Keywords
#Find the firstnames that match a lastname
lastname = raw_input('Please type a lastname: ')
 
cursor.execute('''
SELECT Firstname
FROM TB_TABLE
WHERE Lastname = :last''', last=lastname)
 
#Dictionary
#Insert a new name
firstname = raw_input('Please type a firstname: ')
lastname = raw_input('Please type a lastname: ')
 
cursor.execute('''
INSERT INTO TB_NAME
VALUES(:first,:last)''', {'first':firstname,'last':lastname})

Retrieving data

Any retrieved data can then be accessed using the cursor’s fetchone or fetchall functions which, as the names suggest, either fetch a single row or all available rows respectively. fetchall returns list of tuples containing the row data whereas fetch one returns a single tuple, both raise an exception if execute did not return a result.

row = cursor.fetchone()
#or
rows = cursor.fetchall()

The rows can then be iterated through and accessed in the usual python manner.

#Assuming you fetched all the rows with fetchall
for row in rows:
	print row[0], row[1]
 
#or
 
for firstname, lastname in rows:
	print firstname, lastname

Inserting or updating rows

The cursor’s execute function will accept any (single) SQL statement as such you can insert and update rows in the same way as performing a query.

name_list = [('Joe','Blogs'),('Jim','Jones'),('Dan','Smith')]
 
for name in name_list:
	cursor.execute('''
	INSERT INTO TB_NAME(Firstname,Lastname) VALUES(:first,:last)
	''',
	{'first':name[0],'last':name[1]})

This is of course the briefest of introductions, to learn more read the documentation given above, google it and experiment! (though preferable not on a live database!)

Python Ray Tracer (Introduction)

Writing a ray tracer can be an extremely rewarding project. A basic ray tracer capable of rendering a few spheres with some simple lighting is relatively easy to achieve and can then be extended at will to any desired level of complexity, with each new iteration bringing new depth and realism to the images it produces. Particularly appealing is the gratifying visual feedback from each new change or addition, this ability to admire your handy work at each stage in such a visual way is what can make ray tracing particularly addictive.

I plan to write a Ray tracer in python, posting updates and code snippets here.

Why python?

Python isn’t a natural choice of language for a ray tracer, ray tracing a complex scene to a high degree of realism is very computationally demanding and python as an interpreted language is not known for its performance. However in this case I have chosen python to further explore the language and for the pure simplicity and speed of development. I will be investigating methods of speeding up rendering with various optimisations and the implementation of a render farm – but more on that later.