Week 2 Python Dev Continued

Tools and Libraries

This section overviews some important tools and libraries that the student will use throughout the course. Teaching these tools early is an important way of ensuring the student will be up to speed to work on the inner workings of python, and will be important tools that the student will be using throughout the course.

The Python Interpreter

A Python interpreter is a dynamic way of testing your python code. It allows a user to type in single lines or small code statements and evaluate them dynamically. This can be very useful for learning and experimenting with python. It's great for on the fly and scripting use, and can also be a great way of accessing python documentation.

To start the python interpreter from the Unix command line, start a new session and type 'python'. A new prompt will appear where you can enter python statements.

jcp20@wireless.oit.duke.edu:~> python
Python 2.5.1 (r251:54863, Jan 17 2008, 19:35:16) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

From here, the user can begin entering python code. Entering python code in the interpreter works just like writing code in a text editor, with only a few little tricks to get used to. The user can write simple python statements, like so.

>>> print 'Hello, world!'
Hello, world!
>>> a = 8
>>> b = 12
>>> c = a + b
>>> c

Full code blocks can be entered into the interpreter. If the user enters a colon at the end of a statement, the interpreter won't evaluate the current line. Instead, it will create a new line underneath the current line in which the user can continue the code block. Note that whitespace is still recognized in the interpreter - for full code statements, tabbing works just like in a text editor.

>>> t = True
>>> f = False
>>> if t:
...     print 'olpc@duke'
... else:
...     print 'this will not print'

The user can also define full methods and even classes if they want to.

>>> def fibonacci(n):
...     if n == 0:
...             return 0
...     if n == 1:
...             return 1
...     else:
...             return fibonacci(n-1) + fibonacci(n-2)
>>> for x in range(5):
...     fibonacci(x)

The interpreter will give an error if you type in something wrong.

>>> print not_defined
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'not_defined' is not defined

The python interpreter can be tremendously useful for testing your code. Instead of writing test classes or separate scripts to test new blocks of code, the programmer can get immediate feedback about whether their code will work, and if it's working the way they want.

The interpreter is also the quickest way to get information about python modules (we'll talk more about modules later in this section). Modules can be imported directly into the session while you're using the interpreter. Once the module is imported, you can use the module's variables, functions, and classes as if you had written them yourself in the session.

>>> import random
>>> random.randint(0, 10)
>>> from math import cos
>>> cos(0)

Once a module has been imported, it's easy to get information about it. The 'dir()' and 'help()' functions are two great examples. For instance, what if I want a list of all the variables and methods that are included in the sys module?

>>> import sys
>>> dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__stderr__', 
'__stdin__', '__stdout__', '_getframe', 'api_version', 'argv', 'builtin_module_names', 
'byteorder', 'call_tracing', 'callstats', 'copyright', 'displayhook', 'exc_clear', 'exc_info', 
'exc_type', 'excepthook', 'exec_prefix', 'executable', 'exit', 'getcheckinterval', 
'getdefaultencoding', 'getdlopenflags', 'getfilesystemencoding', 'getrecursionlimit', 
'getrefcount', 'hexversion', 'maxint', 'maxunicode', 'meta_path', 'modules', 'path', 
'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 
'setdlopenflags', 'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout', 
'version', 'version_info', 'warnoptions']

>> help(sys)

Help on built-in module sys:




    This module provides access to some objects used or maintained by the
    interpreter and to functions that interact strongly with the interpreter.

Python modules (aka libraries)

Python modules are collections of python code that can be imported and used by the programmer. They are libraries of information much like those found in other programming languages. Python comes stock with a multitude of useful modules that you can use in your work for this class. Countless more can be downloaded on the internet. What's more, nearly all of the code that comes packaged in these libraries can be viewed by the programmer. By going to the directory where python is installed, you can view nearly all of the code that is used to implement python's many modules. Those that are curious can see exactly how the module's methods work.

A quick note: nearly all python modules are written in python, and their code is freely available for you to view and (if you want) to change. A select few, however, are written in C. Python itself is implemented in C, so a few of it's core libraries are implemented using C.

The sys module has functions/variables/objects that are closely related to the python interpreter, as well as some basic system information. The sys module is especially important when running scripts from the command line. Take note that the sys module, because of it's closeness to the python language, is one of those few module's that is implemented in C.

Useful variables:

modules - (dict) a list of the python modules that have been imported in the current session
argv - (list) a list of command line arguments (much like the String[] arg passed into the main method in Java)
path - (string) the system path that this session or script was run from
platform - (string) the name of the platform that the system is running
maxint - the largest integer allowed by the system. As far as the system is concerned, this is the last number before infinity.

Useful functions:
exit() - exits the interpreter or script

The os modules contains lots of tools for working with the operating system. It abstracts a lot of platform specific information, making it easy to implement your code on multiple operating systems. The os module is usually used in close conjunction with the sys module - you'll often see these two modules imported at the same time.

Note: Be sure to import the os module using import os, not from os import *. This will prevent the os module from overriding python's built in open() method (discussed later in the Python I/O section).

Useful variables:
name - (string) the name of the platform you're currently on
curdir - (string) the pathname of the current directory the python session is working in

Useful functions:
chmod(), chown(), chroot() - used for changing system specific variables, useful for file handling.
fork() - forks the current process into a new thread.
listdir() - gives a list of files in the directory

The random module contains tools for random number generation, and shuffling and randomizing lists.

Useful functions:
randint() - generates a random integer between two seed values
uniform() - generates a random non-integer number between two seed values
shuffle() - shuffles a list

The math module contains most of your basic math variables and functions. It contains mostly trigonometric functions.

Useful variables (mostly self explanatory):

Useful functions (again, mostly self explanatory):
log - the natural logarithm
log10 - the base-10 logarithm

The string module provides methods for string manipulation. Check this module before you do any string manipulation - what you're trying to do might have already been done.

Useful variables:
whitespace - a string containing all characters considered whitespace
lowercase - a string containing all characters considered lowercase letters
uppercase - a string containing all characters considered uppercase letters
letters - a string containing all characters considered letters
digits - a string containing all characters considered decimal digits
hexdigits - a string containing all characters considered hexadecimal digits
octdigits - a string containing all characters considered octal digits
punctuation - a string containing all characters considered punctuation
printable - a string containing all characters considered printable
These descriptions were taken from the string manual file

Useful functions:
upper - uppercases the string's characters
lower - lowercases the string's characters
split - splits the string into a list based on a delimeter (by default, this is whitespace)
join - joins a list into a string based on a delimeter
find - finds the index of a string within another string
count - counts the number of time a string appears within another string
replace - replaces a substring within a string with another substring

Non-stock Modules
Numpy/Scipy - comprehensive set of tools for mathematics and scientific programming, meant as a MATLAB replacement
PyGame - tools for building games
Tkinter - basic gui tools
PyGTK - better gui tools (more on this later)
BeautifulSoup - HTML/XML parser

Application Development

Object-Oriented Programming

Object-Oriented programming in Python is very straightforward. Defining a class is very simple.

class MyClass(InheritedClass):
    """This is called a docstring. It is a triple-quoted string immediately 
         preceding the class definition. Here you should document what your class does."""

    def __init__(self):
         self.data = []

    def anotherMethod():

Here you can see the anatomy of a class. Any class who's methods you'd like to inherit are passed to the class in the initial class definition. The class will inherit all methods from that class. You can then overwrite the methods with your own.

The docstring is an important item that you should remember to include in all your class definitions. It gives other programmers an idea of how to use your class. Docstrings can be used with function definitions as well as class definitions. The docstring can be used inside a python terminal like so:

>>> x = MyClass()  # this creates an instance of MyClass with the __init__ method
>>> x.__doc__ 
This is called a docstring. It is a triple-quoted...

Notice in our class definition the init method. This is the default constructor method that is used in all class definitions in python. init is always called with self as the first parameter. You can add other parameters after self if you want to pass variables into the function.

def __init__(self, a, b, c):
     self.a = a
     self.b = b
     self.c = c

>>> a = 'python'
>>> b = 'olpc'
>>> c = 'xo'
>>> x = MyClass(a,b,c)

>>> x.a
>>> x.b
>>> x.c

The self variable holds all of the instance variables for your class. When a new object is made from your class definition, init is called, and self is created. To make a new instance variable, prepend it to self.

A word on functional programming

Python supports the ability to make anonymous functions. Making anonymous functions is a concept borrowed from functional programming languages like Lisp. Anonymous functions are not bound to an object, and are treated as first-class data types. Once these functions have been created, they can be passed to other functions, giving the programmer a great amount of power and expressibility.

Anonymous functions are created with the lambda keyword, followed by a variable name and then a colon, and then a functional expression.

>> f = lambda x: x+2
>> f(2)

This is essentially the same as declaring a function such as:

def f(x): return x+2

The difference between these two blocks of code is that with a lambda, the function is bound to its own variable. The function is treated as an object and can be passed as a parameter to functions or other lambdas.

The following is another example of multiple lambdas at work. This demonstrates a very simple example of a set of MapReduce functions. The two following lambdas compute the square of all the numbers in a list, and then sum them.

>>> map = lambda x : [y**2 for y in x]
>>> map([1,2,3,4])
[1, 4, 9, 16]
>>> reduce = lambda x:  sum(x)
>>> reduce(map([1,2,3,4]))

Lambdas can be very expressive and are a great way to reduce the size and increase the readability of your code.

Python I/O

Python makes reading and writing files very easy. Opening a file is as easy as calling the open command and assigning it to a variable. If you open the file in write mode, writing is also easy.

>>> openfile = open(/usr/john/home/mytextfile.txt, 'w')  # open mytextfile.txt in w (write) mode
>>> openfile.readline()
'The first line of my text file'
>>> openfile.readline()
'The second line of my text file'
>>> openfile.write('a line of text\n')  # this will append 'a line of text' and a newline to the end of mytextfile.txt

Here we see some basic file I/O. Sometimes you'll use files to keep track of data between sessions. In this case, you'll want to get all of the data from the last session at once, instead of keeping the file open for the duration of your session. This is also very simple.

>>> data = openfile.readlines()
>>> data
['The first line of my text file', 'The second line of my text file', ...]
>>> openfile.close()

This will pull all of the data from your file and place it in a list for easy access. Close obviously will close the file.

Error handling

Error handling in python is similar to error handling in other languages.

>>> while True:
...     try:
...         x = int(raw_input("Enter a command number: "))
...         break
...     except ValueError:
...         print "You didn't enter a valid number. Please try again."

The interpreter will first try to do what is in the try code block, and will default to except if an error is raised. You don't have to specify the error type, but this can be useful when your code block might be generating multiple types of errors.

For more error handling options, refer to http://docs.python.org/tutorial/errors.html

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License