(An Unofficial) Python Tutorial Wiki

putting the community back in "maintained by the community"

Beginners - Classes

COMMENT: This is part of a suggested restructuring of the Tutorial. Use the sidebar links to return to the current Tutorial.

  1. Intro
  2. Interpreter
  3. Numbers
  4. Strings
  5. Lists
  6. Dictionaries
  7. Control Flow
  8. Functions
  9. Classes

If you're coming to Python from an object-oriented programming language (C#/Java/Smalltalk/Ruby, etc), then you can probably skip over this section (though you might want to skim over the parts where I discuss self)

If Python is your first programming language though, or your first encounter with object oriented programming, then you're definitely going to want to read this.

Objects and Classes

Without getting technical, an object is the basic Python building block, the 'lego brick' with which every program is built. All the elements that we've met up until now - integers, strings, lists, functions, lambda forms etc. - they're all objects. And a class is simply a user-defined object that lets you keep a bunch of closely related things "together".

Let's take a simple example to see why you would want classes. First we'll write some code in a non-OO procedural fashion, and then we'll see how the same code can be written with classes.

The Procedural Approach

Imagine that you're a teacher and, after going through the first few chapters of this tutorial, you've decided to build a little python program to keep track of some stats for your students.

So how do you start? Well, the first thing you want is a list of students:

>>> student_list = ["Simon", "Mal", "River", "Zoe", "Jane", "Kaylee", "Hoban"]

(We'll implement the program in the interactive interpreter, for the sake of the example, but in real life, you'd put this stuff into actual modules. I'll interspere the interpreter code with comments, but keep in mind that all this code goes together.)

You know how fickle the administration is, and with the housing market the way it is, people are moving all the time. So you're definitely going to need ways to add and remove kids from the list.:

>>> def add_student(student):
...     student_list.append(student)
...
>>> def remove_student(student):
...     student_list.remove(student)
...
>>> remove_student("Mal")
>>> add_student("Bill")
>>> print student_list
["Simon", "River", "Zoe", "Jane", "Kaylee", "Hoban", "Bill"]

Ok, so that works nicely. What now? Well, it'd be nice to track the grades of each student. Probably the easiest way to do that is to create a dictionary, where each key in the dict is a student name, and each value is a list of their marks.

>>> student_marks = {}
>>> for student in student_list:
...     student_marks[student] = []
...

So there we've initialized our student_marks, with no marks for any student yet. So now we should make a function to add marks.

>>> def add_mark(student, mark):
...     student_marks[student].append(mark)
...

What if we want a function to change the mark though? That's a lot tricker. To change a mark, we need to know a few things. First, we need to know the student, that's easy. Second, we need to know where in the value of student_marks[student] the old mark existed, or we need to know what the old mark is. Here is a possible way to do this:

>>> def change_mark(student, oldmark, newmark):
...     # If you know the old mark
...     temp_mark_list = student_marks[student]
...     position = temp_mark_list.index(oldmark)
...     temp_mark_list[position] = newmark

So we've given a simple way to change the marks, if you know the old mark and the new mark you want to use.

As a final function, let's add a class attendance feature. We'll assume that most days, the entire class will be there. So the function will, by default, say that everyone one was there on a certain day. We will though pass in an optional list of names of people who weren't there.

First, we need another dictionary, to track attendance:

>>> student_attendance = {}
... for student in student_list:
...     student_attendance[student] = 0

This time, we're initializing the list with zeros, ie. the number of days they've attended class. Every day, that number will increase by one for each student who is there.

>>> def another_day(absent = []):
...     for student in student_list:
...         if student not in absent:
...             old_attendance = student_attendance[student]
...             student_attendance[student] = old_attendance + 1

So now we can call another_day, and pass in an optional list of students who aren't there. Everyone else will have their attendance increase by 1.

So this is great and all, and works fine right now. But what if want want to start adding a bunch of other teaching related functions, lists and dictionaries into this file? Suddenly, at the top level of the file, we'll have a lot of different lists defined (like student_attendance, student_marks, etc.), a whole lot of functions (another_day, add_mark, etc.) and no way to tell what goes together with what. In other words, which functions need which variables, how is everything related?

And this is essentially what classes do. They provide "encapsulation", a method of grouping together things that logically relate to each other.

The Object-Oriented Approach

So how do we do this? The first thing we have to do is create a "class". The class is the "thing" that will group together common elements.

The Student Class

So far, for each student in the class, we've been tracking a lot of different things, in different variables. We're tracking the student's name (in student_list), then we have a mapping between their name and their marks (student_marks), and a mapping between the name and their attendance (student_attendance). It'd be really nice to keep all the information for each student together, in one place. Hence, a Student class:

>>> class Student(object):
...     def __init__(self, name):
...         self.name = name
...         self.attendance = 0
...         self.marks = []

Ok, so some of that definitely looks kind of crazy at this point, but some probably makes some sense. For instance, self.attendance = 0 and self.marks = [] should look at least a bit familiar, and should make a little bit of sense.

So what exactly are we doing here? Well, first off, we're declaring that we are creating a new class, with class Student(object). The name of this class is Student.

So, that's fine, nothing too tough there. But what is that next silly looking thing, the __init__(self, name)? That's called a "constructor". It is a special function of the class that is called whenever we create a new instance of the class. Wow, lots of terminology there. Maybe a simple example will help.

>>> class ExampleClass:
...     def __init__(self, some_message):
...         self.message = some_message
...         print "New ExampleClass instance created, with message:"
...         print some_message
...
>>> first_instance = ExampleClass("message1")
New ExampleClass instance created, with message:
message1
>>> second_instance = ExampleClass(message2")
New ExampleClass instance created, with message:
message2

So what have we done there? Well, we created a new type of class, called ExampleClass. In the constructor (__init__), we print out a message when a new instance gets created. After defining the class, we created two new instances, first_instance and second_instance. When we created them, we can see that the print statements in the __init__ function got called, and more importantly, the variable we passed to the class (ie. "message1" in ExampleClass("message1"), gets passed to the __init__ function.

Ok, so that's fine, but what's up with the self as the first argument to the __init__ function? Every function in a class (functions in classes are actually called "methods", I'll call them that from now on) has to take self as the first argument. For anyone coming from another object oriented language, this will seem VERY strange. For new programmers, it will just seem annoying. For now though, have faith that it's needed, and you'll understand why later.

After the self, you can start putting the "real" arguments to the method, the ones you care about. So what arguments did we define? Just some_message. And what is this some_message used for? Well, in this example, we used it when we did print some_message, but more interestingly, we used it to do self.message = some_message.

So what's that all about? By doing self.message =, we created something called an "attribute". An attribute (as the name implies), is a piece of information for the class. Once we assign that attribute, we can access it from outside the class, like so:

>>> first_instance.message
'message1'
>>> second_instance.message
'message2'

See that? We assigned the attribute in the __init__ constructor, and now, we can access that attribute from outside the class! Is the Student class making more sense now? Let's create an example instance of it, and see what happens:

>>> bobby = Student("Bobby")
>>> bobby.name
'Bobby'
>>> bobby.attendance
0
>>> bobby.marks
[]

Isn't that MUCH nicer than having to keep three separate lists/dictionaries? All the information for the student "Bobby" is kept in one single place, an instance of the Student class.

And remember, it's not just from outside the class that you can access these atributes. You can of course access them from within the class. Any attribute tied to self (like we did with self.name, self.attendance and self.marks) essentially becomes a global variable to that instance. So anytime you do anything with that instance, the value of the attribute is still around. Any variables you create inside a class, that aren't prepended with self will be local variables, only around during a particular call to a function.

Let's see an example of that. We'll redefine our Student class as follows:

>>> class Student:
...     def __init__(self, name):
...         self.name = name
...         self.attendance = 0
...         self.marks = []
...         number_of_marks = len(self.marks)
...         print "%s marks so far!" % number_of_marks
... 
>>> b = Student("Bobby")
0 marks so far!
>>> b.marks
[]
>>> b.number_of_marks
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: Student instance has no attribute 'number_of_marks'
>>>

So what happened there? In the __init__ function, we created three attributes, name, attendance and marks. We know they are attributes because we put the self in front of them. We also created a local variable though, number_of_marks. As stated above, local variables only hang around for as long as the function is executing. Once the __init__ function is done, any local variable created in it will go away. That's why when we tried to do b.number_of_marks, we got an AttributeError exception.

And remember that values of attribute variables are unique to each instance. So if we do:

>>> b = Student("Bobby")
>>> m = Student("Mary")
>>> b.name
'Bobby'
>>> m.name
'Mary'

We can see that the b instance has its own value for the attribute name, and the m instance has its own value for that attribute.

So let's get a bit fancier, let's create a StudentTracker. This tracker will receive a list of student names as an argument to its constructor, and then will create a Student instance for EACH of those names:

>>> class StudentTracker:
...     def __init__(self, initial_student_list):
...         self.student_names = initial_student_list
...         self.students = {}
...         for name in self.student_names:
...             self.students[name] = Student(name)
...

So, we created a nice attribute, self.students, which is a dictionary of Student instances (or objects, it is common to call an instance an "object"). We still need to be able to do stuff with those instances though. The way we'll do that is by defining some methods in the class.

A method is a function that is specific just to the class it's defined in. Here's a simple example:

>>> class Multiplier:
...     def __init__(self, number):
...         self.number = number
...     def multiply_by(self, x):
...         return self.number * x

So this class will have one attribute, self.number. It also has one method, multiply_by, which takes another number, multiplies it by our original number, and returns the result. Let's see it in action.

>>> f = Multiplier(10)
>>> f.number
10
>>> f.multiply_by(5)
50
>>> f.number
10

Does that make sense? We created an instance, and called it f. We then showed the attribute, f.number. We then called the method on the class, by doing f.multiply_by(5), which returned 5*10. Notice though that in our definition of multiply_by, we don't change the value of self.number, which is why it remains 10.

It is important to note how we called the method. We can't just do multiply_by(5), we have to say f.multiply_by(5). Why is that? Well, imagine what would happen if we had created two separate instances. How is Python supposed to know which one to call, unless you tell it?:

>>> f = Multiplier(10)
>>> g = Multipler(20)
>>> f.multiply_by(5)
50
>>> g.multiply_by(5)
100

So we told Python which instance to call multiply_by on, and it did it, and everything worked perfectly!

So let's get back to our StudentTracker. We haven't yet defined any regular methods for it (we defined __init__, but that's a special method, you're not supposed to call it yourself. Having __ on both sides of the method means you're not suposed to call it, it's a special method that Python will call by itself).

Let's redefine our Student, and StudentTracker, but this time with useful methods:

>>> class Student:
...     def __init__(self, name):
...         self.name = name
...         self.attendance = 0
...         self.marks = []
...     def add_mark(self, mark):
...         self.marks.append(mark)
...     def present(self):
...         self.attendance = self.attendance + 1
...     def get_average(self):
...         return sum(self.marks) / len(self.marks)
...     def change_mark(self, oldmark, newmark):
...         position = self.marks.index(oldmark)
...         self.marks[position] = newmark
...     def __str__(self):
...         message = "Name: " + self.name + " "
...         message = message + "Attendance: " + str(self.attendance)
...         message = message + "Average: " + str(self.get_average())
...         return message
>>> class StudentTracker:
...     def __init__(self, initial_student_list):
...         self.student_names = initial_student_list
...         self.students = {}
...         for name in self.student_names:
...             self.students[name] = Student(name)
...     def another_day(self, absent = []):
...         for name in self.student_names:
...             if name not in absent:
...                 self.students[name].present()
...     def add_mark(self, name, mark):
...         self.students[name].add_mark(mark)
...     def change_mark(self, student, oldmark, newmark):
...         self.students[name].change_mark(oldmark, newmark)
...     def prettyprint_students(self):
...         for student in self.students.values():
...             print student

Almost everything there should be pretty self explanatory at this point (except the __str__), but I'll point out a few key ideas.

The __str__ method is another special method. It gets called when Python is told to convert something to a string (using the str() function), or when Python is told to print an instance. A small example is as follows:

>>> class Foo:
...     def __str__(self):
...         return "I am an instance of Foo!!!"
>>> f = Foo()
>>> print f
I am an instance of Foo!!!
>>> str(f)
'I am an instance of Foo!!!'

In our __str__ method, we build up a nice long message, including the student's name, attendance, and mark average, and return that.

Note that in our __str__ method, we do self.get_average(). Just like when a class instance wants to access one of its own attributes, we must prepend the self. to the method call.

Reminder about self: Note again that all the methods we defined had self as their first argument, but when we actually call the method, it essentially gets ignored. That is a little bit of magic Python is doing for you. It should make sense when you get deeper into Python programming. For now, just trust that when you define a method, you need self as the first argument, but when you call a method, you can ignore the self.

Notice the short-hand in another_day and the StudentTracker versions of add_mark and change_mark. In another_day, we have the following line:

self.students[name].present()

You've probably figured out what that does, but just in case, I'll explain it. Remember what self.students is, right? It's a dictionary, where the keys are the students' names, and the values are instances of the Student class. So if we do self.students[name], that returns an instance, right? So, we would normally do:

student = self.students[name]
student.present()

But, if the only thing we need to do with the instance right now is called one method, why waste space? We can instead just do what we did above, namely:

self.students[name].present()

So, the self.students[name] part of that is executed first, and it returns the instance object. It then does the .present() on the instance object. This is an idiom you'll see all the time in Python code (and in most object-oriented programming languages), so make sure you understand it. We did the exact same thing in the StudentTracker version of add_mark, namely:

self.students[name].add_mark(mark)

And that ends our mini introduction to what classes are. The further sections in this chapter will go into more detail. I leave it as an exercise to the reader to actually try these out. Create a StudentTracker instance with some names, play around a bit, try to break the code (there's no error handling, so there should be a few ways to break it). Messing around and experimenting with it will be the best way to learn.

And to continue with Python, it is pretty important that you learn how classes work. Most Python code is written with classes, most of the standad library is written with classes, it's just the way things are done. So even if you don't want to ever write your own classes, you'll have to understand how they work if you want to use other peoples' code.


Previous (Functions)


FAQS (http://pyfaq.infogami.com)