Fredrik Håård's Blaag

@fhaard
I'm a programmer, consultant, developer, occasional teacher and speaker. Among my least disliked programming languages are Python, and a majority of these posts are related to Python in one way or another.
RSS Feed

Explaining comprehensions to programmers

For the first year or two programming Python, I never used list comprehensions (at the time, those were the only comprehensions). I read about them, I kinda figured out how they worked, and then I stuck to map() and filter(), which I understood. Looking back, I think that this has a lot to do with the fact that explanations of comprehensions are done using their origin - mathematics - rather than the domain we use them in - programming.

A quick duckduckgo search tells me this is still the case - Wikipedia asks us to consider something like this: S = {2⋅x|xϵN|x > 3} , and other sources also seem to start out with ‘this is how its done in math, so...’ (a notable exception is the tutorial on python.org).

When talking to programmers, I’d like to explain comprehensions differently, because not all programmers have a background in mathematics. For a programmer, a list comprehension is simply a for loops for constructing lists, using a more declarative notation than your usual for loop. For those of us used to map() and filter(), list comprehensions are both of those as well.

Consider:

def loop(my_list):
  result = []
  for x in my_list:
      if x > 3:
          result.append(x*2)
  return result

Ever written code like this? This is code that explicitly states which steps should be taken to construct your list; but you don’t have to - you can instead state what you want:

def compr(my_list):
  return [x*2 for x in my_list if x > 3]

This translates to give me value*2 for every value in my_list, but only if that value is more than three. Note also that this expression does the work of both map (multiply by two) and filter (take only values that are less than two). The general case would be [add something to the list for each value in an iterable, optionally only if a condition is True for that value]

Comprehensions also work nested - consider this simple but ugly code:

def create_matrix_loop(size, default):
  new_matrix = []
  for y in range(size):
    row = []
    for x in range(size):
      row.append(default)
    new_matrix.append(row)
return new_matrix

Sample output:

>create_matrix_loop(3, None)
[[None, None, None],
 [None, None, None],
 [None, None, None]]

Since comprehensions can be nested, this can be replaced with:

def create_matrix_compr(size, default):
  return [[default for x in range(size)] for x in range(size)]

As an added bonus, when we don’t tell the compiler how we want to do something, but rather what we want done, it can generate better - faster - bytecode for us. The loop version of create_matrix is translated into 35 bytecode instructions, and the version using a list comprehension is only 20 (try import dis; dis.dis(func) to see what func looks like in bytecode) and in reality, you will often avoid making a function at all when using comprehensions since they’re terse enough on their own, making this difference even bigger. Timing the implementations, the difference is evident:

>timeit -n100 create_matrix_loop(1000, None)
100 loops, best of 3: 113 ms per loop
>timeit -n100 create_matrix_compr(1000, None)
100 loops, best of 3: 49.1 ms per loop

That's right: less code, declarative syntax, and faster execution! (Note: I used iPython when creating and timing the examples - it's awesome and you should try it)

Blaag created 120216 11:10
blog comments powered by Disqus


Page created using blaag and abusing docutils. RSS Feed generated using PyRSS2Gen.