Iterating over a window in Python
This weekend, I had a fairly common problem: given a sequence of elements and an element in that sequence, I needed to get the next element; if the element was the last in the sequence, then I needed the first element.
My first stab at a solution was pretty straightforward:
for i in range(len(elements)):
if elements[i] == item:
try:
return elements[i+1]
except IndexError:
return elements[0]
This works for the intended purpose, although the C-style looping doesn't sit too well with me. But this code has a major flaw: it doesn't work for generators.
After a little more thought, I decided that what I wanted was to iterate over a window of two items in the sequence, and if the first item matched, return the second one.
def by_pairs(iterable):
sequence = iter(iterable)
previous = sequence.next()
while True:
current = sequence.next()
yield previous, current
previous = current
To illustrate how this works:
print pair
(0, 1)
(1, 2)
(2, 3)
(3, 4)
(4, 5)
The new item_after function:
pairs = by_pairs(elements)
first, second = pairs.next()
for a, b in chain([(first, second)], pairs):
if a == item:
return b
return first
In the new item_after function, I peel off the first pair so I can cache the first element, then glob it back on the front of the sequence using itertools.chain.
After a little more thought, I realized that the by_pairs function can be made generic to handle a window of any size.
def window_iter(elements, n):
sequence = iter(elements)
window = deque()
while True:
element = sequence.next()
window.append(element)
if len(window) == n:
yield window
window.popleft()
The item_after function now looks like this:
pairs = window_iter(elements, 2)
first, second = pairs.next()
for a, b in chain([(first, second)], pairs):
if a == item:
return b
return first
And with a little data: one of my pets gets a treat every day. I know who got the treat yesterday; who gets it next?
print "Pet after Kiri is", item_after3(pets, "Kiri")
print "Pet after Lady is", item_after3(pets, "Lady")
The output:
Pet after Lady is Mikey
I can think of a lot of potential uses for this window iterator. One is in natural language processing, when calculating the "stickiness" of words in a corpus. You could iterate over a window of three words, calculating the collocations of words before and after the middle word.
Here's a rather contrived example: calculating the average temperature on days when the previous two days were 80 degrees or higher.
after_temps = [z for x, y, z in window_iter(temps, 3)
if x > temp < y]
return sum(after_temps) / float(len(after_temps))
tempdata = "99 101 85 91 71 64 67 90 81 101".split()
temps = (float(x) for x in tempdata)
temp = ave_temp_after(80, temps)
print "Average temperature after two days of 80+:", temp
Output:
It strikes me that this would be great for those baseball statistics guys:
This is just the third time since 1963 that a left-handed pinch hitter has broken a bat on three consecutive Wednesdays.
I'm waiting for a call from The Show any day now.

itertools.tee might be useful in this case.
From http://docs.python.org/library/itertools.html#recipes :
def pairwise(iterable): "s -> (s0,s1), (s1,s2), (s2, s3), ..." a, b = tee(iterable) for elem in b: break return izip(a, b)While the for loop (to advance b) is certainly concise, I don’t find it too readable. I think it demands a comment at least.
That’s a good one. Generalizing to create a window of arbitrary size:
from itertools import tee, izip def window_iter(sequence, n): """ s -> (s1, ... sn), (s2, ... sn+1), ... """ sequences = tee(sequence, n) for it, num in zip(sequences, range(n)): for i in range(num): it.next() return izip(*list(sequences))Example:
I give it plus points for not using an accumulator, and minus points for resorting to “star” magic.
How about this?
>>> a
[1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
>>> map(lambda x,y:(x,y),a[:-1],a[1:])
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 8), (8, 9), (9, 0)]
>>>
I’m new to python so forgive me if it’s silly example