with open('in.txt') as input, open('out.txt') as out:
for line in input.readlines():
out.write(foo(line))
Python users are used to importing everything all at once, while in C everything is done in small chunks whenever possible.Python 3 is also moving into this direction by replacing many default functions with their iterator equivalents (map, range, etc).
You might think that this means forcing everything into one big context manager, but that's not necessarily true. For example:
from itertools import imap
def read_file(filename):
with open(filename, 'r') as f:
reader = csv.reader(f)
for line in reader:
yield line
def write_file(filename, data):
with open(filename, 'w') as f:
writer = csv.writer(f)
map(writer.writerow, data)
write_file(
filename='out.txt',
data=imap(foo, read_file('in.txt')))Don't use `map(writer.writerow, data)`, do `writer.writerows(data)` instead. Use binary mode for csv files in Python 2, use newline='' in Python 3.
You forgot to open the first output file in the write mode.
This blog post's algorithm accomplishes the same task with two passes and O(m+n) space. It seems odd that an article explicitly about encouraging readers to think like the authors of UNIX and make simple reusable utilities that process stream data would use a two-pass algorithm when a fairly simple one-pass algorithm is available.