If I see this kind of "cute" code in code reviews, there is some serious scolding to be done.
planets_set = {
planet for episode in episodes.values()
for planet in episode['planets']
}
is less maintainable in a Python shop than say... planets = set()
for episode in episodes.values():
planets.update(episode['planets'])
Although the latter will likely make perfect sense to most non-python developers. The former is faster and has a smaller memory footprint and might be preferred when dealing with a larger or more irregular data sets.The former also has the advantage of not leaking the "episode" variable into the function/method scope, which could introduce a subtle bug if that variable gets conditionally reused. So while it's harder to understand for a less-experienced python developer, the set comprehension solution is inherently safer due to python's design.
coordinates = (star.x, star.y, star.z for star in star_map)Nested list comprehensions with multiple filtering if statements? Scald away.
But the simpler examples there are perfectly readable, understandable and maintainable if you understand list comprehensions. And if you need to alter it to the point where it needs to be a little more verbose in order to convey what you're doing, then sure, you can break it out into some other structure. But just because the complexity might increase later on doesn't mean that you shouldn't use a list comprehension.
And for those few lines of code you've traded the ability for new programmers to understand it easily. To me that's a net loss.
List comprehensions (and inline generators, which you get by using parenthesis) make it trivial to handle fairly complex sequence processing, and are one of the reasons Python gets a lot of love from the FP community.
Sure, it's not really an FP language, but these give it an almost LISPy feel.
Regardless,it would be less Pythonic to build a loop and iterate.
Might be easier for complete novices, but I don't suppose people are just hiring novices these days, right?
(edit: Mobile-sponsored typo)
But it's not relevant. What relevant is, do they create a readability problem for a python developer who spent significant time with them? I honestly don't know.
I am rather fond of dictionary comprehensions as mentioned in the article:
colors = [jedi['lightsaber_color'] for jedi in jedis]
frequencies = {color: colors.count(color) for color in set(colors)}
print(frequencies)
# {'green': 6, 'red': 5, 'blue': 6}Can somebody explain this one to me (I understand list comprehensions)? I'm having trouble understanding how the second part uses something defined in the first part, but the first part can't stand on its own, so
> [planet for episode in episodes.values()]
returns an error.
planets_flat = []
for episode in episodes.values():
for planet in episode['planets']:
plants_flat.append(planet)
Notice how the for loops in the comprehension goes in the same order as in the imperative code.Thanks, that's a sane way to explain the order. Up to right now, it was always "the opposite of what you'd expect", which was a memory rule that always failed me.
Once you know how it translates to regular for loops, this syntax is pretty natural. I always put each for loop on its own line, which makes things more readable.
For episodes in episodes
for planet in episode planets
give me planet
the binding of `planet` isn't evaluated until the end (left to right evaluation for the looping bits)
planets_flat = [planet for planet in episode['planets'] for episode in episodes.values()]
and it makes a lot more sense. In most languages with list comprehensions you'd write the thing you were iterating over first, e.g. (Scala): val planetsFlat = for { episode <- episodes.values; planet <- episode(planets) } yield planet >>> [y for z in [[1,2,3],[4,5,6]] for y in z]
[1, 2, 3, 4, 5, 6]
The error you get is because "planet" is bound by the second FOR clause, so when you leave it out, planet becomes unbound.Don't know what a better term might be, but it's good to be aware of the distinction.
Nope. Well - I could if I tried but that particular form is fairly abhorrent to me. My brain hurts just reading it so this is the stage I would find a clearer way to express the algorithm.
Python 2.7:
>list_of_numbers = [1,2,3]
>[x/2 for x in list_of_numbers]
>print(x)
3
Python 3:
>list_of_numbers = [1,2,3]
>[x/2 for x in list_of_numbers]
>print(x)
NameError: name 'x' is not defined
They "leak" their variables in Python 2, if you're someone who reuses variables this can lead to an enormous headache!
The original concept was that it provided a more concise way to write loops so that:
t = [expr(x, y) for x in s1 for y in s2]
was just a short way to write: _t = []
for x in s1:
for y in s2
_t.append(expr(x, y))
t = _t
del _t
The original implementation reflected that design. I later added the LIST_APPEND opcode to give the list comprehensions a speed advantage over the unrolled code.The question of whether to expose the loop induction variable didn't get much discussion until I proposed Generator Expressions in PEP 279 and decided to give them the behavior of hiding the loop induction variables so that the behavior would match that of a normal unrolled generator function.
When set and dict comprehensions (displays) came along afterwards, they were given the latter behavior because there was precedent, because there was a mechanism to implement that precedent, and to provide a short-cut for then common practice of creating dicts and sets with generator comprehensions:
s = {expr(x) for x in t}
was a short-form for: s = set(expr(x) for x in t)
The advent of Python 3 gave us an opportunity to make a four forms (listcomps, genexps, set and dict displays) consistent about hiding the loop induction variable.The current state in Python 3 has the advantage of being consistent between all four variants and matches how mathematicians treat bound and free variables.
There are some disadvantages as well. List comprehensions can no longer be cleanly explained as being equivalent to the unrolled version. The disassembly is harder to explore be cause you need to drill into the internal code object. Tracing the execution with PDB is no fun because you go up and down the stack. It is more difficult to explain scoping -- formerly, all you had was locals/globals/builtins, but now we have locals/nonlocals/globals/builtins plus variables bound in list comps, genexps, set/dict displays plus exception instances that are only visible inside the except-block.
http://python-history.blogspot.com/2010/06/from-list-compreh...
The PEP is silent on the issue:
https://www.python.org/dev/peps/pep-0202/
I guess mail list spelunking would probably reveal some conversations about how to handle it.
Just to clarify, this is the behavior of py2 for loops in general, whether in a comprehension or not.
While many of the examples in this helpful article are good, the first example with the octets is an excellent example of how not to do it. Look at the octet parsing code we ended up with in the article:
# Snippet 1
octets = [bbs[i:i+8] for i in range(0, len(bbs), 8)]
It's nice and tight. What does it do? I'd need to peer at it a moment and decode it, executing it in my head. This is subjective, but I think code should be self-explanatory; it's up to the computer to execute code, not people in their heads. Is there an off-by-one error in there? Here's another way to do it that'd be better. octets = chunks(bbs, 8)
That function chunks() is something that I keep in an iterutils package which I end up using all the time. It's intuitive, and it has a doctest that shows that we definitely don't have an off-by-one error. It's also easier to re-use than the first one. Maybe it also bears mentioning that chunks() works on an iterator, while the first solution needs to keep the whole thing in memory at once. Here's the chunks method I use: def chunks(collection, chunk_size):
"""Divides list l into chunks of up to n elements each.
>>> l = range(75)
>>> chunks(l,10)
[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
[40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
[60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
[70, 71, 72, 73, 74]]
"""
for i in xrange(0, len(collection), chunk_size):
yield collection[i : i + chunk_size]
When you catch yourself playing code golf and trying to pack more and more meaning into a single line, look for ways that you can break the problem down into multiple components that use each other. This kind of functional decomposition is one of the things that makes functional programming so wonderful. Lots of times the intermediate steps in a complex expression have meaning and are useful on their own.However, I would reject your `chunks` function in a code review and tell you to use `grouper` form the itertools recipes [1].
More generally,any time I've ended up with a long list comprehension, the answer has been to "check itertools and see how you would describe this ugly comprehension in those terms"
[1] https://docs.python.org/3/library/itertools.html#itertools-r...
There is no xrange function in python 3. Now range has the functionality of xrange.
octets = [bbs[i:i+8]
for i in range(0, len(bbs), 8)
if i%2 != 0]Nesting them can get ugly, but it's easy to avoid: Just use generator expressions and chain them. No real runtime overhead that way.
collection
.map(x => x * 2)
...
.filter(isOdd)
.reduce(blargh)
... style syntax that most other modern, C-style languages offer now (ES6, Rust, Ruby, C# ...) [p for p in range(2,100) if all(p%q!=0 for q in range(2,p))]
with range(2,100).filter(p => range(2,p).map(q => p%q!=0).all())
It might be a small thing, but in the first one I feel the prime 'p' becomes the center piece, whereas in the second, 'p' is burrowed somewhat in the expression.I think chaining syntax from e.g. Elixir gives you more flexibility than map/filter/reduce defined on objects too. Big fan of that, though it does interact oddly with elixirs optional parentheses for function calls (which is a mistake of the language imo)
I hate this design approach deep in my soul. It makes for very brittle code that creates lots of backward compatibility issues. If you're working on some legacy code that has some nonsense like
foo.get_status().dispatch_handler().log_error().close()
it is maddening! You have to untangle just what exactly gets returned by every step of the chain, so that you can ensure you're in the right context to know exactly what the next call of the chain is doing.In that example, say someone changes `foo.get_status()` to return some new kind of "status" object, and it alters the `dispatch_handler` and so on. Of course one can implement this in a way where the chain of downstream calls doesn't break, but the point isn't so much that, through huge engineering effort it is possible, but rather that it is extremely brittle and adds a layer of complexity that's not needed.
It's just so much better to write something like:
dispatch_result = run_dispatcher(foo.get_status())
log_error(dispatch_result)
When the intermediate points of the chain are just functions, instead of member functions of a class, it means you can easily experiment with them and figure out what's going on without needing to recreate the entire set of context along the whole chain.`run_dispatcher` in my example would be a hell of a lot easier to unit test and throw some mocked example class into for debugging or refactoring than if it is `some_class.run_dispatcher` ... and then if `some_class` has child classes that specialize the behavior, you're just hosed.
The problem is composability. People think that the fluent interface makes things composable because from some arbitrary point in the middle of the chain of calls, they have easy attribute-like access to the next operation they want to do. This artificially feels easy and convenient.
But contrast this to a functional language like Haskell, where none of these things need to be member functions of an object, and hence the context of the object doesn't have to be created at any point in the fluent chain. Then you can write something even better:
(close . logError . dispatchHandler . getStatus) foo
We can even easily refer to this whole chain of events with a single function name: let statusDispatchLog = (close . logError . dispatchHandler . getStatus)
(And, of course, we get lots of nice type checking in statically typed languages to ensure that the composition actually makes sense -- which not only protects you at run time, but is also a huge help to clue you in to your design flaws. If you're trying to shoehorn some stuff into a fluent interface and it's not working, it probably means you have thought clearly about how the methods should "flow" in the call chain.)To do the same thing in a fluent interface, we need a horrible lambda or a whole new function definition, exactly because the fluent interface is only sweeping the composability issues under the rug.
statusDispatchLog = lambda x: x.get_status().dispatch_handler().log_error().close()
The difference is subtle, but important. Instead of making a new function that is explicitly the composition of other functions, you are making a function that just happens to access other functions as attributes, and if you set it up correctly then it acts as a sequence of composition.In Python this is particularly a shame because functions are first class objects. Of course, you can write helper functions / decorators that sort of do function composition (if you're willing to throw away useful argument signatures), or you can use flaky hacks like the common Infix pattern in Python, and then live with ugly "<< . >>" or "|.|" misleading syntax.
It always makes me sad that Python lacks an extremely short function composition infix operator that provides some information about the function signatures of the functions being composed.
Because not even a comprehension can help you when you need to do the fluent interface stuff in Python.
[x.h().f().g() for x in some_iterator]
This is so much worse than map(g.f.h, someIterator)
or [(g.f.h) x | x <- someIterator]
or even [g(f(h(x))) for x in some_iterator]1. Rey is not a Jedi. A better dict key would be "fav_force_user".
2. For the planets, Episode II is missing Coruscant, Episode VI is missing Dagobah.
Edit: formatting
We like hearing a story we already know. Rey is well on her way to become a Jedi, probably in two-movies' time.
2: At some point while writing this I realised I would spend all evening if I had to round up all the planets, and chose to note that the lists are "non-exhaustive" instead :)
//Just some Sunday fun. :)
Python showed me the Force, but I'm with the Dark Side now.
>>> n = int(bbs, 2)
>>> n.to_bytes((n.bit_length() + 7) // 8, 'big').decode()
's no i sn e h e r pm o c'
>>> _.replace(' ', '')[::-1]
'comprehensions'
http://stackoverflow.com/questions/7396849/convert-binary-to...I considered using the [::-1] syntax to reverse the list, but decided that there was enough "cute" stuff in the examples already.
def palindrome(s):
return s == s[::-1]
It could be discussed whether ''.join(reversed(s)) is more readable for a novice programmer learning Python. In general, Python prefers words over punctuation.Also, there are objects that can be reversed() that are not sequences.
http://stackoverflow.com/questions/931092/reverse-a-string-i...
import struct
result=''.join([chr(int(bits,2)) for bits in struct.unpack('8p'*(int(len(bbs)/8)),bbs)])
print(result.replace(' ','')[::-1])(Not to mention half of ES6 existed in CoffeeScript first, but that's a gripe for another day)
I love Python and write code in it daily but as a (pseudo-)functional language it feels very awkward to me.
FA('0...11'.split('')).chunk(8) .map(x => parseInt(x.join(''), 2)) .map(x=>String.fromCharCode(x)) .filter(x=>x!=' ') .reverse()
list of items --> filter list --> operate on list
becomes
operate on list <-- (list of items --> filter list)
And because this is how it was decided to tackle map/filter problems, we'll always have a weird gimped anon-function operator instead, to discourage the map/filter patterns of every other language
Han Solo said it best "Hokey religions and ancient weapons are no match for a good blaster at your side, kid."
Lets keep it explicit alright? It's better then implicit.
Edit: read the article, it neither makes python cryptic or culty. The Jedi thing is just a cool SW reference. That said list comprehensions are a little cryptic to me since I haven't written python in awhile. Personally I think an important attribute of elegance in programming is how little you need to read the docs to understand something, and the more one-liners we do the more times we are likely to have to look at the documentation before reading it (not necessarily a bad thing, but sometimes a time sink and sometimes people won't look up the docs when they should!).