Explicit is better than implicit.
And yet, s = ["one", "two" "three"] will implicitly and silently do something, that is probably wrong most of the time.
There should be one-- and preferably only one --obvious way to do it.
the author used two different ways of hyphenating (three, if you count the whole PEP 20). PEP 20 is clearly not meant to be taken as law. Nor PEP 8. Nor PEP 257.People frequently mistake "one obvious way" with "one way". There are lots of ways to iterate through something, for example, but there is really one obvious way. And the philosophy here still applies: when you read anyone else's python code, the obvious way is probably doing the obvious thing. I think that is the more appropriate takeaway from PEP 20.
Not in comparison to Perl, which usually has multiple ways to do anything, each 'obvious' to different sets of people (each Perl codebase therefore seems to have a distinct dialect based on which 'obvious' alternatives are chosen).
The other direction languages can take that is being contrasted, is there being one non-obvious way to do something.
Python's 'most obvious way' isn't necessarily the fastest/most concise/most efficient/scalable/etc. way to do something in Python, but it will usually be obvious to most Python developers. And although broad styles have certainly developed over time (imperative, functional, OO) as Python has gained power and flexibility, the dictum still largely holds true.
I knew Python wasn't for me in my first foray into it when I fired its REPL and then went to exit it with control-C or whatever and it literally printed out the right way to do it but then didn't do it. Python was more interested in having me do things a certain way even when it knew what I intended to do, just to be a twit.
from that context it makes sense, because the only goal of python in the 1990s was to be more popular than perl, which was notorious in having many ways of doing the same thing.
but yeah, python had had significant feature creep over the years, it's nowhere near the small clear lang it used to be.
What? Something being complex is artificial, we try to avoid it. Problems can be complicated, we try to simplify them, and more complicated the problem is, we tend to develop more complex solutions. So comparing them does not make sense?
Or did I always know them wrong?
This is a list and you must explicitly place a comma when you want to start a new element in the list. Is there ever a time a new element follows a previous one and is NOT separated by a comma? No, this is explicit.
Whereas, strings also always concatenate in this manner be it in a list context or not. It seems like you're assuming behaviors from other languages would be the same in another.
Sarcasm aside, I'd assume people primarily list things in between [ and ], and sometimes concatenate things in there too. The language should err on the side of doing what people expect, unless explicitly told not to.
> It seems like you're assuming behaviors from other languages would be the same in another.
Rather, I think people expect a language, especially one this big and important, to work for them, and not to be designed with unergonomic features instead.
I'd expect that to be an error.
Yes:
[ "one, two", "three" ]
The comma is not an absolute context-free indicator of element separation.(Python copies some bad ideas from C. Another one is having to import everything you use. It seems that since Python is written in C, its designer took it for granted that there will be something analogous to #include for using libraries, even standard ones that come with the language.)
Implicit string literal catenation is tempting to implement because it solves problems like:
printf("long %s string"
"nicely breaks up"
"with indentation and all",
arg, arg, ...)
and if you're working in a language which has comma separation everywhere, you can get away with it easily.There are other ways to solve it. In TXR Lisp, I allow string literals to go across multiple lines with a backslash newline sequence. All contiguous unescaped whitespace adjacent to the backslash is eaten:
This is the TXR Lisp interactive listener of TXR 273.
Quit with :quit or Ctrl-D on an empty line. Ctrl-X ? for cheatsheet.
TXR needs money, so even abnormal exits now go through the gift shop.
1> "abcd \
efg"
"abcdefg"
If you want a significant space, you can backslash escape it; the exact placement is up to you: 2> "abcd\ \
efg"
"abcd efg"
3> "abcd \
\ efg"
"abcd efg"
4> "abcd \ \
efg"
"abcd efg"
5> "abcd \ \
\ efg"
"abcd efg"Maybe it is that through my work I use a half dozen languages, where it is hard to remember each in detail.
I have also worked on a javascript project where there were no imports/requires and the build process created one file. So you had to inspect the confusing build script to even know what was what.
And especially how I can choose the best way to indicate the sources of names in my code:
import time
t = time.perf_counter()
import time, my_module
t1 = time.perf_counter()
t2 = my_module.perf_counter()
from time import perf_counter as std_counter
from my_module import perf_counter as my_counter
t1 = std_counter()
t2 = my_counter()
try:
from my_module import perf_counter
except ImportError:
# Fall back to standard implementation
from time import perf_counter
t = perf_counter()
# import time as m
import my_module as m
t = m.perf_counter()Build processes creating one file is the seven decade norm in computing.
Even if you literally don't catenate the .js files into one, they get loaded into one running image one way or another.
long %s stringnicely breaks upwith indentation and all"
? In my experience, this always gets ugly when you want to insert spaces (= about always). Do you put them at the end or at the start of each string (apart from the first or last string)I think scala’s mkString (https://superruzafa.github.io/visual-scala-reference/mkStrin...) is the best solution, visually, for such things, but unfortunately, it would require hackers in the parser to do the concatenation at compile time, where possible.
Scala’s multiline strings look nice, too, if you want to insert newlines, except for the stripMargin thing (https://docs.scala-lang.org/overviews/scala-book/two-notes-a...)
The alternative is what exactly? Have the entire standard library exposed at once? Make all modules create non-conflicting names for exported objects, so that the json parse function has to be called json_parse and the csv parse function has to be called csv_parse?
Seems less than ideal to me.
If these things are classes in a plain old single-dispatch oop system, you can havec a json-parser and csv-parser which have parse methods.
There could be packages/namespaces. So csv:parse and json:parse. These packages are standard and so they just exist; nothing to import.
In Python, you cannot use anything without an import! The top-level modules (which serve as de facto namespaces) themselves are not visible.
Say there is a csv module with a parse. You cannot just do:
csv.parse(...)
you have to first say import csv
This jaw-droppingly moronic. @"
here strings in PS are fine for this purpose and
even allows whitespace anywhere
but because of the latter you can't indent it
with your other code
"@ -split "`r`n" | % {'<SOL>{0}<EOL>' -f $_ }
<SOL> here strings in PS are fine for this purpose and <EOL>
<SOL> even allows whitespace anywhere <EOL>
<SOL> but because of the latter you can't indent it <EOL>
<SOL> with your other code <EOL>https://unix.stackexchange.com/questions/76481/cant-indent-h...
And backslash doesn’t let you have the literal obey the proper indenting. Might as well use “””
I don't want to be finding definitions of things that the language provides in the code.
Languages that don't work this way have IDE's, editor plug-ins or other tools for easily finding the definitions of things that are in the language, without hunting for them through intermediate definition steps in the same file.
"I've spent all my life in and out of jails, so I expect bars on doors and windows ..."
If the language supports json, it should just do that.
1> #J[1,2,3]
#(1.0 2.0 3.0)
2> (get-json "[1,2,3,{\"foo\":true}]")
#(1.0 2.0 3.0 #H(() ("foo" t)))
3> (put-json #(1.0 2.0 t))
[1,2,true]tIt's an essential feature used in all sorts of everyday code.
C99 added printf conversion specifiers that are hidden behind macros, and idomatic usage of them relies on string catenation.
uint32_t x = 0;
printf("x = " PRIx32 "\n", x);
where PRIx32 might expand to "%lx" (if uint32_t is the same as unsigned long in that compiler).All sorts of C macrology relies on string catenation. Kernel print messages:
printk(KERN_EMERG "%s: temperature sensor indicates fire!", dev->name);
^ must not have comma hereMissing an operator resulting in explicit behavior is much more subtle and not even obvious behavior. For those who use python, it is worse.
Yes, you'll certainly find somebody who doesn't know what 'not statically typed' means, but ... And yes, there are also C(++) users, that expect strings to be concatenated like that.
But that's just what comes with a hyper flexible language like python. You can do lots of things in lots of different ways, but you can also screw things up just as easily, and your IDE won't tell you because technically it's valid code.
Is this "operator" overloadable on each type in Python?
And that scares me a lot. I think I have to reevaluate my position towards Python.
I get the use case as you described it, but it just seems like minimal effort to accomplish and have some semblance of explicit/safety.
It's common in some languages and used the way you use it. I looked in PEP8 and it seems they don't discuss this.
I think it's a perfectly valid use case, but clearly there are two camps to this. If this is so contentious, I would recommend PEP8 be revised to either explicitly endorse it as a way to split long lines or to explicitly discourage it and recommend the + operator instead.
mylongstring = "hello" +
"world"
No idea if python's way of indentations allows this but sounds like it shouldThe string concatenation in itself should not be a problem as it's really just string constants. (But again, it might be irony exactly because of this :) )
I come from a programming platform (C#) where productivity is a key element of language design. I highly doubt that Anders Heijlsberg would have accepted such a error prone concept like a literal free implicit operator on a key type like strings.
("foo" "bar", "baz")
and
("foo", "bar", "baz")
I've moved away from working in Python in general, but I think the #1 feature I want in the core of the language is the ability to make violating type hints an exception[1]. The core team has been slowly integrating type information, but it feels like they have really struggled to articulate a vision about what type information is "for" in the core ecosystem. I think a little more opinion from them would go a long way to ecosystem health.
[1] I know there are libraries that do this, I am not seeking recommendations.
https://github.com/UWQuickstep/quickstep/pull/9
https://github.com/tensorflow/tensorflow/pull/51578
Also, I personally don't mind this approach to string concatenation. I think it's a fine compromise between easy formatting and clarity. I was whining about a corner case of tuple construction - which as far as I know is not a feature of any other language.
In hindsight, singleton tuples are not common or useful enough to deserve their own syntax. If the way to create them was something like this:
t = tuple.single("hello")
we'd thing it's ugly or inconsistent, but definitely not confusing or bug-prone. x = (1,2,3)
#print("the value of x is %s" % x) # breaks if x is a tuple
print("the value of x is %s" % (x,)) # works even if x is a tuple
There is a readable way to create singleton tuples, without the sneaky trailing comma or a new function like tuple.single: tuple(["hello"])
The square brackets can be slightly annoying. I recall writing the following function to omit them: def tup(*args):
return tuple(args)
This basically lets you use the usual tuple syntax, just prefixed with the word "tup". The advantages are that you don't need a trailing comma for singleton tuples, and it's more obvious that a tuple is being created (it can be difficult to distinguish between tuple literals and parentheses used for grouping in a complex expression).I am reminded of a somewhat similar issue with empty set literals: {1,2} is a set, {1} is a set, but {} is a dict. The way to create empty sets is using set().
I just wish that the core team would take that same zeal for a "pythonic" experience with small code and use it to develop more scaled-up systems for dealing with larger code bases. My idea is to enforce strong pre-conditions on function calls using type hints, but I am sure there are other ways to do it.
char ch_arr[3][10] = {
"uno",
"dos"
"tres"
};I say "had a proper type system", but actually it turns out that it does have something like that: When I use python for anything else than a most tiny script now, I use "mypy"[1] which implements static typing according to some existing Python standard (whether that came about because of mypy or the other way around, I don't know).
It is so, so good to have mypy telling me where I messed up my code instead of receiving a cryptic, weird runtime error, or worse, no error and erratic runtime behavior. Because not knowing that a particular type is unexpected and wrong, values often get passed along and even manipulated until the resulting failure is not very indicative of the actual problem anymore.
Even without static typing, argument length verification etc. can be done with a suitable compiler. In python we are left chasing 100% code coverage in unit tests as it's the only way to be certain that the code doesn't include a silly mistake.
One of our products is a universal linter, which wraps the standard open-source tools available for different ecosystems, simplifies the setup/installation process for all of them, and a bunch of other usability things (suppressing existing issues so that you can introduce new linters with minimal pain, CI integration, and more): you can read more about it at http://trunk.io/products/check or try out the VSCode extension[0] :)
[0] https://marketplace.visualstudio.com/items?itemName=Trunk.io
More broadly, the https://codereview.doctors makers are making the point that their tool caught an easy-to-miss issue that most wouldn't think to add a rule for. A bit of an open question to me how many of those there really are at the language level, but still seems like a neat project.
Also in terms of mistakes codereviewdoctor twice linked to the same issue in their blog https://github.com/tensorflow/tensorflow/issues/53636 and raised the PR to the wrong project https://github.com/tensorflow/tensorflow/pull/53637 (I guess Tensorflow vendors Keras, easy mistake)
STOP!
This folder contains the legacy Keras code which is stale and about to be deleted. The current Keras code lives in github/keras-team/keras.
Please do not use the code from this folder.
Yeah, not the most obvious notice.The fact they didn't find the same mistake(s) in keras-team/keras (I assume they scanned, it's one of the most popular Python repo) makes me believe these issues have been fixed/removed in up-to-date karas repo.
Also a factor that bugs in functional code are more visible, both during development and to users once shipped. So there may have been an equal number or more such bugs in the non-test code, that just didn't remain in the code base for this long.
typo in the url (or in HN's markup) btw: it's https://codereview.doctor
I've been bitten by this one at work, and can't help but think it is an insane behaviour, given that ['foo' + 'bar'] explicitly concatenates the strings, and ['foo', 'bar'] is the much more common desired result.
edit: This also applies to un-separated strings, so ['foo''bar'] also becomes ['foobar']
I don't think it fits well in python
And there’s no evaluation of importance as to whether these instances are in test files or non-critical code. Packages are big and can have hundreds or thousands of files.
It could be that if these mattered, they would have been detected and fixed.
A good example for unit tests and perhaps checking to see if these bugs are covered or not covered.
I like these kinds of analyses but don’t like the presented like it’s some significant failure.
Python has a few of these things, which is really sad.
test did not work but did not fail either, imagine being that dev maintaining the code that the test professes to cover. Imagine being the user relying on the feature that test was meant to check (if the feature under test actually broke).
Along those lines. I wonder how many of these come from ad-hoc file path handling instead of using pathlib.
https://github.com/PyCQA/pylint/issues/1589
Is there usually enough context for a linter to make an educated guess?
Maybe as a matter of linting. As a matter of language design, I think + for string concatenation is a big mistake; using different symbols for numeric addition and string concatenation is something Perl got right.
> This PEP is rejected. There wasn't enough support in favor, the feature to be removed isn't all that harmful, and there are some use cases that would become harder.
The most common way to split a string in lines is using this concatenation formula.
I suppose this is also something you could catch with a linter?
Using the plus operator to concatenate strings is just weird.
Think of the usual algebraic properties these operators are supposed to have.
"+" always is supposed to be commutative--so "a"+"b" = "b"+"a", if those mean alternatives (they usually do mean that in mathematics), is just fine.
On the other hand, multiplication is often not commutative--also not here. "a" "b" != "b" "a".
So string concatenation should be the latter. And indeed that's how it's in regular expression mathematics for example.
Furthermore, Python already uses * for strings to indicate repetition: ("foo" * 2 == "foofoo").
String concatenation really just needs its own separate operator. & is an obvious candidate, if only it wasn't so commonly appropriated for bitwise AND - which is a very poor use of a single-char operator as it's not something that you need often, especially in a language like Python.
On the other hand, D uses binary ~ for concatenation. That has a neat mnemonic: it's a "rope" that "ties strings together".
This might be nice from a math point of view, but I think users are going to be confused using "string"^3 for repetitions (instead of "string"*3). + and * make too much sense to the unwashed masses.
At any rate, explicit is better than implicit.
Well, except if you wanted to support user classes that could duck type as both strings and numbers, which it would make awkward.
puts "a" "b" == "ab" # true
and puts "a"
"b" == "ab"
prints "a" with "b" == "ab" evaluated to false and discarded. This could create bugs as with Python. However ["a"
"b"] == ["ab"]
is syntax error at the beginning of the second line. The parser expects a ]
It would evaluate to true if it were on one line.# list
list = "a","b",
# function
def foobar
end
=> ["a", "b", :foobar]
The implicit concat of string literals is the culprit here. It really should require "+".
https://chromium-review.googlesource.com/c/v8/v8/+/2629465/3...
Personally, I prefer uniform lists with leading commas, because it's easier to add and remove lines for later, inevitable refactoring. For example, I prefer:
things = [
'foo'
, 'bar'
, 'baz'
]
This drives some people crazy, but I think it's the One True Way. things = [
'foo',
'bar',
'baz',
]
even better? In your case, if you want to add something to the beginning of the list you'll have to modify two lines.It doesn't seem to have anything to do with typing discipline.
words = (
'yes',
'correct',
'affirmative'
'agreed',
)
Would be a tuple (immutable list) of strings, while words = (
'yes',
'correct',
'affirmative',
'agreed',
)
would also be a tuple of strings.If haskell had for some reason decided to have the same syntax sugar, it also would have caused an issue.
It's both good for those projects and for the company that does the marketing since they reach there exact target group. Plus it gets them on the front page of HN.
I was just doing some simple refactoring, changing a hard coded sting into a parameterized list of f-strings that’s filtered and joined back into a string.
I’m glad that I had unit tests that caught the problem! I couldn’t figure out why it was breaking, that comma is very devilish to spot with the naked eye. I’m surprised my linters didn’t catch it either. Maybe time to revisit them.
https://github.com/YosysHQ/prjtrellis/pull/176
https://github.com/UWQuickstep/quickstep/pull/9
https://github.com/tensorflow/tensorflow/pull/51578
https://github.com/mono/mono/pull/21197
https://github.com/llvm/llvm-project/pull/335
https://github.com/PyCQA/baron/pull/156
https://github.com/dagwieers/pygments/pull/1
https://github.com/zhuyifei1999/guppy3/pull/12
https://github.com/pyusb/pyusb/pull/277
https://github.com/KhronosGroup/Vulkan-ValidationLayers/pull...
It is indeed a very common mistake in Python, and can be very hard to debug. It bit me once and wasted a whole day for me, so I've been finding/fixing them ever since trying to save others the same pain I went through.
EDIT: I will point out that I've found this error in other non-Python code too, such as c++ (see the 2nd PR for example).
Here's the regex for anyone curious:
[([{]\s*\n?(\s*['"](\w)+['"],\n)+(\s*['"]\w+['"]\n)(\s*['"]\w+['"],\n)*
{
'key': (
'long string long string long string'
)
}
Using parentheses like this to put long strings on their own line is standard practice. title = 'Hello world',
I, for one, have often used this deliberately.Instead of:
s = ['a', 'b', 'c']
I'll type: s = 'a b c'.split()
For multiline lists where I want to get rid of leading whitespace I'll add lstrip(): lines = """line 1
line 2
line 3
""".split('\n')
lines = [line.lstrip() for line in lines]..."there are perfectly cromulent reasons a developer would do implicit string concatenation spanning multiple lines"...
https://www.merriam-webster.com/words-at-play/what-does-crom...
"put"
, "Commas"
, "first"
, "to"
avoid these kinds of things.A paragraph is repeated and the markdown links at the end are broken because there is a space between ] and (.