s1 += s2
There are details at point six of http://docs.python.org/2/library/stdtypes.html#sequence-type..., where it also says that str.join() is preferable. Method 1: 0.115 seconds
Method 2: I gave up after >120s
Method 3: 0.265
Method 4: 0.160
Method 5: 0.220
Method 6: 0.098
I ran each one a few times to make sure the times were roughly correct. Method 6 is still the fastest, but the naive method one is really close. it obviously got optimized. Actually, they're all pretty close with the obvious (and hideous) outlier of using MutableString.EDIT:
I just remembered I have an old version of PyPy (1.8) on my Mac. Thought I'd give that a try.
Method 1: I gave up after >120s
Method 2: I gave up after >120s
Method 3: 0.090 seconds
Method 4: 0.102
Method 5: 0.430
Method 6: 0.102
Method one is a problem again, and method 5 (the pseudo file) is noticeably slower. Otherwise the results aren't too far off.for 10M loop count, method 1 -> 1.599 s, method 6 -> 1.91 s
for 30M loop count,method 1 -> 4.967 s, method 6 -> 5.871 s
Summary: The KISS s1 += s2 always wins
from __pypy__.builders import StringBuilder
builder = StringBuilder(<size estimate, optional>)
for i in xrange(10000):
builder.append(str(i))
return builder.build()As I understand it, this is implementation dependent. CPython has gone to some lengths to optimize its string concatenation code to do things behind the scenes that are basically equivalent to some of the faster methods given in the article. Other Python implementations have not necessarily done that, so string concatenation can be a lot slower.
return ''.join(`num` for num in xrange(loop_count))
On one hand, it avoids creating a temporary list in memory. On the other, it can't know in advance how long the final output of the loop will be and so couldn't use tricks like preallocating enough RAM.http://docs.python.org/2/library/timeit.html
Results:
$ python -m timeit '"-".join(str(n) for n in range(100))'
10000 loops, best of 3: 40.3 usec per loop
$ python -m timeit '"-".join([str(n) for n in range(100)])'
10000 loops, best of 3: 33.4 usec per loop
$ python -m timeit '"-".join(map(str, range(100)))'
10000 loops, best of 3: 25.2 usec per loopFull results: https://gist.github.com/dbarlett/6479378
No, because that would make the algorithm quadratic again (one loop for the scan, second loop for the concatenation), and the whole point of the join idiom, as recommended by the Python documentation, is to avoid a quadratic algorithm.
''.join(map(str, range(n)))In Python 3, the map builtin is basically equivalent what itertools.imap was in Python 2, so it is the best choice. (Also, the range builtin in Python 3 no longer realizes a list.)
(Here I'm using the stock Python 2.7.1 on a 2011 MacBook Pro.)
$buff = '';
for ($i = 0; $i != 100000; ++$i) {
$buff .= $i;
}
Runs in about 0.06s $buff = array();
for ($i = 0; $i != 100000; ++$i) {
$buff[] = $i;
}
$ret = implode('', $buff);
Runs slower, in about 0.10s $ret = implode('', range(0, 100000));
Takes roughly the same time, 0.10sI did many kind of performance tests when implementing my PyClockPro caching lib, even if I don't use strings in it.