- fewer dependencies for my package
I've written the average() and standard_deviation() functions at least a couple of dozen times, because it doesn't make sense to require numpy in order to summarize, say, benchmark timing results.
- reduced import time
NumPy and SciPy were designed with math-heavy users in mind, who start Python once and either work in the REPL for hours or run non-trivial programs. It was not designed for light-weight use in command-line scripts.
"import scipy.stats" takes 0.25 second on my laptop. In part because it brings in 439 new modules to sys.modules. That's crazy-mad for someone who just wants to compute, say, a Student's t-test, when the implementation of that test is only a few dozen lines long. (Partially because it depends on a stddev() as well.)
Sure, 0.25 seconds isn't all that long, but that's also on a fast local disk. In one networked filesystem I worked with (Lustre), the stat calls were so slow that just starting python took over a second. We fixed that by switching to zip import of the Python standard library and deferring imports unless they were needed, but there's no simple solution like that for SciPy.
- less confusing docstring/help
Suppose you read in the documentation that scipy.stats.t implements the Student's t-test as scipy.stats.t.
>>> import scipy.stats
>>> scipy.stats.t
<scipy.stats.distributions.t_gen object at 0x108f87390>
It's a bit confusing to see scipy.stats.distributions.t_gen appear, but okay, it's some implementation thing.Then you do help(scipy.stats.t) and see
Help on t_gen in module scipy.stats.distributions object:
class t_gen(rv_continuous)
| A Student's T continuous random variable.
|
| %(before_notes)s
|
...
|
| %(example)s
Huh?! What's %(before nodes)s and %(example)s?The answer is, scipy.stats auto-generates various of the distribution functions, including things like docstrings. Only, help() gets confused about that because help() uses the class docstring while SciPy modifies the generator instance's docstring. Instead, to see the correct docstring you have to do it directly:
>>> print scipy.stats.t.__doc__
A Student's T continuous random variable.
Continuous random variables are defined from a standard form and may
require some shape parameters to complete its specification. Any
optional keyword parameters can be passed to the methods of the RV
object as given below: