As with most things like it, if you're looking to shift off extremely unsophisticated NLP work to a junior developer, this is a good thing.
If you're an engineer focused in the NLP space, using this API would be like tying your hand behind your back. It introduces its own performance problems, and obscures a number of configurations that the APIs of the libraries it wraps expose. I also find its attitude towards object-orientation tends to obscure performance bottlenecks by hiding how much just-in-time computation occurs for a given string.
Also, I hate to admit this, but the Java/Scala NLP stack is beating out most Python NLP libraries these days. NLTK _just_ got Stanford CoreNLP's best-in-class dependency parser. It's been available in Java for years.
spaCy's native Cython dependency parser is both faster and more accurate than CoreNLP.
The NP chunks example from the post:
>>> from spacy.en import English
>>> nlp = English()
>>> doc = nlp(u'ITP is a two-year graduate program located in the Tisch School of the Arts. Perhaps the best way to describe us is as a Center for the Recently Possible.')
>>> for np in doc.noun_chunks:
... print(np.text)
...
ITP
a two-year graduate program
the Tisch School
the Arts
the best way
us
a Centeredit: sorry, I just noticed that it is available for free under the AGPL 3 license.
I'm a NLP person, and I think the Wit.ai people said it best:
Many papers were kind of “the state of the art for X was Y. We replaced the hand-crafted, manually hacked, heavily engineered Z by a RNN. It improved state of the art by 5 points.” The poor guys who presented deep learning-free papers invariably got the question: “did you also try with a [insert deep net technique here]?”[1]
The only downside with this is that traditional NLP tools are still probably easier to use, and you'll usually need to understand vocabulary to be able to talk to other people about your problems.
It's better than textblob / nltk in many ways.
EDIT: From today on spacy is free for commercial use! (MIT license).
Sorry, it doesn't seem that better to me.
In most NLP libraries, you have to decode two things: what exists, and what you can actually build against. Often the first is well documented, but for the second you might be left with no comment at all, even if the model does not produce output usefully better than chance. You just have to try it out and see for yourself.
Multi-lingual support is not there yet. But when it is it'll be good.
Multi-lingual support is an important issue, and a key reasons I decided to relocate from Sydney and base the business in Berlin. My new co-founder, Henning Peters, is a native speaker of German.
This is a particularly beautiful articulation of the complexity of English (and language in general).