The Amazing Power of Word Vectors (2016) (opens in new tab)

(blog.acolyer.org)

127 pointsSamuelKillin9y ago9 comments

9 comments

9 comments · 6 top-level

iconvalleysil9y ago· 1 in thread

More interesting information from a DOD research lab that resulted in vectors https://www.kaggle.com/c/word2vec-nlp-tutorial/discussion/12...

charlescearl9y ago

Department of Energy, not to be confused with Department of Defense. Granted there is a lot of nuclear weapons work at DOE, but LBNL does mostly big science work up the hill from Berkeley

1 more reply

Bioeye9y ago· 1 in thread

This is an awesome explanation of those papers! Does anyone have any cool examples of word2vec being used in a project? I'd be interested in seeing what people could make with it.

mabbo9y ago

Document type classification. We wanted to predict which of these k classes a new text document was.

We trained 100-dim word vectors on all the text content we currently have, plus some 30,000 wiki articles related to the business. New content comes in, convert words to vecs, average them, and use that resulting vec as the input to a basic classifier.

For how simple that is, the method is unreasonably good. Widely applicable too.

ideonexus9y ago· 1 in thread

For anyone looking for a simple javascript explorable explanation of this you can quickly download and run in a browser, I just found the following GitHub Project.

Demo:

http://turbomaze.github.io/word2vecjson/

Code:

https://github.com/turbomaze/word2vecjson

The code looks pretty straightforward so I look forward to exploring this playground of a new and fascinating concept.

ideonexus9y ago

Update: I'm having a lot of fun exploring the relationships between "nerd," "geek," "dork," etc. in this demo. : )

j_s9y ago

Recently HN featured a podcast interview with "one of the creators of Word2Vec and fastText", documenting some of his professional history.

https://news.ycombinator.com/item?id=13630678

vonnik9y ago

Word vectors are great. We've also written about them at length.[0] But any one interested in word vectors should also be looking at newer ways of applying neural nets to text. Specifically, convolutional nets with pooling for time are producing great results for clustering and classification.

[0] https://deeplearning4j.org//word2vec

mustafabisic19y ago

You nailed it with the "German + airlines" example. Up until that point it was tough to read for a newbie like me. Great blog post

j / k navigate · click thread line to collapse