Could anybody explain (or provide a pointer to an explanation of) the details of how the individual words are mapped to vectors? The source is available, but optimized such that the underlying how's are a bit opaque, and the underlying whys even more so.
You can think of this as a square matrix W. The size of the matrix is the size of the vocabulary. If we look at the 100k most frequent words in our corpus, W will be a 100k x 100k matrix.
The value of W(i,j) is the distance between words i and j, and a row of the matrix is the vector representation of that word. Research around word vectors is all about computing W(i,j) in an efficient way that is also useful in natural language processing applications.
Word vectors are often used to compute similarity between words: since words are represented as vectors, we can compute the cosine angle between a given pair of words to find out how similar the two words are.
TL;DR: The answer to your query is a person named Chaudhry Sitwell Borisovich who is definitely an entomologist-hymnist and probably is also a mineralogist-ornithologist.
A google search suggests that he was born in 1961.
I ran a few queries using the code and its default dataset, trying to use neutral words for substraction: "mosquito -small +mountaineer", "mosquito -big +mountaineer", "mosquito -loud +mountaineer", "mosquito -normal +mountaineer", "mosquito -usual +mountaineer", "mosquito -air +mountaineer", "mosquito -nothing +mountaineer".
You inadvertently stumbled onto the punchline of the joke - "You can't cross them because a mountaineer is a scalar." (scaler) - works better when spoken.
Yeah, the papers linked in the references are probably a better place to start than the readme (though I'm not sure how closely aligned this implementation is with that research, but the paper is still a good read), especially [1]
I wrote a simple library[1] in Ruby for measuring the similarity between documents using word vectors. It has none of the cleverness of this one, but is much simpler, if that helps?