I think with that he was trying to show how computers have a hard time finding out what a webpage subject is. Humans have it easy, just read the title of the page or the title of the article.
"In fact, the least commonly occurring words on a page are frequently more interesting: words like myxomatosis or hermeneutics. To be more precise, what you really want to know is what uncommon words appear on this page more commonly than they do on other pages. The uncommon words are more likely to tell you what the page is about."
"In fact, the least commonly occurring words on a page are frequently more interesting: words like myxomatosis or hermeneutics. To be more precise, what you really want to know is what uncommon words appear on this page more commonly than they do on other pages. The uncommon words are more likely to tell you what the page is about."