As the article states in the introductory paragraph, this problem encompasses more than just counting strings. It also involves "some fundamental tasks of natural language processing (NLP): tokenization (dividing a text into words), stemming, and part-of-speech tagging for lemmatization", so a little more work is required here.