Preprocessing
- Stemming: Chop off plurals, “-ing”, etc. to get word stems
- Lemmatization: Converting a verb to its infinitive form
- Typo correction: Replace a word with the word in our dictionary with the nearest edit distance
- Stop words: Decide how to handle words like “the”