In this 1-hour long project, you will learn how to clean and preprocess data for language classification. You will learn some theory behind Naive Bayes Modeling, and the impact that class imbalance of training data has on classification performance. You will learn how to use subword units to further mitigate the negative effects of class imbalance, and build an even better model.