In this course, we study the basics of text mining.
The basic operations related to structuring the unstructured data into vector and reading different types of data from the public archives are taught.
Building on it we use Natural Language Processing for pre-processing our dataset.
Machine Learning techniques are used for document classification, clustering and the evaluation of their models.
Information Extraction part is covered with the help of Topic modeling
Sentiment Analysis with a classifier and dictionary based approach
Almost all modules are supported with assignments to practice.
Two projects are given that make use of most of the topics separately covered in these modules.
Finally, a list of possible project suggestions are given for students to choose from and build their own project.