Working with text data does not need to be difficult!
Follow along as we explain complex topics for a beginner audience. By the end of this course, you will be able to read in data from websites like twitter and wikipedia, clean it, and perform analysis.
We keep it easy.
This course is designed for a data analyst who is familiar with the R language but has absolutely no background in natural language processing or even statistics in general.
We break our course into three main sections: text mining, preparing and exploring text data, and analyzing text data.
Text Mining
Like with every other form of analytics, before any real work can be done, the data must exist (obviously) and be in a working format.
What’s Covered: APIs, Twitter Data, Webscraping, Wikipedia Data
Preparing and Exploring Text Data
Once the data has been properly gathered and mined, it needs to be put into a usable format. The following tutorials cover how to clean and explore text data.
What’s Covered: Regex, stringr package, tidytext package, tm package
Analyzing Text Data
After exploratory data analysis has been performed, we can do further analysis of the relationships and meaning in text.
What’s Covered: TF-IDF, Sentiment Analysis, Topic Modeling, Parts of Speech Tagging, Name Entity Recognition, Word Embeddings
So dive in and see what insights are hiding in your text data!