Traditional content analysis method (like manual coding) is a lengthy and time-consuming process for researchers in the “big data” context. This course introduces innovative AI-aided content analysis methodology for analysing digital data, empowered by natural language processing (NLP), aiming to help researchers to do social science research faster and more efficiently.
The course includes 6 modules:
(1) Introduction of content analysis and the operating procedures.
(2) The data management: how to upload research data on the platform; how to customize database based on the research purpose; and how to sample data.
(3) Data exploration: an initial step to quickly analyse and overview the data. How to present the overall trend of the data in the form of a time series chart, the major themes of the content in the form of a word cloud diagram, and keywords statistics.
(4) Theme identification and codebook development: some tips to form coding categories from a large text corpus in an efficient, fast, and comprehensible manner.
(5) Machine and manual coding: a data-driven “AI-assisted content analysis” approach is proposed to support researchers on their content analysis in an efficient and effective way, aiming to save 80% of their time on repetitive work like coding.
(6) Statistical analysis and data visualization: present analysis results with a variety of visualization tools, such as word cloud, radar charts, scatter graph, heat map and Sankey diagram.