Skip to Main Content

TIM-8535 V2

Univariate Analysis

Lesson 7 Resources

Datasets

Sample Sentiment Analysis (Emotions Sample)

This example (sample) Jupyter Notebook script demonstrates the preprocessing of document text in preparation for Natural Language Processing (NLP) and the building/training of a Logistic Regression model to predict sentiment.

GoEmotions Dataset 

This dataset is a human-annotated dataset of Reddit comments labeled with 27 emotion categories. The dataset has been modified for the assignment.

Optional Resources

Sample Scripts and Data Files

Sample NLP Preprocessing

This example script shows many common techniques used for preprocessing text in preparation for Natural Language Processing (NLP)

Sample Sentiment Analysis (Twitter Sample)

This example script shows preprocessing of Tweets-related text in preparation for Natural Language Processing (NLP) and the building/training of a Naïve Bayes model to predict sentiment.