Akeem Wells ( ajw3rg@virginia.edu )
DS 5001
10 May 2021

Setup

Imports

Functions

Configs

Create Vector Space

We use Scikit Learn's CountVectorizer to convert our F1 corpus of paragraphs into a document-term vector space of word counts.

Generate Model

We run Scikit Learn's LatentDirichletAllocation algorithm and extract the THETA and PHI tables.

THETA

PHI

Inspect Results

Get Top Terms per Topic

Sort Topics by Doc Weight

Explore Topics by Genre