Topic modelers can analyze an assortment of documents and extract distinct themes from them. From this, topic modelers are able to determine what new documents are about - without needing to read them. In the example below, we trained our topic modeler to read thousands of articles and then visualized the results.
get your own demo
This is for demo purpose only.
Overall Term Frequency
Estimated Term Frequency within The Selected Topic
1. Saliency (term w) = frequency(w) * [sum_t p(t | w) * log(p(t | w)/p(t))] for topics t;
see Chuang et. al (2012)
2. Relevance (term w | topic t) = λ * p(w | t) + (1 - λ) * p(w | t)/p(w);
see Sievert & Shirley (2014)
Marginal Topic Distribution
1. Saliency (term w) = frequency(w) * [sum_t p(t | w) * log(p(t | w)/p(t))] for topics t; see Chuang et. al (2012)
2. Relevance (term w | topic t) = λ * p(w | t) + (1 - λ) * p(w | t)/p(w); see Sievert & Shirley (2014)
λ = 1
TOPIC: Movie 1
Intertopic Distance Map
(via multidimensional scaling)
how it works
Clicking one of the topics in the above diagram will show the most frequent words in that topic, and hovering over a word will show its distribution across the topics. With a little exploration, you can see that there are three distinct categories of topics: movies, sports, and cars.
But each one of those categories has multiple topics inside, so what makes these topics different? If you decrease the Relevance Metric, you'll start seeing differences between the topics; for example, “Sports 5” seems to be more of a hockey topic, while “Sports 6” is more of a baseball topic. Gaining insights into text has never been easier - get in touch to find out how the Convergence topic modeler can benefit your business.
Let's have a conversation
Start a Project
777 Hornby Street, Suite 1500, Vancouver, BC, Canada, V6Z 1S4
© 2022 Convergence Concepts Inc.