Fast Visibility of Research Literature using Thematic Models and Machine Learning

Authors

  • Dr. M. Thamarai

Keywords:

Data science, Latent dirichlet allocation (LDA), Machine learning, Sentiment analysis, Topic modeling

Abstract

A vast number of scientific papers and articles are developed each year with the development of computer technology and the progress of education and science. These articles include essential knowledge of emerging technology and information. It became important to invent a text analytics procedure that contributes a fast understanding of this huge unstructured text data in the research content. Here we propose extracting data using subject modeling methodology to take a quick view of findings. Along with this, we try to categorize the data into some set of topics, and from these, we could deal with data as of some sort cluster which could help us to deal with a bit more depth. Latent Dirichlet is a generative probabilistic model which uses the probability distribution of terms in a text to extract the theme from the text. It helps us understand a set of data patterns or observations by understanding some other data in the similar dataset. The generated topics reveal many secret concepts, linking terminologies to the key theme of the paper that provides a summary of such papers. Emerging application journals are being regarded in this research with topic modeling applications.

Published

2022-12-29

Issue

Section

Articles