Machine Learning
Unsupervised learning by probabilistic latent semantic analysis
Machine Learning
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Hi-index | 0.00 |
The volume of documents in the digital repositories numbers in thousands and is increasing constantly. In such a scenario it becomes a very important issue to organize and retrieve these documents in a way that relates to the human mind. In this paper, we present a novel approach to classify the documents in a digital repository and find the semantically significant keywords related to those documents to make the organization and the retrieval of the documents expeditious. We approach this problem using probabilistic model with incomplete training data to organize them and mark the relevant keywords. This approach makes the classification faster and instead of the unlabeled clustering gives classification with well defined topics.