Construction of Lexicons to Perk Up Re-Clustering

Authors

  • A. George Louis Raja Research Scholar, Department of Computer Science and Applications, SCSVMV University, Kanchipuram, Tamil Nadu, India
  • F. Sagayaraj Francis Professor, Department of Computer Science and Engineering, Pondicherry Engineering College, Puducherry, India
  • P. Sugumar Assistant Professor, Department of Computer Applications, Sacred Heart College (Autonomous), Tirupattur, Tamil Nadu, India

DOI:

https://doi.org/10.51983/ajcst-2018.7.3.1891

Keywords:

Lexicon, Clustering, ATSCA, Keygraph, KBLCA

Abstract

The existing semantic methods cluster the documents based on unabridged or abridged term comparisons. After clustering, these terms are not preserved, costing the cluster operation to be repeated in its entirety upon the arrival of new documents. Hence the semantic clustering methods can be considered as “on the go” methods. Re-clustering becomes unavoidable in all circumstances both in the Iterative and Incremental Clustering Methods. It would be more appropriate to build and evolve a lexicon with the derived keywords of the documents and to refer them in further cluster operations. The rationale is to deny re-clustering upon new documents and refer the Lexicon to formulate clusters until the quality of clusters is intact, and when it breaks above the threshold, the cluster operation can be repeated. Since re-clustering is delayed until a breakeven point, the process of re-clustering becomes faster. This process may incur additional runtime complexity, but would extremely simplify and speed up the process of re-clustering. This paper discusses about the construction of lexicons and its applications in clustering. The Keyword based Lexicon Construction Algorithm (KBLCA) is demonstrated to build lexicons and the breakeven point for re-clustering is proposed and described. The theory of denying re-clustering is briefed, along with experimental results.

References

H. Sayyadi and L. Raschid, "A Graph Analytical Approach for Topic Detection," ACM Transactions on Internet Technology (TOIT), vol. 13, no. 2, 2013.

S. M. Lad, "Keyword Extraction from Conversation Text Document and Recommending Document using Fuzzy Logic Based Weight Matrix Method," International Journal of Advanced Research in Computer Science, vol. 7, no. 4, pp. 34-38, August 2016.

H.-C. Chang and C.-C. Hsu, "Using Topic Keyword Clusters for Automatic Document Clustering," Proceedings of the Third International Conference on Information Technology and Applications, IEEE, 2005.

Y. Kim, M. Kim, A. Cattle, and J. Otmakhova, "Applying Graph-based Keyword Extraction to Document Retrieval," International Joint Conference on Natural language Processing, October 2013, pp. 864-868.

M. Habibi and A. Popescu-Belis, "Keyword Extraction and Clustering for Document Recommendation in Conversations," IEEE, vol. 23, no. 4, pp. 746-759, 2015.

M. Rezaei, N. Gali, and P. Franti, "CIRank: A Method for Keyword Extraction from web pages using Clustering and distribution of nouns," IEEE/ WIC /ACM International Conference on Web Intelligence and Intelligent Agent technology, vol. 1, pp. 79-84, 2015.

Downloads

Published

31-10-2018

How to Cite

Louis Raja, A. G., Francis, F. S., & Sugumar, P. (2018). Construction of Lexicons to Perk Up Re-Clustering. Asian Journal of Computer Science and Technology, 7(3), 82–85. https://doi.org/10.51983/ajcst-2018.7.3.1891