An Efficient Closed Maximal Pattern Sequences Mining on High Dimensional Datasets

Authors

  • J. Krishna Assistant Professor, Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Andhra Pradesh, India
  • M. Haritha UG Student, Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Andhra Pradesh, India

DOI:

https://doi.org/10.51983/ajcst-2019.8.S3.2088

Keywords:

Multi Support, Sequential Pattern Mining, Maximal Pattern, High Dimensional Sequence

Abstract

Previous methods have presented convincing arguments that mining complete set of patterns is huge for effective usage. A compact but high quality set of patterns, such as closed patterns and maximal patterns is needed. Most of the previously maximal pattern sequences mining algorithms on high dimensional sequence, such as biological data set, work under the same support. In this paper, an efficient algorithm Closed Maximal Pattern Sequences (CMPS-Mine) for mining closed maximal patterns based on multi-support is suggested. Careful exhibitions once Beta-globin gene sequences have exhibited that CMPS-Mine expends less memory utilization and run time over Prefix Span. It generates compacted outcomes and two kinds of interesting patterns.

References

N.R. Mabroukeh and C.I. Ezeife, "A Taxonomy of Sequential Pattern Mining Algorithms," Journal ACM Computing Surveys, vol. 43, no. 1, pp. 1-41, 2010.

J. Cohen, "Bioinformatics-an Introduction for computer scientists," ACM Computing Surveys (CSUR), vol. 36, no. 2, pp. 122-158, 2004.

Z. Ezziane, "Applications of artificial intelligence in bioinformatics," A review Expert Systems with Applications, vol. 30, pp. 2-10, 2006.

Y. Xiong and Y.Y. Zhu, "BioPM: an efficient algorithm for protein motif mining," in Proceedings of the 1st International conference on Bioinformatics and Biomedical Engineering, pp. 394-397, 2007.

J.W. Han, H. Cheng, D. Xin, and X.F. Yan, "Frequent pattern mining: current status and future directions," Data Mining and Knowledge Discovery, vol. 15, pp. 55-86, 2007.

J.W. Pei and J.Y. Wang, et al., "Mining sequential patterns by pattern-growth: The prefix span approach," IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 1-17, 2004.

R. Alves, D.S.R. Baena, and J.S.A. Ruiz, "Gene association analysis: a survey of frequent pattern mining from gene expression data," Briefings in Bioinformatics, pp. 1-12, 2009.

B. Lavanya and A. Murugan, "A DNA based approach to find closed repetitive gapped subsequences from a sequence database," International Journal of Computer Applications, vol. 29, no. 5, pp. 45-49, 2011.

P.G. Ferreira and P.J. Azevedo, "Protein sequence pattern mining with constraints," Knowledge Discovery in Databases, vol. 3721, pp. 96-107, 2005.

D. He, X.G. Zhu, X.D. Wu, "Mining approximate repeating patterns from sequence data with gap constraints," Computational Intelligence, vol. 27, no. 3, pp. 336-362, 2011.

J. Krishna, P. Suryanarayana Babu, "DFP-MINER: Assessing the Accuracy of Correlated Sequence Patterns from High Dimensional Biological Datasets," International Journal of Creative Research Thoughts, vol. 5, no. 4, pp. 1233-1241, November 2017.

Downloads

Published

29-04-2019

How to Cite

Krishna, J., & Haritha, M. (2019). An Efficient Closed Maximal Pattern Sequences Mining on High Dimensional Datasets. Asian Journal of Computer Science and Technology, 8(S3), 50–53. https://doi.org/10.51983/ajcst-2019.8.S3.2088