An Efficient Closed Maximal Pattern Sequences Mining on High Dimensional Datasets
DOI:
https://doi.org/10.51983/ajcst-2019.8.S3.2088Keywords:
Multi Support, Sequential Pattern Mining, Maximal Pattern, High Dimensional SequenceAbstract
Previous methods have presented convincing arguments that mining complete set of patterns is huge for effective usage. A compact but high quality set of patterns, such as closed patterns and maximal patterns is needed. Most of the previously maximal pattern sequences mining algorithms on high dimensional sequence, such as biological data set, work under the same support. In this paper, an efficient algorithm Closed Maximal Pattern Sequences (CMPS-Mine) for mining closed maximal patterns based on multi-support is suggested. Careful exhibitions once Beta-globin gene sequences have exhibited that CMPS-Mine expends less memory utilization and run time over Prefix Span. It generates compacted outcomes and two kinds of interesting patterns.
References
N.R. Mabroukeh and C.I. Ezeife, "A Taxonomy of Sequential Pattern Mining Algorithms," Journal ACM Computing Surveys, vol. 43, no. 1, pp. 1-41, 2010.
J. Cohen, "Bioinformatics-an Introduction for computer scientists," ACM Computing Surveys (CSUR), vol. 36, no. 2, pp. 122-158, 2004.
Z. Ezziane, "Applications of artificial intelligence in bioinformatics," A review Expert Systems with Applications, vol. 30, pp. 2-10, 2006.
Y. Xiong and Y.Y. Zhu, "BioPM: an efficient algorithm for protein motif mining," in Proceedings of the 1st International conference on Bioinformatics and Biomedical Engineering, pp. 394-397, 2007.
J.W. Han, H. Cheng, D. Xin, and X.F. Yan, "Frequent pattern mining: current status and future directions," Data Mining and Knowledge Discovery, vol. 15, pp. 55-86, 2007.
J.W. Pei and J.Y. Wang, et al., "Mining sequential patterns by pattern-growth: The prefix span approach," IEEE Transactions on Knowledge and Data Engineering, vol. 16, pp. 1-17, 2004.
R. Alves, D.S.R. Baena, and J.S.A. Ruiz, "Gene association analysis: a survey of frequent pattern mining from gene expression data," Briefings in Bioinformatics, pp. 1-12, 2009.
B. Lavanya and A. Murugan, "A DNA based approach to find closed repetitive gapped subsequences from a sequence database," International Journal of Computer Applications, vol. 29, no. 5, pp. 45-49, 2011.
P.G. Ferreira and P.J. Azevedo, "Protein sequence pattern mining with constraints," Knowledge Discovery in Databases, vol. 3721, pp. 96-107, 2005.
D. He, X.G. Zhu, X.D. Wu, "Mining approximate repeating patterns from sequence data with gap constraints," Computational Intelligence, vol. 27, no. 3, pp. 336-362, 2011.
J. Krishna, P. Suryanarayana Babu, "DFP-MINER: Assessing the Accuracy of Correlated Sequence Patterns from High Dimensional Biological Datasets," International Journal of Creative Research Thoughts, vol. 5, no. 4, pp. 1233-1241, November 2017.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2019 The Research Publication
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.