Maximum Relevance Maximum Anti-Redundancy (mRmA) Feature Selection

Abdul Mannan, Kashif Javed, Serosh Karim Noon


Filters represent a class of feature selection methods used to select a subset of useful features from high dimensional data on the basis of relevance and redundancy analysis. Maximum relevance minimum redundancy (mRMR) is a famous feature selection algorithm for microarray data [1]. The quotient based version of maximum relevance minimum redundancy (Q-mRMR) filter [1],[2] selects, at each iteration, the feature scoring maximum ratio between its class relevance and average redundancy over already selected subset. This ratio can be surprisingly large if the denominator i.e. redundancy term is very small, hence suppressing the effect of relevance and leads to the selection of features which can be very weak representatives of the class. This paper addresses this issue by presenting a maximum relevance maximum antiredundancy (mRmA) filter method. For mRmA the value of objective function is within reasonable limits for all values of relevance and redundancy, hence, making selection of appropriate features more probable. Our 10 fold cross validation accuracy results using naive Bayes and support vector machines (SVM) classifiers confirm that the proposed method outperforms both Q-mRMR and Fast Correlation based Filter (FCBF) methods on six datasets from various applications like microarray, image and physical domains.

Full Text:



Ding, C., & Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology, 3(02), 185-205.

Ding, C., & Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. Journal of bioinformatics and computational biology, 3(02), 185-205.

Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of machine learning research, 3(Mar), 1157-1182.

Javed, K., Babri, H. A., & Saeed, M. (2012). Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Transactions on Knowledge and Data Engineering, 24(3), 465-477.

Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine learning, 46(1), 389-422.

Kohavi, R., & John, G. H. (1997). Wrappers for feature subset selection. Artificial intelligence, 97(1-2), 273-324.

Koller, D., & Sahami, M. (1996). Toward optimal feature selection. Stanford InfoLab.

Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of machine learning research, 5(Oct), 1205-1224.

Yu, L., & Liu, H. (2004, August). Redundancy based feature selection for microarray data. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 737-742). ACM.

Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517.

Javed, K., Babri, H. A., & Saeed, M. (2014). Impact of a metric of association between two variables on performance of filters for binary data. Neurocomputing, 143, 248-260.

Blum, A. L., & Langley, P. (1997). Selection of relevant features and examples in machine learning. Artificial intelligence, 97(1), 245-271.

Peng, H., Long, F., & Ding, C. (2005). Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence, 27(8), 1226-1238.

Hall, M. A., & Smith, L. A. (1999, May). Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper. In FLAIRS conference (Vol. 1999, pp. 235-239).

Doquire, G., &Verleysen, M. (2013). Mutual information-based feature selection for multilabel classification.Neurocomputing,122, 148-155.

Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlation-based filter solution. In Proceedings of the 20th international conference on machine learning (ICML-03) (pp. 856-863).

Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1996). Numerical recipes in C (Vol. 2). Cambridge: Cambridge university press.

Mahmoud, O., Harrison, A., Perperoglou, A., Gul, A., Khan, Z., Metodiev, M. V., &Lausen, B. (2014). A feature selection method for classification within functional genomics experiments based on the proportional overlapping score. BMC bioinformatics, 15(1), 274.

Brown, G., Pocock, A., Zhao, M. J., &Luján, M. (2012). Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. Journal of machine learning research, 13(Jan), 27-66.

Yang, H., & Moody, J. (1999, June). Feature selection based on joint mutual information. In Proceedings of international ICSC symposium on advances in intelligent data analysis (pp. 22-25).

Apiletti, D., Baralis, E., Bruno, G., & Fiori, A. (2007, August). The painter's feature selection for gene expression data. In Engineering in Medicine and Biology Society, 2007. EMBS 2007. 29th Annual International Conference of the IEEE (pp. 4227-4230). IEEE.

Ooi, C. H., Chetty, M., &Teng, S. W. (2006). Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data. BMC bioinformatics, 7(1), 320.

Gonzalez, R. C. (2009). Digital image processing: Pearson Education India.

Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46). IBM.

Bolón-Canedo, V., Sánchez-Maroño, N., & Alonso-Betanzos, A. (2013). A review of feature selection methods on synthetic data. Knowledge and information systems, 34(3), 483-519.

Kira, K., & Rendell, L. A. (1992, July). A practical approach to feature selection. In Proceedings of the ninth international workshop on Machine learning (pp. 249-256).

Alpaydin, E. (2014). Introduction to machine learning. MIT press.

Duin, R. P. W., Juszczak, P., Paclik, P., Pekalska, E., De Ridder, D., Tax, D. M. J., &Verzakov, S. (2000). A matlab toolbox for pattern recognition. PRTools version, 3, 109-111.

Jelinek, H. F., Depardieu, C., Lucas, C., Cornforth, D. J., Huang, W., & Cree, M. J. (2005, November). Towards vessel characterization in the vicinity of the optic disc in digital retinal images. In Image Vis ComputConf (pp. 2-7).

Haslinger, C., Schweifer, N., Stilgenbauer, S., Döhner, H., Lichter, P., Kraut, N., &Abseher, R. (2004). Microarray gene expression profiling of B-cell chronic lymphocytic leukemia subgroups defined by genomic aberrations and VH mutation status. Journal of Clinical Oncology, 22(19), 3937-3949.

Cho, S. B., & Won, H. H. (2003, January). Machine learning in DNA microarray analysis for cancer classification. In Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003-Volume 19 (pp. 189-198). Australian Computer Society, Inc..

Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., &Yakhini, Z. (2000). Tissue classification with gene expression profiles. Journal of computational biology, 7(3-4), 559-583.

James, A. P., &Dimitrijev, S. (2012). Feature selection using nearest attributes. arXiv preprint arXiv:1201.5946.

Bailey, J., Manoukian, T., &Ramamohanarao, K. (2003, November). A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In Data Mining, 2003. ICDM 2003. Third IEEE International Conference on (pp. 485-488). IEEE

Copyright (c) 2017 Pakistan Journal of Engineering and Applied Sciences

Powered By KICS