An Effective Feature Optimization Model for Android Malware Detection

Main Article Content

Hussein K. Almulla
Hussam J. Mohammed
Nathan Clarke
Ahmed Adnan Hadi
Mazin Abed Mohammed

Abstract

The rapid development of technology using Android-based smartphones has led to various threats of malware targeting these devices. Over time, android malware has become increasingly complex and challenging to mitigate. Detection relies on identifying specific sets of features that characterize malicious behavior, and these features have become increasingly complex and diverse as the complexity of malware has increased. Traditional approaches often suffer from high-dimensional feature spaces that increase the computational complexity and reduce the detection accuracy. Therefore, in this paper, a feature optimization approach is proposed that strategically selects the most informative malware features and discards redundant and noisy features to ensure computational efficiency. The ensemble design model using a voting approach is utilized with three base classifiers (LMT, KStar, and Decision Table) that are fed from a feature selection using the Relief algorithm. The proposed models were evaluated through several experiments using three datasets (Derbin, Malgenome, and Prerna) comprising 35,135 samples (10,820 malware samples and 24315 benign samples) across feature settings of 50, 100, 150, and all features. The experimental results highlighted that the detection/classification accuracy can be enhanced via an optimal feature vector. Overall, the model using 150 features was able to achieve the highest performance of 99.61%.

Article Details

Section

Articles

How to Cite

An Effective Feature Optimization Model for Android Malware Detection (H. K. . Almulla, H. J. . Mohammed, N. . Clarke, A. A. . Hadi, & M. A. . Mohammed , Trans.). (2025). Mesopotamian Journal of CyberSecurity, 5(2), 563-576. https://doi.org/10.58496/MJCS/2025/034

References

[1] K. Shao, Q. Xiong, and Z. Cai, “Fb2droid: A novel malware family-based bagging algorithm for android malware detection,” Security and Communication Networks, vol. 2021, no. 1, p. 6642252, 2021.

[2] Z. Ma, H. Ge, Y. Liu, M. Zhao, and J. Ma, “A combination method for android malware detection based on control flow graphs and machine learning algorithms,” IEEE access, vol. 7, pp. 21235–21245, 2019.

[3] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-An, and H. Ye, “Significant permission identification for machine-learning-based android malware detection,” IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216–3225, 2018.

[4] S. B. Almin and M. Chatterjee, “A novel approach to detect android malware,” Procedia Computer Science, vol. 45, pp. 407–417, 2015.

[5] F. Shen, J. Del Vecchio, A. Mohaisen, S. Y. Ko, and L. Ziarek, “Android malware detection using complex-flows,” IEEE Transactions on Mobile Computing, vol. 18, no. 6, pp. 1231–1245, 2018.

[6] H. Zhang, S. Luo, Y. Zhang, and L. Pan, “An efficient android malware detection system based on method-level behavioral semantic analysis,” IEEE Access, vol. 7, pp. 69246–69256, 2019.

[7] V. G. Shankar, G. Somani, M. S. Gaur, V. Laxmi, and M. Conti, “Androtaint: An efficient android malware detection framework using dynamic taint analysis,” 2017 ISEA Asia security and privacy (ISEASP), pp. 1–13, 2017.

[8] J. Zhang, Z. Qin, K. Zhang, H. Yin, and J. Zou, “Dalvik opcode graph based android malware variants detection using global topology features,” IEEE Access, vol. 6, pp. 51964–51974, 2018.

[9] S. Y. Yerima and S. Sezer, “Droidfusion: A novel multilevel classifier fusion approach for android malware detection,” IEEE transactions on cybernetics, vol. 49, no. 2, pp. 453–466, 2018.

[10] E. B. Karbab, M. Debbabi, A. Derhab, and D. Mouheb, “Maldozer: Automatic framework for android malware detection using deep learning,” Digital Investigation, vol. 24, pp. S48–S59, 2018.

[11] A. Rahali, A. H. Lashkari, G. Kaur, L. Taheri, F. Gagnon, and F. Massicotte, “Didroid: Android malware classification and characterization using deep image learning,” in Proceedings of the 2020 10th International Conference on Communication and Network Security, 2020, pp. 70–82.

[12] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, “Dl-droid: Deep learning based android malware detection using real devices,” Computers & Security, vol. 89, p. 101663, 2020.

[13] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, “Drebin: Effective and explainable detection of android malware in your pocket.” in Ndss, vol. 14, 2014, pp. 23–26.

[14] Y. Zhou and X. Jiang, “Dissecting android malware: Characterization and evolution,” in 2012 IEEE symposium on security and privacy. IEEE, 2012, pp. 95–109.

[15] P. Agrawal and B. Trivedi, “Evaluating machine learning classifiers to detect android malware,” in 2020 IEEE International Conference for Innovation in Technology (INOCON). IEEE, 2020, pp. 1–6.

[16] A. Desnos, “Androguard, a full python tool to play with android files,” 2023.

[17] R. Schmicker, F. Breitinger, and I. Baggili, “Androparse-an android feature extraction framework and dataset,” in Digital Forensics and Cyber Crime: 10th International EAI Conference, ICDF2C 2018, New Orleans, LA, USA, September 10–12, 2018, Proceedings 10. Springer, 2019, pp. 66–88.

[18] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, “Dynalog: An automated dynamic analysis framework for characterizing android applications,” in 2016 International Conference On Cyber Security And Protection Of Digital Services (Cyber Security). IEEE, 2016, pp. 1–8.

[19] R. J. Urbanowicz, M. Meeker, W. La Cava, R. S. Olson, and J. H. Moore, “Relief-based feature selection: Introduction and review,” Journal of biomedical informatics, vol. 85, pp. 189–203, 2018.

[20] S. Jadhav, H. He, and K. Jenkins, “Information gain directed genetic algorithm wrapper feature selection for credit rating,” Applied Soft Computing, vol. 69, pp. 541–553, 2018.

[21] X. Jin, A. Xu, R. Bie, and P. Guo, “Machine learning techniques and chi-square feature selection for cancer classification using sage gene expression profiles,” in Data Mining for Biomedical Applications: PAKDD 2006 Workshop, BioDM 2006, Singapore, April 9, 2006. Proceedings. Springer, 2006, pp. 106–115.

[22] M. R. Mahmood, “Two feature selection methods comparison chi-square and relief-f for facial expression recognition,” in Journal of Physics: Conference Series, vol. 1804, no. 1. IOP Publishing, 2021, p. 012056.

[23] P. Bühlmann, “Bagging, boosting and ensemble methods,” Handbook of computational statistics: Concepts and methods, pp. 985–1022, 2012.

[24] I. Gandhi and M. Pandey, “Hybrid ensemble of classifiers using voting,” in 2015 international conference on green computing and Internet of Things (ICGCIoT). IEEE, 2015, pp. 399–404.

[25] B. Ghasemkhani, O. Aktas, and D. Birant, “Balanced k-star: an explainable machine learning method for internet-of-things-enabled predictive maintenance in manufacturing. machines 11 (3), 322,” 2023.

[26] B. Tang and H. He, “Enn: Extended nearest neighbor method for pattern recognition [research frontier],” IEEE Computational intelligence magazine, vol. 10, no. 3, pp. 52–60, 2015.

[27] I. M. Olorunshola Oluwaseyi Ezekiel, Oluyomi Ayanfeoluwa Oluwasola, “An Evaluation of some Machine Learning Algorithms for the detection of Android Applications Malware,” Advances in Science, Technology and Engineering Systems Journal, vol. 5, no. 6, pp. 1741–1749, 2022.

[28] J. Y. Ndagi and J. K. Alhassan, “Machine learning classification algorithms for adware in android devices: a comparative evaluation and analysis,” in 2019 15th International Conference on Electronics, Computer and Computation (ICECCO). IEEE, 2019, pp. 1–6.

[29] M. Dhalaria and E. Gandotra, “A framework for detection of android malware using static features,” in 2020 IEEE 17th India Council International Conference (INDICON). IEEE, 2020, pp. 1–7.

[30] L. Huang, J. Xue, Y. Wang, D. Qu, J. Chen, N. Zhang, and L. Zhang, “Eaodroid: Android malware detection based on enhanced api order,” Chinese Journal of Electronics, vol. 32, no. 5, pp. 1169–1178, 2023.

Similar Articles

You may also start an advanced similarity search for this article.