Big Data Predictive Analytics for Personalized Medicine: Perspectives and Challenges


  • Tahsien Al-Quraishi Victorian Institute of Technology, School of IT, Melbourne, Victoria, Australia.
  • Naseer Al-Quraishi Alayen Iraqi University, College of Computer Science, Computer Science Department, Nasiriyah, Iraq
  • Hussein AlNabulsi Victorian Institute of Technology, School of IT, Melbourne, Victoria, Australia
  • Hussein AL-Qarishey Lawrence Technological University, School of Mechanical Engineering, Michigan, USA
  • Ahmed Hussein Ali Department of Computer, College of Education, Aliraqia University, Baghdad, Iraq.



Predictive Analytics, Personalized Medicine, Perspectives, Challenges, Big Data


The integration of predictive analytics into personalized medicine has become a promising approach for improving patient outcomes and treatment efficacy. This paper provides a review of the field, examining the tools, methodologies, and challenges associated with this advanced statistical methodology. Predictive analytics leverages machine learning algorithms to analyze vast datasets, including Electronic Health Records (EHRs), genomic data, medical imaging, and real-time data from wearable devices. The review explores key tools such as the Hadoop Distributed File System (HDFS), Apache Spark, and Apache Hive, which facilitate scalable storage, efficient data processing, and comprehensive data analysis. Key challenges identified include managing the immense volume of healthcare data, ensuring data quality and integration, and addressing privacy and security concerns. The paper also highlights the difficulties in achieving real-time data processing and integrating predictive insights into clinical practice. Effective data governance and ethical considerations are critical to maintaining trust and transparency. The strategic use of big data tools, combined with investment in skill development and interdisciplinary collaboration, is essential for harnessing the full potential of predictive analytics in personalized medicine. By overcoming these challenges, healthcare providers can enhance patient care, optimize resource management, and drive medical discoveries, ultimately revolutionizing healthcare delivery on a global scale.


Download data is not yet available.


V. Shah, "Next-Generation Artificial Intelligence for Personalized Medicine: Challenges and Innovations," in International Journal of Computer Science and Technology, vol. 2, no. 2, pp. 1-15, 2018.

M. Elkawkagy and H. Elbeh, "High performance hadoop distributed file system," in International Journal of Networked and Distributed Computing, vol. 8, no. 3, pp. 119-123, 2020.

R. R. Asaad, H. B. Ahmad, and R. I. Ali, "A review: big data technologies with hadoop distributed filesystem and implementing M/R," in Academic Journal of Nawroz University, vol. 9, no. 1, pp. 25-33, 2020.

K. B. Johnson et al., "Precision medicine, AI, and the future of personalized health care," in Clinical and Translational Science, vol. 14, no. 1, pp. 86-93, 2021.

A. P. Rodrigues et al., "Performance study on indexing and accessing of small file in Hadoop distributed file system," in Journal of Information & Knowledge Management, vol. 20, no. 04, art. no. 2150051, 2021.

V. S. Sharma et al., "A dynamic repository approach for small file management with fast access time on Hadoop cluster: hash based extended Hadoop archive," in IEEE Access, vol. 10, pp. 36856-36867, 2022.

S. Bende and R. Shedge, "Dealing with small files problem in hadoop distributed file system," in Procedia Computer Science, vol. 79, pp. 1001-1012, 2016.

X. Meng et al., "Mllib: Machine learning in apache spark," in Journal of Machine Learning Research, vol. 17, no. 34, pp. 1-7, 2016.

Y. Huai et al., "Major technical advancements in apache hive," in Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data, pp. 1235-1246, New York, NY, USA, June 2014.

G. Wang et al., "Building a replicated logging system with Apache Kafka," in Proceedings of the VLDB Endowment, vol. 8, no. 12, pp. 1654-1655, 2015.

J. Pokorný, "Big data storage and management: Challenges and opportunities," in Environmental Software Systems. Computer Science for Environmental Protection: 12th IFIP WG 5.11 International Symposium ISESS 2017, Zadar, Croatia, May 10-12, 2017, Springer International Publishing, pp. 28-38.

M. Ghasemaghaei and G. Calic, "Assessing the impact of big data on firm innovation performance: Big data is not always better data," in Journal of Business Research, vol. 108, pp. 147-162, 2020.

L. Ehrlinger and W. Wöß, "A survey of data quality measurement and monitoring tools," in Frontiers in Big Data, vol. 5, art. no. 850611, 2022.

Z. Lv and L. Qiao, "Analysis of healthcare big data," in Future Generation Computer Systems, vol. 109, pp. 103-110, 2020.

M. Janssen et al., "Data governance: Organizing data for trustworthy Artificial Intelligence," in Government Information Quarterly, vol. 37, no. 3, art. no. 101493, 2020.

V. Niculescu, "On the impact of high performance computing in Big data analytics for medicine," in Applied Medical Informatics, vol. 42, no. 1, pp. 9-18, 2020.

K. Batko and A. Ślęzak, "The use of Big Data Analytics in healthcare," in Journal of Big Data, vol. 9, no. 1, art. no. 3, 2022.

C. Guo and J. Chen, "Big data analytics in healthcare," in Knowledge Technology and Systems: Toward Establishing Knowledge Systems Science, pp. 27-70, Singapore: Springer Nature Singapore, 2023.

M. I. Razzak, M. Imran, and G. Xu, "Big data analytics for preventive medicine," in Neural Computing and Applications, vol. 32, no. 9, pp. 4417-4451, 2020.

K. I. Mohammed et al., "A uniform intelligent prioritisation for solving diverse and big data generated from multiple chronic diseases patients based on hybrid decision-making and voting method," in IEEE Access, vol. 8, pp. 91521-91530, 2020.




How to Cite

Al-Quraishi, T., Al-Quraishi, N., AlNabulsi, H., AL-Qarishey, H., & Ali , A. H. (2024). Big Data Predictive Analytics for Personalized Medicine: Perspectives and Challenges. Applied Data Science and Analysis, 2024, 32–38.
DOI: 10.58496/ADSA/2024/004
Published: 2024-04-11