Measuring the E ff ectiveness of AI Tools in Clinical Research and Writing: A Case Study in Healthcare

This article investigates the capabilities and limitations of ChatGPT, a natural language processing (NLP) tool, and large language models (LLMs), developed from advanced artificial intelligence (AI). Designed to help computers understand and produce text understandable by humans, ChatGPT is particularly aimed at general scientific writing and healthcare research applications. Our methodology involved searching the Scopus database for ’type 2 diabetes’ and ’T2 diabetes’ articles from reputable journals. After eliminating duplicates, we used ChatGPT to formulate conclusions for each selected article by inputting their structured abstracts, excluding the original conclusions. Additionally, we tested ChatGPT’s response to simple misuse scenarios. Our findings show that ChatGPT can accurately grasp context and concisely summarize primary research findings. Additionally, it helps individuals who are not as experienced in mathematical analysis by providing coding guidelines for mathematical analyses in a variety of computer languages and by demystifying di ffi cult model results. In conclusion, even if ChatGPT and other AI technologies are revolutionizing scientific publishing and healthcare, their use should be strictly controlled by authoritative laws.

The powerful natural language processing (NLP) capabilities of ChatGPT have had a significant impact on healthcare.It is used in patient interaction via automated chatbots, medical documentation assistance, mental health support, and the identification of unusual diseases.Furthermore, its ability to process and analyze large datasets [25,26] has a significant impact on diagnostic algorithms, tailored treatment planning, AI-assisted surgical procedures, and illness diagnosis support.There are a number of important factors to consider while addressing the problems with AI applications such as ChatGPT in the healthcare industry.Ensuring the interpretability of models is crucial in order to provide transparent decisionmaking in intricate medical situations.Preventing healthcare disparities requires mitigating data bias, and staying current with medical research and guidelines requires ongoing learning.For best outcomes, integration with current healthcare IT systems needs to be managed skillfully.Furthermore, respecting moral and legal requirements is crucial for preserving compliance and confidence, especially when it comes to patient privacy and informed consent [27].The creative application of ChatGPT in the medical field demonstrates its adaptability, since it can be used to streamline medical procedures and enhance men's health, both of which improve patient results.Relevance and dependability are guaranteed by a training and effectiveness evaluation technique designed with healthcare situations in mind.With a focus on type 2 diabetes and general scientific research, this study intends to investigate the present state and potential of ChatGPT in clinical research.

BACKGROUND
ChatGPT is a natural language processing (NLP) tool that works in the field of artificial intelligence (AI).Its primary goal is to assist computers in interpreting and producing text and visuals that are easily interpreted by humans [28,29].ChatGPT, created by OpenAI, works as a generative language model, producing text and graphics or analyzing images based on the input it receives [30].ChatGPT, a prominent example of a large language model (LLM), has recently demonstrated outstanding performance [31].LLMs are a type of AI system that is supposed to mimic human language processing capabilities.They use deep learning (DL) methods, such as convolutional neural networks, and train on large datasets of audio and text data from magazines, websites, books, papers, and other sources.These LLMs are built with multi-layer recurrent neural networks and trained on enormous datasets to produce text or graphics that are similar to human language [30,32].ChatGPT has received widespread notice since its launch, outperforming previous social media trends by gaining nearly 120 million users in just a few months.It can now generate text and even source code in a variety of computer languages.ChatGPT's outstanding performance has sparked substantial debate in both academic and private media.However, there is little empirical data to support its impact in clinical medicine or its reliability in diagnosing type 2 diabetes patients in clinical settings.Previous study has shown that AI technologies might possibly manufacture problematic content on social media platforms or use social media bots to actively spread propaganda [33].Numerous research articles have explored the use of ChatGPT in creating scientific literature, with one study showcasing its ability to generate comprehensive and formal research papers [34,35].While the utilization of Natural Language Processing (NLP) in healthcare research is not new [36][37][38][39], the recent introduction of ChatGPT has generated substantial interest in the AI community.This has sparked vigorous debates about its anticipated advantages and disadvantages in the healthcare domain, prompting responses from researchers across various medical disciplines.Furthermore, ChatGPT is expected to find diverse applications in clinical and medical fields, including pathology, radiology, cardiology, and more [40][41][42][43].Reference [44] emphasized that "ChatGPT can assist professionals and researchers in numerous tasks, such as data analysis, literature review, and hypothesis generation."However, there remains uncertainty regarding how ChatGPT can serve as a collaborative tool, aiding in the editing or enhancement of a researcher's manuscript, or summarizing literature related to type 2 diabetes.

MATERIALS
Seven articles published by Elsevier Publishing Company under a reputable journal are utilized.The selected articles must have focused their findings on 'type 2 diabetes disease only'.Intel CORE i7 ZBook laptop with Windows 10 operating system is utilized to carry out the search exercises.The present version of ChatGPT (version 3.5) is utilized.Addressing the intricate links (sarcopenia, type 2 diabetes) requires a holistic approach.Exercise, nutrition, and targeted interventions offer promising avenues to enhance muscle health and metabolic control in older adults China [46] The Bucher method to assess the treatment effects of SGLT2i , GLP-1 RA between White patients with T2DM, Asian patients significant effect modification;

kidney outcomes and cardiovascular
In evaluating GLP-1 RA and SGLT2i treatments for type 2 diabetes in Asian and White patients, no substantial differences emerged in major cardiovascular events or kidney outcomes between the groups Portugal [47] Retrospective cross-sectional to predict the availability of insulin therapy prescription to patients living with T2D.
Insulin under prescribed in inadequately controlled T2D.
In primary care, insulin therapy for type 2 diabetes patients remains underutilized, with delayed initiation.Older patients with better glycemic control, but higher atherosclerotic cardiovascular disease rates, were more likely to receive insulin treatment.
The study implies that sodium-glucose cotransporter 2 (SGLT2) inhibitors might impact thyroid function in type 2 diabetes patients.Those on SGLT2 inhibitors showed altered free T3 and T3:T4 ratios, indicating a potential relationship between SGLT2 inhibitors and thyroid dynamics Malaysia [49] Effect of empagliflozin in people with type 2 D during Ramadan by accessing hypoglycemia symptom using appropriate measurement Empagliflozin safe for Ramadan fasting.
The study suggests that empagliflozin use in fasting type 2 diabetes patients during Ramadan did not significantly alter blood pressure, weight, renal parameters, ketone levels, or hypoglycemia symptoms compared to the control group.
Ghana [50] Spirometry and 6-min walk test (6MWT) to examine the connection among pulmonary function and exercise capacity in T2D, FEV1 is inversely linked to T2D exercise.
The study highlights a significant link between pulmonary function and exercise capacity in type 2 diabetes patients.Impaired pulmonary function correlated with reduced exercise capacity, emphasizing the importance of considering pulmonary health in managing diabetes-related exercise programs United Arab Emirates [51] Using Pearson Correlation, researchers examined the relationships between 3 widely used scales for assessing the discomfort of diabetic peripheral neuropathy in T2D patients.
High DNP risk in UAE Arabs.
In the UAE, where Type 2 Diabetes is prevalent, understanding and managing diabetes peripheral neuropathy (DPN) is vital, particularly considering the diverse population.The study highlights the impact of ethnic origin on DPN risk and significant associations between neuropathy measurement scales

METHOD
We intentionally carried out our article search in the Scopus database with articles published under a reputable Elsevier journal.The search exercise is conducted at one seat, using the keywords 'type 2 diabetes', and 'T2 diabetes' under the criteria that only a published research article with a written conclusion will be selected.Reviews (short or full), surveys (short or full), and letters to the editor are not included.We found 17 results in the Scopus database written in the English language.All the duplicate and irrelevant articles were removed for this study.Seven published articles that focused on the study of type 2 diabetes were selected from seventeen published articles.Additionally, in this study, an AI-based chat box platform available on the internet, offering cost-free services for research purposes, was employed.The latest free iteration of ChatGPT as of January 30 was utilized for interactions between the user and the tool.Subsequently, specific prompts were inputted into the ChatGPT box, instructing it to generate the abstract conclusions for the 'Diabetes and Metabolic Syndrome: Clinical Research and Reviews' article, ensuring a maximum of 30 words.The resulting conclusions from the original article and those generated by ChatGPT are summarized in Table 1.

Clinical and General Scientific Write-up
Evaluating the ability of ChatGPT, a large language model, to understand and summarize information from the abstract section of an article, which includes Background, Methods, and Results, and generate conclusions based on the contents.To ensure that the generated conclusions were not known by the platform, which has information updated until 2021, we selected seven (7) articles on Diabetes and Metabolic Syndrome: Clinical Research and Reviews published in 2023 [34][35][36][37][38][39][40].We write the following text in the ChatGPT box: "Use the Background, Methods, and Results provided below to write the Conclusions of an abstract for Diabetes and Metabolic Syndrome: Clinical Research and Reviews".The Conclusions should not exceed 30 words."Table 1 shows the original and ChatGPT-generated conclusions.In general, ChatGPT was able to properly summarize the results of the input prompt and generate relevant conclusions for the study.The generated conclusions might have been longer if there were no word count restrictions.

Predicted Misappropriation of ChatGPT in Research and Clinical Writing
Exploring the potential misuse of ChatGPT, a large language model that can generate various types of texts.We asked ChatGPT to suggest possible scenarios of misuse and evaluated their technical feasibility based on its responses.Table 2 shows some of the scenarios proposed by ChatGPT and their feasibility ratings.What is more alarming is the high rate of creating false exhibits and documents with a high level of credibility, even though not all the scenarios are entirely attributable to ChatGPT.In order to assess ChatGPT's aptitude in generating scientific text, we presented a dataset in comma-separated value (CSV) format and tasked ChatGPT with composing a structured abstract conclusion for a scientific journal, without specifying the study's subject matter.The initial outcome was well-organized and cohesive, utilizing the variable names, accurately presenting results, and formulating clear conclusions.However, it's worth emphasizing that ChatGPT lacks the capability to conduct mathematical analyses and may not always elucidate its constraints unless probed by the user.Intriguingly, ChatGPT can offer assistance and suggestions concerning code for mathematical analysis across various programming languages, and even emulate outcomes of diverse models.These simulations could appear convincing to readers with limited familiarity with mathematical analyses.Thus, ChatGPT offers significant advantages and applications within the medical and healthcare domains, as illustrated in Figure 1.

CONCLUSION AND FUTURE WORK
ChatGPT is a big language model that provides considerable benefits and applications in the medical and healthcare areas, as shown in Figure 1.Furthermore, it is a useful tool for healthcare workers, assisting with duties such as diagnosis, research, patient monitoring, and medical education.However, using ChatGPT creates ethical questions and restrictions concerning honesty, plagiarism, copyright infringement, and biases.As a result of these findings, it is clear that the incorporation of artificial intelligence technologies like ChatGPT in healthcare and scientific publishing demands careful monitoring via solid regulatory standards.Implementing defined boundaries for plagiarism and developing specific apps to evaluate submitted articles, for example, is critical, particularly in healthcare and scientific writing.Given the current findings on the use of ChatGPT in healthcare and scientific publishing, future research should focus on improving ethical frameworks to handle data privacy and patient consent issues, as well as establishing strategies to eliminate biases in AI algorithms.A key topic of research should be the development of improved AI-based techniques for successful plagiarism detection in scientific articles, distinguishing AI-generated content from human-generated work.Furthermore, extensive research into the development of solid regulatory frameworks customized for AI applications in healthcare is required to ensure their safe and ethical use.It is also critical to investigate the integration of AI tools such as ChatGPT with existing healthcare IT systems, particularly interoperability with electronic health records and telemedicine

TABLE I .
SHOW THE SELECTED STUDIES, THE COUNTRY WHERE THE RESEARCH IS CONDUCTED, PREVENTIVE MEASURES, AND THE SHORT ABSTRACT AND ABSTRACT CONCLUSION FROM CHATGPT.

TABLE II .
PREDICTED MISAPPROPRIATION OF THE CHATGPT, SAMPLE REQUIST TO CHATGPT, AND FEEDBACK FROM EXPERTS (RESPONDENT) TO JUSTIFY HOW FEASIBLE IT IS TO CHATGPT.