[Skip to Navigation]
Sign In
Table.  Evaluation of Fundamental Cardiovascular Disease Prevention Recommendations From an Online Chat-Based Artificial Intelligence Model Based on Assessment by Preventive Cardiology Clinicians
Evaluation of Fundamental Cardiovascular Disease Prevention Recommendations From an Online Chat-Based Artificial Intelligence Model Based on Assessment by Preventive Cardiology Clinicians
1.
ChatGPT: optimizing language models for dialogue. Accessed December 11, 2022. https://openai.com/blog/chatgpt
2.
Stokel-Walker  C.  AI bot ChatGPT writes smart essays: should professors worry?   Nature. Published online December 9, 2022. doi:10.1038/d41586-022-04397-7PubMedGoogle ScholarCrossref
3.
Arnett  DK, Blumenthal  RS, Albert  MA,  et al.  2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines.   J Am Coll Cardiol. 2019;74(10):e177-e232. doi:10.1016/j.jacc.2019.03.010PubMedGoogle ScholarCrossref
4.
Rodriguez  F, Ngo  S, Baird  G, Balla  S, Miles  R, Garg  M.  Readability of online patient educational materials for coronary artery calcium scans and implications for health disparities.   J Am Heart Assoc. 2020;9(18):e017372. doi:10.1161/JAHA.120.017372PubMedGoogle ScholarCrossref
Research Letter
February 3, 2023

Appropriateness of Cardiovascular Disease Prevention Recommendations Obtained From a Popular Online Chat-Based Artificial Intelligence Model

Author Affiliations
  • 1Department of Cardiovascular Medicine, Cleveland Clinic, Cleveland, Ohio
  • 2Division of Cardiovascular Medicine, Stanford University, Stanford, California
JAMA. 2023;329(10):842-844. doi:10.1001/jama.2023.1044

To obtain cardiovascular disease (CVD) prevention advice, individuals may explore informational sources, including those on the internet, or communicate with clinicians. A research version of a dialogue-based artificial intelligence (AI) language model (ChatGPT) was released in November 2022 and has captured wide attention, with media reports suggesting more than 1 million users within days.1 Using a chat-based interface, this AI model responds to complex queries interactively.2 This study qualitatively evaluated the appropriateness of AI model responses to simple, fundamental CVD prevention questions.

Methods

This study was performed in December 2022. We created 25 questions addressing fundamental preventive concepts, including risk factor counseling, test results, and medication information, based on guideline-based prevention topics and our clinical experience in tertiary care preventive cardiology clinics (Table).3 Each question was posed to the online AI interface2 3 times and the responses were recorded. Each set of 3 responses was graded by an experienced preventive cardiology clinician. There was 1 reviewer assigned to each set of responses. A total of 3 reviewers participated in this study. Reviewers graded each set of responses as either “appropriate” or “inappropriate” based on their clinical judgment and the content of the response or as “unreliable” if the 3 responses were inconsistent. The set of responses was graded as inappropriate if any of the 3 responses contained inappropriate information. Reviewers graded responses in 2 hypothetical contexts: as responses on a patient-facing information platform (akin to hospital-based informational websites) and as AI-generated draft responses to electronic message questions sent by patients for clinician review.

Results

AI model responses to 21 of 25 questions (84%) were graded as appropriate in both contexts (Table). Four responses (16%) were graded as inappropriate in both contexts. For 3 of the 4 sets of responses, all 3 responses had inappropriate information; for 1 set, 1 of 3 responses was inappropriate. For example, the AI model responded to questions about exercise by firmly recommending both cardiovascular activity and lifting weights, which may be incorrect and potentially harmful for certain patients. Responses about interpreting a low-density lipoprotein cholesterol level of 200 mg/dL lacked relevant details, including familial hypercholesterolemia and genetic considerations. Responses about inclisiran suggested that it is commercially unavailable. No responses were graded as unreliable.

Discussion

This exploratory study found that a popular online AI model provided largely appropriate responses to simple CVD prevention questions as evaluated by preventive cardiology clinicians. Findings suggest the potential of interactive AI to assist clinical workflows by augmenting patient education and patient-clinician communication around common CVD prevention queries. For example, such an application may provide conversational responses to simple queries on informational platforms or create automated draft responses to patient electronic messages for clinicians. Whether these approaches can improve readability should be explored, because prior work has indicated low readability of certain online patient educational materials for CVD prevention.4

There are several limitations to this study. This AI model is a research version of a “chatbot” that is not meant for medical use. CVD prevention is a wide field that is not covered by the preliminary list of simple questions in this study. AI accuracy and reliability are susceptible to training data limitations and biases. For instance, an inappropriate answer regarding inclisiran was likely due to a training timeline that missed newer developments. This study used the version of ChatGPT available at the time of this analysis1 and did not assess other AI language models. Future research should compare various models to understand differential limitations. Use of appropriateness ratings was subjective and not validated; the work should be repeated using a more formal system for grading the responses and assessing specific aspects of the responses (eg, accuracy, readability). Only a single reviewer evaluated the responses for each question; having multiple reviewers would have allowed assessment of consistency among the reviewers' ratings. Heterogeneity between the set of 3 AI responses was not assessed in detail. Finally, the AI tool’s responses did not include references to evidence to support any statements.

Section Editors: Jody W. Zylke, MD, Deputy Editor; Kristin Walter, MD, Senior Editor.
Back to top
Article Information

Accepted for Publication: January 24, 2023.

Published Online: February 3, 2023. doi:10.1001/jama.2023.1044

Corresponding Author: Ashish Sarraju, MD, Section of Preventive Cardiology and Rehabilitation, Cleveland Clinic, 9500 Euclid Ave, JB1, Cleveland, OH 44195 (sarraja@ccf.org).

Author Contributions: Dr Sarraju had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

Concept and design: Sarraju, Laffin.

Acquisition, analysis, or interpretation of data: All authors.

Drafting of the manuscript: Sarraju.

Critical revision of the manuscript for important intellectual content: All authors.

Statistical analysis: Bruemmer.

Administrative, technical, or material support: Van Iterson.

Supervision: Sarraju, Cho.

Conflict of Interest Disclosures: Dr Bruemmer reported serving on a scientific advisory board for Esperion and receiving clinical research trial funding to the institution from Novartis. Dr Rodriguez reported receiving grants from the National Institutes of Health/National Heart, Lung, and Blood Institute (1K01HL144607), the American Heart Association/Harold Amos Faculty Development Program, and the Doris Duke Charitable Foundation (#2022051) during the conduct of the study. Dr Laffin reported ownership and interest in Gordy Health and LucidAct Health outside the submitted work; serving as a consultant and/or on steering committees for Medtronic, Eli Lilly, Mineralys Therapeutics, AstraZeneca, and Crispr Therapeutics; and receiving research funding from AstraZeneca. No other disclosures were reported.

Data Sharing Statement: See the Supplement.

References
1.
ChatGPT: optimizing language models for dialogue. Accessed December 11, 2022. https://openai.com/blog/chatgpt
2.
Stokel-Walker  C.  AI bot ChatGPT writes smart essays: should professors worry?   Nature. Published online December 9, 2022. doi:10.1038/d41586-022-04397-7PubMedGoogle ScholarCrossref
3.
Arnett  DK, Blumenthal  RS, Albert  MA,  et al.  2019 ACC/AHA guideline on the primary prevention of cardiovascular disease: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines.   J Am Coll Cardiol. 2019;74(10):e177-e232. doi:10.1016/j.jacc.2019.03.010PubMedGoogle ScholarCrossref
4.
Rodriguez  F, Ngo  S, Baird  G, Balla  S, Miles  R, Garg  M.  Readability of online patient educational materials for coronary artery calcium scans and implications for health disparities.   J Am Heart Assoc. 2020;9(18):e017372. doi:10.1161/JAHA.120.017372PubMedGoogle ScholarCrossref
×