Health Wanted Show Notes: AI in Health Care

Hollywood has been dreaming of artificial intelligence (AI) and its capacity for both good and evil since long before it was developed enough to be a reality.

  • But the AI boom has come to fruition in recent years, touching all aspects of life.
  • And health care and science have been caught up in the AI revolution. This year, Nobel Prizes in both chemistry and physics went to researchers who developed AI platforms in their fields.

But what is AI exactly?

  • To start, the “A” in “AI” has two possible meanings: augmented or artificial.
  • Generally speaking, the difference is that “augmented intelligence” works with the goal of improving human intelligence and decision making, and “artificial intelligence” has a goal of being completely autonomous.
  • AI is a broad term for a category of technology that enables computers and machines to mimic human learning, comprehension, creativity, problem solving, and decision making.
  • There are three main types of AI based on capabilities—two of those (general AI and super AI) that would be human-like and even super-human are still theoretical.

This leaves us with narrow AI, which is what we have in use now.

  • Narrow AI can be trained to perform a single, specific task. It can then build its abilities in that area, but it can’t branch beyond it.
  • ChatGPT is narrow AI because it is limited to the single task of text-based responses.
  • Narrow AI has two basic functionalities:
    • Reactive machine AI are systems that have no memory, but instead can analyze large amounts of data to make predictions. You’re probably most familiar with this type of AI in the form of things like social media algorithms.
    • Limited memory AI is AI with a limited memory that can draw on past outcomes and events to form its decision making. The memory isn’t indefinite, but performance can improve the more it’s trained on new data. Examples include things like websites that can generate photos based on a prompt or personal assistant AI like Siri.

AI can also be broken down into two additional sub-types: generative and predictive.

  • Generative AI uses massive amounts of data to respond to a user’s prompt with original content.
  • The applications for generative AI in health care predominantly lean toward alleviating administrative workloads.
  • Think of things like a program that can summarize patient visit notes or stand in as a chatbot to help direct patients to proper resources and information, freeing up the time of licensed providers to focus on patient care.
  • Children’s Hospital Los Angeles is currently testing generative AI to translate discharge notes into Spanish.For the 60% of their patient population that speaks Spanish, this could greatly improve accessibility…assuming the translations are appropriate.

This kind of AI is often trained on what are known as “large language models,” or LLMs.

  • LLMs require huge amounts of data to be able to recognize language patterns and fluently provide outputs that people can understand.
  • And then those outputs need to be evaluated for accuracy…which isn’t always done well.
  • I think by now we’re all familiar with the launch of Google’s AI-assisted search engine Gemini, which gave advice about how many rocks should be eaten a day and that glue can be added to pizza to thicken the cheese.
  • It was a good lesson in the importance of evaluation before launching an AI model, but not every health care system seems to have learned that.
  • One systematic review found that only 5% of LLMs in health care were evaluated using real patient data.
  • Instead of taking data from actual doctor's visits and inputting it into their system to make sure the outcomes were the same or better than the provider's decision making, many systems entered questions from medical exams or simply invented hypothetical scenarios.
  • Only real patient data will properly encompass all the complexities of clinical care. One researcher likened the use of medical exam questions for evaluation to using a multiple-choice question to certify a car as road ready.

The lack of standardization for evaluation is indicative of the sometimes wild-west-like approach to AI in health care.

  • Musk recently encouraged users to put their medical images into the subscription-based AI program on X, known as Grok, for analysis.
  • While there are some studies showing promise in the potential for AI to aid in reading radiological diagnostics and assist areas that are lacking in qualified radiologists, Musk’s request brings up issues of both accuracy and privacy when it comes to putting protected health information into a non-specialized AI platform.
  • Even medical providers are eager to apply generative AI to their workflows…whether it’s a good idea or not.
  • A recent survey of general practitioners in the U.K. found that one in five doctors were already using things like ChatGPT to help summarize visit notes, come up with treatment plans, or translate complex medical jargon into plain English for patients.
  • OpenAI, the company that owns ChatGPT, states in its terms of service that customers are prohibited from using the software to provide medical or health advice without “review by a qualified professional and disclosure of the use of AI assistance and its potential limitations.” But it’s unclear how closely that rule is followed.
  • Generative AI can also lighten the administrative load of patients.
    • Health insurance companies rejected an estimated one in seven claims for treatment, and Americans insured under the Affordable Care Act typically only file appeals for .1% of their rejections.
    • A tech worker in San Francisco named Holden Karau is aiming to fix that. She created a generative AI platform, called Fight Health Insurance, where you can upload your rejection letter and it will create an appeal letter that’s easy to return to the company.

The other type of AI with immense potential in health care is predictive AI.

  • Predictive AI uses smaller, more specialized datasets to reason through what the most likely outcome of a scenario is.
  • In health care, this can look like systems that decide which patients need to have their care escalated because of the likelihood that they’ll deteriorate quickly.
  • Wearable technology is often touted as the future of predictive AI for the common man.
  • By collecting and analyzing your data in real time, the hope is that smart technology can warn you in advance of an impending medical issue.

But as any meteorologist can tell you, predicting the future is hard…and easy to get wrong.

  • A couple of years ago, the electronic medical records giant Epic had to overhaul its sepsis alert system that used AI to crunch patient data and determine who was at risk of getting the potentially deadly blood infection, which is a leading cause of death among hospitalized patients.
  • The company released its AI sepsis tool without checking to make sure that the parameters it used worked for the real-life hospital population at each different site.
  • They also, confusingly, factored in if antibiotics had already been ordered to help determine if someone was at risk. Which, if a doctor has already ordered antibiotics, your early warning system is too late.

This is another general issue with AI: The programming used to determine how predictions are made tends to not be transparent.

  • If clinicians aren’t educated on what variables an AI program is using to get to the decisions it’s making, they can’t use their expertise to critique those decisions. This issue is particularly important when programs are launched without sufficient validation.
  • Take, for example, Whisper, a voice-to-text transcription tool made by OpenAI.
  • Over 30,000 clinicians and 40 health systems have begun using a Whisper-based tool built by the company Nabla to aid in transcribing patient visits.
  • The problem is that Whisper has been plagued by what’s known as “hallucinations,” which is when the software adds words or entire sentences to transcriptions that no one said.
  • The creators at Nabla say that their version of Whisper is trained on medical records, which fine tunes its ability to give proper transcription. But it’s impossible for independent parties to compare the recording to the transcription, because the original records are deleted after transcribing for “safety reasons.”
  • It’s particularly dangerous for patients who are deaf or hard of hearing because they have no way to identify what parts of the transcript were never said.

The promise and potential of AI in health care is pretty exciting, just maybe not for direct patient care…yet.

  • One meta-analysis looked at decisions made by humans alone versus AI alone and found that both performed better independently than decisions made by humans with AI, indicating the need to better train clinicians on how to use the platforms.
  • But there’s huge potential in the shorter term for the use of AI outside clinical practice.
  • A great application for AI could be found in clinical trials.
  • Imagine doing a trial for treatment on a rare disease. Traditionally, you’d need to recruit enough people with that rare disease to fill both your treatment and control arms of the study.
    • With proper predictive AI, you might only need to recruit people for the treatment arm and could use databases of prior cases to predict what their outcomes would be without treatment.
    • It could cut down on time by reducing the number of people you need to enroll.
  • Trials could also improve their success rates by using algorithms that pick enrollment sites with the proper patient population.
    • Let’s say you want to study a certain outcome or disease in a particular demographic. There might be a location that applies to be an enrollment site, and on paper it appears that their population has enough people of that demographic to meet enrollment needs.
    • AI could analyze patient histories at that location and find that a high number of those people are likely to be excluded because of area-specific health conditions.
    • It would save immense time, energy, and money to know ahead of time that the site is not appropriate for this study.

Aside from easing the bureaucratic headaches of trial design, there are some more tangible uses for AI in health care that have appeared over the last few years.

  • I mentioned earlier the Nobel Prize in chemistry involved AI. It was, in part, awarded to two men who presented the AI model AlphaFold2.
  • For years, researchers have been trying to predict the 3D structure of proteins based on the sequence of amino acids that they have.
  • AlphaFold2 has been able to predict the structures of the nearly 200 million proteins that scientists have identified.
  • Determining protein structure is a necessary and formerly time-consuming process that helps researchers identify new drug structures, design more effective vaccines, and fight antibiotic resistance.
  • But the program would not have been possible without the analog work of the Protein Data Bank - an open-source database of over 200,000 protein structures which AlphaFold used to train its model.
  • Which goes to show that an AI program is only as good as the data it’s trained on, and our robotic servants are nothing without our labor…for now.