OscarAI Blog
In the spirit of constantly learning, we are experimenting with several AI use cases and documenting our discoveries.
Needle in a Haystack: Using LLMs to Search for Answers in Medical Records
Part 1 of 3
We are constantly thinking about how to make the workflows of clinicians less tedious, so they can do what they do best: serve patients and provide care. In this next series of posts, we will share our learnings from a different application with the same goal: improving clinician workflows to deliver faster, better care for our members.
Related Condition Search
In the first step of our ‘find care’ pipeline, Oscar has an omni-search bar that’s able to distinguish between reasons for visit, providers, facilities, and drugs. The bar uses lexicographic (i.e., close in dictionary order) techniques to find autocomplete matches from each group type and a rules-based model decides which results from each group should be surfaced.
Curious Language Model Limitations
Language models are awesome and all, but my favorite research papers are those that show where they fail. It's easier to understand hard limits than soft capabilities. Here are four recent papers with good examples for the limits of current LLMs.
Why Notation Matters
A practical observation on LLMs that is both encouraging and worrisome: it is surprising how much NOTATION matters. Simply how you write something down makes it much easier or harder for a transformer to understand the actual meaning behind it. It’s a little like LLMs are able to read a quantum physics book, but only if it’s written in 18-point Comic Sans to look like a Dr. Seuss book.
Streamlining Commission Reconciliation: An AI Approach
Brokers are paid commissions by insurance carriers on a per month basis, but they must follow certain guidelines in order to be eligible, otherwise they will not be paid their commission. If a broker believes that they were incorrectly denied a commission, they can reach out to Oscar to ask for an explanation, a process we call a “commission reconciliation.”
AI Use Case: Messaging Encounter Documentation
Last year, we expanded our AI functionality to include a new use case: generating initial drafts for providers to document their secure messaging-based visits with patients.
Call Summarization: comparing AI and human work
Summarization is considered a top use case for generative AI. Our call quality team ran a side-by-side evaluation, and you’ll see the results from these real calls. The initial findings show that AI performs comparably to our Care Guides in summarizing calls overall, but this performance isn’t evenly distributed.
Enforced planning and reasoning within our LLM Claim Assistant
The Claim Assistant starts by formulating a strategic plan to tackle the problem at hand. This plan is an array of thoughts or subgoals, much like breaking down a large task into smaller, manageable pieces.
A User-Centric Approach to Working with AI
Take a look at how user research helped inform our experimentation in two areas: automated messaging with members and clinical documentation for primary care providers.
GPT-4 Turbo Benchmarking
The pace of improvement of large language models (LLMs) has been relentless over the past year and a half, with new features and techniques introduced on a monthly basis. In order to rapidly assess performance for new models and model versions, we built a benchmarking data set and protocol composed of representative AI use cases in healthcare we can quickly run and re-run as needed.
Evaluating the Behavior of Call Chaining LLM Agents
We’re developing a GPT-powered agent designed to answer queries about claims processing. However, providing GPT-4 with sufficient context to respond to questions about internal systems presents a significant challenge due to the API request’s limited payload size.
AI Use Case: Electronic Lab Review
Oscar continues to experiment and iterate on clinical-AI-human use cases through Oscar Medical Group (OMG). OMG is a team of 120+ providers who offer virtual urgent and primary care for our members. It operates on top of Oscar’s in-house technology stack, including our internally-built Electronic Health Record (EHR) system.
A Simple Example for Limits on LLM Prompting Complexity
LLMs are capable of spectacular feats, and they are also capable of spectacularly random flame-outs. A big systems engineering issue remains figuring out how to tell one from the other. Here is an example for the latter.
Oscar Claim Assistant Built On GPT-4
Behind each health insurance claim are millions of combinations of variables that take into account a mixture of rules relating to regulations, contracts, and common industry practices among other factors. When a doctor has a question about a particularly complex claim, they contact one of Oscar’s claim specialists, who interprets the claims system’s output. It’s a complex and labor intensive process for health insurers.
Campaign Builder Actions
GPT enables new types of automation through Campaign Builder. This allows Oscar and +Oscar clients to deliver relevant interventions and intelligently monitor for signals to better serve members’ and patients’ clinical needs.
Automated Claims System Configuration
With AI, we’re exploring ways to automate the translation of provider contracts into highly accurate claims configurations.
Rating Member Services Interactions With LLMs
In our member services operations, we ask members to rate their interactions with us on member satisfaction (MSAT, general satisfaction with Oscar) and agent satisfaction (ASAT, their satisfaction with the care guide who helped them). How good is GPT-4 at estimating those two ratings from member services transcripts?
Claims Assistant
A bot is trained to understand system logs and PDFs, breaking down the cost of procedures and identifying how insurance logic was applied in each case.
Clinical Documentation
An AI-generated clinical documentation system that automatically summarizes and documents conversations between our medical team and patients.