OscarAI Blog

In the spirit of constantly learning, we are experimenting with several AI use cases and documenting our discoveries.


Sarah Donna Sarah Donna

Needle in a Haystack: Using LLMs to Search for Answers in Medical Records

Part 1 of 3

We are constantly thinking about how to make the workflows of clinicians less tedious, so they can do what they do best: serve patients and provide care. In this next series of posts, we will share our learnings from a different application with the same goal: improving clinician workflows to deliver faster, better care for our members.

Read More
Isabella Brown Isabella Brown

Related Condition Search

In the first step of our ‘find care’ pipeline, Oscar has an omni-search bar that’s able to distinguish between reasons for visit, providers, facilities, and drugs. The bar uses lexicographic (i.e., close in dictionary order) techniques to find autocomplete matches from each group type and a rules-based model decides which results from each group should be surfaced.

Read More
Sarah Donna Sarah Donna

Curious Language Model Limitations

Language models are awesome and all, but my favorite research papers are those that show where they fail. It's easier to understand hard limits than soft capabilities. Here are four recent papers with good examples for the limits of current LLMs.

Read More
Sarah Donna Sarah Donna

Why Notation Matters

A practical observation on LLMs that is both encouraging and worrisome: it is surprising how much NOTATION matters. Simply how you write something down makes it much easier or harder for a transformer to understand the actual meaning behind it. It’s a little like LLMs are able to read a quantum physics book, but only if it’s written in 18-point Comic Sans to look like a Dr. Seuss book.

Read More
Isabella Brown Isabella Brown

Streamlining Commission Reconciliation: An AI Approach

Brokers are paid commissions by insurance carriers on a per month basis, but they must follow certain guidelines in order to be eligible, otherwise they will not be paid their commission. If a broker believes that they were incorrectly denied a commission, they can reach out to Oscar to ask for an explanation, a process we call a “commission reconciliation.”

Read More
Isabella Brown Isabella Brown

Call Summarization: comparing AI and human work

Summarization is considered a top use case for generative AI. Our call quality team ran a side-by-side evaluation, and you’ll see the results from these real calls. The initial findings show that AI performs comparably to our Care Guides in summarizing calls overall, but this performance isn’t evenly distributed.

Read More
Isabella Brown Isabella Brown

GPT-4 Turbo Benchmarking

The pace of improvement of large language models (LLMs) has been relentless over the past year and a half, with new features and techniques introduced on a monthly basis. In order to rapidly assess performance for new models and model versions, we built a benchmarking data set and protocol composed of representative AI use cases in healthcare we can quickly run and re-run as needed.

Read More
Michael Farley Michael Farley

Evaluating the Behavior of Call Chaining LLM Agents

We’re developing a GPT-powered agent designed to answer queries about claims processing. However, providing GPT-4 with sufficient context to respond to questions about internal systems presents a significant challenge due to the API request’s limited payload size.

Read More

AI Use Case: Electronic Lab Review

Oscar continues to experiment and iterate on clinical-AI-human use cases through Oscar Medical Group (OMG). OMG is a team of 120+ providers who offer virtual urgent and primary care for our members. It operates on top of Oscar’s in-house technology stack, including our internally-built Electronic Health Record (EHR) system.

Read More
Isabella Brown Isabella Brown

A Simple Example for Limits on LLM Prompting Complexity

LLMs are capable of spectacular feats, and they are also capable of spectacularly random flame-outs. A big systems engineering issue remains figuring out how to tell one from the other. Here is an example for the latter.

Read More
Isabella Brown Isabella Brown

Oscar Claim Assistant Built On GPT-4

Behind each health insurance claim are millions of combinations of variables that take into account a mixture of rules relating to regulations, contracts, and common industry practices among other factors. When a doctor has a question about a particularly complex claim, they contact one of Oscar’s claim specialists, who interprets the claims system’s output. It’s a complex and labor intensive process for health insurers.

Read More
Chat GPT, Post Antonio Lee Chat GPT, Post Antonio Lee

Campaign Builder Actions

GPT enables new types of automation through Campaign Builder. This allows Oscar and +Oscar clients to deliver relevant interventions and intelligently monitor for signals to better serve members’ and patients’ clinical needs.

Read More
Isabella Brown Isabella Brown

Rating Member Services Interactions With LLMs

In our member services operations, we ask members to rate their interactions with us on member satisfaction (MSAT, general satisfaction with Oscar) and agent satisfaction (ASAT, their satisfaction with the care guide who helped them). How good is GPT-4 at estimating those two ratings from member services transcripts?

Read More
Claims, Demo Antonio Lee Claims, Demo Antonio Lee

Claims Assistant

A bot is trained to understand system logs and PDFs, breaking down the cost of procedures and identifying how insurance logic was applied in each case.

Read More