Subject Guides: AI in Healthcare: NYUTron

NYUTron (pronounced: new-tron)

The Health Sciences Library is excited to support innovative AI projects at NYU Langone Health. One such initiative is NYUTron. Led by Dr. Eric Oermann, Assistant Professor of Neurosurgery, Radiology, and Data Science, and Dr. Yin Aphinyanaphongs,Director of Operational Data Science and Machine Learning, NYUTron is a large language model (LLM) platform trained on a decade's worth of clinical notes from NYU Langone Health's inpatient records. Its purpose is to handle unstructured text and perform tasks involving categorization and numerical prediction.

The work behind NYUTron was published in Nature and scored exceptionally high (in the 99th percentile) based on alternative metrics Metrics that are not based solely on traditional academic citations but include various indicators like social media mentions, downloads, and online discussions.. This substantiates its impact and popularity within the scientific community.

NYUTron Implementation

Main Features

190 million parameters
Trained on 387,144 patients, 4.1 billion words from clinical notes (identified version)
Can be flexibly fine-tuned to perform classification and regression tasks with unstructured text inputs

Proof of Usefulness

Predicting in-hospital mortality
Filling in comorbidity index
Predicting 30-day all-cause re-admissions
Estimating length of stay
Predicting insurance denials

Enhancing NLP in Healthcare:

A de-identified model, stripped of protected health information (PHI), is free for use by all NYU Langone Health (NYULH) faculty and staff. To access the de-identified model, please submit a request to the Predictive Analytics Unit (PAU).

We encourage members of the NYU Langone Health community to explore and utilize this model for their own predictive tasks. See the >Jiang LY, et.al. research paper on health system-scale language models for examples of potential use cases and further details.

Future Plans:

The NYUTron team plans to expand its range of models, including BERT-like models Bidirectional Encoder Representations from Transformers (BERT), developed by Google, is a language model designed to learn from text without labels. It pays attention to both the words before and after each word in a sentence. This dual focus helps BERT understand language nuances. The model can then be fine-tuned for specific tasks like question answering and facilitating meaningful conversation, making it versatile without requiring significant modifications. as well as generative models. The team's goal is to provide Natural Language Processing (NLP) tools within the electronic health record (EHR) universal interface.

Jiang LY, Liu XC, Nejatian NP, Nasir-Moin M, Wang D, Abidin A, Eaton K, Riina HA, Laufer I, Punjabi P, Miceli M, Kim NC, Orillac C, Schnurman Z, Livia C, Weiss H, Kurland D, Neifert S, Dastagirzada Y, Kondziolka D, Cheung ATM, Yang G, Cao M, Flores M, Costa AB, Aphinyanaphongs Y, Cho K, Oermann EK. . “Health System-scale Language Models are All-purpose Prediction Engines” Nature. 2023 Jul; 619 (7969): 357-362. doi: 10.1038/s41586-023-06160-y. Epub 2023 Jun 7. PMID: 37286606; PMCID: PMC10338337. [1]

Key Links:

Manuscript: Read the Research Paper

Codebase: Explore the Code on GitHub

PAU Request Form: Request Model Weights

For questions about AI at NYU Langone Health, please email: generative.ai@nyulangone.org

<< Previous: Considerations for Authorship
Next: Additional Educational Resources >>

AI in Healthcare

Save this guide to your phone

NYUTron (pronounced: new-tron)

NYUTron Implementation

Enhancing NLP in Healthcare:

Future Plans:

Key Links: