Non-Alcoholic Fatty Liver Disease (NAFLD)

blue 3D image of human body with liver highlighted in orange

Hepatocellular carcinoma and Non-alcoholic fatty liver disease (NAFLD)

Hepatocellular carcinoma (HCC) is increasing in incidence worldwide and associated with nonalcoholic fatty liver disease (NAFLD) even in patients without advanced liver fibrosis. Predictive models to selectively identify patients with NAFLD who are at significant risk of developing HCC can enable personalized screening strategies and cost-effective care while optimizing clinical outcomes through early detection. Machine learning (ML) tools hold the potential to design such predictive models that can precisely determine the risk for development of HCC in patients with NAFLD.

Our machine learning prediction model

As part of this multi-institutional study, we have created a pilot ML model using data from the UC Davis OMOP database, which includes standard clinical variables that have been curated by our physician collaborators for accurate risk prediction. 

We intend to further improve upon this model by training it on a wider range of datasets from institutions across the country.

Our pathway to creating this model for NAFLD to HCC progression is:

  1. Create a base cohort comprised of patients diagnosed with non-alcoholic fatty liver disease (NAFLD)
  2. Identify patients who developed HCC in that cohort (180 patients)
  3. Identify various phenotypes (features) to be used for training the machine learning models
  4. Split the data into 90/10 cohorts to train and validate machine learning models
  5. Apply data to multiple machine learning models
  6. Select the final model based on accuracy and develop a front end application for the model