Data Sciences Resources
UC Davis Clinical Data Sources
Various data sources available for patient data from UC Davis EHR. DataPATH option is the de-identified OMOP database is available to all UC Davis staff and researchers without any IRB.
Spoke Knowledge Graph
A knowledge network collecting a wealth of information from basic molecular research, clinical insights, environmental data and others.
UCSF Information Commons
A platform for data-driven research that provides access to shared multi-modal patient- and population-level biomedical data and a wide range of exploration and analysis tools via secure computational environments from UCSF EHR.
UC Health Data Warehouse (UCHDW) Data Discovery
A central repository of the combined standardized OMOP instances from the UC academic medical centers, including: UCLA Health, UCSD Health, UCSF Health, UCI Health, UCR Health.
Human Protein Atlas
An atlas for all human proteins with the aim to map all the human proteins in cells, tissues and organs using integration of various omics technologies, including antibody-based imaging, mass spectrometry-based proteomics, transcriptomics and systems biology.
KNIME
An integrated low code open-source platform for all the data science needs including AI/ML workflows. Also has a huge community for various extensions and components.
Streamlit
An abstracted fast way to build and share data apps in python. Good for extending and sharing which can be deployed in cloud for free.
Tensorflow
Free and opn-source software library in python for ML and AI for building deep learning neural networks. Developed and supported by Google Brain team under Alphabet.
PyTorch
A machine learning library based on Torch library for AI/ML with specializations in computer vision. Developed and supported by Meta AI and a part of Linux Foundation umbrella.
Hugging Face
A repository of various open-source models developed by various organizations that can be used under open-source agreement. Also provides a space for hosting and deploying custom models in cloud.
Sci-kit Learn
Free and open-source ML library in python for traditional ML needs of classification and regression.
Relational Databases: Microsoft SQL Server, Oracle and PostgreSQL
Various databases to store relational and structured data.
Learn more: Microsoft SQL Server, Oracle and PostgreSQL
NoSQL Database: MongoDB
Cross platform and document-oriented database to store unstructured data. Extremely useful for storing big data but doesn’t support relational database properties natively.