Machine learning approach automates pathologists' work to identify disease markers
Researchers at UC Davis Health and UC San Francisco have found a way to teach a computer to detect one of the hallmarks of Alzheimer’s disease in human brain tissue. The research delivers a proof of concept for a machine-learning approach to distinguishing critical markers of the disease.
Amyloid plaques are clumps of protein fragments in the brains of people with Alzheimer's disease that destroy nerve cell connections. Much like the way Facebook recognizes faces based on images, the machine learning tool can “see” if a sample of brain tissue has one type of amyloid plaque or another, and do it very quickly.
The findings, published May 15 in Nature Communications, suggest that machine learning can add to the expertise and analysis of a neuropathologist. The tool allows them to analyze thousands of times more data and ask new questions otherwise impossible with the limited data processing capabilities of highly trained human experts.
“We still need the pathologist,” said Brittany N. Dugger, an assistant professor in the UC Davis Department of Pathology and Laboratory Medicine and lead author of the study. “This is a tool, like a keyboard is for writing. As keyboards help writing workflows, digital pathology paired with machine learning helps with neuropathology workflows.”
Teaching a computer to see Alzheimer’s disease like a pathologist
In this study, Dugger partnered with Michael J. Keiser, an assistant professor in UCSF’s Institute for Neurodegenerative Diseases and Department of Pharmaceutical Chemistry, to determine if they could teach a computer to automate the time-consuming process of identifying and analyzing tiny amyloid plaques of various types in large slices of autopsied human brain tissue. For this job, Keiser and his team designed a “convolutional neural network” (CNN), a computer program designed to recognize patterns based on thousands of human-labeled examples.
To create enough training examples to teach the CNN algorithm how Dugger analyzes brain tissue, the researchers devised a method that allowed her to rapidly label tens of thousands of images from a collection of half a million taken from 43 healthy and diseased brain samples.
Like an online dating service that allows users to swipe left or right to label someone’s photo “hot” or “not,” they developed a web platform that allowed Dugger to look at highly zoomed-in regions of potential plaques and quickly label what she saw there. This platform — which researchers called “blob or not” — allowed Dugger to label more than 70,000 “blobs,” or plaque candidates, at a rate of about 2,000 images per hour.
Machine learning tool accurately detects disease markers
The UCSF team used this database labeled example images to train their CNN machine-learning algorithm to identify different types of brain changes seen in Alzheimer’s disease. That includes discriminating between so-called cored and diffuse plaques and identifying abnormalities in blood vessels. The researchers showed that their algorithm could process an entire whole-brain slice slide with 98.7% accuracy, with speed only limited by the number of computer processors they used. (In the current study they used a single graphics card like those used by home gamers.)
The team then performed rigorous tests of the computer’s identification skills to make sure its analysis was biologically valid.
“It’s notoriously hard to know what a machine-learning algorithm is actually doing under the hood, but we can open the black box and ask it to show us why it made its predictions,” Keiser explained. Keiser emphasized that the machine learning tool is no better at identifying plaques than Dugger, who trained the computer to find them in the first place.
“But it’s tireless and scalable,” he said. “It’s a co-pilot that extends the scope of what we can accomplish and lets us ask questions we never would have attempted manually. For example, we can look for rare plaques in unexpected places that could give us important clues about the course of the disease.”
Study data and algorithms for Alzheimer’s disease pathology tool available online
To promote use of the tool, the researchers have made it and the study data publicly available online. This has already generated interactions with other researchers who have evaluated the data and the algorithms in their own labs. In the future, the researchers hope that such algorithms will become a standard part of neuropathology research, trained to help scientists analyze vast amounts of data, tirelessly seeking out patterns that could unlock new insights into causes and potential treatments for the disease.
“If we can better characterize what we are seeing, this could provide further insights into the diversity of dementia,” Dugger said. “It opens the door to precision medicine for dementias.” She added, “These projects are phenomenal examples of cross-disciplinary translational science; neuropathologists, a statistician, a clinician, and engineers coming together, forming a dialogue and working together to solve a problem.”
Other study authors included: Charles DeCarli, Lee-Way Jin and Laurel Beckett from UC Davis; Ziqi Tang of UCSF and Tsinghua University in Beijing, China, and Kangway V. Chuang of UCSF. The study was funded by an NIH P30 AG010129, Paul G. Allen Family Foundation Distinguished Investigator Award and the China Scholarship Council.
The authors declare no conflicting interests.