The College of American Pathology (CAP), systematically reviews and accredits laboratories across the United States to ensure these laboratories meet certain safety and patient care criteria. The CAP facilitates this important role through the Laboratory Accreditation Program, which accredits each lab with powers granted from the Centers for Medicare and Medicaid Services (CMS)1. These site visits are unique in that they are performed by other practicing laboratory professionals and occur every 2 years to asses compliance. Checklist are made available to each lab before the site visit is performed so laboratory personal can compile, organize, and review the required information. Compiling this data often forces laboratory admins to spend sufficient time ensuring this data is collated to the needs of CAP and that the data is accurate. Having expertise in informatics can facilitate this and alleviates many of the time-consuming tasks required for this documentation. Thus, reinforcing informatics critical role in all pathology departments.

One such checklist item that became time consuming, both for our admin staff, as well as our LIS analytics team, was the American Society of Clinical Oncology (ASCO) / CAP estrogen and progesterone receptor testing in breast cancer guidelines2. These guidelines allow laboratories the ability to establish standard operating procedure to ensure the validity of low positive or negative interpretations with these biomarkers. At our institution, the Beaker LIS environment went live mid-way through our CAP visit cycle, making this process more complicated. Thankfully we had developed an in-house pathology search engine PathSearch (Michael Erickson), which combines our old LIS data, with new Beaker data.

PathSearch allows pathologists to search for cases by keywords and extract the data into a spreadsheet which can be used for further analysis. For the CAP estrogen and progesterone receptor reporting we built a small program to further breakdown the results into a suitable table. The receptor data is not stored as a discrete field but rather in either the final diagnosis or separately in an addendum. In order to search for receptor status only once per case, all case text across all addendums were merged into a single text field and then queried using regular expressions to identify receptor status. The result is noted as a new column for each case/row and tabulated at the end to produce the final data needed.

For this specific task the CAP required reporting on total breast carcinoma or DCIS for a given period, the age range and mean age of the patients, the number of cases per tumor grade, the receptor status (ER, PR and HER2) case counts by age (pre- or post-menopausal), and the receptor status (ER, PR and HER2) case counts by tumor grade.

By empowering our physicians with the ability to query for data we have shortened the time needed to generate a report and reduced the amount of people needed for such a project. Of course, there are limitations to keep in mind. For example, the regular expressions are a form of advanced text search and depends on an understanding of all the written possibilities of the data being queried. Since free text search is often difficult, due to complexity of natural language as well as difference in writing styles amongst physicians, having discrete data points for individuals findings, such as receptor status for a case, can be hard coded, allowing for simplified data analysis. As a result, this may not be as comprehensive and may require further verification. Ultimately having a specialist in informatics within the laboratory allows the department to take control of their data and use it seamlessly with their regulatory and business needs. However, anticipating which records will be needed for regulatory parameters is not always defined ahead of time as biomarkers and prognostic status evolve. These expected changes necessitate the pathology informatics professional to hold a key role in administrative aspects of the department in order to facilitate proper data handling from the implementation/maintenance of any LIS. The informatics professional also has a role in educating the other pathologists to the importance of adhering to these data standards during their individual practice, which can take considerable buy-in.


  1. CAP Laboratory-Improvement-Program, accessed October 21, 2020
  2. Estrogen and Progesterone Receptor Testing in Breast Cancer Guideline Update, accessed October 21, 2020
  3. A cross-source, system-agnostic solution for clinical data review, accessed October 26, 2020
  4. Extracting Structured Information from Free Text Pathology Reports, accessed October 26, 2020
  5. Analysis of hormone receptor status in primary and recurrent breast cancer via data mining pathology reports, accessed October 26, 2020
  6. Development of a Novel Tool for the Retrieval and Analysis of Hormone Receptor Expression Characteristics in Metastatic Breast Cancer via Data Mining on Pathology Reports, accessed October 26, 2020