Data/AI

September 2017 - Alexapath researchers demonstrated a CNN composed of four convolutional layers, two FC layers, one dropout and a sigmoid to Larry Paulson, President of Qualcomm, India. The CNN uses deep learning to classify images of PAP smear cytology acquired using the ADA system as either normal or abnormal. The classifier was accurate to over 98%.

The AI was built as a pre-screening tool for cervical cancer testing clinics in Low and Middle Income Countries. Due to the high rate of mortality associated with cervical cancer in places like rural India, it has become common for pathologists to volunteer their time to travel to these areas and conduct free or extremely low cost testing at camps. At the testing clinic, the ADA system is set up and a lab technician creates a mobile whole slide image of the Pap smear slide, then the mWSI is analyzed using the CNN app. If a mWSI is deemed abnormal, then the pathologist will view the slide and determine the grade of dysplasia. If the result is normal, it can be assumed the slide is negative. Our solution had to be portable and able to run at point of care without access to the internet.

With support from engineers at Intrinsyc and Qualcomm, the CNN was developed using TensorFlow and the training was accomplished using cloud based GPU’s available from Paperspace.com. The process was iterative. When we were satisfied with the accuracy (aka the day before we had to present) the final graph and weights were stored as a Protocol Buffer (pb), then deployed as an Android application running on a SnapDragon 820 DragonBoard. In our case, the Protocol Buffer is a TensorFlow Model file that generates classes in Python so they can be loaded, saved and accessed in a user friendly way.

When we first started building the training set we had no Pap Smear slides, and the only resource available was the DTU / Herlev Pap Smear Databases created in 2008 by Dr. Jan Jantzen at DTU (Denmark). To supplement the data, we generated our own database of images acquired using the ADA-1 system. The acquired mWSIs were segmented using the watershed method. The segmentation method was developed using MatLab, then ported to Python so that it could be deployed on the Android device as well. The images were screened and flagged by Cytologist Julie O’Keefe (South Bend, Indiana) The dataset was expanded using flipping, rotating, vignetting.

The CNN had two distinct phases during our research. The CNN used for training was both complicated and cloud based, yet the deployed CNN was a small PB file containing the TensorFlow weights and graphs that could be used offline as an Android app.

This research was supported by the United States and India Science and Technology Endowment Foundation. This was a proof of concept to prove: 1- the ADA system is AI ready, 2- AI can be deployed at a low cost in low resource environments with poor internet, 3- a portable end to end system can be used in the field. This has become the basis of our relationship with other researchers looking for a means to deploy their AI in the field. We are currently seeking funding to launch a larger trial of the CNN.

Below is a link to the Alexapath Data Set containing reclassified images captured with Version 1 of ADA scope. We offer this data set for free to all academic researchers and with licensing for commercial users. Join Our Community of Open Source Developers and get access to the Alexapath CNN on GitHub.