Many applications in healthcare require classifying data to determine a test result or to decipher various outcomes from sensor data such as a positive or negative result or the use of an accelerometer to determine whether a patient is sitting, sleeping, or walking.
The bulk of these applications uses ‘supervised learning’ which requires a set of ‘labeled’ data to enable the ML system to correlate labels associated with desired outcomes to each data set. As we will see, it can be difficult or impossible for a Machine Learning Model to find a correlation between the data provided and the labels assigned, if we provide too few examples or if the data provided is limited or restricted in some fashion. For this reason, not all use cases are good candidates for AI/ML, and I hope that the examples in this series will help readers develop a ‘horse sense’ for when AI may be appropriate and when building a traditional algorithm to solve a problem is a better choice.
Dr. Eric Topal shared this example in his excellent book “Deep Medicine” to illustrate the challenges one faces when building and validating a Machine Learning model for a healthcare application.
Although they were given more than 1 Million ECG T-waves with lab results, the AI/ML algorithm initially could not find a reliable indicator in the data that linked T-Waves to potassium levels. When AliveCor then asked their hospital partner to provide full ECGs and full lab results for all patients and re-ran their ML Models, voila, the model found a reliable correlation between the complete patient ECG and elevated Potassium levels.
So why did the first attempt fail? It seems that the first set of data was for out-patients only, so the lab results were likely taken at a much greater difference in time than the ECG data, and further, outpatients tend to be healthier and have less tendency toward serious kidney failure conditions which was the condition of interest. In addition, the first request for the T-Wave ECG data excluded the full ECG, so this assumption further blinded the ML algorithm. The lesson here is to be careful when pre-sorting or pre-filtering data as you may lose important data by virtue of your assumption that only the requested subset of data is significant for your results. (Dr. Eric Topal – Deep Medicine)
In the AliveCor example, AI/ML was an excellent choice to sift through data to find a correlation, but it was only successful once it was presented with full sets of ECG data and lab results taken from admitted patients so they were obtained close to the time of the ECG. As a result, the AliveCor ECG product can be used to detect high-potassium conditions in patients who use their product.
AliveCor was able to show that their ML model produced accurate results on a large sample of patients which enabled them to receive FDA clearance to market their product with a claim that it detects elevated potassium levels in patients. Since this ML model functions as a classifier, its operation was trained using a large data set, and it has been proven to perform this function reliably across the “test data” that was kept separate from the “training data” used to train the model.
The AliveCor example illustrates why the use of Supervised Learning with ML classifiers can be a great use case for Machine Learning in a variety of medical product applications. To this end, a number of software tools make it possible to build Machine Learning models in the cloud that can be deployed on embedded devices, often with only minimal loss in performance.
While ML can be a useful tool, we have learned that the AI/ML model is not ideal in many situations since it adds complexity and requires a good amount of data for training and testing the model. It may also not be appropriate in situations where an algorithm can be used to perform a sequence of well-defined tasks such as applying a filter, an envelope, and a threshold to detect heart rate from a PPG or ECG signal. Detecting a heart rate with an ML model would surely require significantly more ECG data than the algorithm approach, and it introduces much complexity into an otherwise well-defined algorithmic approach.
I learned long ago that the best musicians first master their instrument and then learn when NOT to play. So it is with planning system architecture and knowing which tools are best suited for the application. As a result, it is best to investigate alternate approaches before deciding whether the AI/ML approach is best for your application.
Our team is always here to help and in our next blog, we will review the impact of latency with an ML implementation vs. the latency of an algorithm performing the same function in a tech-enabled version of a familiar everyday medical product.