Machine learning 3: the importance of subject matter expertise
One of the doomsday prophecies often heard in popular culture and media is that ML is going to replace our human experts. It is easy to get that impression because there are certain, I would say even narrow, set of areas that ML can perform very well. However, it has significant weaknesses that prevent some kind of widespread implementation on all tasks. Human experts in physics, chemistry, engineering, medicine, biology, material science, mathematics etc are at the core of proper development, continuous monitoring, and feedback needed in development and implementation of such algorithms There is a continual need for human intervention beyond the initial algorithm development to check on its success, application to broad data sets, and evaluate potential biases. This topic of bias is becoming increasingly recognized in the AI community – for example there could be biases in the training set, or there could be algorithms that are misguided by a meaningless metric (good examples pointed out in a recent NYT article). ML is a tool for experts, not a replacement of experts.
While data scientists can build an algorithm or a model, a subject matter expert (SME) is needed to identify the problem, identify its underlying intricacies, make sure that the model does work properly and is applied correctly to whatever dataset is of interest. Let’s look at a couple of examples related to image recognition, continuing our theme for our machine learning blogs.
An application of image recognition with growing interest is in the field of radiology to “read” films. Let’s take the example of using a computer to “diagnose” a broken leg from an x-ray (by the way, in September the FDA approved a GE Healthcare x-ray system that diagnoses a pneumothorax – collapsed lung – through AI algorithms.) First the SME needs to identify the initial training set of images – which images display broken bones and which do not? What about areas of gray where there could be a fracture – will that be included in the training set and how will it be classified? While it is fairly straightforward to build an algorithm for a computer to diagnose a break in a bone given a set of images of unbroken and broken bone, it is far more challenging to apply the learning broadly. For example, what if in the image to be analyzed the leg is positioned at a different angle than any of the images in the training set? Are there other body parts included in the x-ray that were not in the training set? Maybe there is some anomaly in the image? The NYT article points to a bias based on hospital center, which has nothing to do with the task at hand.
ML algorithms really struggle with such variations. While human experts are very good at generalizing their learnings from a certain set of images and have no problem extrapolating to changes like different angles or x-rays taken from different hospitals, a computer is not. Because again, fundamentally the algorithm is solving a pattern recognition problem, while a human being is able to draw in his/her vast experience in reading x-rays in different environments and a fundamental understanding of what the x-ray means.
Another great example demonstrating the benefit of SME was given by Yong Han of Lawrence Livermore at his talk at MRS correlating bulk mechanical measurements with SEM data. They did a ML study where they tried to correlate parameters in an SEM image using a “kitchen sink” approach (throw all the image analysis parameters at the problem) and an SME-guided approach, where SME chose parameters based on what they believed would correlate with the bulk properties (e.g. crystal size and roughness). The SME guided approach performed better than the “kitchen sink” approach.
Let’s take a look at an AFM example. A data scientist wants to build an ML model that tells apart cancerous vs non-cancerous cells in AFM images as outlined in the PNAS 2018 paper by Igor Sokolov et al. The first inclination would be to look at the images and try discern differences between the healthy and non-cancerous cells, and then maybe even try to develop a ML algorithm to discern the differences.
AFM image of cell. From PNAS (Sokolov et al)
But there were certain challenges to such an approach including huge data sets and uncertainty in some of the parameters. So the SME’s in this case decided to try a different route involving the surface properties for the cells. They analyzed 44 different surface parameters to see if there was any patterns in these parameters associated with the healthy vs. cancerous cells. Among the most important parameters they found were the “valley fluid retention index” and “surface area ratio”. What was interesting was that even with all these 44 parameters, the height channel – which is our typical go-to channel and the one we always look at first - had no meaningful parameters for correlation. It was a different channel that they collected – the adhesion channel - that was the most helpful. In this case, the SME’s provided invaluable guidance: not only what channels to collect and examine, but also the insight that perhaps you need to look beyond the “superficial” image.
So this AFM bladder cell study is an excellent example of the power and utility of ML: not as something that will replace us experts, but rather as a tool to enhance our capabilities. Identifying a pattern among the 44 surface parameters of the adhesion channel and then correlating it with a particular outcome is not something we could have figured out without the power of computation and ML. But the ML algorithm is useless without being fed the proper dataset (adhesion channel) and being told where to look (the surface parameters.)
Subject matter expertise always needs to be incorporated with any machine learning model with AFMs and any image recognition. This is especially true since the field of atomic force microscopy itself is an active research area with still many unknowns. Again, ML is a tool to enhance expertise, not replace it.
Dalia Yablon and Ishita Chakraborty