Machine learning is everywhere: Is there a role with AFM?
I was reading the latest edition of Chemical and Engineering News, the trade journal published by the American Chemical Society, of which I am a member. It seemed like I could not escape articles on machine learning: “A better way to adiponitrile” and “AI identifies drug candidates in weeks.” In the first article, machine learning (using Matlab!) was used to optimize reaction conditions for the synthesis of adiponitrile, a widely produced chemical that is the precursor to nylon 6,6 which is in carpeting and textiles. The second article highlighted machine learning to identify small molecules for drug discovery.
Between the scientific literature and conferences, it really does feel like machine learning is the new buzzword (with artificial intelligence being an even more trendy, popular term that has permeated even popular culture.) Even the technical conference that I have been involved with for the past decade, Nanotech, added an artificial intelligence track this year (AIConnect.) It seems that machine learning is everywhere.
There is quite an active debate about whether the emphasis on machine learning is just hype (e.g. Is Machine Learning over-hyped?) or truly transformative, but it is clear that researchers are looking to help them in their research. So first of all, what exactly is machine learning?
In 1959, Arthur Samuel, a pioneer in the field of artificial intelligence, defined machine learning as “field of study that gives computers the ability to learn without being explicitly programmed”. The broad idea is that a computer program learns to predict useful information once it sees a lot of similar examples or data. Technically, “machine learning” is a subset of “artificial intelligence”. However, these two terms are used inter-changeably nowadays.
Here is an example of how a typical machine learning model for image recognition can work. Suppose you want to build a machine learning model that will tell apart images of dogs and elephants. As a first step, the model will need to see a lot of pre-labeled dog and elephant images, which is the training data for this case. After the model is trained on the training data, it will be tested on another set of unseen images. This unseen set of images is called a test or a development (dev) set. A good machine learning model will have an optimum performance on the training and test/dev data sets. In formal terminology, this type of problem is called a ‘classification’ as it is classifying the input images as dogs or elephants. It is also called ‘supervised’ machine learning since it uses training data that has a labeled output: dog or elephant.
There are many types of machine learning models that you can build around a dataset. Now, how do these models work? They use principles of linear algebra, optimization, calculus, and statistics. Although the basic concepts of machine learning were there for a long time, the recent advances in computing and the availability of big data has made it into a very powerful tool.
You also hear a lot about neural networks in regards to machine learning or AI. Neural networks are a type of machine learning model that is composed of interconnected layers and nonlinearities that is widely used for applications such as image recognition, the relevant area for microscopy applications. They are a popular tool for predicting an output y (e.g. house prices, image classification, healthy vs. cancerous cell differentiation, etc.) from an input x (e.g. location and age of the house, RGB channels of the image, or microscopy images of cells etc.). Their popularity lies in the fact that the neural networks figure out the intermediate layers to map the output y from input x. Neural networks are an example of end to end learning. A deep neural network model with multiple layers is referred to as “Deep Learning,” yet another popular term in the jargon of machine learning and AI.
So it got me thinking, does machine learning have a role to play in AFM, or more broadly in microscopy? What can be learned from AFM images that can be applied in practical ways to get new information or perhaps the same information but more quickly?
We are starting to see a tiny bit of machine learning in analyzing AFM images. For example, a group out of Tufts University published a paper last year studying AFM images of urine cells from patients with and without bladder cancer. They used machine learning to more accurately recognize parameters to differentiate the healthy cells from the cancerous cells. Another paper from researchers at Oak Ridge National Lab used machine learning to read and recognize complex molecular structures assembled on a surface.
So the answer is yes, I do believe machine learning can play a useful role in analysis and interpretation of AFM imaging. In a subsequent blog, Ishita and I will discuss some of our ideas.
Ishita Chakraborty Ph.D. (Stress Engineering Services) and Dalia Yablon Ph.D. (SurfaceChar LLC)