Adopting AutoML: Let's do a reality check

There is no cure for Alzheimer’s. But what if we could find a way to detect it early? The question intrigued the scientists at Imagia, who then used Google’s automated machine learning (AutoML) to reduce test processing time from 16 hours to one hour. PayPal experienced similar benefits. In 2018, with H2O’s AutoML, PayPal increased its fraud detection model accuracy by 6% and made the model development process six times faster.

Success stories like these have inspired around 61% of decision-makers in companies using artificial intelligence (AI) to adopt AutoML. Its uptake is only going to increase as it can mitigate, to a great extent, troubles caused by the lack of data scientists. Also, AutoML’s ability to improve scalability and increase productivity is bound to lure customers.

But does this mean that adopting AutoML has become a must-do? Well, that’s a conundrum most businesses are facing right now, and examining real-life cases could be a solution.

As a senior software engineer, I have worked with several startups where AI played a pivotal role. I have seen the pros and cons and business impact. But before going into the use cases, let’s first establish what AutoML is, its present status, and what it can and cannot do.

What is AutoML?

AutoML (automated machine learning) is a system's ability to automatically decide the right model and set parameters to deliver the best possible model. I will focus only on deep neural networks in this article.

In deep neural networks, finding the right architecture is always a major challenge. By architecture, I mean basic building blocks (for example, for image recognition, basic building blocks would be 3X3 max pooling, 3X1 convolution, and so on) and the interconnection between them for multiple hidden layers. 

Neural architecture search (NAS) is a technique for automating the design of deep neural networks. It is used to design networks that are on par with or can outperform hand-designed architectures. But we need to train vast numbers of candidate networks as part of the search process to come up with the right architecture, which is time-consuming.

The current state of available platforms

NAS plays a pivotal role in forming the AutoML framework for both Amazon Web Services (AWS) and Google Cloud Platform (GCP). But AutoML is still in the dawning stage, and these platforms are evolving. Let us discuss these two famous AutoML frameworks.

_{Automated Machine Learning Flow. Image by author}

GCP AutoML

GCP AutoML has NAS and transfer learning at its core. NAS searches for optimal architecture from a pool of architectures based on previous training outcomes. Initially, reinforcement learning algorithms were used for architecture search.

However, these algorithms tend to be computationally expensive due to the large search space. Recently, there has been a paradigm shift towards developing gradient-based methods that have shown promising results. But what happens inside GCP AutoML is still not that clear, and it is more of a black-box solution.

AWS Autopilot

The main concept of AWS Autopilot is to provide a configurable AutoML solution. Every detail about the machine learning cycle is exposed, from data transformation to model training and hyper-parameter tuning. In contrast to GCP AutoML, AWS Autopilot is a white-box solution.

AWS Autopilot uses different strategies for data and ML (machine learning) pipelines. Some of these strategies are based on the if-else statements suggested by domain experts; other strategies depend on choosing the correct hyper-parameters (that is, learning rate, over-fitting parameter, embedding size) for the pipeline.

What AutoML can do and what it cannot do

Sometimes people say that AutoML is the holy grail of AI/ML, a view that I do not share. So let's continue:

	What It Can Do	What It Cannot Do
Data Transformation	Takes care of the pre-processing and data transformation. Identifies numerical and categorical variables and can handle them.	Can make mistakes, like wrongly identifying numerical features in data with low cardinality as a categorical feature. Cannot dump data and assume it will work without hiccups.
Feature Extraction	Extracts features to some extent.	In domain-dependent models, feature extraction is necessary. Mastering domain knowledge is still a problem.
Modeling and Tuning	Identifies the best hyper-parameters. Can do a search for the best architecture.	AutoML cannot work on a small amount of data as there is minimum data points restriction. It is overkill for simple problems where we are using linear regression or some basic models. It is a time-consuming task and can incur high costs for both simple problems and problems with a large amount of data.

Let me share some experiential insights, with real-life examples, to elaborate on where AutoML was the right fit and where it did not work.

Use case 1: Property rent prediction

We had to develop a tool to predict property rent, but AutoML did not perform well because the property market has a lot of localized (state-wise) information. In fact, our attempts failed with one model per region as it did not have sufficient data (less than 500 data points) to learn architecture. A simple XGBoost kind of model with reduced features performed well compared to AutoML.

In states where the data was good enough for AutoML, our prediction model fared better than the in-house solution.

Use case 2: TV rating prediction

In the case of TV rating prediction, the same thing happened. AutoML failed to capture daypart-based behavior across multiple channels. For example, NICK is for children; most children watch programs in the afternoon, and grown-ups mostly watch MTV and have peak viewership in the evening. This is just a simple pattern, but AutoML wasn't able to capture multiple patterns from multiple categories in one model.

Will AutoML really replace DS?

From my experience in the field, I would say "No." AutoML cannot directly replace data scientists. But it can be a useful tool for data scientists.

Where we should be using AutoML

AutoML’s probability of functioning well without any human interference is higher in scenarios where the problems are familiar in literature. In cases like object detection for generic objects or image classification, you can use AutoML, as they are already tuned with a large amount of data. It can also help you in developing quick PoCs, which may or may not give reasonable performance.

Where we should not be using AutoML

Sometimes, we need simple feature engineering with a simple linear regression model for real-world ML projects. AutoML could incur more cost in those cases as it does not support feature engineering. Internally it does use a deep neural network, which means there is some feature engineering, but for that, it will require a lot of data. Also, it is expensive if you compare it to the basic approach. And the performance of the model selected by AutoML will need improvement.

Scenarios where the problem is very domain-specific and requires some domain knowledge are also likely to fail with AutoML. Here are the scenarios where we should use AutoML and those where we should avoid it:

	AutoML	Custom Model
Security and privacy	Has a security issue because we have to upload data to the cloud.	This is safe. We can train custom models on our personal machines as well.
Domain-specific problem	AutoML does not work for specific problems.	We can train the model for domain-specific problems.
Budget constraint	AutoML is expensive in many cases like linear regression.	Its budget depends on the requirements.
Less data	AutoML has a minimum data requirement.	It can affect performance but there is no such restriction.
Time to market	Using AutoML we can complete the task quickly.	We have to set up a pipeline in this instance, which is time-consuming.
Standard problem	In the case of standard problems, AutoML can quickly complete.	Custom models will take more time to find optimal architecture.
Feature engineering	AutoML cannot help with feature engineering where we need domain knowledge to create them.	We must separately work on feature engineering; then we can feed to AutoML.
Solution	It will give solutions from already known approaches for existing problems.	Data scientists can try novel approaches that will be very specific to the problem statement.

Conclusion

AutoML is not artificial general intelligence (AGI), which means it cannot define the problem statements and solve them automatically. However, it can solve pre-defined problem statements if we give it relevant data and features.

The use of AutoML involves a trade-off between the generalization of a problem and the performance of a specific problem. If AutoML is generalizing its solution, then it must compromise on the performance of a specific problem (because AutoML architecture is not tuned for that). And the general solution cannot help with domain-specific problems where we need a novel approach to solve them.

Alakh Sharma is a data scientist at Talentica Software.

Welcome to the VentureBeat community!

Our guest posting program is where technical experts share insights and provide neutral, non-vested deep dives on AI, data infrastructure, cybersecurity and other cutting-edge technologies shaping the future of enterprise.

Read more from our guest post program — and check out our guidelines if you’re interested in contributing an article of your own!