FrugalML switches between APIs to improve image classification and cut costs

Stanford University researchers developed a framework that enables developers to intelligently switch between multiple cloud AI APIs (including those from Google and Microsoft) within a budget constraint. In preliminary experiments, they claim their system -- FrugalML -- typically leads to a more than 50% cost reduction while matching the accuracy of the best single API.

Third-party machine learning APIs come with several challenges. One is that companies don't price workloads the same. For example, Face++ charges $0.0005 per photo for image classification, while Microsoft charges $0.0010. Moreover, different APIs perform either better or worse on different types of data. Studies have brought to light disparities in gender classification for different skin colors, among other inequities.

The researchers' framework -- FrugalML -- attempts to solve this by learning the strengths and weaknesses of each API and then performing an optimization to identify the best adaptive strategy. FrugalML's approach calls APIs sequentially given a budget -- for instance, it might send data to Google's image classification API, and if the API returns a label (e.g., "dog") with high confidence, the framework stops and reports the label. But if the API returns a label (e.g., "hare") with lower confidence and FrugalML knows the API is less accurate for this label, the framework might select a second API to make an additional assessment.

In experiments, the researchers compared the accuracy and incurred costs of FrugalML to that of real-world AI services on open source data sets (FER+, RAFDB, and AFFECTNET). They focused on the task of facial emotion recognition (where the goal was to predict the emotion expressed by people in headshots), and they compared three commercially available APIs: Google's, Microsoft's, and Face++'s.

The team reports that on the data set FER+, only 33% cost was needed to achieve the same accuracy as the best API -- Microsoft's. They reason this was likely because the base service's quality score was "highly correlated" to its prediction accuracy, thus requiring FrugalML to call expensive services only for a few difficult images and enabling it to leverage cheaper services for easier images.

In the future, the team plans to conduct more in-depth theoretical analyses of FrugalML as they work to extend the framework to create calling strategies for machine learning tasks beyond image classification. They also plan to release the data set they used to develop FrugalML.

"Our research characterized the substantial heterogeneity in cost and performance across available machine learning APIs, which is useful in its own right and also leveraged by FrugalML," the researchers wrote.