BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage News Meta's Optimization Platform Ax 1.0 Streamlines LLM and System Optimization

Meta's Optimization Platform Ax 1.0 Streamlines LLM and System Optimization

Listen to this article -  0:00

Now stable, Ax is an open-source platform from Meta designed to help researchers and engineers apply machine learning to complex, resource-intensive experimentation. Over the past several years, Meta has used Ax to improve AI models, accelerate machine learning research, tune production infrastructure, and more.

Ax is especially oriented to researchers who need to understand and optimize AI models or other systems with complex configuration. In such cases, the sheer number of possible configurations makes it nearly impossible to evaluate all of them efficiently in a linear way, say Meta researchers. The solution comes in the form of adaptive experimentation, which automatically evaluates configurations in a sequential way, using the insights from previous evaluations to guide exploration of the solution space.

Adaptive experiments are incredibly useful, but can be challenging to run. Not only do these experiments require the use of sophisticated machine learning methods to drive the optimization, they also demand specialized infrastructure for managing experiment state, automating orchestration, providing useful analysis and diagnostics, and more.

Examples of problems for which Meta used Ax internally include hyperparameter optimization and architecture search in machine learning, discovering optimal data mixtures to train AI models, tuning infrastructure, optimizing compiler flags, and more.

One particularly interesting application of Ax is in optimizing LLMs. Meta researchers have provided a comprehensive introduction to it, demonstrating how Ax can be used to write better prompts, select the most effective examples the AI should follow, and more.

An additional challenge in optimization arises because researchers often aim to improve multiple objective metrics subject to constraints and guardrails. Meta researchers describe how they used Ax for "multi-objective optimization to simultaneously improve a machine learning model’s accuracy while minimizing its resource usage."

Besides experimentation, Meta researchers note that Ax is an effective tool to gain deeper understanding of the system being optimized:

Ax provides a suite of analyses (plots, tables, etc) which helps its users understand how the optimization is progressing over time, tradeoffs between different metrics via a Pareto frontier, visualize the effect of one or two parameters across the input space, and explain how much each input parameter contributes to the results (via sensitivity analysis).

Ax uses Bayesian optimization, via PyTorch and BoTorch, to iteratively test candidate configurations. It employs a surrogate model to identify the most promising configuration to evaluate next, repeating the process iteratively until the desired goal has been reached or the allowed compute budget consumed. The typical surrogate model is a Gaussian process, used for its ability to make predictions with quantified uncertainty based on very few data points.

Meta researchers emphasize Ax's expressive API, which enables exploring complex search spaces, as well as handling multiple objectives, constraints, and noisy observations. Ax can also evaluate distinct configurations in parallel and halt evaluation at any time. A key benefit of Ax is that it provides sensible defaults allowing practitioners to leverage advanced techniques without needing to be optimization experts.

Ax is not the only black-box open-source adaptive optimization platform available. Alternatives include SMAC, Nevergrad, Optuna, Dragonfly, and others. Meta researches claim that Ax provides a broader range of capabilities, including the ability to impose constraints on parameters and outcomes, as well as handle noise measurements. Ax and many of its alternatives can be also used with orchestration frameworks like Ray Tune and Hydra.

About the Author

BT