Data-sparse Search for Discovery Problems

A discovery problem seeks to identify a set of candidates in a search space that fulfills certain performance criteria. For example, a battery engineer needs to carefully select compounds so that a battery meets specified resistance, capacitance, and inductance thresholds. In these regimes, data is typically sparse and expensive; a material needs to be fabricated or simulated in order to make measurements. The high cost of obtaining data, heterogeneity of said data, large size of the search space, and many performance thresholds render search and/or optimization an exceedingly difficult problem.

To tackle these discovery problems, we need to simultaneously handle data sparsity, black-box constraints, and high noise. I’ve worked on search methods that attempt to resolve these challenges by foregoing performance optimality in favor of candidate diversity. Diverse candidates are more likely to satisfy performance thresholds given high noise and uncertainty.

I believe this is a paradigm that holds significant potential in a wide range of practical applications, from drug design to simulation optimization, and I’m passionate about working on projects in this area.

Automatic Machine Learning (AutoML)

AutoML seeks algorithms and software to streamline and simplify deep learning pipelines. I’ve worked mainly on Bayesian optimization, which is a class of optimization algorithms used to tune neural network hyperparameters. Neural networks are often be quite sensitive to these hyperparameters, so it’s important to tune these with minimal cost overhead.

I’ve worked on making these AutoML algorithms more application-aware. The overall idea is that by understanding the underlying learning task, we can make more intelligent decisions about which hyperparameters to tune and how to tune them. I’ve incorporated this idea into vision systems, in which the training cost can vary quite significantly and thus needs to be accounted for during the tuning process. I’ve also worked automatic tuning of gradient-boosted trees, which leverage metalearning to build experience tuning these trees over time.

In the future, I think it’ll be very important to continue adapting these AutoML algorithms to the massive models in production today (which presumably, will only get larger over time).

Gaussian Processes

Gaussian processes are a kind of machine learning model unique in the sense that they automatically provide uncertainty quantification. In layman’s terms, this means that they not only provide predictions, but also an estimate of how confident they are in these predictions.

This is enormously useful when it comes to incorporating interpretability into machine learning models. The downside is that Gaussian processes are scale poorly with the amount of data and are quite inefficient to train. My work focuses on making this training process more efficient.