<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Hyperparameter Optimization |</title><link>https://tiao.io/tags/hyperparameter-optimization/</link><atom:link href="https://tiao.io/tags/hyperparameter-optimization/index.xml" rel="self" type="application/rss+xml"/><description>Hyperparameter Optimization</description><generator>HugoBlox Kit (https://hugoblox.com)</generator><language>en-us</language><lastBuildDate>Mon, 01 Sep 2025 00:00:00 +0000</lastBuildDate><image><url>https://tiao.io/media/icon_hu_9c2a75fde2335590.png</url><title>Hyperparameter Optimization</title><link>https://tiao.io/tags/hyperparameter-optimization/</link></image><item><title>Ax: A Platform for Adaptive Experimentation</title><link>https://tiao.io/publications/ax-platform/</link><pubDate>Mon, 01 Sep 2025 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/ax-platform/</guid><description/></item><item><title>Ax</title><link>https://tiao.io/projects/ax/</link><pubDate>Thu, 01 Aug 2024 00:00:00 +0000</pubDate><guid>https://tiao.io/projects/ax/</guid><description>&lt;p&gt;
is an open-source platform for
developed by Meta&amp;rsquo;s Adaptive
Experimentation team. It provides a unified interface for
,
, multi-objective, and
constrained optimization, built on top of
.&lt;/p&gt;
&lt;p&gt;I contribute to Ax as part of my work at Meta, with a particular focus on
sample-efficient methods for
, capacity management, and
scaling-law-based modeling. Co-first author on
(AutoML 2025).&lt;/p&gt;</description></item><item><title>Batch Bayesian Optimisation via Density-ratio Estimation with Guarantees</title><link>https://tiao.io/publications/batch-bore-guarantees/</link><pubDate>Thu, 01 Dec 2022 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/batch-bore-guarantees/</guid><description/></item><item><title>📄 One paper accepted to NeurIPS 2022</title><link>https://tiao.io/posts/one-paper-accepted-to-neurips2022/</link><pubDate>Sat, 15 Oct 2022 18:36:59 +0000</pubDate><guid>https://tiao.io/posts/one-paper-accepted-to-neurips2022/</guid><description>&lt;p&gt;Our paper
was accepted
to NeurIPS 2022. Led by Rafael Oliveira, this is the batch extension of
with theoretical convergence
guarantees for parallel Bayesian optimization. Joint work with Rafael
Oliveira, Edwin Bonilla, and Fabio Ramos.&lt;/p&gt;</description></item><item><title>Long Talk: BORE — Bayesian Optimization by Density-Ratio Estimation</title><link>https://tiao.io/events/icml2021-bore/</link><pubDate>Wed, 21 Jul 2021 14:00:00 +0000</pubDate><guid>https://tiao.io/events/icml2021-bore/</guid><description/></item><item><title>BORE</title><link>https://tiao.io/projects/bore/</link><pubDate>Thu, 01 Jul 2021 00:00:00 +0000</pubDate><guid>https://tiao.io/projects/bore/</guid><description>&lt;p&gt;
is the reference implementation of
(Tiao et al., ICML 2021). It recasts the acquisition function in
as a probabilistic classification
problem via
,
sidestepping the analytical-tractability constraints of conventional
surrogate-based methods.&lt;/p&gt;
&lt;p&gt;Developed with Aaron Klein.&lt;/p&gt;</description></item><item><title>Invited Talk: BORE — Bayesian Optimization by Density-Ratio Estimation</title><link>https://tiao.io/events/ellis-automl-seminars-2021/</link><pubDate>Wed, 12 May 2021 16:00:00 +0000</pubDate><guid>https://tiao.io/events/ellis-automl-seminars-2021/</guid><description/></item><item><title>📄 One paper accepted to ICML 2021</title><link>https://tiao.io/posts/one-paper-accepted-to-icml2021/</link><pubDate>Sat, 08 May 2021 00:00:00 +0000</pubDate><guid>https://tiao.io/posts/one-paper-accepted-to-icml2021/</guid><description>&lt;p&gt;Our paper
was
accepted to ICML 2021 as a &lt;strong&gt;Long Talk&lt;/strong&gt; (awarded to the top 3% of submissions).
This is joint work with Aaron Klein, Cédric Archambeau, Edwin Bonilla, Matthias
Seeger, and Fabio Ramos — much of it carried out during my AWS Berlin
internship.&lt;/p&gt;</description></item><item><title>BORE: Bayesian Optimization by Density-Ratio Estimation</title><link>https://tiao.io/publications/bore-2/</link><pubDate>Sat, 08 May 2021 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/bore-2/</guid><description>&lt;p&gt;&lt;strong&gt;B&lt;/strong&gt;ayesian &lt;strong&gt;O&lt;/strong&gt;ptimization (BO) by Density-&lt;strong&gt;R&lt;/strong&gt;atio &lt;strong&gt;E&lt;/strong&gt;stimation (DRE),
or &lt;strong&gt;BORE&lt;/strong&gt;, is a simple, yet effective framework for the optimization of
blackbox functions.
BORE is built upon the correspondence between &lt;em&gt;expected improvement (EI)&lt;/em&gt;&amp;mdash;arguably
the predominant &lt;em&gt;acquisition functions&lt;/em&gt; used in BO&amp;mdash;and the &lt;em&gt;density-ratio&lt;/em&gt;
between two unknown distributions.&lt;/p&gt;
&lt;p&gt;One of the far-reaching consequences of this correspondence is that we can
reduce the computation of EI to a &lt;em&gt;probabilistic classification&lt;/em&gt; problem&amp;mdash;a
problem we are well-equipped to tackle, as evidenced by the broad range of
streamlined, easy-to-use and, perhaps most importantly, battle-tested
tools and frameworks available at our disposal for applying a variety of approaches.
Notable among these are
/
and
/
for Deep Learning,
for Gradient Tree Boosting,
not to mention
for just about
everything else.
The BORE framework lets us take direct advantage of these tools.&lt;/p&gt;
&lt;h2 id="code-example"&gt;Code Example&lt;/h2&gt;
&lt;p&gt;We provide an simple example with Keras to give you a taste of how BORE can
be implemented using a feed-forward &lt;em&gt;neural network (NN)&lt;/em&gt; classifier.
A useful class that the
package provides is
,
a subclass of
from
Keras that inherits all of its existing functionalities, and provides just
one additional method.
We can build and compile a feed-forward NN classifier as usual:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;bore.models&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MaximizableSequential&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;tensorflow.keras.layers&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Dense&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# build model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MaximizableSequential&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;relu&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;relu&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Dense&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;activation&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;sigmoid&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# compile model&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;optimizer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;adam&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;loss&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;binary_crossentropy&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;See
from the
if this seems unfamiliar to
you.&lt;/p&gt;
&lt;p&gt;The additional method provided is &lt;code&gt;argmax&lt;/code&gt;, which returns the &lt;em&gt;maximizer&lt;/em&gt; of
the network, i.e. the input $\mathbf{x}$ that maximizes the final output of
the network:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;x_argmax&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;L-BFGS-B&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_start_points&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Since the network is differentiable end-to-end wrt to input $\mathbf{x}$, this
method can be implemented efficiently using a &lt;em&gt;multi-started quasi-Newton
hill-climber&lt;/em&gt; such as
.
We will see the pivotal role this method plays in the next section.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Using this classifier, the BO loop in BORE looks as follows:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-python" data-lang="python"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;numpy&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nn"&gt;np&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;targets&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;# initialize design&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features_initial_design&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="n"&gt;targets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;targets_initial_design&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_iterations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# construct classification problem&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;X&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;vstack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;y&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hstack&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;targets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;tau&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;np&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;less&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tau&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# update classifier&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;X&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;z&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;epochs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# suggest new candidate&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;x_next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argmax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bounds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;&amp;#34;L-BFGS-B&amp;#34;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_start_points&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# evaluate blackbox&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;y_next&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;blackbox&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="c1"&gt;# update dataset&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;x_next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt; &lt;span class="n"&gt;targets&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;y_next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;hr&gt;
&lt;p&gt;Let&amp;rsquo;s break this down a bit:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;At the start of the loop, we construct the classification problem&amp;mdash;by labeling
instances $\mathbf{x}$ whose corresponding target value $y$ is in the top
&lt;code&gt;q=0.25&lt;/code&gt; quantile of all target values as &lt;em&gt;positive&lt;/em&gt;, and the rest as &lt;em&gt;negative&lt;/em&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Next, we train the classifier to discriminate between these instances. This
classifier should converge towards
&lt;/p&gt;
$$
\pi^{*}(\mathbf{x}) = \frac{\gamma \ell(\mathbf{x})}{\gamma \ell(\mathbf{x}) + (1-\gamma) g(\mathbf{x})},
$$&lt;p&gt;
where $\ell(\mathbf{x})$ and $g(\mathbf{x})$ are the unknown distributions of
instances belonging to the positive and negative classes, respectively, and
$\gamma$ is the class balance-rate and, by construction, simply the quantile
we specified (i.e. $\gamma=0.25$).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Once the classifier is a decent approximation to $\pi^{*}(\mathbf{x})$, we
propose the maximizer of this classifier as the next input to evaluate.
In other words, we are now using the classifier &lt;em&gt;itself&lt;/em&gt; as the acquisition
function.&lt;/p&gt;
&lt;p&gt;How is it justifiable to use this in lieu of EI, or some other acquisition
function we&amp;rsquo;re used to?
And what is so special about $\pi^{*}(\mathbf{x})$?&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Well, as it turns out, $\pi^{*}(\mathbf{x})$ is equivalent to EI, up to some
constant factors.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The remainder of the loop should now be self-explanatory. Namely, we&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;evaluate the blackbox function at the suggested point, and&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;update the dataset.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="step-by-step-illustration"&gt;Step-by-step Illustration&lt;/h3&gt;
&lt;p&gt;Here is a step-by-step animation of six iterations of this loop in action,
using the &lt;em&gt;Forrester&lt;/em&gt; synthetic function as an example.
The noise-free function is shown as the solid gray curve in the main pane.
This procedure is warm-started with four random initial designs.&lt;/p&gt;
&lt;p&gt;The right pane shows the empirical CDF (ECDF) of the observed $y$ values.
The vertical dashed black line in this pane is located at $\Phi(y) = \gamma$,
where $\gamma = 0.25$.
The horizontal dashed black line is located at $\tau$, the value of $y$ such
that $\Phi(y) = 0.25$, i.e. $\tau = \Phi^{-1}(0.25)$.&lt;/p&gt;
&lt;p&gt;The instances below this horizontal line are assigned binary label $z=1$, while
those above are assigned $z=0$. This is visualized in the bottom pane,
alongside the probabilistic classifier $\pi_{\boldsymbol{\theta}}(\mathbf{x})$
represented by the solid gray curve, which is trained to discriminate between
these instances.&lt;/p&gt;
&lt;p&gt;Finally, the maximizer of the classifier is represented by the vertical solid
green line.
This is the location at which the BO procedure suggests be evaluated next.&lt;/p&gt;
&lt;p&gt;
&lt;figure &gt;
&lt;div class="flex justify-center "&gt;
&lt;div class="w-full" &gt;
&lt;img alt="Animation"
srcset="https://tiao.io/publications/bore-2/paper_1500x5562_hu_bf54a19b8bc6fbf5.webp 205w"
sizes="(max-width: 480px) 100vw, (max-width: 768px) 90vw, (max-width: 1024px) 80vw, 760px"
src="https://tiao.io/publications/bore-2/paper_1500x5562_hu_bf54a19b8bc6fbf5.webp"
width="205"
height="760"
loading="lazy" data-zoomable /&gt;&lt;/div&gt;
&lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;We see that the procedure converges toward to global minimum of the blackbox
function after half a dozen iterations.&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;To understand how and why this works in more detail, please read our paper!
If you only have 15 minutes to spare, please watch the video recording of our
talk!&lt;/p&gt;
&lt;h2 id="video"&gt;Video&lt;/h2&gt;
&lt;div id="presentation-embed-38942425"&gt;&lt;/div&gt;
&lt;script src='https://slideslive.com/embed_presentation.js'&gt;&lt;/script&gt;
&lt;script&gt;
embed = new SlidesLiveEmbed('presentation-embed-38942425', {
presentationId: '38942425',
autoPlay: false, // change to true to autoplay the embedded presentation
verticalEnabled: true
});
&lt;/script&gt;</description></item><item><title>Simulation-based Scoring for Model-based Asynchronous Hyperparameter and Neural Architecture Search</title><link>https://tiao.io/publications/simulation-based-scoring/</link><pubDate>Sat, 01 May 2021 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/simulation-based-scoring/</guid><description/></item><item><title>Contributed Talk: BORE — Bayesian Optimization by Density-Ratio Estimation</title><link>https://tiao.io/events/neurips2020-meta-learning/</link><pubDate>Fri, 11 Dec 2020 15:00:00 +0000</pubDate><guid>https://tiao.io/events/neurips2020-meta-learning/</guid><description/></item><item><title>Bayesian Optimization by Density Ratio Estimation</title><link>https://tiao.io/publications/bore-1/</link><pubDate>Tue, 01 Dec 2020 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/bore-1/</guid><description/></item><item><title>Model-based Asynchronous Hyperparameter and Neural Architecture Search</title><link>https://tiao.io/publications/async-multi-fidelity-hpo/</link><pubDate>Sun, 01 Mar 2020 00:00:00 +0000</pubDate><guid>https://tiao.io/publications/async-multi-fidelity-hpo/</guid><description/></item><item><title>AutoGluon</title><link>https://tiao.io/projects/autogluon/</link><pubDate>Sun, 01 Sep 2019 00:00:00 +0000</pubDate><guid>https://tiao.io/projects/autogluon/</guid><description>&lt;p&gt;
is an open-source
toolkit from AWS that automates ML for tabular, image, and text data. During
my AWS Berlin internship I was a core developer of the
-based
searcher module — described in
and later forming the basis of
.&lt;/p&gt;</description></item></channel></rss>