Bayesian Optimization by Density Ratio Estimation


Louis C. Tiao, Aaron Klein, Cédric Archambeau, Edwin V. Bonilla, Matthias Seeger, and Fabio Ramos

Bayesian Optimization

  • $\mathbf{x}$ denote an input to the blackbox function
  • $y$ its corresponding output value
  • $y = f(\mathbf{x}) + \epsilon$, with noise $\epsilon \sim \mathcal{N}(0, \sigma^2)$

Relative Density Ratio

  • The $\gamma$-relative density ratio $$ r_{\gamma}(\mathbf{x}) = \frac{\ell(\mathbf{x})}{\gamma \ell(\mathbf{x}) + (1 - \gamma) g(\mathbf{x})} $$
  • where $\gamma \ell(\mathbf{x}) + (1 - \gamma) g(\mathbf{x})$ is the $\gamma$-mixture density
    • for some mixing proportion $0 \leq \gamma < 1$

Ordinary Density Ratio

  • The ordinary density ratio $$ r_{0}(\mathbf{x}) = \frac{\ell(\mathbf{x})}{g(\mathbf{x})} $$

Expected Improvement (EI)

  • Non-negative improvement over $\tau$ $$ I_{\gamma}(\mathbf{x}) = \max(\tau - y, 0) $$
  • Posterior predictive $$ p(y | \mathbf{x}, \mathcal{D}_N) $$
  • Expected value of $I_{\gamma}(\mathbf{x})$ under posterior predictive $$ \alpha_{\gamma}(\mathbf{x}; \mathcal{D}_N) = \mathbb{E}_{p(y | \mathbf{x}, \mathcal{D}_N)}[I_{\gamma}(\mathbf{x})] $$

Conditional

Express $p(\mathbf{x} | y, \mathcal{D}_N)$ in terms of $\ell(\mathbf{x})$ and $g(\mathbf{x})$ $$ p(\mathbf{x} | y, \mathcal{D}_N) = \begin{cases} \ell(\mathbf{x}) & \text{if } y < \tau, \newline g(\mathbf{x}) & \text{if } y \geq \tau \end{cases} $$

Bergstra et al. 2011

Bergstra et al. 2011

$$ \underbrace{\alpha_{\gamma}(\mathbf{x}; \mathcal{D}_N)}_\text{expected improvement} \propto \underbrace{r_{\gamma}(\mathbf{x})}_\text{relative density ratio} $$

Controls

  • Next: Right Arrow or Space
  • Previous: Left Arrow
  • Start: Home
  • Finish: End
  • Overview: Esc
  • Speaker notes: S
  • Fullscreen: F
  • Zoom: Alt + Click
  • PDF Export: E

Code Highlighting

Inline code: variable

Code block:

porridge = "blueberry"
if porridge == "blueberry":
    print("Eating...")

Math

In-line math: $x + y = z$

Block math:

$$ f\left( x \right) = ;\frac{{2\left( {x + 4} \right)\left( {x - 4} \right)}}{{\left( {x + 4} \right)\left( {x + 1} \right)}} $$

Fragments

Make content appear incrementally

{{% fragment %}} One {{% /fragment %}}
{{% fragment %}} **Two** {{% /fragment %}}
{{% fragment %}} Three {{% /fragment %}}

Press Space to play!

One Two Three

A fragment can accept two optional parameters:

  • class: use a custom style (requires definition in custom CSS)
  • weight: sets the order in which a fragment appears

Speaker Notes

Add speaker notes to your presentation

{{% speaker_note %}}
- Only the speaker can read these notes
- Press `S` key to view
{{% /speaker_note %}}

Press the S key to view the speaker notes!

Themes

  • black: Black background, white text, blue links (default)
  • white: White background, black text, blue links
  • league: Gray background, white text, blue links
  • beige: Beige background, dark text, brown links
  • sky: Blue background, thin dark text, blue links
  • night: Black background, thick white text, orange links
  • serif: Cappuccino background, gray text, brown links
  • simple: White background, black text, blue links
  • solarized: Cream-colored background, dark green text, blue links

Custom Slide

Customize the slide style and background

{{< slide background-image="/img/boards.jpg" >}}
{{< slide background-color="#0000FF" >}}
{{< slide class="my-style" >}}

Custom CSS Example

Let’s make headers navy colored.

Create assets/css/reveal_custom.css with:

.reveal section h1,
.reveal section h2,
.reveal section h3 {
  color: navy;
}

Questions?

Ask

Documentation

References

  • Bergstra, J. S., Bardenet, R., Bengio, Y., & Kégl, B. (2011). Algorithms for Hyper-parameter Optimization. In Advances in Neural Information Processing Systems (pp. 2546-2554).
  • Yamada, M., Suzuki, T., Kanamori, T., Hachiya, H., & Sugiyama, M. (2011). Relative Density-ratio Estimation for Robust Distribution Comparison. In Advances in Neural Information Processing Systems (pp. 594-602).