Exploring the Binance API in Python - Part I: The Orderbook

Draft

In this series, we will directly interact with the API endpoints and explicitly make the low-level HTTP requests ourselves. If you’re just looking for a high-level way to interact with the API endpoints that abstracts away these details please check out python-binance, an unofficial, but slick and well-designed Python Client for the Binance API.

We will be making the request using the requests library. Thereafter, we will process the results with pandas, and visualize it with matplotlib and seaborn. Let’s import these dependencies now:

import requests
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

To make a GET request for the symbol ETHBUSD from the /depth endpoint:

r = requests.get("https://api.binance.com/api/v3/depth", params=dict(symbol="ETHBUSD"))
results = r.json()

Load the buy and sell orders, or bids and asks, into respective dataframes:

frames = {side: pd.DataFrame(data=results[side], columns=["price", "quantity"], dtype=float)
          for side in ["bids", "asks"]}

Concatenate the dataframes containing bids and asks into one big frame:

frames_list = [frames[side].assign(side=side) for side in frames]
data = pd.concat(frames_list, axis="index", ignore_index=True, sort=True)

Get a statistical summary of the price levels in the bids and asks:

price_summary = data.groupby("side").price.describe()
price_summary.to_markdown()
sidecountmeanstdmin25%50%75%max
asks1001057.860.6961461056.641057.21057.911058.491059.04
bids1001055.060.8323851053.71054.41054.851055.821056.58

As a side note, we see that the Binance API only provides the lowest 100 asks and the highest 100 bids.

Top of the book

The prices of the most recent trades will be somewhere between the maximum bid price and the minimum ask price. This is known as the top of the book. The difference between these two prices levels is known as the bid-ask spread.

>>> frames["bids"].price.max()
1056.58
>>> frames["asks"].price.min()
1056.64

We can also get this information from the /ticker/bookTicker endpoint:

r = requests.get("https://api.binance.com/api/v3/ticker/bookTicker", params=dict(symbol="ETHBUSD"))
book_top = r.json()

Read this into a Pandas Series and render as a Markdown table:

name = book_top.pop("symbol")  # get symbol and also delete at the same time
s = pd.Series(book_top, name=name, dtype=float)
s.to_markdown()
ETHBUSD
bidPrice1056.58
bidQty7.555
askPrice1056.64
askQty7.43152

Scatter plot

Let us visualize all the orderbook entries using a scatter plot, showing price on the $x$-axis, and quantity on the $y$-axis. The hue designates whether the entry is an ask or a bid.

fig, ax = plt.subplots()

ax.set_title(f"Last update: {t} (ID: {last_update_id})")

sns.scatterplot(x="price", y="quantity", hue="side", data=data, ax=ax)

ax.set_xlabel("Price")
ax.set_ylabel("Quantity")

plt.show()
Scatter Plot

This is the most verbose visualization, displaying all the raw information, but also providing the least amount of actionable insights.

Histogram plot

fig, ax = plt.subplots()

ax.set_title(f"Last update: {t} (ID: {last_update_id})")

sns.histplot(x="price", hue="side", binwidth=binwidth, data=data, ax=ax)
sns.rugplot(x="price", hue="side", data=data, ax=ax)

plt.show()
Histogram Plot

This shows the number of bids or asks at specific price points, but obscures the volume, or quantity.

This is obviously misleading. For example, there could be 1 bid at price $p_1$ and and 100 bids at $p_2$. However, the 1 bid at price $p_1$ could be for 100 ETH, while each of those 100 bids at $p_2$ could be for just 1 ETH. At both price points, the total quantity of ETH being bid is actually identical. Yet this plot would suggest that there is 100 times greater demand for ETH at $p_2$.

Weighted histogram plot

This is easy to fix, simply by weighting each entry by the quantity. This just amounts to setting weights="quantity":

fig, ax = plt.subplots()

ax.set_title(f"Last update: {t} (ID: {last_update_id})")

sns.histplot(x="price", weights="quantity", hue="side", binwidth=binwidth, data=data, ax=ax)
sns.scatterplot(x="price", y="quantity", hue="side", data=data, ax=ax)

ax.set_xlabel("Price")
ax.set_ylabel("Quantity")

plt.show()
Weighted histogram plot

Weighted empirical CDF (ECDF) plot – aka the “Depth Chart”

Calling sns.ecdfplot with the options stat="count" and complementary=True:

fig, ax = plt.subplots()

ax.set_title(f"Last update: {t} (ID: {last_update_id})")

sns.ecdfplot(x="price", weights="quantity", stat="count", complementary=True, data=frames["bids"], ax=ax)
sns.ecdfplot(x="price", weights="quantity", stat="count", data=frames["asks"], ax=ax)
sns.scatterplot(x="price", y="quantity", hue="side", data=data, ax=ax)

ax.set_xlabel("Price")
ax.set_ylabel("Quantity")

plt.show()
Weighted empirical CDF (ECDF) plot
Louis Tiao
Louis Tiao
PhD Candidate

Probabilistic machine learning and artificial intelligence.

comments powered by Disqus