a study exploring the efficacy of sophisticated vs naïve market market making in both experimental and real-world environments
By Miles Child
DRAFT
This study explores the efficacy, measured by trading PnL and aggregate trading volume, of a naïve and a slightly more sophisticated market making strategy. Naïveté, in this study, is defined as making no assumptions about the true value of the underlying security. We deploy these strategies first in an artificial trading environment and second on a real binaries exchange, Kalshi, and find that __. Many studies have been conducted on theoretically efficient market-making techniques both when the market-making agent has and does not have access to exogenous information that can be used to price the underlying asset. Few have applied their market-making techniques in real trading environments. Binary market exchanges like Kalshi present novel pricing environments where the price of any given contract suggests a market-implied probability of that contract's event occurring. In some instances, like "Will the end-of-day S&P 500 index value for December 27, 2023 is between 4750-4774.99", computing a theoretical probability (price) of a contract is relatively simple by application of primary exchange data. In other instances, like "Will Kanye West release a new album on Spotify by Jan 12, 2024?", estimation of this probability is considerably more difficult and is much more approximately a low-mean and high-standard deviation jump process where information is rarely, noisily, and violently incorporated into contract prices. As such, spreads (and rewards to agents willing to provide liquidity) tend to be thin on easily-estimated contracts and wide on others. In this study, we attempt to develop a strategy that is efficient at providing liquidity in complex binary environments.
This paper was inspired by the Sanmay Das's thesis Intelligent Market-Making in Artificial Financial Markets [1] and some of Das's findings have been recycled here.
- Brief Microstructure Overview
- The Naïve Market Maker
- Inventory Control
- The Sophisticated Naïve Market Maker
- Artificial Experimental Setup
- Artificial Experimental Results
- Real-World Experimental Expectations & Setup
- Real-World Experimental Results
- Conclusion
- References
I will briefly offer definitions to a few microstructure-related terms here to hopefully avoid any confusion among newer crowds.
-
Market Making: Posting limit orders at the bid and ask on an asset with the intention of making a large number of trades and profiting as a function of your spread and trading volume. In some scenarios, you may also be rewarded rebates (as a function of trading volume and, sometimes, your spread) from the exchange for providing liquidity.
-
Naïve Market Making: Market making with no exogenous information about the fair price for the underlying security. For the sake of this paper, I consider a market maker to be naïve if they either don't make any assumptions about the true value of the asset they trade or if they do attempt to make a theoretical true value while being restricted to only orderbook/trading or other orderflow information.
-
Orderbook: The orderbook we use here is one that tracks all the price levels all market participants have posted limit orders that they are willing to buy or sell at and the subsequent quantities at those levels. Our orderbook uses price-time priority, where the first limit order at the best price wins the first order that crosses that price level.
For simplicity, our trading environment is assumed to be made up of three classes of participants (aside from market makers):
-
Informed Traders: Informed traders are assumed to have, at any given time, a perfect notion of the true value of the asset in question. They trade this asset rationally at all times, buying when the best ask is below the true value and selling when the best bid is at or above the true value. Informed traders are toxic to market makers because they will only do business when the market maker's bid or ask is above / below the true value.
-
Gaussian Informed Traders: Gaussian informed traders are a slight modification of informed traders and are assumed to have, at any given time, a noisy signal
$S$ regarding the true value$V_T$ of the underlying stock that can be represented as$S = V_T + \mathcal{N}(0, \sigma)$ , where$\mathcal{N}(0, \sigma)$ is a sample from a normal distribution with mean 0 and variance$\sigma^2$ . They will trade about the best bid and ask in the same manner as informed traders, but due to the noisiness of their$V_T$ signal, it is possible for them to buy when the best offer is actually above the true value (or sell when the best bid is below the true value). We will use Gaussian informed traders as a substitute for actually-informed traders because they probably represent the real-life informed trading class more accurately than perfectly informed traders. -
Gaussian (or noisy) Traders: Gaussian traders trade about the best bid and ask in the same way as Gaussian informed traders but their
$V_T$ signal is pure noise. In other words, they have, at any given time, a noisy signal$S$ regarding the true value$V_T$ of the underlying stock that can be represented as$S = \mathcal{N}(0, \sigma)$ , where$\mathcal{N}(0, \sigma)$ is a sample from a normal distribution with mean 0 and variance$\sigma^2$ . This trading class trades completely at random and has no accurate notion of the true value at any given time.
The naïve market-making strategy we use here is an adaptation of the kalshi-adapted python NMM that can be found at the github repository here. The NMM is a simple liquidity provider whose goal is to always post a bid and ask on a single security with a fixed spread and a fixed trading quantity that the user defines at runtime. the naïve market maker is initialized with a fixed spread σ that is symmetrical about the market's last trading price
The NMM's behavior will be the same regardless of the volatility of the underlying asset, the presence of other market-making agents, or the underlying portfolio exposure. This model is very simple to implement but has an extensive list of clear flaws that would likely prevent its ability to achieve positive or Nash PnL in most (if not all) trading environments. The NMM's most critical flaws are as follows:
(1) Bid and ask will always be a lagging indicator of "true value" because the NMM's bid/ask placement only moves as a function of the last trading price. This means that in times of price discovery, the NMM's price adjustment will at best be linear while informed traders continously lift the toxic side of the spread.
(2) In competitive environments with other market-making agents, the NMM's behavior can easily be learned and exploited to ensure it only wins trades when flow is toxic. For example, in times where the true value is approximately known, other agents can constantly maintain the maximum spread that is tighter than the NMM's. In times of information volatility when there is an unknown delta in the true value, other agents can widen their spreads so the NMM wins every trade and bears the brunt of all the informed trading and subsequent PnL losses. Then, when the true value becomes less uncertain, spreads can again be tightened.
(3) In an environment with no competition and any presence of informed traders, we can always expect the NMM to have negative PnL as spreads are not dynamic to changes in true value uncertainty and inventory will always be accumulated in the toxic direction during times of price discovery.
(4) The NMM is not proactive about mitigating inventory exposure and incentivizing turnover. We can thus expect PnL to be an inverse function of inventory accumulation in the presence of informed traders.
To address the NMM's position exposure and inefficient pricing adjustment flaws, we will introduce a simple inventory-control mechanism below before discussing the sophisticated naïve market-maker.
As any market-making agent provides liquidity, it will tend to accumulate a positive or negative position in the underlying asset that can interfere with its ability to remain market-neutral and generate consistent liquidity reward revenues. This accumulated position of shares, either long (positive) or short (negative), is called inventory and we will assume it is in our best interest to maintain zero inventory while maximizing trading volume. Thus, we need an inventory control mechanism that makes it increasingly difficult to buy shares and easy to sell shares when our inventory is long and the opposite when inventory is short.
Spread placement is improved via the below inventory control mechanism, which computes an adjustment (in cents) to apply to both sides of the spread that effectively makes accumulating inventory in the direction (long/short) of current inventory increasingly difficult up to a parameterized maximum amount.
The inventory control mechanism is as follows:
Where,
-
$f(i)$ is the spread adjustent in cents -
$M$ is the maximum allowed spread adjustment in cents -
$i$ is the integer value of current inventory holdings (negative for short) -
$a$ is a scaling factor for the exponential function (use higher$a$ when you are more averse to inventory accumulation)
For example, when the max adjustment is set to 10 cents and
Figure 1: Inventory Control Mechanism
In artificial/inventory_control.ipynb, we find the inventory-controlling naïve market maker to be considerably more efficient in managing portfolio exposure risk with a demonstrated ability to maintain near-zero unrealized PnL throughout the entire trading session. Unsurprisingly, the non inventory-controlling market maker becomes susceptible to significant positional PnL in times of true value volatility.
The inventory-controlling market maker also appears to be more efficient at tracking the underlying's true value, likely due to the expeditious effect the exponential inventory control adjustment has on price discovery.
Figure 2: Naïve Market Maker Performance, No IC
Figure 3: Inventory-Controlling Naïve Market Maker Performance
A more sophisticated NMM would be able to widen spreads (to avoid negative positional PnL) when the true value of the underlying asset is more uncertain and tighten them (to perform better in competition) when the true value is more certain. In order to do so, the sophisticated NMM would have to make two advancements over the NMM: (1) maintain a sense of the true value of the underlying asset while only using order flow data, and (2) incorporate that true value estimation into the bid
Das explores the possibility of deriving optimal bid and ask prices by using an online probability density estimator that stores, for each possible asset price level, the market maker's probability assumption that the true value
Our
Our
Before detailing the approximations for
Assume a mean value of 100 and standard deviation 10. For simplicity, assume the
In reality, this distribution would accumulate higher probabilities at more concentrated price levels and display far higher kurtosis than the above normal distribution. Before detailing the two mechanisms required for this
- We need to make an assumption about the proportion of all traders that are noisy and informed, which will be denoted by
$\alpha$ - We need to make an assumption about the standard deviation of the gaussian informed traders information signal noise, denoted by
$\sigma W^2$ - We need to make an assumption about the base probability of an order occurring at any given time (how often gaussian traders make trades), denoted by
$\eta$ - Finally, we need the initial true value of the security at the beginning of the simulation, denoted by
$V_{T0}$
The full derivations for the bid and ask prices will not be detailed here but those interested can read about them in Das's paper linked in references.
First, Das's equation for the optimal bid price is:
Where:
-
$P_{Sell}$ is the total probability of a sell order occurring. It's calculated by summing the probabilities of a sell order for each possible value of$V$ -
$V_i$ is The possible values of the security, ranging from a minimum$V_{min}$ to a maximum$V_{max}$ value which, in our case, is every integer from 0 to 100 (this is how binaries are priced on Kalshi). - And
$Pr(V = V_i)$ is obtained from the probability density vector that we will describe shortly
The first summation calculates the expected value of the security for values from the minimum possible value to the bid price. This part uses a weight of
Similarly, for the ask price:
The probability density estimate is first initialized as a normal distribution about the initial true value,
If a buy order is received, each of the probabilities get updated with the following Bayesian update formula:
(1) For the first part of the numerator,
When
- For informed traders (represented by
$\alpha$ ), the likelihood considers the probability that their noise-adjusted estimate of the true value ($\tilde{\eta}(0, \sigma_W^2)$) is greater than the difference between the ask price$P_a$ and the true value$V_i$ . This reflects the informed traders' propensity to buy when they believe the security is undervalued. - For uninformed traders (represented by
$(1 - \alpha)\eta$ ), the base probability$\eta$ of a trade happening is used, as these traders are assumed to trade randomly and not based on an informed estimate of the true value.
And when
(2) The second part of the numerator,
(3) The denominator,
Updating based on a sell order follows the exact same logic.
Now, we have a more sophisticated naïve market maker that maintains a dynamic spread based on its estimation of the point-in-time true value of the underlying security given observed order flow. We also equip the aforementioned inventory control mechanism to the sophisticated NMM to incentivize inventory turnover and mitigate positional PnL. The inventory control adjustment is applied to the bid and ask prices after the above
We can expect this evolution of the naïve market maker to be much more effective in competitive environments, more responsive to deltas in the true value of the underlying security, and considerably better at tracking the true value of the underlying security. We will test these expectations in the proceeding sections.
The artificial experimental framework is setup as follows: a single or multiple market makers are dropped at the first auction in a contrived single-security universe with
At each auction, the Gaussian traders trade randomly with probability
First, we will compare the trading efficiency of the NMM versus the sophisticated NMM in an anticompetitive, frequent & low standard deviation jump, single market-maker environment with the following variables:
Variable | Value | Definiton |
---|---|---|
1 | The total number of market making agents | |
10000 | The total number of auctions | |
10 | The total number of traders | |
0.4 | Proportion of total traders that are Gaussian-informed | |
2 | Standard deviation of gaussian-informed noise distribution | |
0.2 | Probability of a noise trader placing a trade during any auction | |
5 | Standard deviation of the jump process | |
0.05 | Probability of jump in the true value at each auction |
We find that in an environment where the true value evolves frequently via a low-standard deviation jump process, both the NMM and SNMM display an ability to track the true value based only on order flow information with the SNMM tracking much more effectively than the NMM (figure 5, 7). The NMM's inventory control mechanism is considerably more effective at incentivizing inventory turnover, however the SNMM's performance premium outweighs its worse ability to maintain low inventories (figure 6, 8).
Figure 5: Naïve Market Maker Performance - Experiment 1
Figure 6: Naïve Market Maker Realized & Unrealized PnL - Experiment 1
Figure 7: Sophisticated Naïve Market Maker Performance - Experiment 1
Figure 8: Sophisticated Naïve Market Maker Realized & Unrealized PnL - Experiment 1
Our goal, however, is to ultimately make markets in environments where information is dissemenated much less frequently and with much higher resulting price changes. In experiment two, we sensitize the jump process to this effect with the following parameter updates:
Variable | Value | Definiton |
---|---|---|
50 | Standard deviation of the jump process | |
0.0001 | Probability of jump in the true value at each auction |
Experimental results uncover a significant flaw in the sophisticated market maker's behavior: in environments with infrequent but violent jumps, spreads fail to track the true value of the underlying security (figure 10). This is a result of saturation of the SNMM's probability density function, where after so many trades at/around the same level, the probabilities that any price outside that level become so remote that it takes hundreds (sometimes thousands) of auctions for spreads to update in the event of large true value changes.
Experiment two also exemplifies just how terribly the Naïve Market Maker can perform when instructed to maintain tight spreads but still incentivize inventory turnover. It is effective at managing inventory, but its inventory control forces it to oscillate about the true value in the opposite direction of its accumulated inventory. In other words, if it buys at or under the true value, the IC mechanism is going to force it to decrease its asking price until it successfully covers below the price it bought at.
Figure 9: Naïve Market Maker Performance - Experiment 2
Figure 10: Naïve Market Maker Realized & Unrealized PnL - Experiment 2
Figure 11: Sophisticated Naïve Market Maker Performance - Experiment 2
Figure 12: Sophisticated Naïve Market Maker Realized & Unrealized PnL - Experiment 2
In omniscient_snmm.ipynb, we find that notifying the market maker of jumps in the true value without specifying the amount dramatically increases its ability to track the true value of the underlying security. We develop a simple jump detection module (JDM) that observes public trading data and detects anomalies in the ratio of buy orders to sell orders in recent transaction history to trigger signals that the SNMM uses to re-center its probability density function. Figures 13 and 14 display the effect of the JDM's signals on the SNMM's true value tracking ability.
Figure 11: Sophisticated Naïve Market Maker Performance - No Jump Detection
Figure 12: Jump-Detecting Sophisticated Naïve Market Maker Performance
An interesting direction for future work would be a more sophisticated, data-driven jump detection module. For the sake of not convoluting the content of this paper, we've simplified the solution to jump detection and implementation can be seen in omniscient_snmm.ipynb and jdm.py.
In competitive_auctions.ipynb, we introduce naïve and sophisticated naïve market makers to competitive environments and find promising results particularly from SNMM-on-SNMM competition. SNMM profits and total volume decrease when a naïve market maker is introduced, however SNMM inventory turnover increases when a NMM is available to assist in price discovery (table 1, 2). The SNMM still wins considerably more trades than the NMM as a result of its more competitive median spread value (table 2). Interestingly, a two-SNMM competitive environment appears to be less harmful to PnL than the SNMM + NMM environment (table 2, 3), however total achieved volume is essentially equivalent between the two SNMMs (table 3). We can also see that NMM profits increase with their spreads (table 4).
Table 1: Single-SNMM Environment
Market-Maker | Total Trades | Average PnL Per-Trade | Median Spread |
---|---|---|---|
SNMM | 29810 | 0.16 | 1.0 |
Table 2: SNMM + NMM Environment
Market-Maker | Total Trades | Average PnL Per-Trade | Median Spread |
---|---|---|---|
SNMM | 22109 | 0.10 | 1.0 |
NMM | 8438 | 0.15 | 2.0 |
Table 3: Double-SNMM Environment
Market-Maker | Total Trades | Average PnL Per-Trade | Median Spread |
---|---|---|---|
SNMM_1 | 15288 | 0.14 | 1.0 |
SNMM_2 | 15459 | 0.14 | 1.0 |
Table 4: Double-NMM Environment
Market-Maker | Total Trades | Average PnL Per-Trade | Median Spread |
---|---|---|---|
NMM_1 | 30015 | -1.13 | 2.0 |
NMM_2 | 760 | -0.25 | 6.0 |
Many of the parameters required for implementation of Das's model are very unrealistically attainable in a real trading environment and, as such, some estimation and slight modifications have been made prior to live deployment. The following modifications to the theoretical framework have not been exhaustively deliberated and there is likely much room for improvement among the below.
Position Sizes:
First, the lack of position sizes in Das's model is quite problematic when considering the wide distribution of trading position sizes that come with real trading environments. First, updates to the probability density function should logically be made in proportion to the relative size of the buy or sell trade, where larger trades have more impact on the probability update. Second, the inventory control mechanism must account for greater-than-one lot trading sizes when the market maker is configured to post bids/offers with greater than one quantity.
We resolve the probability density function update issue by likening the single-qty trades in the experimental framework to 100-lot trades in practice and multiple each PDF update by
The inventory control mechanism issue is easily resovled by adding a trade_qty parameter to the SNMM's spread module and dividing the cur_inv by the SNMM's configured trade_qty when calculating inventory adjustments.
It is also worth noting that it is probable that an edge can be gained in estimating whether an arbitrary trade is informed or random given its relative position size. If possible, it could relieve the need to make assumptions about the proportion of informed traders as new bayesian updates could be easily derived given the knowledge that a trade is informed / random (or the probability of the single trade being informed could be used in the updates rather than blanked assumptions about random / informed trading frequency and noise). This is an interesting direction for future work.
Initial True Value (
The theoretical model requires that we provide the market maker with the true value and standard deviation of the true value distribution at the first auction. Obviously, there is no such thing as a "true value" in practice and the best we can do is arrive at a sensible esimate based on the data available to us. When historical trading data is available, we subject the market maker to a warm-up period on recent trading history in which we can calculate the mean historical trading value and standard deviation of trading values, which we use as
Proportion of Informed Traders (
The remaining theoretical framework variables are the least realistic to obtain and most difficult to estimate in practice.
SELECT 1
However, a historical optimization on past trading data from baskets of similar markets reveals that SNMM performance tends to be best when ______.
As such, in _____ we develop a historical optimization on
Our historical optimization suggests that, in environments where the true value evolves frequently and with small jumps, the following parameters should be used:
Variable | Value | Definiton |
---|---|---|
Proportion of total traders that are Gaussian-informed | ||
Standard deviation of gaussian-informed noise distribution | ||
Probability of a noise trader placing a trade during any auction | ||
Standard deviation of the jump process |
And in environments with large and infrequent jumps, these parameters should be used:
Variable | Value | Definiton |
---|---|---|
Proportion of total traders that are Gaussian-informed | ||
Standard deviation of gaussian-informed noise distribution | ||
Probability of a noise trader placing a trade during any auction | ||
Standard deviation of the jump process |
Market Selection Evaluation Methodology
[1] Das, S. (2003). Intelligent Market-Making in Artificial Financial Markets.