How We Reverse Engineered The Chicago Mercantile Exchange
By Dendi Suhubdy, Co-Founder and CEO of Bitwyre Technologies
The exchange is at the heart of our economic system. From the Rialto in Venice (fourteenth century), the Grand Bazaar in Turkey (seventeenth century), the Amsterdam Bourse in Holland (seventeenth century) and the New York Stock Exchange (twentieth century), markets have purposely been built to be a place for buyers and sellers of goods and services to meet and transact. After the invention of computers, the marketplace went online, the buying and selling that used to take the Spice trader in the Dutch East Indies (now modern-day Indonesia) about a month to listen to prices from the Amsterdam Bourse now only takes a fraction of a second using automated computerized trading systems: an overall better price discovery for the buyers and sellers evolved.
Building such a centralized computerized trading platform is not an easy task. We thought that since the matching engine is at the core of many economic activities, there must be a step-by-step tutorial to build a trading system from scratch somewhere on the Internet. Our assumption turned out to be completely wrong. Building an exchange requires knowledge spanning from multiple distinct fields, such as game theory (mathematics), algorithmic and data structure (computer science), and market microstructure (finance). Inspired by Cinnober (part of Nasdaq) and LSE which operates the Oslo Bourse exchange, our team decided to rebuild the foundations from the ground up: we needed to build a financial exchange from scratch.
The tutorials that you find on the web are geared towards high-frequency market making, or proprietary buyers and sellers on a financial exchange. In financial lingo, the tutorials are for the buy side/sell side and not on the exchange side e.g how you listen to price movements, make a decision, and send an automated order (either to buy or sell a financial instrument) to the exchange. We never found any information regarding how you would receive orders from clients, match them and conduct proper clearing mechanisms after a trade has happened e.g. the clearing process. Also after a trade happens how do you make sure each market participant listens correctly to the price changes that have happened during the trade e.g. the price data feed.
We call this the centralized exchange problem and, truth be told, this has been solved efficiently by large financial groups such as the National Association of Securities Dealers Automated Quotations Exchange (NASDAQ), the New York Stock Exchange (NYSE), and the Chicago Mercantile Exchange (CME) group. One particular implementation of a financial exchange intrigued us: the CME. The reason the CME is different is because the platform not only has standardized spot trading, but it also has standardized options, futures and spread markets with implied orders functionality.
Many technology groups have solved the spot market pretty easily. You might have heard about several cryptocurrency exchanges spawning around the globe such as BitMEX, Binance, Bitfinex and Coinbase. You can easily trade, using leverage in the range of 1-100x, several financial products such as perpetual inverse swaps (which are basically spot on leverage) and or directly buying/selling cryptocurrencies such as Bitcoin and Ethereum.
Building A Spot Exchange
Before explaining our solution, let’s cover how a typical spot exchange operates. You have multiple clients that send a sequence of orders, each with a quantity and price. The order then goes to a journey as follows (also which will determine the round trip time (RTT) of an exchange, more on that later)
- Receive order in gateway, timestamp the order
- Send the order to the risk management system, calculate if client has enough balance
- If they do, send the order to the matching engine, if not reject the order
- The matching engine, then spits events such as trade event, order event, depth event
- The events then gets passed to several other parts of the exchange
- The market data feed engine takes the trades, and depth and blasts it to the client
- The spot clearing system takes both sides of trades and debits and credits the account of each trader.
Now here, bear in mind that the clearing system at the end takes the counter-party position of the client’s trade and delivers the financial instrument. Say a prop trader buys 3000 shares of Goldman Sachs ($GS) @ 207.36 USD, and another hedge fund in New York sells 3000 shares also of $GS @207.36 USD. The clearing system will then debit 622,080 USD from the traders account and credits the trader with 3000 $GS shares and also vice versa, credit 622,080 USD from the hedge fund and debits 3000 $ GS shares. The spot exchange clearing engine solves the centralized clearing problem e.g the clearing engine becomes the seller for every buyer and the buyer for every seller. Thus making sure market integrity exists and ensures the trade happens. The details of the clearing process might be different in North America and Europe. In the US, clearing is done by DTCC, and in Europe it’s done by clients choosing a Central Counter Party (CCP).
If you add leverage to the equation, this becomes a little bit more complicated. Here for most equity centralized exchanges, they do not lend money to individual/institutional traders. To ensure it’s clearing objective, the clearing house would need to ensure the brokerage house that lends the trader could always pay the order value sent by the buyer. This is also true for a leveraged seller (short seller).
Now imagine if you don’t have intermediaries like brokerage houses, for a leverage trade to happen now the clearing house is exposed to credit default risk of the leveraged buyer and or short seller. During each tick, the clearing house needs to listen for price changes and compute the leverage buyer’s/seller’s position and maintain that in the advent of drastic price change, the centralized clearing house should take over the position and liquidate it on the open market. The level in which a leveraged buyer/seller gets liquidated is called the liquidation price.
Building A Derivatives Exchange
Now, imagine this matching system but for derivatives. A derivative is a financial contract that is tied to an underlying price. For example an option on Goldman Sachs ($GS) equity is a right to purchase (or sell) Goldman Sachs ($GS) at a specific price predetermined now (called the strike price) at a later date (for example in a month).
The main difference between a spot financial exchange and a derivative financial exchange is its ability to conduct mark-to-market computation of all positions of all traders in the exchange. For a computer scientist, this is a simple matrix computation problem, but imagine, for every single tick of market data.
Revisiting the spot market order matching process, process 1-7 looks similar but with few extra steps.
- Receive order in gateway, timestamp the order
- Send the order to the risk management system, calculate if client has enough balance
- If they do, send the order to the matching engine
- The matching engine, then spits events such as trade event, order event, depth event
- The events then gets passed to several other parts of the exchange
- The market data feed engine takes the trades, and depth and blasts it to the client
- The derivative clearing system does a couple of things
- Take the latest trade, compute volatility projections, price bands (moving average) and several other metrics and recompute every single value of the portfolio of each trader on the exchange
- The traders that have positions that are too risky or are not contained within the respected margin balance, will get liquidated.
- On liquidation, the position will be taken care of the independent clearing house, and be sold immediately.
This is all done asynchronously and called post-trade processing.
- On delivery date, the derivative clearing house will make sure the seller of the contract delivers the financial contract which depends on a cash/physical settlement.
Now algorithmically this becomes one step harder. To have step number 7 be efficient, one could utilize parallel computation methods such as GPGPU computing.
Building Derivatives Pipeline for Implied Markets
If you have traded on the Chicago Mercantile Exchange’s Globex system, you might as well admire the beauty of how the whole architecture works. Especially around a feature called “implied markets” (for more on implied markets see https://www.cmegroup.com/education/videos/implied-markets-video.html).
According to the CME’s confluence website (https://www.cmegroup.com/confluence/display/EPICSANDBOX/Implied+Orders) there are several rules of a quantity of an implied order sent to the market. These are:
- Implication requires at minimum two orders in related markets in the proper combination.
- Implied bids do not trade against implied offers.
- Implied bids can exist at the same or inverted price levels with implied offers. When this occurs, CME Globex ceases dissemination of the implied bid; however, the bid is still calculated and can be traded against.
- Implied OUT quantity will not disseminate when the leg value of the included spread is in a ratio greater than one. The price and quantity are still calculated and can be traded against.
- Implied quantity in futures markets does not have time priority.
- Implied quantity is not available:
- When the market is not in a matching state (e.g. Pre-Open)
- When implied calculation has been suspended
On an implied market there are two different calendar futures, say since right now we are in July there would be the N20 (July 2020) and Q20 (August 2020) futures delivery dates. You will then have another implied market which is the delta differential between the two order books N20 and Q20 which is called the N-Q20 calendar spreads.
To solve the implied market problem, one would need three instances of the matching engine that are asynchronous and event driven that listens to not only incoming orders from the risk checking system but also makes sure that the calendar spread order book is updated too. An order coming into the calendar spread order book would also ensure that the orders inside both matching engines also be added or removed accordingly.
Source: https://www.cmegroup.com/education/videos/implied-markets-video.html
Bitwyre’s Solution to the Implied Markets Problem
It took us a while to realize that a financial exchange system is an asynchronous event-driven system and to build one, one should keep in mind doing things asynchronously is different from doing things synchronously.
We decided on building an algorithmic-controlled order passing system to the matching engines. This algorithmic microservice is called the Credit Risk Account Management (CRAM) service. Internally our engineers call it “The Oracle”, for the sake of the joke :)
- Receive order in gateway, timestamp the order
- Send the order to the risk management system, calculate if client has enough balance
- If they do, the Oracle does a check
- If it’s a normal non-implied market order, send it to it’s respected matching engine
- If it’s tied to an implied market say a calendar future and a calendar spread, send the order asynchronously one to the calendar future matching engine and one to the calendar spread matching engine
For the CME, this is done synchronously as each order is processed as a single transaction, thus needing a “lock” on both matching engines which the order is being processed. We do things experimentally where the orders received by the Oracle and the orders received by the different calendar matching engines would have strong ordering and timing guarantees for consuming and producing messages and orders that are related to the other respected matching engine.
- The matching engine, then spits events such as trade event, order event, depth event
- The events then gets passed to several other parts of the exchange
- The market data feed engine takes the trades, and depth and blasts it to the client
- The derivative clearing system does a couple of things
- Take the latest trade, compute volatility projections, price bands (moving average) and several other metrics and recompute every single value of the portfolio of each trader on the exchange
- The traders that have positions that are too risky or are not contained within the respected margin balance, will get liquidated.
- On liquidation, the position will be taken care of the independent clearing house, and be sold immediately.
- On delivery date, the derivative clearing house will make sure the seller of the contract delivers the financial contract which depends on a cash/physical settlement.
Somehow if you try to look at CME’s solution to this problem, we accidentally came with the same thing without seeing their implementation. Below is the Globex architecture you will see the separations of each functionalities such as Order Entry, Drop Copy, Credit Control and Risk Management, Matching Pricing and Market Integrity, Market Data and Clearing. Each which represents the algorithmic process of 1-8 that I have mentioned above. Be advised that since CME does not directly handle end customers' trades(although they do via brokerage houses), the clearing system is not as computational heavy as if they did.
Source: Trading Services. Confluence page the CME group (https://www.cmegroup.com/confluence/display/EPICSANDBOX/Trading+Services)
Our core value proposition of building Bitwyre are the following: currently there exist no market in the world where an institution or a retail speculator/hedger would be able to liquidate or purchase large holdings of cryptocurrencies rapidly. To solve that problem we built several solutions 1. We built markets where volumes of trades are hidden from all market participants. When a trader liquidates his/her holdings in the darkpools, there would be minimal slippage. 2. As an institutional trader like Jim Simon’s Renaissance Technologies/George Soros’s Quantum fund/Paul Tudor Jones fund, might be able to have exposure to cryptocurrency price movements without having the risk of actually holding the coin which does bear the custodian risk e.g being hacked. Our cash settlement system allows us to use the Bitcoin Real Time Index (BRTI) from the CME and or a spot bitcoin index that we compute internally to swap the cash differences at the delivery date. 3. For professional market makers we also offer colocation services so that they can have predictable and fast execution service for their proprietary market making algorithms.
During the author's discussion with several spot crypto currency exchange founders, they mentioned that aggregate throughput is a significant problem in current exchange pipelines. Building a simple exchange with Python or PHP might take you to 100-1000 orders/second, but it will not allow you to ingest 100,000-1,000,000 orders/second as is common with professional exchanges like the CME/NASDAQ/NYSE. During the last 40+ years these centralized exchanges have invested heavily in FPGA technology that allow them to have a very efficient low-latency high throughput pipeline.
At Bitwyre we try to take a different approach. With the lack of crypto exchanges that take on being an execution venue for institutional investors, we decided to specialize in that route. That also means we will be able to have non-lit/dark markets, and colocation services. Technology-wise, we started our codebase with C++, conducting operations all in-memory with MemSQL as our database and a Kafka-compatible streaming platform Redpanda, while slowly utilizing exciting new unikernel technology such as IncludeOS and HermitCore, we can try to lower the latency for pipeline processing and hopefully later we can optimize it to a FPGA pipeline.
One more important point of our architecture is that every computation is done in memory, which speeds the computation compared to saving every operation on disk. For that purpose the core database system that we use is MemSQL. There have been several in-memory databases out there geared for financial services but we decided that MemSQL is the best for our application as from my experience Redis has problems with multithreaded asynchronous access needed in low-latency processing applications like us. We also have tried KDB+ but we didn’t like the semantics of programming language to conduct the time series operations. We also understand less of the inner workings of Kdb+ since it’s closed source and has a high price tag for a small startup like us.
As with Redpanda as a drop in replacement for Kafka, we prefer MemSQL to kdb+ since it's a drop in replacement to MariaDB/MySQL. By doing the SET/GET, HSET/HGET, LPUSH/RPUSH operations with JSON schemas to Redis, the conversion to SQL was not as painful as we thought. The CTO, our lead quant and I, ported our Redis operations to SQL operations in 2 weeks (we are talking about a codebase of 200 thousand lines of code, spanning in 100 microservices, where each have 5-10 Redis operations per microservice). Once we succeeded in writing the SQL, we tested it out with MariaDB and we then dropped in the MemSQL database on our compute clusters as a replacement for MariaDB once we made sure extensive testing was done.
Another very interesting feature of the CME Globex exchange such as Velocity Logic, Circuit Breakers are not implemented on Bitwyre because of the nature of cryptocurrency derivatives contracts that trade on our platform. Crypto derivatives trade 24/7 compared to trading on bitcoin futures and bitcoin options on the CME. If we implement Velocity Logic and Circuit Breakers, the inherent volatile nature of Bitcoin would probably make us hit the circuit breakers once every 2 months. Other features such as Market and Stop Order Protection and Self-Match Prevention is an integral aspect to ensure market integrity on our platform. We implement both on the CRAM/Oracle. The reason is that despite being in Asia where markets are not as regulated as in the United States, we still want to ensure fair market participation and make sure we filter out wash trades from our system.
Also unlike the CME, we are a derivatives exchange, a clearing house, and brokerage house all together, centralizing most risks for the cryptocurrency trader. Our security protocols for custodian operations exceeds current implemented custodian operations done by centralized equity market exchanges. For example the need for multisignature operations, warm, and hot wallet transfers. In current equity exchanges, the custodian operations are conducted by an independent facility. The delivery of say 1 US barrel of WTI Crude Oil from the seller of the futures contract to the buyer of the futures contract happens electronically between two brokerage houses that conduct the buying/selling on behalf of the seller/buyer of the futures contracts. This settlement process is conducted by a derivatives clearing house. Seldom this process fails since the process only consists of two parts swapping balance and mark-to-market positions. In very rare circumstances can they fail.
As a bearer instrument, the owner of the private key of a cryptocurrency bears the ability to move them around. Infamous cryptocurrency exchange hacks happen seldom because the hacker manipulated trading records, but most of the time happens on a hot wallet breach e.g when a hacker gets their hands on the private keys of the exchanges hot wallets and transfers them to a crypto address that they control. To mitigate these risks, several crypto currency exchanges have offloaded the risk by introducing institutional custodianship of cryptocurrencies. Others, like us, for example, conduct separation of wallet functionalities into withdrawal wallet (only allowed for transferring out), deposit wallet (only for transferring in), warm wallet (transition of funds in between) and cold wallet storage.
If you’re interested to learn more about building derivatives exchanges for fun, don’t hesitate to ping us! Don’t forget to subscribe to our alpha testnet launch at https://bitwyre.com.
-----
Special thanks to Domenic Ravita for allowing us to write this blog. Thanks to Joshua Levine (who invented INSTINET/Island of NASDAQ, long live the SOES Bandits!!!!!!!!!!!!) for his inspiration for the Bitwyre team. To Aditya Kresna, our Chief Research and Development (CR&D) who discussed with the author on the implementation of this design system and to Yefta Sutanto, our Chief Technology Officer, whom with patience and dedication has deployed our engineering solution. Acknowledgements also goes to Aditya Suseno our CFO and John Schwall who have been inspiring us during the journey building this open-source financial exchange infrastructure. Shout out to Muhammad Augi, Sudhanshu Jashoria, Gaurav Jawla, Gandharv Sharma, Agustin K-Ballo, Alex Hultman, Sonkeng Maldini, Veronika Vřešťálová, Genevieve Brechel, Gary Basin, Erik Rigtorp, Matt Godbolt, Zachary David, Ezra Rapoport, David Senouf, Eric Charvillat, David Amara, Stefan Cheplick, Byrne Hobart, Remco Lenterman, Themis Sal, Per Buer, Alexander Gallego, Alexy Khrabrov for inspiring us during the design of our electronic exchange network. Also MILA friends, Sai Krishna G.V, Alex Fedorov, and Florian Golemo for their discussion on the order routing problem using deep reinforcement learning. Last but not least to my former supervisor who always supports me to build awesome cool things Prof Yoshua Bengio.
Reference
CME Circuit breakers
https://www.cmegroup.com/education/videos/cme-globex-circuit-breakers.html
https://www.cmegroup.com/education/videos/dynamic-circuit-breakers-market-integrity-control.html
CME Protection for Market and Stop Orders https://www.cmegroup.com/education/protection-functionality-for-market-and-stop-orders.html
CME Globex Timestamp https://www.cmegroup.com/education/videos/cme-globex-timestamps.html
Bitcoin Paper https://bitcoin.org/bitcoin.pdf
Derivatives Clearing house default problem https://www.nytimes.com/2019/05/03/business/central-counterparties-financial-meltdown.html
ISDA centralized clearing http://online.wsj.com/public/resources/documents/ISDApaper05232011.pdf