As described in my previous piece, complicated software has existed for almost as long as there has been software. More recently, however, the software we contribute to is not complicated, but complex.
In this Better Bits excerpt, I illustrate complex software by discussing 2010’s Flash Crash.
Modern algorithmic trading is an example of a complex, not complicated, software system. This high-frequency trading attempts to maximize returns by acting the fastest on new information. Entering and exiting trades thousands of times per second is the norm. Once transaction fees may have have reduced the number of trades per second, the network’s liquidity rebates often exceed those fees; in other words, algorithms can trade with effectively zero marginal cost. Networks can do this because they don’t make their money on transactions, but instead sell “low-latency” feeds while renting out co-location space. Both enable traders to receive data before others, translating into additional arbitrage opportunities. Ultimately, current markets are dominated by interacting software capable of trading at speeds incomprehensible to previous generations.
On May 6th, 2010, a single Kansas City stock trader - either through ignorance or malice, we’re still not sure - executed a $4 billion trade in the futures market. Unlike other sales of similar size, the trader did not set any conditions like price, time, or volume for the trade. An automated trading algorithm, without parameters, attempted to sell the entire amount at once. This set off a reflexive domino effect from the rest of the system’s high-frequency trading programs. A trillion dollars had disappeared from the U.S. securities market in less than twenty minutes. Before trading could be manually halted, almost nine percent of the market had disappeared.
The 2010 “Flash Crash” crash exemplifies the kind of unpredictable, emergent behaviors that occur in our increasingly common, complex, software-dominated systems. Complex software isn’t higher-order complicatedness.
Paul Cilliers was a South-African philosopher, professor, and complexity researcher. He defined seven characteristics of complex systems. All of them apply to the Flash Crash:
Complex systems consist of many elements that in themselves can be simple. In the markets referenced in our Flash Crash example, high-frequency trading algorithms dominate, accounting for 60-75% of the trading volume. Their individual actions, like buy when X event happens or after Y time, are simple to understand.
The elements interact dynamically by exchanging energy or information. Even if specific elements only interact with a few others, the effects of these interactions propogate throughout the system. The interactions are nonlinear. From our example, actions taken by one software agent create information that may trigger other software agents, which trigger other agents, and so on. Their nonlinear propagation throughout the entire market can quickly surpass trillions of dollars.
There are many direct and indirect feedback loops. A sale triggers another sale, which then triggers yet more sell-offs is a type of reinforcing feedback loop.
Complex systems are open systems — they exchange energy or information with their environment — and operate at conditions far from equilibrium. New software agents can be added to the market to action on the exchanged information therein. Operating without equilibrium is not only expected - it is precisely how financial gain is managed through arbitrage opportunities.
Complex systems have memory, and that history impacts the behavior of the system. However, that memory is distributed throughout the system and not located at a specific place. Likewise, a market’s transaction history is distributed throughout the system; so much so that understanding what happened in the Flash Crash took more than five months of investigation.
The system’s behavior is determined by the nature of the interactions, not by what is contained within the components. Since the interactions are rich, dynamic, fed back, and, above all, nonlinear, the behavior of the system as a whole cannot be predicted from an inspection of its components. This phenomenon is also referred to as “emergence”. Said another way, there was no way of predicting the likelihood of the Flash Crash by studying the individual agents.
Complex systems are adaptive. They can (re)organize their internal structure without the intervention of an external agent. Despite our best efforts to harden and prevent algorithmic runs, similar - yet less headline-grabbing - crashes continue as the complex system adapts and leads to new emergent behavior.
While triggered by a single trade, the emergent properties of the market’s software system caused the ensuing chaos. The emergent behavior of means that we may be able to observe a complex system, but we can’t control it.
Trading markets aren’t the only place we’d find complex software systems. Every industry is wrestling with a new category of problems. Journalist Quinn Norton, in her essay Everything is Broken, explains:
“Software is so bad because it’s so complex, and because it’s trying to talk to other programs on the same computer, or over connections to other computers. Even your computer is kind of more than one computer, boxes within boxes, and each one of those computers is full of little programs trying to coordinate their actions and talk to each other. Computers have gotten incredibly complex, while people have remained the same gray mud with pretensions of godhood.”
Unfortunately, there are some good (and not-so-good) reasons why we’re in this state. Kellan Elliott-McCrea, who led engineering efforts at Dropbox, Etsy, and Flickr (among others), describes five reasons our software is more complex than before:
We expect more from our software; some of this is due to customer expectations, but some are also due to increased regulatory and safety concerns.
Maturing problem-solving approaches has lead to more tools, licensing structures, languages, and architectures; while this represents progress, it also increases the challenges of selection and proper utilization.
Decision-making about growing technology stack uniqueness (above) doesn’t happen in a vacuum; instead, decisions must also account for an ever-accruing context that most likely includes missteps and no longer true assumptions. This context is ever more challenging to grasp with increasing rates of developer turnover.
Larger systems to do bigger things mean it is more difficult to grok the whole system - thus necessitating more onboarding, designing, and triaging from processes that haven’t kept up or are treated as an afterthought (something also exacerbated by increased developer turnover).
During a decade when interest rates were low and techno-solutionism favorable, companies found it easier to raise capital if what they did sounded high-tech and complex, and so that’s what they built.
This software system complexity has consequences: nonlinearity, randomness, emergence, and surprise. It is why existing software seems more difficult to bend for the better. And it is why communication and coordination dominate all other costs when building software; research shows time spent on collaborative work (e.g., meetings, apps, texts, email) has grown 50% in the past 12 years.
We need a different approach.
If you enjoyed this excerpt and are interested in updates on the book’s progress, please consider subscribing.
Next time: Given that, how do we make our complex software systems better?
Isn’t A.I. the epitome of a complex system, where activities spring out of collective networked information that programmers and systems don’t really control? Isn’t that the ultimate “threat” of A.I., decisions made at digital speed without oversight?