On the complexity of hard things

Michelle-Joy Low
Towards Data Science
5 min readOct 23, 2020

--

Photo by Uriel SC on Unsplash

“Please, speak as you might to a young child, or a golden retriever.”

Margin Call is one of my favourite films — and this line from the movie happens to be one of my all-time favourite quips. In the throes of academia when I first watched the film, it brought dry comic relief, mirroring my experience walking into thesis-defence seminars regularly packed with senior academics of the time-poor, highly acerbic variety.

In my time working as a consultant in data and digital transformation I have witnessed an insatiable appetite for “easy-to-understand” headlines, like “data-driven culture” or “leverage our data”. A decade in, this subject continues to stir in me equal parts fascination and consternation. It seems everyone wants to sprinkle data around these days, but after a rousing Powerpoint presentation filled with easy-to-understand headlines and a few considered nods, interest quickly dissolves. Just a couple of weeks ago someone told me they didn’t want to know why a model seemed unstable (the data was reflecting time-varying market conditions) — they simply needed it to Just-Give-Me-The-Number™.

Uncomplicated feels good

There is no lack of readable material on the importance of good technical communication; indeed, much of the discourse on bridging the gap between commercial and technical domains point to simplification as the answer. The premise seems compelling, almost to the point of stating the obvious — leaders want information in easily digestible, rapidly consumable “treats” for decision-making; simplifying everything therefore, is the definitive solution, the critical difference between go or no-go.

Reducing complexity to quotable quotes is alluring on many levels. The apparent clarity provides a sense of speed. A well-formatted chart festooned with a crisp headline or two yields an immediate sense of understanding and well-being. To have a few such charts accessible in an interactive dashboard for data exploration compounds the feeling of control and comprehension. Feeling in control and in comprehension often leads to believing the problem is already solved.

The shrewd consultant understands this well; he knows to draw attention to the bubbles, not the bath. All in plain language, of course. Because why go into the dark realities of building data infrastructure or machine learning at scale, if saying “scale” and “value” enough times is all it takes to generate audience delight? Moreover, how to actually deploy models at scale is for the 𝚗̶𝚎̶𝚛̶𝚍̶𝚜̶ delivery team to worry about.

Putting complexity in the cupboard doesn’t make it go away

My concern for the undiscerning technology decision-maker is the temptation to mistake obfuscated complexity for elegant solutions. Simpler presentations appeal to System 1 — our inherent propensity to measure the value of a narrative, not on the reliability of evidence, but on coherence.

The trouble then is often the ensuing underestimation — in particular the sheer magnitude of a task. This typically is realised late in a piece, as the warmth of easily digestible executive briefings is replaced by the harsh, cold reality of delivery.

Take for instance the construction of a data lake. Commonly, most energies in developing critical data infrastructure are expended on vendor selection, under the over simplified view that “Once we have a vendor locked in, we’ll buy a data platform and our data-driven culture will fall into place”. The reality of constructing a data platform is that an entire organisation needs to change the way it thinks about accessing, producing, moving, conforming, storing, and retrieving data. As the project unfolds, transformation teams working on the ground realise that the executive presentation’s one-liner about “change our ways of working” includes trying to convince someone to adopt Git after 20 years of Windows Explorer and local storage.

Hard things actually are complex

Truth be told, decision-making is complex — especially so where technology and quantitative assets are concerned.

To execute effectively in the liminal space between the commercial and technical domains, one must grasp both the 10,000-foot strategic mandate, and accurately map this to almost non-negotiable demands of people, processes, infrastructure and software. A virtual private network is either configured with appropriate access controls or it is not — regardless of how often one writes on a Powerpoint Slide “Automated Secure Networking”.

Data adds to engineering complexity, as its propensity to change state over time and space can present some pretty mind-bending considerations. Consider for instance the implications of eventual consistency on e-commerce CX — where customers might be shown stale information as database updates take time to sync up across your “Cloud-Native Architecture”. Surely it would matter to a CX team, in their design mandate, to understand the bounds of such variability.

And then the addition of machine learning further exponentiates data management pressures. Not only do we now need to keep critical infrastructure, software and data assets running in lock step, we must also ensure that the combinatoric ways in which ML can be used (and abused) are carefully managed. “What do you mean I should be worried about using that model with a 100% accurate rate?”

Data and machine learning are topical examples, but something research training has taught me is that all disciplines have incredible complexities that are often glossed over in the pursuit of a good story.

Golden retrievers shouldn’t make hard decisions

Given the complexity of the modern corporate, presiding business leaders should probably not seek to emulate golden retrievers. And given the weight of decisions at hand, as data practitioners we ultimately need to be contemplating how to communicate substance, not faux clarity.

This requires a seemingly small but pertinent shift: away from simplicity, towards accuracy (and precision). That is, aim for the clarity so desperately sought, supported with only as much complexity as is needed — but make no mistake: the complexity is necessary to convey. As practitioners we are responsible for how responsibly data is consumed; and if a business user is about to hit a button that deploys a neural network that discriminates against minorities because of a bias in the training data, it’s the ethics, trade-offs, and risks around class-imbalance they need to hear about, not how you’ve used the latest state-of-the-art architecture from Google or OpenAI.

For the decision-maker, the notion of diving into the detail may sound unappetising at first; my recommendation is to at least stir the bathwater and part the froth. A few pithy questions go a long way — “Talk me through the execution”, or “Run me through what can go wrong” quickly reveals what lurks beneath the (bubbly) surface.

While neither communicating nor consuming complexity can be enjoyable, the impact of good technical communication cannot be disputed. My hope is as an industry we learn to value substance over sugar hits and reward well-executed strategies ahead of simply-told stories.

--

--

Econometrician, always curious, loves growing people, and helping businesses use data.