No, this is not another article about ChatGPT and how it is going to takeover everyone's job. Don't get me wrong! I am as excited about the revolutionary potential of machine learning (ML) technology as the next person. But here and now, we are just going to think in terms of good old systems.
Thinking in Systems by Donella H. Meadows is an excellent introduction to systems thinking. It roughly defines a system to be a set of interconnected elements with a purpose where the sum is more than its parts. The central thesis of the book is that a system's behavior is intrinsic to the system itself and to produce better results from a system, we need to understand the relationship between the structure and behavior of that system.
I will not indulge in the details of systems theory here, but rather focus only on aspects relevant to building digital systems, specifically, software systems. With digitization of every worthwhile human pursuit, we are all part of a global machinery that turns physical reality into mindless stream of bits, willingly or not. A never ending quest for efficiency and speed that challenges us to conquer ever growing complexity of digital systems and infrastructure. A journey that forces us to confront the limits of our own tools and techniques far too often, and to seek new ways to build better systems.
How do we go about building programs that mimic physical systems?
We develop models of physical systems. As pointed out in Structure and Interpretation of Computer Programs1, "One powerful design strategy, which is particularly appropriate to the construction of programs for modeling physical systems, is to base the structure of our programs on the structure of the system being modeled."
This turned out to be a very effective strategy as evidenced by the huge popularity enjoyed by Object Oriented Design (OOD) and Object Oriented Programming (OOP) in the industry. While there are genuine criticisms that can be leveled against OO approaches, they are a natural fit when modeling systems in many practical domains.
The role and importance of domain models in software development can't be overstated. They are the foundation of any software system. From eliciting the structure and behaviors of systems, establishing the architectural blueprints, verification and validation of working software to deploying and operating them on distributed, virtual infrastructures of today, domain models play a critical role. They may sit majestically at the center of our software development arena or hide in the shadows of its dark alleyways. They lurk behind the software systems of all sizes and shapes, whether we formally acknowledge them or not.
Domain Driven Design (DDD), an approach to software development that centers the development on programming a domain model that has a rich understanding of the processes and rules of a domain (Martin Fowler, 2020), based on 2003 book by Eric Evans, has been widely adopted and applied in the industry. The adoption of a Ubiquitous Language (UL), a common rigorous language of communication between developers, domain experts and domain users is a core concept of DDD. Its bias towards developing models in software and its evolutionary approach to model development has made it attractive to the development community, especially to Extreme Programming (XP) and Agile practitioners.
The focus of DDD is on building better domain models whereas the related approaches of Model-Driven Engineering (MDE) and Model-Driven Architecture (MDA) are more concerned with translating the models into code. But taken together, they provide the essential tools for building better software systems faster, a way to transform strings of conscious reality into streams of flashing bits.
As Albert Einstein2 stated, "... the supreme goal of all theory is to make the irreducible basic elements as simple and as few as possible without having to surrender the adequate representation of a single datum of experience." And there in lies the relentless quest for foundational modeling elements that connect the objective reality of a domain user to the abstract constructs in some programming language.
How do I know there is a gap? I asked ChatGPT, seriously.
Me: Is there a gap between domain models and programming models?
ChatGPT: Yes, there is often a gap between domain models and programming models. Domain models represent the concepts and relationships within a specific field or domain, while programming models are concerned with representing these concepts in code. Bridging this gap can be challenging as the concepts in the domain model may not have a straightforward mapping to code, and vice versa. It requires a good understanding of both the domain and the programming language to effectively translate between the two.
The irrefutable proof of truthiness one can hope for in 2023!
This kind of gap is more fundamental and deep, pervading all fields of study. Philip Anderson, a Nobel Laureate in Physics, in his 1972 article eloquently argued that "more is different" and "ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe". When dealing with the dual difficulties of scale and complexity, he cautions about the emergence of new properties at each level of complexity needing different abstractions (fundamental laws) to explain new behaviors.
As we tackle complex domains, attempting to model emergent behaviors at each level of complexity, shoehorning them into the existing programming models, most of them originally designed to deal with memory representations and machine code generation, leads to cognitive dissonance and compromised models. Wrong models lead to wrong systems, period! The saying, "You can't fix stupid" is no more truer in any other context.
Not all programming languages are the same. Each one was designed with a different purpose in mind. Overtime, they evolve, supporting multiple paradigms and the developer communities build better abstraction layers on top of them to address the aforementioned gap. From a systems perspective, what is important is how we tame domain complexity and handle emergent behaviors.
Time honored way to solve complex problems is to decompose them into smaller, simpler ones and solve them independently. The way to build complex systems, would be to compose them from simpler ones. A fact we might be willing to accept as self evident. But as Anderson pointed out, "a reductionist hypothesis does not imply a constructionist one". Since modularity, coupling, cohesion and information hiding were part of software engineering vocabulary for ages and nearly universal practice of modular organization of code, we naively assume that we are composing software systems, when in fact, we merely decompose them. We will explore this subtle, but crucial difference in more detail in another chapter. But for now, I will simply state that the composition models across programming paradigms require a different set of abstractions.
All programming languages provide means to model structure in terms of values (entities, value objects, classes, etc.) and relations (inheritance, aggregation etc.), typically encoded in a type system. They also support modeling the interactions using interface definitions (functions, protocols, interfaces, traits, instance methods, etc.). The behaviors emerge from dynamic states of the system that change as a consequence of computations initiated through interfaces. We can really flex the powers of type systems and modern compilers to go a long way to model and validate systems. If you are curious, there is delightful series on Designing with types by Scott Wlaschin, that might be of interest to you. I will continue with my deliberations on dynamic states and encapsulating emergent behaviors here.
Data modeling has a long and storied history. It is a rich and well established field supported by a thriving database community. The databases and their schemas have been powering most of the systems out there. When we refer to model in an application context (Model, View, Controller (MVC) pattern as example), we are often referring to the data, usually stored in some database. Yet there seems to be a divide between the modeling aspects of data pertaining to data layer and computation (or application logic) layer. One focuses more on data at rest while the other is concerned about the data in motion (transition). But both have to deal with the dynamics of the system (the changing states) or emergent behaviors.
Data community does this by shoving more status fields into their tables and documents while the application community deals with them by writing a truck load of code in the name of controllers and logical blocks. The very essence of our system lives in the wild wild west of broiler plate code that is built to tie these disparate worlds together.
Can we have a unified model of a domain that crosses the boundaries of client, server, middleware, databases and other artificial boundaries we have created to organize our teams, software artifacts and infrastructure? Can we do this without all the ceremony and fanfare?
In a much simpler past, many of us could standardize on a single programming language and move on. Not any more. Between our web and mobile applications, multiple public API's and language specific SDK's that we provide to our customers for accelerated adoption, we end up shoehorning our domain models many times over. It is not just the domain complexity we are up against, but the complexities presented by the realities of global, distributed, hybrid, virtual, polyglot environments of today. Our domain models have to transcend the very confines of programming languages whose programming model we want them to be part of!
If you have lead your professional life oblivious to the above challenges, don't feel left out. In any growing business, you will soon be forced to confront them. Sleep well tonight!
It is not just about programmer productivity. We fill the gap between the
domain user and programmer with different roles - domain experts,
domain consultants, business managers,business analysts, data analysts,
product managers. We equip them with even more automation tools each churning out their own digital artifacts. Meanwhile, the real developer spends all her time writing broiler plate code to bring all of them together. The official estimates of broiler plate code ranges from 20 to 30%, but in sufficiently complex projects, they easily exceed 50%.
A domain model, first and foremost, is a communication tool. It is where we collect, organize, analyze and refine the tiny, shiny granules of domain wisdom. Domain models help build a common understanding and align goals among the stakeholders. They inform and guide the design, implementation and validation of the digital systems we build. The simplicity and expressiveness needed for effective stakeholder communication often stands at odds with the implementation details that creep into the programming models. For all the allegiance we pledge to working software over comprehensive documentation, we end up doing both, by different people.
As we digitize the domain knowledge, we are at the risk of burying more and more of our organizational knowledge in code. In any modernization project in any organization with some history, there is always a spreadsheet or a piece of software that nobody wants to touch. Everyone knows it is important, but no one knows how it works. Marvels of modern architecture built around leaking legacy sewage! The modern systems we build today are the dark abyss of organization knowhow of tomorrow. Just think about the amount of knowledge that is buried in data models, database schemas, spreadsheets and code repositories in our organizations.
In biology, we study changes to organisms (biological systems) from a developmental and evolutionary perspectives. We have been looking systems and their models from a developmental perspective, exploring their changes over a single lifespan. All systems evolve over generations. Building models resilient to changes under evolutionary pressures of systems they model is key to their success. Just as we avoid under or over-fitting our models to data in our machine learning (ML) systems, we have to be careful about how well we fit our models to the requirements of the systems. Build for change is a mantra that we profess with passion, but pursue with extreme prejudice. Just as we concluded that the emerging behaviors need different kinds of abstractions, we have to explore abstractions that enable us to deal with the evolution of systems as well.
It is fair at this point to ask, "What do we want, really?".
- We want to apply systems lens to building software.3
- We want powerful domain models that can capture the structure and emergent behaviors of complex systems.
- The systems and their models should be composable.
- The domain models should be simple, expressive and enable effective communication between stakeholders, especially bridging the chasm between the domain user and the programmer worlds.
- We want better abstractions built on our programming models to reduce the degrees of separation between the specifications and working software.
- Our models should transcend the artificial boundaries of programming languages, implementation details, software and organization structures, and deployment environments.
- Models capture organizational learning and knowledge. They should not be buried in code.
- We want to build models that are resilient to change.
We will explore how we can achieve these goals in the upcoming chapters. A journey that will take us through models, programming languages, type systems, knowledge representation, state machines and even polynomials!
We have been building software systems for ages now. Do we really need to bother?
I would like to ask a counter question. Do we really need to spend millions of dollars to build a new chat application? Can't we just Google things?
Apparently, even Google does not think so.
Keeping with my stated intention of focusing the discussions here on technology and solutions, I will not attempt an elaborate business case based on lost productivity, time to value or any number of other flavor metrics of the day. If this is an impediment to your appreciation of the subject matter, please do reach out to me.
We are a species that progressed by building better tools and systems. It is our survivalist instinct. It is what makes us who we are.
Prof. Robert Sapolsky, in his lecture Uniqueness of Humans, said it best. "The more clearly, absolutely, utterly, irrevocably, unchangeably clear it is that it is impossible for you to make a difference and make the world better, the more you must."
So, we must!
I like to hear from you.
This book is a classic and a must read for any self-respecting programmer.
It is an unwritten rule that one can't discuss models without an obligatory quote from Albert Einstein.
Systems analysis and design methodologies have been used in software development for a long time. I hope the practitioners of these methodologies can appreciate the differences in perspectives here.
Copyright 2023 Weavers @ Eternal Loom. All rights reserved.