03 The Change Factor, 05 Future Changes

Best of Breeds: A Technical Architecture

Reading Time: 12 minutes

In the last two chapters, we’ve implemented a Publish/Subscribe pattern twice. Once via a messaging platform, once with a database. In both we had a {CHANGED} message fan out between multiple instances of an application. With messaging, it was our own microservice doing. With a database, it was done internally by its own write-read-main-replica topology.

Unfortunately, each implementation has its limitations. Not all databases support such a topology, which would be Restricting our system architecture. Messaging does not solve the problem of long lasting data persistence. 

In this chapter, we’re going to mix messaging and databases together. Having each do what it does best. Ending up with a worthy candidate for a system architecture.

Decoupling the Data

In our last design in Delta Force: Messaging Based Consistency, our microservice needed to fetch the entirety of the data in a parseable format. It would store it all in-memory during boot up to survive a restart. We’ve chosen to place our data.json on S3 because it was the simplest possible solution, and not a database per se.

Once we talked about scaling, we ran into an issue with our design. Our file structure is somewhat fragile, is being concurrently written to and read from, and fetching costs scales with its size. Issues already handled by a database.

Is there anything blocking us from replacing S3 with a database? The answer depends on when. Before our design goes live, everything is on paper. Afterwards, it’s a bit more complicated because we have contracts to maintain. To start off easy, let’s say everything is still on paper. There is nothing that technically prevents us from replacing S3 with MySQL, and having both our microservice and our Web App directly communicate with it.

Unfortunately, doing so will tightly couple our Actor and our microservice through the data and its structure. If one engineer would have a Behavioral/Non-Technical Cause to Change the Web App, it might introduce an Instability by unexpectedly Changing the data structure, one propagating all the way through to our microservice.

A mediator, a server-side application that would encapsulate our data would be handy. An application we sometimes call a Service. As long as all of its external contracts are maintained, all internal Changes would be correctly encapsulated. As a result, some Instabilities would be prevented.

Speaking of contracts, these are the ones which must be maintained in a live production environment. To replace S3, would require not only encapsulating it, but to maintain its HTTP API contracts. Currently, it returns the contents of the data.json file in some format. Once again, a mediator is needed. Only once it exists, can we safely and seamlessly replace S3 with another storage.

Either live or on paper, we need to encapsulate our data with a Service anyhow. As on paper it’s too easy, let’s do so in a live production environment. Our first action would be to add this mediator, an application we will name Data Service. It would entail another little tweak to our design. As the Data Service would be the one Changing the data, it would be the one to emit the {CHANGED} messages instead of our Web App.

Our second action would be to make sure the Data Service’s initial version would be maintaining the existing contracts. Its first endpoint would be /getAll, the second would be a POST for updating the data. Both would be encapsulating and replacing the direct HTTP calls made to S3.

There’s another contract to maintain and that is the {CHANGED} message itself, a contract like any other. We’d need to copy/move the logic that creates it from our Web App to our Data Service, and it might not be a simple matter. Once done, we should go ahead and deploy our initial version, because making a few small Changes is more beneficial than one big Change.

The next Change we wish to make is the one that finally reduces the costs incurred by our Web App. We wish to add that /get?Id= endpoint to get a specific permission. Unless we know binary ranges in advance, it can not be efficiently done with S3 or not all and not with a JSON file. So we either Change the file format, or Change our storage to a database. Either way, our design is decoupled enough to go through with it, without any Change to any other component but the Data Service. As long as our contracts are maintained.

Cheaper Reliable Cohesion

Unlike before, where we had a database topology with multiple read instances, our latest design requires only a database with only one main instance. Our microservice still stores everything in memory and survives a restart.

It is a design we’ve seen before, a Reliable one. Our microservice can continue to function even when the Data Service or its database are entirely down. For as long as they do not require a restart. For some companies and some scenarios, putting more effort into a higher Reliability would be non-beneficial.

As long as its contract of {CHANGED} message is maintained, our Data Service and our microservice have mutually exclusive Causes to Change. Both Technical Causes such as replacing S3 with a database, and Behavioral Changes to our Web App. In that way, it also respects Cohesion. We had met both Cohesion and Reliability, although a database exists in the shared zone.

When it comes to costs, it would cost us less to launch another microservice instance than a database read replica. If needed, our microservice instance can be vertically scaled and we can add more and more RAM to it.

We might have found a middle way between costs and Reliability. At least for as long as the entirety of the data fits in-memory. Or fetching the entirety of it from the Data Service on boot would take too long and affect scaling.

Independence Day

And here’s the trick. When the day comes, we have a choice. If we wish, we can have each microservice instance launch its own MySQL. It would be the same one our Data Service uses, but independently from it. And if we do, it would be without any Change to Data Service, as the decoupling between the two goes both ways!

If we do choose to add an independent one, we might get the worst of both worlds. A database already has an optimized internal messaging system. It would seem we’d just pay double for messaging. Once in our database, once in our messaging platform.

[Side note: in a way, this could be a setup for multiple independent DB clusters to communicate with one another. Like in multi-region or multi-cloud deployments. Beyond the scope of this book.]

However, if both databases are already independent, there is nothing Restricting us from choosing a different kind of database for our microservice. A cheaper or more suitable one for its needs. If they are read-heavy, maybe a cluster of Redis will do our microservice better than a cluster of MySQL. We could also choose to launch a Redis instance per microservice or one cluster for them all. Whatever we choose, it will have nothing to do with our Data Service, its database or its topology.

Due to our messaging platform and our efforts for maintaining the {CHANGED} message contract, the two are completely agnostic to one another. One is not Restricting the other from Changing. If that is so, each one can be also designed independently. If that really is so, we may have found an excellent candidate for a system architecture

And there’s some more work to be done for our candidate to become an actual one.

Alignment

It may appear from the above diagram that our design has formed a relationship between a service and a microservice. That is incorrect. Our messaging platform and the Publish/Subscribe pattern is agnostic to an application’s size, name or boundaries. It provides a communication method between multiple applications, no matter what they do with the message afterwards.

There’s another difference between our two applications, a less noticeable one. Changing the data within the Data Service is done only through a direct communication line, invoking an HTTP API endpoint. Specifically by our Web App. On the contrary, our microservice’s data is Changed through an indirect communication line, via our messaging platform.

Let’s align our design, so both our application’s data would be mutated via our messaging platform. As both would be expecting to consume the {CHANGED} messages, our Data Service can no longer be the one emitting them. It goes for our Web App to emit those, where the action of Change is being performed by our customer. As it is an Actor, and nothing more than another kind of application, our Web App can emit those to our messaging platform.

Doing so is insufficient, as our Web App requires an ACK/FAIL notification. Luckily, it would just be another message emitted by our Data Service, to be later published to our Web App and consumed by it. In this one we do assume more of our messaging platform, for it not to be limited to one way communications. Later in this series, we’ll see that when it comes to Actors, it may not be an easy requirement to fulfill. 

Mutual Inconsistency

To notice the biggest fault in our design, requires expertise and experience with distributed and asynchronous systems. Not even sure it would help to design for everything to eventually fail. Our microservices will no longer be consistent with each other, and with our Data Service. Because our microservice restarts and has no persistence of its own.

As each application consumes at its own rate, our microservice may consume faster than the Data Service. Once restarted and called /getAll of the Data Service, it would result in fetching data in an earlier state in time. As our microservice has already consumed its own {CHANGED} messages, it can not forward its own state of the data to the correct point in time. it will remain stale and inconsistent forever.

We may think it’s another dead end. But it isn’t. It’s exactly the other way around. If a specific design requires a microservice, a pure application which survives a restart or not, our fellow engineer can design as such. He’ll just have the Service emit {CHANGED} messages. If the design requires persistence, requires a Service, our fellow engineer can design as such. He’ll just have the client/Actor emit {CHANGED} messages. It’s an architecture that Restricts no one in any way from solving their current or future problems.

Chained Change

Although great, it is still a candidate for system architecture. Because the above opens up the question of when to do the one or the other in a non-restricting way. And there is one more coupling to handle, and it is the {CHANGED} message itself.

Let’s imagine a scenario of an eCommerce website with Buyers and Sellers. Each has a customer facing application, each is a micro-frontend independently Changing and deployed. Each has their backend tuple and their own independent database. Together they form a Product composed of Features/Flows. Their mutual exclusivity is maintained by our messaging platform, one informing the other of any data Changes through a {CHANGED} message.

As our system architecture does not Restrict it, the Sellers Service uses DynamoDB as a database. And due to its different concerns and requirements, the Buyers Service uses MySQL.

The {CHANGED} message containing the new state of the data would be emitted only after a Create/Update/Delete operation was done on the data, in a database. As such, the {CHANGED} message contents will be coupled to both the data and the database. It potentially creates a chain of transformations, which could lead to future Inefficiencies.

In our imaginary story, our Seller Service API has a mandatory field called ItemAvailabilityDates. It is a start and end date, in between those the item is available for purchase. For good reasons, the API contract mandates it to be a unix timestamp, which in DynamoDB it can be stored as a Number type. Instead, for valid concerns of correctly querying it, the field is set to Date type which is in ISO-8601

So whenever someone invokes our HTTP API and updates the Date field, the {CHANGED} message would contain an ISO-8601 based key/property. Unfortunately, a format not supported by MySQL. It would require the engineer on the other team, to transform it from one standard to another. So although the contract is maintained, it also creates a coupling between two mutually exclusive Services.

Let’s go one month backwards, to when the ItemAvailabilityDates was added. It was due to the Seller’s Product Manager coming up with a new idea for a customer-facing Flow. It was implemented immediately and solely by the Sellers Team. Required no other team’s involvement.

Once done, the Product Manager of the Buyers Team saw it with his very own eyes and wished to add it to the Buyers Frontend. The team starts working on that, but it turns out the {CHANGED} message received does not contain the required new field of ItemAvailabilityDates. “We’re sorry, but we’d need the other team to add it”.

The PM would be somewhat confused, because the other frontend already sends it to the backend. “It’s literally right there, on the HTTP {REQUEST} message!” our PM would go on saying. “That’s right, but their frontend is not sending this directly to us. We are getting it indirectly through their backend.”

Technically speaking, it’s nothing but a few minutes of work. But the other team’s next two Sprints are already set. They’ll get it done, maybe in some future Sprint if they’ll remember to pull it out of their backlog. It’s no one’s fault, it is just a mismatch between an organizational structure and the Change Stream

As our PM correctly mentioned, the ItemAvailabilityDates both teams require already exists in the {REQUEST} message. Unfortunately it is only sent to one Service and only via a direct communication line. If only we had some way to fan out the {REQUEST} message so it would reach both of them together. Oh yeah, we do. We already have a messaging platform set up!

There will no longer be a need to get the other team involved. To receive the message, all a backend Service needs to do is to register to it. That is a Bottleneck removed in the development workflow, with an eventually beneficial outcome. As both Services feed directly from the {REQUEST} message emitter, there would be two independent transformations, instead of the dependent ones. 

Here might lay the difference between a software architecture and a system architecture. The first’s name suggests it only has to do with software. Those would be architectures such as Serverless Architecture or Service-oriented Architecture. The second, takes the organization itself into account. Better yet, it comes to serve the organization and its customers. Enterprise Messaging Architecture is one of those. And we are indeed a few steps closer towards an entire system architecture.

The messages themselves are still not aligned with each other, they have no canonical structure. {CHANGED} seems like an Event, but {REQUEST} isn’t one. And we’ve presumed all the business logic is solely in the Actor.
We’ll tie those knots as well. But first, we need to go through it all over again. We need to go back, review and clear out the product requirements and customer expectations that have led to this system architecture. Because it comes to serve our Actors. On this, in the next chapter.

Leave a Reply