01 The Unknown, 02 The Philosophical Layer

Impermanent Continuous Chaos: Customer Experience and [In]stability

Reading Time: 11 minutes

TL;DR:

  • Unstable = bad customer experience
  • Physical devices are Waterfall, software is Agile. Smart devices are the worst of both.
  • Things are stable when they are final, not the other way around.
  • Product changes frequently, thus they are never final and never stable.
  • Updates can and will eventually include bugs.
  • MVP = Updateable application
  • From a customer’s perspective, a product is as unstable as its linked system.
  • Customer expectations and experience change, and change causes instability.
  • Reduce and prevent instability, the natural state, instead of chasing the non existent stable state
  • To minimize the frequency, duration and effect of instability is to maximize the customer experience

The boog’s first article, I don’t know, ended with the one one thing we knew for sure – the customer expectations. Not only that, Silo’s device design, patents and future marketing were all about the customer experience. I remember the first time we showcased the device, repeating again and again “Silo is the world’s smartest and easiest vacuum sealer – ever!”. There are other vacuum sealers in the market, but ours is the most user friendly that will ever be.

You should have seen our designers work to guarantee that. They’ve spent weeks getting to the right “pssst” sound when you open the container, which we later coined phrased as “the sound of freshness”. To a software engineer it may sound ridiculous but if a container is opened and it doesn’t make any sound, you’d be getting a feeling that nothing happened. If the sound is too loud, an explosive like one, it would be too frightening. You may not know this fact but Coca Cola’s pssst sound used to be a registered patent and trademark. So after years spent on physical design and manufacturing, the only thing that can ruin it… is… well… software. That can not happen. Not on my watch. Customers expect a product to work. Needless to say the unstable products create bad customer experience.

You can’t update the plastics

A physical product, unlike a software product, is almost a one shot thing. You could say there are only major versions (V1.0, V2.0, V3.0) as once a physical product reaches a customer it can not be changed, or to be exact it can no longer be changed once it leaves the factory. There is no remote updating neither for the plastics nor for the metals. Once a device sits on a kitchen’s counter, with its electronic board and components, the only way to replace or upgrade a malfunctioned component would be a recall. It would mean taking the product back from an angry client. If the designers, the mechanical or electronics engineers are releasing a malfunctioned or a not friendly enough product – that is doomsday. But they have to do it only once. That is why the waterfall model is a good fit for physical devices / hardware.

Not only the plastics are supposed to be perfect but also the physical product’s software, as it also leaves the factory and may not be updateable at all. Let’s say you bought a simple kitchen scale and a week later it turns out that due to a bug in its weighting algorithm, which is more complicated than you think, it measures everything incorrectly. As a customer, there is nothing you can do. There is no way to fix this. You won’t be unscrewing the bottom of the scale, plugging a data cable to the electronics board and updating its embedded software, which won’t be released because the weighting algorithm is a commercial secret (not kidding!).

This is why a physical device is thoroughly designed in advance and thoroughly tested by the manufacturer, sometimes manually unit by unit. Every mistake is unrecoverable with a huge penalty of a recall and a bad review on Amazon. Not only one existing customer is angry but you’ve probably lost another 9 future customers. This is way more serious than someone simply uninstalling a mobile application. This is why a phsyical’s product software when it leaves the factory must be final and stable.

Unlike we came to think, it is not final because it is stable (Final → X → Stable). It is stable because it is final (Stable → Final). The scale is fixed. It will not be updated. There will neither be a new feature or a new “user experience”. A scale remains a scale because someone bought it in order to.. well… to scale. It is named after its one single function. This is why a simple kitchen scale works for years to come. This is why a customer will remain happy. This is why the mainframe servers of your bank are working for years to come. Stable is a state interrupted by change.

…. but what about a smart kitchen scale?

“It’s software, it breaks”

At Wiser (2016), one customer called his technical account manager (TAM). The customer was a bit angry and a whole lot dissatisfied. He’s paying to get his data updated daily, but his data hasn’t been updated in about a month. The TAM from the San Francisco office, did some poking around the system and eventually told the customer, and this is an exact quote as far as I know “well you know, it’s software. It breaks”. Unfortunately for the TAM, or so does the legend says, the CEO was standing right behind him. He was furious and rounded up the entire customer service and TAM personnel and gave them a proper yell. When we in the Israeli R&D offices heard about it, we laughed our asses off. We went and printed “it’s software, it breaks” on a banner, and were about to hang it in the hallway but alas, we were denied. The furious CEO was on his way to visit the Israeli offices and he won’t find that amusing at all. Us Israelis have a tendency for good humor. Good is a matter of perspective nonetheless. Years later I can never forget this story, but now I’m always ending it with “you know what, he might have been right after all!”

A smart/connected kitchen scale, or a vacuum sealer for that matter, is an abomination. It is still true that its physical properties, plastics and electronic components, can not be augmented remotely. But in order to satisfy existing and new customer expectations/needs it’s software can and will be remotely updated. It could happen to expand the company’s business or to create entirely new revenue streams. The scale has just lost it’s property of one single function of scaling. It is now also connecting to a WiFi, also logging into a system and also updating itself.

But maybe it actually is the other way around? It is technically possible for the physical product to leave the factory with only the capability of updating itself. It would become a scale only after it had reached a customer and the software had initially updated itself. Within a year it may receive an update that will update your Facebook status with your last thing you weighed. Within two years it would receive an update that would turn your smart scale into a remote control for Dance Dance Revolution. That would be such a huge hit that your kids are gonna play with it all day and it is no longer used as a scale at all. It would now be constantly placed in the living room next to the Playstation. It sounds crazy but a change to a product’s main function, to its own definition, through time is something that indeed happens. 

An example for a highly mutable device would be Amazon’s Echo Dot. It started as a virtual assistant that many used just as a cooking timer. Years later they pushed an update that turned it also into an intercom. Suddenly for many families’ its primary function has now changed into telling their kids the dinner is ready.

Silo’s 4 years roadmap for the connected vacuum sealer was the same. It would start as an easy to use vacuum sealer with an Echo Dot embedded within it. Then it would transform into a Kitchen Hub that will easily manage all your food at home (not only food within containers) and eventually into a point of sale as you’ll be able to order more food through it, maybe even completely automatic. So what would it be that would exit our factory?! The answer would be, as always, I don’t know.

Although it is an annoying answer, it is also a great one. We have just defined a software only MVP, which must be final and stable in order to be released at all. An application that no matter what it does, it must be securely remotely updated with zero chance of failure/bricking. Fortunately, that has nothing to do with what the physical product will do and no matter when. It is independent of it. That means that we have something to start work with, months in advance without any product specifications. That will give the product team whatever time they need. We’re busy anyhow.

If this Software MVP is final and stable, then the product can leave the factory. But updates to it will occur, so it can’t be final and stable. That’s a contradiction that needs to be technically resolved. Updates can and will eventually include bugs. If not this update, then the next one. If you’ve been in the software industry long enough, you know that this is 100% true. That’s the reality of it. 

If it is continuously updating – it can not be final thus can not be stable. If it is not stable, it can not leave the factory. If it can’t leave the factory, no one can buy and use it. So why make it in the first place?! A conundrum!

This is not limited to a physical device. It is also true for mobile. Companies pivoted entire products during the COVID-19 epidemic. Cheetah, a wholesale restaurant food purchase application, turned over night to Cheetah for Me, a personal grocery supply for private households. Check was acquired by Intuit and prior users were switched overnight to Mint Bills. The only difference, maybe, that a part of the update mechanism has been done for you by Google/Apple. Web applications as well are continuously updating behind the scenes, with their own custom CI/CD pipelines. The truth is, entire systems are.

Naturally unstable

So far, we’ve only looked through the scope of a single standalone product. But connected products, physical and pure software ones, are dependent on entire ecosystems that each component is continuously updating on its own. If the ecosystem is unstable, how can a connected product be? From a customer’s perspective, a product is as unstable as its linked system. Image Google’s scale for a minute. Google’s ecosystem consists of tens of thousands of applications over hundreds of interconnected products. How can such an ecosystem be stable at all? And yet, it is and a lot of good engineering practice has been invented to make sure of that.

But there’s no need for all that “engineering” and “science”, there is another solution. Simply stop updating. Don’t deploy. Don’t work on any new features, just fix the last bugs and voila you’ll be reaching the Nirvana of stability shortly. Just lay back and relax. Everything is about to become stable. Alas, you can’t. Customer expectations and needs change through time and time never stops passing. If you won’t update, you can’t adapt or did you forget the tens of startups just waiting to take your piece of the pie? They do update, and they are now faster than you’ll ever be. They will meet your customer’s expectations and customers will leave you. You and your company will go bankrupt. But that’s fine, at least you’re stable. Customer expectations and experience change, and change causes instability. That is the truth. 

Let’s get back to reality and realise that stable is a state, a temporary short lasting state. It may be infinitesimally so small that a stable state may not even exist at all. Agile and Lean, the de-facto winning development and business methodologies, are not only causing disruption they themselves cause instability. You would need to manage instability, the natural state. No wonder that Netflix’s mechanism to ensure resilience is called Chaos Monkey and the principles of it are called Chaos Engineering. Do not waste time on unfulfillable promises of stability. Even Amazon Web Services, who are by far smarter than me and you combined, who hire the best engineers with an endless amount of resources, are not promising 100% up time – as it is just not possible. It’s a change of thought and a change of mind. That does not mean you should give up and do nothing.

You’d be amazed how many things do not exist, but are actually only as a lack of their opposite. Cold is a lack of hot, lack of energy. Happiness is a lack of suffering. Customer satisfaction is a lack of customer drop off. Stability is a lack of instability. Instead of maximizing stability, minimize instability. Reduce and prevent instability, which is not a state but an ongoing process. You may not be aware that you’re already there. What is a CI/CD’s role if not to prevent instability by automated tests and automated deployments? What is the purpose of measuring your application’s error rate if not to reduce it? There are entire teams at your disposal that are guarding against instability – DevOps & QA. They too directly contribute to the customer’s experience.

If I’d wish to meet my customer expectations and the future business of my company, I must design a system that is inherently built to withstand and sustain continuous change that continuously causes instability. You will see that in the series of articles about Silo’s Service Oriented Architecture, our Message Brokers (plural!), isolation and blast radius [further discussed in a future article] .All of which were in place to minimize the possibility of instability by preventing it in advance. To minimize the frequency, duration and effect of instability is to maximize the customer experience

This is the why and how Silo’s system must and will be resilient and resilient to change. A practical example, that we actually did, of this way of thought would be to define and release an MVP, that could be shipped out of a factory, that is both stable and unstable. But how can something be at both of these states at the same time? The answer is that instability, is not a entirely a black and white situation, not a true || false one. It’s somewhere in the middle.

The technical resolution to this would be to invest much time and effort and to thoroughly test this MVP, to make sure that the device connects and updates itself, under any circumstance. To make sure these processes are isolated and independent. The frequency of change to these processes is lower than those of the applicative usage and an update to the applicative usage will not change the update mechanism itself.

That’s how you minimise, prevent the chance of instability and shorten its duration. That is what everything in Silo should aspire to do. This is we’d achieve resilience. That is just for one device but there’s an entire interdependent ecosystem to take care of. There’s a lot of work to be done. This boog will deep dive into these solutions.

Leave a Reply