Friday, June 23, 2017

Pricey Architecture

Or, How architecting in the cloud is different 

When designing cloud scale always on system, system architects are expected to be experienced in core system requirements - scale, security and high availability. By this day and age, this art is pretty well understood. The public cloud is a great help in driving solutions to those core concerns by providing the hard-to-acquire and hard-to-build foundational elements:

  • Apparently infinite amount of compute and storage capacity for scale
  • Fine grained control at the network and API level for security
  • Fault zone isolation in the form of independent zones and regions for HA
However, the tax the cloud imposes on these building blocks, even if not apparent at first, is its own complexity. If you don't consider the pricing models for the underlying building blocks and misapply apply them the tax is converted to $'s. Here are a few examples from recent design mishaps I've witnessed.

An expensive scaling story:
What with all the commotion about the NoOps and Serverless movement, the team decided to build their new upcoming service using AWS Lambda and API Gateway. The thinking was that scaling problems will be taken care off by AWS: As the service grows in popularity the magic of Serverless in the public Cloud would solve the problem.
With 2 months after launch, the charges for the new service were exploding - the service became so popular that millions of invocations where streaming in continuously. 

An expensive security story:
The security team has defined a goal to leverage a much mode secure deployment in the cloud with separate Account level boundaries for management vs production vs staging accounts. The infrastructure in the different accounts would tie back via appropriate network and peering configuration into the management secure network.
Alas, the design didn't consider that all the traffic that was previously free, in the new design will be charged as internet traffic and incur NAT charges.

An expensive availability story:
In a similar situation, a team who needed to store large volumes of data wanted to ensure that their data will be accessible even with partial failures. Well versed in the "Cloud", they opted to deploy their backend stores across multiple zones, with online replication between them. This guaranteed that localized failures will not cause availability concerns.
As in the previous note, they were hit with unexpected network transit charges due to the chatty replication behavior of their storage engine.

What could they have done differently?

Lambda and API gateway are excellent choices for getting a service out to market quickly - they require no investment in infrastructure or ops. However, the price is steep, and as a service matures and becomes heavily used, you should rethink the choice of  Serverless. For a 24x7 workload the equivalent horse power provisioned in lambda vs T2 family is roughly 60 times more expensive. Relaying on Serverless for scaling out is an expensive proposition.

The next two examples share a common thread - transit and especially egress charges can quickly rack up. It is not the data center, where you pay for the equipment and the power, and get to use as much of it as you have.... Designing your architecture for both security and availability without taking into account networking costs results in a pricy solution.

New troubles for the Enterprise Architect
The stories above can be taken as a cautionary tale, but rather than scare you from the public Cloud, they should the opposite - the variety of building blocks available empowers the architect to build fast and iterate quickly, possibly switching between different building blocks (e.g. Lambda vs Container-Functions) depending the business needs. The choices made when time to market is paramount, should favor those offerings that provide for speed at the cost of $. However, a responsible architect will admit that when the product gains traction and COGS considerations become important, then rearchitecting might be required.

So, yes, Mr./Mrs Architect - in the cloud, to server your business, you can't limit yourself to just the considerations of the past - you now must add COGS into your list of worries. This challenge is exciting and new - building a financially efficient product offering!
To wet your appetite and a different view of Storage choices, feel free to gander over to Gigabytes and cents, which despite being a bit old, is very much still relevant.