Google+ Followers

Wednesday, September 3, 2014

Thinking in Gigabytes and Cents


AWS offers a variety of storage options that fit different usage patterns, retention needs and cost profiles. When making an architectural choice of storage in the cloud, today you have a multitude of options to achieve your technical goals. For example, you can use RDS, or stand up your own database using EC2 and EBS. While the two options provide an almost identical service, they differ in the flexibility and cost profiles - with the right one for you dependent on your specific use case.
Below is a summary of the different categories of storage options available from AWS, and a short summary of the price drivers behind them:

  • Functional storage: This category includes storage exposed as a functional component. These are databases, traditional or modern. While these services store data, they main pricing driver is the speed of access  - the faster the required access, the higher the price. In this category you’ll find:
    • Relational Database Server (RDS) in various flavors
    • RedShift - A highly scalable, clustered, MySQL variant.
    • DynamoDB - a predecessor to Cassandra, a nosql highly scalable database, which trades flexibility in query for speed and scale
    • Elastic Map Reduce - A managed hadoop service, which can leverage different underlying storage options.
    • ElastiCache - managed Redis or Memcache - for smaller, but memory speed, datasets.
    • CloudSearch - a fully managed service similar to ElasticSearch.
  • Block storage: These services look and feel like traditional SAN services, and are priced in similar terms - namely: data volume used.  In this category:
    • Elastic Block Storage (EBS) provides 3 types of volumes (Magnetic, IOPS provisioned and SSD) with differences  in pricing and performance. EBS Snapshots are an add on to EBS, which provides highly resilient backups (on top of S3, see below).
    • Storage Gateway - A path into the cloud for the traditional datacenter. It offers a transparent hybrid on-prem/cloud tiering option where backups or less frequently accessed data can be stored in the cloud on top of either EBS, S3 or Glacier (below).
  • Object Storage: Simple Storage Service(S3) was a game changer in terms of $/GB stored when first introduced in 2006, and prices has been dropping. To provide more flexibility in $/GB, AWS has also introduced:
    • Reduced Redundancy storage option for S3. This option keeps the same API, but a lower level of redundancy (with the possibility of rarely losing objects) for a 20% price cut.
    • Glacier long term archiving. With a 66% lower price for storage than S3, it’s worth considering, but only if your data access patterns match the intended use case - rarely accessing large percentages of data, and having the patience to get it.


Choosing the Right Storage for Your Need

The options are many, and target different use cases - those requiring huge amounts of archive data for mostly offline storage, to high level 10,000’s request per second highly available database. The selection criteria must obviously include suitability to the problem at hand. While you can’t use Object Storage as a backend for a database, you also shouldn’t neglect the pricing model details.

Practical Example

As a simple example, consider the options in using EBS volumes:

  • At current pricing, Magnetic Volumes at  $0.05 Per GB/Month is 50% cheaper than SSD storage at $0.10 Per GB/Month.
  • For a 10GB volume which has 200 sustained IOPS during the month, the total charges will be: 37.2$ for magnetic but only 1$ for SSD. SSD volumes do not incur additional cost for IOPS, while magnetic ones do. What appears to be more expensive at first look, is actually 300% cheaper.

This example highlights the need to understand the price drivers for the services you are planning to consume, and how they map to your workloads. It’s often hard to predict the exact behavior of different workloads, and in an agile world you should ship first, provide the value, and optimize for cost on an ongoing basis. At least that’s the oft repeated excuse. ;) There is however no excuse to not performing the on going monitoring and required adjustments.

How CloudHealth Can Help

Screen Shot 2014-08-31 at 11.18.50 AM.png
As a founding engineer in CloudHealth, I can tell you first hand the value of an analytics platform that helps customers to understand the drivers behind their spend, justify it and optimize it. Having deep visibility across your cloud infrastructure, usage and spend provides the the ability to make well-informed tactical and strategic decisions around architecture.

Small changes, can lead to big savings… if you can find them.