Everyone knows that cloud services have usage-based, or what we consultants are now calling “consumption-based,” pricing, right? Questioning that is a bit like asking if the emperor really has any clothes, but that’s okay because that’s something that needs to be done now and then. Let’s start off with the easy one – the idea that cloud pricing is “consumption-based.” When you “consume” something, you basically eat it. That means it’s gone when you’re done, or it can only be used again if you recycle it. Consumables are things like paper, toner cartridges, pencils, sticky notes, etc. But you don’t eat computers. They are still there when you’re done using them, especially in the public cloud. (The only thing you’re “consuming” is a little of the power feeding the equipment you’re sharing.) And if you want to stop using them, then someone else can use them after you’re done. So why do so many consultants use the term “consumption-based?” I can only speculate, but we are known for occasionally making up words for things so that we sound like consultants. It’s the reason ‘impact” and “action” are now verbs, while “spend” and “ask” are now nouns.
Speak directly to a subject matter expert.
Fill out the form to connect with an advisor.
Okay, fine, you may say, but we definitely use computers. Surely cloud pricing is indeed usage-based? It’s pretty reasonable to think so, but in most cases I’m sorry – it really isn’t. Yes, you are, in fact, using technology resources when you use cloud services, but that doesn’t mean you are charged that way. True usage-based pricing is the way most people pay for electricity. There’s a meter that ticks off the number of KWh you use (or consume) each month, and that’s what you pay for. Anyone else with access to the power grid does the same thing, paying only for what they use even though they have shared access to a bunch of power generation and transmission infrastructure. The amount of time you take to use it doesn’t matter, and if you don’t use any KWh you don’t pay anything (other than a few fees and taxes).
Get a Handle on Cloud Pricing.
Not so with cloud computing. Let’s use the on-demand option of Amazon Web Services (AWS) as the best-known, market-defining example. Amazon bills for compute instances on an hourly basis. The instance has a certain number of processors and memory associated with it, i.e., compute capacity. If your application only spends 5 minutes per hour doing any processing, you still get charged for the whole hour. Amazon isn’t measuring how much processing you’ve done; they are just measuring how much wall clock time you’ve had access to a certain amount of compute capacity, which means that it’s really still capacity-based pricing. If the power company charged you like that you would get a bill for the number of hours that you had access to a power grid, regardless of how much power you actually consumed, except that the maximum power you could use would be limited until you ordered more.
The way we pay for public cloud is actually much more analogous to the way people rent apartments, where we pay for access to capacity. The bigger the family we have, the more rooms we need. The innovation here is that the apartment is rented by the hour instead of the month. You can add more rooms or remove them very quickly, and you can stop renting whenever you want. Those options aren’t very useful in the real-estate market because families don’t change size quickly, but they are very helpful in the fast-paced world of IT, where demand for resources often swings dramatically in short periods of time.
The speed with which you can scale your cloud capacity up and down, and thus your bill, creates the illusion of usage-based pricing. Since you are “using” more or fewer instances at different times, if you look at a whole month, your bill exhibits usage-like behavior due to the fact that you are being charged by the hour and not by the month (you can see that you incurred more charges during one hour vs. another). Monthly bills for traditional outsourcing services are for the capacity you have as of a certain day of the month, but if you were to look at a whole year, even that might appear, in a way, to be usage-based if you had a lot of changes in server and storage capacity month to month. It’s the frame of reference you have when you look at the charges.
The other thing to understand is that there is often a lot of spare CPU time left over when a typical application uses a VM (instance) for an hour, because computer chips deal in actions that take only a fraction of a second, and they can also do more than one thing at a time. In our example, Amazon can sell that CPU capacity you aren’t using to other customers if they wish to. Same for Microsoft. Same for Google. To get true usage-based pricing you would have to bill not for wall-clock time that the CPU is rented but for CPU time that processing was actually going on. In the mainframe world this is measured in CPU seconds. We could do that for cloud computing too, but, well, it’s hard to measure and it’s hard for customers to understand, so it’s probably not going to happen any time soon.
Okay, that’s a nice academic discussion, but so what? The useful take-away is the following rule: as the time units used for billing decrease in size, capacity-based pricing resembles usage-based pricing more and more. Let’s call it “Scott’s rule of cloud pricing.” Understanding it can save real money. If an application that only uses 5 CPU minutes per hour (100% of the CPU actually doing work for a total of 5 minutes per hour) were billed by the minute rather than by the hour (assuming the per minute price is one 60th the per hour price), the server instances would cost as little as one twelfth as much. So to simplify, you always want the time units you are billed for to be as small as possible (per-hour is better than per month, per-minute is better than per-hour, and so on). The more volatile your workloads are, the more important this will be.
This fact is currently not well understood by cloud customers or the media.