Cloud Storage Is Expensive? Are You Doing it Right?
In my day to day job, I talk to a lot of end users. And when it comes to the cloud, there are still many differences between Europe and the US. The European cloud market is much more fragmented than the American one for several reasons, including the slightly different regulations in each country. Cloud adoption is slower in Europe and many organizations still like to maintain data and infrastructure in their premises. The European approach is quite pragmatic, and many enterprises take somewhat advantage of the experiences made by similar organizations on the other side of the pond. One similarity is cloud storage or, better, cloud storage costs and reactions.
The fact that data is growing everywhere at an incredible pace is nothing new, and often faster than predicted in the past years. At first glance, an all-in cloud strategy looks very compelling, low $/GB, less CAPEX and more OPEX, increased agility and more, until of course your cloud bill starts growing out of control.
As I wrote in one of my latest reports, “Alternatives to Amazon AWS S3”, the $/GB is the first item on the bill, there are several others, including egress fees that come after that. An aspect that is often initially overlooked at the beginning and has unpleasant consequences later.
There are at least two reasons why a cloud storage bill can get out of control:
- The application is not written properly. In fact, someone wrote or migrated an application that is not specifically designed to work in the cloud and is not resource savvy. This happens often with legacy applications that are migrated as-is. Sometimes it’s hard to solve because re-engineering an old application is simply not possible. In other cases, the application behavior could be corrected with a better understanding of the API and the mechanisms that regulate the cloud (and how they are charged).
- There is nothing wrong with the workload, it’s just that data is being created, read and moved around more than in the past.
Start by optimizing the cloud storage infrastructure. Many providers are adding additional storage tiers and automations to help with this. In some cases, it adds some complexity (someone must manage new policies and ensure they work properly). Not a big deal but probably not a huge saving either.
Also, try to optimize the application. But that is not always easy, especially if you don’t have control on the code and the application wasn’t already written with the intent to run in a cloud environment. Still, this could pay off in the mid- to long term, but are you ready to invest in this direction?
BRING DATA BACK…
A common solution, adopted by a significant number of organizations now, is data repatriation. Bringing back data on premises (or a colocation service provider), and accessing it locally or from the cloud. Why not?
At the end of the day, the bigger the infrastructure the lower the $/GB and, above all, no other fees to worry about. When thinking about petabytes, there are several ways to optimize and take advantage of which can lower the $/GB considerably: fat nodes with plenty of disks, multiple media tiers for performance and cold data, data footprint optimizations, and so on, all translating into low and predictable costs.
At the same time, if this is not enough, or you want to keep a balance between CAPEX and OPEX, go hybrid. Most storage systems in the market allow to tier data to S3-compatible storage systems now, and I’m not talking only about object stores – NAS and block storage systems can do the same. I covered this topic extensively in this report but check with your storage vendor of choice and I’m sure they’ll have solutions to help out with this.
…OR GO MULTI-CLOUD
Another option, that doesn’t negate what is written above, is to implement a multi-cloud storage strategy. Instead of focusing on a single-cloud storage provider, abstract the access layer and pick up what is best depending on the application, the workloads, the cost, and so on, all determined by the needs of the moment. Multi-cloud data controllers are gaining momentum with big vendors starting to make the first acquisitions (RedHat with NooBaa for example) and the number of solutions is growing at a steady pace. In practice, these products offer a standard front-end interface, usually S3 compatible and can distribute data on several back-end repositories following user-defined policies. This leaves the end user with a lot of freedom of choice and flexibility regarding where to put (or migrate) data while allowing to access it transparently regardless of where it’s stored. Last week, for example, I met with Leonovus which has a compelling solution that associates what I just described to a strong set of security features.
There are several alternatives to major service providers when it comes to cloud storage, some of them focus on better pricing, and lower or no egress fees, while others work on high performance too. As I wrote last week in another blog, going all-in with a single service provider could be an easy choice at the beginning but a huge risk in the long term.
CLOSING THE CIRCLE
Data storage is expensive and cloud storage is no exception. Those who think they will save money by just moving all of their data to the cloud as-is are making a big mistake. For example, cold data is a perfect fit for the cloud, thanks to its low $/GB, but as soon as you begin accessing it over and over again the costs can rise to an unsustainable level.
To avoid dealing with this problem later, it’s best to think about the right strategy now. Planning and executing the right hybrid or multi-cloud strategy can surely help to keep costs under control while giving that agility and flexibility needed to preserve IT infrastructure, therefore business, competitivity.
To learn more about multi-cloud data controllers, alternatives to AWS S3, and two-tier storage strategy, please check my reports on GigaOm. And subscribe to Voices in Data Storage Podcast to listen to the latest news, market, and technology trends with opinions, interviews and other stories coming from the data and data storage field