We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Let’s be honest, storage has always been a complicated subject. Teams of dedicated storage administrators would choose between block (SAN), file (NAS) or direct-attached (DAS), and then each choice led to more details such as HDD versus SSD. The cloud is supposed to make everything simple with storage-as-a-service. While the details of infrastructure have now been passed on to the magic of the cloud, there are still many choices that have a notable impact on performance, costs and scale.
Decision-making for cloud storage: Be agile!
Unlike buying an on-premises storage array where customers typically had to make compromises around the best storage for a workload and then live with that decision for up to five years, the rate of innovation and offerings in the cloud are nonstop.
On-premises storage decision-making was a big container that had to support many different workloads, often with serious tradeoffs. The cloud now lets you pick the best storage for every workload and even subsets of those workloads. Need ultra-low latency storage for your database? Done! Need low-cost storage for long-term retention of rarely-accessed data? The cloud has the storage for you!
Taking full advantage of cloud storage requires IT managers to stop thinking in terms of storage volumes and pivot to thinking about the data. By segmenting datasets, you can pick the right resources and, more importantly, pivot data to a better cloud resource as your cloud vendor introduces new services or if you realize your initial choice is under- or over-utilized. Let’s look at the options:
Object storage is built for the cloud. It boasts unlimited scale with global namespace, which is akin to a universal file directory that makes it seem as if all unstructured data distributed across devices and locations is in a single location. Accessible over the HTTP protocol rather than file protocols like NFS and SMB, object is perfect for web-scale access of unstructured data. Object storage is presented to applications through a URL and storage tasks such as read, write and delete are accessed through simple commands that make it easy to consume by applications.
- Unlimited scale and simplicity come with a performance trade-off, as object storage typically has lower performance and higher latency compared with file or block storage.
- To protect your data on object storage, most cloud vendors offer automated replication to storage in a secondary region. Object storage also offers immutability that prevents any modification or deletion of the data for a set period, which is an effective ransomware defense tactic.
- All the major cloud vendors offer multiple classes of object storage with a range of performance and price tiers. With new applications, object storage is a great choice, allowing you to take full advantage of cloud data services and cost optimization.
- The simplicity, scale, and price point of object make it the de facto standard for analytics, AI and ML applications. New cloud-based compute functions default to object storage for scale and simplicity.
- Research-intensive industries such as biotech use object storage because it is low cost, can be accessed globally, is durable and provides easy access for research and collaboration. For example, if I want to share a dataset with an external organization, object storage allows for the creation of “pre-signed” URLs to grant read access; the same access using file storage requires a VPN configuration.
Also known as Network Attached Storage or NAS, file storage is presented via the popular NFS and SMB protocols for unstructured data. File is often the choice for existing applications versus new “born in the cloud” apps. File will typically boast higher performance and lower latency versus object with the tradeoff of limited capacity both in terms of number of files and size of volumes. File access is optimized for local or corporate network versus the global namespace that object offers.
- File is extremely versatile as it can be deployed on SSD for high performance or on dense SATA disk for lower costs.
- To protect data stored on file storage, customers may choose to mirror data (sync or asynchronous) to another file store or implement a backup application. Additionally, many file store options support a snapshot capability for rapid restores.
- File storage had become the default storage on-premises for all but the most performance-intensive workloads (see block) due to its solid performance and ease of use. With a long track record in IT infrastructure, most customers are comfortable deploying and supporting file storage. For these same reasons, expect to see rapid growth of file storage in the cloud with mature file offerings and enterprises migrating workloads at a faster pace.
Object and file are abstractions on top of storage resources that can increase scale and simplicity, whereas block storage is the equivalent of a local hard disk or direct attached storage. When implemented over a network, block storage is referred to as a storage area network (SAN). Block storage provides the lowest latency and highest performance because it is dedicated to a single application or server without an abstraction layer.
- Block storage is ideal for applications and structured datasets, such as databases, where performance is the primary consideration. Industries such as fintech will implement block storage to achieve ultra-low latency.
- Data protection on block storage is often implemented by the application that is using the storage. For example, to get a consistent backup of a database, the database application must perform the replication or export.
Storage is a space for rapid innovation and the choices are getting more nuanced all the time:
- Object storage is the incumbent of the cloud; once thought of only as “cheap and deep” and only for archive or static data, it can now boast relatively high performance.
- File is a viable contender for workloads like databases that were once considered only for block storage. Given that file is the prominent storage on-premises, we expect it will take on more of the scale and global sharing capabilities of object as customers move to the cloud.
- Block storage is ideal for applications and structured datasets such as databases where performance is the primary consideration
The true innovation of the cloud is that all these options are just a few clicks away and you can change storage as your requirements change. The key is to understand those requirements by analyzing your data both before and after you move to the cloud. Make the best decision for your workloads and data based on current usage and then monitor over time with an eye for new offerings. By optimizing your data for performance, durability and costs over the available resources you can innovate faster and save money.
Steve Pruchniewski is Director of Product Marketing at Komprise.
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!