Five Key Factors That Will Define the Future of Unstructured Data Storage

Posted on August 28, 2016

This article contains excerpts from a recently authored ActualTech Media whitepaper. The article will be updated with a link to download the full paper as soon as it’s available for download!

Every organization leverages data in some way. Whether that data is in the form of neatly designed databases or instead consists of files randomly strewn about the data center, the fact is that every business relies on data in order to operate. However, different kinds of data present different kinds of business and technology-related challenges in the modern enterprise.

As the name implies, “unstructured data” consists of data that does not have a well-defined structure. “Structured data,” on the other hand, is neat and orderly. Relational databases, for instance, are examples of structured data.

With unstructured data, organizations face two challenges:

Where do they store the continually increasing amount of data that’s being produced?
How do they make sense of all of the bits and pieces of data that are being captured?

Many companies attempt to use traditional block and file storage to meet all of their storage needs. However, as you’ll learn in the next section, traditional storage is often not up to the task, particularly as you consider the economics of the solution.

Six Challenges With Traditional Storage

Architectural Rigidity
Block- and File-centricity
Lockstep Scalability
Hardware Lock-in
Location Lock-in
Cost and complexity

Unstructured Data in the Future

It’s time for storage admins and IT decision makers to begin rethinking storage. In recent years, the storage market has undergone tremendous transformation, with solutions that address some of the challenges in traditional storage. However, as we look into the future of storage, there are five key characteristics that will drive the future of storage systems that support unstructured data. They are:

Object Storage
Frictionless Deployment
Agility and Flexibility
A “no compromise” feature set
Improved Business Outcomes

Introducing NooBaa

We’re pleased to be on of the first to get a look at an awesome new solution that address and realizes these five key factors: NooBaa Frictionless Storage!

NooBaa-Architecture — **Figure 1** – NooBaa architecture

NooBaa storage provides agile, low-cost, scalable storage for unstructured data workloads such as Splunk, enterprise content archival, and media archival. NooBaa supports the use of any compute host as a storage node, including completely heterogeneous or even shared resources.
Part of what makes NooBaa unique is the architecture. What’s very interesting about the SDS nature of the NooBaa solution is that the design looks very reminiscent of SDN in that the control plane is very separate from the data plane and the controller manages any number of nodes. See Figure 1 for a simplified look at NooBaa’s architecture.
The team launching NooBaa claims that it can be deployed in 15 minutes or less to start bringing value immediately. We took it for a spin in the ATM lab and found this to be true! We were able to cluster storage resources spanning a handful of AWS EC2 CentOS instances as well some on-premises Server 2012 machines. It’s as simple as deploying the management appliance (which is the control plane), and then using wget or Invoke-WebRequest (depending on your OS) to install agents which consume the storage.
In a matter of minutes, we were able to create the storage pools (aggregated storage resources) seen in Figure 2.

NooBaa-Console-Overview — **Figure 2** – The NooBaa Overview UI

Then, with a few more clicks, we were able to create S3 buckets which had policies applied to do things like mirror data across multiple pools. In this case, one of our buckets with pretend important data was mirrored across on-premises and AWS nodes to increase availability. On another pool, the policy was set to stripe data across both pools, allowing us to aggregate storage capacity across multiple data centers.
NooBaa storage is all S3, all the time. That means that to consume it, one needs to connect an application or S3 browser using the RESTful interface. This will be very familiar to anyone used to consuming Amazon S3 storage, or other on-prem S3-compatible storage. Figure 3 shows an example of connecting to one of our test buckets.

NooBaa-Interface-Application — **Figure 3** – NooBaa Connect Application dialog showing connection information

You might wonder, “What would I actually use storage like this for today?” The opportunities for this type of storage are growing rapidly. Today, an organization might consider using this type of storage for retention-oriented enterprise applications such as Splunk (or ELK, etc). It would also be a really good storage target for S3-compatible backup tools such as those offered by Commvault, Veritas, and others.
In the future, there are other pretty clear areas where we expect to see an uptick in consumption of this type of storage: use cases like video surveillance, media & entertainment, and life sciences. There will be even more, but these are some of the most obvious.

Test Drive It Yourself

There’s no better way to learn about a solution than to get your hands dirty. I’m happy to say that NooBaa has done one of my favorite things: they’re offering a free community edition! With the community edition, you get all the goodness of a full-on NooBaa installation and can manage up to 20 TB of data! That’s more than enough to give it a whirl in the lab, so be sure to grab a copy and test it out!
If you happen to be at VMworld 2016 US, NooBaa has a booth here and would love to meet with you! They’re at Booth 442 in the Solutions Exchange. We’re genuinely enjoying learning about the solution and getting to know the NooBaa team. Be on the lookout for more ActualTech Media content about NooBaa’s offerings during VMworld!