Skip to main content

Achieving high availability at lower costs with modern hyper-storage architectures

Hyper-storage technology was developed to break the myths of the storage world. Trade-offs as to cost, capacity, functionality and performance that were absolute in the past, are no longer an issue with this modern storage approach.  And capabilities enterprises could previously only hope and dream about, have now become reality, creating new opportunities. The hyper-storage approach enables enterprises to achieve the highest data availability with lower costs, peak performance with infinite scalability and seamless ease of use.  With hyper-storage, enterprises no longer need to compromise when making decisions about their storage architectures. 

Hyper-storage technology is a unified, multi-petabyte, ultra-high-performance solution, providing very high IOPS and Throughput and very low response times. The concept behind hyper-storage is to deliver software so smart that it can beat the strongest hardware in real life workloads. It is not built on brute force, but on brains. Optimised mass paralleled utilisation instead of unnecessary expensive hardware and machine learning instead of guessing and wasting important resources. Using every component’s merits while avoiding its flaws to achieve maximum efficiency in every second.

The world is heading fast towards data analytics and machine learning concepts across all kinds of technologies, it’s time that the storage world recognise and adopt it as well. The writing is already on the wall. Look for instance at the search engines we all know and what came to pass with those who took these technologies to the next level and with those who did not. Think of who was where 20 years ago, and who is where today. 

The same applies to storage. Every single IO needs to be identified, documented, analysed and learned from to improve the next one, and the next one, and the next one. Analysing all kinds of IO patterns and improving them over time is the basis of storage machine learning. The first level of this technology is embedding this automated mechanism in every system on its own, learning the workload patterns running on top of it and improving service dramatically over time. 

The second level, and the most interesting one, is to take all these patterns from all the systems in the world, merge them into a single place, cross learning from it to constantly improve and optimise the machine learning algorithms. Measuring and analysing cache hits and cache misses, learning and changing methods of data placement, improving pre-fetch mechanisms by learning their effectiveness, re-organising data to improve accessibility and many other learnable IO patterns were just waiting for these unlimited improvements. 

Utilising every pattern differently and providing it with the best dedicated service that suits it specifically, allows this smart architecture to serve all kinds of IO patterns, no matter what they are, in the same system. Many different kinds of workloads can run together on the same platform and each and every one of them will be served exactly, no more no less, according to its needs. One storage fits all.

Redundancy over redundancy

Hyper-storage is using many other technologies to maximise performance and availability. Let’s start by putting this on the table, the concept of physical RAID groups is dead. But for some inexplicable reason lots of storage vendors are still using it vastly. It is inefficient to say the least, creating limited resiliency coupling software and hardware together for no reason. Long rebuild processes are initiated with every drive failure. Processes that have a dramatic impact on performance, expose systems to data loss and create enormous pain points for every storage manager in the world. The same applies for ongoing performance. 

Coupling logical entities, such as LUNs and FSs, with a dedicated set of hardware components is limiting to say the least. As said, this is a thing of the past. The advanced hyper-storage mechanism is based on creating a massive amount of block level logical protection groups, completely decoupled from the hardware. It provides extremely high performing processes, running in mass parallelism on all components simultaneously in a completely balanced form.

It also provides with enormously faster data aware reconstruct processes that has no impact on production performance. The terminology of rebuild times has changed from hours and days to seconds and minutes and needless to say, performance impact during these rebuild processes is not a thing we should suffer from any longer using hyper-storage architectures.

Redundancy over redundancy is another ground rule hyper-storage goes by. No single point of failure standard is obsolete, five nines availability is not enough when systems are getting bigger and bigger and the blast radius of every system is in the multi PBs scale. The consequences of unavailability are enormous and cannot be accepted any longer. Playing in those fields, hyper-storage has to have redundancy over redundancy for every single component with no exception. 

The system has to stay completely redundant even during failures, no matter what component fails. N+2 level of protection across the entire data path and the entire power path is the basic architecture required to allow redundancy over redundancy, but it needs to go far beyond that level for components that tend to fail every once in a while, such as disk drives. Hyper-storage, decoupling software from hardware with layers of virtualisation and using virtualised spare capacity instead of physical spare drives allows systems to suffer dozens of drive failures, no matter which drives are failing, before the need to replace any of them. In case a drive fails, or multiple drives fail, a self-healing process is initiated automatically using all active drives, and after several minutes the data is reconstructed completely and balanced on all active drives, back to full parity protection. 

This process doesn’t require any drive to be replaced, it is completely transparent and has no performance impact whatsoever on the system. CRC protection is another feature required for systems playing in the multi PB scale. When such an enormous amount of data is written, logical data corruptions are not dismissible, they happen. And hyper-storage, claiming to be at least 100 times more available and reliable than any traditional storage system, is solving this issue transparently as well, fixing these corruptions on the fly without users even notice they happened. These are only part of the resiliency, availability, serviceability & data integrity features hyper-storage architecture contains by design. Many more are implemented in order to offer the best data storage service the world has to offer.

Let's talk money

Let’s talk a little bit about management and ease of use. For starters, the concept of how hyper-storage allocates storage capacity to the outside world needs to be cleared. The front-end side is always completely virtual, not a single physical block is formatted or dedicated to the address space provisioned or exported. The space seen by the end-user is always a thinly provisioned address space that is separated by the layers of virtualisation from the hardware layer. This is why the virtual space can be 10 or 100 or 1000 times bigger than the physical space. 

High flexibility, mass parallelism, auto-balancing and vast data aware capabilities are a given due to this topology. Another nice capability that is an outcome of this topology, is instant operations across the board. Every entity creation, deletion and modification are completely instant when only pointers are created and redirected between these layers of virtualisation. The same applies for data copies. A virtually unlimited number of data copies can be created, deleted, cloned, mapped, unmapped, refreshed and restored instantly in every granularity and with every retention policy required. Actually, every operation done on hyper-storage is completely instant. 

It can be done on a single, multiple and even a large amount of entities at the same time and still doesn’t take more than a split second to complete. It is important to understand that managing these virtualisation layers from DRAM allows for microsecond latency operations that don’t affect the response times of the IOs being served by the system. These capabilities, alongside many other automation processes built to make every management operation as straight forward as it can be, create incredible ease of use when managing TBs, PBs and even EBs of data using hyper-storage architectures.

Now let’s get down to business and talk about a thing that is important to every customer in the world: money. The magnificent thing about hyper-storage architecture is that it is built for sky-high performance capabilities when actually it can be running on a low-cost hardware without any penalty. The aggregate throughput the low priced “slow” drives provide is larger and faster than the TBs of RAM and the dozens of TBs of flash the systems use for cache layers. Meaning that by converting random workloads to sequential and parallelising massively all active drives simultaneously, the drives are far from being the bottleneck of the architecture. 

This allows for a huge amount of data at a very aggressive price point. A race no other architecture has a chance to win when scaling high, both CAPEX wise and OPEX wise. It is a race where innovation is winning brute force by a knockout. For non-compromising quality assurance reasons, the highest quality hardware is being chosen and every piece of it is going through massive validation, qualification and integration processes by the best standards in the industry. But the ability to use this hardware without any compromise to a media that is not considered to be the fastest hardware available, which is priced far more aggressively than the fast media hardware available today, makes the solution very attractive even comparing to the mid-range market prices. Keep in mind that we are talking about a super high-end solution. As weird as it sounds, it’s like buying a Mercedes-Benz or a Ferrari for the price of a Fiat. 

The millions of IOPS and dozens of GB per second are provided using hardware that traditional logic would say: “It doesn’t make any sense, it is impossible! It’s too good to be true!” And other such phrases that I keep hearing again and again about this technology, until the naysayers are proven otherwise by seeing it in action. But when proven otherwise, “How can it be? What is the secret sauce? Did you spread some magic powder on this system?” are starting to be the lines thrown out to the air. One of the nicest things I experienced is that when the most sceptical are “converted”, they become the biggest fans and the best sponsors.

Let’s summarise

Low-priced highly qualified hardware, combined with logical protection groups architecture that supports any size of drive with no penalty on rebuild times and performance, are the building blocks of a super dense high-end system that has the ability to scale infinitely. 

Separating the storage innovation from the hardware enables the rapid adoption of the latest and most cost-effective hardware. In addition, by shipping the software with a highly-tested, qualified and integrated hardware reference platform, hyper-storage is actually the first, true enterprise-class, software-defined storage architecture available today. 

Smart data copy management, instant operations, innovative replication, hot upgrades, CRC protection, encryption, compression, auto balancing, mass parallelism, machine-learning and many other features and capabilities are built in by design. Alongside a very cost effective highly qualified hardware, hyper-storage is the only data storage architecture available today that is built and designed for both technological and commercial requirements of the massively growing businesses and organisations in the future.

Let me finish by saying that hyper-storage is a very smart software architecture that is coming to solve hardware limitations, which is the exact opposite of a very strong expensive hardware architecture that is coming to solve software limitations. 

Hyper-storage is not a technology coming to solve today’s pains of the storage industry; it’s coming to solve the next 20 and 30 years’ challenges. Its target is to make organisations future proof and future ready for the data explosion revolution of the decades to come. Organisations that will not consider such challenges today, will be out of the game tomorrow. And the ones that will make data explosion, data analytics and big data approaches their priorities will lead the world in the future. I suggest you give it a thought.

Re'em Hazan, Technical Sales Director, INFINIDAT

Re'em Hazan is Technical Sales Director for INFINIDAT. He has over 15 years of experience in the IT industry and has joined INFINIDAT in mid-2014 managing the technical sales for Israel and South Africa. Before joining INFINIDAT, Re'em has managed big complex and heterogeneous IT infrastructures and fulfilled technical and management roles in a big financial enterprise and a large IT integrator in Israel. (Sys. Admin, Backup/Storage manager, Backup/Storage team leader, BCP manager, Group manager).