Big Data Challenges

Storing Petabytes or more of data requires a new approach to data protection.  At Petabyte to Exabyte scale, RAID no longer provides acceptable protection from data loss, and today’s backup and replication approaches can’t keep up with typical ingest rates at this scale.  If you have Petabytes of data or more, and are growing at the industry average of 30-40% per year, you need Himalaya.

  • RAID at petabyte scale results in data loss
  • Replication of 3 (or more) copies is too costly
  • Asynchronous replication is inherently vulnerable to data loss (data in flight during disaster)

With today’s high capacity disk drives, 4TB today, 6.4TB in 2014, and predicted to be 60TB within the decade , the probability of data loss using RAID6 from a non-recoverable read error during rebuild will soon be a certainty. These large capacity disk drives will proportionally increase RAID rebuild times to multiples of days. According to an Intel study (3), rebuild times between 8.3 hours and 41.5 hours have been measured for 3TB SATA drives, depending on the IO rates being attained from the drives.

During a RAID rebuild, data becomes unprotected (RAID5) or reduced in protection (RAID6). Statistically, this increases the chance that another one (or more) of the remaining disks will fail as well due to the increased activity. Studies have also shown that RAID sets show increasing probability of a subsequent disk failure after the first failure. This is due to several factors including the fact that these disks are usually of the same age, from the same manufacturing lot, and have been subject to similar read/write patterns as the other disks in the RAID group.

Learn More – Get the White Paper

More importantly, the probability of a drive encountering an unrecoverable read error (due to bit rot) during a RAID rebuild is the most statistically inevitable event, and can lead to data loss or corruption during the rebuild process. In a 10PB data set, data stored in RAID5 (2) groups will statistically experience 2-3 data loss events per year. Storing the data in RAID6 (2) groups will still nearly guarantee 1 data loss event per year, as demonstrated in the figure.

This is the reason all large-scale cloud services today, such as Amazon, Microsoft Azure and others, use either multiple distributed copies (up to 6), erasure coding, or a combination of the two for data protection and durability. Now is the time to abandon obsolete RAID based storage, and eliminate the overhead and vulnerabilities of asynchronous replicated copies by moving to AmpliStor, ensuring unmatched data durability for bit-perfect preservation of critical data.

Himalaya is unmatched in both data durability and performance due to patented Bitspread® “rateless erasure coding” technology. Himalaya BitSpread delivers unbreakable durability at greater than 15 nines, can tolerate up to 19 simultaneous hardware failures, while saving up to 60% in storage.

This compares to Amazon S3 data durability claims of eleven 9’s, and RAID6 full replicated of six nines.

Learn More – Get the White Paper

 Data Durability is defined as:
  • The chance of losing 1 object in 1 Year
  • Fraction of lost objects in 1 Year
Himalaya’s 15 nines durability for a 20/4 BitSpread Policy
  • 99.9999999999999% durability
  • 1 in 1,000,000,000,000,000 objects might get lost per year
  • Amazon S3 is 11 nine’s
  • 1 in 100,000,000,000 objects might get lost per year

Himalaya also ensures data consistency (Strong Consistency) of all data written to the object store, including to multiple datacenter sites.  Asynchronous replication approaches are susceptible to data loss, due to their Eventual Consistency limitations.