A look into Data Deduplication feature and its advantages!

IT Professionals related to data storage field might have come across the term “Data De-Duplication” by now. But still there exist a small proportion of them who are connected to the field somehow, but are not familiar with the term and its real benefits. So, here’s a small post for them to know what exactly is Data Deduplication aka Dedup and its benefits.

Actually, there is no industry standard definition yet, for this term. But here’s an explanation which can help in getting close to it. Deduplication helps in reducing the required storage capacity since only the unique data is stored. For example, a typical email system might contain 100 instances of the same 1MB file attachment. If the email platform is backed up or archived, all 100 instances are saved, requiring 100MB of storage space. But with the help of data Deduplication, only one instance of the attachment is actually stored; each subsequent instance is just referred back to the one saved copy. In the said instance, the demand can be reduced to just 1MB with the help of Dedup.

Thus, with the help of deduplication, here are the benefits you can be attain over a normal file system.

Reduced storage allocation- Deduplicatiom can reduce storage needs by up to 80% for files and backups. Therefore, an enterprise can store far more backup data for a given expenditure and this lengthens disk purchase intervals automatically. This helps in storing data to disk cost efficiently, taking advantage of its speed and eliminating the need for tape.
Efficient Volume Replication- Since, only unique data is written to disk, only those blocks need to be replicated. This can reduce traffic for replicating data by 90% depending on the application.
Effectively increase network bandwidth- No copies need to be transmitted over the network if dedup takes place at the source.
A greener environment can be attained- less electricity, fewer cubic feet of space required to house the data in both primary and remote locations.
Fast Recoveries ensure that line-of business process continue unimpeded.
This feature in your storage appliance helps in faster recovery and ensures that data continuity and disaster recovery plans are very well set-up.
Because you’re buying and maintaining less storage, fast return on investment can be obtained and thus reduces overall storage costs.
But here’s a point to be noted on this issue. Dedup can act at the file level and block level.

File Level Deduplication- With this feature, duplicate files can be eliminated, as pointers are placed in place of the other duplicated copies for a single instance. While file deduplication is more efficient, the disadvantage is that even a single minor change to the file will result in an additional copy being stored.
Block Level Deduplication- Block-Level deduplication promises much greater overall storage efficiency. It works by searching for instances of redundant information by looking at chunks of data sized 4KB and larger and stored only one copy, regardless of how many copies are available. The copies are then replaced by pointers which reference the original block of data in a way that is seamless to the user, who continues to use a file as if all of the blocks of data it contains are his or hers alone.
So, choose the option of dedup based on your enterprise storage needs.

Adding further till a couple of years ago, Deduplication was seen as an exclusive tool of large enterprises, with an imposing cost, a daunting learning curve, and with file only deduplication feature. Moreover, it could be applied only in support of servers, despite the fact that enormous data stores are contained at the workstation level within most IT Infrastructures.

But now, most deduplication products are being designed and sold as combined software/hardware solutions. That is data storage vendors are ingesting this feature into their network operating system available on its storage appliance. Thus, with the help of such feature being integrated into the software intelligence of storage appliance, unparalleled storage efficiency can be obtained.

StoneFly, Inc. is one such vendor which offers deduplication feature in its storage appliances. StoneFly’s StoneFusion Network Storage Platform, which is a patented and award winning network operating system, offers deduplication feature by integrating Permabit’s Albireo data deduplication software. Thus, with the help of this feature, users of StoneFly IP SANs can avail benefits such as increased storage utilization through resource consolidation, storage provisioning, centralized access control and intelligent volume management.

“Deduplication is a highly sought after attribute by companies looking for reliable and cost-effective management of their business critical data”, said Mo Tahmasebi, CEO & President of StoneFly, Inc.
