BakBone Blog

News & Views from BakBone® Software

Posts Tagged ‘deduplication’

Calculating Physical Disk Space for Licensed NetVault: SmartDisk Capacity

Posted by Dawn renee Campbell on March 8, 2010

Dawn renee Campbell, Senior Product Manager for NetVault: SmartDisk & NetVault: Backup

After you have determined how much NetVault: SmartDisk (NVSD) capacity you need to license, the next step is calculating how much physical disk space you will need.

As we discovered in my Calculating NetVault: SmartDisk License Capacity blog, NVSD is licensed based on the Logical Capacity of the data that it can store. However, in Deduplicated NVSD Instances, Logical Capacity does not match Physical Capacity or physical disk space. This is because NVSD Deduplication Option packs up to 12 times more protected data into the same storage area for a 92% reduction in storage footprint.

Deduplicated NVSD Instances

A Deduplicated NVSD Instance can have a combination of both Deduplicated and Non-Deduplicated data. In this configuration, calculating the total Physical Capacity or physical disk space is comprised by calculating the Physical Capacity for the Deduplicated Backups followed by calculating the Physical Capacity for the Non-Deduplicated Backups and totaling the sums.

Deduplicated Backups

The Physical Capacity or physical disk space required for Deduplicated Backups in Deduplicated NVSD Instances is equal to the Size of Weekly Full Backups + Unique Data Size and is calculated using the following formula:

(Size of Weekly Full Backups) +

Size of Weekly Full Backups + ((Size of Weekly Full Backups * Weekly Change Rate)

* Weekly Full Backup Retention Period)

+ (Size of Daily Backups * (Number of Daily Backups between Weekly Full Backups

* Daily Backup Retention Period))

Non-Deduplicated Backups

The Physical Capacity or physical disk space required for Non-Deduplicated Backups in a Deduplicated NVSD Instance is calculated using the following formula:

(Size of Non-Deduplicated Weekly Full Backups * Weekly Full Backup Retention Rate)

+ (Size of Non-Deduplicated Daily Backups * (Number of Daily Backups between Full Backups * Daily Backup Retention Period in Weeks))

Total Required Disk Space = Deduplicated Backup Disk Space + Non-Deduplicated Backup Disk Space

The Total Required Disk Space is divided into the Staging Store and the Chunk Store. If different file systems or disk spindles are going to be utilized for the Staging Store and the Chunk Store, it is important to know how much of the Total Required Disk Space will be allocated to the Staging Store versus the Chunk Store. The calculations below can be used to make this determination.

Read the rest of this entry »

Posted in BakBone North America | Tagged: , , , , , , , , | Comments Off

Storage Technology Predictions for 2010 – What’s to Come

Posted by BakBone on January 21, 2010

Andrew Martin, VP of APAC

For much of 2009, deduplication was the “buzz” technology in storage. We started seeing major adoption of deduplication across different business sectors and companies of varying sizes, from large enterprises down to SMEs, adopt the technology. In addition, we saw a move from deduplication specialist companies dominating the space to traditional storage and software companies (BakBone included) coming to market with their own deduplication offerings. Finally, we saw the soap-opera-like battle of two industry giants slugging it out to take control as the leading deduplication specialist, Data Domain.

So what’s going to be the storage buzz technology of 2010?

Unfortunately, I don’t see anything on the horizon about to explode into every IT department. I stand to be proved wrong, but I do not see a technology taking us by storm in 2010 in the way deduplication did last year.

Cloud storage is growing with many vendors aiming to get on the bandwagon, form alliances and drive IT departments toward their vision of how cloud should be adopted. However, I have spoken to many people about cloud storage: IT directors, government departments, analysts, journalists and systems integrators, and whilst cloud is becoming increasingly relevant, it seems that adoption of cloud based will still be steady with corporate companies, particularly experimenting with different providers to find out what, if anything, works for them.

Whilst cloud is going to continue to grow, my educated guess is that it won’t be the must-have storage technology of 2010.

I have seen one technology making a comeback – archiving. However, I don’t think we can refer to archiving as  something new; in fact, it is one of the oldest storage technologies around with HSM-type products being popular in the height of the mainframe days.

However, across Asia and beyond, many companies are expressing a renewed interest in spending money on archive and HSM-type products. This is not only for e-mail but also for applications like SharePoint and even for file-system data. In addition, we are seeing numerous startup companies come to market with new archive products. It is interesting to see startups cropping up in such an old arm of storage technology. However, these new companies are working on the next generation of archive technology. Products that understand and take advantage of server and storage virtualisation, that can classify data in many different ways and even build a repository of archived data that can be used for analysis by other applications. BakBone recently released NetVault: Archive in Europe (a “next-generation archive product”), and it has been fascinating to see the level of demand for us to launch this product in Asia.

For me, I wonder what the natural progression from archiving might be. I have wondered if there ever might be demand for a vendor to build a platform that can intelligently delete data; not secure deletion, there are many light utilities available that securely delete data. My vision is of a technology that can identify data based on strong preset criteria that can be permanently deleted without impacting compliance or inadvertently removing required/useful data.

We are always reluctant to simply delete data; however we know two things for a fact:

1 – The rate of data growth causes significant data management problems.

2 – We store and protect large volumes of data that we do not need and will never use again.

Deleting data sends a chill down the spine of most IT administrators and is something we just don’t do. The risk of deleting something critical outweighs the potential benefit of getting rid of the waste.

However, if intelligent file systems ever become a reality, then maybe a product that allows data to automatically be “thrown” into the waste bin could be in our future.

Posted in BakBone Asia | Tagged: , , , , , , | Comments Off

Choosing what data to deduplicate with NetVault: SmartDisk

Posted by Dawn renee Campbell on December 23, 2009

Dawn renee Campbell, BakBone Senior Product Manager

You have decided that you want to deploy NetVault: SmartDisk (NVSD) Deduplication Option, but you are asking yourself, “What data should I deduplicate?” To answer this question you must first understand that not ALL data can benefit from deduplication, and deduplication does have a restore cost.   

While data deduplication does reduce storage costs by reducing the storage footprint to store the data, there is a cost during the restore processes. During the restoration of a deduplicated backup, any deduplication technology (such as NVSD) will have to reassemble the chunks as it restores the data. This reassembly process, also called rehydration, will lengthen the time to restore the data. Therefore, if the recovery time objective (RTO) is important for particular database, e-mail, or file system, determining between reducing storage costs or fast recovery will help you determine what data you want to target for deduplication. If your RTO for Exchange, SQL Server or Windows File Servers is more important than reducing your storage footprint, you should consider protecting this data with BakBone’s NetVault: FASTRecover, which provides 30-second recovery for Exchange, SQL Server and Windows File Servers.   

What Data is an Ideal Target for Deduplication?    

For data in which reducing the storage footprint is your top priority, the following types of data are ideal targets for deduplication:   

  • Structured databases such as Oracle and SQL Server databases that are protected by the NetVault: Backup Application Plugin Modules (APMs)
  • Unstructured file system data, such as that stored on File Servers, protected by the NetVault: Backup File System Plugin
  • Workstation data, such as desktops and laptops, protected by the NetVault: Backup Workstation Client
  • Virtual Machine images from the same OS and similar applications, such as those protected by the NetVault: Backup VMware and Hyper-V Plugins
  • E-mail such as Exchange and Lotus Domino that are protected by the NetVault: Backup APMs for Exchange Server and Lotus Notes/Domino. While e-mail is an ideal target for deduplication, Single Instance Store such as that provided by Exchange will reduces deduplicated data; therefore, e-mail’s deduplication ratio will not be as high as other types of data.

What Data is Better Targeted for Non-Deduplication?    

Data that does NOT deduplicate well and should not be deduplicated includes   

  • Encrypted data because the data stream is unique
  • Data with demanding RTOs

What Data Should be Deduplicated Together?    

Now that you have determined what data you want to target for deduplication, you have to determine what data you should deduplicate together. Deduplication ratios can be increased when backups from the same database, file system or application are targeted to the same NVSD Instance or Device. When a backup is deduplicated and a previous backup from the same database, file system, or application has already been deduplicated by the NVSD Instance, only the unique or new chunks that didn’t exist in the previous backup will have to be stored in NVSD’S Chunk Store. If a previous deduplicated backup doesn’t already exist in the NVSD Instance, a large majority of the backup will be considered unique data, which will increase the amount of unique chunks that will have to be stored in the Chunk Store.   

Therefore, when targeting Backups to NVSD Instances, deduplication ratios will decline if backups are targeted to random NVSD Instances. We recommend backups from the same database, file system or application be targeted to the same NVSD Instance.   

If you want to learn more about defining your NVSD Deployment Strategy, check out the Deployment Strategy chapter in the NetVault: SmartDisk Installation Guide, available at www.bakbone.com/documentation.

Posted in BakBone North America | Tagged: , , , , | Comments Off

Podcast – Spotlight on NetVault: Backup Featuring First Looks at NetVault: SmartDisk

Posted by Amber Winans on October 14, 2009

Dawn renee Campbell

Dawn renee Campbell

    
6 min 44 sec
 
In this technology spotlight podcast we talk to Senior Product Manager Dawn renee Campbell about NetVault: Backup 8.5 and the newly introduced NetVault: SmartDisk. Dawn discusses the new features in NetVault: Backup, including the new Hyper-V and encryption plugins and dives into the specifics of our new disk-based data protection and deduplication capabilities in NetVault: SmartDisk.
 
 

Posted in BakBone North America, Podcasts | Tagged: , , , , , , , , , , , , , , , , , , | Comments Off

How Green is Green?

Posted by BakBone on October 7, 2009

Andrew Martin

Andrew Martin

A few months ago I was presenting at a conference in Kolkata aimed at stimulating adoption of IT in West Bengal. One of the other presenters at the event was the trade minister for the region and I was fascinated as he outlined the challenges and aspirations he saw ahead in West Bengal. However I was also struck by a comment he made about green computing. He mentioned that it was important to repurpose old IT equipment and recycle old computers from companies into the community.

I have no doubt that his intention is a good one. Taking computers discarded by businesses and placing them into the hands of community centres and schools is something that will deliver enormous benefit to large parts of the population that could not even begin to think about buying their own computers. However, is this green computing as the minister described?

The answer is “I don’t know” and the truth is, when you look into the issue of energy saving it is enormously complex.

On the surface, if you extend the life of a PC and keep it in use after the original owner has discarded it, that would seem to be a green practice, as manufacturing requirements for new PCs will be reduced. However, to really assess whether an IT-related policy is truly green, so much more needs to be considered than simply extending the life of a piece of equipment.

  • How much energy was consumed in the manufacturing process?
  • How much energy was consumed in the manufacturing of the components?
  • What is the ongoing energy consumption over time? (Older equipment typically consumes more energy just to run.)
  • How much energy will be consumed in the safe disposal of the product?

In short, to really understand if we are achieving green computing we need to consider the energy consumption over the entire lifetime of the product from manufacturing through to the final disposal process. For most of us this is an impossible task. It requires expert consultancy, a long-term vision with senior level commitment to that vision. Large, multi-national companies are taking this approach, but for the rest of us necessity demands that we take a more pragmatic approach.

The view that I have heard from many IT managers and directors is something akin to, “I believe in looking for efficiencies and cost savings and often if I can achieve these, the result is less power consumption and fewer wasted resources. If I can drive efficiency and save money, the spin off is that most likely I am being more green.”

On balance, I subscribe to this view. Efficiency is generally a good thing. When we utilise just enough resources to deliver the functionality that we need, then waste is trimmed, budgets are saved and over-consumption is eliminated.

When it comes to data protection, the practice is intrinsically “non-green.” Data protection involves creating duplicate copies of existing data. By it’s nature this process creates “waste” as it requires more storage hardware, dedicated storage networks, redundant systems, increased power consumption, and the list goes on.

However, all vendors in the data protection space are trying to drive efficiency, and in doing so, are finding ways to reduce excess resource utilisation. Examples of technology that aid the green cause, even if the reason for being is nothing to do with green issues, include: 

  • Tape – Tape is offline media. It does not consume power when stored in a vault. Perhaps it is the greenest of all media?
  • Data depulication – Compressing 30TB of data onto 5TB of disk is truly efficient. Less hardware required and reduced ongoing power consumption.
  • MAID – Intelligently powering down disks that are not in use saves money on datacenter cooling and on powering the disks themselves.
  • Virtualisation – converting physical servers or storage into virtual enables you to consume more resource from every piece of hardware you own.
  • Cloud-based backup – Sharing a single backup infrastructure between numerous companies and users.
  • Software enhancement – Things like “incremental forever” eliminate full backups, which in turn reduces media consumption.

Most of us believe the green cause in IT is a good one. Actually doing something about it can be complex and difficult. However, keeping up with the data protection technology curve is in itself a way to support a green agenda and save costs. When it comes to repurposing old PC’s in Kolkata, is that green? Possibly not. However, is it wrong? Well, that’s a moral dilemma and whole different debate!

Posted in BakBone Asia | Tagged: , , , , , , , , | Comments Off