By Jason Zhang
The best way to cost effectively manage the retention and disposition of data is to utilize a data archiving system. Archiving will maintain a clear primary storage by filtering any cold data. In addition, archiving will improve backup performance and make restores faster while efficiently retaining data for future use or compliance and regulatory needs.
Whether it’s a “DATAsaster” or just business as usual, a proper archive solution will provide immediate, in-depth access to data for effortless retrieval.
From an operational perspective, it’s fiscally irrational not to have a quality archiving implementation in each and every IT department.
It’s a no-brainer, providing infrastructure benefits and lowering the overall IT cost of storing and protecting data. Archiving allows the flexibility to expand storage options when and where the company desires rather than on past purchase patterns.
Planning a storage archive
An archive storage system is not overwhelming to implement and in many cases won’t require years of committee and policy meetings to figure out what to do with every data type and how long it needs to be stored.
We can talk all day about data type, But, let’s keep it simple “grab some data and move it!”
If you want to take the easy road there is software that can monitor data activity and identify characteristics so you can better choose what to archive. In preparation, there are some questions to ask in order to make sure that the implementation and meets the growing needs of the IT infrastructure and business users.
5 must-ask questions
What kind of data are you storing?
The majority of archived data is cold. Meaning it hasn’t been accessed in six months or more. Finding the balance of preserving data at the lowest cost and having your most important data on the highest performing storage can be difficult, but once you find an equilibrium, you’ll notice great benefits.
Cold data has to be stored safely as it may be required in a compliance audit or in a potential legal discovery. Many companies are also cashing in on information that can be researched for profit and new analytical tools are introduced. For any of these uses, the data must be stored properly to ensure its integrity, an archiving system will also provide information on who accessed the data and when.
Making sure we have a solid understanding of the projected future uses of different data types is important when it comes to introducing policies for data removal from primary storage. Where will it go and for how long will it stay there?
What kind of technology is being used?
Data storage is a combination of hardware and medium, such as tape, hard disk (HDD) or solid state (SSD or combination). The need to update firmware and software is important due to failure rates associated with such devices. In addition, interfaces will need to be changed over time as these upgrades occur.
Clearly tape has a long life, LTO tape that is used every month will last 17 years. With very little use and proper storage, it could last up to 100 years.
HDD, asides from manufacturing defects and random errors, has proven to run perfectly for three years. After which time the mechanical components start to break down, by 6 years 50 per cent of drives have failed.
People have the misconception that powering down your drive will help it last longer. In fact, the disk head will land on the platter and stick to the disk surface. If it remains for too long when the drive spins up again it will cause damaged or lost data.
SSD have no mechanical parts but it stores data electronically and eventually the cells will degrade in capacity and burn out. All SSD vendors have a plan in place to relocate data to good cells so that there is no effect on performance. But it’s unclear as to the longevity of long-term retention.
Most SSD’s are warranted for three years, although, some say they last longer than HDD’s. In the end, ALL mediums will wear out. So, there must be a plan to test the data periodically on a given medium to ensure its readability and to migrate the data to current media as hardware, software, and interfaces change.
What is the data format?
To ensure the long-term usability of data, all of the “bits” have to be preserved. There must also be hardware that can read the data from the storage medium and an application that can interpret the data in its file, database or object format. Therefore, it’s mandatory to have a system that regularly checks the readability of the data and usability by the app purposed with displaying the data.
Is the archive in the cloud or data center?
If we are talking data center storage and systems cost, ask the IT department, they should be very familiar with all aspects. But, isn’t it amazing that we are at a point in technology that we can consider cloud storage as a long term archive.
Of course, when we’re considering cloud we have to accept the fact that there are some great positives, but, there are also some complicated negatives. On a positive note, the cloud is a pay as you go convenience which turns a capital expense into an operational expense. Another clear advantage is that your cloud provider is sure to have strategies in place making sure the type, age and interface of the storage device are balanced. This gives you peace of mind in knowing that your data is always available and readable.
Application readability is still up to you but everything else should be handled by the cloud storage provider.
Now onto the negatives, as time passes you will accumulate data, as a result the cost of you cloud-based archive will also grow. Trickling data to the cloud is much more cost effective than the cost of moving data back to the data center.
When data is needed for a recovery operation, system restore or compliance issue, the bandwidth between the data center and cloud might be too slow for the amount of data being moved and the time it’s due. A typical WAN 100 bit per second connection can transfer 1TB a day. This is, of course, subject to change depending on how many users are sharing the connection, latency issues, line quality, etc. In an emergency, the data center will need more than 1TB quickly which requires a dedicated, high-speed link or physically taking the data back to the data center.
What do you know about your data?
Many IT departments can tell their storage capacity, but seldom can they tell you how much data they store for each application, owner, and age. Several tools are available that scan the environment with no additional hardware purchases.
This software can identify key metadata parameters and produce reports which will assist in planning the archive system by giving deeper insights as to what should be archived and recalling archived data when needed.
When your IT department knows exactly when a department, person or application is going to run out of capacity they can show them the cost of their current data retention policies and work with them to reduce cost or expand services. Business users can have access to weekly, monthly or quarterly reports and charts at the push of a button.
This gives managers the information needed to assess an ROI analysis for new system purchases and the details required to implement them. By being able to have regular health checks on their data environment and how it’s growing. They can see how it’s being managed and make changes immediately if necessary.
Archiving is essential
When it comes to data management, archiving is the service that protects the performance, vitality and health of the IT environment and the longevity of the data. As long as the IT department is archiving daily or weekly they are not likely to fall apart anytime soon. It’s really the sustained non-performance that leads to data atrophy, sluggish systems and severe costs to upgrade and migrate to new systems or to deal with legal and regulatory requests.
Archiving protects against these things and once implemented will positively answer the questions above. The more you know about what to ask yourself and your service provider the better the outcome.