File Servers: Biggest Problem with Protecting Them and How to Fix It
File Servers File servers help consolidate your data from multiple users to one single repository. This helps in maintaining common storage for the entire data and brings in convenient file sharing between multiple users & projects. You allocate large volumes of RAID-protected disks to a file server and then allocate shares to your users. Users have full rights to their folders and can have full or limited access to shared folders. While most of the people think the real data is in databases, I believe the real data is on the file servers. Users create a lot of data by their knowledge & experience. The data here is not just a transaction, it is the intellectual property of the person creating it. I saw someone making one final report in MS Excel which pulls out data from 11 different Excel files. A department updates its data in one file and the entire report gets updated immediately. Imagine the impact if this is lost. Apart from losing the data, the user needs to spend a huge amount of time rebuilding the entire algorithm again as against an ERP software where you may lose an important record but the basic application can be reinstalled. Therefore, it becomes even more important to protect this data from any accidental deletion or overwriting. Whether you are a small office or a large enterprise, file servers are the most critical data assets and need to be protected properly. The BIG Problem The biggest problem that you face backing up file servers is that the small files take much longer to get backed up and organizations have terabytes of such data with them. Small Organization Many small organizations still prefer making copies of the data on an external USB drive. This starts as a good practice but does not get continued to a long-term regular practice as you are not always available to sit down and perform the copy-paste activity. Investing in automatic backup infrastructure and managing it isn’t a simple task in small organizations with small IT setups. I have seen so many organizations losing their data to such a small thing. Many setups are too small to even have RAID protection. Protecting the data on the cloud is a good option in such scenarios. Allow us to install the Virtual Vault software in the environment and schedule the backups. The first-time backup will process the entire data and the subsequent “always incremental” ensures only the changes get backed up that too get compressed locally. This minimizes the bandwidth requirement for the organization. Zero infrastructure investment and a pay-as-you-use model make it even easier to handle the cost of protection. You can also add up protecting your desktops/laptops with the same environment without the need to buy more infrastructure or licenses. Large Organization Large enterprises have the affordability of deploying external storage with dual controllers, RAID protection, and hot spare disks to keep themselves safe as the more redundancy you add, the more reliable it becomes. However, what if a user is looking for data that he had created a day before. Most of the file server recovery requests that you get are for the data that is some previous version of a small file. You rarely see a complete crash. For larger organizations, protecting file servers is the biggest challenge. The volumes are huge and the tape infrastructure is slow. I have seen organizations upgrading their tape infrastructure every 18-24 months believing it to reduce their backup windows. It does look good initially but is not a long-term solution. The backup windows keep increasing as the large number of small files take much longer to backup. Based on the volumes of data on file servers, you end up taking Full backups every month, as it takes 4-5 days or even more to complete a full cycle. Incremental backups for the rest of the month help shorten the backup window. Recoveries during the month stay dependent on these and therefore recoveries take an equally long time to complete. On top of this, managing & storing tapes needs a lot of care and investment. Organizations should consider categorizing their data based on usage patterns and archiving the old data to immensely help in reducing the backup window. I had an experience of daily incremental backups taking too long to complete. Equally time-consuming & media-consuming as the full backups. We diagnosed this further and found out that the access rights were configured in such a way that every time you add/remove a user from a set of shared folders, the access bit of all the files in the folders changes and gets a new timestamp. Incremental backups work on a timestamp and the moment they see a change, all the files get backed up even if the content has not changed. We immediately initiated a pilot on sample data and could show the benefits of our solution with deduplication. It picked up only the changed data instead of picking up all the files again on a file access bit change. Another good way for large enterprises is to consider archiving their file servers regularly. I met someone recently who does this every year manually. They take a full backup of the file server & find the files have not been accessed for over a year. Considering the backup as an archive, they delete these files from the production systems. If a user requests for an old file, they recover it from the old backups. This helps them reduce the file server workload as well as get better backup performance. The only issue here is a lot of manual effort to take care of this. This can be automated, and a professional archiving tool that can help you automate the entire process and have easier access to your older data. You can give access to the user to retrieve back his data on his own. Utilizing the cloud-based archival infrastructure could help reduce the cost of retention drastically and provide much
File Servers: Biggest Problem with Protecting Them and How to Fix It Read More »