CET2176C - Server+ Lecture #13 - Backup Technologies

Materials:
Tape Backup Drive
Blank Backup Tape
Tape Backup Drive Driver CD
PC running Windows 2000 Advanced Server
Objectives:
The student will become familiar with:
The data backup concepts, methods, and technologies
Each technology's functions and capabilities,
The technical details of each technology,
Know how to install and use a tape backup drive,
Performing backup operations,
Know how to test the backup
Competency:

The student will become familiar with the concept of the data backup as the only viable method of recovery from a variety of data loss scenarios. The student will become familiar with the various backup methods and technologies including their features, capabilities, limitations and be able to select the best solutions and be able to install and test these devices and technologies.

Discussion
  1. It has already been determined in Lectures 4 and 5 covering the subjects of data loss, in particular, as well as data security, that the one universally successful method of recovery from data loss, in the form of being damaged or destroyed, is to perform the restoration of a previously taken backup of the original data. Because backups are the only possible, reliable system for accomplishing this task, then ALL businesses must have a effective, reliable, on-going and well maintained backup system in operation. Server+ professionals will be expected to design, implement, upgrade, maintain, troubleshoot, and test backup systems as one of their primary skills.

  2. A backup simply put is to create a copy of the data. In the modern PC industry backing up implies that the copy of the original data will be moved out of the computer to another location for use to restore the computer in the event that it fails. Backups therefore do not, by this definition, imply tape backup drives or media, and in many ways these devices and their media are some of the least reliable storage technologies in the entire PC industry, nevertheless they are the most commonly used and the technician needs to be fully aware of their questionable state of reliability.

  3. While one could keep copies of any or all files on the local hard drive of a computer, they are by definition just copies, not backups. In the event that the computer fails, these copies may be no more accessible than te originals stored on the same computer that has failed. Backed up data must be stored outside of the computer itself in order for it to be considered a "true" backup from which the failed system could be restored. Backups can be stored on another computer, given that it needs to be backed up to somewhere else as well. Backups can be kept on removable internal or external hard drives or optical media. At least one copy of any and all backups should be phyiscally removed from the site of the computer such that even if the data loss event is a fire that completely destroys the facility, the backed up data is still safely stored somewhere else and can still be restored to another replacement computer, and the business can go on.

  4. There are three basic types of backup:

    Full: Also called "Normal" A full backup as its name suggests takes a complete backup of all data on the computer
    Differential: A differential backup takes a complete backup of all data on the computer that has changed since the last full backup
    Incremental: An incremental backup takes a complete backup of all data on the computer that has changed since the last full or incremental backup

  5. Obviously the difference between the differential and the incremental backups is extremely subtle in the wording but very significant in the way they work and the way they are used in backup schemes. Once a full backup has been taken then, any new files of files that have changed will be marked for backup. A differential backup will back these files up and it will not change the "marked for backup" status. As new files are created and others are changed, the next differential will backup all of these files including the ones taken in the previous differential. This will continue to accumulate until another full backup is taken which will unmark all files and start the process again.

  6. After a full backup, an incremental backup will nackup all files that are new or have been changed since the full backup, and it will unmark these files. A subsequent incremental will only backup the new or changed files since the last incremental backup. So each incremental is smaller, but only holds the new or changed files since the last incremental (or full if it was the first incremental after the full.)

  7. A differential backup scheme requires two "tapes", the full backup tape and the differential backup tape. For example, on Sunday a full backup is performed and all files are backed up onto the full backup tape:

    file1.txt (backed up)
    file2.txt (backed up)
    file3.txt (backed up)

  8. On Monday file4.txt is created and file2.txt is changed:

    file1.txt (backed up)
    file2.txt (changed)
    file3.txt (backed up)
    file4.txt (newly created)

  9. A differential backup at this point will back up file2.txt and file4.txt but it will not change their status. So on Tuesday file5.txt is created and file 2 is changed again:

    file1.txt (backed up)
    file2.txt (changed)
    file3.txt (backed up)
    file4.txt (newly created)
    file5.txt (newly created)

  10. A differential backup done Tuesday night records file2.txt, file4.txt and file5.txt. On Wednesday file6.txt is created and file1.txt is changed. The Wednesday evening differential backup will record file1.txt, file2.txt, file4.txt, file5.txt, and file6.txt.

    file1.txt (changed)
    file2.txt (changed)
    file3.txt (backed up)
    file4.txt (newly created)
    file5.txt (newly created)
    file6.txt (newly created)

  11. The same differential backup tape is used on each weekday evening since it is accumulating all changes. In the event that a restoration must be done at this point, the full backup tape would be restored and then the differential tape and all files will be up to date. The disadvantage is that the differential tape backups are growing larger and larger, and they keep backing up file4.txt which has not changed since it was created.

  12. Now lets follow the backup scheme using incremental backups instead. This time an incremental backup is performed on Monday night. It backs up file2.txt and file4.txt:

    Before the Incremental Backup:
    file1.txt (backed up)
    file2.txt (changed)
    file3.txt (backed up)
    file4.txt (newly created)
    After the Incremental Backup:
    file1.txt (backed up)
    file2.txt (backed up)
    file3.txt (backed up)
    file4.txt (backed up)

  13. On Tuesday, file2.txt is changed again, and file5.txt is created. The Tuesday night incremental backup will only pick up file2.txt and file5.txt since the previous incremental marked all others that they are already backed up:

    Before the Incremental Backup:
    file1.txt (backed up)
    file2.txt (changed)
    file3.txt (backed up)
    file4.txt (backed up)
    file5.txt (newly created)
    After the Incremental Backup:
    file1.txt (backed up)
    file2.txt (backed up)
    file3.txt (backed up)
    file4.txt (backed up)
    file5.txt (backed up)

  14. On Wednesday, file1.txt is changed and file6.txt is created. The Wednesday evening incremental backup will only take file1.txt and file6.txt:

    Before the Incremental Backup:
    file1.txt (changed)
    file2.txt (backed up)
    file3.txt (backed up)
    file4.txt (backed up)
    file5.txt (backed up)
    file6.txt (newly created)
    After the Incremental Backup:
    file1.txt (backed up)
    file2.txt (backed up)
    file3.txt (backed up)
    file4.txt (backed up)
    file5.txt (backed up)
    file6.txt (backed up)

  15. To restore, the full backup tape is restored first, then the Monday evening tape, then the Tuesday evening tape, then the Wednesday evening tape and this must be done completely and in this order or the restoration will fail. The advantage of using the incremental backup scheme is that each incremental backup backs up far less data compared to the progressive accumulation of data that the differential scheme backs up. The disadvantage is that it requires more tapes, which includes more individual physical tapes and therefore more opportunity for one of them to go bad. However, the logical assumption is that backups must always be done regradless of the circumstances, but restorations must only be done if and when the need arrises. So since the backups must always be done, then the incrementals backup less data and therefore take less time to perform, so the business will always prefer to pay for the shorter man-hours spent taking incrementals every day versus paying longer hours for the technician to "babysit" the system while performing a differential backup on a Friday night. It is irrelevant that the restoration process of the incremental backup scheme will take much longer, this happens rarely and the system is already down, so on the rare occasion that it is down, dos it matter that it is down a little longer?

  16. A common backup scheme based on the incremental backup is called the GFS - Grandfather, Father, Son backup scheme. This method can be employed on a monthly, yearly or even multiyear method. Here the GFS method will be used in a monthly backup strategy to illustrate it.

  17. In this backup scheme three sets of tapes are kept. The grandfather set consists of twelve tapes labeled "January" to "December". The father set consists of five tapes labeled "1st Friday" to "5th Friday" and the son set consists of four tapes labeled "Monday" to "Thursday".

  18. Let's say it is Sunday January 31ST. A full backup is performed and stored on the "January" tape which is put away until next January. The next day is Monday, and an incremental backup is performed on the "Monday" tape. Each day this continues until Friday. On Friday a full backup is performed on the "1st Friday" tape. On the following Monday the daily tapes will be reused through the week. On Friday the "2nd Friday" tape will be used.

  19. This continues until Sunday February 28th. A full backup is taken on the February tape and the daily tapes start up through the week as usual. On Friday, the "1st Friday" tape from a month ago will be reused. The system continues rotating along like this and the following year the monthly tapes will get reused at the end of each month. In the event of a disaster in the middle of June on a Wednesday, the GFS tape set has: a Full Backup taken on May 31st, a full backup labeled "1st Friday" taken on June 4th, a full backup labeled "2nd Friday" taken on June 11th, and incremental backup tapes for "Monday and "Tuesday" The administrator can use "2nd Friday" plus "Monday" and "Tuesday" and be back up and running having lost very little. In the event that "2nd Friday" is no good, "1st Friday" can be used and loses at most one week, "Monday" and "Tuesday" can also be restored behind it recovering some recent data. In the event that a file is needed from May that was recently deleted, "3rd Friday" or "4th Friday" may still have the file. GFS allows the system to be restored to the latest state possible (end of the last business day) or to try to recover old files, back an entire week up to 4 or 5 weeks or back an entire month up to 12 months as needed.

  20. Historically the tape drive has always been one of the most common devices used for backing up all data on the hard drive. This is due primarily to the fact that they historically speaking were one of the only technologies that had the storage capacity to accomplish this. Even in modern times this is still the case as hard drives swell toward a terabyte in capacity and large companies actually fill this kind of storage capacity with data. Tape drives are still used in many businesses mainly because they are already in use and working and since the system backups hold all of the company's vital data with which they do business they are not eager to change the backup system if it is working; employing the principle of "If it isn't broken, then don't fix it." The disadvantages of tape backup systems are:

    Unreliable: Tape media is exetrmely vulnerable to environmental conditions including temperature and humidity changes that can quickly ruin them. The drives heads like any magnetic tape device are in contact with the tape and can accumulate particulate debris and lose their read/write responsiveness
    High Maintenance: Especially because of the exposed read/write heads, tape drives must be cleaned regularly, and the tape drive and the media must be kept in clean, dry, cool, controlled environment or they will fail quickly
    Slow: Tape drives may have the capacity of their contemporary hard drives but they certainly do not have the same DTR. To copy a full 120GB 7200RPM (15MB/sec platter-to-buffer DTR) hard drive would take roughly 2 hours. For a tape drive to store that much data could eight hours or more.
    Technological Life Span: Many historical tape drive technologies came and went within very short periods of time stranding users of those transient companies products.
    Proprietary system: Many tape backup drives and even the media for them are proprietary forcing the user to buy new media only from the manufacturer, fortunately most of these have disappeared from the market (see Technological Life Span above).
    Generic system level drivers vs. proprietary backup software: Generic system level drivers allow the drive to be recognized by the operating system and allows any backup software to work with the drive. Proprietary software may suffer the same problems that proprietary hardware and media do (see Technological Life Span above)
  21. The only true advantage of the tape backup drive based backup system is that modern tape drives have very high capacity storage media making it possible to perform a full backup using only one or a few tapes. These tapes can then be copied and stored off-site.

  22. Tape drives are only as reliable as the tape which is usually the poorest quality compared to the other commonly used PC industry mass storage technologies and the tapes are constantly reused to the point of failure. It is difficult to tell that the tape has failed and difficult to test the tape since the only real way to do this is to perform the restore, this is a time consuming task at best and if the restore fails then a perfectly functioning system could be destroyed! We will however develop a reasonable method for testing the backups that will be created later in this exercise.

  23. Tape drives are making a come back in the area of affordable media but more importantly in the area of media storage capacity. While a CD-R/RW media costs pennies per disc, the disc can only hold 700MB. The latest 4mm DAT/DDS cartridges are available in 72GB and even higher capacity. The new tapes sell for under $30. Consider that if a server were using 72GB then 103 CD-R's or 16 DVD-R's would be needed to back it up versus a single modern 72GB tape cartridge. So the cost is comparable to the CD-R's but the tape is reusable and only one cartridge might be all that is needed to back up the entire server. The drives are more expensive than the burner but if the server can back up to a single tape, then at least full backups would be easily stored and tracked versus stacks of optical media. With spanning multiple CD-R or DVD-R discs then the system would need an operator to stay with it for hours changing and labeling the discs. It would cause quite a lot of trouble if one disc failed during the creation of a massive backup set as well.

  24. Files are "marked for backup" by the file system itself. Whenever a new file is created on the partition (or copied from some other location onto the partition) it is marked for backup by the file system drivers as they write the data onto the clusters and create the new directory entry (or MFT entry in NTFS) Any previously backed up file that gets modified even by having to write it back to the clusters in changing a single byte, is sufficient for the same file system drivers to return to the directory entry and mark the file for backup again. This ensures that all new or modified files will be properly marked, the procedure does not depend on some other application to monitor all of the files or to periodically check for new or modified files. The system employed to mark files for backup is the file attributes, supported in all file systems, not just FAT and NTFS. The file attribute used is the archive attribute. If the attribute exists, then the file has been marked by the file system drivers as new or modified since the last full or incremental backup. A full backup will backup all files regardless of the status of their archive attribute but it will clear the attribute on any file that has it set. An incremental backup will backup only the files with the archive attribute set and remove the attribute. A diiferential backup will backup all files with the archive sttribute set and it will not reset it.

  25. There is another type of backup that backs up all files but it does not remove the archive attribute from any file and is therefore not technically a full backup which does remove the archive attribute from any file that has it set. Such a backup is sometimes referred to as a "snapshot". Using hard drive imaging software to backup the entire hard drive to an image file that can later be restored in full even if the MBR and/or VBR sectors have been destroyed is usually a "snapshot" in that the objective is to preserve all data and to alter nothing. Some such programs are even used in computer forensics which must not alter a single bit of data that is being collected for analysis. Norton GHOST is a drive imaging tool that takes and replaces snapshots of the hard drive, although it is not computer forensics grade software.

  26. Windows explorer view settings can be modified to display the file attributes. Open "My Computer" > "Local Disk (C:)" > View (Main Menu) > Details. Right Click on the "Name" column header and select "More..." In the resulting window, uncheck "Type" and check "Attributes", "Owner", "Created", and "Accessed". Select each item and use the Move Up and Move Down buttons to organize the detail columns (top to bottom appear left to right in the windows explorer details view of any open folder (then click the OK button):

    Choose the meaningful information to be displayed,
    organize the columns, then click the OK button

  27. With the C: drive still open, click Tools (Main Menu) > Folder Options > General (tab) > "Use Windows Classic Desktop" (radio button) and "Use WIndows Classic Folders (radio button). Then click the View tab > check "Display compressed ...", check both "Display the full path...", select "Show hidden files and folders" (radio button), uncheck "Hide extensions of known file types", uncheck "Hide protected operating system files", uncheck "Remember each folder's view settings" Now click the "Apply" button. These explorer settings will be applied permanently to all folders.

    Making the needed changes in the Folder Options View tab properties sheet

    After applying the Windows explorer view settings, all open folders
    will display more meaningful information at a glance

  28. DAT mentioned earlier is a storage standard developed by the music industry and means Digital Audio Tape. DDS is the current DAT standard modified for use on computer tape drives and means Digital Data Storage. This is also an industry standard meaning that in theory a DDS tape created by one drive should be readable by another drive as long as the drive has the same data storage capacity and the same software is used since the software will usually impose its own form of data compression on the stored data on the tapes. Another older tape drive technology was the QIC meaning Quarter Inch Cartridge. Here are the major QIC standards (many companies made their own proprietary variations which of course only work with their own drives and tape media):

    StandardTracksCapacityCartridge
    QIC-029<60MBDC-3000
    QIC-24960MBDC-6000
    QIC-402040MBDC-2000
    QIC-803280MBDC-2080
    QIC-10012/24100MBDC-2000
    QIC-15018250MBDC-6000
    QIC-1000301.0GBDC-6000
    QIC-1350301.35GBDC-6000
    QIC-2100302.1GBDC-6000
  29. Travan was a technology developed at 3M (now Imation) that had a higher capacity than the QIC technologies and was also to a certain extent backwards compatible with them. Here are the Travan standards:

    StandardCapacity(raw)Capacity
    (compressed)
    Compatibility
    Travan-1400MB800MBRead-Only:QIC-40, QIC-80
    Travan-31.6GB3.2GBTR-1, QIC-80; Read-Only: TR-2
    Travan-44GB8GBQIC-3080;RO:QIC-80,TR-1,TR-3
    Travan NS84GB8GBQIC-3080, TR-4
    Travan NS20 (TR-5)10GB20GBTR-4
    Travan 40 (TR-7)20GB40GBTravan NS20
  30. The DAT/DDS technology was originally developed in the music industry for creating tapes that will store music digitally, minimizing the degradation in sound quality over time. DAT uses 8mm tape cartridges. With the drivers, the same tape read/write drive can store computer data to the tapes as well. This adaptation requires a new file organization scheme, or file system, for use on the tape. The adaptation which is mostly software is called DDS.

  31. DAT then is the physical drive specifications and the modification for the DAT drive to interface with the PC and store data files rather than long multi-track music recordings is referred to as the DDS technology. Here are the DAT/DDS technologies:

    StandardRaw CapacityCapacity
    Compressed
    Notes:
    DDS-12GB4GBThe first widely available DAT/DDS
    DDS-24GB8GBDirect competitor with Travan NS8
    DDS-312GB24GBHigher capacity than the Travan NS20
    DDS-420GB40GBCompetition with the Travan 40
    DAT72 (DDS-5)36GB72GBTravan falls obsolete because it cannot compete with this and other technologies
  32. AIT – Advanced Intelligent Tape, was developed independently by Sony. It uses the same 8mm tape material as the DAT/DDS drives and cartridges but stores the data using a helical scan recording process. In this the heads record multiple files streams on the multiple parallel tracks of the tape in a rotating fashion. So if the tape has four physical tracks, part of data track one is stored on physical track #1, then on track #2, then on track #3, then on track #4, then back to track #1:


    Simplified example of how AIT helical scan lays down data tracks on the physical tape tracks

  33. AIT tapes are divided into 256 partitions that can be quickly individually located and the data from that partition can then be restored. AIT stores 35GB or 70GB at 2:1 on their standard cartridges. AIT-2 stores 50GB/100GB. Note that AIT is a proprietary Sony technology that will need specific cartridges, drives and the software and drivers that ship with the drive. Other systems may not be able to read or write to these tapes. DLT – Digital Linear Tape, a Quantum Corp. technology, lays down each data track linearly one to each physical track without the rotation used in the helical scan technique. DLT records (runs the backup) and plays back (performs the restore) much faster than DAT/DDS or AIT but does not have the level of reliability of AIT in particular. DLT stores 40GB or 80GB using the 2:1 compression on its standard tape cartridge. Note: DLT is a proprietary Quantum technology that requires specific cartridges, drives and the software and drivers that ship with the drive. Other systems may not be able to use the tapes.

  34. The Exabyte Corp. VXA-1 drives, cartridges and data storage technology is another noteworthy proprietary tape backup system. The VXA-1 stores 33GB or 66GB using the 2:1 compression technique. VXA drives are capable of recording (performing the backup) and playing back (performing the restore) at varying speeds based on the ability of the host to deliver the data stream to the drive. The drive can detect the varying speed data laid down on the tape media and automatically compensate and read it back without error. This makes these drives well suited for slower under powered end user machines and they do not have to be SCSI making them less expensive as well as reasonably reliable compared to the other high end tape backup drives listed above, yet with comparable cartridge capacities. The VXA-2 drive can store 80GB or 160GB using 2:1 compression and is currently one of the highest capacity widely used tape backup technologies.

Review Questions
  1. Name and briefly describe the four main types of backup discussed in this lecture:











  2. Name and briefly describe the four main tape drive technologies discussed in this lecture:











  3. A two tape backup scheme would be based on what type of backup? Descibe what such a backup does:





  4. Name and describe a commonly used multiple tape backup scheme. What type of backup is this scheme based on?











  5. List and discuss the disadvantages of tape drives.














  6. List and discuss the advantages of tape drives. Do the advnatages necessarily outweigh the disadvantages? Explain.











  7. Drive imaging software takes what type of backup? Extremely high quality versions of this type of software can be used in what profession?





Copyright©2000-2008 Brian Robinson ALL RIGHTS RESERVED