The student will become familiar with:
The various forms of data loss,
Each form's primary method of prevention and recovery,
Hardware and Software level solutions to data loss,
Disaster prevention methodology,
Disaster Recovery methodology.
The student will become familiar with the various forms of data loss including the most common forms, their causes, methods or prevention and ultimately methods of recovering from them. The student will be familiar with the methodology of planning for disasters and recovering from them and implementing a disaster prevention and recovery strategy.
Since the introduction of the first PC's the problem of data loss and combating it has been a primary concern. Interestingly enough, the evolution of disaster prevention and recovery techniques and technologies has not advanced at the pace of the performance and capacity of the PC industry technologies at all. In fact, the same things cause data loss in the 21st century that caused data loss at the dawn of the PC era. And the same basic techniques are used to prevent this data loss and to recover from it.
Data loss in computer systems can be categorized in the following way:
Accidental: The data was lost due to unforeseen events with no malicious intent.
Malicious intent: The data was lost due to specifically planned, malicious intent.
Destruction: The data was lost by virtue of being damaged or destroyed beyond recovery.
Theft: The data was lost by virtue of being stolen and is often left in tact.
It will be seen that some forms of data loss are obviously accidental or malicious, destructive or acts of theft. But some forms of data loss are not so easily categorized. For example, is data loss due to the destruction caused by a virus to be classified as accidental or as malicious intent? Someone definitely sat down and spent the time to create the virus for the purposes of causing all of the trouble that it causes, but did they direct it at you specifically? Or at anyone unlucky enough to run into it? It should be noted that such "random acts of digital violence" are to be considered accidental in that there was no malicious intent aimed at the user in particular; as opposed to the actions of spyware that can gather enough information about a user so that the spyware author can steal directly from them. While still anonymous, this is personally directed malicious intent that can cause the person, or organization real monetary loss and provide the malicious code writer with real monetary gain. If someone fires a gun into the air on July 4th, and the bullet falls to Earth killing someone, they will not be charged with murder, rather manslaughter. By the same measure virus writers are therefore engaging in "digital manslaughter" so to speak. But, if someone puts a gun to someone's head and demands all of their money and then leaves, they have stolen directly from that person in a personal attack. The only difference with spyware leading to identity theft is that the criminal and the victim never met face-to-face; the crime however, is exactly the same and certainly to be classified as malicious intent and not an accident.
The main causes of data loss are categorized in the following table and are listed with the most common cause at the top, and the least common cause at the bottom of the list:
Causes of Data Loss on Computer Systems
|Form of Data Loss||Primary Method of Prevention||Primary Method of Recovery|
|Human Error||Training||Restoration from Backups|
|Maintaining software updates||Restoration from Backups|
|Malicious Code||Running active monitoring|
software or "shields"
|Restoration from Backups|
|Running active defense software|
and enact effective security policy
|Restoration from Backups|
|Hardware Failure||Fault Tolerance/Redundancy and|
|Restoration from Backups|
|Fault Tolerant and|
|Restoration from Backups|
It should be abundantly clear then that the only practical way to recover from any form of data loss is to perform a restoration of the data from the backups. What may not be evident is that given time, all computer systems will suffer from a disaster in the form of data loss, and therefore cannot be adequately recovered without adequate backups to recover from. It should also be understood that under normal operating circumstances it is impossible to take backups continuously which means that at best only recent copies of the data exist, and that no matter how diligent a backup strategy is, that when a data loss disaster occurs, there will be some data lost that will not be recovered.
To be clear then: the only absolutely effective method of recovering from a catastrophic data loss event is to perform a restoration of the data from backups and it is therefore critical that important systems, and servers are the important systems being referred to here, have implemented an effective and diligent backup scheme, which may involve implementation into both their hardware and software design. Furthermore, it is equally important to remember that no matter how effective and no matter how diligent the backup scheme may be, that it is impossible to implement a 100% effective and perfect backup scheme, meaning that when the data loss disaster does occur, that some portion of the lost data will not be retrievable from the backups, primarily due to the fact that it was not backup up yet.
Still, any organization would rather have 99.9% of their data recovered than none at all. To this end it is the responsibility of the Server+ professional to plan a backup and restore strategy for the server, implement it and provide methods for testing the backups and for implementing an effective restoration when it is needed.
Before we dig deeper into backup hardware and software technologies that are available, let's examine each form of data loss and the efficacy of the techniques for preventing it from striking the system. As strange as it sounds, the number #1 cause of data loss in computer systems in the 21st century nearly 30 years after the introduction of the first personal computers is still human error accounting for nearly a third of all data loss events according to industry experts.1
Human error is also still the most difficult form of data loss to prevent. The only effective method by which any organization can attempt to prevent data loss due to human error will be to implement employee training programs. These are time consuming which translates into money lost while employees are being paid, but not performing the duties for which they were hired. And they are also expensive in payment to the technical trainers themselves. Despite the cost, many large organizations invest heavily in employee training, having learned that there does exist a cost effectiveness in developing employees who are less likely to lose data due to inexperience with their systems. But the cost effectiveness is never perfect since some employees will benefit greatly from the training, some marginally and some will not benefit at all, possibly because they do not having enough basic knowledge of the Information Technology systems they are being trained to use.
Because human error is the most common cause of data loss and because it is the most expensive and difficult form of data loss to prevent, and because the only practical method of recovering from data loss due to human error is to perform a restoration of data from the backups, it is understood that any IT network must have in place an effective backup and restore strategy.
Most experts agree that the second most common cause of data loss is software errors or "bugs" in the software. Modern users have seen messages like this:
The principle cause of such errors is a transient software incompatibility also known as a "glitch." Because Windows is a multitasking operating system capable of running many independent processes of its own as well as many independent programs and their multitude of independent child processes, almost any given moment on any given computer system has a different collection of processes running in different locations in RAM and this can lead to a unique situation that causes the malfunction. Because the CPU and the OS also participate in a protected mode system, the operating system can usually maintain control despite the erratic program behavior and therefore be able to display the message and deactivate the program and clear the RAM it was occupying. However, if that program was the one the user was actually using at the time, then all data that program was holding is lost, suddenly and permanently. The usual form of data loss related to software malfunctions is also the most likely to be irretrievable from backups, since it was immediate and permanent, and there was no time or opportunity for the data to be backed up.
Malicious code sometimes rises up to become the second leading cause of data loss depending on what authority is reporting the statistics. Hard drive data recovery specialists will skew numbers indicating that hardware failure ranks higher, while anti-virus authorities will report that malicious code ranks higher. Normally, the ranking of second or third most common form of data loss are interchangeable between malicious code and software bugs based on the state of the current operating system's functional lifespan. Just prior to the release of Windows Vista, the current OS at the time was Windows XP which was also about 5 years old. Having been in use for that long, most bugs had already been discovered and fixed, and the vast majority of the third party hardware vendors had already learned how to develop stable device drivers for their products for Windows XP. At that time, software bugs were at a five year low due to the stage of the operating system's functional life cycle ("seasoned") but data loss incidents due to malicious code remain a relative constant throughout time, so they rose to the second most common cause mainly due to Windows XP's seasoned life cycle stage of relative stability. With the release of Vista as a new OS, its bugs have yet to be completely purged and the third party hardware vendors are still learning how to develop stable device drivers for it. At this point in time (early 2008) the incidents of data loss due to software bugs are at cyclical high and have surpassed the relatively constant numbers of malicious code incidents for the time being. As Vista ages, the bugs will be found and fixed, the hardware drivers will increase in numbers and stability and the ranking of software bug caused data loss incidents will go back down.
Malicious code falls into several categories including:
Philology of Malicious Code
Viruses and their kin (and associated terminology)
Spyware and their kin:
Viruses are designed to destroy data and in the end, the business could care less about the computers themselves, they are a trivial cost when compared to the customer database which to any company is priceless and irreplaceable. As a result, since viruses are designed specifically to destroy data and the data is the TOP priority on any business network, the occurrence of a single piece of malicious code no matter how innocuous it may seem, must be treated with utmost caution and severity; it could have the potential of destroying the entire database and causing irrevocable damage to the business.
The only effective defense against viruses and the kin is to install and maintain an anti-virus software suite. In the event that a virus does penetrate the system, the only effective resolution of this data loss event is the restoration of the system from backups. However, viruses introduce a new twist to this disaster recovery strategy in that their warheads generally have event triggers meaning that they may hide out undetected in infected systems for days even months. In this situation, recent backups quite often contain infected files and restoring from these backups also restores the virus. To further compound the problem, backup media may not be in a format that allows it to be scanned, so there may be no way of knowing what backups are clean and what backups are tainted until the restorations are performed which in turn releases the virus into the system again.
As dangerous as viruses are, spyware can present an even greater danger to businesses. Spyware is malware specifically designed to gather or more accurately stated steal information from the business. This can include customer lists which can then be used directly as a direct mail advertising target or worse, the list could contain private information such as the users credit card numbers and addresses, all sufficient to engage in identity theft on a massive scale wrecking the customers financially and thoroughly destroying their trust in the company that lost their data. Such incidents have put companies out of business and should be considered just as a severe and potentially devastating as any virus.
Spyware has more unrelated forms than viruses have and as such can be more difficult to combat, but the defense begins with the installation of an anti-spyware package that should include both a real-time shield as well as regularly scheduled scans.
Malicious users are the fourth form of data loss. This basically includes intruders to the system. While collectively known today as hackers, this is not an accurate term. The true definition of a hacker is anyone sincerely interested in learning about technical systems, but not necessarily interested in causing any form of damage which includes intrusion and access to private information. Those who choose to intrude and damage or access private information are exceeding the definition of hacker and are properly referred to as crackers but even this term has changed in modern times to mean something totally different.
Malicious Users (and associated terminology)
Hacker: Develops methods of infiltrating information technology systems across the Internet access portal.
Cracker: 1) Develops methods of cracking commercial software so that it will install and run without the original product key. 2) Develops and/or employs decryption techniques to encrypted "cipher text" in an attempt to expose the original "plain text" data and thus gain access to sensitive information.
Pirate: Shares commercial software with others either for free or worse, charges for it.
Script kiddie: Finds hackers programs and techniques, downloads and uses them with very little knowledge of his own on what they are or how they work.
Social engineer: Works "face-to-face" or chat window-to-chat window (etc.) to convince the user that he is their friend and to share sensitive information such as username/password combinations etc.
Ethical hacker: Hired by the company to test their intrusion defense and detection. Also known as white hats
Hactivist: Hackers who believe that they are the only people saving the world, essentially the Internet, from big corporation and government control.
Disgruntled/disloyal employee: Essentially an insider intruder. Being behind the majority of any network's intrusion defenses and detection systems and already possessing some rights on the system, they have already defeated several layers of security without doing anything; gaining more access within the system will be easier than trying from scratch from the outside.
White Hat: Hackers dedicated to defending the computer world from harmful hackers.
Black Hat: Harmful hackers/intruders
Blue Hat: Hired by the company to test their intrusion defense and detection by doing nothing more than actually attacking it.
Grey Hat: Ambivalent concerning their ethical position concerning hacking.
Firewall: System designed to thwart external intrusion techniques.
Security policy: Security policies include general security policies of the entire site and access control to systems and facilities as well as the network and server security policies within the system and should be comprehensive and thorough and subject to review, testing, and maintenance through the employment of ethical hackers and regular security audits.
Encryption: Conversion of data from "plain text" to "cipher text" by applying encryption keys to the plain text, converting it into encrypted cipher text which is then transmitted from one system to another. This prevents the interception of sensitive data while in transit.
Authentication: Verification of the user generally done by matching the username and the password. There are however, much stronger authentication techniques than this which can be employed by the network logical security policy.
Key generator: Program that scans commercial software to gain clues on its product key detection code in order to create a fake product key that will activate it.
Brute force cracker: Program that attempts to decrypt encrypted "cipher text" by applying every possible encryption key to it, until it finds the correct key that exposes the original "plain text" information.
Dictionary cracker: Program that attempts to decrypt encrypted "cipher text" by applying a list of common possible encryption keys to it taken from a dictionary file, until it finds the correct key that exposes the original "plain text" information.
Event log: The system records various events including failed attempts by users to log in into an event log which can then be checked for such possible attempts by intruders to gain access to the system.
Security Audit: Routine operation in which the security mechanisms in place are checked and tested to ensure that the system is secure. Includes such activities as logging on as a particular user and then attempting to access services and shares which that user account should not have the rights to. If the user account can engage in such activities or access restricted services or shares, then the administrator must change the account or the underlying global system security policy so that such rights are no longer available.
Honey Pot: A computer purposely left exposed to attract the attention of intruders and to run system monitor software to send alerts when it is breached. Honey pots are also used as decoys to keep intruders from finding and damaging critical systems.
The problem with data loss due to malicious user activities is that the damage they cause may not necessarily be in the form of the destruction of the data, but may instead be in the form of the theft of the data. In this case, there is no clearly defined concise solution and in fact there may be no solution at all to the theft of sensitive information. Because this is the one form of data loss for which an effective recovery may not even be possible, this is the one form of data loss prevention that is the most important consideration of any organization. Because there are many different types of malicious user many employing various intrusion tools and techniques, there is no definitive method of defense other than constant vigilance. The defense of the network from intruders however begins with the firewall and a strong security policy which includes strong authentication techniques, encryption, and security audits as well as strong event logging and monitoring.
Hardware failure falls into one of the smallest percentages of the causes of data loss in organizations. It also has some of the best methods of prevention, and there is most likely a direct correlation between these two facts: the fact that data loss due to hardware failure has well developed prevention technologies may be why it has been reduced to such a low incidence level. The main hardware related failures that will result in data loss are:
Hard drive failure: The hard drive has moving parts, it is not a question of "if it will fail" but a question of "when will it fail". Hard drive failures can be as limited as the loss of integrity of a single sector or as massive as a total failure of the drive, resulting in the loss of all data stored on it. Hard drives can suffer instant total failure, or gradual degradation over a period of months. Hard drive failures and their severity are impossible to predict.
Cooling fan failure: The system's cooling fans have moving parts, and just like hard drives it is not a question of "if it will fail" but a question of "when will it fail". Cooling fans can also suffer instant total failure or a gradual degradation of performance over a protracted period and like hard drive, such failures are impossible to predict.
Poor environmental conditions: All components within modern computers are extremely high speed electronic technologies that are also, because of this, extremely sensitive to environment and especially sensitive to ESD damage. No component within an enterprise level system, be it the server, the network interconnectivity devices or workstations should ever be subjected to improper conditions which includes high humidity, high temperature (insufficient ventilation), high dust/dirt concentrations. Computer equipment must be kept in a dry, cool, clean environment to ensure that the components will live beyond their own functional obsolescence date. ESD vulnerability and poor environmental conditions are the principle cause of solid state device failures in the modern PC.
ESD mishandling failure: All components within modern computers are extremely high speed electronic technologies that are also, because of this, extremely sensitive to environment and especially sensitive to ESD damage. No component within an enterprise level system, be it the server, the network interconnectivity devices or workstations should ever be improperly handled. ESD damage can cause instant total failure or latent damage which is completely undetectable and can cause instant total failure months after the component was handled improperly. ESD vulnerability and poor environmental conditions are the principle cause of solid state device failures in the modern PC.
The main methods of preventing data loss due to hardware failure begin with the maintenance of a suitable server environment and the proper handling of all equipment. These two simple steps go a long way toward preventing component failure and extending the useful life expectancy and reliability of all equipment. Furthermore, hard drives should definitely be setup as fault tolerant RAID's which dramatically reduce the chances of the failure of a single drive resulting in data loss. The failure of a single hard drive within the RAID should result in nothing more than the inconvenience of having to replace it. The same is true for the installation of multiple cooling fans such that the component will not fail if any single fan fails.
Ultimately, all modern hard drives since the ATA-3 specification are equipped with S.M.A.R.T - Self-Monitoring Analysis and Reporting Technology, so if the drive detects a problem within itself (such as a transient deviation in RPM, it can report it. Unfortunately, most systems either have the feature disabled by default, or limited support only through the BIOS POST routine, which means hard drives can only report their imminent failure during the POST which implies a reboot which high availability servers don't do very often. (That is, a high availability server should never go down by design, which means it should never reboot.) The hard drive manufacturers do make utilities capable of actively monitoring their products while they are up and running and these solutions should be implemented on the server whose hard drives hold the vital data of the organization.
Another form of hardware level failure is the loss of power to the computer system. This may be due to the failure of the internal power supply itself, or worse, due to the loss of the electricity flow from the utility provider. In order to minimize damage due to power loss server systems in particular should be equipped with redundant power supplies and at the very least a: surge protector, line conditioner, and a battery backup UPS. An on-line battery backup UPS will likely provide all necessary line conditioning required so the two roles can be combined into a single device. For mission critical systems, the organization will have to invest in a backup AC power generator in order to remain operational at full or partial capacity during extended blackouts. While the installation of such a system is beyond the scope of the Server+ professional, working with plant maintenance in reporting the total electronic systems power requirements is part of the Server+ technician's role as is coordinating the installation of local powerline appliances and testing of the local server room devices as well as coordinating testing of the generator.
Non-computer related disasters make up the smallest percentage of the total causes of data loss in computer systems even though there are more such causes than any other. By non-computer related causes of data loss it is meant: anything other than the computer system and/or its users (whether invited or not) which causes the loss of its data. This includes theft: #1, and fire/flood/natural disaster: #2. While measures can be taken to prevent the theft of computer system equipment to a certain extent, only the most valuable, sensitive and potentially lucrative data in the possession of wealthy organizations can be adequately defended. In the end, there is no way to prevent nature from taking whatever it desires. To that end, the backups must be kept offsite so that in the event that the entire facility is lost, it can still be replaced and then the data can be restored.
List the six major causes of data loss in computer systems:
What is the only method of recovering data once it has been lost regardless of what caused it?
What is the one form of data loss from which there may be no adequate method of recovery at all?
What is the one cause of data loss that is the most difficult and expensive to prevent?
What is the one cause of data loss that has the most effective preventative techniques and technologies?
What is the one cause of data loss whose preventative technique may cause more harm than good?
Viruses would be categorized in general as accidental or malicious intent? Destructive or data theft?
A trojan horse that installs a key logger would be classified as accidental or malicious intent? Destructive or data theft?
An email that has a subject that reads "New Virus may capture your password" is an example of what spyware technique?
What is the percentage likelihood that any particular computer network will suffer from data loss during its functional lifespan?
What is the one cause of data loss for which there is no way to prevent it?
What is the one cause of data loss that has the highest likelihood that it will not be recoverable from the backups? Explain why.
Define: virus -
Define: Warhead -
Define: vector -
List and describe the high potential physical vectors:
List and describe the high potential logical vectors:
Define: malicious virus -
Define: trojan horse -
Define: worm (both classic and modern definitions) -
Define: resident and non-resident virus -
Define: prolific and non-prolific virus -
Define: encrypted virus -
Define: polymorphic virus -
Define: metamorphic virus -
Define: stealth virus -
Define: KGCC system or boot disk -
Define: signature list -
Define: heuristic analysis -
Define: anti-virus shield -
Which virus takes countermeasures directly against anti-virus software?
What is the only successful method of detecting a metamorphic virus?
Define: adware -
Define: spyware -
Define: malware -
Define: phishing -
Define: spamming -
Define: "drive-by downloading" -
Define: tracking -
Define: stealware -
Define: keylogger -
Define: rootkit -
Define: backdoor -
What is the only satisfactory method of preventing data loss due to malware?
What is the only satisfactory method of preventing data loss caused by viruses?
Define: vulnerability -
A program that was written to take advantage of a software vulnerability is called an:
Define: malicious user
Define: hacker (classic and modern definitions)
Define: cracker (both definitions)
Define: script kiddie
Define: social engineer
Define: ethical hacker
Define: white hat
Define: grey hat
Define: blue hat
Define: black hat
Define: key generator
Define: brute force cracker
Define: dictionary cracker
Define: event log
Define: security audit
Define: security policy
Define: honey pot
Reviewing event logs would be an example of an action undertaken in which of the above terms?
A network security breach never passed through the firewall to the public access domain. This was possibly carried out by which category of malicious user?
Malicious users often engage in data theft, what are the chances of recovery from this?
What is the best defense against network intruders?
What is the first line of defense against intruders from the public access portal?
List and describe the four primary causes of hardware related data loss?
What are the best solutions to prevent hardware related data loss caused by hard drive failure?
What is the best solution to prevent hardware related data loss caused by cooling fan failure?
What is the best solution to prevent hardware related data loss caused by poor server environment?
What is the best solution to prevent hardware related data loss caused by ESD mishandling of components?
What are the technologies available to prevent data loss caused by transient electricity flow from the utility provider?
The new guy at a company has gotten to know the user in the cubicle next to him and invites this person to check out a really cool website he has found. Unbeknownst to his coworker, he created the website whose homepage can surrepticiously download and install a program that can capture their username and password when they login on their workstation. What category of malicious user is he? What two types of spyware is he employing?
Go online and find:
A highly rated, free, full featured, anti-virus software that you would try.
A highly rated, free, full featured, anti-spyware software that you would try.
A highly rated, free, full featured, firewall software that you would try.
At least one identified malicious fake of each type of defensive software listed above.
Go online and find:
An inexpensive ATA/SATA RAID controller card capable of RAID-5 that you would try.
An inexpensive, highly rated surge protector that you would try.
An inexpensive highly rated battery backup UPS, minimum 750VA, that you would try.
At least one other high availability computer system device/peripheral/component that you would consider in the planning phase of a well funded commercial server project.
Copyrightę2000-2008 Brian Robinson ALL RIGHTS RESERVED