CET2176C - Server+ Lecture #4 - Data Loss, Prevention and Recovery

Materials:

Lecture Only
Objectives:
The student will become familiar with:

The various forms of data loss,

Each form's primary method of prevention and recovery,

Hardware and Software level solutions to data loss,

Disaster prevention methodology,

Disaster Recovery methodology.
Competency:
The student will become familiar with the various forms of data loss including the most common forms, their causes, methods or prevention and ultimately methods of recovering from them. The student will be familiar with the methodology of planning for disasters and recovering from them and implementing a disaster prevention and recovery strategy.

Since the introduction of the first PC's the problem of data loss and combating it has been a primary concern. Interestingly enough, the evolution of disaster prevention and recovery techniques and technologies has not advanced at the pace of the performance and capacity of the PC industry technologies at all. In fact, the same things cause data loss in the 21^st century that caused data loss at the dawn of the PC era. And the same basic techniques are used to prevent this data loss and to recover from it.

Data loss in computer systems can be categorized in the following way:

	Accidental: The data was lost due to unforeseen events with no malicious intent.
	Malicious intent: The data was lost due to specifically planned, malicious intent.
	Destruction: The data was lost by virtue of being damaged or destroyed beyond recovery.
	Theft: The data was lost by virtue of being stolen and is often left in tact.

It will be seen that some forms of data loss are obviously accidental or malicious, destructive or acts of theft. But some forms of data loss are not so easily categorized. For example, is data loss due to the destruction caused by a virus to be classified as accidental or as malicious intent? Someone definitely sat down and spent the time to create the virus for the purposes of causing all of the trouble that it causes, but did they direct it at you specifically? Or at anyone unlucky enough to run into it? It should be noted that such "random acts of digital violence" are to be considered accidental in that there was no malicious intent aimed at the user in particular; as opposed to the actions of spyware that can gather enough information about a user so that the spyware author can steal directly from them. While still anonymous, this is personally directed malicious intent that can cause the person, or organization real monetary loss and provide the malicious code writer with real monetary gain. If someone fires a gun into the air on July 4th, and the bullet falls to Earth killing someone, they will not be charged with murder, rather manslaughter. By the same measure virus writers are therefore engaging in "digital manslaughter" so to speak. But, if someone puts a gun to someone's head and demands all of their money and then leaves, they have stolen directly from that person in a personal attack. The only difference with spyware leading to identity theft is that the criminal and the victim never met face-to-face; the crime however, is exactly the same and certainly to be classified as malicious intent and not an accident.

The main causes of data loss are categorized in the following table and are listed with the most common cause at the top, and the least common cause at the bottom of the list:

Causes of Data Loss on Computer Systems

Form of Data Loss	Primary Method of Prevention	Primary Method of Recovery
Human Error	Training	Restoration from Backups
Inadvertent Software Failure (bugs)	Maintaining software updates	Restoration from Backups
Malicious Code	Running active monitoring software or "shields"	Restoration from Backups
Malicious Users (Intruders/Hackers)	Running active defense software and enact effective security policy	Restoration from Backups
Hardware Failure	Fault Tolerance/Redundancy and Controlled environment	Restoration from Backups
Non-computer related disaster	Fault Tolerant and Redundant systems	Restoration from Backups

It should be abundantly clear then that the only practical way to recover from any form of data loss is to perform a restoration of the data from the backups. What may not be evident is that given time, all computer systems will suffer from a disaster in the form of data loss, and therefore cannot be adequately recovered without adequate backups to recover from. It should also be understood that under normal operating circumstances it is impossible to take backups continuously which means that at best only recent copies of the data exist, and that no matter how diligent a backup strategy is, that when a data loss disaster occurs, there will be some data lost that will not be recovered.

To be clear then: the only absolutely effective method of recovering from a catastrophic data loss event is to perform a restoration of the data from backups and it is therefore critical that important systems, and servers are the important systems being referred to here, have implemented an effective and diligent backup scheme, which may involve implementation into both their hardware and software design. Furthermore, it is equally important to remember that no matter how effective and no matter how diligent the backup scheme may be, that it is impossible to implement a 100% effective and perfect backup scheme, meaning that when the data loss disaster does occur, that some portion of the lost data will not be retrievable from the backups, primarily due to the fact that it was not backup up yet.

Still, any organization would rather have 99.9% of their data recovered than none at all. To this end it is the responsibility of the Server+ professional to plan a backup and restore strategy for the server, implement it and provide methods for testing the backups and for implementing an effective restoration when it is needed.

Before we dig deeper into backup hardware and software technologies that are available, let's examine each form of data loss and the efficacy of the techniques for preventing it from striking the system. As strange as it sounds, the number #1 cause of data loss in computer systems in the 21^st century nearly 30 years after the introduction of the first personal computers is still human error accounting for nearly a third of all data loss events according to industry experts.¹

Human error is also still the most difficult form of data loss to prevent. The only effective method by which any organization can attempt to prevent data loss due to human error will be to implement employee training programs. These are time consuming which translates into money lost while employees are being paid, but not performing the duties for which they were hired. And they are also expensive in payment to the technical trainers themselves. Despite the cost, many large organizations invest heavily in employee training, having learned that there does exist a cost effectiveness in developing employees who are less likely to lose data due to inexperience with their systems. But the cost effectiveness is never perfect since some employees will benefit greatly from the training, some marginally and some will not benefit at all, possibly because they do not having enough basic knowledge of the Information Technology systems they are being trained to use.

Because human error is the most common cause of data loss and because it is the most expensive and difficult form of data loss to prevent, and because the only practical method of recovering from data loss due to human error is to perform a restoration of data from the backups, it is understood that any IT network must have in place an effective backup and restore strategy.

Most experts agree that the second most common cause of data loss is software errors or "bugs" in the software. Modern users have seen messages like this:

The principle cause of such errors is a transient software incompatibility also known as a "glitch." Because Windows is a multitasking operating system capable of running many independent processes of its own as well as many independent programs and their multitude of independent child processes, almost any given moment on any given computer system has a different collection of processes running in different locations in RAM and this can lead to a unique situation that causes the malfunction. Because the CPU and the OS also participate in a protected mode system, the operating system can usually maintain control despite the erratic program behavior and therefore be able to display the message and deactivate the program and clear the RAM it was occupying. However, if that program was the one the user was actually using at the time, then all data that program was holding is lost, suddenly and permanently. The usual form of data loss related to software malfunctions is also the most likely to be irretrievable from backups, since it was immediate and permanent, and there was no time or opportunity for the data to be backed up.

Prevention of data loss due to software malfunction, is often free in the form of free update downloads from the OEM's website, but with the added caveat, that in implementing the prevention strategy it can cause, in some random cases, more harm than good. The basic approach is to maintain the latest software which should be done from the beginning of the project (planning and prototype stages). The system should have the latest version of the software installed as well as the latest service pack, update(s), patches, and hotfixes applied. This applies to the operating system of choice, device drivers, and finally software suites. This will ensure reliability better than any other approach. NOTE: if the system is using an older operating system, then apply the latest service pack, updates, patches and hotfixes, but do not upgrade to the latest operating system major version. While the technology has improved, many device drivers may fail during a major OS version upgrade and may not have modernized versions for the modern version of the OS. Software may suffer from similar deficiencies due to its age.

Malicious code sometimes rises up to become the second leading cause of data loss depending on what authority is reporting the statistics. Hard drive data recovery specialists will skew numbers indicating that hardware failure ranks higher, while anti-virus authorities will report that malicious code ranks higher. Normally, the ranking of second or third most common form of data loss are interchangeable between malicious code and software bugs based on the state of the current operating system's functional lifespan. Just prior to the release of Windows Vista, the current OS at the time was Windows XP which was also about 5 years old. Having been in use for that long, most bugs had already been discovered and fixed, and the vast majority of the third party hardware vendors had already learned how to develop stable device drivers for their products for Windows XP. At that time, software bugs were at a five year low due to the stage of the operating system's functional life cycle ("seasoned") but data loss incidents due to malicious code remain a relative constant throughout time, so they rose to the second most common cause mainly due to Windows XP's seasoned life cycle stage of relative stability. With the release of Vista as a new OS, its bugs have yet to be completely purged and the third party hardware vendors are still learning how to develop stable device drivers for it. At this point in time (early 2008) the incidents of data loss due to software bugs are at cyclical high and have surpassed the relatively constant numbers of malicious code incidents for the time being. As Vista ages, the bugs will be found and fixed, the hardware drivers will increase in numbers and stability and the ranking of software bug caused data loss incidents will go back down.

Malicious code falls into several categories including:

Philology of Malicious Code

Viruses and their kin (and associated terminology)

Virus: basically defined in computer science as: a self-replicating piece of executable code. Any program that can copy itself from one location to another without user intervention, and set itself up so that it will execute automatically, also without user intervention, is by definition a virus. It should be noted that nowhere in the definition of the virus does it discuss doing damage. Most do damage the system but such viruses are technically called malicious viruses. Viruses generally come in two classes: fast or prolific replicators and slow or non-prolific replicators. There are advantages and disadvantages to each replication technique.
Warhead: any piece of code designed specifically to cause damage. Almost all warheads do damage to data either directly or by damaging the file system low level data structures which in turn usually causes massive and catastrophic data loss. Warheads are either immediate or event triggered. Immediate warheads cause data destruction the moment they arrive in the system and are rarely used in malicious viruses. Event triggered warheads lie dormant waiting for the trigger event, when this occurs, they proceed to damage the data.
Vector: Just like viruses in the living world, viruses in the digital world must infect hosts and have a method of delivery between hosts. Infected hosts and delivery methods are referred to in the living world as vectors. For example, the black plague infected Europeans in the dark ages through flea bites. However, the only fleas that could deliver the pathogen in a state in its complicated life cycle that was an actual threat, was if they bit an infected rat first. So the rat was the infected host which the flea bit picking up the deadly from of the pathogen which it then carried to a human which it bit, infecting the human. Both the rat and the flea are called in this case vectors, carriers and delivery methods of this human affliction. In computer viruses, there exist clean host computers, infected host computers, and vectors which are the means of infection and delivery of the virus which is nothing more than data, to another clean host. Common physical delivery vectors include:
- removable media: floppy disks, flash drives, CD-R/RW and DVD±R/±RW)
- remote/network connectivity devices: modems, network interface cards
Common binary vectors (structures likely to actually be infected by the virus on any given host computer) include:
- binary executable files: In DOS/Windows this means *.COM and *.EXE files, as well as to a lesser extent, overlays - *.OVL, and dynamic Linked libraries - *.DLL's)
- Volume Boot Record of removable and non-removable drives
- Master Boot Record of local drives
- Interpreted executable OS "scripts" including: DOS/Windows Batch files - *.BAT, Windows Scripting Host script files - *.VBS, *.WSH, etc., shell scripts - *.SCR and *.CMD
- Interpreted executable software "scripts" including: AutoCAD LISP includes, BASIC programming language source files, embedded javascript source (a.k.a XSS - Cross Site Scripting viruses), embedded Visual BASIC for Applications (a.k.a. MS Office macro viruses).
- Software Data files with object support including: images like *.GIF and *.JPG, web pages capable of being infected by javascript source injection (a.k.a XSS - Cross Site Scripting viruses), MS Office documents capable of carrying embedded Visual BASIC for Applications (a.k.a. MS Office macro viruses), MS Outlook data files which can easily be exploited using VBA as well.
Malicious Virus - Virus + Warhead: This is what most people think of when they hear the term "virus." By definition this structure can copy itself to new locations without the intervention (or knowledge) of the user, set itself up in the new location (including a new computer system in which it has arrived over the network) so that it will automatically get a chance to execute (also without user intervention or knowledge) and it has carried embedded within it a warhead, usually event triggered, that will do damage to the target system's data. A good example of this is the Jerusalem B virus which was designed to infect diskette and hard disk boot sectors: the virus engine component. And to continue this behavior until Friday March 13, 1992. On that date it would corrupt the partition tables of the PC on which it resided effectively destroying the entire C: drive (and all logical drives) and all data in them; the event triggered warhead component.
Trojan Horse - Warhead disguised as friendly code: one of the lesser understood forms of malicious code, the trojan horse is NOT a virus, in that it does not have the ability to copy itself from one location to another and consequently does not have the ability to set itself up to execute automatically either. Instead it relies on deception masquerading as a friendly program so that the user will intervene and manually execute it. Once executed by the user, it has all of the power and the authority (rights) of the user and can set itself up to run automatically from then on. Trojan horses can even carry locally infectious viruses and release them into the computer upon user execution. Modern Windows era trojan horses can even release old tried and truly devastating DOS era viruses (even Jerusalem B mentioned earlier) thus causing a resurgence of these viruses as the main warheads being delivered in Windows trojan horse "wrappers."
Worm: The original and deprecated definition of a worm was a virus capable of self-replication over the network. However, the virus vector does not significantly change the nature of the virus, so viruses that can self-replicate over the network are now properly referred to as viruses, and the new definition of a worm is: any virus whose self-replication activity is allowed to run unrestricted such that this behavior in and of itself consumes so much computer resources that it becomes the warhead. Since a virus is by definition any auto-self-replicating program, any such program that does so profusely to the point that this behavior bogs down the system and even crashes it, is a worm. It should be noted that the old meaning of the term still persists and many modern viruses that can replicate across-the-wire are called worms; by the old definition of the term. A modern worm using the modern definition was the Blaster Worm whose prolific self-replication across-the-wire was so wildly out of control and caused such a network traffic bottleneck that this was how it was noticed and this was the extent of the trouble it caused.
Resident vs. Non-resident viruses: Non-resident viruses consist of a replication module which when activated becomes memory resident performs whatever self-replication activities it desires and then removes itself from RAM until its next opportunity to be activated. Infected executables often embed the virus at the beginning of the executable so that when the program is run, the replication module gets to execute first, performing an injection into another executable, it then erases itself from RAM and passes control to the original executable that it infected. Resident viruses enter RAM in their execution opportunity and then remain resident in RAM for the duration of the computing session, periodically gaining usage of the CPU and performing their status checks and replication activities as they desire. Old DOS boot sector viruses were often resident viruses that remained active until the PC was turned off.
Morphic virus: There are three basic types of morphic viruses:
- Encrypted virus: These viruses encrypt themselves using a different key each time thereby disguising their existence and making it difficult to put their signature into an anti-virus list for anti-virus software to attempt to match. The virus then consists of a small decryption module and the rest consists of the encrypted body. Anti-viruses attempt to locate the decryption module which cannot be encrypted or it would not be able to execute.
- Polymorphic virus: These viruses also encrypt themselves with variable encryption keys, but they also modify the decryption module making them much harder to detect, because even the decryption module gets modified each time they create a new copy of themselves. Polymorphic viruses posed the first truly dangerous threat to the computing world since no anti-virus could possess a concise accurate description of one to latch onto in performing their pattern matching searches. The solution was to have the anti-virus create a virtual PC emulator, it would then release files into the emulator and if it was a polymorphic virus it would decrypt itself, this decrypted variation could then be checked against the pattern in the anti-virus list.
- Metamorphic virus: Another variation of the polymorphic virus in which the virus does not just change by using randomly chosen encryption keys, but it also changes its own actual executable code algorithms as well. Basic metamorphs generally use the polymorphic overall design, but once decrypted in RAM, they change their own actual working code encrypt themselves using another key, then insert this new variant into the target vector. Another term for metamorphic viruses is then mutating viruses since the actual working code mutates with each new iteration or copy. Metamorphic viruses are extremely resistant to detection within virtual PC emulation RAM created by the anti-virus software because each individual is unique making it impossible for flat pattern matching against the master signature held in the anti-virus signature file. The only method of detecting a metamorphic virus is to release it in the emulator region, detect the polymorphic behavior (automatic decryption of the body), detect the metamorphic behavior (automatic modification of the body), detect the polymorphic behavior (reencryption of the modified body), detect the virus engine behavior (auto-self-replication; injection into a new vector/host), and possible identification of potentially threatening algorithms within the unencrypted body (identification of the warhead).
Stealth virus: While these do go back to the DOS era, modern stealth viruses are a particularly dangerous problem for modern Windows systems. Stealth viruses are by definition a special form of resident virus in that they stay active in RAM and actively monitor system-wide events including the activation of anti-virus software. When they detect that such an event has been undertaken, they take appropriate countermeasures to either avoid detection by the anti-virus software which may include intercepting low-level OS file access and delivery of non-infected versions of the file to the AV engine, or they will take countermeasures to survive the AV scan by hiding out either in RAM or by using some trojan-horse-like maneuver to hide in a single file after the AV has already scanned the file and determined that it is clean. The only known method of defeating a stealth virus is to be 100% certain that it is not active in RAM. The only way to do this is to boot the system from a KGCC alternate OS or from a KGCC disk and then scan and disinfect all files.
KGCC - Known Good/Certified Clean System or Boot Disk: When dealing with a virus on a network, it should be abundantly clear that any system could be infected and therefore it must be assumed that ALL systems at that site have been infected. A virus on a networked machine is a "Site-wide" threat and a "Site-wide" infection until it can be asserted that it is not ... guessing is unacceptable. The only system onsite that can be asserted as uninfected is one that has never been attached to the network with the others. Furthermore, any machine that has been exposed to the Internet without proper defensive countermeasures against malicious code, cannot be asserted as uninfected. A Known Good Certified Clean system is then: a PC that has had a clean install of the OS, and all proper defensive software prior to exposure to the Internet where it was exposed only for the purpose of downloading the latest updates to the OS and the defensive software packages. A Known Good Certified Clean Boot Disk can only originate on a KGCC system and it must be completely write-protected. Such a KGCC would be for example a bootable CD-R which can only be recorded once. Any further attempt to write data to the CD-R will fail making it a perfect write-protected media from which to bootup suspicious hosts.
Signature: The stretch of code that is specific to the virus, either part of the auto-self-replication module (the virus engine), the data damaging component (the warhead), or the decryption/mutation engine, or a combination of these that is unique to the virus.
Signature List or Database: The file or collection of files that hold the signatures for all known viruses. The anti-virus software will use this list by opening each potential vector on the system and scan it byte-by-byte trying to match it against any signature in the database. If a match is found, then the vector has been determined to be infected.
Anti-Virus engine: The program code of the anti-virus software that opens each potential vector and compares its contents with the contents of the virus signature list. Modern AV engines are extremely powerful and sophisticated programs capable of generating an entire virtual PC emulator in RAM where they can release potentially infected files in order to fool a morphic virus into "coming out of hiding" and trying to infect other files within the virtual PC environment that the AV engine has prepared for it. If it does, then the AV engine can identify it as an infected file and sometimes accurately identify which one (in the case of encrypted viruses and polymorphic viruses)
Heuristic Analysis: Modern AV engines all possess heuristic analysis capabilities. Heuristics means "acting like a human" or "thinking machine." In attempting to identify a metamorphic virus, the AV engine has no entry for it in its signature database because one does not exist, because these viruses constantly mutate, changing their form sufficiently that no simple signature exists for them to scan for. As a result, the AV engine must instead, release the potential mutating virus into the virtual PC emulator in RAM and observe for its decryption (polymorphic module), auto-self-mutation (metamorphic module), reencryption (polymorphic module), auto-self-replication (virus module), behavior in order to identify it as a potential threat along with scanning the content while unencrypted, of potentially dangerous program code (the warhead module) again without a fixed template. Because the AV engine must identify these viruses based on their general behavioral traits, they are using a rudimentary form of artificial intelligence in doing so. They are capable of recognizing a general pattern of behavior of the suspected vector which is a distinctly human-like activity that is very difficult for programs to do well.
AV Scan: The main activity of the AV engine in which the software scans the files on the system comparing each one's content with all known virus signatures in its virus signature lists. Modern AV scanners must also create a virtual PC emulator environment in which to release potentially infected files in order to allow polymorphs to decrypt themselves, thus exposing their core signatures for detection, and to observe for such related behaviors in order to use heuristic analysis to identify metamorphs that cannot be detected any other way. AV scans allow the user to choose many options including "full system scans" vs. "scan selected areas" and to scan "all files" vs. "potential threat files". Obviously a system where a virus is suspected should be subjected to a full system scan not just specific areas and such a scan should include all files, not just potential threats which are the most common binary vectors.
Quarantine: In some cases an AV engine cannot be 100% certain that it has identified a virus, or it cannot ascertain with 100% certainty which specific section of the file is actually infected, or it cannot actually delete the affected area of the file due to the countermeasures that the virus is actively engaged in. In any of these scenarios, the AV will announce that it cannot successfully cleanse or inoculate the file and it will offer to quarantine it instead. This means that the file will be copied to a new location and that it will be encrypted by the AV engine. When the user accesses it, the AV engine will decrypt it and allow it to launch into a virtual PC emulation where any destructive activities it attempts will occur within this virtual realm and cause no more damage to any other data on the PC. It should be noted that the virus would have the ability to continue causing damage to the file it has infected. So if the file is important, it should be opened immediately and all data within it should be copied at one time to a new file outside of the quarantine area. The AV engine should then be allowed to scan this new data file to assert that it is clean and that the virus was not copied as well.
Cleanse or disinfect: The AV engine successfully removes all traces of the virus from the file. Sometimes this cannot be accomplished either because the virus is a polymorph and the actual region that it is infecting cannot be accurately determined, or because the file has been infected with a stealth virus which is taking active steps to protect itself within the affected file such as copying itself back into it instantly every time it is cleansed from it, or because the original file has already been partially or totally corrupted by the virus. In such cases the file can either be deleted or quarantined at the discretion of the user.
Inoculation: This term is used by specific AV software to mean one of two things and it is important to determine exactly what the specific AV software means when they use this term: 1) Cleansing an infected vector, or 2) applying active real-time scanning protection to monitor all activity on the host. Obviously the two meanings are totally different and misunderstanding what the AV is trying to do could mean disaster.
Active Real-Time Protection or shield: This is one of the most important features of any good AV software package. The AV runs a small AV engine as a background process, usually visible in the System Tray of Windows, and actively scans all data that is passing through the OS and its applications as it occurs. This includes watching the stream of data arriving at the Network Interface Card as it travels from the device into the Web browser software, watching files that are opened and read and especially watching the data that gets written to any files at any time. If this data matches any pattern in the virus signature database, the operation will be halted and the user will be notified of the threat. It should be noted that each data activity must be scanned against the entire virus signature list, modern lists are exceeding 20 megabytes in size. That is a lot of data to compare against each data event in the system as it occurs. Running a strong and thorough AV shield should as no surprise slow down the overall performance of the system, but no matter how bad the performance is as a result of the AV shield, it is preferable to running the machine unprotected in the presence of high-risk vectors which includes the highest risk one of all: The Internet.

Spyware and their kin:

Spyware: By definition is any code designed specifically to obtain data from the user without the users knowledge or intervention as opposed to the basic definition of a malicious virus (and their kin) which is designed to destroy the users data. Spyware comes in just about as many varieties as do viruses and sometimes comes packaged as a virus or close relative to them.
Adware: By definition is any code designed to generate advertising directed at the user without the users consent. This includes the introduction of more malicious forms of spyware embedded within other program installers that the user does want, such that the user consents to the installation of the adware or other more malicious spyware agent.
Malware: By definition is any code designed to perform any kind of malicious act against the system or the user: malicious software. Classifications of malicious actions include: damaging data or stealing data. Forms include:
- Infectious Malware: viruses, worms
- Concealed Malware: trojan horses, "drive-by downloads," phishing
- Profit motive malware: adware, web use tracking, stealware, spyware
Corporate spyware: This spyware is installed by the organization purposely to track the activities of the employees especially while they are using the Internet. Companies justify the installation of such packages and the warnings to their employees of its existence to thwart browsing such websites as online gambling sites (illegal in most states) and pornographic websites whose content may be completely illegal (underage models.) Since the FBI tracks all traffic to such sites, the company is protecting itself by discouraging its employees from visiting potentially illegal sites and even detecting it when it happens. The problem lies in the fact that many anti-spyware suites will detect the desired spyware and report it and in some cases even successfully remove it, reopening the problem (unlimited Internet access) to the workstation. This has led many desirable corporate spyware manufacturers to force anti-spyware manufacturers from reporting and disabling their products. This in turn has led spurious individuals to modify these spyware packages which the anti-spyware leaves alone, for their own use.
Phishing: Sending authentic looking email that appears to come from reputable websites to end users using mass emailing techniques known as spamming. When the user clicks on any link in the email, the user has performed a cooperative activity that can be redirected to do any kind of undesirable activity including: download and install malicious code including low-level spyware, trojan horse, virus, etc., redirect the user to the spyware author's site which in turn looks like the authentic site that the email seems to have come from and presents to the user the familiar looking login user name and password boxes. At this point if the user proceeds to login, the spyware author's server records the user name and password for them to use at any time. If the victim was fooled into believing that the website was their bank, the result could be devastating.
Spamming: A particularly overwhelming problem for the modern Internet, spamming is by definition: mass blind emailing. While the primary intention of most spam may be as relatively harmless as its physical mail equivalent - "junk mail" there exist enough malicious versions that all unrecognized email should be deleted immediately. Note: even opening email or any other related message can activate malicious code immediately and should be avoided at all costs. This can include any form of communication that can embed HTML including unknown website links, web-based email of unverifiable origin, or live chat windows invitations or transmissions from unknown people, any of which can be a powerful and dangerous phishing event or even embed devastating malicious code that activates immediately.
"Drive-by downloading": Takes advantage of web browser vulnerabilities to force a surrepticious download from the spyware author's website onto the visitor's system. Actively installing an executable and running it allows this program to accomplish any desired task. It is as powerful and as dangerous as any trojan horse and is one technically speaking. Well known vulnerabilities become known in the malicious code and anti-virus/anti-spyware communities as exploits.
tracking: This form of spyware monitors the users web surfing in order to collect information about the users interests in order to create customized advertisements that will contain items that interest the user greatly improving their likelihood of pursuing them. While many companies, even reputable ones engage in this activity which they argue is just "good marketing" they are nevertheless obtaining private information from the users computer without their knowledge or consent and that is, by definition, spyware.
stealware: Not an end-user concern, but it is definitely a corporate concern. Stealware authors intercept legitimate companies advertisements, either pop-ups or embedded webpage banners redirecting the users who click on them to their own sites instead. They gain from this action either directly by taking credit for the users pursuit of the links to their sites (pay-per-click advertising reimbursements) or by getting the business of the customer rather than the original advertiser.
keylogger: One of the most dangerous forms of spyware, the keylogger program infiltrates the local system and records all keystroke events at the system level and then transmits keystroke logs or transcripts back to the author. The keylogger program itself is a stealth technology that usually either does not appear at all in the Windows Task Manager Active Processes list or appears there under a friendly name alias. The program is usually equipped with enough internal components to be able to engage in delivery of the keystroke transcripts usually by emailing them without the system's help (they can make the transmission by themselves as soon as the system gains access to the Internet.)
rootkit: Another very dangerous form of spyware, the rootkit is a program or group of programs (hence the name rootkit) that establishes administrative level access rights on the compromised system. Rootkits are the next generation evolution of the keylogger in which they infiltrate the system and allow access and control to all information and functions from a distance. That is, once installed a rootkit is usually extremely difficult to detect because it has the highest system privilege level and is capable of forcing the task manager to skip it while displaying the active processes list, it is difficult to remove, because of this privilege level, and it gives a spurious user full access rights remote control of the system.
backdoor: Back doors are security holes purposely designed into software or network systems usually by the maker of the product. The justification is that backdoors allow them to bypass certain security techniques that allow them to identify their users, or detect if the system is registered or pirated, etc. When backdoor information leaks it then becomes a major security risk for the entire product line. Backdoors can also be created by software, notably rootkits, but viruses and trojan horses main objectives are shifting from random data destruction to the delivery systems for creating backdoors for information collection.
vulnerability and exploit: Vulnerabilities are found in software and protocols usually by long trial and error methods undertaken by crackers. Once a vulnerability has been found, this leads to the development of a tool that makes taking advantage of the vulnerability easy, such a tool is called an exploit.

Viruses are designed to destroy data and in the end, the business could care less about the computers themselves, they are a trivial cost when compared to the customer database which to any company is priceless and irreplaceable. As a result, since viruses are designed specifically to destroy data and the data is the TOP priority on any business network, the occurrence of a single piece of malicious code no matter how innocuous it may seem, must be treated with utmost caution and severity; it could have the potential of destroying the entire database and causing irrevocable damage to the business.

The only effective defense against viruses and the kin is to install and maintain an anti-virus software suite. In the event that a virus does penetrate the system, the only effective resolution of this data loss event is the restoration of the system from backups. However, viruses introduce a new twist to this disaster recovery strategy in that their warheads generally have event triggers meaning that they may hide out undetected in infected systems for days even months. In this situation, recent backups quite often contain infected files and restoring from these backups also restores the virus. To further compound the problem, backup media may not be in a format that allows it to be scanned, so there may be no way of knowing what backups are clean and what backups are tainted until the restorations are performed which in turn releases the virus into the system again.

As dangerous as viruses are, spyware can present an even greater danger to businesses. Spyware is malware specifically designed to gather or more accurately stated steal information from the business. This can include customer lists which can then be used directly as a direct mail advertising target or worse, the list could contain private information such as the users credit card numbers and addresses, all sufficient to engage in identity theft on a massive scale wrecking the customers financially and thoroughly destroying their trust in the company that lost their data. Such incidents have put companies out of business and should be considered just as a severe and potentially devastating as any virus.

Spyware has more unrelated forms than viruses have and as such can be more difficult to combat, but the defense begins with the installation of an anti-spyware package that should include both a real-time shield as well as regularly scheduled scans.

Malicious users are the fourth form of data loss. This basically includes intruders to the system. While collectively known today as hackers, this is not an accurate term. The true definition of a hacker is anyone sincerely interested in learning about technical systems, but not necessarily interested in causing any form of damage which includes intrusion and access to private information. Those who choose to intrude and damage or access private information are exceeding the definition of hacker and are properly referred to as crackers but even this term has changed in modern times to mean something totally different.

Malicious Users (and associated terminology)

	Hacker: Develops methods of infiltrating information technology systems across the Internet access portal.
	Cracker: 1) Develops methods of cracking commercial software so that it will install and run without the original product key. 2) Develops and/or employs decryption techniques to encrypted "cipher text" in an attempt to expose the original "plain text" data and thus gain access to sensitive information.
	Pirate: Shares commercial software with others either for free or worse, charges for it.
	Script kiddie: Finds hackers programs and techniques, downloads and uses them with very little knowledge of his own on what they are or how they work.
	Social engineer: Works "face-to-face" or chat window-to-chat window (etc.) to convince the user that he is their friend and to share sensitive information such as username/password combinations etc.
	Ethical hacker: Hired by the company to test their intrusion defense and detection. Also known as white hats
	Hactivist: Hackers who believe that they are the only people saving the world, essentially the Internet, from big corporation and government control.
	Disgruntled/disloyal employee: Essentially an insider intruder. Being behind the majority of any network's intrusion defenses and detection systems and already possessing some rights on the system, they have already defeated several layers of security without doing anything; gaining more access within the system will be easier than trying from scratch from the outside.
	White Hat: Hackers dedicated to defending the computer world from harmful hackers.
	Black Hat: Harmful hackers/intruders
	Blue Hat: Hired by the company to test their intrusion defense and detection by doing nothing more than actually attacking it.
	Grey Hat: Ambivalent concerning their ethical position concerning hacking.
	Firewall: System designed to thwart external intrusion techniques.
	Security policy: Security policies include general security policies of the entire site and access control to systems and facilities as well as the network and server security policies within the system and should be comprehensive and thorough and subject to review, testing, and maintenance through the employment of ethical hackers and regular security audits.
	Encryption: Conversion of data from "plain text" to "cipher text" by applying encryption keys to the plain text, converting it into encrypted cipher text which is then transmitted from one system to another. This prevents the interception of sensitive data while in transit.
	Authentication: Verification of the user generally done by matching the username and the password. There are however, much stronger authentication techniques than this which can be employed by the network logical security policy.
	Key generator: Program that scans commercial software to gain clues on its product key detection code in order to create a fake product key that will activate it.
	Brute force cracker: Program that attempts to decrypt encrypted "cipher text" by applying every possible encryption key to it, until it finds the correct key that exposes the original "plain text" information.
	Dictionary cracker: Program that attempts to decrypt encrypted "cipher text" by applying a list of common possible encryption keys to it taken from a dictionary file, until it finds the correct key that exposes the original "plain text" information.
	Event log: The system records various events including failed attempts by users to log in into an event log which can then be checked for such possible attempts by intruders to gain access to the system.
	Security Audit: Routine operation in which the security mechanisms in place are checked and tested to ensure that the system is secure. Includes such activities as logging on as a particular user and then attempting to access services and shares which that user account should not have the rights to. If the user account can engage in such activities or access restricted services or shares, then the administrator must change the account or the underlying global system security policy so that such rights are no longer available.
	Honey Pot: A computer purposely left exposed to attract the attention of intruders and to run system monitor software to send alerts when it is breached. Honey pots are also used as decoys to keep intruders from finding and damaging critical systems.

The problem with data loss due to malicious user activities is that the damage they cause may not necessarily be in the form of the destruction of the data, but may instead be in the form of the theft of the data. In this case, there is no clearly defined concise solution and in fact there may be no solution at all to the theft of sensitive information. Because this is the one form of data loss for which an effective recovery may not even be possible, this is the one form of data loss prevention that is the most important consideration of any organization. Because there are many different types of malicious user many employing various intrusion tools and techniques, there is no definitive method of defense other than constant vigilance. The defense of the network from intruders however begins with the firewall and a strong security policy which includes strong authentication techniques, encryption, and security audits as well as strong event logging and monitoring.

Hardware failure falls into one of the smallest percentages of the causes of data loss in organizations. It also has some of the best methods of prevention, and there is most likely a direct correlation between these two facts: the fact that data loss due to hardware failure has well developed prevention technologies may be why it has been reduced to such a low incidence level. The main hardware related failures that will result in data loss are:

	Hard drive failure: The hard drive has moving parts, it is not a question of "if it will fail" but a question of "when will it fail". Hard drive failures can be as limited as the loss of integrity of a single sector or as massive as a total failure of the drive, resulting in the loss of all data stored on it. Hard drives can suffer instant total failure, or gradual degradation over a period of months. Hard drive failures and their severity are impossible to predict.
	Cooling fan failure: The system's cooling fans have moving parts, and just like hard drives it is not a question of "if it will fail" but a question of "when will it fail". Cooling fans can also suffer instant total failure or a gradual degradation of performance over a protracted period and like hard drive, such failures are impossible to predict.
	Poor environmental conditions: All components within modern computers are extremely high speed electronic technologies that are also, because of this, extremely sensitive to environment and especially sensitive to ESD damage. No component within an enterprise level system, be it the server, the network interconnectivity devices or workstations should ever be subjected to improper conditions which includes high humidity, high temperature (insufficient ventilation), high dust/dirt concentrations. Computer equipment must be kept in a dry, cool, clean environment to ensure that the components will live beyond their own functional obsolescence date. ESD vulnerability and poor environmental conditions are the principle cause of solid state device failures in the modern PC.
	ESD mishandling failure: All components within modern computers are extremely high speed electronic technologies that are also, because of this, extremely sensitive to environment and especially sensitive to ESD damage. No component within an enterprise level system, be it the server, the network interconnectivity devices or workstations should ever be improperly handled. ESD damage can cause instant total failure or latent damage which is completely undetectable and can cause instant total failure months after the component was handled improperly. ESD vulnerability and poor environmental conditions are the principle cause of solid state device failures in the modern PC.

The main methods of preventing data loss due to hardware failure begin with the maintenance of a suitable server environment and the proper handling of all equipment. These two simple steps go a long way toward preventing component failure and extending the useful life expectancy and reliability of all equipment. Furthermore, hard drives should definitely be setup as fault tolerant RAID's which dramatically reduce the chances of the failure of a single drive resulting in data loss. The failure of a single hard drive within the RAID should result in nothing more than the inconvenience of having to replace it. The same is true for the installation of multiple cooling fans such that the component will not fail if any single fan fails.

Ultimately, all modern hard drives since the ATA-3 specification are equipped with S.M.A.R.T - Self-Monitoring Analysis and Reporting Technology, so if the drive detects a problem within itself (such as a transient deviation in RPM, it can report it. Unfortunately, most systems either have the feature disabled by default, or limited support only through the BIOS POST routine, which means hard drives can only report their imminent failure during the POST which implies a reboot which high availability servers don't do very often. (That is, a high availability server should never go down by design, which means it should never reboot.) The hard drive manufacturers do make utilities capable of actively monitoring their products while they are up and running and these solutions should be implemented on the server whose hard drives hold the vital data of the organization.

Another form of hardware level failure is the loss of power to the computer system. This may be due to the failure of the internal power supply itself, or worse, due to the loss of the electricity flow from the utility provider. In order to minimize damage due to power loss server systems in particular should be equipped with redundant power supplies and at the very least a: surge protector, line conditioner, and a battery backup UPS. An on-line battery backup UPS will likely provide all necessary line conditioning required so the two roles can be combined into a single device. For mission critical systems, the organization will have to invest in a backup AC power generator in order to remain operational at full or partial capacity during extended blackouts. While the installation of such a system is beyond the scope of the Server+ professional, working with plant maintenance in reporting the total electronic systems power requirements is part of the Server+ technician's role as is coordinating the installation of local powerline appliances and testing of the local server room devices as well as coordinating testing of the generator.

Non-computer related disasters make up the smallest percentage of the total causes of data loss in computer systems even though there are more such causes than any other. By non-computer related causes of data loss it is meant: anything other than the computer system and/or its users (whether invited or not) which causes the loss of its data. This includes theft: #1, and fire/flood/natural disaster: #2. While measures can be taken to prevent the theft of computer system equipment to a certain extent, only the most valuable, sensitive and potentially lucrative data in the possession of wealthy organizations can be adequately defended. In the end, there is no way to prevent nature from taking whatever it desires. To that end, the backups must be kept offsite so that in the event that the entire facility is lost, it can still be replaced and then the data can be restored.

Review Questions

List the six major causes of data loss in computer systems:
What is the only method of recovering data once it has been lost regardless of what caused it?
What is the one form of data loss from which there may be no adequate method of recovery at all?
What is the one cause of data loss that is the most difficult and expensive to prevent?
What is the one cause of data loss that has the most effective preventative techniques and technologies?
What is the one cause of data loss whose preventative technique may cause more harm than good?
Viruses would be categorized in general as accidental or malicious intent? Destructive or data theft?
A trojan horse that installs a key logger would be classified as accidental or malicious intent? Destructive or data theft?
An email that has a subject that reads "New Virus may capture your password" is an example of what spyware technique?
What is the percentage likelihood that any particular computer network will suffer from data loss during its functional lifespan?
What is the one cause of data loss for which there is no way to prevent it?
What is the one cause of data loss that has the highest likelihood that it will not be recoverable from the backups? Explain why.
Define: virus -
Define: Warhead -
Define: vector -
List and describe the high potential physical vectors:
List and describe the high potential logical vectors:
Define: malicious virus -
Define: trojan horse -
Define: worm (both classic and modern definitions) -
Define: resident and non-resident virus -
Define: prolific and non-prolific virus -
Define: encrypted virus -
Define: polymorphic virus -
Define: metamorphic virus -
Define: stealth virus -
Define: KGCC system or boot disk -
Define: signature list -
Define: heuristic analysis -
Define: anti-virus shield -
Which virus takes countermeasures directly against anti-virus software?
What is the only successful method of detecting a metamorphic virus?
Define: adware -
Define: spyware -
Define: malware -
Define: phishing -
Define: spamming -
Define: "drive-by downloading" -
Define: tracking -
Define: stealware -
Define: keylogger -
Define: rootkit -
Define: backdoor -
What is the only satisfactory method of preventing data loss due to malware?
What is the only satisfactory method of preventing data loss caused by viruses?
Define: vulnerability -
A program that was written to take advantage of a software vulnerability is called an:
Define: malicious user
Define: hacker (classic and modern definitions)
Define: cracker (both definitions)
Define: pirate
Define: script kiddie
Define: social engineer
Define: ethical hacker
Define: hactivist
Define: white hat
Define: grey hat
Define: blue hat
Define: black hat
Define: firewall
Define: encryption
Define: authentication
Define: key generator
Define: brute force cracker
Define: dictionary cracker
Define: event log
Define: security audit
Define: security policy
Define: honey pot
Reviewing event logs would be an example of an action undertaken in which of the above terms?
A network security breach never passed through the firewall to the public access domain. This was possibly carried out by which category of malicious user?
Malicious users often engage in data theft, what are the chances of recovery from this?
What is the best defense against network intruders?
What is the first line of defense against intruders from the public access portal?
List and describe the four primary causes of hardware related data loss?
What are the best solutions to prevent hardware related data loss caused by hard drive failure?
What is the best solution to prevent hardware related data loss caused by cooling fan failure?
What is the best solution to prevent hardware related data loss caused by poor server environment?
What is the best solution to prevent hardware related data loss caused by ESD mishandling of components?
What are the technologies available to prevent data loss caused by transient electricity flow from the utility provider?
The new guy at a company has gotten to know the user in the cubicle next to him and invites this person to check out a really cool website he has found. Unbeknownst to his coworker, he created the website whose homepage can surrepticiously download and install a program that can capture their username and password when they login on their workstation. What category of malicious user is he? What two types of spyware is he employing?

Fun Work

Go online and find:

A highly rated, free, full featured, anti-virus software that you would try.
A highly rated, free, full featured, anti-spyware software that you would try.
A highly rated, free, full featured, firewall software that you would try.
At least one identified malicious fake of each type of defensive software listed above.

Go online and find:

An inexpensive ATA/SATA RAID controller card capable of RAID-5 that you would try.
An inexpensive, highly rated surge protector that you would try.
An inexpensive highly rated battery backup UPS, minimum 750VA, that you would try.
At least one other high availability computer system device/peripheral/component that you would consider in the planning phase of a well funded commercial server project.

¹ The stated value is an estimate based on the averages collected from many sources including but not limited to these Online References:
Stellar Information Systems Ltd.
OnTrack Data Recovery Unit
IT Policy Compliance Dot Com
Protect-Data Dot Com