The student will become familiar with:
the general layout of an NTFS partitioned HDD including:
MBR partition tables,
the Microsoft NTFS Volume Boot Record,
the Master File Table.
The student will learn about the general layout of NTFS partition structures on the hard drive including the location and nature of the NTFS partition tables, the Volume Boot Record, the "extended boot strap loader," and the Master File Table location and function. The student will begin to understand the foundation of the NT File System and how it works to store information on the hard drive.
The NTFS file system functions fundamentally differently from the FAT file system. In FAT the root directory is the fundamental starting structure for listing files and subdirectories by their human given names. This small and very simple "fixed record" database contains the human given names and the starting FAT entry or first occupied cluster of the file. The corresponding FAT netry within the FAT either indicates that it is also the last used cluster or holds the number of the next used cluster. That corresponding FAT entry then indicates the next used cluster or the end of the file, and so on. As such, the directory entries are indeed small and simple and easily found and interpreted by the data recovery specialist. The FAT is a central repository at the beginning of the partition of ALL used cluster information for all files and subdirectories of the partition.
FAT file systems are, relative to NTFS, extremely simple and lightweight meaning that there is very little overhead for the file. An 8GB partition would still use 4KB clusters (8 consecutive sectors per cluster) and therefore have to set up two File Allocation Tables to keep track of the usage of 2,000,000 clusters. Each cluster's entry in a FAT32 File Allocation Table is 32 bits or 4 bytes in length. So the entire FAT would be 2,000,000 x 4 = 8,000,000 bytes. This divided by 512 bytes/sector = 15,625 sectors per FAT x 2 FAT's = 31,250 sectors. A typical Windows 98 installation holds about 2000 files in about 200 directories x 32 bytes per directory entry, thus: 2200 entries x 32 bytes/entry = 70,400 bytes ÷ 512 bytes/sector = 137.5 or 138 sectors. But allowing for 200 directories each of which must use at least one cluster yields: 200 directories x 4 sectors/cluster = 800 sectors + 138 sectors = 938 sectors assuming that each directory except one held only one file (the most inefficient and pessimistic layout possible.) All of this means that the roughly 260MB sized full standard installation of Windows 98 SE would also occupy 32,188 sectors of the drive due to the underlying file system structures that make access to any of it possible. This is only 16,480,256 bytes or roughly 15.7MB which is about 6% of the space used by the actual files of the installation. As data accumulates and programs are installed, this percentage decreases because the FAT's will never grow in size and they are the bulk of this estimate.
FAT file systems are, relative to NTFS, extremely fast in performance because a file's directory entry can be quickly located because the directory entries are small and the filename location within them is fixed in position. Once located, the starting cluster number can be quickly found for the same reason, it is small and in a fixed position within the entry. The FAT entry can be quickly found and the occupied cluster chains can be quickly determined because they are all small numbers located relatively close to each other. Any of the Windows operating systems running on top of FAT32 rather than NTFS should experience significant performance improvement because of the much smaller, simpler, and faster FAT file system vs. NTFS. FAT16 in fact is even smaller and faster, but Windows XP may not fit into a FAT16 partition.
NTFS on the other hand, has very large Master File Table entries. The MFT is the foudational file organizational structure in NTFS, in fact it is the ONLY file organizational structure in the file system. Every file on the drive has at least one unique MFT entry, and many files have many MFT entries. An MFT entry is very roughly analogous to the root directory entry of the FAT file systems. The entry consists of a collection of the file's attributes just like in FAT, but this is where the analogy ends.
One of the attributes that is held in the MFT entry is the occupied cluser "runlist" for the file. By storing the entire list of occupied clusters in the entry, the file system drivers do not have to refer back to the beginning of the drive to a File Allocation Table for the rest of the occupied clusters. However, the MFT entries while fixed in overall size, are not fixed as to their internal organization or total number of attributes, in fact a file's MFT entry can grow larger than the size of an MFT entry and then have to push attributes out to occupy another entry or more.
Furthermore, the data content of the file is actually considered just another attribute of the file so in the end, almost all files except for some small text files will probably have attributes occupying multiple clusters scattered across the surface of hte drive. Add to this state of affairs the fact that each MFT entry instead of being 32 bytes in size is 1024 bytes in size and that some files have 3 or four attribute fields while others have dozens that must be exported out to occupy more than one of these bloated 1024 byte entries and that the positioning of additional entries can be ahead of the base MFT entry and it is clear that the file system must perform many more calculations to interpret or parse the MFT entry for a file thus bogging it down in overhead both in size and complexity.
NTFS is a journaling file system. It is capable of quickly spotting errors due to incomplete operations that were interrupted by an improper shutdown or system crash and can find the problems and in many cases undo the partially completed transaction. Some data may in the end be lost, but vital operating system files can be saved from permanent corruption that would lead to the need for a complete reinstallation. NTFS also features, native compression (nothing new, Drivespace also did this and did it with a superior algorithm), native encryption (very poorly understood by the end user which can have dire consequences without patient preparation for disaster,) native security (unauthorized access to a file will be prevented,) and indexing and hard linking which can improve performance. But these were added to improve its performance because it has such a high overhead in the first place, and they add their own overhead as well.
Many security features and almost all of the server's functions and roles will not run unless the operating system is installed onto an NTFS partition. Furthermore, the Windows 2000 server cannot create any complex hard drive containers (i.e. Volumes like a RAID-5 Volume) unless they reside on dynamic disks. A further extension of the Microsoft methods of low level organization of hard drives which will be explored later. Most of these must be formatted NTFS in order for all server features to function correctly.
Outside of servers and systems with multiple users that need to hide files from each other, there is no practical reason for using NTFS. A business man's laptop appears to be safe by placing files into an encrypted folder, but this is NOT the case. Once in the posession of a thief, all data must be considered compromised. Many spurious programs can obtain the full user list including the Administrator account which the owner may not even know exists, and they can quickly determine the password or allow it to be changed. Logging on as Administrator allows the user to decrypt all files on the system, meaning the effort of encrypting the folder was just quickly and easily thwarted and done in vain.
Nevertheless, almost all OEM PC's featuring Windows 2000 or later will be delivered from the factory using NTFS and all end users of these products will have NTFS based partitions regardless of whether they are aware of it, whether they know what it means, or whether they need it. And data recovery specialists will be called upon to repair these operating systems files and/or rescue user's data files from them.
An NTFS partition occupies a hard drive, generally speaking, similarly to any other type of partition. The MBR layout, particularly the partition tables, sizes fields, and locations must be the same because they are read by the BIOS boot strap loader code embedded in the motherboard BIOS EEPROM. After that sector all else can deviate and it does deviate somewhat from any other partition.
The partition table for any NTFS partition on a basic disk will include the starting sector's CHS coordinates, the Partition ID byte value of 07h (meaning NTFS although it was first used by IBM for their HPFS - High Performance File System, of OS/2) the neding sector's CHS coordinates, the LBA offset to the starting sector and the size of the partition in sectors. NTFS can support partitions well larger than the limit size limit supported by the partition tables. The partition table supports a maximum partition size expressed as a 32 bit number, the largest of which is FF FF FF FFh = 4,294,967,295 sectors x 512 bytes/sector = 2,199,023,255,040 ÷ 1,048,576 bytes/MB = 2,097,152MB ÷ 1024MB/GB = 2048GB ÷ 1024GB/TB = 2TB. In cases where a single partition would be larger than this, the drive could possibly use the 64 bit total partition size field in the VBR but it would probably have to be converted to a dynamic disk and additional volume information would be stored in the dynamic drive logical disk manager database stored at the end of the drive (in the last 1MB of physical sectors on it.)
The starting sector of a type 07h (NTFS) partition is called th Volume Boot Record or VBR. This sector has a similar layout to the FATx DBR's but it does deviate in the EDPB area. It is also interesting to note that this sector is considered the starting sector of data cluster number zero. It does hold two relavant and important pieces of data: the sectors/cluster and the location of the start of the Master File Table or MFT. The first entry in the MFT is the entry for itself named $MFT.
The MFT is such a large and complicated structure to parse, that the simple little boot strap loader code that fits within this sector cannot do it. To that end, the sixteen sectors that follow it consist of the extended or tertiary boot strap loader that are launched by the VBR code. The tertiary boot strap loader will then find and load the file "ntldr" into RAM and pass control to it.
Ntldr reads boot.ini for operating system options and their locations and can launch other operating systems although its compatibility with non-Microsoft products is marginal at best. Ntldr will launch ntdetect.com and then the kernel if the native OS is chosen. Ntldr will also launch ntbootdd.sys but only if this driver is needed in the case of non-conventional SCSI based boot devices. If the SCSI controller loads a fully INT 13h compatible BIOS during the POST, then it is possible that ntbootdd.sys will not be present or necessary.
Basic layout of the hard drive holding a single NTFS partition setup on a basic disk:
|...||End MFT||...||ntldr data||...|
Since this value reads "00 20" the starting sector of the first FAT is the sector number 32 counting from the DBR which is sector #0. This is not always the case, FORMAT may position the start of the first FAT anywhere from sector #2 to sector #65535. It is a logical offset from the DBR and is NEITHER an LBA coordinate or a geometric coordinate. If and when the starting sector of the first FAT is needed, its geometric and/or LBA coordinate will have to be calculated.
The field at offset 10h specifies the number of FAT's. For all true magnetic storage disks DOS keeps two FATs back to back at the beginning of the partition. This value when working on the HDD should always be "02".
The field at offset 11h indicates the maximum number of root directory entries. For FAT32 file system based partitions the size and location of the root directory are NOT fixed as they are in FAT16. The root directory is considered just another file like a subdirectory. Because of this, the root directory can grow and therefore fragment and is therefore unlimited in size and therefore unlimited in number of entries that it can hold. It is however, HIGHLY prefereable to keep it from becoming fragmented and its number of entries should be kept below the number that would fit into one cluster. Since the cluster size of this partition (less than 8GB in size) is 4KB and each entry is 32 bytes in size just like FAT16 based file systems, then 4096/32 = 128. The root should never exceed 128 entries (files and subdirectories). Since the FAT32 root has no limit to the number of entries it can hold this field is left zeroed.
As with FAT12 and FAT16, FAT32 also stores both FATs back to back. So if the starting sector of the first FAT is known (indicated by the field at offset 0Eh) and the number of FATs is known (indicated by the field at offset 10h) and their sizes are known (in an upcoming field) and DOS always stores the FATs back to back, then the starting sector of the first data cluster of the partition will be known precisely. The root directory's starting cluster is listed in an upcoming field here in the DBR and this is how the file system will be able to locate its start. It gets FAT table entires as well which is how the file system can pursue it if it grows larger than one cluster.
The field at offset 13h holds the total number of sectors in the partition. It is however a word (16-bit number) and is therefore limited to a maximum value of 65,535. Multiplied by 512 bytes/sector = 33,553,920 bytes or roughly 32MB. This is a deprecated field that was used in DOS Partition ID Type 04h partitions. Type 04 FAT16 file systems do NOT support clusters such that each FAT entry points to one sector, and not one cluster which is a group of sectors. Since all FAT32 partitions are larger than 32MB and use Partition ID types other than 04h then this field will be left zeroed.
The field at offset 15h is also a somewhat deprecated field used to indicate the type of physical drive that this DPB is located on. For hard drives the value is F8h. All DPB's of hard drives should indicate F8h at this offset. It has very little functional value within the DPB. But, it it good for verifying the DPB. It should not be any other value.
The field at offset 16h indicates sectors/FAT or the size of each FAT. However, FAT32 supports huge file allocation tables and as such this field is NOT used by FAT32 and is left zeroed. The true FAT size field is coming up is a 32-bit sized field of the FAT32 EDPB.
The field at offset 18h is the sectors/track for geometric coordinate access of sectors in the partition. It will match the translations being used by the BIOS on the HDD. This value will be verified later.
The field at offset 1Ah is the total number of heads being used for geometric coordinate access of sectors in the partition. It will match the translations being used by the BIOS on the HDD. This value like the preceding one will be verified later.
The field at offset 1Ch of the DPB is a double word or 32-bit value indicating another cryptic value called "Hidden Sectors." Do not forget that the four bytes must be byte reversed as they are read off of the DEBUG display on screen. So the "3F 00 00 00" on screen becomes =x=> "00 00 00 3F" This field indicates the logical offset of THIS DBR from the sector that holds the partition table that defines the partition. In this case the sector that holds the partition tables where this partition is defined is the MBR itself. So if the MBR is considered LBA sector #0, then this sector is LBA sector #3F (63). To illustrate:
The Hidden Sectors field is a relative location. It is the location of this DBR relative to the location of the MBR in this case. Other sectors besides the MBR can define partitions which is why it is important to bear in mind that this is a relative offset from the sector that contains the partition tables that define its location. This value MUST match the field at offset 08h of the partition table that defined this partition. This is the case and one of the main fields used in rebuilding the DPB and/or the MBR partition table.
The field at offset 20h indicates the total size of the partition (in sectors) if it is a FAT32 partition it will be larger than 32MB. Any partition larger than 32MB will cause FORMAT to use this field, but the criterion is: only if the partition is not a type 04. Remember that this is a single double word value (32-bit) and must be byte reversed as read off of the sector in the DEBUG screen. The value "86 FA 3F 00" that was found will be recorded as "00 3F FA 86" on the worksheet. This value must match the value of the field at offset 0Ch of the partition table that defines this partition and this is another tool for verifying the DPB as corrupt or not, and for rebuilding either the DBR or the MBR. In rare cases this field may be slightly smaller than the size found in the MBR's partition table. In this case, the two values do match.
The field at offset 24h is the first field of the FAT32 EDPB and has a different piece of data from the FAT16 EDPB. The field at this location is a 32-bit (4 bytes reversed) value that holds the FAT size in sectors. In this case the value is 00 00 04 F3 = 1267. So each FAT is 1,267 sectors in size and these are relatively small file allocaton tables for FAT32. A 2TB partition would have file allocation tables (in theory) occupying 524,288 sectors each (that is over 260 million bytes in size!) We'll have to wait until someone strings together a hardware level 2TB RAID and partitions it FAT32 or until the HDD manufaturers release a cheap 2TB drive before we can know for sure if FAT32 can actually handle a partition this size.
The field at offset 28h of the FAT32 DBR, holds the flags. This is a word sized field so reverse the bytes then find the meaningful bits within it. Bit 7 indicates if FAT mirroring is enabled. Note that for FAT12 and FAT16 it is always enabled. Any activity is automatically updated to BOTH copies of the FAT simultaneously. Under normal conditions the FAT32 second copy is also updated automatically (FAT mirroring is therefore enabled. If the bit 7 = 0 then mirroring is enabled. If it is "1" then it has been disabled. If it is disabled, then bits 0 - 4 indicate which FAT is the active and therefore reliable File Allocation Table. For example, if the flags field reads: 81 00, then first reverse the bytes: 00 81. Then convert to binary and find bit 7:
BIT 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 VALUE 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 ^
Bit 7 indicates that FAT mirroring has been disabled. Bits 0 to 4 hold the value 0 0 0 1 which is binary for the number "1" indicating that the first FAT is the active FAT. In our example, the flags field is "00 00" indicating that FAT mirroring is enabled and this will usually be the case.
The field at offset 2A of the FAT32 DBR is the FAT32 file system version. This word sized field is almost always "00 00" in that Microsoft never changed the file system since they began to concentrate all of their efforts on NTFS which would ultimately replace FAT32.
The field at offset 2C is a double word (32-bit, or 4 byte) value holding the starting FAT entry (cluster number) of the root directory. Cleanly formatted new FAT32 partitions will always place the start of the root directory in the first usable cluster; data cluster #2. Converted partitions, converted from some other file system, such as FAT16, may have the root start at some other cluster. If the root does not start in cluster #2, it is a good indication that it has been converted from some other file system either by Microsoft's CONVERT.EXE or a third party utility like Partition Magic. In our example, the value is 00 00 00 02, the expected value indicating that the root directory begins in cluster #2.
The field at offset 30h of the FAT32 DBR is a word sized value that holds the offset to the FSI (File System Information) sector. This is a second sector that has been added to the DBR that holds optimization infromation for the Windows 98+ file system drivers to use in writing new files to the partition. It will be examined in detail in another module. In the example the value read off of the DEBUG screen was: 01 00 which when byte reversed is 00 01, indicating that the sector that immediately follows the DBR holds the FSI sector.
The field at offset 32h of the FAT32 DBR is a word sized value that holds the offset to the backup DBR. This sector is a fully functional duplicate of this sector. And the backup FAT32 DBR will launch the operating system in the event that this DBR is completely zeroed. The only way that this could work is if the MBR 1st stage OS loader code were built for it to launch the backup in case the primary DBR were found to be missing. This means that the MBR code dictates the location of the backup DBR and it is probably inflexible. The value usually found is: "00 06"
The field at off set 34h of the FAT32 DBR is listed as reserved and usually contains all zeros. It is 12 bytes in length.
The field at offset 40h of the FAT32 DBR consists of the fields originally found in the FAT16 DBR's EDPB and starts with the BIOS drive number identity of the hard drive on which the partition was at the time it was formatted. In the example the value found was 80h indicating that it was most likely the master on the primary ATA channel.
The field at 41h is usually 0, but does mean the "current head" the file system drivers may temporarily insert this value into this location during large operations. But this is a transient and therefore unreliable value that does not have much practical value to the data recovery operations of the partition's contents.
The field at 42h is the EDPB signature, usually 29h as was observed above.
The field at offset 43h of the FAT32 DBR holds the volume serial number. A randomly generated 32-bit (4 reversed bytes) number created by format and inserted into this field. While DOS has almost no use for the value, Windows file system drivers do use it and changing it can have disastrou effects on the Windows NT family of operating systems (i.e. Blue Screens of Death and failure to launch the operating system will occur)
The field at offset 47h holds the volume label as 11 ASCII encoded bytes. Like FAT16, it will hold the value "NO NAME" followed by five spaces if none is specified by the user during the formatting process.
The field at offset 52h holds the file system type as 8 ASCII encoded characters. For FAT32 it MUST read: "FAT32" followed by three spaces as it does in this example.
Like the FAT16 DBR, the FAT32 DBR holds the mapping to the layout and location of all important file system structures within the partition. Although FAT32 is slightly different and as such there are some extra fields the concepts are the same. In the following exercise, the information contained within this DBR will be used to peruse the partition and its structures.
In working with a FAT32 based partition, it will be necessary to verify values found within the FAT32 DPB within the DBR. The following table is a foundation that holds typical vaues for the various fields:
|00h||3 bytes||EB 58 90||Jump Instruction to bypass the DPB|
|03h||8 bytes||"8 ASCII CHARS"||File System Driver Signature|
|0Bh||1 word||200 ("00 02")||Bytes/Sector|
|0Eh||1 word||Reserved Sectors|
|10h||1 byte||02||# of FAT's|
|11h||1 word||0 ("00 00")||max. # of root dir entries|
|13h||1 word||00 00||total sectors (partitions < 32MB)|
|15h||1 byte||F8||Media Descriptor|
|1Ah||1 word||total heads|
|20h||dword||total sectors (partitions > 32MB)|
|2Ch||dword||2 ("02 00 00 00")||Root start cluster|
|30h||word||1 ("01 00")||offset to FSI sector|
|32h||word||6 ("06 00")||offset to backup DBR|
|34h||12 bytes||0's||reserved field|
|40h||byte||80 (rarely: 81,etc)||BIOS drive #|
|43h||dword||xx xx xx xx||volume serial #|
|47h||11 bytes ASCII||"NO NAME "||Volume Label (irrelevant)|
|52h||8 bytes ASCII||"FAT16 "||File System Type|
The red boxes must be calculated manually as part of the verification process. The green boxes match the BIOS translations in effect at the time the drive was originally formatted. This may have been accidentally incorrect, leading to the current data recovery scenario. This situation will be further discussed in the lecture review of translation mathods. Some details are avaiable here.
The blue boxes must match the values in the partition table that defined this partition. In some rare cases especially when the partition uses the entire physical disk, the partition size may not match, the difference however should be slight and within reason.
All other fields in the above table should be as listed or they are irrelevant to the accessibility of the partition from the DOS prompt. This concludes the preliminary investigation into the verification of the DPB and EDPB within the DBR of a FAT32 partition.
Copyrightę2000-2006 Brian Robinson ALL RIGHTS RESERVED