Find the CHKCPU.ZIP utility on the Student CD-ROM. This is a free download from the Internet and we should all be thankful to Mr. Steunebrink for his contribution to the world. Unzip it to the HDD and open a DOS box and run it. Record its output information in the module below. Note that the program cannot reveal the clock multiplier while in Protected mode. Restart in MS-DOS mode and run the program again to get this information.
C:\BIN>chkcpu
CPU Identification utility v1.9 (c) 1997-2002 Jan Steunebrink
---------------------------------------------------------------------------------
CPU Vendor and Model: Intel Celeron 4 1700/1800 E0-step
Internal CPU speed : 1714.3 MHz (using internal Time Stamp Counter)
Clock Multiplier : Available only in Real Mode!
CPU-ID Vendor string: GenuineIntel
CPU-ID Name string : Intel(R) Celeron(R) CPU 1.70GHz
CPU-ID Signature : 0F13
|||+-- Stepping or sub-model no.
||+--- Model: Indicates CPU Model and 486 L1 cache mode
|+---- Family: 4=486, Am5x86, Cx5x86
| 5=Pentium, Nx586, Cx6x86, K5/K6, C6, mP6
| 6=PentiumPro/II/III, CxMII/III, Athlon, C3
| F=Pentium4
+----- Type: 0=Standard, 1=Overdrive, 2=2nd Dual Pentium
Current CPU mode : Protected
Internal (L1) cache : Enabled in Write-Back mode
C:\BIN>_
In order to understand how the IBM PC works it is necessary to understand that the PC is a computer system centered around the Intel 8088 CPU – Central Processing Unit. And that this CPU is in fact the computer itself. The rest of the PC exists for the purpose of allowing information to enter this computer, be stored by the system, be retrieved by the system, be processed through this computer, and to be displayed to the user. The computer that will be described here will be the 8088 microprocessor which is the ancestor of all of the subsequent CPU's developed by Intel for the IBM personal computer family of systems starting with the IBM PC, then the XT, then the AT. IBM continues to have a huge influence on the personal computer market despite the fact that their sales plummeted in the early 1990's and have currently fallen to practically single percentage figures.
One of the most important concepts for the PC technician to know about the processors that are the computer's at the heart of each successive generation of this family of personal computers is that up to the Pentium 4, they are ALL backwards compatible. This means that the Pentium 4 can execute 8088 executable programs and therefore understands 8088 machine language. The 8088 instruction set is a subset of the complete Pentium 4 instruction set. It can also therefore be said more accurately that the Pentium 4 instruction set is a superset of the preceding x86 family processor's instruction sets that trace all the way back to the 8088. Later we will see that the 8088 memory addressing scheme is completely different from the Pentium 4's also, because of this the Pentium 4 powers on or resets into 8088 emulation in which it is effectively nothing more than a very fast 8088.
When any CPU from the x86 family is running in 8088 emulation this is referred to as real mode (I suppose the term means it is acting like a real 8088?). When the CPU is switched from real mode into its native mode of operation, in the case of the Pentium 4 it is switched from acting like an 8088 to acting like a Pentium 4, then this is referred to as protected mode. This term refers directly to one of the main features of the way the native 32-bit CPU's manage memory which we will investigate later.
Because the machine starts up in real mode, all BIOS code is written for real mode (modern BIOSes are including more and more protected mode instructions), and all of DOS is written for real mode. There are a few exceptions in the later versions, DOS 6.22 includes HIMEM.SYS which does provide access to extended memory through a protected mode interface to any program that knows how to use this API – Application Programmer's Interface - which is written in the 80386 native instruction set. Therefore, learning the 8088 processor as the example has real world applicable value to the PC expert since so much code exists in modern systems for it and all modern CPU's reset to emulate it.
The activity of the 8088 CPU will be described partially in order to get enough of a grasp of the machine necessary for this lecture and it will not be fully described here since it actually quite a complex computer in its own right. The 8088 CPU has many 16-bit wide registers of several different categories of functionality, it has a 20-bit wide memory address bus and an 8-bit wide data bus. This narrow data bus is handled by the motherboard chipset transparently to the CPU so that it can be treated as if it were 16-bits wide. This is one of the main reasons that the 8088 was chosen over the 8086 which has a true 16-bit wide data bus which would have driven the price of the motherboard, chipset, and expansion cards costs so high that the system would not have remained affordable.
Each block of the registers is actually four binary bits wide. A register is simply a special set of circuits that can hold and also manipulate a large binary place value number. Since each block in the diagram actually represents 4 binary digits it can be seen then that the general purpose registers named "ax" and "bx" are actually 16-bits wide. As are the segment registers "ds" and "cs" as are the pointer registers "si and "ip"
The physical data bus register exists physically in the CPU but not logically. That is, it can never be referred to directly by the programmer in the machine language instructions and is under the exclusive control of the CPU itself. The same is true of the physical memory address register although the programmer can exert complete control on the registers used to calculate its value.
The diagram illustrates two of the 8088's general purpose registers named the "ax" and the "bx" These are where a number can be brought onboard the CPU in order for a mathematical or logical operation to be performed on it. If a program for this machine is to add two numbers together, the numbers must be brought onboard into these registers, then they can be added, then the result can be stored back out in memory. This is true of all of the x86 family processors from the 8088 up to the modern processors.
This diagram also illustrates two of the 8088's "segment" registers named the "ds" (data segment) and the "cs" (code segment). These are used to point to the place in memory where the data is stored (the ds) and to where the program that the machine is currently executing is stored (the cs). The 8088 has one more extremely important segment register called the "ss" or Stack Segment register which will described later.
This diagram also illustrates two of the 8088's "pointer" registers, the "si" (source index) and the "ip" (instruction pointer). The si register points to a particular address within the data segment of memory which the CPU would access. This is how the CPU will read a data value in, or write a data value back out to memory. The x86 family is certainly not limited to the use of the si register for this, it has many addressing schemes for accessing data in RAM; far too many to discuss here. The ip register points to the particular instruction that the CPU is about to fetch, then decode and execute within the program stored in the code segment of memory.
So at this point it is important to take a closer look at memory since it appears that this machine is going to be working with it quite a lot according to the above descriptions of the registers. In fact, the CPU must have memory in order to function and basically does nothing but execute programs loaded into memory, and read raw data that has also been loaded into memory, process this data and then write the results of this processing back out to memory.
The memory referred to here is RAM - Random Access Memory. This is an array of high speed digital electronic circuits that can store the binary numbers that are the numbers that the CPU works with. The CPU will treat some of these numbers as instructions for it to perform and it will treat others as data and information that those instructions have it manipulate.
The machine only knows which is which by having the proper program instructions loaded into the proper location in memory and by having the proper data values also loaded into the proper location in memory as well. The programmer's are entirely responsible when designing the programs for this machine that all of this will be done correctly and set up the programs and data in memory and set up the values in the segment and pointer registers so that these segment/pointer register combinations are pointing to the locations in memory that hold what they are supposed to be holding; the program that the machine is running should reside in the memory address pointed to by the cs:ip combination and the data being worked on by that program should be pointed to by the ds:si combination.
The address select line comes from the RAM address demultiplexer so that whatever binary number is represented on the address bus, the demultiplexer will decode it and activate one specific address select line, the one for the cell whose address matches the binary number on the address bus.
The state of the Read/Write select line coming to the cell is replicated from the state of the "memrw" line which comes directly out of the CPU. This is how the CPU indicates to memory whether it is reading from memory or writing to memory.
Each cell is individually connected to the data bus lines which run throughout the memory array. If the CPU is indicating address (00000000000000000)010 on the physical address bus, then the address select line to the middle cell of the illustration above would come on. If the CPU is attempting to read the data held in this cell, this would be indicated by the state of the Read/Write line at that cell as well and the address control circuit would switch on connections (transistors as we have seen) between the voltages held in the individual bit cells to the data bus wires thus exposing the contents of this particular memory cell to this bus. The CPU is also attached to these bus lines and would "catch" these voltages at the physical data bus register. This would complete a single RAM memory read activity.
If the state of the CPU's memrw line were indicating a write (note that the state of this line is either a "0" or a "1" meaning either a read of the cell or a write to the cell) then the CPU will place the number that it wants to store in the cell into the physical data bus register and then activate the set of switches (transistors) so that these voltages appear on the data bus lines. Then the address bus would indicate which cell is being written to and the demultiplexer circuits will select it. The state of the memrw line will indicate that the CPU is writing to the cell and so the address control circuit will clear the individual bits of the cell and again switch the bit cells so that they are in contact with the data bus. This time the cells will take on the voltages present on the data bus lines thereby storing the number they represent in the memory cell thus completing a CPU write to memory.
In the situation illustrated here, the CPU has placed the address "100" onto the address bus and the RAM demultiplexer circuit has activated the address select line for the memory cell #4 (100b = 4d):
The memrw line is indicating that this is a memory read so the address control circuit (the blue box) connects the individual bit cells of this RAM address to the data bus lines and the voltages appear on these wires. The CPU's physical data bus (register the boxes in the lower right corner of the CPU) can then "catch" these voltages.
In the case that the CPU is writing to the RAM location, again the address of the location that the CPU wishes to write to is placed on the address bus and the RAM demultiplexer selects the memory location. Note that the CPU will also have the number that it wishes to write to this location staged in its physical data bus register:
At this point the memrw line indicates that this is a write from the CPU to the memory location. So the address control circuit (the blue box) will clear the bits and connect them to the data bus. The bit cells will absorb the voltages on these lines thereby completing the write to this RAM location from the CPU:
Examining this section of RAM it can be seen that each address will be activated only when the actual address number has been placed on the address bus by the CPU. The address demultiplexer circuit of the RAM chip will determine which unique address select line to turn on which of course will activate the desired memory cell. The CPU has an interface wire coming from it called the "memrw" line (memory read/write) the state of this line will also be forwarded to the cell control circuit so that the memory cell will either expose the voltages held in the individual bit circuits that make up the memory cell onto the data bus wires, effectively a read of its contents, or the bit circuits will absorb the voltage states on the data bus, effectively writing a number from the CPU to them. (Note that all of this is not totally literally true, but it is all close enough to the truth for a PC technician).
A memory loaction for the 8088 CPU consists of an address control circuit and the eight bit cell circuits connected to it that hold the byte that it can store. It can hold one 8-bit binary number that can be either one piece of an instruction of a program or one piece of data that the program will manipulate. The entire amount of RAM that the 8088 CPU can address (its maximum amount of RAM supported) is determined by how the segment registers and pointer registers are used. The 8088 uses a four bit shift to the left of the segment register then it adds the pointer to this. This sounds bizarre (and in fact it is), but by introducing the segment registers Intel made it possible for the 8088 to construct a number larger than the 16-bits held in its registers so that very large programs and/or very large amounts of data could be handled by the CPU. Let's assume that the code segment register currently holds the value 1000h and that the instruction pointer currently holds the value 2345h Now the CPU will shift the cs value four bits to the left and then add the ip register's value to this:
CS = 1000h, IP = 2345h
Remember that each hex digit is four binary bits so shifting the CS four bits to the left is the same as shifting it one hex digit to the left:
CS <- 1 hex digit = 1000_
Add the IP to it + 2345
12345h
From this it can be seen that since the cs register is shifted four bits to the left that the code segment of RAM pointed to when the cs register is 1000h is actually 10000h and that the physical address bus is actually 20 bits wide. The highest 20-bit wide number is 11111111111111111111b = 1,048,575. So the maximum amount of RAM that the 8088 can address is 1,048,576 addresses (remember that 00000h is also a valid address and is the 1st address making 00001h the second and FFFFFh the 1,048,576th address). It is the physical address bus register that will be fed the current value of the segment and pointer registers when any RAM address must be read from or written to and it will automatically perform this bit shift and calculate the physical address and place it onto the address bus.
We also see that each memory cell's individual bit cells attach to the data bus whose wires ultimately lead back to the 8088 CPU's physical data bus register. When the CPU wishes to read a value from a particular memory location, the proper segment and pointer register values are given to the physical address bus register which calculates the physical address and places it on the address bus.
It is clear then that the voltages on these wires that are held within the bit cell circuits are synonymous with the binary number that they represent. What is not clear is that the transistors that make up these circuits do not know this. All of these circuits are nothing more than collections of switches simply responding to the inputs of the other switches and they change states and send voltages or lack of voltages or "on" and "off" states back and forth between each other. It is only the humans who have designed them to do this in a particular way that make these voltages take on the interpretation by humans as binary digits of value "1" or "0" Because of this it is interesting to note that in some circuits the lack of voltage can be assigned by the designers to mean "1" and the presence of voltage to mean "0" even though you would expect no voltage to mean "0" and some voltage to mean "1" Again it is the humans who assign these meanings, the circuits just switch themselves based on what the other circuits have sent them a signal to do.
Now that it is seen how the machine reads and writes values to and from RAM, it is possible to understand the full operation of the 8088 and the machine will be followed through the execution of a few machine language program instructions so that the student can fully understand this computer and how it works; in real mode for now.
As mentioned earlier the 8088 operates using three segment registers (although only two were illustrated to keep the diagram as simple as possible). The cs or code segment register is a 16-bit register that is actually shifted 4 bits to the left. Since 4 bits can count from 0 to 15 and these are demultiplexed into addresses, this means that the cs segment register could have the value 1234h for example, but shifted 4 bits to the left (one hex digit) it is really the number 12340h which is the physical memory address in RAM. Adding 1 to this register results in the number 1235h and after the 4 bit shift 12350h. So the cs register can only point to a specific address in RAM on even boundaries that occur every 16 bytes. These 16 byte gaps and the even cs numbers (like 12350h) are called paragraphs and paragraph boundaries.
With the cs register set to 1235h it is pointing at the RAM memory segment that begins at the physical address paragraph 12350h. If the ip register is 0000h then the physical address bus register will hold the value 12350h which is the lowest possible physical address that the cs:ip combination can address as long as the cs = 1235h. This then is the starting point or absolute bottom address of RAM segment 1235h.
If the ip register holds the number FFFFh, then the physical address bus register will add 12350h + FFFFh = 2234Fh. This would be the highest possible physical RAM address that could be accessed while the cs register stays set to 1235h and is therefore the absolute top address of the RAM segment 1235h. Subtracting the top address (2234Fh) from the bottom address 12350h yields: FFFFh which makes sense when one realizes that the range of addresses within the segment are spanned by the ip register which is 16-bits wide which is exactly the size of this number. Converted to decimal it is 65,535. Counting the zero address at 12350h as well yields 65,536 or 64KB. This is the size of a single RAM segment of the 8088 in real mode addressing.
A program can easily execute instructions in a variety of ways to change either the code segment or the data segment or even the stack segment to any value from 0000h to FFFFh and therefore is definitely not limited to a single 64KB segment of RAM. Any program can use any number of segments (note that they can be defined every 16 bytes and are 64KB deep so multiple segments can easily overlap often to the unwary programmer's dismay) and can access any physical address within the addressable megabyte spanning from 00000h to FFFFFh.
Armed with this background information, assume that the program has already been loaded into RAM and that the segment registers have already been initialized so that they are pointing to the correct locations in RAM. That is, the cs and ip pair are pointing to the first executable machine language instruction of the program that the CPU is about to fetch, decode and execute. In fact, the 8088 is totally nonfunctional without RAM and is a computer that is designed to fetch, decode and execute machine language instructions from RAM. It is designed to manipulate raw data stored in RAM and to generate output information in RAM. Interaction with other devices is quite a challenge since this computer essentially does nothing more than operate on numbers stored in RAM. Later the engineering solution of how the CPU was designed and how the PC was designed to allow input and output to peripheral devices will be discussed in detail.
The ds segment is pointing to the segment of RAM where the data that the program is going to manipulate and generate will be stored and the ss (Stack Segment) and sp (Stack Pointer) are pointing to the top of the memory stack.
The three main segment:offset registers point to three physical addresses
in RAM at any given moment: the program it is executing, the data that
program is working on, and the stack where it can keep a note to itself
|
Assume that the cs = A000h, the ds = 9000h and the ss = 8000h. Also assume that the ip = 0000h and the program is about to begin executing. Note the "?" next to the ds register. This is because the 8088 has many ways of addressing data using the ds as the segment register, it does not have to use the si in fact it can use a direct value within an instruction and most of the registers other than the segment registers as the pointer to the address within the data segment.
The sp or stack pointer's value is FFFFh which is why the current ss:sp pair point to the physical address 8FFFFh. The stack is nothing more than a special way of treating RAM which requires less overhead, but more planning on the part of the programmer. There are memory read write operations that are dedicated to the stack and many low level operations of the CPU depend on the stack in order to retrieve critical values such as the return addresses from long jump instructions. Picture the stack as the stack of dishes next to the dishwashing employee in a busy restaurant (hence the name "stack") The bus boys bring in 10 dishes and the dishwasher begins to wash them, first the top one and sets it aside, then he grabs the next one, which is the new one on the top of the stack washes it then sets it aside. Then he grabs the next one which is again the new one on the top of the stack washes it and sets it aside. Now he has cleaned 3 dishes and the fourth one that was brought in originally is now the dish on top of the stack. Now lets say that the bus boys bring in two more dishes. They will place them on the top of the stack. So now the dishwasher grabs the top dish (and notice that it just arrived on the stack) washes it and sets it aside. Now he grabs the next one, washes it and sets it aside. Now that fourth dish from the original group is back on the top of the stack. And the bus boys bring in 4 more and place them on the top of the stack.
So the stack is a type of memory called a LIFO – Last In, First Out. And it starts at the highest address of the segment and works its way down. It does not need a direct numerical address reference to function either. The dishwasher does not need to know the address of that fourth dish at any time, just how many dishes have been stacked on top of it at any given moment and he will know where it is. This is the power of the stack to the programmer. To the CPU it has a default behavior built into many of its instructions. When the processor is about to execute a jump that will change both the segment pointer and the instruction pointer carrying it to a distant location in memory, the CPU will automatically "push" the current cs and ip values onto the current stack position. This is a total of 4 bytes (each register is 16 bits or 2 bytes wide) and so the stack pointer which points to the current place on the stack that can be written to will be decreased by 4. As long as the distant part of the program does not alter the stack segment or pointer, a "return" instruction can be executed and it will simply "pop" the four bytes on the top of the stack off and place them into the cs and the ip effectively returning the program to where it started from before the jump and there is no need for the return instruction to refer to the original place in memory from which the CPU jumped from so it, nor the programmer have to know the actual value either. It may sound a little weird the first time you read it, but trust me it is very easy to code large programs for this machine without having to know the exact address every byte of the way through the program. Just write "Jump [to] procedure X" and at the end of Procedure X just write "return" and it works. But its not magic, the programmer has to be aware of exactly how the machine returns or he will mess up the stack not knowing that the machine needs it and then the program will crash (Despite knowing it, they seem to do a fine job of crashing it anyway!)
At this point the technician should be aware of how the 8088 CPU works at the fundamental level reading and writing to RAM. The technician should note that any program for the 8088 keeps three physical addresses: the place where the actual machine language program instructions reside in RAM pointed to by the cs:ip pair, the place where the raw data resides in RAM pointed to by the ds:si pair (as one example pointer register), and the stack where the CPU can automatically store long jump return addresses (and many other things) pointed to by the ss:sp pair. The technician should be aware that the CPU does not "know" the difference between program code and data. This is the responsibility of the programmers to position it in RAM properly and to set up the segments and pointers properly. This is the exact job of the operating system. When the user runs or executes a program, by typing its name at the command prompt, or double clicking its icon in Windows, the operating system reads the program file off of the drive and into RAM, sets the segment:pointer pairs properly and then the program begins to execute.
The evolution of the x86 family started before the first IBM PC model. The original 8086 processor was actually too expensive for the PC which as each successive design problem was solved was steadily growing in price such that it was going to become too expensive to be a feasible "personal computer." IBM therefore decided to base the original IBM PC on the Intel 8088 processor, a much cheaper variant in which the data bus attachment to the external bus was cut down to 8-bits wide rather than 16-bits. This allowed the data bus through out the PC to be made 8-bits wide rather than 16-bits which made the manufacture of everything from the motherboard to the expansion bus slots and cards cheaper. The 8088 chip was also much cheaper than the full blown 8086. However, the chip is still a 16-bit CPU meaning that its registers can hold 16-bit numbers and it can perform internal mathematical and logical calculations on those 16-bit numbers. Because the data bus is cut down to 8-bits wide on the IBM PC and the 8088 CPU, each memory fetch requires two complete fetches to fill a register instead of one. This makes the 8088 based system immediately twice as slow as an equivalent 8088 based system all other factors begin equal.
Original 8088 Microprocessor within an IBM PC, open 40-pin DIP
Socket next to it is for the 8087 Math Coprocessor.
|
Intel would develop a successor to the 8088 called the 80186. It was technologically more advanced but still had a 16-bit data bus like the 8086, and still ran 16-bit machine language yet it was far more expensive than the 8088. IBM decided to pass on the processor which Intel was banking that they would buy. Intel would scurry to develop a much more powerful step up from the 8088 and released the 80286 shortly afterwards.
This processor still features 16-bit registers and is fully backwards compatible not only capable of running machine language programs written for the 8088/86 but also in fact completely emulating it at power reset. However, the 80286 is a different CPU and can be switched out of the 8088/86 emulation mode called real mode into its native mode called protected mode in which it still shares a common set of machine language instructions with its predecessors. But it also has its own new superset of instructions and powerful new memory addressing schemes allowing programs to access up to 16MB of RAM instead of the old 1MB limit of the pure 16-bit 8088/86 instruction set and functionality. The 80286 memory addressing schemes also allow the processor to allocate a fixed block of memory to an application and if that application attempts to execute an instruction that addresses a memory location outside of that range the CPU will automatically throw an interrupt and execute the interrupt routine instead. The operating system designers can place OS functions in this interrupt and a protected mode CPU management interrupt is called an "exception" or a "fault." When it concerns programs violating protected memory spaces it is called a general protection fault.
|
Windows kernel operating system code is written in two fault mode levels, if the OS kernel determines that it is still undamaged by the fault it will open the ubiquitous "This program has performed an illegal operation and will be shut down [OK]" window. If the OS determines that the kernel itself may be compromised it will switch the video card back to text mode and issue an infamous "Blue Screen Of Death" error message and more than likely lock up the system at this point. The '286's vastly improved speed (10-12Mhz) and memory addressing capacity was enough to attract IBM. While they were modifying the motherboard to accommodate the new CPU they decided to introduce more motherboard improvements intended to overcome nagging shortcomings of the original PC's design. The result was the IBM AT.
The IBM AT included the following architectural changes:
-
16-bit data bus from CPU to the rest of the system including RAM, expansion bus
-
16-bit data bus expansion to the ISA bus slots, old cards still work in them
-
Added a second PIC – Programmable Interrupt Controller providing a total of 15 IRQs (1 attaches the "slave" PIC to the original "master") instead of 8
-
Added a second DMA – Direct Memory Access controller. Each DMA actually has two lines, the DMA Request and DMA Acknowledge so the second contoller adds 4 new DMA channels one of which is unavailable and attaches it to the original. The second DMA supports the 16-bit data bus.
-
Added the RTC – Real Time Clock, so that the system can retain the Time/Date internally while off and because this chip had 64 bytes of RAM built into it:
-
Added the BIOS Setup Utility for software configuration of installed hardware which stores the hardware configuration in the RTC chip's CMOS RAM
Before the IBM AT, that would be the IBM PC and XT, there was no RTC chip on the motherboard, therefore the system did not retain the Time/Date and these had to be manually entered every time the machine was turned on. And because there was no RTC chip, there was no CMOS RAM either, so the BIOS could only be configured through jumpers and DIP switches on the motherboard. Adding RAM, upgrading or adding a second floppy all required setting jumpers or DIP switches on the motherboard or the system would not recognize the hardware or worse: cause a POST failure until the settings were corrected to match the hardware configuration.
The BIOS Setup Utility and the CMOS settings allows the user to change the hardware, then run the utility and make the changes in software from menu choices which are plain and clear on screen, whereas jumper settings and DIP switches are generally marked with something like "JP1" on the motherboard and it is absolutely not clear what this jumper sets. The inclusion of the RTC chip and the CMOS RAM for holding hardware configuration that is set through the BIOS Setup Utility was a huge advance that changed the PC from more of a "hobbyist toy" into a "real" computer especially since the machine retains the Time/Date which are critical to the workplace environment.
Setting DIP switches on the motherboard
|
Intel's next processor the 80386DX would be released in late 1985. It would be the first full 32-bit CPU intended for the PC market. It featured 32-bit wide registers allowing te chip to perform native machine language operations on 32-bit numbers for the first time. The 386 has a large new superset of instructions to accommodate these operations. Like the 286 the 386 will power reset into real mode and must be switched into protected mode. A major difference is that the 386 can be switched back into real mode without having to reset (reboot) the machine. This is one reason that HIMEM.SYS is able to make extended memory available to applications that know how to call it because it switches the system into protected mode accesses extended memory using 32-bit native 386 instructions and then switches back to real mode then passes control back to the 16-bit DOS program or the 16-bit DOS kernel. The 386's protected mode functions differently from the 286 in memory addressing details but is backwards compatible. The new 32-bit IP (instruction pointer) register allows the 386 to address 232 memory locations or 4 Gigabytes of RAM. To put the power of the 386 in perspective, it was released in 1985 capable of running 4GB sized programs in RAM, yet as of this writing, 20 years later the standard PC still does not come with this much RAM.
Another major feature of protected mode is the processor's capability that is referred to as multitasking. There are two types of multitasking that have been developed to take advantage of this feature of the processor: cooperative multitasking and preemptive multitasking.
In multitasking the CPU can effectively stop the execution of a particular program and push all of the current values in all of the registers onto the stack and read in all of the values for all of the registers of another program in progress from the stack. This means that the execution of one program can be interrupted and saved onto the stack in RAM and another program that was saved in the middle of execution can be read back off of the stack and continue to execute where it left off. Multitasking can handle several programs executing this way and move through them one at a time giving each a small amount of time to execute on the CPU and then stop the program, save the state of the CPU to the stack and reload the next one. Then it can stop and save the state of this next one onto the stack and pull the next one off of the stack until it has given each program a time slice.
Once all of the programs have had a time slice the first program gets to execute again for a short time and this process continues in a round robin fashion constantly. If a program ends it gets removed from getting a turn. If another program begins it gets inserted into the rotation. All of this happens so quickly, the changing from one program to the next, that it appears that the CPU is executing all of the programs simultaneously, which of course is impossible with a single processor.
In cooperative multitasking, each program must voluntarily relinquish control of the CPU by calling the controlling program (the operating system function) which will then coordinate the state save to the stack and load the next program. The next program will run until it calls the control program to coordinate its state save to the stack and so on. The problem with cooperative multitasking is that if the program crashes, then it will never call the control program to get out of the jam and the computer crashes. Windows 3.x even though it is not considered an operating system did switch the system into protected mode and did use multitasking but it used the cooperative variety which led to all of the blue screens and crashes associated with it.
The CPU however, does support preemptive multitasking. In this case the CPU itself will only allow a given program so many clock cycles of control and then automatically execute an interrupt which will give control back to the operating system control program function that can then decide whether to shelf the state of the current program onto the stack and load the next program or return control to the current program. In this case the program that is running has no control over how long it gets to retain control of the CPU and this control can be pulled away from it at any time, even if it has crashed and is stuck in an infinite loop. Windows 9x and NT/2000/XP are based entirely on preemptive multitasking which obviously has far greater control over the system in the event that a given program crashes.
The 386 was also a manufacturing first in that Intel used a manufacturing process or material known as bipolar CMOS. What you already know about CMOS is that it is an extrememly low power consumption integrated circuit manufacturing material. This means that the core of the 386 can run on much less current leading to much less heat generated by the circuit activity in the core of the chip. The 386 would also integrate a cache MMU - memory management unit for the first time. IBM had been experimenting with motherboard cache manager circuits that would attempt to predict what information from main RAM that the processor would need next and then prefetch that information up into a high speed static RAM chip. If the prediction was correct the SRAM chip could respond to the CPU memory read request much faster than the main dynamic RAM chips could. The 386 was indeed one the single largest processor improvements in the PC evolutionary trail. In fact it was far more powerful than the software of its day (DOS and its 16-bit applications) and it was a very expensive processor. Since there existed no OS or applications to take advantage of its enormous power the industry began to complain that they were paying a lot of money for a lot of CPU that no one knew how to use yet.
386 in PQFP (Plastic Quad Flat Pack) form factor surface soldered
to a regular PGA (Pin Grid Array) adapter board for mounting into
a LIF (Low Insertion Force) socket on the motherboard.
|
132-pin LIF (Low Insertion Force) socket for 80386 CPU on the motherboard.
|
Intel quickly developed an "economy model" of the 386 called the 386SX. The 386SX featured the internal core of the 386DX with full 32-bit registers and instruction set, however it was cut down to a 24-bit address bus like the 286 and a 16-bit data bus like the 286 and packaged like the 286 as well. Since the naufacturing process was greatly improved for superior core speed, the 386SX could be inserted into a 286's socket and run much faster. Since it sold for much less than the 386DX it was a marketing success in that people now had a viable use for the chip as an upgrade to the old 286 CPU on the motherboard. The 386SL would be the first Intel CPU intended for the PC market that would be designed specifically for use in laptops (and other portable PC's.) The SL was further slimmed down in current consumption but is architecturally much more closely a 386DX in that it features the 32-bit registers and instruction set along with the full 32-bit data and address buses. The SL introduces for the first time the SMM interrupt in which the CPU can detect long periods of computing inactivity and then fire the SMM interrupt. If the motherboard is equipped to respond to it, the system can systematically shut down devices thus conserving power. SMM like all other innovations would be carried forward to all subsequent generations of CPU in the x86 family.