Working complete PC
Student bootable floppy diskette - "New Boot A Version 2"
Familiarity with the PC's Central Processing Unit,
List and describe the entire Intel x86 family,
List and describe the general x86 family capabilities,
Each CPU's features and capabilities,
Each CPU's limitations,
Be able to identify which CPU is installed in a system,
The student will become more familiar with the microprocessors used in the IBM PC industry. The student will understand the fundamental functionality of this computer-on-a-chip, the general capabilities of the entire x86 family of Intel CPU's and the individual capabilities of each major generation and their major subfamilies. The student will be able to determine which processor a PC is equipped with through various means from visual inspection to real mode and protected mode diagnostic tools.
Boot the PC into Windows 98 Normally and open Start > Run. In the run box type: MSINFO32 and hit [Enter]. This is an alternative method to executing a program when you know its name but you don’t want to fish around through Window’s endless menus looking for it. It functions very similarly to typing a command’s name and pressing [Enter] at the DOS prompt.
The System Information Program opens. In the left window pane, the general information concerning the entire system as a whole is displayed, this includes the type of processor. Record this information for later reference. Close MSINFO32.
Right click on My Computer and select properties. Here again is a less detailed set of general system information similar to what the MSINFO32.EXE displayed. Again the processor type is displayed. Close the My Computer properties sheet.
At this point, the student should review all of the information in the PC Repair 1 lab: Microprocessors. Here is a quick recap:
|8088||8088||16bit||8bit||20bit||First CPU used in the original IBM PC. Runs at 4.77Mhz avg. 12 clocks/instruction.|
|286||80286||16||16||24||First protected mode CPU. Influences the new IBM AT motherboard architecture. Runs at 8.33Mhz avg. 4.5 clocks/instruction|
|386||80386DX||32||32||32||First full 32-bit CPU. Introduces integral MMU. First biPolar CMOS mfg. Change from protected mode back to real mode w/o reset (reboot)|
| ||80386SX||32||16||24||First economy CPU. Some have same pinout as 286 = first upgrade CPU.|
| ||80386SL||32||32||32||First CPU for portables/laptops. Introduces System Management Mode interrupt. New physical design for reduced power consumption and heat generation|
|486||80486DX||32||32||32||First integrated L1 (8KB). First pipeline decoder (5-stage). MMU supports up to 256KB external L2. First clock multiplied core (2x or 3x). First integrated FPU.|
| ||80486SX||32||32||32||Economy model 486 = missing the integral FPU|
| ||80486SL||32||32||32||Laptop model, introduced Suspend/Resume for instant on/off without reboot (Hibernation), introduced clock throttling (reduces core speed when not needed)|
|Pentium||P5||32||64||32||First dual pipeline decoder. First dual 8KB L1 cache. MMU supports up to 512KB external L2.|
Taking up where the PC Repair 1 microprocessors lab left off, this will be a complete discussion of the Pentium generations of products from Intel. A brief description of the first Pentium offered is then needed to get us started. And we will do a side-by-side comparison of the Pentium against its immediate predecessor, the 486DX:
|Typical Clock Speed:||50Mhz-100Mhz||50Mhz-66Mhz|
|Avg. Clocks/Instruction:||Between 1 and 2||Less than 1|
|Core L1 Cache:||8KB||Dual 8KB|
|External L2 Cache:||Max. 256KB||Max. 512KB|
Already differences between the two machines are emerging: that 64-bit data bus attachment to the Front Side Bus is fundamental; changing the chipsets and rendering the VESA Local Bus slots instantly and permanently obsolete, to illustrate how dramatic this change was. It happens to be directly related to the other fundamental difference between the two machines on the general characteristics chart: that this processor can, on average, take less than one clock cycle to complete the execution of an instruction. Unheard of when you consider that it is this very clock cycle that drives the engine that is step-by-step performing the execution of the instruction. Think of the clock cycle as one rotation of this little computing machine's engine. Now how can it complete the entire execution of the instruction in less than one clock cycle?
It turns out that the Pentium has two complete instruction decoders whereas the 486 has a single 5-stage pipeline decoder. This means that in essence the Pentium is a dual core 486. But there is much more to it than that, the two instruction decoders are embedded within one core and work together in tandem and are fed from independent L1 caches of 8KB each, whereas the 486 had just one 8KB L1 cache on-die running at core speed. These two L1's are fed by one MMU - Memory Management Unit that feeds them from one external L2 that can be as large as 512KB in size, whereas the 486 could only manage an external L2 cache of up to 256KB in size.
While the above illustration is not accurate, it does give the student some idea of what is going on. When the processor fetches a machine language instruction, it makes a memory read request on the 32-bit address bus, but instead of reading 32-bits across the data bus that is fed into a single pipeline decoder, it reads 64-bits across the data bus that feeds two pipeline decoders that can each begin decoding their own instructions simultaneously from the same executable. The difference between this and a completely independent dual core CPU, is that in a true dual core CPU, EACH core can make its own separate instruction fetch on its own separate address bus from anywhere in RAM and they can work on two completely separate executables at the same time. However, this dual pipeline single core is what makes it possible for the Pentium to achieve the apparently impossible execution efficiency of less than one clock cycle per instruction; a feature that Intel calls superscalar execution.
These first Pentiums were 5 volt chips and suffered massive overheating problems. Because of this none of them could clock multiply the core frequency and ran at a straight one-to-one ratio with the front side bus clock. This means that the Pentium chipsets ran at 50Mhz, 60Mhz and 66Mhz while the 486 chipsets ran at 25Mhz to 33Mhz and some exceptionally designed aftermarket overclocked 486 motherboards could run the chipset at 40Mhz. Because of this very fast front side bus and chipset, this is another reason that the Pentium based systems seemed so much faster than their predecessors: in the end the CPU must wait for the chipset to send and receive every data transaction. Since the Pentium motherboard chipsets ran twice as fast, this send and receive occurred twice as fast as well.
Intel would quickly switch to a superior manufacturing process for the Pentium; the first generation used the 0.8µ process resolution. The second generation P54's would introduce the 0.6µ manufacturing process. This was a multibillion dollar investment in manufacturing equipment, which is why the first Pentiums would have been delayed. Instead of delaying them, Intel made them at the inferior resolutions on the equipment that had been making the previous processors which is why they had the higher voltage and heating issues. This 0.6µ manufacturing process means that the physical objects on the semiconductor wafer must be separated by a minimum of this distance or they might accidentally merge.
The conductive traces through the circuit above (the orange bars) are 0.6µ wide. And this is the minimum width for the nonconductive gaps (the white areas) between them as well. The light gray rectangles are transistors each is either a three piece "NPN" or "PNP" junction so they are three little 0.6µ x 0.6µ squares in size. The dark gray objects are resistors, each must be a little 0.6µ x 0.6µ square in size as a minimum that the manufacturing technique can cleanly and reliably create on the surface of the wafer. Later P54's would be manufactured using a 0.35µ process. Obviously then if a CPU is said to be made with a 0.35µ resolution, then these conductor paths and the gaps between them and the little squares are then 0.35µ in size. The micron manufacturing resolution is one of the big buzz words in CPU's and it is a big deal. The smaller the resolution, then the more transistors fit on the wafer and the higher the resistance in the conductive pathways which increases the heat that the chip generates and increases the possibility of corrupting a bit moving through the circuit. This is why moving from one resolution to a smaller one is always a major manufacturing breakthrough in electronic engineering and is usually accompanied by a reduction in the voltage requirements of the circuits to reduce the heat generated by them.
P54 also introduces the clock multiplier for the first time to the Pentium cores. These are the first ever fractional multipliers capable of either 1.5x or 2.5x as well as cores that multiply the front side bus clock cycles by 2x or 3x. Here is a side-by-side comparison with the P5:
|Feature||Pentium (P5)||Pentium (P54)|
|Clock Multiplier:||1x Only||1.5x/2x/2.5x/3x cores|
|Typical Clock Speed:||50Mhz-66Mhz||75Mhz-200Mhz|
|Native SMP Support:||No||Yes|
In addition to these changes the P54 also introduces the APIC - Advanced Programmable Interrupt Controller. The student will recall that the PIC, the 8259 chip, had up until this time been a separate chip and part of the motherboard chipset design and architecture. Now, the APIC is a different architecture and incorporated back into the Pentium processor itself. While the CPU still has a single INTR line, the mechanisms of the old PIC chip have been incorporated back into the CPU itself. The IOAPIC chip on the motherboard forwards IRQ's directly to the CPU and internally it handles them as if the PIC were onboard, hence the system does function as if it were wired up like the bottom diagram. APIC, including the LAPIC (Local, onboard the CPU) and the IOAPIC (Input/Output, in the chipset) allow multiple CPU's to properly coordinate the sharing of interrupts and is a part of the SMP architecture of the CPU.
Motherboards remain backwards compatible supporting either dual 8259 architecture or direct connections of the devices to the external I/O APIC lines which forward interrupt requests to the internal Local APIC of the processor. This is usually controlled by a setting in the BIOS, but APIC would not become fully implemented for years because the industry expansion device manufacturers had to learn how to use the APIC interrupts rather than the old tried and true PIC managed interrupts. As of 2007, Windows XP Service Pack 2, most 6th generation BIOSes (any BIOS calling itself version 6.0 or better) and most PCI 2.2+ hardware support APIC. The P54 also introduces native SMP - Symmetric Multi-Processor - support. This allows the motherboard chipset to be developed that can allow for two or more processors to be attached. This again would take a long time to appear on motherboards, and operating systems had to be developed that could recognize and use multiple processors which would also take years.
This is the series of "Pentium MMX" processors. The significant changes to its predecessors include: new 0.28µ and later 0.25µ manufacturing resolution and dual 16KB L1 cache. Both changes meant immediate increased clock speeds and improved performance. This lower manufacturing resolution was directly tied, not surprisingly, to a reduced voltage requirement of 2.8V. Immediately rendering all of the work being done on the standard ATX power supply specification obsolete. To avoid rendering the ATX specification obsolete before it was even finished, Intel introduced Socket 7, part of the Socket 7 specification is a local VRM (Voltage Regulation Module) that is responsible for tapping the proposed 3.3V output of the ATX power supply and adjusting it to the voltage required by the CPU in the socket. The ATX specification would proceed without having to change voltages for every new CPU, and the Socket 7 specification tied to the VRM that the motherboard manufacturers could change from one motherboard to another allowed this mounting system to run for several years supporting many steppings of Pentium MMX and many generations of AMD and Cyrix clones of the MMX, and Pentium II/III processors.
The other change which was incorporated into the processor's name was the addition of a new section of the ALU - Arithmetic/Logic Unit in the processor core including 8 new 64-bit wide registers supporting 57 new instructions designed specifically for multimedia applications, the MMX registers and instructions. However, this would take months even years before the programmers would learn how to use them and start incorporating them into the applications they created. This is always the case with new machine language instructions. Acquiring a new CPU based solely on new instructions is a waste, since the instructions will not begin to appear in applications for many months to many years. The P55 however was worth it because of the raw core speed increases and the larger 16KB L1 caches which greatly increased performance. And months down the road, applications did appear taking advantage of the MMX instructions.
During the time that Intel was offering the Pentium MMX processors they released another processor called the Pentium Pro. This processor was the most expensive product they had ever released, some of them selling for over $4,000 each. It is no surprise that few end users had ever heard of them because they were almost never installed on end-user systems. They were made for high-end engineering workstations and network servers. Even those power users and technicians that had heard of them were probably not aware of the fact that this processor was the first of the sixth generation processors from Intel.
Intel would not change their manufacturing processes for the P6 generation chips until after the Pentium III. Pentium Pro's would be manufactured with the 0.35µ process used in the P54's and Pentium II's would be made with the 0.25µ process developed for the Pentium MMX processor.
The P6 family starting with the Pentium Pro includes a totally redesigned core with more than two 10-, 12-, or 14-stage pipeline decoders that give the chip the ability to execute up to three instructions per clock cycle. Intel's dynamic execution which consists of 1) Multiple Branch Prediction (MMU can look ahead and pull both targets of the upcoming "if-then" statement into the cache), 2) Dataflow Analysis ("Out-of-Order Execution: pipelines can change the order in which instructions are queued and then sent into the pipeline for decoding), 3) Speculative Execution (another look ahead feature that allows the alternate idle pipelines to execute a possible upcoming branch in the program, if the branch is taken, then the instructions are already done).
Pentium II/III generation of the P6 introduces MPS - MultiProcessor Specification (1.0) which is a significant redesign and implementation of the CPU core features that began as the SMP support embedded in the Pentium MMX cores. The MPS support of the P6 generation allows not only dual processor motherboards but also supports multiprocessor motherboards with four sockets or even more and allows these to be populated one at a time without causing problems. So the MPS support allows for the manufacture of Quad CPU motherboards, and also allows these to have 1, 2, 3, or 4 CPU's installed and they will function properly (in theory! Chipsets and BIOSes may not like it.) Incidentally the MPS kernels for Windows XP and Server 2003 will not work on Quad CPU motherboards with 3 CPU's installed (too bad.)
Another major change for the P6 is called the DIB - Dual Independent Bus architecture. This provides the processor core with a second complete Address/Data bus pair for exclusive use with the L2 cache. This allows the MMU to draw blocks of data out of it and into the L1's within the core without ever having to contend with the constant traffic on the main front side bus.
Note the separate CPU core and L2 chips attached to the printed circuit board, and that the CPU core has two complete Address buses and two complete Data buses, one that passes through the L2 chip and out of the card where it extends to the main RAM. The other extends directly off of the circuit board and reaches the I/O expansion buses. This is the Dual Independent Bus Architecture.
P6 introduces the VID - Voltage Identification pins that allow the system to autodetect the voltage requirements of the processor eliminating the need for the technician to manually set jumpers on the motherboard to the correct voltage for the processor. Failing to get these jumpers correct would almost always destroy the CPU.
P6's brought more sophisticated MMU architecture including one 16KB L1 for instructions and an 8KB L1 for data. These would increase in size in later P6 models. P6's spanning several years actually range from models supporting zero KB of L2 (shortlived and very unpopular) to models with 2MB (Xeons.) It was the combination of deeper multiple pipelines, dynamic execution, DIB, new more sophisticated MMU managing more powerful L1 + L2 combinations that led to far greater instructions/clock cycle efficiency that in turn led to far faster throughput which means far faster performance.
Finally, P6 brought a 36-bit wide address bus, backwards compatible with the predecessor's 32-bit address bus, when fully implemented, in what the industry now calls PAE - Physical Address Extensions the P6 family can address up to 236 addresses or 64GB of RAM.
|Feature||Pentium MMX||Pentium Pro||Pentium II|
|Voltage:||2.8V||Varies under VID/VRM Autodetect|
|Typical Clock Speed:||166Mhz-266Mhz||150Mhz-200Mhz||233Mhz-450Mhz|
|Native MPS Support:||No||Yes||Yes|
|Dual Independent Bus:||No||Yes||Yes|
|L2 Cache:||512KB @ 66Mhz||512KB/1MB/2MB @ CORE||512KB @ ½ CORE|
The P6 along with all of the technological advances discussed so far would also begin a new era for Intel in the area of multiple product lines. As far back as the 386, Intel had been offering the "base" version - the 386DX - along side an "economy" version - the 386SX - along side a special low power consumption, low heat generation version for laptops - the 386SL. Since the loss of the lawsuit against AMD which forced Intel to name and trademark their products, such a selection had not been attempted yet. With the Pentium II, the first full production version of the P6, Intel would resume offering multiple products starting with the Pentium II Celeron, the economy model, and the Pentium II Xeon, the more expensive "deluxe" model - the first time Intel would do this.
So Intel offered the customer the Pentium II, Pentium II Celeron and the Pentium II Xeon each in various core speeds and eventually in different front side bus attachments. Xeon was offered in three L2 size configurations just like its predecessor, the Pentium Pro. Aside from FSB and core multipliers this makes for 5 separate processors. It was in fact the new CPU package that would make both the Celeron and the Xeon relatively simple to produce.
These new P6 CPU's were mounted onto printed circuit boards (instead of being embedded into a wafer of ceramic) and attached to a slot like an expansion card rather than being dropped into a socket like the predecessors. This allowed Intel to manufacture the actual CPU core and the L2 cache chip separately and then attach the two and the pathways between them onto the printed circuit board.
So on one such CPU card, the P6 core could be mounted along with a 512KB L2 cache chip running at ½ core speed. This CPU was the "regular" Pentium II codenamed "Klamath" Another CPU card could have the P6 core and no L2 cache at all, being much cheaper, this was the Celeron. The same core could be mounted on the CPU card along with either 512KB, 1MB, or 2MB of core speed L2 cache: these were the Pentium II Xeons.
The Celeron with no L2 cache at all had such dreadful performance that Intel dropped them and introduced a modified one called "Celeron A" These had a 128KB core speed L2 cache chip added on to the CPU card. This card was actually called the SEP - Single Edge Package for Celerons, and SECC - Single Edge Card Connector for the Klamath, which attached to the "Slot 1" card edge socket on the motherboard which Intel patented so that AMD could not make faster, cheaper replacement CPU's. SECC had the entire card encased in a plastic shell which protected the exposed chips and fine connection tracings across the surface of the card. But since the heat sink could not come into contact with the chip this led to serious overheating problems. SEP had no shell at all, so the heat sink could come into direct contact with the chip. While cheaper, it proved to be better for cooling systems as well. Note that for the first time, AMD CPU's and Intel CPU's could no longer be attached to the same motherboard. Choosing the motherboard was now a commitment to one CPU manufacturer or the other.
The Celeron A with 128KB of core speed L2 could be a less expensive alternative to the Klamath which had 512KB of L2 cache but only running at ½ core speed. This made them interesting enough that Celerons would always have the 128KB of core speed L2, or in future generations even more, but always far less than the standard variety of their family. But this made the Celeron more expensive to manufacture, it was in fact the SEP and the SECC which were more expensive to manufacture than Intel had imagined. Ultimately they would change the Celeron A back to a classic square of ceramic called the FC-PGA (Flip-Chip Pin Grid Array), and patent it and the Socket 370 to which it mounted on the motherboard.
The Celeron A with 128KB of core speed L2 would raise enough interest that Intel would offer another significant variant called the "Pentium II PE". This CPU offered a 256KB core speed L2 cache. This L2 was half the size of the regular Klamath L2 but twice as fast. The key to the Celeron A and the Pentium II PE is "light application usage" Any heavy usage such as large processor hungry applications like high end video games, large spreadsheets or AutoCAD would need the largest caches possible and would bog down these processors. But small applications like casual web browsing, and word processing, would run better on the CPU's with the smaller but faster L2 cache.
|Feature||Pentium II||Pentium II PE||Pentium II Celeron A||Pentium II Xeon|
|L2 Cache:||512KB @ ½core||256KB @ core||128KB @ core||½, 1 or 2MB @ core|
With the Pentium III "Katmai", the core was modified, and the SSE section of the ALU was added. This brings new 128-bit registers and 70 new instructions for multimedia applications. These are superior to the MMX generation of instructions and again, there would be a significant lag between the release time of the CPU and the implementation of these new instructions into software. SSE means Streaming SIMD Extensions where SIMD means Single Instruction/Multiple Data. These are single instructions that can simultaneously affect all of the bytes within a large block rather than one instruction affecting the information within a single register. Think of a single instruction affecting a single register as a tutor, teaching a single student at a time. Think of an SIMD instruction as a teacher in front of a class teaching multiple students the same thing and therefore affecting many students simultaneously. These have the power to run graphics processing which often requires changing large regions of data that are being reflected by the video controller onto the screen. Katmai's MMU management of transactions between main RAM, the L2 and the L1's is superior. Overall however, the chip was not a major change from the Pentium II, either in core architecture or performance.
The Pentium III "Coppermine" would use the new 0.18µ manufacturing process and speeds would rapidly rise. While the last Katmai's did introduce the 133Mhz front side bus, the Coppermine would take Pentium III's from about 600Mhz to 1.0Ghz. Coppermines would move the "regular" version of the CPU back to Socket 370 in the FC-PGA form factor introduced first in the Pentium II Celeron as a cheaper alternative to the SEP packaging. Tualatin cores were based on the 0.13µ manufacturing process and brought the Pentium III from 1.0Ghz to 1.4Ghz. Pentium III's were released in "Pentium IIIE" forms, These have superior speed L2 cache chips. The Coppermine Pentium IIIE had 256KB of core speed L2 cache (instead of the 512KB at ½ core speed) and the regular Tualatin Pentium III had the full 512KB at full core speed (not at ½ core speed) so Tualatins did not need an "E" version. They were also released in "Pentium IIIB" forms this indicated that the processor was running on a new faster front-side bus, the 133Mhz front-side bus in the case of the Katmai and the Coppermine. Note that all Tualatins ran on the 133Mhz bus only, so there was no need for a "B" designation with these processors. Obviously then a "Pentium IIIEB" was attached to a 133Mhz FSB and possessed a faster but smaller L2 cache.
|CPU||Mfg.||Frontside Bus||Core Speeds||L2 Cache|
|Pentium III Katmai||0.25µ||100Mhz||450Mhz-600Mhz||512KB @ ½ core|
|Pentium III Katmai E||0.25µ||100Mhz||550Mhz-600Mhz||256KB @ core|
|Pentium III Katmai B||0.25µ||133Mhz||533Mhz-600Mhz||512KB @ ½ core|
|Pentium III Katmai EB||0.25µ||133Mhz||533Mhz-600Mhz||256KB @ core|
|Pentium III Coppermine||0.18µ||100Mhz-133Mhz||500Mhz-1.0Ghz||512KB @ ½ core|
|Pentium III Coppermine E||0.18µ||100Mhz-133Mhz||500Mhz-1.0Ghz||256KB @ core|
|Coppermine-128 Celeron A||0.18µ||66-100Mhz||500Mhz-1.0Ghz||128KB @ core|
|Pentium III Tualatin||0.13µ||133Mhz||1000Mhz-1.4Ghz||512KB @ core|
|Tualatin-256 Celeron A||0.13µ||100Mhz||900Mhz-1.4Ghz||256KB @ core|
The P7's are properly called "Family 15" since their CPUID instruction will reveal the number 0x0F in the family designation rather than the number 7 which had already been used in the early "Merced" prototypes that would ultimately become the Itanium. These processors spanned several years and brought a totally new core architecture Intel called the netburst architecture as well as improvements in computer engineering and VLSI - Very Large Scale Integration, manufacturing technology to the CPU. The Netburst architecture is based on a core made up of 5 very deep pipelines, 20-stage in the Willamette (the first cores) and 31-stage in the 90nm Prescott cores. Along with these multiple very deep pipeline decoders, the core uses dual MMU's to manage transactions between main RAM and various sizes of L2's running at various speeds and an unusual L1 cache consisting of two distinct sections. A "classic" 8KB or 16KB L1 holding program information, plus an "Advanced Trace Cache" that can hold up to 12K of the core microcode words. Bearing in mind that the Netburst core microcode words are 100 bits long, this second piece of the L1 is actually 150KB and can hold plenty of instructions for the pipelines themselves to work with. Netburst also features a 256-bit wide data path between the L1 and the L2 for the first time. This greatly facilitates the MMU's capability of filling the L1 quickly and is probably the single most significant change to the CPU core aside from the addition of the second MMU and the huge 150KB microcode section of the L1.
The Netburst core also increases the total number of registers from 40 various length registers in the Pentium III core to 128 registers all 128-bits long in the Netburst core. Each of these registers is totally generic, unlike the previous cores in which each one had a specific name and function. Some were 32-bits wide and some were 64-bits wide and some were 128-bits wide. The Netburst core can assign any of its registers at any time to "become" the EAX register, for example. In fact, it can assign another one to "become" the EAX register at the same time the first one is "being" the EAX. This flexibility greatly facilitates the CPU's ability to multitask, and to perform speculative execution. This register "aliasing" is what made HyperThreading a natural possibility in the Netburst core evolution, allowing the core to manipulate two independent sets of the same registers simultaneously without having to dump them out to RAM and then retrieve them later to continue, which is done in standard multitasking.
Netburst's main horsepower however is derived from the implementation of a QDR - Quad Data Rate, front side bus. Using the same technique of stuffing multiple bits/wave that began with DDR - Double Data Rate SDRAM and the AGP 2x video cards, this CPU's front side bus connections can stuff 4 bits/wave. So the CPU is attached to the motherboard chipset on an address bus/data bus pair that runs at 100Mhz, but the CPU and the motherboard chipset actually transfer 4 bits/wave achieving effectively 400MegaTransfers/second. With a 64-bit wide data bus, 8 bytes this yields: 400MT/sec X 8 Bytes = 3200MB/sec throughput across the front side bus. Clearly far more powerful than previous processor/chipset connections to date.
Intel also takes a dramatic new direction in the concept of multiple product lines with the Pentium 4. Whereas before, they bundled all new developments into a base CPU core and then marketed a base product with 512KB of ½ core speed L2, an economy product with 128KB of core speed L2 and a deluxe product with 512KB, 1MB, or 2MB of core speed L2, the Family 15 strategy would be completely different. This time various cores with various groupings of the features would be produced leading to a large field of CPU's each slightly different from the others. This led Intel to start using Model numbers which are now far more significant than just the names of the products or differences between the micron manufacture, front side bus speed, the core multiplier, and the size and speed of the L2 cache. Now the differences from model to model even with the same name and core speed mean that one has certain features that the other does not and vice versa along with the differences in FSB, core multiplier, mfg. "die" (micron mfg. process) and size/speed of the L2. While end users will still be dazzled by the Gigahertz, certain chips with larger and faster caches and certain other significant performance enhancing features can far out-perform the base models that have a much faster core speed.
Because this family has literally dozens upon dozens of models the best way to represent them is to first list each core and what technologies members of those cores would introduce. Then each new technology will be explained in detail.
Thermal throttling introduced by the early Willamette cores allows the CPU to reduce the core frequency automatically when the temperature rises beyond a safe threshold. The core can even come to a complete halt to avoid being destroyed although this is not 100% reliable!
Northwood cores introduce HyperThreading technology. Remembering that the core is based on 5 x 20-stage pipelines, there are times when one or more pipelines have nothing to do for periods from one clock cycle to dozens. Because these pipelines are executing their own machine language instructions called microcode that has been loaded into the ATC - Advanced Trace Cache 12K x 100-bit microcode words in size, and that the core has 128 x 128-bit registers that can be aliased, the microcode instructions can be modified so that the pipelines can emulate two complete independent cores. In essence by modifying the microcode, the CPU emulates a physical dual core. This allows idling pipelines to have something productive to accomplish almost always and does increase overall performance by 10%-30% Furthermore, because the core is advertising itself as a dual core, there is no way for the software to know that it is not. Operating Systems that can detect and use multiple processors, will see an HT core as a dual core processor and put the dual core emulation capabilities to full use.
Northwoods would introduce Quad Data Rate front side buses based on the 133Mhz frequency allowing 533MT/sec x 8 Byte/Transfer = 4266MB/sec throughput. This would constitute another step in total computing performance of greater significance than raw Gigahertz. Northwood would be the first Intel CPU to run on a raw 200Mhz front side bus and it would naturally run it in Quad Data Rate mode for a total of 800MegaTransfers/Second x 8 Bytes/Transfer = 6400MB/sec throughput. By far the highest throughput front side bus for Intel products up to this time.
Time to criticize the 20-stage pipelines. They are in fact far more inefficient than the 10/12/14-stage pipelines of the Pentium II/III products. Running 5 independent very deep pipelines further bogs down the processor with excessive microcode steps to get anything done. However, each stage of the 20-stage pipeline decoders is physically smaller, made out of far fewer transistors and these doing far fewer things. This allows Intel to increase the clock frequency of the core without changing the micron manufacturing resolution. So Willamettes were introduced at 1.5Ghz up to 2.0Ghz when Pentium III Coppermines simply could not made to run that fast. When Intel started making Netburst core CPU's on the Tualatin's manufacturing process of 0.13µ these approached 3.0Ghz, when Tualatins could never be made this fast. The promise of the greatly delayed 90nm manufacturing process would be even faster cores, but this resolution is extremely difficult to implement and Intel had to stretch the pipelines out to 31-stages in order to again reduce the transistor count and work being done by each one so that they would not overheat. Despite the effort, the Prescott core does have heating issues based entirely on how small the conductor pathways are.
Prescott cores would get help from being packaged in the LGA775 form factor introduced in earlier true dual core products. The larger LGA775 package would help carry the heat out over the larger surface area for the heatsink/fan assembly to absorb and dissipate. But Prescotts still run hotter than anyone would care to like or admit. These would be the first single physical core CPU's packaged as LGA775 although Intel would revisit many of the Netburst core products and outfit them as LGA775's mainly for its better cooling properties.
Willamettes would introduce SSE2 - incorporates all of the original SSE plus the predecessor MMX and adds 144 new instructions that would further facilitate multimedia applications. Prescotts would introduce SSE3 - incorporates SSE2 and adds 13 more multimedia instructions. Again adding instructions has a natural time lag of several months minimum to find its way into applications which can then improve performance. So this feature was wisely bundled in along with the many other advancements introduced with the Prescott core.
The XD (eXecute Disable) bit is an enhancement to the protected mode environment that the CPU sets up for each application in extended memory. It is possible to say that this is one of the first significant changes to the 32-bit protected mode environment since the 386DX. This allows the CPU to flag a region of RAM as pure data. Then even if the CPU somehow gets directed to the region for the purposes of execution (like embedded macros within a Word document, or by falling past an address in a stack (stack overflow exploits) the CPU will still refuse to execute the code. This like the original protected mode, is designed to thwart poorly written code as well as malicious code and is a welcome advancement. Interestingly, AMD invented it, Intel emulates it in the Prescott core.
On the subject of AMD inventions, the major feature of the entire Netburst generation, EM64T - Extended Memory 64-bit Technology, was also invented entirely at AMD for their first 64-bit processors. While Intel had developed the first 64-bit CPU intended for the PC industry, codenamed "Merced" and ultimately released as the Itanium, that processor could not run any existing 16 or 32-bit code. In other words the Itanium is not an x86 processor and is not backwards compatible at all. Itanium therefore needs all new compilers and operating systems and software. Microsoft worked on this for years and the development was going slow because they could not reuse decades of existing tried and true code that had been developed from DOS through Windows 9x through Windows NT. When AMD developed a CPU that was backwards compatible with all existing code and added new 64-bit instructions and modes on top so that the CPU could load and run existing versions of operating systems and applications and then later accept new sections of 64-bit code, Microsoft immediately dropped Intel's 64-bit Itanium architecture and "suggested" to them that they adopt AMD64 architecture. Quite a shock for the people who had always blindly run wildly into the future with everyone else desperately following them straight toward the cliff ... Now they found themselves forced behind and following.
EM64T is Intel's effort to emulate the AMD64 (also known as x86-64) architecture. It allows the CPU to be fully backwards compatible and capable of executing all existing software and instructions while adding as a new superset of instructions the 64-bit machine language instructions. A 64-bit instruction means that the CPU can load a single register with a 64-bit wide number and then load another register with a single 64-bit wide number, and then with a single instruction add the two together. This is the very definition of the term. Netburst architecture with all of those 128-bit registers running around inside could in fact be easily altered to run 128-bit code. But, for now 64-bit seems to be both powerful enough and complex enough to keep the programmers entertained.
The MMU's would be modified so that they could manage and support a third layer of high speed cache for the 90nm Prescott cores, so they introduce L3 cache for the first time to the Intel line of PC industry CPU's. Now the MMU's can pull large blocks of code and data from main RAM into the L3, then move it from the L3 to the L2, then from there to the L1. We know from previous discussions that any improvement to the MMU's and/or cache brings significant performance improvements and this L3 cache and the MMU support for it is no exception.
EIST - Enhanced Intel Speedstep Technology was first developed in a rudimentary form in the early Pentium 4 Mobile processors. The technology literally steps the core down from its maximum clock frequencies to lower frequencies during moments in which the speed is not needed such as an unattended system idling at the desktop in Windows. In fact the core would probably stay running slow under a simple application like the Windows Calculator which relies on the FPU. But if several CPU intense applications like Microsoft Excel and Access were launched, the core would then throttle back up to full speed. While throttling down in speed, the core will also adjust the voltages as well, since the resistance goes down with lower speeds, the voltages can be backed off as well, which in turn generates less heat keeping the core cooler and prolonging its life. Prescott cores without this technology constantly running at high core speeds gradually but continuously degrade until they simply fail; a condition that came to be known in the techie circles as Sudden Prescott Death Syndrome. EIST seems to significantly cut down on the wear and tear on the core and prolong CPU core life from mere months to many years and counting.
Intel's VT - Virtualization Technology is a spectacular example of both the multitasking concept, and the concept of absorbing technologies born in pure software into the core of the CPU. The software concept that the CPU has absorbed here is called Microsoft Virtual PC. This is a program that sets up a software PC complete with its own BIOS. It opens as a Window and can be "booted up" capture a physical drive like the CD-ROM drive, and therefore boot up an installation CD-ROM. The installation program will run and detect all of the virtual PC's BIOS, hardware, hard drives (nothing more than files on the actual host PC) and completely install its operating system, totally unaware of the fact that it is actually running on a fake PC in an application Window. Now the VT capable CPU needs a complete motherboard and BIOS compatible implementation to work, but these are starting to appear in high end workstation and server platforms. Now the processor can maintain two completely separated processes, the multitasking capability, but all the way to the point of running two independent operating systems and even rebooting one while continuing to run the other one uninterrupted.
Now that all of the major technologies have been described, the major product families can be listed and described. They are:
Again, the various combinations of core speed, L2 cache speed and size, FSB speed, and the Family feature set are too numerous to chart here. And Intel is marketing these CPU's by Model # rather than name designation. There can be huge differences between one Pentium 4 (small L2, no HT) and another with the same core speed that has an L2 4 times the size running at core speed, and has HT. One final note on the Family 15 "Pentium 4/D/EE" family, there is no such thing as a Pentium 4 Xeon. There are Family 15 Xeons though. They have simply been separated as a product line from the standard CPU's of the same era. So while Intel is selling a "Pentium 4" and a "Pentium 4 Celeron", they are also selling a "Xeon" no longer considered to be "just a souped up Pentium 4." The Family 15 Xeons in general featured much larger L2 and L3 caches and were the first cores to introduce most of the technologies that would trickle down into the Northwood and Prescott cores for use in the standard Pentium 4. At one point Intel was marketing the "Xeon" (32-bit), the "64-bit Xeon", the "Dual Core Xeon", and the "Xeon Dual Core w/HT" along with the end user products listed above: "Pentium 4", "Pentium 4 Celeron", "Pentium 4 M(obile)", "Pentium D", "Celeron D", and "Pentium 4 EE" plus the Itanium for a total of 11 distinct product families each with dozens of core speed, L2 size/speed and Family 15 feature grouping variations!
First of all, do not confuse any of these processors with other processors of similar name that are in fact completely different processors: Pentium M is not even similar to a Pentium 4 M, Pentium Core Duo is not related at all to a Pentium Core 2 Duo, and the Pentium Extreme Edition not related at all to a Pentium 4 Extreme Edition.
Intel appears to be trying to capitalize on naming tomorrows products with names extremely similar to yesterdays highly popular and successful products but this is where the similarity ends between the products: name only.
These three processors are in fact highly related to each other more so than they are to the other products that have similar names to them. They form a generation of Intel CPU's with similar core architecture, that has no clear generational position. They are clearly not Family 15 Netburst cores. Instead of the 5 x 20/31-stage pipeline decoders they possess dual 11-stage pipeline decoders making the core essentially a P6 similar to the Pentium II/III processors. But these processors do possess almost all of the collateral technologies developed in Family 15 including the MMU's, L1/L2/L3 cache arrangements and 256-bit attachment between the L1 and L2, the aliasable 128 x 128-bit registers, the 12K microcode word L1 ATC, HT, EIST, VT, EM64T, and so on. A strange hybrid core we might call "½ P6 and ½ Family 15"
It was previously noted that the multiple very deep pipeline decoders of Family 15 were a compromise that allowed Intel to continue to increase core speeds beyond what the manufacturing processes would have allowed otherwise. With Pentium M and its relatives, Intel was now well equipped to manufacture CPU's with the 90nm mfg. process and then the 65nm process. Because of this, they redesigned the core returning to the simpler and far more efficient dual 11-stage pipeline construction. This makes the Pentium M far faster and more efficient than any Family 15 CPU at the same core speed.
Intel had been making mobile versions of their processors for years in "surface solder mount" form factors. These are superheated, dipped into a solder bath and then placed directly onto the motherboard surface where the solder cools and hardens up underneath them. This makes the CPU completely unremovable. Pentium M was developed specifically for laptop systems in a "Surface solder mount" form factor, but turned out to be so fast and desirable, that system builders started putting the CPU into desktop systems as well. Intel in turn would begin offering Pentium M processors in their own unique "Socket M" form factor. Asus would make a "Socket M - to - Socket 478" adapter for some of their motherboards which prompted Intel to manufacture some models of Pentium M in the standard desktop Socket 478 form factor.
Pentium Core Duo is simply a dual physical core Pentium M. And the Pentium Extreme Edition is a Core Duo, dual physical core Pentium M, with HyperThreading technology. Like the Family 15 predecessor, multiprocessor capable operating systems will recognize a Pentium Extreme Edition as a quad core processor. Again Intel would offer each in dozens of models varying in core speed, L2 cache size/speed, and Family 15 feature groupings.
This is quite a confusing name because after all what does Intel make? Processors, of course. So what then is the first thing that occurs to anyone when told "This computer has the Intel Centrino" inside? Well it must be another one of their processors. And that is a completely natural assumption that happens to be wrong.
"Centrino" is a marketing tool, nothing more. Intel developed the Centrino technology for laptop systems. But again, it became so popular that it made its way into desktop systems as well. A system is said to be equipped with Intel's "Centrino" technology if and only if:
1) It has a Pentium M processor
2) It has either the Intel i915 chipset or the i855 chipset
3) It has an Intel/PRO series wireless LAN adapter built in
"Centrino Duo" is a marketing tool, just like the term "Centrino". Intel developed the Centrino and Centrino Duo technologies for laptops, but they were so popular and powerful that they made their way into desktop systems as well. A system is said to be equipped with Intel's "Centrino Duo" technology if and only if:
1) It has a Pentium Core Duo processor
2) It has either the Intel i915 chipset or the i855 chipset
3) It has an intel/PRO series wireless LAN adapter built in
"Celeron Centrino" does not exist although some system manufacturers may make this marketing claim. Intel does not recognize the Celeron M as satisfying their marketing term "Centrino". So the manufacturer is making a platform based on the Celeron M (lower FSB speed and smaller L2 cache than the Pentium M) either the i915 or i855 chipset and integrated Intel/PRO wireless LAN adapter. But according to Intel, they should not be using the "Centrino" brand name with such a system.
Core Solo is the replacement for the Pentium M designation. Intel no longer actually manufactures Pentium M's from their original dies. Instead they devote the assembly line to making Core Duo's only. But since the Pentium M was so popular and still has a market, they take some of the Pentium Core Duo's, post core production, and disable the second core, effectively making it into a Pentium M. But since it is a modified (chopped in half actually) Pentium Core Duo which is a different CPU in the actual circuit diagram details, they call it the Pentium Core Solo instead.
The Core 2 Duo processor is the latest "desktop" or "end user" product from Intel. This CPU has a totally new processor core which Intel calls the "Intel Core Architecture" for lack of a better name. Inspired and influenced by the Pentium Core Duo (the dual physical core Pentium M) it is not just a souped up Pentium Core Duo, but a whole new core architecture that brings new microcode, pipelines, MMU's, intercache data pathways, dual physical core intercommunication, etc.
The Core 2 Extreme processor is the first variant of the new Intel Core Architecture. It is a Pentium Core 2 Duo without the internal clock speed lock. This allows the end user to change the front side bus clock speed provided to the processor which in turn allows it to run faster than its rated core clock speed. If the CPU is, for example, provided with a 200Mhz FSB that it is multiplying internally by 15 to run the core at 3000Mhz (3.0GHz) then the Core 2 Extreme processor could be placed on this motherboard which in turn could have the FSB speed changed to say 215MHz. The Core 2 Extreme would multiply this by 15 yielding a core running at 3225MHz or 3.22GHz. The regular Core 2 Duo will possibly reject this FSB setting. This then is a product designed specifically for gamers and other "power users" interested in overclcking the CPU.
The Core 2 Quad processor is the second variant of the new Intel Core Architecture. It is a dual physical core Pentium Core 2 Duo. Since the Core 2 Duo is a dual physical core processor, the Core 2 Extreme is the first Quad physical core processor made by Intel for the "desktop" market. Since the Intel base core architecture, the "Core 2 Duo" base core design is a dual physical core CPU, it appears that Intel is not going back to a single physical core CPU ever again. And once a new territory has been explored, it gets immediately colonized then overpopulated in the PC land, meaning there will be 8-core, 16-core, 256-core, ... CPU's on the not too distant desktop.
What was the first fifth generation Intel CPU was called?
List the main differences between the Pentium P5 and its predecessor?
What was the model number and name of the second generation Pentium?
List the differences between the P5 and the P54:
What was the model number and the name of the third generation Pentium?
List the differences between the P54 and the P55:
Describe "superscalar execution"?
Explain why the Pentium processors have a 64-bit wide data bus.
What is meant by the term 0.35µ manufacturing process?
What was the first Pentium to be built with a micron manufacturing process superior to its predecessors?
List the micron manufacturing resolution of the P5, P54's, and P55's?
True or False: During the Pentium II Intel introduced a new micron manufacturing process.
After the Pentium MMX what was the first processor to introduce a superior manufacturing process and what was that micron resolution?
List all market names of the sixth generation processors, which was the first ever processor that was a "deluxe" version of the base CPU?
List the firsts of the P6 processors:
Diagram and describe the dual pipeline core of the Pentium processor:
Define the acronym APIC. Describe the technology and contrast it with the preceding technology.
Define the acronym SMP. Describe this technology.
Describe the MMX technology.
Describe the P6 processor core.
Describe "Dynamic Execution."
Define the acronym MPS. Describe this technology.
Define the acronym DIB. Diagram and describe this technology.
Explain the advantage of the P6 VID technology over the predecessors.
Define the acronym PAE. Explain this technology.
Intel had been marketing "economy" versions of their CPU's since the introduction of the 386SX, what was the first ever economy version "Pentium"?
What major sacrifice was made to the P6 family between the Pentium Pro and the Pentium II?
What gave the first Celeron such terrible performance that it was quickly dropped?
What was the next "economy" version CPU called? What was the difference between the original Celeron and its successor?
Define the acronym SSE. Define the embedded acronym. Explain this technology.
List and describe the three Pentium III cores. What differentiates them specifically?
List and describe the different market name products of the Pentium III generation.
Describe the Intel Netburst core architecture.
List the firsts of the Family 15 processors.
List all the processor market names based on the Intel Netburst core architecture.
List the three main Pentium 4 cores.
Describe the mfg. process and the features introduced in the Willamette core.
Describe the mfg. process and the features introduced in the Northwood core.
Describe the mfg. process and the features introduced in the Prescott core.
Define the acronym HT. Describe the technology.
Describe thermal throttling technology.
Define the acronym ATC. Describe the technology.
The first Pentium 4's were manufactured using the same micron mfg. process as which of their predecessors? What is this micron mfg. resolution?
The Northwood Pentium 4's were manufactured using the same micron mfg. process as which of their predecessors? What is this micron mfg. resolution?
What was the first Pentium 4 to be manufactured with a completely new micron mfg. process? What is this micron mfg. resolution?
Describe SSE2 and SSE3.
What two major enhancements to the protected mode functions and the instruction sets of Pentium 4 were actually pioneered by AMD? Define their acronyms and describe the technologies.
What was the first CPU to include an integrated MMU? Define the acronym MMU. Explain this technology.
What was the first CPU to feature an integrated L1 cache in the core? The first CPU to feature an integrated L2 cache in the core? The first CPU to feature an integrated L3 cache?
What was the first CPU to feature a total of 8 KB of L1 cache? 16KB of L1 cache? 32KB of L1 cache?
What was the first CPU to feature an MMU that could manage up to 512KB of L2 cache? An L2 greater than 512KB?
EIST was first developed for which product? Define the acronym EIST. Describe this technology.
Define the acronym VT. Describe this technology.
What processor core threatened to render the ATX specification obsolete before it was even complete? Explain why. What technology would "save" the ATX standard? Explain how.
Based on the pipeline decoder architecture alone, what family does the Pentium M really belong to? It shares all of the technological developments and micron mfg process of what other family?
What is the market name of a dual physical core Pentium M processor?
Explain why Intel abandoned the Netburst architecture and actually went backwards to a P6 based processor core for the Pentium M.
What is an Intel "Centrino"?
What is an Intel "Centrino Duo"?
What are the differences between a Pentium 4 M and a Pentium M?
What are the differences, if any, between a Pentium 4 EE and a Pentium Extreme Edition?
Describe the Pentium Core 2 Duo?
Describe the Pentium Core 2 Extreme?
Which do you think is faster: 1.4Ghz Pentium III or a 1.4Ghz Pentium 4 Willamette? Why?
All other factors being equal which do you think is faster: Pentium 4 w/HT or a Pentium D? Why?
All other factors being equal which do you think is faster: Pentium 4 or a Pentium M? Why?
Which do you think is faster: any generation Pentium or its Xeon equivalent? Why?
What was the difference between the Pentium II, Pentium II PE and the Pentium II Celeron A? If the Celeron A's L2 cache was too small why might someone still be tempted to choose the Pentium II PE over the regular Pentium II?
Explain the meanings of the designations "E", "B" and "EB" for Pentium III's.
Define the acronym FSB. Describe this technology.
Define the acronym ALU. Describe this technology.
What was the first processor to have a FSB clock speed greater than that of the Coppermine? Calculate the throughput of these FSB's?
What do the following CPU's all have in common: "Pentium 4 EE", "Xeon Dual Core w/HT", "Pentium Extreme Edition", "Pentium Core 2 Extreme"? List and describe the differences between them.
Describe the Intel Itanium processor.
Intel patented which CPU mounting system forcing the buyer to choose one manufacturer with the purchase of the motherboard? What CPU family was this mounting system designed for?
Quad processor motherboards have been available since Intel introduced a chipset that could support them. Go online and see if you can find a modern quad CPU motherboard. What CPU's are supported? In what form factor (sockets or slots)?
Can you figure out where all of these strange code names for the processor cores come from?
Copyright©2000-2007 Brian Robinson ALL RIGHTS RESERVED