MSM2 Lecture #4 - DMA's

Materials:
Working complete PC
Student bootable floppy diskette - "New Boot A Version 2"
Objectives:
Familiarity with hardware system resources,
Discuss how I/O addresses work,
Discuss how DMA works,
Discuss how DMA uses I/O addresses,
List the industry standard DMA assignments,
Be able to discern between 8-bit and 16-bit DMA channels.
Competency:
The student will become more familiar with I/O address system resources and learn about DMA channels and their controllers. The student will understand the fundamental functionality of this resource and be able to list all industry standard assignments. The student will be able to determine the DMA channels, if any, being used by a device and be able to change these settings when possible in BIOS, DOS, and Windows.

Procedures
  1. The instructor will issue each student a complete system unit, monitor, keyboard, and mouse.

  2. Boot the PC into Windows 98 Normally. Open the Device Manager by right clicking on My Computer > Properties > Device Manager tab. Device Manager is the central location from which to troubleshoot the PCs hardware including installing or reinstalling device drivers, and investigating system resource conflicts. Just as was done in the IRQ's module, in order to view the PCs devices listed by system resource category rather than device type categories, first right click on the Computer Icon and then click properties. A new window appears that provides system resource usage lists.

  3. Click on the DMA radio button and the list of devices using DMA channels will appear. It happens that every single peripheral device that possesses data (input devices), transmits data (output devices) or stores data (storage devices) must have at least one I/O address through which that data can travel. However, not all devices may take advantage of DMA - Direct Memory Access.

  4. Write down the eight bit DMA's in use. These are the channels attached to the original DMA i8237 chip included in the IBM PC architecture and are DMA 0 to 3. Write down the 16-bit DMA's in use. These are attached to the second i8237 chip that was added to the IBM AT motherboard architecture. Note that they are both the same chip, but the second one was attached to the wider 16-bit data bus of the IBM AT, while the first one controlling DMA's 0 to 3 was attached to the IBM PC's data bus which was only 8-bits wide. This could not be changed because it would instantly adversely affect all devices and their device drivers that depended on these 8-bit DMA channels. Today the i8237 chips are not longer included on the motherboard but, the modern south bridge chipsets still emulate their circuitry. If they did not do this, every device that depemds on these DMA's would not work.

  5. Close Device Manager and open Start > Run. In the run box type: MSINFO32 and hit [Enter]. This is an alternative method to executing a program when you know its name but you don’t want to fish around through Window’s endless menus looking for it. It functions very similarly to typing a command’s name and pressing [Enter] at the DOS prompt.

  6. The System Information Program opens. In the left window pane, click the [+] in front of Hardware. The Tree view expands to show its contents. Now click on DMA's. Compare the list in the right pane with the values you got from Device Manager. They should agree. If they ever differ, you should expect problems with the device(s) involved. Close MSINFO32.

  7. Under normal circumstances you should not be able to change a DMA manually because a device uses it only if its driver code has been written explicitly to use it. But in the case of the LPT port it can be enabled in ECP mode, which will use DMA channel 3, or disabled from ECP mode which would then free up DMA channel 3. If a device insists on using this DMA, changing the LPT mode in the BIOS would allow the device to use it.

  8. Direct Memory Access controller was included in the original IBM PC. This device provides the service of moving large blocks of data bytes between peripheral devices and RAM memory. The alternative would occupy the CPU with long laborious loops moving the data byte by byte between the device and RAM. The CPU being referred to is of course the original CPU of the IBM PC: the i8088. This CPU was not very efficient taking up to 12 clock cycles to complete one instruction. Let's follow a simplified loop that copies some bytes of data from a device's I/O address to the data segment of RAM. Here are the assembly language instructions (the instructions that the CPU will fetch, decode and execute from the code segment of RAM):

    repeat:
    in   al, 3F0  ;read the byte at I/O port 3F0 into the al register
    mov [si], al  ;write the value in the al register out to the address
                  ;pointed to by the contents of the si register
    inc  si       ;increment (add 1) to the si to point to the next 
                  ;address in the data segment of RAM
    loop repeat   ;subtract 1 from counter register, check to see if 
                  ;has reached zero yet, if not jump back to the repeat label
    

  9. To execute the loop, the CPU must first fetch the "in al, 3F0" instruction:



  10. Next, the CPU must decode the "in al, 3F0" instruction:



  11. Next, the CPU must execute the "in al, 3F0" instruction:



  12. Next, the CPU must fetch the "mov [si], al" instruction:



  13. Next, the CPU must decode the "mov [si], al" instruction:



  14. Next, the CPU must execute the "mov [si], al" instruction:



  15. Next, the CPU must fetch the "inc si" instruction:



  16. Next, the CPU must decode the "inc si" instruction:



  17. Next, the CPU must execute the "inc si" instruction:



    (This instruction is internal to the processor and does not require a RAM access)

  18. Next, the CPU must fetch the "loop repeat" instruction:



  19. Next, the CPU must decode the "loop repeat" instruction:



  20. Next, the CPU must execute the "loop repeat" instruction:



    (This instruction is internal to the processor and does not require a RAM access)

  21. At this point the CPU has moved ONE byte from the I/O address of the device into RAM and must repeat this process for every single byte until complete. Again this is highly simplified. In actual practice the loop must contain instructions for reading the device status port (another I/O address) to make sure that it is ready to deliver the next byte of data at the data I/O address. And if it is not ready yet, execute wait loops checking it over and over until it is ready, then return to this loop. It goes without saying that it is time consuming especially for the 8088 CPU.

  22. The DMA controller has all of the addressing power of the CPU, but none of the computing power. It does not fetch decode and execute machine language instructions, but it can address RAM and I/O addresses just like the CPU. So it can read an I/O address from any device, and then write that byte to anywhere in RAM. And the DMA controller is "hard-wired" to do just that. This means that its internal circuitry has been designed for it to do this automatically without it having to fetch, decode and execute the instructions telling it to do it. This means that the DMA controller can execute the loop much faster than the CPU can. So why not build a CPU like this? Because the DMA chip is faster, but it can only do 2 things: move a large block of data from RAM to a device I/O address or move a large amount of data from a device I/O address to a block of RAM. The CPU on the other hand can perform any computational task. The fact that the CPU is a programmable machine makes it so powerful, just slow at specific tasks like moviing large blocks of data from a device I/O address to RAM.

  23. This is the previous bus with the DMA controller added in:

  24. Once the DMA controller has been told what to do (i.e. transfer a large number of bytes from the device's I/O address to a block of RAM) it begins, first by reading the first byte of the transfer from the device I/O address:

  25. Next, write the byte to RAM:

  26. Next, internal loop circuitry advances the RAM pointer, and checks the block loop counter:

  27. The DMA controller is ready to copy the next byte from the device to RAM. Notice that most of the time saved is because the DMA controller does not have to actually fetch any machine language instructions from RAM and more importantly it does not have to decode them which takes computing time. DMA controllers can execute a large block move in either direction (from the device I/O address to RAM or from RAM to the device I/O address) as much as seven times faster than the 8088. This is why the chip was included on the original IBM PC.

  28. The original controller was 8-bit technology like the entire PC. This means that the data bus width was 8-bits just like the rest of the data bus pathways throughout the PC's motherboard. Each DMA channel can read from the device to RAM or write to the device from RAM. Communication with the DMA controller involves sending it the I/O address of the device, whether it is a transfer from device to RAM or vice versa, the starting address in RAM, and the number of bytes to be transferred. Each DMA request also has an acknowledge line. So there is a DREQ0 to request the DMA services and a DACK0 for DMA acknowledgement of the request. As a result the original IBM PC had 4 DMA channels: DREQ 0 to 3, and each has a DACK 0 to 3 line, totaling 8 lines. The device communicates directly with the DMA controller using the DREQuest and DACKnowledge lines rather than placing status codes in its I/O addresses. When the device is ready, it will signal this across these DREQ/DACK lines.

  29. Since the DMA chip is generating addresses onto the address bus and taking a long time to complete its task (longer than one machine cycle) then it would collide with the CPU trying to make a memory read/write request. To avoid this, the DMA controller "locks" the bus with special instructions within its driver code that tell the CPU to stop doing anything until the "all clear" is sent which then unlocks the buses and the CPU can continue executing the current program instructions. This seems inefficient, but remember that the DMA controller can perform the block transfer of data from device I/O address to RAM up to seven times faster. The time saved is worth having the CPU idle while this is done.

  30. With the IBM AT a second DMA controller was added to the motherboard architecture. This was done at the same time that the second IRQ controller was cascaded to the first PIC for IRQs, and for the same reason: to add system resources to a rather limited motherboard. The second DMA controller effectively doubled the DMA channels available and also offered new ones at double the speed of the original ones. However, the DMA controllers are not cascaded in the way the PIC's are. Instead the DMA controllers communicate with each other over DMA channel 4 of the second DMA controller. But all DMA channels have equal priority and it is up to the driver to lock the bus first, in order to get it first. It should be noted that DMA channels (also known as DRQs) 0 to 3 are "Classic DMA" channels that use the 8-bit transfers of the first DMA controller and a few are reserved for standard system peripherals. DMA channels 4 to 7 are attached to the 16-bit DMA chip and are therefore 16-bit data transfer DMA channels attached by 16 wires to the "new" (at the time) extended 16-bit data bus. Since DMA #4 is claimed as the communication channel between the two controllers themselves, channels 5, 6 and 7 are generally considered available for use by devices using 16-bit transfers of data over the "AT" bus, that is the extended 16-bit ISA bus. Here are the two 40-pin DIP Intel 8237 DMA controllers (one attached to the data bus by 8-bits, the other by 16-bits) on an IBM AT class motherboard:



  31. The DMA controller has been outclassed by the modern PC, its buses and peripherals. Beginning with the 80386DX microprocessor, the CPU could actually execute the data transfer loop instructions and accomplish the large block data transfer from device I/O address to RAM (or vice versa) faster than the DMA controllers could do it. Device manufacturers noticed this and began developing their own device driver code to take advantage of the CPU's enormous processing speed and therefore bypass the DMA functions. This would ultimately make its way into the BIOS code itself allowing unsophisticated code to access the greater block transfer power without having to load device drivers to do it. This process of having the CPU do the work of the DMA controller because it can do it faster is called: PIO - Programmed I/O. And the standardized implementations in the BIOS code are called PIO modes and are used mainly by the ATA controller for accessing hard drives and ATAPI devices. The DMA controllers run on the ISA bus at 8.33Mhz and channels 5 to 7 offering 16-bit transfers can never achieve ideal transfer rates of 2Bytes X 8.33Mhz = 16.66MB/sec because the ISA bus requires a minimum of 2 clock cycles per data transfer for a maximum of 8.33MB/sec. The older ATA controllers (and hard drives of the same vintage) used the 8-bit DMA channels for data transfers to and from memory. This technology was officially made obsolete and no longer supported in the ATA-3 generation controllers. The 16-bit DMA modes are still supported for backwards compatibility however.

    16-bit DMA Mode Transfer Rate Controller
    Mode 0 4.17MB/sec ATA-1
    Mode 1 13.33MB/sec ATA-2 (PCI bus only)
    Mode 2 16.67MB/sec ATA-2 (PCI bus only)
  32. Newer peripherals, ATA controllers especially, work with the motherboard chipset to temporarily take over the bus forcing all other peripherals off of it by locking it. Then they perform large block transfers to or from RAM with the assistance of the motherboard chipset. This provision in the chipset design, for any device to communicate, and negotiate control of the bus and do its own transfers in coordination with the motherboard chipset is called “bus mastering”. And the ATA controllers have spearheaded the technology which the hard drive industry calls UDMA.

  33. The standard DMA assignments are:

    DMA Channel Standard Device
    0 Available (Sound card - MIDI)
    1 Available
    2 Floppy Disk Controller
    3 Enhanced Capabilities Port
    4 Tie-back between DMA controllers
    5 Available (Sound card - WAV)
    6 Available
    7 Available
  34. Aside from the industry standard assignments, the student should be aware that in general, slow devices use 8-bit DMA channels because the transfer rate is limited to 1 byte/transfer and the maximum transfer rate is 8.33Mhz clock but on the asynchronous ISA bus which requires two clock cycles per transfer: 1Byte X 8.33Mhz ÷ 2 = 4.16MB/sec (only in an ideal situation which never actually exists). Faster devices like hard drives and ATAPI devices attached to the ATA controller need faster speeds and therefore use the 16-bit DMA's. Finally, the sound card, while not an industry standard is present on most modern PC's and uses DMA's. It needs faster speeds for moving raw digitized sound wave files to the DAC (Digital to Analog Converter) circuitry. But it does not need this speed when transfering notes to the onboard sound synthesizer. As such the sound card usually uses two DMA channels: one 8-bit DMA channel for MIDI (Musical Instrument Digital Interface) transfering notes to the synthesizer, and one 16-bit channel for the WAVDAC which captures or plays raw digitized sounds.

Review Questions
  1. DMA controller(s) were first included in what model of the PC?


  2. What was the data bus width attachment of the first DMA controller?


  3. How many DMA control lines does each DMA controller have?


  4. How many DMA channels does each DMA controller have?


  5. How many DMA controllers were added to the motherboard architecture after the original design?


  6. The additional DMA controller(s) were added to what model of the PC?


  7. What is the data bus width of the additional DMA controller(s)?


  8. How many channels have industry standard assignments (and are therefore unavailable) on each DMA controller?


  9. Why was the i8237 chip included in the motherboard architecture of the PC?


  10. Why was the motherboard architecture later modified by adding DMA controller(s)?


  11. How many 16-bit DMA channels are normally available on a PC?


  12. How many 8-bit DMA channels are normally available on the PC?


  13. What industry standard 8-bit DMA channel may still be in use on a modern (Pentium 4 generation) PC?


  14. What other 8-bit DMA channel may still be in use on a modern (Pentium 4 generation) PC?


  15. List the industry standard DMA channel assignnents note which are 8-bit and which are 16-bit:

    DMA ChannelDeviceWidth
    0  
    1  
    2  
    3  
    4  
    5  
    6  
    7  

  16. List two Windows tools for viewing system resources:


  17. List two reasons why DMA controllers are considered deprecated?


  18. One of the reasons, DMA controllers are considered deprecated is that the CPU is now faster at moving data bewteen RAM and the device I/O address. This is called:


  19. What are the two operations that the DMA controller is "hard-wired" to do?


  20. What technology takes advantage of 16-bit DMA modes?


  21. What technology takes advantage of PIO modes?


  22. True or False, DMA controllers can only read device I/O addresses.

  23. By what method does the actual device communcate its "ready" status for data transfer with the DMA controller?


  24. Which devices are more likely to use the 8-bit DMA channels?


  25. Which devices are more likely to use the 16-bit DMA channels?


  26. Which devices are more likely to use both 8-bit and 16-bit DMA channels?


Copyright©2000-2004 Brian Robinson ALL RIGHTS RESERVED