Materials:![]() ![]() ![]() ![]() Objectives: ![]() ![]() ![]() ![]() ![]() Competency: The student will learn how to design executable assembly lanugage programs and how to compile and link the source code text file into an executable. The student will become familiar with some basic compiler "dot" directives and some simple machine language instructions. |
Preparation
The student will need a copy of Borland's Turbo Assembly Lanugage software, preferably version 5.0.
Procedures
After launching QBOOT by pressing [F8] say yes to all CONFIG.SYS prompts and all AUTOEXEC.BAT prompts except the last one to launch Windows 3.11.
Change to the K: drive and format a floppy for use to permanently store the source files. Run "edit blank.asm" All assembly language source files should have the file extention .ASM so that the compilers can easily recognize them.
In EDIT's main field type in the following source code:
.model tiny .stack 200h .data .code main proc mov ax, @data mov ds, ax terminate: mov ax, 4C00h int 21h main endp end main
This blank source file can be used as a starting point to create many DOS executables without having to retype the same lines over and over. The first three lines are syntax particular to the Borland Turbo Assembly language compiler and are affectionately called the "dot" directives. They are not true assembly language instructions in that they will not be directly converted into machine language instructions for the CPU to execute. Instead the compiler software will read them and use them to build the segment information that will ultimately end up in the DOS .EXE file header of the final executable.
The ".model" directive must be followed by the memory model that the final executable will use. The "tiny" model means that the data, stack and program segments will all be only one segment in size and that they will all be the SAME segment. This limits the total size of the final compiled executable to 64KB and the stack should not be overused or it will corrupt the program code occupying the same memory segment while running and cause a lockup.
The ".stack" directive causes the compiler to create a stack segment of the size in bytes specified and choose the physical segment address automatically.
The ".data" directive causes the compiler to create a data segment for holding variables and choose the physical segment address automatically.
The ".code" directive causes the compiler to create a code segment for holding the actual executable instructions of the program and choose the physical segment address automatically.
While it is not necessary to encapsulate the main logic of the .code segment in a procedure, this is a safe progrmaming practice that keeps the machine from starting off or later jumping to unexpected addresses within the program. This program's main logic section is a procedure named "main" which starts with the line "main proc" and ends with the line "main endp" This ensures that every line of code will be kept neatly between them and makes the section stand out better to the human eye.
The source text file must end with the compiler directive "end" telling the compiler where the actual program source code ends and it must be followed by the label on the line where the actual executable portion of the program begins. In this example the line reads "end main" indicating to the compiler that the actual program code ends on this line and that the executable portion of the code starts on the first line, reading from the top of the file down, that begins with the label "main" that line being the line "main proc"
Press [Alt]+[F] to open the File menu and then press [S] to save the changes. Now Press [Alt]+[F] then [A] and name the file HELLO.ASM. Make the following changes:
.model tiny .stack 200h .data msg db "Hello World!",0Dh, 0Ah, '$' .code main proc mov ax, @data mov ds, ax start: mov ah, 9 ;function 9 = show string mov dx, offset msg ;address of the string to be displayed int 21h ;call DOS to show the string terminate: mov ax, 4C00h ;function 4C = quit program int 21h main endp end main
Press [Alt]+[F] then [S] to save the changes. The highlighted lines are the ones that were added to the original BLANK.ASM. The first change creates the new variable named "msg" It is of data type "byte" as indicated by the pseudo-instruction "db" meaning "define byte" There is no such instruction in machine language, so this is also really a compiler directive. There is also a "dw" meaning "define word".
The next set of changes are the lines "mov ah, 9" which assigns the literal value of 9 to the ah register. The basic registers that will be used are the AX, BX, CX, and DX. The segments registers will be CS (code segment), SS (stack segment), DS (data segment) and ES (extra segment). The pointers that will be learned are the SI (source index), DI (destination index), and BP (Base Pointer). Each of the general purpose registers is a 16-bit wide register, but the top byte can be refered to by the parent's letter followed by an "H" instead of the "X" and the bottom half can be refered to by the parent's letter and the letter "L" instead of "X" so the AX 16-bit wide register is actually composed of two 8-bit or byte wide registers called the AH and the AL. An instruction can refer to either the AX or either of the byte wide children registers.
The next line "mov dx, offset msg" says: place the memory OFFSET or address of the variable named msg into the DX register. Without the keyword "offset" this instruction would attempt to place the CONTENTS of the variable ("Hello World!...") into this register which is not what is needed.
The next line, "INT 21h" is a software invoked interrupt. While hardware devices can invoke interrupts by sending signals along the IRQ wires to the IRQ controllers, an INT instruction can also invoke an interrupt. The DOS operating system kernel has parked itself in RAM somewhere during the boot sequence and then placed its interrupt entrance address into the Interrupt Vector Table entry number 21h. When this instruction is invoked, the processor will save all register values to the stack including the CS:IP (pointing to where it is currently executing) then fetch the CS:IP values from the IVT effectively far jumping to that address. That is where the DOS kernel resides. From there DOS will read the AH register which holds the "function number" and the DOS code will then jump to the appropriate function handler. In this case the function number 9 handler will then read the bytes starting at the offset specified in the DX register and display them on screen until it reads a dollar sign, then it will execute the interrupt return (IRET) instruction and return to the following line of the calling program. Function numebr 9 is therefore the "Display '$' Terminated String" function.
This is followed by the label "terminate:" This will be useful in future programs as a target of a jump instruction intended to end the program. Remember that "end" is a compiler recognized key word and therefore cannot be used for this purpose.
The next line "mov ax, 4C00h" loads the AH register with 4Ch and the AL register with 00.
The next line calls DOS again. This time the AH register holds the value 4Ch. When DOS receives this it returns control to COMMAND.COM and considers the program "ended" meaning that the RAM it occupied is now considered available for the next program that executes. Function 4Ch means "End Program."
Save the changes and exit EDIT with [Alt]+[F] then [X]. Now copy all ASM files to the floppy. If using QBOOT, the floppy is the B: drive:
K:\>copy *.asm B:
BLANK.ASM
HELLO.ASM
2 file(s) copied
K:\>_
Now it is time to compile the program. Having booted from QBOOT, the compiler software has been included in the PATH environmental variable. If it was installed onto the hard drive, the installation program will also include the PATH entry in the AUTOEXEC.BAT. The compiler path is: [drive]:\TASM\BIN which should be included if invoking the compiler yields bad command or file name. First ask the comiler for help with: "tasm /?" Note the last switch "/z" has three different states. In order to include full debug information it must read "/zi" Now invoke the compiler on the HELLO.ASM source file to include full debug information:
K:\>tasm /zi hello
Turbo Assembler Version 4.1 Copyright (c) 1988, 1996 Borland International
Assembling file: hello.ASM
Error messages: None
Warning messages: None
Passes: 1
Remaining memory: 336k
K:\>_
If any error messages occur, they include the line number. Open HELLO.ASM in EDIT and make the necessary changes, then try the compiler again. TASM.EXE is a multi-pass structured compiler, so if items like variables are mentioned before they are declared, the compiler will watch for the declaration later in the file. If it is never declared, then it will complain. This shows off its "structured" capability. Some compilers like Turbo PASCAL are "top down" compilers in which variables MUST be declared before they are refered to in an instruction. Because ASM source can refer to external "Include" files if items are referenced that exist outside in an include file, the compiler can open the includes and even execute conditional compiler directives all the while watching for declarations of previously mentioned items thus demonstrating its multi-pass capability. In this case everything was resolved in one pass through the source with no errors. However, TASM.EXE did not create an executable file. It created an OBJECT file which is compiled into machine language code but not fully executable in form yet. Also note that the file's .ASM extension does not have to be specified. TASM "knows" the file type it compiles. Any text file can be fed to it, but if it does not end in .ASM, then it must be fully specified (i.e. hello.txt) Check the root:
K:\>dir
Volume in drive K is MS-RAMDRIVE
Directory of K:\
HELLO ASM 230 11-07-06 10:09a
HELLO OBJ 481 11-07-06 10:19a
BLANK ASM 145 11-07-06 10:19a
3 file(s) 856 bytes
4,175,872 bytes free
K:\>_
To convert *.OBJ files into executables requires the "linker" tool. The reason for this is that very large programs can be built by linking smaller *.OBJ files with this tool. It allows large reuasble libraries of functions to be built and precompiled by the programmer and then they can be linked into other programs as needed, dramatically reducing total programming time by eliminating the need to create the same functions over and over again. To retain full debug information the switch is "/v":
A:\>tlink /v hello
Turbo Link Version 7.1.30.1. Copyright (c) 1987, 1996 Borland International
K:\>_
The linker announces no errors. And it knows that it wants to link an OBJ file. A second name can be supplied after the OBJ name and it will name the executable this second name. Since a second name was not provided, it assumes that the executable should have the same name as the OBJ. Check:
K:\>dir
Volume in drive K is MS-RAMDRIVE
Directory of K:\
HELLO ASM 230 11-07-06 10:09a
HELLO OBJ 481 11-07-06 10:19a
BLANK ASM 145 11-07-06 10:19a
HELLO MAP 232 11-07-06 10:25a
HELLO EXE 1,747 11-07-06 10:25a
5 file(s) 2,835 bytes
4,171,776 bytes free
K:\>_
Also note that the linker has created a *.MAP file. This is used by the full debugger to trace through the program one instruction at a time in order to find logic bugs. This tool will be demonstrated later. Now execute "HELLO.EXE":
K:\>hello
Hello World!
K:\>_
In the next exercise a program will be developed that can take input from the user and then display it on screen.
Copyright©2000-2006 Brian Robinson ALL RIGHTS RESERVED