AS.9
上传用户:jnzhq888
上传日期:2007-01-18
资源大小:51694k
文件大小:13k
- ." unchecked (kjb)
- .CD "as (en assembler"
- .SE "AS(emASSEMBLER [IBM]"
- .SP 1
- .PP
- This document describes the language accepted by the 80386 assembler
- that is part of the Amsterdam Compiler Kit. Note that only the syntax is
- described, only a few 386 instructions are shown as examples.
- .SS "Tokens, Numbers, Character Constants, and Strings"
- .PP
- The syntax of numbers is the same as in C.
- The constants 32, 040, and 0x20 all represent the same number, but are
- written in decimal, octal, and hex, respectively.
- The rules for character constants and strings are also the same as in C.
- For example, (fma(fm is a character constant.
- A typical string is "string".
- Expressions may be formed with C operators, but must use [ and ] for
- parentheses. (Normal parentheses are claimed by the operand syntax.)
- .SS "Symbols"
- .PP
- Symbols contain letters and digits, as well as three special characters:
- dot, tilde, and underscore.
- The first character may not be a digit or tilde.
- .PP
- The names of the 80386 registers are reserved. These are:
- .HS
- ~~~al, bl, cl, dl
- .br
- ~~~ah, bh, ch, dh
- .br
- ~~~ax, bx, cx, dx, eax, ebx, ecx, edx
- .br
- ~~~si, di, bp, sp, esi, edi, ebp, esp
- .br
- ~~~cs, ds, ss, es, fs, gs
- .HS
- The xx and exx variants of the eight general registers are treated as
- synonyms by the assembler. Normally "ax" is the 16-bit low half of the
- 32-bit "eax" register. The assembler determines if a 16 or 32 bit
- operation is meant solely by looking at the instruction or the
- instruction prefixes. It is however best to use the proper registers
- when writing assembly to not confuse those who read the code.
- .HS
- The last group of 6 segment registers are used for selector + offset mode
- addressing, in which the effective address is at a given offset in one of
- the 6 segments.
- .PP
- Names of instructions and pseudo-ops are not reserved.
- Alphabetic characters in opcodes and pseudo-ops must be in lower case.
- .SS "Separators"
- .PP
- Commas, blanks, and tabs are separators and can be interspersed freely
- between tokens, but not within tokens.
- Commas are only legal between operands.
- .SS "Comments"
- .PP
- The comment character is *(OQ!*(CQ.
- The rest of the line is ignored.
- .SS "Opcodes"
- .PP
- The opcodes are listed below.
- Notes: (1) Different names for the same instruction are separated by *(OQ/*(CQ.
- (2) Square brackets ([]) indicate that 0 or 1 of the enclosed characters
- can be included.
- (3) Curly brackets ({}) work similarly, except that one of the
- enclosed characters fImustfR be included.
- Thus square brackets indicate an option, whereas curly brackets indicate
- that a choice must be made.
- .sp
- .if t .ta 0.25i 1.2i 3i
- .if n .ta 2 10 24
- .nf
- .B "Data Transfer"
- .HS
- mov[b] dest, source ! Move word/byte from source to dest
- pop dest ! Pop stack
- push source ! Push stack
- xchg[b] op1, op2 ! Exchange word/byte
- xlat ! Translate
- o16 ! Operate on a 16 bit object instead of 32 bit
- .B "Input/Output"
- .HS
- in[b] source ! Input from source I/O port
- in[b] ! Input from DX I/O port
- out[b] dest ! Output to dest I/O port
- out[b] ! Output to DX I/O port
- .B "Address Object"
- .HS
- lds reg,source ! Load reg and DS from source
- les reg,source ! Load reg and ES from source
- lea reg,source ! Load effect address of source to reg and DS
- {cdsefg}seg ! Specify seg register for next instruction
- a16 ! Use 16 bit addressing mode instead of 32 bit
- .B "Flag Transfer"
- .HS
- lahf ! Load AH from flag register
- popf ! Pop flags
- pushf ! Push flags
- sahf ! Store AH in flag register
- .B "Addition"
- .HS
- aaa ! Adjust result of BCD addition
- add[b] dest,source ! Add
- adc[b] dest,source ! Add with carry
- daa ! Decimal Adjust after addition
- inc[b] dest ! Increment by 1
- .B "Subtraction"
- .HS
- aas ! Adjust result of BCD subtraction
- sub[b] dest,source ! Subtract
- sbb[b] dest,source ! Subtract with borrow from dest
- das ! Decimal adjust after subtraction
- dec[b] dest ! Decrement by one
- neg[b] dest ! Negate
- cmp[b] dest,source ! Compare
- .B "Multiplication"
- .HS
- aam ! Adjust result of BCD multiply
- imul[b] source ! Signed multiply
- mul[b] source ! Unsigned multiply
- .B "Division"
- .HS
- aad ! Adjust AX for BCD division
- o16 cbw ! Sign extend AL into AH
- o16 cwd ! Sign extend AX into DX
- cwde ! Sign extend AX into EAX
- cdq ! Sign extend EAX into EDX
- idiv[b] source ! Signed divide
- div[b] source ! Unsigned divide
- .B "Logical"
- .HS
- and[b] dest,source ! Logical and
- not[b] dest ! Logical not
- or[b] dest,source ! Logical inclusive or
- test[b] dest,source ! Logical test
- xor[b] dest,source ! Logical exclusive or
- .B "Shift"
- .HS
- sal[b]/shl[b] dest,CL ! Shift logical left
- sar[b] dest,CL ! Shift arithmetic right
- shr[b] dest,CL ! Shift logical right
- .B "Rotate"
- .HS
- rcl[b] dest,CL ! Rotate left, with carry
- rcr[b] dest,CL ! Rotate right, with carry
- rol[b] dest,CL ! Rotate left
- ror[b] dest,CL ! Rotate right
- .B "String Manipulation"
- .HS
- cmps[b] ! Compare string element ds:esi with es:edi
- lods[b] ! Load from ds:esi into AL, AX, or EAX
- movs[b] ! Move from ds:esi to es:edi
- rep ! Repeat next instruction until ECX=0
- repe/repz ! Repeat next instruction until ECX=0 and ZF=1
- repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
- scas[b] ! Compare ds:esi with AL/AX/EAX
- stos[b] ! Store AL/AX/EAX in es:edi
- .fi
- .B "Control Transfer"
- .PP
- fIAsfR accepts a number of special jump opcodes that can assemble to
- instructions with either a byte displacement, which can only reach to targets
- within (mi126 to +129 bytes of the branch, or an instruction with a 32-bit
- displacement. The assembler automatically chooses a byte or word displacement
- instruction.
- .PP
- The English translation of the opcodes should be obvious, with
- *(OQl(ess)*(CQ and *(OQg(reater)*(CQ for signed comparisions, and
- *(OQb(elow)*(CQ and *(OQa(bove)*(CQ for unsigned comparisions. There are
- lots of synonyms to allow you to write "jump if not that" instead of "jump
- if this".
- .PP
- The *(OQcall*(CQ, *(OQjmp*(CQ, and *(OQret*(CQ instructions can be
- either intrasegment or
- intersegment. The intersegment versions are indicated with
- the suffix *(OQf*(CQ.
- .if t .ta 0.25i 1.2i 3i
- .if n .ta 2 10 24
- .nf
- .B Unconditional
- .HS
- jmp[f] dest ! jump to dest (8 or 32-bit displacement)
- call[f] dest ! call procedure
- ret[f] ! return from procedure
- .B "Conditional"
- .HS
- ja/jnbe ! if above/not below or equal (unsigned)
- jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
- jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
- jbe/jna ! if below or equal/not above (unsigned)
- jg/jnle ! if greater/not less nor equal (signed)
- jge/jnl ! if greater or equal/not less (signed)
- jl/jnqe ! if less/not greater nor equal (signed)
- jle/jgl ! if less or equal/not greater (signed)
- je/jz ! if equal/zero
- jne/jnz ! if not equal/not zero
- jno ! if overflow not set
- jo ! if overflow set
- jnp/jpo ! if parity not set/parity odd
- jp/jpe ! if parity set/parity even
- jns ! if sign not set
- js ! if sign set
- .B "Iteration Control"
- .HS
- jcxz dest ! jump if ECX = 0
- loop dest ! Decrement ECX and jump if CX != 0
- loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
- loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
- .B "Interrupt"
- .HS
- int n ! Software interrupt n
- into ! Interrupt if overflow set
- iretd ! Return from interrupt
- .B "Flag Operations"
- .HS
- clc ! Clear carry flag
- cld ! Clear direction flag
- cli ! Clear interrupt enable flag
- cmc ! Complement carry flag
- stc ! Set carry flag
- std ! Set direction flag
- sti ! Set interrupt enable flag
- .fi
- .SS "Location Counter"
- .PP
- The special symbol *(OQ.*(CQ is the location counter and its value
- is the address of the first byte of the instruction in which the symbol
- appears and can be used in expressions.
- .SS "Segments"
- .PP
- There are four different assembly segments: text, rom, data and bss.
- Segments are declared and selected by the fI.sectfR pseudo-op. It is
- customary to declare all segments at the top of an assembly file like
- this:
- .HS
- ~~~.sect .text; .sect .rom; .sect .data; .sect .bss
- .HS
- The assembler accepts up to 16 different segments, but
- .MX
- expects only four to be used. Anything can in principle be assembled
- into any segment, but the
- .MX
- bss segment may only contain uninitialized data.
- Note that the *(OQ.*(CQ symbol refers to the location in the current
- segment.
- .SS "Labels"
- .PP
- There are two types: name and numeric. Name labels consist of a name
- followed by a colon (:).
- .PP
- The numeric labels are single digits. The nearest 0: label may be
- referenced as 0f in the forward direction, or 0b backwards.
- .SS "Statement Syntax"
- .PP
- Each line consists of a single statement.
- Blank or comment lines are allowed.
- .SS "Instruction Statements"
- .PP
- The most general form of an instruction is
- .HS
- ~~~label: opcode operand1, operand2 ! comment
- .HS
- .SS "Expression Semantics"
- .PP
- .tr ~~
- The following operators can be used:
- + (mi * / & | ^ ~ << (shift left) >> (shift right) (mi (unary minus).
- .tr ~
- 32-bit integer arithmetic is used.
- Division produces a truncated quotient.
- .SS "Addressing Modes"
- .PP
- Below is a list of the addressing modes supported.
- Each one is followed by an example.
- .HS
- .ta 0.25i 3i
- .nf
- constant mov eax, 123456
- direct access mov eax, (counter)
- register mov eax, esi
- indirect mov eax, (esi)
- base + disp. mov eax, 6(ebp)
- scaled index mov eax, (4*esi)
- base + index mov eax, (ebp)(2*esi)
- base + index + disp. mov eax, 10(edi)(1*esi)
- .HS
- .fi
- Any of the constants or symbols may be replacement by expressions. Direct
- access, constants and displacements may be any type of expression. A scaled
- index with scale 1 may be written without the *(OQ1**(CQ.
- .SS "Call and Jmp"
- .PP
- The *(OQcall*(CQ and *(OQjmp*(CQ instructions can be interpreted
- as a load into the instruction pointer.
- .HS
- .ta 0.25i 3i
- .nf
- call _routine ! Direct, intrasegment
- call (subloc) ! Indirect, intrasegment
- call 6(ebp) ! Indirect, intrasegment
- call ebx ! Direct, intrasegment
- call (ebx) ! Indirect, intrasegment
- callf (subloc) ! Indirect, intersegment
- callf seg:offs ! Direct, intersegment
- .HS
- .fi
- .SP 1
- .SS "Symbol Assigment"
- .SP 1
- .PP
- Symbols can acquire values in one of two ways.
- Using a symbol as a label sets it to *(OQ.*(CQ for the current
- segment with type relocatable.
- Alternative, a symbol may be given a name via an assignment of the form
- .HS
- ~~~symbol = expression
- .HS
- in which the symbol is assigned the value and type of its arguments.
- .SP 1
- .SS "Storage Allocation"
- .SP 1
- .PP
- Space can be reserved for bytes, words, and longs using pseudo-ops.
- They take one or more operands, and for each generate a value
- whose size is a byte, word (2 bytes) or long (4 bytes). For example:
- .HS
- .if t .ta 0.25i 3i
- .if n .ta 2 24
- .data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
- .br
- .data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
- .br
- .data4 010 ! allocate a longword initialized to 8
- .br
- .space 40 ! allocates 40 bytes of zeros
- .HS
- allocates 50 (decimal) bytes of storage, initializing the first two
- bytes to 2 and 6, the next two words to 3 and 16, then one longword with
- value 8 (010 octal), last 40 bytes of zeros.
- .SS "String Allocation"
- .PP
- The pseudo-ops fI.asciifR and fI.ascizfR
- take one string argument and generate the ASCII character
- codes for the letters in the string.
- The latter automatically terminates the string with a null (0) byte.
- For example,
- .HS
- ~~~.ascii "hello"
- .br
- ~~~.asciz "worlden"
- .HS
- .SS "Alignment"
- .PP
- Sometimes it is necessary to force the next item to begin at a word, longword
- or even a 16 byte address boundary.
- The fI.alignfR pseudo-op zero or more null byte if the current location
- is a multiple of the argument of .align.
- .SS "Segment Control"
- .PP
- Every item assembled goes in one of the four segments: text, rom, data,
- or bss. By using the fI.sectfR pseudo-op with argument
- fI.text, .rom, .datafR or fI.bssfR, the programmer can force the
- next items to go in a particular segment.
- .SS "External Names"
- .PP
- A symbol can be given global scope by including it in a fI.definefR pseudo-op.
- Multiple names may be listed, separate by commas.
- It must be used to export symbols defined in the current program.
- Names not defined in the current program are treated as "undefined
- external" automatically, although it is customary to make this explicit
- with the fI.externfR pseudo-op.
- .SS "Common"
- .PP
- The fI.commfR pseudo-op declares storage that can be common to more than
- one module. There are two arguments: a name and an absolute expression giving
- the size in bytes of the area named by the symbol.
- The type of the symbol becomes
- external. The statement can appear in any segment.
- If you think this has something to do with FORTRAN, you are right.
- .SS "Examples"
- .PP
- In the kernel directory, there are several assembly code files that are
- worth inspecting as examples.
- However, note that these files, are designed to first be
- run through the C preprocessor. (The very first character is a # to signal
- this.) Thus they contain numerous constructs
- that are not pure assembler.
- For true assembler examples, compile any C program provided with
- .MX
- using the fB(enSfR flag.
- This will result in an assembly language file with a suffix with the same
- name as the C source file, but ending with the .s suffix.