AS.9
上传用户:jnzhq888
上传日期:2007-01-18
资源大小:51694k
文件大小:16k
- Command: as - assembler
- AS----ASSEMBLER [IBM]
- This document describes the language accepted by the 80386
- assembler that is part of the Amsterdam Compiler Kit. Note that only
- the syntax is described, only a few 386 instructions are shown as
- examples.
- Tokens, Numbers, Character Constants, and Strings
- The syntax of numbers is the same as in C. The constants 32, 040,
- and 0x20 all represent the same number, but are written in decimal,
- octal, and hex, respectively. The rules for character constants and
- strings are also the same as in C. For example, 'a' is a character
- constant. A typical string is "string". Expressions may be formed with
- C operators, but must use [ and ] for parentheses. (Normal parentheses
- are claimed by the operand syntax.)
- Symbols
- Symbols contain letters and digits, as well as three special
- characters: dot, tilde, and underscore. The first character may not be
- a digit or tilde.
- The names of the 80386 registers are reserved. These are:
- al, bl, cl, dl
- ah, bh, ch, dh
- ax, bx, cx, dx, eax, ebx, ecx, edx
- si, di, bp, sp, esi, edi, ebp, esp
- cs, ds, ss, es, fs, gs
- The xx and exx variants of the eight general registers are treated as
- synonyms by the assembler. Normally "ax" is the 16-bit low half of the
- 32-bit "eax" register. The assembler determines if a 16 or 32 bit
- operation is meant solely by looking at the instruction or the
- instruction prefixes. It is however best to use the proper registers
- when writing assembly to not confuse those who read the code.
- The last group of 6 segment registers are used for selector + offset
- mode addressing, in which the effective address is at a given offset in
- one of the 6 segments.
- Names of instructions and pseudo-ops are not reserved. Alphabetic
- characters in opcodes and pseudo-ops must be in lower case.
- Separators
-
-
- Commas, blanks, and tabs are separators and can be interspersed
- freely between tokens, but not within tokens. Commas are only legal
- between operands.
- Comments
- The comment character is '!'. The rest of the line is ignored.
- Opcodes
- The opcodes are listed below. Notes: (1) Different names for the
- same instruction are separated by '/'. (2) Square brackets ([])
- indicate that 0 or 1 of the enclosed characters can be included. (3)
- Curly brackets ({}) work similarly, except that one of the enclosed
- characters must be included. Thus square brackets indicate an option,
- whereas curly brackets indicate that a choice must be made.
- Data Transfer
- mov[b] dest, source ! Move word/byte from source to dest
- pop dest ! Pop stack
- push source ! Push stack
- xchg[b] op1, op2 ! Exchange word/byte
- xlat ! Translate
- o16 ! Operate on a 16 bit object instead of 32 bit
- Input/Output
- in[b] source ! Input from source I/O port
- in[b] ! Input from DX I/O port
- out[b] dest ! Output to dest I/O port
- out[b] ! Output to DX I/O port
- Address Object
- lds reg,source ! Load reg and DS from source
- les reg,source ! Load reg and ES from source
- lea reg,source ! Load effect address of source to reg and DS
- {cdsefg}seg ! Specify seg register for next instruction
- a16 ! Use 16 bit addressing mode instead of 32 bit
- Flag Transfer
- lahf ! Load AH from flag register
- popf ! Pop flags
- pushf ! Push flags
- sahf ! Store AH in flag register
- Addition
-
-
- aaa ! Adjust result of BCD addition
- add[b] dest,source ! Add
- adc[b] dest,source ! Add with carry
- daa ! Decimal Adjust after addition
- inc[b] dest ! Increment by 1
- Subtraction
- aas ! Adjust result of BCD subtraction
- sub[b] dest,source ! Subtract
- sbb[b] dest,source ! Subtract with borrow from dest
- das ! Decimal adjust after subtraction
- dec[b] dest ! Decrement by one
- neg[b] dest ! Negate
- cmp[b] dest,source ! Compare
- Multiplication
- aam ! Adjust result of BCD multiply
- imul[b] source ! Signed multiply
- mul[b] source ! Unsigned multiply
- Division
- aad ! Adjust AX for BCD division
- o16 cbw ! Sign extend AL into AH
- o16 cwd ! Sign extend AX into DX
- cwde ! Sign extend AX into EAX
- cdq ! Sign extend EAX into EDX
- idiv[b] source ! Signed divide
- div[b] source ! Unsigned divide
- Logical
- and[b] dest,source ! Logical and
- not[b] dest ! Logical not
- or[b] dest,source ! Logical inclusive or
- test[b] dest,source ! Logical test
- xor[b] dest,source ! Logical exclusive or
- Shift
- sal[b]/shl[b] dest,CL ! Shift logical left
- sar[b] dest,CL ! Shift arithmetic right
- shr[b] dest,CL ! Shift logical right
- Rotate
- rcl[b] dest,CL ! Rotate left, with carry
- rcr[b] dest,CL ! Rotate right, with carry
-
-
- rol[b] dest,CL ! Rotate left
- ror[b] dest,CL ! Rotate right
- String Manipulation
- cmps[b] ! Compare string element ds:esi with es:edi
- lods[b] ! Load from ds:esi into AL, AX, or EAX
- movs[b] ! Move from ds:esi to es:edi
- rep ! Repeat next instruction until ECX=0
- repe/repz ! Repeat next instruction until ECX=0 and ZF=1
- repne/repnz ! Repeat next instruction until ECX!=0 and ZF=0
- scas[b] ! Compare ds:esi with AL/AX/EAX
- stos[b] ! Store AL/AX/EAX in es:edi
- Control Transfer
- As accepts a number of special jump opcodes that can assemble to
- instructions with either a byte displacement, which can only reach to
- targets within -126 to +129 bytes of the branch, or an instruction with
- a 32-bit displacement. The assembler automatically chooses a byte or
- word displacement instruction.
- The English translation of the opcodes should be obvious, with
- 'l(ess)' and 'g(reater)' for signed comparisions, and 'b(elow)' and
- 'a(bove)*(CQ for unsigned comparisions. There are lots of synonyms to
- allow you to write "jump if not that" instead of "jump if this".
- The 'call', 'jmp', and 'ret' instructions can be either
- intrasegment or intersegment. The intersegment versions are indicated
- with the suffix 'f'.
- Unconditional
- jmp[f] dest ! jump to dest (8 or 32-bit displacement)
- call[f] dest ! call procedure
- ret[f] ! return from procedure
- Conditional
- ja/jnbe ! if above/not below or equal (unsigned)
- jae/jnb/jnc ! if above or equal/not below/not carry (uns.)
- jb/jnae/jc ! if not above nor equal/below/carry (unsigned)
- jbe/jna ! if below or equal/not above (unsigned)
- jg/jnle ! if greater/not less nor equal (signed)
- jge/jnl ! if greater or equal/not less (signed)
- jl/jnqe ! if less/not greater nor equal (signed)
- jle/jgl ! if less or equal/not greater (signed)
- je/jz ! if equal/zero
- jne/jnz ! if not equal/not zero
- jno ! if overflow not set
-
-
- jo ! if overflow set
- jnp/jpo ! if parity not set/parity odd
- jp/jpe ! if parity set/parity even
- jns ! if sign not set
- js ! if sign set
- Iteration Control
- jcxz dest ! jump if ECX = 0
- loop dest ! Decrement ECX and jump if CX != 0
- loope/loopz dest ! Decrement ECX and jump if ECX = 0 and ZF = 1
- loopne/loopnz dest ! Decrement ECX and jump if ECX != 0 and ZF = 0
- Interrupt
- int n ! Software interrupt n
- into ! Interrupt if overflow set
- iretd ! Return from interrupt
- Flag Operations
- clc ! Clear carry flag
- cld ! Clear direction flag
- cli ! Clear interrupt enable flag
- cmc ! Complement carry flag
- stc ! Set carry flag
- std ! Set direction flag
- sti ! Set interrupt enable flag
- Location Counter
- The special symbol '.' is the location counter and its value is the
- address of the first byte of the instruction in which the symbol appears
- and can be used in expressions.
- Segments
- There are four different assembly segments: text, rom, data and
- bss. Segments are declared and selected by the .sect pseudo-op. It is
- customary to declare all segments at the top of an assembly file like
- this:
- .sect .text; .sect .rom; .sect .data; .sect .bss
- The assembler accepts up to 16 different segments, but MINIX expects
- only four to be used. Anything can in principle be assembled into any
- segment, but the MINIX bss segment may only contain uninitialized data.
- Note that the '.' symbol refers to the location in the current segment.
-
-
- Labels
- There are two types: name and numeric. Name labels consist of a
- name followed by a colon (:).
- The numeric labels are single digits. The nearest 0: label may be
- referenced as 0f in the forward direction, or 0b backwards.
- Statement Syntax
- Each line consists of a single statement. Blank or comment lines
- are allowed.
- Instruction Statements
- The most general form of an instruction is
- label: opcode operand1, operand2 ! comment
- Expression Semantics
- The following operators can be used: + - * / & | ^ ~ << (shift
- left) >> (shift right) - (unary minus). 32-bit integer arithmetic is
- used. Division produces a truncated quotient.
- Addressing Modes
- Below is a list of the addressing modes supported. Each one is
- followed by an example.
- constant mov eax, 123456
- direct access mov eax, (counter)
- register mov eax, esi
- indirect mov eax, (esi)
- base + disp. mov eax, 6(ebp)
- scaled index mov eax, (4*esi)
- base + index mov eax, (ebp)(2*esi)
- base + index + disp. mov eax, 10(edi)(1*esi)
- Any of the constants or symbols may be replacement by expressions.
- Direct access, constants and displacements may be any type of
- expression. A scaled index with scale 1 may be written without the
- '1*'.
- Call and Jmp
- The 'call' and 'jmp' instructions can be interpreted as a load into
- the instruction pointer.
-
-
- call _routine ! Direct, intrasegment
- call (subloc) ! Indirect, intrasegment
- call 6(ebp) ! Indirect, intrasegment
- call ebx ! Direct, intrasegment
- call (ebx) ! Indirect, intrasegment
- callf (subloc) ! Indirect, intersegment
- callf seg:offs ! Direct, intersegment
- Symbol Assigment
- Symbols can acquire values in one of two ways. Using a symbol as a
- label sets it to '.' for the current segment with type relocatable.
- Alternative, a symbol may be given a name via an assignment of the form
- symbol = expression
- in which the symbol is assigned the value and type of its arguments.
- Storage Allocation
- Space can be reserved for bytes, words, and longs using pseudo-ops.
- They take one or more operands, and for each generate a value whose size
- is a byte, word (2 bytes) or long (4 bytes). For example:
- .data1 2, 6 ! allocate 2 bytes initialized to 2 and 6
- .data2 3, 0x10 ! allocate 2 words initialized to 3 and 16
- .data4 010 ! allocate a longword initialized to 8
- .space 40 ! allocates 40 bytes of zeros
- allocates 50 (decimal) bytes of storage, initializing the first two
- bytes to 2 and 6, the next two words to 3 and 16, then one longword with
- value 8 (010 octal), last 40 bytes of zeros.
- String Allocation
- The pseudo-ops .ascii and .asciz take one string argument and
- generate the ASCII character codes for the letters in the string. The
- latter automatically terminates the string with a null (0) byte. For
- example,
- .ascii "hello"
- .asciz "worldn"
- Alignment
-
-
- Sometimes it is necessary to force the next item to begin at a
- word, longword or even a 16 byte address boundary. The .align pseudo-op
- zero or more null byte if the current location is a multiple of the
- argument of .align.
- Segment Control
- Every item assembled goes in one of the four segments: text, rom,
- data, or bss. By using the .sect pseudo-op with argument .text, .rom,
- .data or .bss, the programmer can force the next items to go in a
- particular segment.
- External Names
- A symbol can be given global scope by including it in a .define
- pseudo-op. Multiple names may be listed, separate by commas. It must
- be used to export symbols defined in the current program. Names not
- defined in the current program are treated as "undefined external"
- automatically, although it is customary to make this explicit with the
- .extern pseudo-op.
- Common
- The .comm pseudo-op declares storage that can be common to more
- than one module. There are two arguments: a name and an absolute
- expression giving the size in bytes of the area named by the symbol. The
- type of the symbol becomes external. The statement can appear in any
- segment. If you think this has something to do with FORTRAN, you are
- right.
- Examples
- In the kernel directory, there are several assembly code files that
- are worth inspecting as examples. However, note that these files, are
- designed to first be run through the C preprocessor. (The very first
- character is a # to signal this.) Thus they contain numerous constructs
- that are not pure assembler. For true assembler examples, compile any C
- program provided with MINIX using the -S flag. This will result in an
- assembly language file with a suffix with the same name as the C source
- file, but ending with the .s suffix.
-