nasmdoc.src
资源名称:nasm-0.98.zip [点击查看]
上传用户:yuppie_zhu
上传日期:2007-01-08
资源大小:535k
文件大小:370k
源码类别:
编译器/解释器
开发平台:
C/C++
- c section .data align=16
- switches to the section c{.data} and also specifies that it must be
- aligned on a 16-byte boundary.
- The parameter to c{ALIGN} specifies how many low bits of the
- section start address must be forced to zero. The alignment value
- given may be any power of two.I{section alignment, in
- bin}I{segment alignment, in bin}I{alignment, in bin sections}
- H{objfmt} ic{obj}: i{Microsoft OMF}I{OMF} Object Files
- The c{obj} file format (NASM calls it c{obj} rather than c{omf}
- for historical reasons) is the one produced by i{MASM} and
- i{TASM}, which is typically fed to 16-bit DOS linkers to produce
- ic{.EXE} files. It is also the format used by i{OS/2}.
- c{obj} provides a default output file-name extension of c{.obj}.
- c{obj} is not exclusively a 16-bit format, though: NASM has full
- support for the 32-bit extensions to the format. In particular,
- 32-bit c{obj} format files are used by i{Borland's Win32
- compilers}, instead of using Microsoft's newer ic{win32} object
- file format.
- The c{obj} format does not define any special segment names: you
- can call your segments anything you like. Typical names for segments
- in c{obj} format files are c{CODE}, c{DATA} and c{BSS}.
- If your source file contains code before specifying an explicit
- c{SEGMENT} directive, then NASM will invent its own segment called
- ic{__NASMDEFSEG} for you.
- When you define a segment in an c{obj} file, NASM defines the
- segment name as a symbol as well, so that you can access the segment
- address of the segment. So, for example:
- c segment data
- c dvar: dw 1234
- c segment code
- c function: mov ax,data ; get segment address of data
- c mov ds,ax ; and move it into DS
- c inc word [dvar] ; now this reference will work
- c ret
- The c{obj} format also enables the use of the ic{SEG} and
- ic{WRT} operators, so that you can write code which does things
- like
- c extern foo
- c mov ax,seg foo ; get preferred segment of foo
- c mov ds,ax
- c mov ax,data ; a different segment
- c mov es,ax
- c mov ax,[ds:foo] ; this accesses `foo'
- c mov [es:foo wrt data],bx ; so does this
- S{objseg} c{obj} Extensions to the c{SEGMENT}
- DirectiveI{SEGMENT, obj extensions to}
- The c{obj} output format extends the c{SEGMENT} (or c{SECTION})
- directive to allow you to specify various properties of the segment
- you are defining. This is done by appending extra qualifiers to the
- end of the segment-definition line. For example,
- c segment code private align=16
- defines the segment c{code}, but also declares it to be a private
- segment, and requires that the portion of it described in this code
- module must be aligned on a 16-byte boundary.
- The available qualifiers are:
- b ic{PRIVATE}, ic{PUBLIC}, ic{COMMON} and ic{STACK} specify
- the combination characteristics of the segment. c{PRIVATE} segments
- do not get combined with any others by the linker; c{PUBLIC} and
- c{STACK} segments get concatenated together at link time; and
- c{COMMON} segments all get overlaid on top of each other rather
- than stuck end-to-end.
- b ic{ALIGN} is used, as shown above, to specify how many low bits
- of the segment start address must be forced to zero. The alignment
- value given may be any power of two from 1 to 4096; in reality, the
- only values supported are 1, 2, 4, 16, 256 and 4096, so if 8 is
- specified it will be rounded up to 16, and 32, 64 and 128 will all
- be rounded up to 256, and so on. Note that alignment to 4096-byte
- boundaries is a i{PharLap} extension to the format and may not be
- supported by all linkers.I{section alignment, in OBJ}I{segment
- alignment, in OBJ}I{alignment, in OBJ sections}
- b ic{CLASS} can be used to specify the segment class; this feature
- indicates to the linker that segments of the same class should be
- placed near each other in the output file. The class name can be any
- word, e.g. c{CLASS=CODE}.
- b ic{OVERLAY}, like c{CLASS}, is specified with an arbitrary word
- as an argument, and provides overlay information to an
- overlay-capable linker.
- b Segments can be declared as ic{USE16} or ic{USE32}, which has
- the effect of recording the choice in the object file and also
- ensuring that NASM's default assembly mode when assembling in that
- segment is 16-bit or 32-bit respectively.
- b When writing i{OS/2} object files, you should declare 32-bit
- segments as ic{FLAT}, which causes the default segment base for
- anything in the segment to be the special group c{FLAT}, and also
- defines the group if it is not already defined.
- b The c{obj} file format also allows segments to be declared as
- having a pre-defined absolute segment address, although no linkers
- are currently known to make sensible use of this feature;
- nevertheless, NASM allows you to declare a segment such as
- c{SEGMENT SCREEN ABSOLUTE=0xB800} if you need to. The ic{ABSOLUTE}
- and c{ALIGN} keywords are mutually exclusive.
- NASM's default segment attributes are c{PUBLIC}, c{ALIGN=1}, no
- class, no overlay, and c{USE16}.
- S{group} ic{GROUP}: Defining Groups of SegmentsI{segments, groups of}
- The c{obj} format also allows segments to be grouped, so that a
- single segment register can be used to refer to all the segments in
- a group. NASM therefore supplies the c{GROUP} directive, whereby
- you can code
- c segment data
- c ; some data
- c segment bss
- c ; some uninitialised data
- c group dgroup data bss
- which will define a group called c{dgroup} to contain the segments
- c{data} and c{bss}. Like c{SEGMENT}, c{GROUP} causes the group
- name to be defined as a symbol, so that you can refer to a variable
- c{var} in the c{data} segment as c{var wrt data} or as c{var wrt
- dgroup}, depending on which segment value is currently in your
- segment register.
- If you just refer to c{var}, however, and c{var} is declared in a
- segment which is part of a group, then NASM will default to giving
- you the offset of c{var} from the beginning of the e{group}, not
- the e{segment}. Therefore c{SEG var}, also, will return the group
- base rather than the segment base.
- NASM will allow a segment to be part of more than one group, but
- will generate a warning if you do this. Variables declared in a
- segment which is part of more than one group will default to being
- relative to the first group that was defined to contain the segment.
- A group does not have to contain any segments; you can still make
- c{WRT} references to a group which does not contain the variable
- you are referring to. OS/2, for example, defines the special group
- c{FLAT} with no segments in it.
- S{uppercase} ic{UPPERCASE}: Disabling Case Sensitivity in Output
- Although NASM itself is i{case sensitive}, some OMF linkers are
- not; therefore it can be useful for NASM to output single-case
- object files. The c{UPPERCASE} format-specific directive causes all
- segment, group and symbol names that are written to the object file
- to be forced to upper case just before being written. Within a
- source file, NASM is still case-sensitive; but the object file can
- be written entirely in upper case if desired.
- c{UPPERCASE} is used alone on a line; it requires no parameters.
- S{import} ic{IMPORT}: Importing DLL SymbolsI{DLL symbols,
- importing}I{symbols, importing from DLLs}
- The c{IMPORT} format-specific directive defines a symbol to be
- imported from a DLL, for use if you are writing a DLL's i{import
- library} in NASM. You still need to declare the symbol as c{EXTERN}
- as well as using the c{IMPORT} directive.
- The c{IMPORT} directive takes two required parameters, separated by
- white space, which are (respectively) the name of the symbol you
- wish to import and the name of the library you wish to import it
- from. For example:
- c import WSAStartup wsock32.dll
- A third optional parameter gives the name by which the symbol is
- known in the library you are importing it from, in case this is not
- the same as the name you wish the symbol to be known by to your code
- once you have imported it. For example:
- c import asyncsel wsock32.dll WSAAsyncSelect
- S{export} ic{EXPORT}: Exporting DLL SymbolsI{DLL symbols,
- exporting}I{symbols, exporting from DLLs}
- The c{EXPORT} format-specific directive defines a global symbol to
- be exported as a DLL symbol, for use if you are writing a DLL in
- NASM. You still need to declare the symbol as c{GLOBAL} as well as
- using the c{EXPORT} directive.
- c{EXPORT} takes one required parameter, which is the name of the
- symbol you wish to export, as it was defined in your source file. An
- optional second parameter (separated by white space from the first)
- gives the e{external} name of the symbol: the name by which you
- wish the symbol to be known to programs using the DLL. If this name
- is the same as the internal name, you may leave the second parameter
- off.
- Further parameters can be given to define attributes of the exported
- symbol. These parameters, like the second, are separated by white
- space. If further parameters are given, the external name must also
- be specified, even if it is the same as the internal name. The
- available attributes are:
- b c{resident} indicates that the exported name is to be kept
- resident by the system loader. This is an optimisation for
- frequently used symbols imported by name.
- b c{nodata} indicates that the exported symbol is a function which
- does not make use of any initialised data.
- b c{parm=NNN}, where c{NNN} is an integer, sets the number of
- parameter words for the case in which the symbol is a call gate
- between 32-bit and 16-bit segments.
- b An attribute which is just a number indicates that the symbol
- should be exported with an identifying number (ordinal), and gives
- the desired number.
- For example:
- c export myfunc
- c export myfunc TheRealMoreFormalLookingFunctionName
- c export myfunc myfunc 1234 ; export by ordinal
- c export myfunc myfunc resident parm=23 nodata
- S{dotdotstart} ic{..start}: Defining the i{Program Entry
- Point}
- OMF linkers require exactly one of the object files being linked to
- define the program entry point, where execution will begin when the
- program is run. If the object file that defines the entry point is
- assembled using NASM, you specify the entry point by declaring the
- special symbol c{..start} at the point where you wish execution to
- begin.
- S{objextern} c{obj} Extensions to the c{EXTERN}
- DirectiveI{EXTERN, obj extensions to}
- If you declare an external symbol with the directive
- c extern foo
- then references such as c{mov ax,foo} will give you the offset of
- c{foo} from its preferred segment base (as specified in whichever
- module c{foo} is actually defined in). So to access the contents of
- c{foo} you will usually need to do something like
- c mov ax,seg foo ; get preferred segment base
- c mov es,ax ; move it into ES
- c mov ax,[es:foo] ; and use offset `foo' from it
- This is a little unwieldy, particularly if you know that an external
- is going to be accessible from a given segment or group, say
- c{dgroup}. So if c{DS} already contained c{dgroup}, you could
- simply code
- c mov ax,[foo wrt dgroup]
- However, having to type this every time you want to access c{foo}
- can be a pain; so NASM allows you to declare c{foo} in the
- alternative form
- c extern foo:wrt dgroup
- This form causes NASM to pretend that the preferred segment base of
- c{foo} is in fact c{dgroup}; so the expression c{seg foo} will
- now return c{dgroup}, and the expression c{foo} is equivalent to
- c{foo wrt dgroup}.
- This I{default-WRT mechanism}default-c{WRT} mechanism can be used
- to make externals appear to be relative to any group or segment in
- your program. It can also be applied to common variables: see
- k{objcommon}.
- S{objcommon} c{obj} Extensions to the c{COMMON}
- DirectiveI{COMMON, obj extensions to}
- The c{obj} format allows common variables to be either nearI{near
- common variables} or farI{far common variables}; NASM allows you to
- specify which your variables should be by the use of the syntax
- c common nearvar 2:near ; `nearvar' is a near common
- c common farvar 10:far ; and `farvar' is far
- Far common variables may be greater in size than 64Kb, and so the
- OMF specification says that they are declared as a number of
- e{elements} of a given size. So a 10-byte far common variable could
- be declared as ten one-byte elements, five two-byte elements, two
- five-byte elements or one ten-byte element.
- Some OMF linkers require the I{element size, in common
- variables}I{common variables, element size}element size, as well as
- the variable size, to match when resolving common variables declared
- in more than one module. Therefore NASM must allow you to specify
- the element size on your far common variables. This is done by the
- following syntax:
- c common c_5by2 10:far 5 ; two five-byte elements
- c common c_2by5 10:far 2 ; five two-byte elements
- If no element size is specified, the default is 1. Also, the c{FAR}
- keyword is not required when an element size is specified, since
- only far commons may have element sizes at all. So the above
- declarations could equivalently be
- c common c_5by2 10:5 ; two five-byte elements
- c common c_2by5 10:2 ; five two-byte elements
- In addition to these extensions, the c{COMMON} directive in c{obj}
- also supports default-c{WRT} specification like c{EXTERN} does
- (explained in k{objextern}). So you can also declare things like
- c common foo 10:wrt dgroup
- c common bar 16:far 2:wrt data
- c common baz 24:wrt data:6
- H{win32fmt} ic{win32}: Microsoft Win32 Object Files
- The c{win32} output format generates Microsoft Win32 object files,
- suitable for passing to Microsoft linkers such as i{Visual C++}.
- Note that Borland Win32 compilers do not use this format, but use
- c{obj} instead (see k{objfmt}).
- c{win32} provides a default output file-name extension of c{.obj}.
- Note that although Microsoft say that Win32 object files follow the
- COFF (Common Object File Format) standard, the object files produced
- by Microsoft Win32 compilers are not compatible with COFF linkers
- such as DJGPP's, and vice versa. This is due to a difference of
- opinion over the precise semantics of PC-relative relocations. To
- produce COFF files suitable for DJGPP, use NASM's c{coff} output
- format; conversely, the c{coff} format does not produce object
- files that Win32 linkers can generate correct output from.
- S{win32sect} c{win32} Extensions to the c{SECTION}
- DirectiveI{SECTION, win32 extensions to}
- Like the c{obj} format, c{win32} allows you to specify additional
- information on the c{SECTION} directive line, to control the type
- and properties of sections you declare. Section types and properties
- are generated automatically by NASM for the i{standard section names}
- c{.text}, c{.data} and c{.bss}, but may still be overridden by
- these qualifiers.
- The available qualifiers are:
- b c{code}, or equivalently c{text}, defines the section to be a
- code section. This marks the section as readable and executable, but
- not writable, and also indicates to the linker that the type of the
- section is code.
- b c{data} and c{bss} define the section to be a data section,
- analogously to c{code}. Data sections are marked as readable and
- writable, but not executable. c{data} declares an initialised data
- section, whereas c{bss} declares an uninitialised data section.
- b c{info} defines the section to be an i{informational section},
- which is not included in the executable file by the linker, but may
- (for example) pass information e{to} the linker. For example,
- declaring an c{info}-type section called ic{.drectve} causes the
- linker to interpret the contents of the section as command-line
- options.
- b c{align=}, used with a trailing number as in c{obj}, gives the
- I{section alignment, in win32}I{alignment, in win32
- sections}alignment requirements of the section. The maximum you may
- specify is 64: the Win32 object file format contains no means to
- request a greater section alignment than this. If alignment is not
- explicitly specified, the defaults are 16-byte alignment for code
- sections, and 4-byte alignment for data (and BSS) sections.
- Informational sections get a default alignment of 1 byte (no
- alignment), though the value does not matter.
- The defaults assumed by NASM if you do not specify the above
- qualifiers are:
- c section .text code align=16
- c section .data data align=4
- c section .bss bss align=4
- Any other section name is treated by default like c{.text}.
- H{cofffmt} ic{coff}: i{Common Object File Format}
- The c{coff} output type produces COFF object files suitable for
- linking with the i{DJGPP} linker.
- c{coff} provides a default output file-name extension of c{.o}.
- The c{coff} format supports the same extensions to the c{SECTION}
- directive as c{win32} does, except that the c{align} qualifier and
- the c{info} section type are not supported.
- H{elffmt} ic{elf}: i{Linux ELF}I{Executable and Linkable
- Format}Object Files
- The c{elf} output format generates ELF32 (Executable and Linkable
- Format) object files, as used by Linux. c{elf} provides a default
- output file-name extension of c{.o}.
- S{elfsect} c{elf} Extensions to the c{SECTION}
- DirectiveI{SECTION, elf extensions to}
- Like the c{obj} format, c{elf} allows you to specify additional
- information on the c{SECTION} directive line, to control the type
- and properties of sections you declare. Section types and properties
- are generated automatically by NASM for the i{standard section
- names} ic{.text}, ic{.data} and ic{.bss}, but may still be
- overridden by these qualifiers.
- The available qualifiers are:
- b ic{alloc} defines the section to be one which is loaded into
- memory when the program is run. ic{noalloc} defines it to be one
- which is not, such as an informational or comment section.
- b ic{exec} defines the section to be one which should have execute
- permission when the program is run. ic{noexec} defines it as one
- which should not.
- b ic{write} defines the section to be one which should be writable
- when the program is run. ic{nowrite} defines it as one which should
- not.
- b ic{progbits} defines the section to be one with explicit contents
- stored in the object file: an ordinary code or data section, for
- example, ic{nobits} defines the section to be one with no explicit
- contents given, such as a BSS section.
- b c{align=}, used with a trailing number as in c{obj}, gives the
- I{section alignment, in elf}I{alignment, in elf sections}alignment
- requirements of the section.
- The defaults assumed by NASM if you do not specify the above
- qualifiers are:
- c section .text progbits alloc exec nowrite align=16
- c section .data progbits alloc noexec write align=4
- c section .bss nobits alloc noexec write align=4
- c section other progbits alloc noexec nowrite align=1
- (Any section name other than c{.text}, c{.data} and c{.bss} is
- treated by default like c{other} in the above code.)
- S{elfwrt} i{Position-Independent Code}I{PIC}: c{elf} Special
- Symbols and ic{WRT}
- The ELF specification contains enough features to allow
- position-independent code (PIC) to be written, which makes i{ELF
- shared libraries} very flexible. However, it also means NASM has to
- be able to generate a variety of strange relocation types in ELF
- object files, if it is to be an assembler which can write PIC.
- Since ELF does not support segment-base references, the c{WRT}
- operator is not used for its normal purpose; therefore NASM's
- c{elf} output format makes use of c{WRT} for a different purpose,
- namely the PIC-specific I{relocations, PIC-specific}relocation
- types.
- c{elf} defines five special symbols which you can use as the
- right-hand side of the c{WRT} operator to obtain PIC relocation
- types. They are ic{..gotpc}, ic{..gotoff}, ic{..got},
- ic{..plt} and ic{..sym}. Their functions are summarised here:
- b Referring to the symbol marking the global offset table base
- using c{wrt ..gotpc} will end up giving the distance from the
- beginning of the current section to the global offset table.
- (ic{_GLOBAL_OFFSET_TABLE_} is the standard symbol name used to
- refer to the i{GOT}.) So you would then need to add ic{$$} to the
- result to get the real address of the GOT.
- b Referring to a location in one of your own sections using c{wrt
- ..gotoff} will give the distance from the beginning of the GOT to
- the specified location, so that adding on the address of the GOT
- would give the real address of the location you wanted.
- b Referring to an external or global symbol using c{wrt ..got}
- causes the linker to build an entry e{in} the GOT containing the
- address of the symbol, and the reference gives the distance from the
- beginning of the GOT to the entry; so you can add on the address of
- the GOT, load from the resulting address, and end up with the
- address of the symbol.
- b Referring to a procedure name using c{wrt ..plt} causes the
- linker to build a i{procedure linkage table} entry for the symbol,
- and the reference gives the address of the i{PLT} entry. You can
- only use this in contexts which would generate a PC-relative
- relocation normally (i.e. as the destination for c{CALL} or
- c{JMP}), since ELF contains no relocation type to refer to PLT
- entries absolutely.
- b Referring to a symbol name using c{wrt ..sym} causes NASM to
- write an ordinary relocation, but instead of making the relocation
- relative to the start of the section and then adding on the offset
- to the symbol, it will write a relocation record aimed directly at
- the symbol in question. The distinction is a necessary one due to a
- peculiarity of the dynamic linker.
- A fuller explanation of how to use these relocation types to write
- shared libraries entirely in NASM is given in k{picdll}.
- S{elfglob} c{elf} Extensions to the c{GLOBAL} DirectiveI{GLOBAL,
- elf extensions to}I{GLOBAL, aoutb extensions to}
- ELF object files can contain more information about a global symbol
- than just its address: they can contain the I{symbol sizes,
- specifying}I{size, of symbols}size of the symbol and its I{symbol
- types, specifying}I{type, of symbols}type as well. These are not
- merely debugger conveniences, but are actually necessary when the
- program being written is a i{shared library}. NASM therefore
- supports some extensions to the c{GLOBAL} directive, allowing you
- to specify these features.
- You can specify whether a global variable is a function or a data
- object by suffixing the name with a colon and the word
- ic{function} or ic{data}. (ic{object} is a synonym for
- c{data}.) For example:
- c global hashlookup:function, hashtable:data
- exports the global symbol c{hashlookup} as a function and
- c{hashtable} as a data object.
- You can also specify the size of the data associated with the
- symbol, as a numeric expression (which may involve labels, and even
- forward references) after the type specifier. Like this:
- c global hashtable:data (hashtable.end - hashtable)
- c hashtable:
- c db this,that,theother ; some data here
- c .end:
- This makes NASM automatically calculate the length of the table and
- place that information into the ELF symbol table.
- Declaring the type and size of global symbols is necessary when
- writing shared library code. For more information, see
- k{picglobal}.
- S{elfcomm} c{elf} Extensions to the c{COMMON} DirectiveI{COMMON,
- elf extensions to}
- ELF also allows you to specify alignment requirements I{common
- variables, alignment in elf}I{alignment, of elf common variables}on
- common variables. This is done by putting a number (which must be a
- power of two) after the name and size of the common variable,
- separated (as usual) by a colon. For example, an array of
- doublewords would benefit from 4-byte alignment:
- c common dwordarray 128:4
- This declares the total size of the array to be 128 bytes, and
- requires that it be aligned on a 4-byte boundary.
- H{aoutfmt} ic{aout}: Linux I{a.out, Linux version}c{a.out} Object Files
- The c{aout} format generates c{a.out} object files, in the form
- used by early Linux systems. (These differ from other c{a.out}
- object files in that the magic number in the first four bytes of the
- file is different. Also, some implementations of c{a.out}, for
- example NetBSD's, support position-independent code, which Linux's
- implementation doesn't.)
- c{a.out} provides a default output file-name extension of c{.o}.
- c{a.out} is a very simple object format. It supports no special
- directives, no special symbols, no use of c{SEG} or c{WRT}, and no
- extensions to any standard directives. It supports only the three
- i{standard section names} ic{.text}, ic{.data} and ic{.bss}.
- H{aoutfmt} ic{aoutb}: i{NetBSD}/i{FreeBSD}/i{OpenBSD}
- I{a.out, BSD version}c{a.out} Object Files
- The c{aoutb} format generates c{a.out} object files, in the form
- used by the various free BSD Unix clones, NetBSD, FreeBSD and
- OpenBSD. For simple object files, this object format is exactly the
- same as c{aout} except for the magic number in the first four bytes
- of the file. However, the c{aoutb} format supports
- I{PIC}i{position-independent code} in the same way as the c{elf}
- format, so you can use it to write BSD i{shared libraries}.
- c{aoutb} provides a default output file-name extension of c{.o}.
- c{aoutb} supports no special directives, no special symbols, and
- only the three i{standard section names} ic{.text}, ic{.data}
- and ic{.bss}. However, it also supports the same use of ic{WRT} as
- c{elf} does, to provide position-independent code relocation types.
- See k{elfwrt} for full documentation of this feature.
- c{aoutb} also supports the same extensions to the c{GLOBAL}
- directive as c{elf} does: see k{elfglob} for documentation of
- this.
- H{as86fmt} c{as86}: Linux ic{as86} Object Files
- The Linux 16-bit assembler c{as86} has its own non-standard object
- file format. Although its companion linker ic{ld86} produces
- something close to ordinary c{a.out} binaries as output, the object
- file format used to communicate between c{as86} and c{ld86} is not
- itself c{a.out}.
- NASM supports this format, just in case it is useful, as c{as86}.
- c{as86} provides a default output file-name extension of c{.o}.
- c{as86} is a very simple object format (from the NASM user's point
- of view). It supports no special directives, no special symbols, no
- use of c{SEG} or c{WRT}, and no extensions to any standard
- directives. It supports only the three i{standard section names}
- ic{.text}, ic{.data} and ic{.bss}.
- H{rdffmt} I{RDOFF}ic{rdf}: i{Relocatable Dynamic Object File
- Format}
- The c{rdf} output format produces RDOFF object files. RDOFF
- (Relocatable Dynamic Object File Format) is a home-grown object-file
- format, designed alongside NASM itself and reflecting in its file
- format the internal structure of the assembler.
- RDOFF is not used by any well-known operating systems. Those writing
- their own systems, however, may well wish to use RDOFF as their
- object format, on the grounds that it is designed primarily for
- simplicity and contains very little file-header bureaucracy.
- The Unix NASM archive, and the DOS archive which includes sources,
- both contain an I{rdoff subdirectory}c{rdoff} subdirectory holding
- a set of RDOFF utilities: an RDF linker, an RDF static-library
- manager, an RDF file dump utility, and a program which will load and
- execute an RDF executable under Linux.
- c{rdf} supports only the i{standard section names} ic{.text},
- ic{.data} and ic{.bss}.
- S{rdflib} Requiring a Library: The ic{LIBRARY} Directive
- RDOFF contains a mechanism for an object file to demand a given
- library to be linked to the module, either at load time or run time.
- This is done by the c{LIBRARY} directive, which takes one argument
- which is the name of the module:
- c library mylib.rdl
- H{dbgfmt} ic{dbg}: Debugging Format
- The c{dbg} output format is not built into NASM in the default
- configuration. If you are building your own NASM executable from the
- sources, you can define ic{OF_DBG} in c{outform.h} or on the
- compiler command line, and obtain the c{dbg} output format.
- The c{dbg} format does not output an object file as such; instead,
- it outputs a text file which contains a complete list of all the
- transactions between the main body of NASM and the output-format
- back end module. It is primarily intended to aid people who want to
- write their own output drivers, so that they can get a clearer idea
- of the various requests the main program makes of the output driver,
- and in what order they happen.
- For simple files, one can easily use the c{dbg} format like this:
- c nasm -f dbg filename.asm
- which will generate a diagnostic file called c{filename.dbg}.
- However, this will not work well on files which were designed for a
- different object format, because each object format defines its own
- macros (usually user-level forms of directives), and those macros
- will not be defined in the c{dbg} format. Therefore it can be
- useful to run NASM twice, in order to do the preprocessing with the
- native object format selected:
- c nasm -e -f rdf -o rdfprog.i rdfprog.asm
- c nasm -a -f dbg rdfprog.i
- This preprocesses c{rdfprog.asm} into c{rdfprog.i}, keeping the
- c{rdf} object format selected in order to make sure RDF special
- directives are converted into primitive form correctly. Then the
- preprocessed source is fed through the c{dbg} format to generate
- the final diagnostic output.
- This workaround will still typically not work for programs intended
- for c{obj} format, because the c{obj} c{SEGMENT} and c{GROUP}
- directives have side effects of defining the segment and group names
- as symbols; c{dbg} will not do this, so the program will not
- assemble. You will have to work around that by defining the symbols
- yourself (using c{EXTERN}, for example) if you really need to get a
- c{dbg} trace of an c{obj}-specific source file.
- c{dbg} accepts any section name and any directives at all, and logs
- them all to its output file.
- C{16bit} Writing 16-bit Code (DOS, Windows 3/3.1)
- This chapter attempts to cover some of the common issues encountered
- when writing 16-bit code to run under MS-DOS or Windows 3.x. It
- covers how to link programs to produce c{.EXE} or c{.COM} files,
- how to write c{.SYS} device drivers, and how to interface assembly
- language code with 16-bit C compilers and with Borland Pascal.
- H{exefiles} Producing ic{.EXE} Files
- Any large program written under DOS needs to be built as a c{.EXE}
- file: only c{.EXE} files have the necessary internal structure
- required to span more than one 64K segment. i{Windows} programs,
- also, have to be built as c{.EXE} files, since Windows does not
- support the c{.COM} format.
- In general, you generate c{.EXE} files by using the c{obj} output
- format to produce one or more ic{.OBJ} files, and then linking
- them together using a linker. However, NASM also supports the direct
- generation of simple DOS c{.EXE} files using the c{bin} output
- format (by using c{DB} and c{DW} to construct the c{.EXE} file
- header), and a macro package is supplied to do this. Thanks to
- Yann Guidon for contributing the code for this.
- NASM may also support c{.EXE} natively as another output format in
- future releases.
- S{objexe} Using the c{obj} Format To Generate c{.EXE} Files
- This section describes the usual method of generating c{.EXE} files
- by linking c{.OBJ} files together.
- Most 16-bit programming language packages come with a suitable
- linker; if you have none of these, there is a free linker called
- i{VAL}I{linker, free}, available in c{LZH} archive format from
- W{ftp://x2ftp.oulu.fi/pub/msdos/programming/lang/}ic{x2ftp.oulu.fi}.
- An LZH archiver can be found at
- W{ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers}ic{ftp.simtel.net}.
- There is another `free' linker (though this one doesn't come with
- sources) called i{FREELINK}, available from
- W{http://www.pcorner.com/tpc/old/3-101.html}ic{www.pcorner.com}.
- A third, ic{djlink}, written by DJ Delorie, is available at
- W{http://www.delorie.com/djgpp/16bit/djlink/}ic{www.delorie.com}.
- When linking several c{.OBJ} files into a c{.EXE} file, you should
- ensure that exactly one of them has a start point defined (using the
- I{program entry point}ic{..start} special symbol defined by the
- c{obj} format: see k{dotdotstart}). If no module defines a start
- point, the linker will not know what value to give the entry-point
- field in the output file header; if more than one defines a start
- point, the linker will not know e{which} value to use.
- An example of a NASM source file which can be assembled to a
- c{.OBJ} file and linked on its own to a c{.EXE} is given here. It
- demonstrates the basic principles of defining a stack, initialising
- the segment registers, and declaring a start point. This file is
- also provided in the I{test subdirectory}c{test} subdirectory of
- the NASM archives, under the name c{objexe.asm}.
- c segment code
- c
- c ..start: mov ax,data
- c mov ds,ax
- c mov ax,stack
- c mov ss,ax
- c mov sp,stacktop
- This initial piece of code sets up c{DS} to point to the data
- segment, and initialises c{SS} and c{SP} to point to the top of
- the provided stack. Notice that interrupts are implicitly disabled
- for one instruction after a move into c{SS}, precisely for this
- situation, so that there's no chance of an interrupt occurring
- between the loads of c{SS} and c{SP} and not having a stack to
- execute on.
- Note also that the special symbol c{..start} is defined at the
- beginning of this code, which means that will be the entry point
- into the resulting executable file.
- c mov dx,hello
- c mov ah,9
- c int 0x21
- The above is the main program: load c{DS:DX} with a pointer to the
- greeting message (c{hello} is implicitly relative to the segment
- c{data}, which was loaded into c{DS} in the setup code, so the
- full pointer is valid), and call the DOS print-string function.
- c mov ax,0x4c00
- c int 0x21
- This terminates the program using another DOS system call.
- c segment data
- c hello: db 'hello, world', 13, 10, '$'
- The data segment contains the string we want to display.
- c segment stack stack
- c resb 64
- c stacktop:
- The above code declares a stack segment containing 64 bytes of
- uninitialised stack space, and points c{stacktop} at the top of it.
- The directive c{segment stack stack} defines a segment e{called}
- c{stack}, and also of e{type} c{STACK}. The latter is not
- necessary to the correct running of the program, but linkers are
- likely to issue warnings or errors if your program has no segment of
- type c{STACK}.
- The above file, when assembled into a c{.OBJ} file, will link on
- its own to a valid c{.EXE} file, which when run will print `hello,
- world' and then exit.
- S{binexe} Using the c{bin} Format To Generate c{.EXE} Files
- The c{.EXE} file format is simple enough that it's possible to
- build a c{.EXE} file by writing a pure-binary program and sticking
- a 32-byte header on the front. This header is simple enough that it
- can be generated using c{DB} and c{DW} commands by NASM itself, so
- that you can use the c{bin} output format to directly generate
- c{.EXE} files.
- Included in the NASM archives, in the I{misc subdirectory}c{misc}
- subdirectory, is a file ic{exebin.mac} of macros. It defines three
- macros: ic{EXE_begin}, ic{EXE_stack} and ic{EXE_end}.
- To produce a c{.EXE} file using this method, you should start by
- using c{%include} to load the c{exebin.mac} macro package into
- your source file. You should then issue the c{EXE_begin} macro call
- (which takes no arguments) to generate the file header data. Then
- write code as normal for the c{bin} format - you can use all three
- standard sections c{.text}, c{.data} and c{.bss}. At the end of
- the file you should call the c{EXE_end} macro (again, no arguments),
- which defines some symbols to mark section sizes, and these symbols
- are referred to in the header code generated by c{EXE_begin}.
- In this model, the code you end up writing starts at c{0x100}, just
- like a c{.COM} file - in fact, if you strip off the 32-byte header
- from the resulting c{.EXE} file, you will have a valid c{.COM}
- program. All the segment bases are the same, so you are limited to a
- 64K program, again just like a c{.COM} file. Note that an c{ORG}
- directive is issued by the c{EXE_begin} macro, so you should not
- explicitly issue one of your own.
- You can't directly refer to your segment base value, unfortunately,
- since this would require a relocation in the header, and things
- would get a lot more complicated. So you should get your segment
- base by copying it out of c{CS} instead.
- On entry to your c{.EXE} file, c{SS:SP} are already set up to
- point to the top of a 2Kb stack. You can adjust the default stack
- size of 2Kb by calling the c{EXE_stack} macro. For example, to
- change the stack size of your program to 64 bytes, you would call
- c{EXE_stack 64}.
- A sample program which generates a c{.EXE} file in this way is
- given in the c{test} subdirectory of the NASM archive, as
- c{binexe.asm}.
- H{comfiles} Producing ic{.COM} Files
- While large DOS programs must be written as c{.EXE} files, small
- ones are often better written as c{.COM} files. c{.COM} files are
- pure binary, and therefore most easily produced using the c{bin}
- output format.
- S{combinfmt} Using the c{bin} Format To Generate c{.COM} Files
- c{.COM} files expect to be loaded at offset c{100h} into their
- segment (though the segment may change). Execution then begins at
- Ic{ORG}c{100h}, i.e. right at the start of the program. So to
- write a c{.COM} program, you would create a source file looking
- like
- c org 100h
- c section .text
- c start: ; put your code here
- c section .data
- c ; put data items here
- c section .bss
- c ; put uninitialised data here
- The c{bin} format puts the c{.text} section first in the file, so
- you can declare data or BSS items before beginning to write code if
- you want to and the code will still end up at the front of the file
- where it belongs.
- The BSS (uninitialised data) section does not take up space in the
- c{.COM} file itself: instead, addresses of BSS items are resolved
- to point at space beyond the end of the file, on the grounds that
- this will be free memory when the program is run. Therefore you
- should not rely on your BSS being initialised to all zeros when you
- run.
- To assemble the above program, you should use a command line like
- c nasm myprog.asm -fbin -o myprog.com
- The c{bin} format would produce a file called c{myprog} if no
- explicit output file name were specified, so you have to override it
- and give the desired file name.
- S{comobjfmt} Using the c{obj} Format To Generate c{.COM} Files
- If you are writing a c{.COM} program as more than one module, you
- may wish to assemble several c{.OBJ} files and link them together
- into a c{.COM} program. You can do this, provided you have a linker
- capable of outputting c{.COM} files directly (i{TLINK} does this),
- or alternatively a converter program such as ic{EXE2BIN} to
- transform the c{.EXE} file output from the linker into a c{.COM}
- file.
- If you do this, you need to take care of several things:
- b The first object file containing code should start its code
- segment with a line like c{RESB 100h}. This is to ensure that the
- code begins at offset c{100h} relative to the beginning of the code
- segment, so that the linker or converter program does not have to
- adjust address references within the file when generating the
- c{.COM} file. Other assemblers use an ic{ORG} directive for this
- purpose, but c{ORG} in NASM is a format-specific directive to the
- c{bin} output format, and does not mean the same thing as it does
- in MASM-compatible assemblers.
- b You don't need to define a stack segment.
- b All your segments should be in the same group, so that every time
- your code or data references a symbol offset, all offsets are
- relative to the same segment base. This is because, when a c{.COM}
- file is loaded, all the segment registers contain the same value.
- H{sysfiles} Producing ic{.SYS} Files
- i{MS-DOS device drivers} - c{.SYS} files - are pure binary files,
- similar to c{.COM} files, except that they start at origin zero
- rather than c{100h}. Therefore, if you are writing a device driver
- using the c{bin} format, you do not need the c{ORG} directive,
- since the default origin for c{bin} is zero. Similarly, if you are
- using c{obj}, you do not need the c{RESB 100h} at the start of
- your code segment.
- c{.SYS} files start with a header structure, containing pointers to
- the various routines inside the driver which do the work. This
- structure should be defined at the start of the code segment, even
- though it is not actually code.
- For more information on the format of c{.SYS} files, and the data
- which has to go in the header structure, a list of books is given in
- the Frequently Asked Questions list for the newsgroup
- W{news:comp.os.msdos.programmer}ic{comp.os.msdos.programmer}.
- H{16c} Interfacing to 16-bit C Programs
- This section covers the basics of writing assembly routines that
- call, or are called from, C programs. To do this, you would
- typically write an assembly module as a c{.OBJ} file, and link it
- with your C modules to produce a i{mixed-language program}.
- S{16cunder} External Symbol Names
- I{C symbol names}I{underscore, in C symbols}C compilers have the
- convention that the names of all global symbols (functions or data)
- they define are formed by prefixing an underscore to the name as it
- appears in the C program. So, for example, the function a C
- programmer thinks of as c{printf} appears to an assembly language
- programmer as c{_printf}. This means that in your assembly
- programs, you can define symbols without a leading underscore, and
- not have to worry about name clashes with C symbols.
- If you find the underscores inconvenient, you can define macros to
- replace the c{GLOBAL} and c{EXTERN} directives as follows:
- c %macro cglobal 1
- c global _%1
- c %define %1 _%1
- c %endmacro
- c %macro cextern 1
- c extern _%1
- c %define %1 _%1
- c %endmacro
- (These forms of the macros only take one argument at a time; a
- c{%rep} construct could solve this.)
- If you then declare an external like this:
- c cextern printf
- then the macro will expand it as
- c extern _printf
- c %define printf _printf
- Thereafter, you can reference c{printf} as if it was a symbol, and
- the preprocessor will put the leading underscore on where necessary.
- The c{cglobal} macro works similarly. You must use c{cglobal}
- before defining the symbol in question, but you would have had to do
- that anyway if you used c{GLOBAL}.
- S{16cmodels} i{Memory Models}
- NASM contains no mechanism to support the various C memory models
- directly; you have to keep track yourself of which one you are
- writing for. This means you have to keep track of the following
- things:
- b In models using a single code segment (tiny, small and compact),
- functions are near. This means that function pointers, when stored
- in data segments or pushed on the stack as function arguments, are
- 16 bits long and contain only an offset field (the c{CS} register
- never changes its value, and always gives the segment part of the
- full function address), and that functions are called using ordinary
- near c{CALL} instructions and return using c{RETN} (which, in
- NASM, is synonymous with c{RET} anyway). This means both that you
- should write your own routines to return with c{RETN}, and that you
- should call external C routines with near c{CALL} instructions.
- b In models using more than one code segment (medium, large and
- huge), functions are far. This means that function pointers are 32
- bits long (consisting of a 16-bit offset followed by a 16-bit
- segment), and that functions are called using c{CALL FAR} (or
- c{CALL seg:offset}) and return using c{RETF}. Again, you should
- therefore write your own routines to return with c{RETF} and use
- c{CALL FAR} to call external routines.
- b In models using a single data segment (tiny, small and medium),
- data pointers are 16 bits long, containing only an offset field (the
- c{DS} register doesn't change its value, and always gives the
- segment part of the full data item address).
- b In models using more than one data segment (compact, large and
- huge), data pointers are 32 bits long, consisting of a 16-bit offset
- followed by a 16-bit segment. You should still be careful not to
- modify c{DS} in your routines without restoring it afterwards, but
- c{ES} is free for you to use to access the contents of 32-bit data
- pointers you are passed.
- b The huge memory model allows single data items to exceed 64K in
- size. In all other memory models, you can access the whole of a data
- item just by doing arithmetic on the offset field of the pointer you
- are given, whether a segment field is present or not; in huge model,
- you have to be more careful of your pointer arithmetic.
- b In most memory models, there is a e{default} data segment, whose
- segment address is kept in c{DS} throughout the program. This data
- segment is typically the same segment as the stack, kept in c{SS},
- so that functions' local variables (which are stored on the stack)
- and global data items can both be accessed easily without changing
- c{DS}. Particularly large data items are typically stored in other
- segments. However, some memory models (though not the standard
- ones, usually) allow the assumption that c{SS} and c{DS} hold the
- same value to be removed. Be careful about functions' local
- variables in this latter case.
- In models with a single code segment, the segment is called
- ic{_TEXT}, so your code segment must also go by this name in order
- to be linked into the same place as the main code segment. In models
- with a single data segment, or with a default data segment, it is
- called ic{_DATA}.
- S{16cfunc} Function Definitions and Function Calls
- I{functions, C calling convention}The i{C calling convention} in
- 16-bit programs is as follows. In the following description, the
- words e{caller} and e{callee} are used to denote the function
- doing the calling and the function which gets called.
- b The caller pushes the function's parameters on the stack, one
- after another, in reverse order (right to left, so that the first
- argument specified to the function is pushed last).
- b The caller then executes a c{CALL} instruction to pass control
- to the callee. This c{CALL} is either near or far depending on the
- memory model.
- b The callee receives control, and typically (although this is not
- actually necessary, in functions which do not need to access their
- parameters) starts by saving the value of c{SP} in c{BP} so as to
- be able to use c{BP} as a base pointer to find its parameters on
- the stack. However, the caller was probably doing this too, so part
- of the calling convention states that c{BP} must be preserved by
- any C function. Hence the callee, if it is going to set up c{BP} as
- a ie{frame pointer}, must push the previous value first.
- b The callee may then access its parameters relative to c{BP}.
- The word at c{[BP]} holds the previous value of c{BP} as it was
- pushed; the next word, at c{[BP+2]}, holds the offset part of the
- return address, pushed implicitly by c{CALL}. In a small-model
- (near) function, the parameters start after that, at c{[BP+4]}; in
- a large-model (far) function, the segment part of the return address
- lives at c{[BP+4]}, and the parameters begin at c{[BP+6]}. The
- leftmost parameter of the function, since it was pushed last, is
- accessible at this offset from c{BP}; the others follow, at
- successively greater offsets. Thus, in a function such as c{printf}
- which takes a variable number of parameters, the pushing of the
- parameters in reverse order means that the function knows where to
- find its first parameter, which tells it the number and type of the
- remaining ones.
- b The callee may also wish to decrease c{SP} further, so as to
- allocate space on the stack for local variables, which will then be
- accessible at negative offsets from c{BP}.
- b The callee, if it wishes to return a value to the caller, should
- leave the value in c{AL}, c{AX} or c{DX:AX} depending on the size
- of the value. Floating-point results are sometimes (depending on the
- compiler) returned in c{ST0}.
- b Once the callee has finished processing, it restores c{SP} from
- c{BP} if it had allocated local stack space, then pops the previous
- value of c{BP}, and returns via c{RETN} or c{RETF} depending on
- memory model.
- b When the caller regains control from the callee, the function
- parameters are still on the stack, so it typically adds an immediate
- constant to c{SP} to remove them (instead of executing a number of
- slow c{POP} instructions). Thus, if a function is accidentally
- called with the wrong number of parameters due to a prototype
- mismatch, the stack will still be returned to a sensible state since
- the caller, which e{knows} how many parameters it pushed, does the
- removing.
- It is instructive to compare this calling convention with that for
- Pascal programs (described in k{16bpfunc}). Pascal has a simpler
- convention, since no functions have variable numbers of parameters.
- Therefore the callee knows how many parameters it should have been
- passed, and is able to deallocate them from the stack itself by
- passing an immediate argument to the c{RET} or c{RETF}
- instruction, so the caller does not have to do it. Also, the
- parameters are pushed in left-to-right order, not right-to-left,
- which means that a compiler can give better guarantees about
- sequence points without performance suffering.
- Thus, you would define a function in C style in the following way.
- The following example is for small model:
- c global _myfunc
- c _myfunc: push bp
- c mov bp,sp
- c sub sp,0x40 ; 64 bytes of local stack space
- c mov bx,[bp+4] ; first parameter to function
- c ; some more code
- c mov sp,bp ; undo "sub sp,0x40" above
- c pop bp
- c ret
- For a large-model function, you would replace c{RET} by c{RETF},
- and look for the first parameter at c{[BP+6]} instead of
- c{[BP+4]}. Of course, if one of the parameters is a pointer, then
- the offsets of e{subsequent} parameters will change depending on
- the memory model as well: far pointers take up four bytes on the
- stack when passed as a parameter, whereas near pointers take up two.
- At the other end of the process, to call a C function from your
- assembly code, you would do something like this:
- c extern _printf
- c ; and then, further down...
- c push word [myint] ; one of my integer variables
- c push word mystring ; pointer into my data segment
- c call _printf
- c add sp,byte 4 ; `byte' saves space
- c ; then those data items...
- c segment _DATA
- c myint dw 1234
- c mystring db 'This number -> %d <- should be 1234',10,0
- This piece of code is the small-model assembly equivalent of the C
- code
- c int myint = 1234;
- c printf("This number -> %d <- should be 1234n", myint);
- In large model, the function-call code might look more like this. In
- this example, it is assumed that c{DS} already holds the segment
- base of the segment c{_DATA}. If not, you would have to initialise
- it first.
- c push word [myint]
- c push word seg mystring ; Now push the segment, and...
- c push word mystring ; ... offset of "mystring"
- c call far _printf
- c add sp,byte 6
- The integer value still takes up one word on the stack, since large
- model does not affect the size of the c{int} data type. The first
- argument (pushed last) to c{printf}, however, is a data pointer,
- and therefore has to contain a segment and offset part. The segment
- should be stored second in memory, and therefore must be pushed
- first. (Of course, c{PUSH DS} would have been a shorter instruction
- than c{PUSH WORD SEG mystring}, if c{DS} was set up as the above
- example assumed.) Then the actual call becomes a far call, since
- functions expect far calls in large model; and c{SP} has to be
- increased by 6 rather than 4 afterwards to make up for the extra
- word of parameters.
- S{16cdata} Accessing Data Items
- To get at the contents of C variables, or to declare variables which
- C can access, you need only declare the names as c{GLOBAL} or
- c{EXTERN}. (Again, the names require leading underscores, as stated
- in k{16cunder}.) Thus, a C variable declared as c{int i} can be
- accessed from assembler as
- c extern _i
- c mov ax,[_i]
- And to declare your own integer variable which C programs can access
- as c{extern int j}, you do this (making sure you are assembling in
- the c{_DATA} segment, if necessary):
- c global _j
- c _j dw 0
- To access a C array, you need to know the size of the components of
- the array. For example, c{int} variables are two bytes long, so if
- a C program declares an array as c{int a[10]}, you can access
- c{a[3]} by coding c{mov ax,[_a+6]}. (The byte offset 6 is obtained
- by multiplying the desired array index, 3, by the size of the array
- element, 2.) The sizes of the C base types in 16-bit compilers are:
- 1 for c{char}, 2 for c{short} and c{int}, 4 for c{long} and
- c{float}, and 8 for c{double}.
- To access a C i{data structure}, you need to know the offset from
- the base of the structure to the field you are interested in. You
- can either do this by converting the C structure definition into a
- NASM structure definition (using ic{STRUC}), or by calculating the
- one offset and using just that.
- To do either of these, you should read your C compiler's manual to
- find out how it organises data structures. NASM gives no special
- alignment to structure members in its own c{STRUC} macro, so you
- have to specify alignment yourself if the C compiler generates it.
- Typically, you might find that a structure like
- c struct {
- c char c;
- c int i;
- c } foo;
- might be four bytes long rather than three, since the c{int} field
- would be aligned to a two-byte boundary. However, this sort of
- feature tends to be a configurable option in the C compiler, either
- using command-line options or c{#pragma} lines, so you have to find
- out how your own compiler does it.
- S{16cmacro} ic{c16.mac}: Helper Macros for the 16-bit C Interface
- Included in the NASM archives, in the I{misc subdirectory}c{misc}
- directory, is a file c{c16.mac} of macros. It defines three macros:
- ic{proc}, ic{arg} and ic{endproc}. These are intended to be
- used for C-style procedure definitions, and they automate a lot of
- the work involved in keeping track of the calling convention.
- An example of an assembly function using the macro set is given
- here:
- c proc _nearproc
- c %$i arg
- c %$j arg
- c mov ax,[bp + %$i]
- c mov bx,[bp + %$j]
- c add ax,[bx]
- c endproc
- This defines c{_nearproc} to be a procedure taking two arguments,
- the first (c{i}) an integer and the second (c{j}) a pointer to an
- integer. It returns c{i + *j}.
- Note that the c{arg} macro has an c{EQU} as the first line of its
- expansion, and since the label before the macro call gets prepended
- to the first line of the expanded macro, the c{EQU} works, defining
- c{%$i} to be an offset from c{BP}. A context-local variable is
- used, local to the context pushed by the c{proc} macro and popped
- by the c{endproc} macro, so that the same argument name can be used
- in later procedures. Of course, you don't e{have} to do that.
- The macro set produces code for near functions (tiny, small and
- compact-model code) by default. You can have it generate far
- functions (medium, large and huge-model code) by means of coding
- Ic{FARCODE}c{%define FARCODE}. This changes the kind of return
- instruction generated by c{endproc}, and also changes the starting
- point for the argument offsets. The macro set contains no intrinsic
- dependency on whether data pointers are far or not.
- c{arg} can take an optional parameter, giving the size of the
- argument. If no size is given, 2 is assumed, since it is likely that
- many function parameters will be of type c{int}.
- The large-model equivalent of the above function would look like this:
- c %define FARCODE
- c proc _farproc
- c %$i arg
- c %$j arg 4
- c mov ax,[bp + %$i]
- c mov bx,[bp + %$j]
- c mov es,[bp + %$j + 2]
- c add ax,[bx]
- c endproc
- This makes use of the argument to the c{arg} macro to define a
- parameter of size 4, because c{j} is now a far pointer. When we
- load from c{j}, we must load a segment and an offset.
- H{16bp} Interfacing to i{Borland Pascal} Programs
- Interfacing to Borland Pascal programs is similar in concept to
- interfacing to 16-bit C programs. The differences are:
- b The leading underscore required for interfacing to C programs is
- not required for Pascal.
- b The memory model is always large: functions are far, data
- pointers are far, and no data item can be more than 64K long.
- (Actually, some functions are near, but only those functions that
- are local to a Pascal unit and never called from outside it. All
- assembly functions that Pascal calls, and all Pascal functions that
- assembly routines are able to call, are far.) However, all static
- data declared in a Pascal program goes into the default data
- segment, which is the one whose segment address will be in c{DS}
- when control is passed to your assembly code. The only things that
- do not live in the default data segment are local variables (they
- live in the stack segment) and dynamically allocated variables. All
- data e{pointers}, however, are far.
- b The function calling convention is different - described below.
- b Some data types, such as strings, are stored differently.
- b There are restrictions on the segment names you are allowed to
- use - Borland Pascal will ignore code or data declared in a segment
- it doesn't like the name of. The restrictions are described below.
- S{16bpfunc} The Pascal Calling Convention
- I{functions, Pascal calling convention}I{Pascal calling
- convention}The 16-bit Pascal calling convention is as follows. In
- the following description, the words e{caller} and e{callee} are
- used to denote the function doing the calling and the function which
- gets called.
- b The caller pushes the function's parameters on the stack, one
- after another, in normal order (left to right, so that the first
- argument specified to the function is pushed first).
- b The caller then executes a far c{CALL} instruction to pass
- control to the callee.
- b The callee receives control, and typically (although this is not
- actually necessary, in functions which do not need to access their
- parameters) starts by saving the value of c{SP} in c{BP} so as to
- be able to use c{BP} as a base pointer to find its parameters on
- the stack. However, the caller was probably doing this too, so part
- of the calling convention states that c{BP} must be preserved by
- any function. Hence the callee, if it is going to set up c{BP} as a
- i{frame pointer}, must push the previous value first.
- b The callee may then access its parameters relative to c{BP}.
- The word at c{[BP]} holds the previous value of c{BP} as it was
- pushed. The next word, at c{[BP+2]}, holds the offset part of the
- return address, and the next one at c{[BP+4]} the segment part. The
- parameters begin at c{[BP+6]}. The rightmost parameter of the
- function, since it was pushed last, is accessible at this offset
- from c{BP}; the others follow, at successively greater offsets.
- b The callee may also wish to decrease c{SP} further, so as to
- allocate space on the stack for local variables, which will then be
- accessible at negative offsets from c{BP}.
- b The callee, if it wishes to return a value to the caller, should
- leave the value in c{AL}, c{AX} or c{DX:AX} depending on the size
- of the value. Floating-point results are returned in c{ST0}.
- Results of type c{Real} (Borland's own custom floating-point data
- type, not handled directly by the FPU) are returned in c{DX:BX:AX}.
- To return a result of type c{String}, the caller pushes a pointer
- to a temporary string before pushing the parameters, and the callee
- places the returned string value at that location. The pointer is
- not a parameter, and should not be removed from the stack by the
- c{RETF} instruction.
- b Once the callee has finished processing, it restores c{SP} from
- c{BP} if it had allocated local stack space, then pops the previous
- value of c{BP}, and returns via c{RETF}. It uses the form of
- c{RETF} with an immediate parameter, giving the number of bytes
- taken up by the parameters on the stack. This causes the parameters
- to be removed from the stack as a side effect of the return
- instruction.
- b When the caller regains control from the callee, the function
- parameters have already been removed from the stack, so it needs to
- do nothing further.
- Thus, you would define a function in Pascal style, taking two
- c{Integer}-type parameters, in the following way:
- c global myfunc
- c myfunc: push bp
- c mov bp,sp
- c sub sp,0x40 ; 64 bytes of local stack space
- c mov bx,[bp+8] ; first parameter to function
- c mov bx,[bp+6] ; second parameter to function
- c ; some more code
- c mov sp,bp ; undo "sub sp,0x40" above
- c pop bp
- c retf 4 ; total size of params is 4
- At the other end of the process, to call a Pascal function from your
- assembly code, you would do something like this:
- c extern SomeFunc
- c ; and then, further down...
- c push word seg mystring ; Now push the segment, and...
- c push word mystring ; ... offset of "mystring"
- c push word [myint] ; one of my variables
- c call far SomeFunc
- This is equivalent to the Pascal code
- c procedure SomeFunc(String: PChar; Int: Integer);
- c SomeFunc(@mystring, myint);
- S{16bpseg} Borland Pascal I{segment names, Borland Pascal}Segment
- Name Restrictions
- Since Borland Pascal's internal unit file format is completely
- different from c{OBJ}, it only makes a very sketchy job of actually
- reading and understanding the various information contained in a
- real c{OBJ} file when it links that in. Therefore an object file
- intended to be linked to a Pascal program must obey a number of
- restrictions:
- b Procedures and functions must be in a segment whose name is
- either c{CODE}, c{CSEG}, or something ending in c{_TEXT}.
- b Initialised data must be in a segment whose name is either
- c{CONST} or something ending in c{_DATA}.
- b Uninitialised data must be in a segment whose name is either
- c{DATA}, c{DSEG}, or something ending in c{_BSS}.
- b Any other segments in the object file are completely ignored.
- c{GROUP} directives and segment attributes are also ignored.
- S{16bpmacro} Using ic{c16.mac} With Pascal Programs
- The c{c16.mac} macro package, described in k{16cmacro}, can also
- be used to simplify writing functions to be called from Pascal
- programs, if you code Ic{PASCAL}c{%define PASCAL}. This
- definition ensures that functions are far (it implies
- ic{FARCODE}), and also causes procedure return instructions to be
- generated with an operand.
- Defining c{PASCAL} does not change the code which calculates the
- argument offsets; you must declare your function's arguments in
- reverse order. For example:
- c %define PASCAL
- c proc _pascalproc
- c %$j arg 4
- c %$i arg
- c mov ax,[bp + %$i]
- c mov bx,[bp + %$j]
- c mov es,[bp + %$j + 2]
- c add ax,[bx]
- c endproc
- This defines the same routine, conceptually, as the example in
- k{16cmacro}: it defines a function taking two arguments, an integer
- and a pointer to an integer, which returns the sum of the integer
- and the contents of the pointer. The only difference between this
- code and the large-model C version is that c{PASCAL} is defined
- instead of c{FARCODE}, and that the arguments are declared in
- reverse order.
- C{32bit} Writing 32-bit Code (Unix, Win32, DJGPP)
- This chapter attempts to cover some of the common issues involved
- when writing 32-bit code, to run under i{Win32} or Unix, or to be
- linked with C code generated by a Unix-style C compiler such as
- i{DJGPP}. It covers how to write assembly code to interface with
- 32-bit C routines, and how to write position-independent code for
- shared libraries.
- Almost all 32-bit code, and in particular all code running under
- Win32, DJGPP or any of the PC Unix variants, runs in I{flat memory
- model}e{flat} memory model. This means that the segment registers
- and paging have already been set up to give you the same 32-bit 4Gb
- address space no matter what segment you work relative to, and that
- you should ignore all segment registers completely. When writing
- flat-model application code, you never need to use a segment
- override or modify any segment register, and the code-section
- addresses you pass to c{CALL} and c{JMP} live in the same address
- space as the data-section addresses you access your variables by and
- the stack-section addresses you access local variables and procedure
- parameters by. Every address is 32 bits long and contains only an
- offset part.
- H{32c} Interfacing to 32-bit C Programs
- A lot of the discussion in k{16c}, about interfacing to 16-bit C
- programs, still applies when working in 32 bits. The absence of
- memory models or segmentation worries simplifies things a lot.
- S{32cunder} External Symbol Names
- Most 32-bit C compilers share the convention used by 16-bit
- compilers, that the names of all global symbols (functions or data)
- they define are formed by prefixing an underscore to the name as it
- appears in the C program. However, not all of them do: the ELF
- specification states that C symbols do e{not} have a leading
- underscore on their assembly-language names.
- The older Linux c{a.out} C compiler, all Win32 compilers, DJGPP,
- and NetBSD and FreeBSD, all use the leading underscore; for these
- compilers, the macros c{cextern} and c{cglobal}, as given in
- k{16cunder}, will still work. For ELF, though, the leading
- underscore should not be used.
- S{32cfunc} Function Definitions and Function Calls
- I{functions, C calling convention}The i{C calling convention}The C
- calling convention in 32-bit programs is as follows. In the
- following description, the words e{caller} and e{callee} are used
- to denote the function doing the calling and the function which gets
- called.
- b The caller pushes the function's parameters on the stack, one
- after another, in reverse order (right to left, so that the first
- argument specified to the function is pushed last).
- b The caller then executes a near c{CALL} instruction to pass
- control to the callee.
- b The callee receives control, and typically (although this is not
- actually necessary, in functions which do not need to access their
- parameters) starts by saving the value of c{ESP} in c{EBP} so as
- to be able to use c{EBP} as a base pointer to find its parameters
- on the stack. However, the caller was probably doing this too, so
- part of the calling convention states that c{EBP} must be preserved
- by any C function. Hence the callee, if it is going to set up
- c{EBP} as a i{frame pointer}, must push the previous value first.
- b The callee may then access its parameters relative to c{EBP}.
- The doubleword at c{[EBP]} holds the previous value of c{EBP} as
- it was pushed; the next doubleword, at c{[EBP+4]}, holds the return
- address, pushed implicitly by c{CALL}. The parameters start after
- that, at c{[EBP+8]}. The leftmost parameter of the function, since
- it was pushed last, is accessible at this offset from c{EBP}; the
- others follow, at successively greater offsets. Thus, in a function
- such as c{printf} which takes a variable number of parameters, the
- pushing of the parameters in reverse order means that the function
- knows where to find its first parameter, which tells it the number
- and type of the remaining ones.
- b The callee may also wish to decrease c{ESP} further, so as to
- allocate space on the stack for local variables, which will then be
- accessible at negative offsets from c{EBP}.
- b The callee, if it wishes to return a value to the caller, should
- leave the value in c{AL}, c{AX} or c{EAX} depending on the size
- of the value. Floating-point results are typically returned in
- c{ST0}.
- b Once the callee has finished processing, it restores c{ESP} from
- c{EBP} if it had allocated local stack space, then pops the previous
- value of c{EBP}, and returns via c{RET} (equivalently, c{RETN}).
- b When the caller regains control from the callee, the function
- parameters are still on the stack, so it typically adds an immediate
- constant to c{ESP} to remove them (instead of executing a number of
- slow c{POP} instructions). Thus, if a function is accidentally
- called with the wrong number of parameters due to a prototype
- mismatch, the stack will still be returned to a sensible state since
- the caller, which e{knows} how many parameters it pushed, does the
- removing.
- There is an alternative calling convention used by Win32 programs
- for Windows API calls, and also for functions called e{by} the
- Windows API such as window procedures: they follow what Microsoft
- calls the c{__stdcall} convention. This is slightly closer to the
- Pascal convention, in that the callee clears the stack by passing a
- parameter to the c{RET} instruction. However, the parameters are
- still pushed in right-to-left order.
- Thus, you would define a function in C style in the following way:
- c global _myfunc
- c _myfunc: push ebp
- c mov ebp,esp
- c sub esp,0x40 ; 64 bytes of local stack space
- c mov ebx,[ebp+8] ; first parameter to function
- c ; some more code
- c leave ; mov esp,ebp / pop ebp
- c ret
- At the other end of the process, to call a C function from your
- assembly code, you would do something like this:
- c extern _printf
- c ; and then, further down...
- c push dword [myint] ; one of my integer variables
- c push dword mystring ; pointer into my data segment
- c call _printf
- c add esp,byte 8 ; `byte' saves space
- c ; then those data items...
- c segment _DATA
- c myint dd 1234
- c mystring db 'This number -> %d <- should be 1234',10,0
- This piece of code is the assembly equivalent of the C code
- c int myint = 1234;
- c printf("This number -> %d <- should be 1234n", myint);
- S{32cdata} Accessing Data Items
- To get at the contents of C variables, or to declare variables which
- C can access, you need only declare the names as c{GLOBAL} or
- c{EXTERN}. (Again, the names require leading underscores, as stated
- in k{32cunder}.) Thus, a C variable declared as c{int i} can be
- accessed from assembler as
- c extern _i
- c mov eax,[_i]
- And to declare your own integer variable which C programs can access
- as c{extern int j}, you do this (making sure you are assembling in
- the c{_DATA} segment, if necessary):
- c global _j
- c _j dd 0
- To access a C array, you need to know the size of the components of
- the array. For example, c{int} variables are four bytes long, so if
- a C program declares an array as c{int a[10]}, you can access
- c{a[3]} by coding c{mov ax,[_a+12]}. (The byte offset 12 is obtained
- by multiplying the desired array index, 3, by the size of the array
- element, 4.) The sizes of the C base types in 32-bit compilers are:
- 1 for c{char}, 2 for c{short}, 4 for c{int}, c{long} and
- c{float}, and 8 for c{double}. Pointers, being 32-bit addresses,
- are also 4 bytes long.
- To access a C i{data structure}, you need to know the offset from
- the base of the structure to the field you are interested in. You
- can either do this by converting the C structure definition into a
- NASM structure definition (using c{STRUC}), or by calculating the
- one offset and using just that.
- To do either of these, you should read your C compiler's manual to
- find out how it organises data structures. NASM gives no special
- alignment to structure members in its own ic{STRUC} macro, so you
- have to specify alignment yourself if the C compiler generates it.
- Typically, you might find that a structure like
- c struct {
- c char c;
- c int i;
- c } foo;
- might be eight bytes long rather than five, since the c{int} field
- would be aligned to a four-byte boundary. However, this sort of
- feature is sometimes a configurable option in the C compiler, either
- using command-line options or c{#pragma} lines, so you have to find
- out how your own compiler does it.
- S{32cmacro} ic{c32.mac}: Helper Macros for the 32-bit C Interface
- Included in the NASM archives, in the I{misc directory}c{misc}
- directory, is a file c{c32.mac} of macros. It defines three macros:
- ic{proc}, ic{arg} and ic{endproc}. These are intended to be
- used for C-style procedure definitions, and they automate a lot of
- the work involved in keeping track of the calling convention.
- An example of an assembly function using the macro set is given
- here:
- c proc _proc32
- c %$i arg
- c %$j arg
- c mov eax,[ebp + %$i]
- c mov ebx,[ebp + %$j]
- c add eax,[ebx]
- c endproc
- This defines c{_proc32} to be a procedure taking two arguments, the
- first (c{i}) an integer and the second (c{j}) a pointer to an
- integer. It returns c{i + *j}.
- Note that the c{arg} macro has an c{EQU} as the first line of its
- expansion, and since the label before the macro call gets prepended
- to the first line of the expanded macro, the c{EQU} works, defining
- c{%$i} to be an offset from c{BP}. A context-local variable is
- used, local to the context pushed by the c{proc} macro and popped
- by the c{endproc} macro, so that the same argument name can be used
- in later procedures. Of course, you don't e{have} to do that.
- c{arg} can take an optional parameter, giving the size of the
- argument. If no size is given, 4 is assumed, since it is likely that
- many function parameters will be of type c{int} or pointers.
- H{picdll} Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF i{Shared
- Libraries}
- ELF replaced the older c{a.out} object file format under Linux
- because it contains support for i{position-independent code}
- (i{PIC}), which makes writing shared libraries much easier. NASM
- supports the ELF position-independent code features, so you can
- write Linux ELF shared libraries in NASM.
- i{NetBSD}, and its close cousins i{FreeBSD} and i{OpenBSD}, take
- a different approach by hacking PIC support into the c{a.out}
- format. NASM supports this as the ic{aoutb} output format, so you
- can write i{BSD} shared libraries in NASM too.
- The operating system loads a PIC shared library by memory-mapping
- the library file at an arbitrarily chosen point in the address space
- of the running process. The contents of the library's code section
- must therefore not depend on where it is loaded in memory.
- Therefore, you cannot get at your variables by writing code like
- this:
- c mov eax,[myvar] ; WRONG
- Instead, the linker provides an area of memory called the
- ie{global offset table}, or i{GOT}; the GOT is situated at a
- constant distance from your library's code, so if you can find out
- where your library is loaded (which is typically done using a
- c{CALL} and c{POP} combination), you can obtain the address of the
- GOT, and you can then load the addresses of your variables out of
- linker-generated entries in the GOT.
- The e{data} section of a PIC shared library does not have these
- restrictions: since the data section is writable, it has to be
- copied into memory anyway rather than just paged in from the library
- file, so as long as it's being copied it can be relocated too. So
- you can put ordinary types of relocation in the data section without
- too much worry (but see k{picglobal} for a caveat).
- S{picgot} Obtaining the Address of the GOT
- Each code module in your shared library should define the GOT as an
- external symbol:
- c extern _GLOBAL_OFFSET_TABLE_ ; in ELF
- c extern __GLOBAL_OFFSET_TABLE_ ; in BSD a.out
- At the beginning of any function in your shared library which plans
- to access your data or BSS sections, you must first calculate the
- address of the GOT. This is typically done by writing the function
- in this form:
- c func: push ebp
- c mov ebp,esp
- c push ebx
- c call .get_GOT
- c .get_GOT: pop ebx
- c add ebx,_GLOBAL_OFFSET_TABLE_+$$-.get_GOT wrt ..gotpc
- c ; the function body comes here
- c mov ebx,[ebp-4]
- c mov esp,ebp
- c pop ebp
- c ret
- (For BSD, again, the symbol c{_GLOBAL_OFFSET_TABLE} requires a
- second leading underscore.)
- The first two lines of this function are simply the standard C
- prologue to set up a stack frame, and the last three lines are
- standard C function epilogue. The third line, and the fourth to last
- line, save and restore the c{EBX} register, because PIC shared
- libraries use this register to store the address of the GOT.
- The interesting bit is the c{CALL} instruction and the following
- two lines. The c{CALL} and c{POP} combination obtains the address
- of the label c{.get_GOT}, without having to know in advance where
- the program was loaded (since the c{CALL} instruction is encoded
- relative to the current position). The c{ADD} instruction makes use
- of one of the special PIC relocation types: i{GOTPC relocation}.
- With the ic{WRT ..gotpc} qualifier specified, the symbol
- referenced (here c{_GLOBAL_OFFSET_TABLE_}, the special symbol
- assigned to the GOT) is given as an offset from the beginning of the
- section. (Actually, ELF encodes it as the offset from the operand
- field of the c{ADD} instruction, but NASM simplifies this
- deliberately, so you do things the same way for both ELF and BSD.)
- So the instruction then e{adds} the beginning of the section, to
- get the real address of the GOT, and subtracts the value of
- c{.get_GOT} which it knows is in c{EBX}. Therefore, by the time
- that instruction has finished,
- c{EBX} contains the address of the GOT.
- If you didn't follow that, don't worry: it's never necessary to
- obtain the address of the GOT by any other means, so you can put
- those three instructions into a macro and safely ignore them:
- c %macro get_GOT 0
- c call %%getgot
- c %%getgot: pop ebx
- c add ebx,_GLOBAL_OFFSET_TABLE_+$$-%%getgot wrt ..gotpc
- c %endmacro
- S{piclocal} Finding Your Local Data Items
- Having got the GOT, you can then use it to obtain the addresses of
- your data items. Most variables will reside in the sections you have
- declared; they can be accessed using the I{GOTOFF
- relocation}c{..gotoff} special Ic{WRT ..gotoff}c{WRT} type. The
- way this works is like this:
- c lea eax,[ebx+myvar wrt ..gotoff]
- The expression c{myvar wrt ..gotoff} is calculated, when the shared
- library is linked, to be the offset to the local variable c{myvar}
- from the beginning of the GOT. Therefore, adding it to c{EBX} as
- above will place the real address of c{myvar} in c{EAX}.
- If you declare variables as c{GLOBAL} without specifying a size for
- them, they are shared between code modules in the library, but do
- not get exported from the library to the program that loaded it.
- They will still be in your ordinary data and BSS sections, so you
- can access them in the same way as local variables, using the above
- c{..gotoff} mechanism.
- Note that due to a peculiarity of the way BSD c{a.out} format
- handles this relocation type, there must be at least one non-local
- symbol in the same section as the address you're trying to access.
- S{picextern} Finding External and Common Data Items
- If your library needs to get at an external variable (external to
- the e{library}, not just to one of the modules within it), you must
- use the I{GOT relocations}Ic{WRT ..got}c{..got} type to get at
- it. The c{..got} type, instead of giving you the offset from the
- GOT base to the variable, gives you the offset from the GOT base to
- a GOT e{entry} containing the address of the variable. The linker
- will set up this GOT entry when it builds the library, and the
- dynamic linker will place the correct address in it at load time. So
- to obtain the address of an external variable c{extvar} in c{EAX},
- you would code
- c mov eax,[ebx+extvar wrt ..got]
- This loads the address of c{extvar} out of an entry in the GOT. The
- linker, when it builds the shared library, collects together every
- relocation of type c{..got}, and builds the GOT so as to ensure it
- has every necessary entry present.
- Common variables must also be accessed in this way.
- S{picglobal} Exporting Symbols to the Library User
- If you want to export symbols to the user of the library, you have
- to declare whether they are functions or data, and if they are data,
- you have to give the size of the data item. This is because the
- dynamic linker has to build I{PLT}i{procedure linkage table}
- entries for any exported functions, and also moves exported data
- items away from the library's data section in which they were
- declared.
- So to export a function to users of the library, you must use
- c global func:function ; declare it as a function
- c func: push ebp
- c ; etc.
- And to export a data item such as an array, you would have to code
- c global array:data array.end-array ; give the size too
- c array: resd 128
- c .end:
- Be careful: If you export a variable to the library user, by
- declaring it as c{GLOBAL} and supplying a size, the variable will
- end up living in the data section of the main program, rather than
- in your library's data section, where you declared it. So you will
- have to access your own global variable with the c{..got} mechanism
- rather than c{..gotoff}, as if it were external (which,
- effectively, it has become).
- Equally, if you need to store the address of an exported global in
- one of your data sections, you can't do it by means of the standard
- sort of code:
- c dataptr: dd global_data_item ; WRONG
- NASM will interpret this code as an ordinary relocation, in which
- c{global_data_item} is merely an offset from the beginning of the
- c{.data} section (or whatever); so this reference will end up
- pointing at your data section instead of at the exported global
- which resides elsewhere.
- Instead of the above code, then, you must write
- c dataptr: dd global_data_item wrt ..sym
- which makes use of the special c{WRT} type Ic{WRT ..sym}c{..sym}
- to instruct NASM to search the symbol table for a particular symbol
- at that address, rather than just relocating by section base.
- Either method will work for functions: referring to one of your
- functions by means of
- c funcptr: dd my_function
- will give the user the address of the code you wrote, whereas
- c funcptr: dd my_function wrt ..sym
- will give the address of the procedure linkage table for the
- function, which is where the calling program will e{believe} the
- function lives. Either address is a valid way to call the function.
- S{picproc} Calling Procedures Outside the Library
- Calling procedures outside your shared library has to be done by
- means of a ie{procedure linkage table}, or i{PLT}. The PLT is
- placed at a known offset from where the library is loaded, so the
- library code can make calls to the PLT in a position-independent
- way. Within the PLT there is code to jump to offsets contained in
- the GOT, so function calls to other shared libraries or to routines
- in the main program can be transparently passed off to their real
- destinations.
- To call an external routine, you must use another special PIC
- relocation type, I{PLT relocations}ic{WRT ..plt}. This is much
- easier than the GOT-based ones: you simply replace calls such as
- c{CALL printf} with the PLT-relative version c{CALL printf WRT
- ..plt}.
- S{link} Generating the Library File
- Having written some code modules and assembled them to c{.o} files,
- you then generate your shared library with a command such as
- c ld -shared -o library.so module1.o module2.o # for ELF
- c ld -Bshareable -o library.so module1.o module2.o # for BSD
- For ELF, if your shared library is going to reside in system
- directories such as c{/usr/lib} or c{/lib}, it is usually worth
- using the ic{-soname} flag to the linker, to store the final
- library file name, with a version number, into the library:
- c ld -shared -soname library.so.1 -o library.so.1.2 *.o
- You would then copy c{library.so.1.2} into the library directory,
- and create c{library.so.1} as a symbolic link to it.
- C{mixsize} Mixing 16 and 32 Bit Code
- This chapter tries to cover some of the issues, largely related to
- unusual forms of addressing and jump instructions, encountered when
- writing operating system code such as protected-mode initialisation
- routines, which require code that operates in mixed segment sizes,
- such as code in a 16-bit segment trying to modify data in a 32-bit
- one, or jumps between different-size segments.
- H{mixjump} Mixed-Size JumpsI{jumps, mixed-size}
- I{operating system, writing}I{writing operating systems}The most
- common form of i{mixed-size instruction} is the one used when
- writing a 32-bit OS: having done your setup in 16-bit mode, such as
- loading the kernel, you then have to boot it by switching into
- protected mode and jumping to the 32-bit kernel start address. In a
- fully 32-bit OS, this tends to be the e{only} mixed-size
- instruction you need, since everything before it can be done in pure
- 16-bit code, and everything after it can be pure 32-bit.
- This jump must specify a 48-bit far address, since the target
- segment is a 32-bit one. However, it must be assembled in a 16-bit
- segment, so just coding, for example,
- c jmp 0x1234:0x56789ABC ; wrong!
- will not work, since the offset part of the address will be
- truncated to c{0x9ABC} and the jump will be an ordinary 16-bit far
- one.
- The Linux kernel setup code gets round the inability of c{as86} to
- generate the required instruction by coding it manually, using
- c{DB} instructions. NASM can go one better than that, by actually
- generating the right instruction itself. Here's how to do it right:
- c jmp dword 0x1234:0x56789ABC ; right
- Ic{JMP DWORD}The c{DWORD} prefix (strictly speaking, it should
- come e{after} the colon, since it is declaring the e{offset} field
- to be a doubleword; but NASM will accept either form, since both are
- unambiguous) forces the offset part to be treated as far, in the
- assumption that you are deliberately writing a jump from a 16-bit
- segment to a 32-bit one.
- You can do the reverse operation, jumping from a 32-bit segment to a
- 16-bit one, by means of the c{WORD} prefix:
- c jmp word 0x8765:0x4321 ; 32 to 16 bit
- If the c{WORD} prefix is specified in 16-bit mode, or the c{DWORD}
- prefix in 32-bit mode, they will be ignored, since each is
- explicitly forcing NASM into a mode it was in anyway.
- H{mixaddr} Addressing Between Different-Size SegmentsI{addressing,
- mixed-size}I{mixed-size addressing}
- If your OS is mixed 16 and 32-bit, or if you are writing a DOS
- extender, you are likely to have to deal with some 16-bit segments
- and some 32-bit ones. At some point, you will probably end up
- writing code in a 16-bit segment which has to access data in a
- 32-bit segment, or vice versa.
- If the data you are trying to access in a 32-bit segment lies within
- the first 64K of the segment, you may be able to get away with using
- an ordinary 16-bit addressing operation for the purpose; but sooner
- or later, you will want to do 32-bit addressing from 16-bit mode.
- The easiest way to do this is to make sure you use a register for
- the address, since any effective address containing a 32-bit
- register is forced to be a 32-bit address. So you can do
- c mov eax,offset_into_32_bit_segment_specified_by_fs
- c mov dword [fs:eax],0x11223344
- This is fine, but slightly cumbersome (since it wastes an
- instruction and a register) if you already know the precise offset
- you are aiming at. The x86 architecture does allow 32-bit effective
- addresses to specify nothing but a 4-byte offset, so why shouldn't
- NASM be able to generate the best instruction for the purpose?
- It can. As in k{mixjump}, you need only prefix the address with the
- c{DWORD} keyword, and it will be forced to be a 32-bit address:
- c mov dword [fs:dword my_offset],0x11223344
- Also as in k{mixjump}, NASM is not fussy about whether the
- c{DWORD} prefix comes before or after the segment override, so
- arguably a nicer-looking way to code the above instruction is
- c mov dword [dword fs:my_offset],0x11223344
- Don't confuse the c{DWORD} prefix e{outside} the square brackets,
- which controls the size of the data stored at the address, with the
- one c{inside} the square brackets which controls the length of the
- address itself. The two can quite easily be different:
- c mov word [dword 0x12345678],0x9ABC
- This moves 16 bits of data to an address specified by a 32-bit
- offset.
- You can also specify c{WORD} or c{DWORD} prefixes along with the
- c{FAR} prefix to indirect far jumps or calls. For example:
- c call dword far [fs:word 0x4321]
- This instruction contains an address specified by a 16-bit offset;
- it loads a 48-bit far pointer from that (16-bit segment and 32-bit
- offset), and calls that address.
- H{mixother} Other Mixed-Size Instructions
- The other way you might want to access data might be using the
- string instructions (c{LODSx}, c{STOSx} and so on) or the
- c{XLATB} instruction. These instructions, since they take no
- parameters, might seem to have no easy way to make them perform
- 32-bit addressing when assembled in a 16-bit segment.
- This is the purpose of NASM's ic{a16} and ic{a32} prefixes. If
- you are coding c{LODSB} in a 16-bit segment but it is supposed to
- be accessing a string in a 32-bit segment, you should load the
- desired address into c{ESI} and then code
- c a32 lodsb
- The prefix forces the addressing size to 32 bits, meaning that
- c{LODSB} loads from c{[DS:ESI]} instead of c{[DS:SI]}. To access
- a string in a 16-bit segment when coding in a 32-bit one, the
- corresponding c{a16} prefix can be used.
- The c{a16} and c{a32} prefixes can be applied to any instruction
- in NASM's instruction table, but most of them can generate all the
- useful forms without them. The prefixes are necessary only for
- instructions with implicit addressing: c{CMPSx} (k{insCMPSB}),
- c{SCASx} (k{insSCASB}), c{LODSx} (k{insLODSB}), c{STOSx}
- (k{insSTOSB}), c{MOVSx} (k{insMOVSB}), c{INSx} (k{insINSB}),
- c{OUTSx} (k{insOUTSB}), and c{XLATB} (k{insXLATB}). Also, the
- various push and pop instructions (c{PUSHA} and c{POPF} as well as
- the more usual c{PUSH} and c{POP}) can accept c{a16} or c{a32}
- prefixes to force a particular one of c{SP} or c{ESP} to be used
- as a stack pointer, in case the stack segment in use is a different
- size from the code segment.
- c{PUSH} and c{POP}, when applied to segment registers in 32-bit
- mode, also have the slightly odd behaviour that they push and pop 4
- bytes at a time, of which the top two are ignored and the bottom two
- give the value of the segment register being manipulated. To force
- the 16-bit behaviour of segment-register push and pop instructions,
- you can use the operand-size prefix ic{o16}:
- c o16 push ss
- c o16 push ds
- This code saves a doubleword of stack space by fitting two segment
- registers into the space which would normally be consumed by pushing
- one.
- (You can also use the ic{o32} prefix to force the 32-bit behaviour
- when in 16-bit mode, but this seems less useful.)
- C{trouble} Troubleshooting
- This chapter describes some of the common problems that users have
- been known to encounter with NASM, and answers them. It also gives
- instructions for reporting bugs in NASM if you find a difficulty
- that isn't listed here.
- H{problems} Common Problems
- S{inefficient} NASM Generates i{Inefficient Code}
- I get a lot of `bug' reports about NASM generating inefficient, or
- even `wrong', code on instructions such as c{ADD ESP,8}. This is a
- deliberate design feature, connected to predictability of output:
- NASM, on seeing c{ADD ESP,8}, will generate the form of the
- instruction which leaves room for a 32-bit offset. You need to code
- Ic{BYTE}c{ADD ESP,BYTE 8} if you want the space-efficient
- form of the instruction. This isn't a bug: at worst it's a
- misfeature, and that's a matter of opinion only.
- S{jmprange} My Jumps are Out of RangeI{out of range, jumps}
- Similarly, people complain that when they issue i{conditional
- jumps} (which are c{SHORT} by default) that try to jump too far,
- NASM reports `short jump out of range' instead of making the jumps
- longer.
- This, again, is partly a predictability issue, but in fact has a
- more practical reason as well. NASM has no means of being told what
- type of processor the code it is generating will be run on; so it
- cannot decide for itself that it should generate ic{Jcc NEAR} type
- instructions, because it doesn't know that it's working for a 386 or
- above. Alternatively, it could replace the out-of-range short
- c{JNE} instruction with a very short c{JE} instruction that jumps
- over a c{JMP NEAR}; this is a sensible solution for processors
- below a 386, but hardly efficient on processors which have good
- branch prediction e{and} could have used c{JNE NEAR} instead. So,
- once again, it's up to the user, not the assembler, to decide what
- instructions should be generated.
- S{proborg} ic{ORG} Doesn't Work
- People writing i{boot sector} programs in the c{bin} format often
- complain that c{ORG} doesn't work the way they'd like: in order to
- place the c{0xAA55} signature word at the end of a 512-byte boot
- sector, people who are used to MASM tend to code
- c ORG 0
- c ; some boot sector code
- c ORG 510
- c DW 0xAA55
- This is not the intended use of the c{ORG} directive in NASM, and
- will not work. The correct way to solve this problem in NASM is to
- use the ic{TIMES} directive, like this:
- c ORG 0
- c ; some boot sector code
- c TIMES 510-($-$$) DB 0
- c DW 0xAA55
- The c{TIMES} directive will insert exactly enough zero bytes into
- the output to move the assembly point up to 510. This method also
- has the advantage that if you accidentally fill your boot sector too
- full, NASM will catch the problem at assembly time and report it, so
- you won't end up with a boot sector that you have to disassemble to
- find out what's wrong with it.
- S{probtimes} ic{TIMES} Doesn't Work
- The other common problem with the above code is people who write the
- c{TIMES} line as
- c TIMES 510-$ DB 0
- by reasoning that c{$} should be a pure number, just like 510, so
- the difference between them is also a pure number and can happily be
- fed to c{TIMES}.
- NASM is a e{modular} assembler: the various component parts are
- designed to be easily separable for re-use, so they don't exchange
- information unnecessarily. In consequence, the c{bin} output
- format, even though it has been told by the c{ORG} directive that
- the c{.text} section should start at 0, does not pass that
- information back to the expression evaluator. So from the
- evaluator's point of view, c{$} isn't a pure number: it's an offset
- from a section base. Therefore the difference between c{$} and 510
- is also not a pure number, but involves a section base. Values
- involving section bases cannot be passed as arguments to c{TIMES}.
- The solution, as in the previous section, is to code the c{TIMES}
- line in the form
- c TIMES 510-($-$$) DB 0
- in which c{$} and c{$$} are offsets from the same section base,
- and so their difference is a pure number. This will solve the
- problem and generate sensible code.
- H{bugs} i{Bugs}I{reporting bugs}
- We have never yet released a version of NASM with any e{known}
- bugs. That doesn't usually stop there being plenty we didn't know
- about, though. Any that you find should be reported to
- W{mailto:hpa@zytor.com}c{hpa@zytor.com}.
- Please read k{qstart} first, and don't report the bug if it's
- listed in there as a deliberate feature. (If you think the feature
- is badly thought out, feel free to send us reasons why you think it
- should be changed, but don't just send us mail saying `This is a
- bug' if the documentation says we did it on purpose.) Then read
- k{problems}, and don't bother reporting the bug if it's listed
- there.
- If you do report a bug, e{please} give us all of the following
- information:
- b What operating system you're running NASM under. DOS, Linux,
- NetBSD, Win16, Win32, VMS (I'd be impressed), whatever.
- b If you're running NASM under DOS or Win32, tell us whether you've
- compiled your own executable from the DOS source archive, or whether
- you were using the standard distribution binaries out of the
- archive. If you were using a locally built executable, try to
- reproduce the problem using one of the standard binaries, as this
- will make it easier for us to reproduce your problem prior to fixing
- it.
- b Which version of NASM you're using, and exactly how you invoked
- it. Give us the precise command line, and the contents of the
- c{NASM} environment variable if any.
- b Which versions of any supplementary programs you're using, and
- how you invoked them. If the problem only becomes visible at link
- time, tell us what linker you're using, what version of it you've
- got, and the exact linker command line. If the problem involves
- linking against object files generated by a compiler, tell us what
- compiler, what version, and what command line or options you used.
- (If you're compiling in an IDE, please try to reproduce the problem
- with the command-line version of the compiler.)
- b If at all possible, send us a NASM source file which exhibits the
- problem. If this causes copyright problems (e.g. you can only
- reproduce the bug in restricted-distribution code) then bear in mind
- the following two points: firstly, we guarantee that any source code
- sent to us for the purposes of debugging NASM will be used e{only}
- for the purposes of debugging NASM, and that we will delete all our
- copies of it as soon as we have found and fixed the bug or bugs in
- question; and secondly, we would prefer e{not} to be mailed large
- chunks of code anyway. The smaller the file, the better. A
- three-line sample file that does nothing useful e{except}
- demonstrate the problem is much easier to work with than a
- fully fledged ten-thousand-line program. (Of course, some errors
- e{do} only crop up in large files, so this may not be possible.)
- b A description of what the problem actually e{is}. `It doesn't
- work' is e{not} a helpful description! Please describe exactly what
- is happening that shouldn't be, or what isn't happening that should.
- Examples might be: `NASM generates an error message saying Line 3
- for an error that's actually on Line 5'; `NASM generates an error
- message that I believe it shouldn't be generating at all'; `NASM
- fails to generate an error message that I believe it e{should} be
- generating'; `the object file produced from this source code crashes
- my linker'; `the ninth byte of the output file is 66 and I think it
- should be 77 instead'.
- b If you believe the output file from NASM to be faulty, send it to
- us. That allows us to determine whether our own copy of NASM
- generates the same file, or whether the problem is related to
- portability issues between our development platforms and yours. We
- can handle binary files mailed to us as MIME attachments, uuencoded,
- and even BinHex. Alternatively, we may be able to provide an FTP
- site you can upload the suspect files to; but mailing them is easier
- for us.
- b Any other information or data files that might be helpful. If,
- for example, the problem involves NASM failing to generate an object
- file while TASM can generate an equivalent file without trouble,
- then send us e{both} object files, so we can see what TASM is doing
- differently from us.
- A{iref} Intel x86 Instruction Reference
- This appendix provides a complete list of the machine instructions
- which NASM will assemble, and a short description of the function of
- each one.
- It is not intended to be exhaustive documentation on the fine
- details of the instructions' function, such as which exceptions they
- can trigger: for such documentation, you should go to Intel's Web
- site, W{http://www.intel.com/}c{http://www.intel.com/}.
- Instead, this appendix is intended primarily to provide
- documentation on the way the instructions may be used within NASM.
- For example, looking up c{LOOP} will tell you that NASM allows
- c{CX} or c{ECX} to be specified as an optional second argument to
- the c{LOOP} instruction, to enforce which of the two possible
- counter registers should be used if the default is not the one
- desired.
- The instructions are not quite listed in alphabetical order, since
- groups of instructions with similar functions are lumped together in
- the same entry. Most of them don't move very far from their
- alphabetic position because of this.
- H{iref-opr} Key to Operand Specifications
- The instruction descriptions in this appendix specify their operands
- using the following notation:
- b Registers: c{reg8} denotes an 8-bit i{general purpose
- register}, c{reg16} denotes a 16-bit general purpose register, and
- c{reg32} a 32-bit one. c{fpureg} denotes one of the eight FPU
- stack registers, c{mmxreg} denotes one of the eight 64-bit MMX
- registers, and c{segreg} denotes a segment register. In addition,
- some registers (such as c{AL}, c{DX} or
- c{ECX}) may be specified explicitly.
- b Immediate operands: c{imm} denotes a generic i{immediate operand}.
- c{imm8}, c{imm16} and c{imm32} are used when the operand is
- intended to be a specific size. For some of these instructions, NASM
- needs an explicit specifier: for example, c{ADD ESP,16} could be
- interpreted as either c{ADD r/m32,imm32} or c{ADD r/m32,imm8}.
- NASM chooses the former by default, and so you must specify c{ADD
- ESP,BYTE 16} for the latter.
- b Memory references: c{mem} denotes a generic i{memory reference};
- c{mem8}, c{mem16}, c{mem32}, c{mem64} and c{mem80} are used
- when the operand needs to be a specific size. Again, a specifier is
- needed in some cases: c{DEC [address]} is ambiguous and will be
- rejected by NASM. You must specify c{DEC BYTE [address]}, c{DEC
- WORD [address]} or c{DEC DWORD [address]} instead.
- b i{Restricted memory references}: one form of the c{MOV}
- instruction allows a memory address to be specified e{without}
- allowing the normal range of register combinations and effective
- address processing. This is denoted by c{memoffs8}, c{memoffs16}
- and c{memoffs32}.
- b Register or memory choices: many instructions can accept either a
- register e{or} a memory reference as an operand. c{r/m8} is a
- shorthand for c{reg8/mem8}; similarly c{r/m16} and c{r/m32}.
- c{r/m64} is MMX-related, and is a shorthand for c{mmxreg/mem64}.
- H{iref-opc} Key to Opcode Descriptions
- This appendix also provides the opcodes which NASM will generate for
- each form of each instruction. The opcodes are listed in the
- following way:
- b A hex number, such as c{3F}, indicates a fixed byte containing
- that number.
- b A hex number followed by c{+r}, such as c{C8+r}, indicates that
- one of the operands to the instruction is a register, and the
- `register value' of that register should be added to the hex number
- to produce the generated byte. For example, EDX has register value
- 2, so the code c{C8+r}, when the register operand is EDX, generates
- the hex byte c{CA}. Register values for specific registers are
- given in k{iref-rv}.
- b A hex number followed by c{+cc}, such as c{40+cc}, indicates
- that the instruction name has a condition code suffix, and the
- numeric representation of the condition code should be added to the
- hex number to produce the generated byte. For example, the code
- c{40+cc}, when the instruction contains the c{NE} condition,
- generates the hex byte c{45}. Condition codes and their numeric
- representations are given in k{iref-cc}.
- b A slash followed by a digit, such as c{/2}, indicates that one
- of the operands to the instruction is a memory address or register
- (denoted c{mem} or c{r/m}, with an optional size). This is to be
- encoded as an effective address, with a i{ModR/M byte}, an optional
- i{SIB byte}, and an optional displacement, and the spare (register)
- field of the ModR/M byte should be the digit given (which will be
- from 0 to 7, so it fits in three bits). The encoding of effective
- addresses is given in k{iref-ea}.
- b The code c{/r} combines the above two: it indicates that one of
- the operands is a memory address or c{r/m}, and another is a
- register, and that an effective address should be generated with the
- spare (register) field in the ModR/M byte being equal to the
- `register value' of the register operand. The encoding of effective
- addresses is given in k{iref-ea}; register values are given in
- k{iref-rv}.
- b The codes c{ib}, c{iw} and c{id} indicate that one of the
- operands to the instruction is an immediate value, and that this is
- to be encoded as a byte, little-endian word or little-endian
- doubleword respectively.
- b The codes c{rb}, c{rw} and c{rd} indicate that one of the
- operands to the instruction is an immediate value, and that the
- e{difference} between this value and the address of the end of the
- instruction is to be encoded as a byte, word or doubleword
- respectively. Where the form c{rw/rd} appears, it indicates that
- either c{rw} or c{rd} should be used according to whether assembly
- is being performed in c{BITS 16} or c{BITS 32} state respectively.
- b The codes c{ow} and c{od} indicate that one of the operands to
- the instruction is a reference to the contents of a memory address
- specified as an immediate value: this encoding is used in some forms
- of the c{MOV} instruction in place of the standard
- effective-address mechanism. The displacement is encoded as a word
- or doubleword. Again, c{ow/od} denotes that c{ow} or c{od} should
- be chosen according to the c{BITS} setting.
- b The codes c{o16} and c{o32} indicate that the given form of the
- instruction should be assembled with operand size 16 or 32 bits. In
- other words, c{o16} indicates a c{66} prefix in c{BITS 32} state,
- but generates no code in c{BITS 16} state; and c{o32} indicates a
- c{66} prefix in c{BITS 16} state but generates nothing in c{BITS
- 32}.
- b The codes c{a16} and c{a32}, similarly to c{o16} and c{o32},
- indicate the address size of the given form of the instruction.
- Where this does not match the c{BITS} setting, a c{67} prefix is
- required.
- S{iref-rv} Register Values
- Where an instruction requires a register value, it is already
- implicit in the encoding of the rest of the instruction what type of
- register is intended: an 8-bit general-purpose register, a segment
- register, a debug register, an MMX register, or whatever. Therefore
- there is no problem with registers of different types sharing an
- encoding value.
- The encodings for the various classes of register are:
- b 8-bit general registers: c{AL} is 0, c{CL} is 1, c{DL} is 2,
- c{BL} is 3, c{AH} is 4, c{CH} is 5, c{DH} is 6, and c{BH} is
- 7.
- b 16-bit general registers: c{AX} is 0, c{CX} is 1, c{DX} is 2,
- c{BX} is 3, c{SP} is 4, c{BP} is 5, c{SI} is 6, and c{DI} is 7.
- b 32-bit general registers: c{EAX} is 0, c{ECX} is 1, c{EDX} is
- 2, c{EBX} is 3, c{ESP} is 4, c{EBP} is 5, c{ESI} is 6, and
- c{EDI} is 7.
- b i{Segment registers}: c{ES} is 0, c{CS} is 1, c{SS} is 2, c{DS}
- is 3, c{FS} is 4, and c{GS} is 5.
- b I{floating-point, registers}{Floating-point registers}: c{ST0}
- is 0, c{ST1} is 1, c{ST2} is 2, c{ST3} is 3, c{ST4} is 4,
- c{ST5} is 5, c{ST6} is 6, and c{ST7} is 7.
- b 64-bit i{MMX registers}: c{MM0} is 0, c{MM1} is 1, c{MM2} is 2,
- c{MM3} is 3, c{MM4} is 4, c{MM5} is 5, c{MM6} is 6, and c{MM7}
- is 7.
- b i{Control registers}: c{CR0} is 0, c{CR2} is 2, c{CR3} is 3,
- and c{CR4} is 4.
- b i{Debug registers}: c{DR0} is 0, c{DR1} is 1, c{DR2} is 2,
- c{DR3} is 3, c{DR6} is 6, and c{DR7} is 7.
- b i{Test registers}: c{TR3} is 3, c{TR4} is 4, c{TR5} is 5,
- c{TR6} is 6, and c{TR7} is 7.
- (Note that wherever a register name contains a number, that number
- is also the register value for that register.)
- S{iref-cc} i{Condition Codes}
- The available condition codes are given here, along with their
- numeric representations as part of opcodes. Many of these condition
- codes have synonyms, so several will be listed at a time.
- In the following descriptions, the word `either', when applied to two
- possible trigger conditions, is used to mean `either or both'. If
- `either but not both' is meant, the phrase `exactly one of' is used.
- b c{O} is 0 (trigger if the overflow flag is set); c{NO} is 1.
- b c{B}, c{C} and c{NAE} are 2 (trigger if the carry flag is
- set); c{AE}, c{NB} and c{NC} are 3.
- b c{E} and c{Z} are 4 (trigger if the zero flag is set); c{NE}
- and c{NZ} are 5.
- b c{BE} and c{NA} are 6 (trigger if either of the carry or zero
- flags is set); c{A} and c{NBE} are 7.
- b c{S} is 8 (trigger if the sign flag is set); c{NS} is 9.
- b c{P} and c{PE} are 10 (trigger if the parity flag is set);
- c{NP} and c{PO} are 11.
- b c{L} and c{NGE} are 12 (trigger if exactly one of the sign and
- overflow flags is set); c{GE} and c{NL} are 13.
- b c{LE} and c{NG} are 14 (trigger if either the zero flag is set,
- or exactly one of the sign and overflow flags is set); c{G} and
- c{NLE} are 15.
- Note that in all cases, the sense of a condition code may be
- reversed by changing the low bit of the numeric representation.
- S{iref-ea} Effective Address Encoding: i{ModR/M} and i{SIB}
- An i{effective address} is encoded in up to three parts: a ModR/M
- byte, an optional SIB byte, and an optional byte, word or doubleword
- displacement field.
- The ModR/M byte consists of three fields: the c{mod} field, ranging
- from 0 to 3, in the upper two bits of the byte, the c{r/m} field,
- ranging from 0 to 7, in the lower three bits, and the spare
- (register) field in the middle (bit 3 to bit 5). The spare field is
- not relevant to the effective address being encoded, and either
- contains an extension to the instruction opcode or the register
- value of another operand.
- The ModR/M system can be used to encode a direct register reference
- rather than a memory access. This is always done by setting the
- c{mod} field to 3 and the c{r/m} field to the register value of
- the register in question (it must be a general-purpose register, and
- the size of the register must already be implicit in the encoding of
- the rest of the instruction). In this case, the SIB byte and
- displacement field are both absent.
- In 16-bit addressing mode (either c{BITS 16} with no c{67} prefix,
- or c{BITS 32} with a c{67} prefix), the SIB byte is never used.
- The general rules for c{mod} and c{r/m} (there is an exception,
- given below) are:
- b The c{mod} field gives the length of the displacement field: 0
- means no displacement, 1 means one byte, and 2 means two bytes.
- b The c{r/m} field encodes the combination of registers to be
- added to the displacement to give the accessed address: 0 means
- c{BX+SI}, 1 means c{BX+DI}, 2 means c{BP+SI}, 3 means c{BP+DI},
- 4 means c{SI} only, 5 means c{DI} only, 6 means c{BP} only, and 7
- means c{BX} only.
- However, there is a special case:
- b If c{mod} is 0 and c{r/m} is 6, the effective address encoded
- is not c{[BP]} as the above rules would suggest, but instead
- c{[disp16]}: the displacement field is present and is two bytes
- long, and no registers are added to the displacement.
- Therefore the effective address c{[BP]} cannot be encoded as
- efficiently as c{[BX]}; so if you code c{[BP]} in a program, NASM
- adds a notional 8-bit zero displacement, and sets c{mod} to 1,
- c{r/m} to 6, and the one-byte displacement field to 0.
- In 32-bit addressing mode (either c{BITS 16} with a c{67} prefix,
- or c{BITS 32} with no c{67} prefix) the general rules (again,
- there are exceptions) for c{mod} and c{r/m} are:
- b The c{mod} field gives the length of the displacement field: 0
- means no displacement, 1 means one byte, and 2 means four bytes.
- b If only one register is to be added to the displacement, and it
- is not c{ESP}, the c{r/m} field gives its register value, and the
- SIB byte is absent. If the c{r/m} field is 4 (which would encode
- c{ESP}), the SIB byte is present and gives the combination and
- scaling of registers to be added to the displacement.
- If the SIB byte is present, it describes the combination of
- registers (an optional base register, and an optional index register
- scaled by multiplication by 1, 2, 4 or 8) to be added to the
- displacement. The SIB byte is divided into the c{scale} field, in
- the top two bits, the c{index} field in the next three, and the
- c{base} field in the bottom three. The general rules are:
- b The c{base} field encodes the register value of the base
- register.
- b The c{index} field encodes the register value of the index
- register, unless it is 4, in which case no index register is used
- (so c{ESP} cannot be used as an index register).
- b The c{scale} field encodes the multiplier by which the index
- register is scaled before adding it to the base and displacement: 0
- encodes a multiplier of 1, 1 encodes 2, 2 encodes 4 and 3 encodes 8.
- The exceptions to the 32-bit encoding rules are:
- b If c{mod} is 0 and c{r/m} is 5, the effective address encoded
- is not c{[EBP]} as the above rules would suggest, but instead
- c{[disp32]}: the displacement field is present and is four bytes
- long, and no registers are added to the displacement.
- b If c{mod} is 0, c{r/m} is 4 (meaning the SIB byte is present)
- and c{base} is 4, the effective address encoded is not
- c{[EBP+index]} as the above rules would suggest, but instead
- c{[disp32+index]}: the displacement field is present and is four
- bytes long, and there is no base register (but the index register is
- still processed in the normal way).
- H{iref-flg} Key to Instruction Flags
- Given along with each instruction in this appendix is a set of
- flags, denoting the type of the instruction. The types are as follows:
- b c{8086}, c{186}, c{286}, c{386}, c{486}, c{PENT} and c{P6}
- denote the lowest processor type that supports the instruction. Most
- instructions run on all processors above the given type; those that
- do not are documented. The Pentium II contains no additional
- instructions beyond the P6 (Pentium Pro); from the point of view of
- its instruction set, it can be thought of as a P6 with MMX
- capability.
- b c{CYRIX} indicates that the instruction is specific to Cyrix
- processors, for example the extra MMX instructions in the Cyrix
- extended MMX instruction set.
- b c{FPU} indicates that the instruction is a floating-point one,
- and will only run on machines with a coprocessor (automatically
- including 486DX, Pentium and above).
- b c{MMX} indicates that the instruction is an MMX one, and will
- run on MMX-capable Pentium processors and the Pentium II.
- b c{PRIV} indicates that the instruction is a protected-mode
- management instruction. Many of these may only be used in protected
- mode, or only at privilege level zero.
- b c{UNDOC} indicates that the instruction is an undocumented one,
- and not part of the official Intel Architecture; it may or may not
- be supported on any given machine.
- H{insAAA} ic{AAA}, ic{AAS}, ic{AAM}, ic{AAD}: ASCII
- Adjustments
- c AAA ; 37 [8086]
- c AAS ; 3F [8086]
- c AAD ; D5 0A [8086]
- c AAD imm ; D5 ib [8086]
- c AAM ; D4 0A [8086]
- c AAM imm ; D4 ib [8086]
- These instructions are used in conjunction with the add, subtract,
- multiply and divide instructions to perform binary-coded decimal
- arithmetic in e{unpacked} (one BCD digit per byte - easy to
- translate to and from ASCII, hence the instruction names) form.
- There are also packed BCD instructions c{DAA} and c{DAS}: see
- k{insDAA}.
- c{AAA} should be used after a one-byte c{ADD} instruction whose
- destination was the c{AL} register: by means of examining the value
- in the low nibble of c{AL} and also the auxiliary carry flag
- c{AF}, it determines whether the addition has overflowed, and
- adjusts it (and sets the carry flag) if so. You can add long BCD
- strings together by doing c{ADD}/c{AAA} on the low digits, then
- doing c{ADC}/c{AAA} on each subsequent digit.
- c{AAS} works similarly to c{AAA}, but is for use after c{SUB}
- instructions rather than c{ADD}.
- c{AAM} is for use after you have multiplied two decimal digits
- together and left the result in c{AL}: it divides c{AL} by ten and
- stores the quotient in c{AH}, leaving the remainder in c{AL}. The
- divisor 10 can be changed by specifying an operand to the
- instruction: a particularly handy use of this is c{AAM 16}, causing
- the two nibbles in c{AL} to be separated into c{AH} and c{AL}.
- c{AAD} performs the inverse operation to c{AAM}: it multiplies
- c{AH} by ten, adds it to c{AL}, and sets c{AH} to zero. Again,
- the multiplier 10 can be changed.
- H{insADC} ic{ADC}: Add with Carry
- c ADC r/m8,reg8 ; 10 /r [8086]
- c ADC r/m16,reg16 ; o16 11 /r [8086]
- c ADC r/m32,reg32 ; o32 11 /r [386]
- c ADC reg8,r/m8 ; 12 /r [8086]
- c ADC reg16,r/m16 ; o16 13 /r [8086]
- c ADC reg32,r/m32 ; o32 13 /r [386]
- c ADC r/m8,imm8 ; 80 /2 ib [8086]
- c ADC r/m16,imm16 ; o16 81 /2 iw [8086]
- c ADC r/m32,imm32 ; o32 81 /2 id [386]
- c ADC r/m16,imm8 ; o16 83 /2 ib [8086]
- c ADC r/m32,imm8 ; o32 83 /2 ib [386]
- c ADC AL,imm8 ; 14 ib [8086]
- c ADC AX,imm16 ; o16 15 iw [8086]
- c ADC EAX,imm32 ; o32 15 id [386]
- c{ADC} performs integer addition: it adds its two operands
- together, plus the value of the carry flag, and leaves the result in
- its destination (first) operand. The flags are set according to the
- result of the operation: in particular, the carry flag is affected
- and can be used by a subsequent c{ADC} instruction.
- In the forms with an 8-bit immediate second operand and a longer
- first operand, the second operand is considered to be signed, and is
- sign-extended to the length of the first operand. In these cases,
- the c{BYTE} qualifier is necessary to force NASM to generate this
- form of the instruction.
- To add two numbers without also adding the contents of the carry
- flag, use c{ADD} (k{insADD}).
- H{insADD} ic{ADD}: Add Integers
- c ADD r/m8,reg8 ; 00 /r [8086]
- c ADD r/m16,reg16 ; o16 01 /r [8086]
- c ADD r/m32,reg32 ; o32 01 /r [386]
- c ADD reg8,r/m8 ; 02 /r [8086]
- c ADD reg16,r/m16 ; o16 03 /r [8086]
- c ADD reg32,r/m32 ; o32 03 /r [386]
- c ADD r/m8,imm8 ; 80 /0 ib [8086]
- c ADD r/m16,imm16 ; o16 81 /0 iw [8086]
- c ADD r/m32,imm32 ; o32 81 /0 id [386]
- c ADD r/m16,imm8 ; o16 83 /0 ib [8086]
- c ADD r/m32,imm8 ; o32 83 /0 ib [386]
- c ADD AL,imm8 ; 04 ib [8086]
- c ADD AX,imm16 ; o16 05 iw [8086]
- c ADD EAX,imm32 ; o32 05 id [386]
- c{ADD} performs integer addition: it adds its two operands
- together, and leaves the result in its destination (first) operand.
- The flags are set according to the result of the operation: in
- particular, the carry flag is affected and can be used by a
- subsequent c{ADC} instruction (k{insADC}).
- In the forms with an 8-bit immediate second operand and a longer
- first operand, the second operand is considered to be signed, and is
- sign-extended to the length of the first operand. In these cases,
- the c{BYTE} qualifier is necessary to force NASM to generate this
- form of the instruction.
- H{insAND} ic{AND}: Bitwise AND
- c AND r/m8,reg8 ; 20 /r [8086]
- c AND r/m16,reg16 ; o16 21 /r [8086]
- c AND r/m32,reg32 ; o32 21 /r [386]
- c AND reg8,r/m8 ; 22 /r [8086]
- c AND reg16,r/m16 ; o16 23 /r [8086]
- c AND reg32,r/m32 ; o32 23 /r [386]
- c AND r/m8,imm8 ; 80 /4 ib [8086]
- c AND r/m16,imm16 ; o16 81 /4 iw [8086]
- c AND r/m32,imm32 ; o32 81 /4 id [386]
- c AND r/m16,imm8 ; o16 83 /4 ib [8086]
- c AND r/m32,imm8 ; o32 83 /4 ib [386]
- c AND AL,imm8 ; 24 ib [8086]
- c AND AX,imm16 ; o16 25 iw [8086]
- c AND EAX,imm32 ; o32 25 id [386]
- c{AND} performs a bitwise AND operation between its two operands
- (i.e. each bit of the result is 1 if and only if the corresponding
- bits of the two inputs were both 1), and stores the result in the
- destination (first) operand.
- In the forms with an 8-bit immediate second operand and a longer
- first operand, the second operand is considered to be signed, and is
- sign-extended to the length of the first operand. In these cases,
- the c{BYTE} qualifier is necessary to force NASM to generate this
- form of the instruction.
- The MMX instruction c{PAND} (see k{insPAND}) performs the same
- operation on the 64-bit MMX registers.
- H{insARPL} ic{ARPL}: Adjust RPL Field of Selector
- c ARPL r/m16,reg16 ; 63 /r [286,PRIV]
- c{ARPL} expects its two word operands to be segment selectors. It
- adjusts the RPL (requested privilege level - stored in the bottom
- two bits of the selector) field of the destination (first) operand
- to ensure that it is no less (i.e. no more privileged than) the RPL
- field of the source operand. The zero flag is set if and only if a
- change had to be made.
- H{insBOUND} ic{BOUND}: Check Array Index against Bounds
- c BOUND reg16,mem ; o16 62 /r [186]
- c BOUND reg32,mem ; o32 62 /r [386]
- c{BOUND} expects its second operand to point to an area of memory
- containing two signed values of the same size as its first operand
- (i.e. two words for the 16-bit form; two doublewords for the 32-bit
- form). It performs two signed comparisons: if the value in the
- register passed as its first operand is less than the first of the
- in-memory values, or is greater than or equal to the second, it
- throws a BR exception. Otherwise, it does nothing.
- H{insBSF} ic{BSF}, ic{BSR}: Bit Scan
- c BSF reg16,r/m16 ; o16 0F BC /r [386]
- c BSF reg32,r/m32 ; o32 0F BC /r [386]
- c BSR reg16,r/m16 ; o16 0F BD /r [386]
- c BSR reg32,r/m32 ; o32 0F BD /r [386]
- c{BSF} searches for a set bit in its source (second) operand,
- starting from the bottom, and if it finds one, stores the index in
- its destination (first) operand. If no set bit is found, the
- contents of the destination operand are undefined.
- c{BSR} performs the same function, but searches from the top
- instead, so it finds the most significant set bit.
- Bit indices are from 0 (least significant) to 15 or 31 (most
- significant).
- H{insBSWAP} ic{BSWAP}: Byte Swap
- c BSWAP reg32 ; o32 0F C8+r [486]
- c{BSWAP} swaps the order of the four bytes of a 32-bit register:
- bits 0-7 exchange places with bits 24-31, and bits 8-15 swap with
- bits 16-23. There is no explicit 16-bit equivalent: to byte-swap
- c{AX}, c{BX}, c{CX} or c{DX}, c{XCHG} can be used.
- H{insBT} ic{BT}, ic{BTC}, ic{BTR}, ic{BTS}: Bit Test
- c BT r/m16,reg16 ; o16 0F A3 /r [386]
- c BT r/m32,reg32 ; o32 0F A3 /r [386]
- c BT r/m16,imm8 ; o16 0F BA /4 ib [386]
- c BT r/m32,imm8 ; o32 0F BA /4 ib [386]
- c BTC r/m16,reg16 ; o16 0F BB /r [386]
- c BTC r/m32,reg32 ; o32 0F BB /r [386]
- c BTC r/m16,imm8 ; o16 0F BA /7 ib [386]
- c BTC r/m32,imm8 ; o32 0F BA /7 ib [386]
- c BTR r/m16,reg16 ; o16 0F B3 /r [386]
- c BTR r/m32,reg32 ; o32 0F B3 /r [386]
- c BTR r/m16,imm8 ; o16 0F BA /6 ib [386]
- c BTR r/m32,imm8 ; o32 0F BA /6 ib [386]
- c BTS r/m16,reg16 ; o16 0F AB /r [386]
- c BTS r/m32,reg32 ; o32 0F AB /r [386]
- c BTS r/m16,imm ; o16 0F BA /5 ib [386]
- c BTS r/m32,imm ; o32 0F BA /5 ib [386]
- These instructions all test one bit of their first operand, whose
- index is given by the second operand, and store the value of that
- bit into the carry flag. Bit indices are from 0 (least significant)
- to 15 or 31 (most significant).
- In addition to storing the original value of the bit into the carry
- flag, c{BTR} also resets (clears) the bit in the operand itself.
- c{BTS} sets the bit, and c{BTC} complements the bit. c{BT} does
- not modify its operands.
- The bit offset should be no greater than the size of the operand.
- H{insCALL} ic{CALL}: Call Subroutine
- c CALL imm ; E8 rw/rd [8086]
- c CALL imm:imm16 ; o16 9A iw iw [8086]
- c CALL imm:imm32 ; o32 9A id iw [386]
- c CALL FAR mem16 ; o16 FF /3 [8086]
- c CALL FAR mem32 ; o32 FF /3 [386]
- c CALL r/m16 ; o16 FF /2 [8086]
- c CALL r/m32 ; o32 FF /2 [386]
- c{CALL} calls a subroutine, by means of pushing the current
- instruction pointer (c{IP}) and optionally c{CS} as well on the
- stack, and then jumping to a given address.