编译器/解释器

开发平台：

C/C++

nasmdoc.src：源码内容

c section .data align=16
switches to the section c{.data} and also specifies that it must be
aligned on a 16-byte boundary.
The parameter to c{ALIGN} specifies how many low bits of the
section start address must be forced to zero. The alignment value
given may be any power of two.I{section alignment, in
bin}I{segment alignment, in bin}I{alignment, in bin sections}
H{objfmt} ic{obj}: i{Microsoft OMF}I{OMF} Object Files
The c{obj} file format (NASM calls it c{obj} rather than c{omf}
for historical reasons) is the one produced by i{MASM} and
i{TASM}, which is typically fed to 16-bit DOS linkers to produce
ic{.EXE} files. It is also the format used by i{OS/2}.
c{obj} provides a default output file-name extension of c{.obj}.
c{obj} is not exclusively a 16-bit format, though: NASM has full
support for the 32-bit extensions to the format. In particular,
32-bit c{obj} format files are used by i{Borland's Win32
compilers}, instead of using Microsoft's newer ic{win32} object
file format.
The c{obj} format does not define any special segment names: you
can call your segments anything you like. Typical names for segments
in c{obj} format files are c{CODE}, c{DATA} and c{BSS}.
If your source file contains code before specifying an explicit
c{SEGMENT} directive, then NASM will invent its own segment called
ic{__NASMDEFSEG} for you.
When you define a segment in an c{obj} file, NASM defines the
segment name as a symbol as well, so that you can access the segment
address of the segment. So, for example:
c segment data
c dvar: dw 1234
c segment code
c function: mov ax,data ; get segment address of data
c mov ds,ax ; and move it into DS
c inc word [dvar] ; now this reference will work
c ret
The c{obj} format also enables the use of the ic{SEG} and
ic{WRT} operators, so that you can write code which does things
like
c extern foo
c mov ax,seg foo ; get preferred segment of foo
c mov ds,ax
c mov ax,data ; a different segment
c mov es,ax
c mov ax,[ds:foo] ; this accesses `foo'
c mov [es:foo wrt data],bx ; so does this
S{objseg} c{obj} Extensions to the c{SEGMENT}
DirectiveI{SEGMENT, obj extensions to}
The c{obj} output format extends the c{SEGMENT} (or c{SECTION})
directive to allow you to specify various properties of the segment
you are defining. This is done by appending extra qualifiers to the
end of the segment-definition line. For example,
c segment code private align=16
defines the segment c{code}, but also declares it to be a private
segment, and requires that the portion of it described in this code
module must be aligned on a 16-byte boundary.
The available qualifiers are:
b ic{PRIVATE}, ic{PUBLIC}, ic{COMMON} and ic{STACK} specify
the combination characteristics of the segment. c{PRIVATE} segments
do not get combined with any others by the linker; c{PUBLIC} and
c{STACK} segments get concatenated together at link time; and
c{COMMON} segments all get overlaid on top of each other rather
than stuck end-to-end.
b ic{ALIGN} is used, as shown above, to specify how many low bits
of the segment start address must be forced to zero. The alignment
value given may be any power of two from 1 to 4096; in reality, the
only values supported are 1, 2, 4, 16, 256 and 4096, so if 8 is
specified it will be rounded up to 16, and 32, 64 and 128 will all
be rounded up to 256, and so on. Note that alignment to 4096-byte
boundaries is a i{PharLap} extension to the format and may not be
supported by all linkers.I{section alignment, in OBJ}I{segment
alignment, in OBJ}I{alignment, in OBJ sections}
b ic{CLASS} can be used to specify the segment class; this feature
indicates to the linker that segments of the same class should be
placed near each other in the output file. The class name can be any
word, e.g. c{CLASS=CODE}.
b ic{OVERLAY}, like c{CLASS}, is specified with an arbitrary word
as an argument, and provides overlay information to an
overlay-capable linker.
b Segments can be declared as ic{USE16} or ic{USE32}, which has
the effect of recording the choice in the object file and also
ensuring that NASM's default assembly mode when assembling in that
segment is 16-bit or 32-bit respectively.
b When writing i{OS/2} object files, you should declare 32-bit
segments as ic{FLAT}, which causes the default segment base for
anything in the segment to be the special group c{FLAT}, and also
defines the group if it is not already defined.
b The c{obj} file format also allows segments to be declared as
having a pre-defined absolute segment address, although no linkers
are currently known to make sensible use of this feature;
nevertheless, NASM allows you to declare a segment such as
c{SEGMENT SCREEN ABSOLUTE=0xB800} if you need to. The ic{ABSOLUTE}
and c{ALIGN} keywords are mutually exclusive.
NASM's default segment attributes are c{PUBLIC}, c{ALIGN=1}, no
class, no overlay, and c{USE16}.
S{group} ic{GROUP}: Defining Groups of SegmentsI{segments, groups of}
The c{obj} format also allows segments to be grouped, so that a
single segment register can be used to refer to all the segments in
a group. NASM therefore supplies the c{GROUP} directive, whereby
you can code
c segment data
c ; some data
c segment bss
c ; some uninitialised data
c group dgroup data bss
which will define a group called c{dgroup} to contain the segments
c{data} and c{bss}. Like c{SEGMENT}, c{GROUP} causes the group
name to be defined as a symbol, so that you can refer to a variable
c{var} in the c{data} segment as c{var wrt data} or as c{var wrt
dgroup}, depending on which segment value is currently in your
segment register.
If you just refer to c{var}, however, and c{var} is declared in a
segment which is part of a group, then NASM will default to giving
you the offset of c{var} from the beginning of the e{group}, not
the e{segment}. Therefore c{SEG var}, also, will return the group
base rather than the segment base.
NASM will allow a segment to be part of more than one group, but
will generate a warning if you do this. Variables declared in a
segment which is part of more than one group will default to being
relative to the first group that was defined to contain the segment.
A group does not have to contain any segments; you can still make
c{WRT} references to a group which does not contain the variable
you are referring to. OS/2, for example, defines the special group
c{FLAT} with no segments in it.
S{uppercase} ic{UPPERCASE}: Disabling Case Sensitivity in Output
Although NASM itself is i{case sensitive}, some OMF linkers are
not; therefore it can be useful for NASM to output single-case
object files. The c{UPPERCASE} format-specific directive causes all
segment, group and symbol names that are written to the object file
to be forced to upper case just before being written. Within a
source file, NASM is still case-sensitive; but the object file can
be written entirely in upper case if desired.
c{UPPERCASE} is used alone on a line; it requires no parameters.
S{import} ic{IMPORT}: Importing DLL SymbolsI{DLL symbols,
importing}I{symbols, importing from DLLs}
The c{IMPORT} format-specific directive defines a symbol to be
imported from a DLL, for use if you are writing a DLL's i{import
library} in NASM. You still need to declare the symbol as c{EXTERN}
as well as using the c{IMPORT} directive.
The c{IMPORT} directive takes two required parameters, separated by
white space, which are (respectively) the name of the symbol you
wish to import and the name of the library you wish to import it
from. For example:
c import WSAStartup wsock32.dll
A third optional parameter gives the name by which the symbol is
known in the library you are importing it from, in case this is not
the same as the name you wish the symbol to be known by to your code
once you have imported it. For example:
c import asyncsel wsock32.dll WSAAsyncSelect
S{export} ic{EXPORT}: Exporting DLL SymbolsI{DLL symbols,
exporting}I{symbols, exporting from DLLs}
The c{EXPORT} format-specific directive defines a global symbol to
be exported as a DLL symbol, for use if you are writing a DLL in
NASM. You still need to declare the symbol as c{GLOBAL} as well as
using the c{EXPORT} directive.
c{EXPORT} takes one required parameter, which is the name of the
symbol you wish to export, as it was defined in your source file. An
optional second parameter (separated by white space from the first)
gives the e{external} name of the symbol: the name by which you
wish the symbol to be known to programs using the DLL. If this name
is the same as the internal name, you may leave the second parameter
off.
Further parameters can be given to define attributes of the exported
symbol. These parameters, like the second, are separated by white
space. If further parameters are given, the external name must also
be specified, even if it is the same as the internal name. The
available attributes are:
b c{resident} indicates that the exported name is to be kept
resident by the system loader. This is an optimisation for
frequently used symbols imported by name.
b c{nodata} indicates that the exported symbol is a function which
does not make use of any initialised data.
b c{parm=NNN}, where c{NNN} is an integer, sets the number of
parameter words for the case in which the symbol is a call gate
between 32-bit and 16-bit segments.
b An attribute which is just a number indicates that the symbol
should be exported with an identifying number (ordinal), and gives
the desired number.
For example:
c export myfunc
c export myfunc TheRealMoreFormalLookingFunctionName
c export myfunc myfunc 1234 ; export by ordinal
c export myfunc myfunc resident parm=23 nodata
S{dotdotstart} ic{..start}: Defining the i{Program Entry
Point}
OMF linkers require exactly one of the object files being linked to
define the program entry point, where execution will begin when the
program is run. If the object file that defines the entry point is
assembled using NASM, you specify the entry point by declaring the
special symbol c{..start} at the point where you wish execution to
begin.
S{objextern} c{obj} Extensions to the c{EXTERN}
DirectiveI{EXTERN, obj extensions to}
If you declare an external symbol with the directive
c extern foo
then references such as c{mov ax,foo} will give you the offset of
c{foo} from its preferred segment base (as specified in whichever
module c{foo} is actually defined in). So to access the contents of
c{foo} you will usually need to do something like
c mov ax,seg foo ; get preferred segment base
c mov es,ax ; move it into ES
c mov ax,[es:foo] ; and use offset `foo' from it
This is a little unwieldy, particularly if you know that an external
is going to be accessible from a given segment or group, say
c{dgroup}. So if c{DS} already contained c{dgroup}, you could
simply code
c mov ax,[foo wrt dgroup]
However, having to type this every time you want to access c{foo}
can be a pain; so NASM allows you to declare c{foo} in the
alternative form
c extern foo:wrt dgroup
This form causes NASM to pretend that the preferred segment base of
c{foo} is in fact c{dgroup}; so the expression c{seg foo} will
now return c{dgroup}, and the expression c{foo} is equivalent to
c{foo wrt dgroup}.
This I{default-WRT mechanism}default-c{WRT} mechanism can be used
to make externals appear to be relative to any group or segment in
your program. It can also be applied to common variables: see
k{objcommon}.
S{objcommon} c{obj} Extensions to the c{COMMON}
DirectiveI{COMMON, obj extensions to}
The c{obj} format allows common variables to be either nearI{near
common variables} or farI{far common variables}; NASM allows you to
specify which your variables should be by the use of the syntax
c common nearvar 2:near ; `nearvar' is a near common
c common farvar 10:far ; and `farvar' is far
Far common variables may be greater in size than 64Kb, and so the
OMF specification says that they are declared as a number of
e{elements} of a given size. So a 10-byte far common variable could
be declared as ten one-byte elements, five two-byte elements, two
five-byte elements or one ten-byte element.
Some OMF linkers require the I{element size, in common
variables}I{common variables, element size}element size, as well as
the variable size, to match when resolving common variables declared
in more than one module. Therefore NASM must allow you to specify
the element size on your far common variables. This is done by the
following syntax:
c common c_5by2 10:far 5 ; two five-byte elements
c common c_2by5 10:far 2 ; five two-byte elements
If no element size is specified, the default is 1. Also, the c{FAR}
keyword is not required when an element size is specified, since
only far commons may have element sizes at all. So the above
declarations could equivalently be
c common c_5by2 10:5 ; two five-byte elements
c common c_2by5 10:2 ; five two-byte elements
In addition to these extensions, the c{COMMON} directive in c{obj}
also supports default-c{WRT} specification like c{EXTERN} does
(explained in k{objextern}). So you can also declare things like
c common foo 10:wrt dgroup
c common bar 16:far 2:wrt data
c common baz 24:wrt data:6
H{win32fmt} ic{win32}: Microsoft Win32 Object Files
The c{win32} output format generates Microsoft Win32 object files,
suitable for passing to Microsoft linkers such as i{Visual C++}.
Note that Borland Win32 compilers do not use this format, but use
c{obj} instead (see k{objfmt}).
c{win32} provides a default output file-name extension of c{.obj}.
Note that although Microsoft say that Win32 object files follow the
COFF (Common Object File Format) standard, the object files produced
by Microsoft Win32 compilers are not compatible with COFF linkers
such as DJGPP's, and vice versa. This is due to a difference of
opinion over the precise semantics of PC-relative relocations. To
produce COFF files suitable for DJGPP, use NASM's c{coff} output
format; conversely, the c{coff} format does not produce object
files that Win32 linkers can generate correct output from.
S{win32sect} c{win32} Extensions to the c{SECTION}
DirectiveI{SECTION, win32 extensions to}
Like the c{obj} format, c{win32} allows you to specify additional
information on the c{SECTION} directive line, to control the type
and properties of sections you declare. Section types and properties
are generated automatically by NASM for the i{standard section names}
c{.text}, c{.data} and c{.bss}, but may still be overridden by
these qualifiers.
The available qualifiers are:
b c{code}, or equivalently c{text}, defines the section to be a
code section. This marks the section as readable and executable, but
not writable, and also indicates to the linker that the type of the
section is code.
b c{data} and c{bss} define the section to be a data section,
analogously to c{code}. Data sections are marked as readable and
writable, but not executable. c{data} declares an initialised data
section, whereas c{bss} declares an uninitialised data section.
b c{info} defines the section to be an i{informational section},
which is not included in the executable file by the linker, but may
(for example) pass information e{to} the linker. For example,
declaring an c{info}-type section called ic{.drectve} causes the
linker to interpret the contents of the section as command-line
options.
b c{align=}, used with a trailing number as in c{obj}, gives the
I{section alignment, in win32}I{alignment, in win32
sections}alignment requirements of the section. The maximum you may
specify is 64: the Win32 object file format contains no means to
request a greater section alignment than this. If alignment is not
explicitly specified, the defaults are 16-byte alignment for code
sections, and 4-byte alignment for data (and BSS) sections.
Informational sections get a default alignment of 1 byte (no
alignment), though the value does not matter.
The defaults assumed by NASM if you do not specify the above
qualifiers are:
c section .text code align=16
c section .data data align=4
c section .bss bss align=4
Any other section name is treated by default like c{.text}.
H{cofffmt} ic{coff}: i{Common Object File Format}
The c{coff} output type produces COFF object files suitable for
linking with the i{DJGPP} linker.
c{coff} provides a default output file-name extension of c{.o}.
The c{coff} format supports the same extensions to the c{SECTION}
directive as c{win32} does, except that the c{align} qualifier and
the c{info} section type are not supported.
H{elffmt} ic{elf}: i{Linux ELF}I{Executable and Linkable
Format}Object Files
The c{elf} output format generates ELF32 (Executable and Linkable
Format) object files, as used by Linux. c{elf} provides a default
output file-name extension of c{.o}.
S{elfsect} c{elf} Extensions to the c{SECTION}
DirectiveI{SECTION, elf extensions to}
Like the c{obj} format, c{elf} allows you to specify additional
information on the c{SECTION} directive line, to control the type
and properties of sections you declare. Section types and properties
are generated automatically by NASM for the i{standard section
names} ic{.text}, ic{.data} and ic{.bss}, but may still be
overridden by these qualifiers.
The available qualifiers are:
b ic{alloc} defines the section to be one which is loaded into
memory when the program is run. ic{noalloc} defines it to be one
which is not, such as an informational or comment section.
b ic{exec} defines the section to be one which should have execute
permission when the program is run. ic{noexec} defines it as one
which should not.
b ic{write} defines the section to be one which should be writable
when the program is run. ic{nowrite} defines it as one which should
not.
b ic{progbits} defines the section to be one with explicit contents
stored in the object file: an ordinary code or data section, for
example, ic{nobits} defines the section to be one with no explicit
contents given, such as a BSS section.
b c{align=}, used with a trailing number as in c{obj}, gives the
I{section alignment, in elf}I{alignment, in elf sections}alignment
requirements of the section.
The defaults assumed by NASM if you do not specify the above
qualifiers are:
c section .text progbits alloc exec nowrite align=16
c section .data progbits alloc noexec write align=4
c section .bss nobits alloc noexec write align=4
c section other progbits alloc noexec nowrite align=1
(Any section name other than c{.text}, c{.data} and c{.bss} is
treated by default like c{other} in the above code.)
S{elfwrt} i{Position-Independent Code}I{PIC}: c{elf} Special
Symbols and ic{WRT}
The ELF specification contains enough features to allow
position-independent code (PIC) to be written, which makes i{ELF
shared libraries} very flexible. However, it also means NASM has to
be able to generate a variety of strange relocation types in ELF
object files, if it is to be an assembler which can write PIC.
Since ELF does not support segment-base references, the c{WRT}
operator is not used for its normal purpose; therefore NASM's
c{elf} output format makes use of c{WRT} for a different purpose,
namely the PIC-specific I{relocations, PIC-specific}relocation
types.
c{elf} defines five special symbols which you can use as the
right-hand side of the c{WRT} operator to obtain PIC relocation
types. They are ic{..gotpc}, ic{..gotoff}, ic{..got},
ic{..plt} and ic{..sym}. Their functions are summarised here:
b Referring to the symbol marking the global offset table base
using c{wrt ..gotpc} will end up giving the distance from the
beginning of the current section to the global offset table.
(ic{_GLOBAL_OFFSET_TABLE_} is the standard symbol name used to
refer to the i{GOT}.) So you would then need to add ic{$$} to the
result to get the real address of the GOT.
b Referring to a location in one of your own sections using c{wrt
..gotoff} will give the distance from the beginning of the GOT to
the specified location, so that adding on the address of the GOT
would give the real address of the location you wanted.
b Referring to an external or global symbol using c{wrt ..got}
causes the linker to build an entry e{in} the GOT containing the
address of the symbol, and the reference gives the distance from the
beginning of the GOT to the entry; so you can add on the address of
the GOT, load from the resulting address, and end up with the
address of the symbol.
b Referring to a procedure name using c{wrt ..plt} causes the
linker to build a i{procedure linkage table} entry for the symbol,
and the reference gives the address of the i{PLT} entry. You can
only use this in contexts which would generate a PC-relative
relocation normally (i.e. as the destination for c{CALL} or
c{JMP}), since ELF contains no relocation type to refer to PLT
entries absolutely.
b Referring to a symbol name using c{wrt ..sym} causes NASM to
write an ordinary relocation, but instead of making the relocation
relative to the start of the section and then adding on the offset
to the symbol, it will write a relocation record aimed directly at
the symbol in question. The distinction is a necessary one due to a
peculiarity of the dynamic linker.
A fuller explanation of how to use these relocation types to write
shared libraries entirely in NASM is given in k{picdll}.
S{elfglob} c{elf} Extensions to the c{GLOBAL} DirectiveI{GLOBAL,
elf extensions to}I{GLOBAL, aoutb extensions to}
ELF object files can contain more information about a global symbol
than just its address: they can contain the I{symbol sizes,
specifying}I{size, of symbols}size of the symbol and its I{symbol
types, specifying}I{type, of symbols}type as well. These are not
merely debugger conveniences, but are actually necessary when the
program being written is a i{shared library}. NASM therefore
supports some extensions to the c{GLOBAL} directive, allowing you
to specify these features.
You can specify whether a global variable is a function or a data
object by suffixing the name with a colon and the word
ic{function} or ic{data}. (ic{object} is a synonym for
c{data}.) For example:
c global hashlookup:function, hashtable:data
exports the global symbol c{hashlookup} as a function and
c{hashtable} as a data object.
You can also specify the size of the data associated with the
symbol, as a numeric expression (which may involve labels, and even
forward references) after the type specifier. Like this:
c global hashtable:data (hashtable.end - hashtable)
c hashtable:
c db this,that,theother ; some data here
c .end:
This makes NASM automatically calculate the length of the table and
place that information into the ELF symbol table.
Declaring the type and size of global symbols is necessary when
writing shared library code. For more information, see
k{picglobal}.
S{elfcomm} c{elf} Extensions to the c{COMMON} DirectiveI{COMMON,
elf extensions to}
ELF also allows you to specify alignment requirements I{common
variables, alignment in elf}I{alignment, of elf common variables}on
common variables. This is done by putting a number (which must be a
power of two) after the name and size of the common variable,
separated (as usual) by a colon. For example, an array of
doublewords would benefit from 4-byte alignment:
c common dwordarray 128:4
This declares the total size of the array to be 128 bytes, and
requires that it be aligned on a 4-byte boundary.
H{aoutfmt} ic{aout}: Linux I{a.out, Linux version}c{a.out} Object Files
The c{aout} format generates c{a.out} object files, in the form
used by early Linux systems. (These differ from other c{a.out}
object files in that the magic number in the first four bytes of the
file is different. Also, some implementations of c{a.out}, for
example NetBSD's, support position-independent code, which Linux's
implementation doesn't.)
c{a.out} provides a default output file-name extension of c{.o}.
c{a.out} is a very simple object format. It supports no special
directives, no special symbols, no use of c{SEG} or c{WRT}, and no
extensions to any standard directives. It supports only the three
i{standard section names} ic{.text}, ic{.data} and ic{.bss}.
H{aoutfmt} ic{aoutb}: i{NetBSD}/i{FreeBSD}/i{OpenBSD}
I{a.out, BSD version}c{a.out} Object Files
The c{aoutb} format generates c{a.out} object files, in the form
used by the various free BSD Unix clones, NetBSD, FreeBSD and
OpenBSD. For simple object files, this object format is exactly the
same as c{aout} except for the magic number in the first four bytes
of the file. However, the c{aoutb} format supports
I{PIC}i{position-independent code} in the same way as the c{elf}
format, so you can use it to write BSD i{shared libraries}.
c{aoutb} provides a default output file-name extension of c{.o}.
c{aoutb} supports no special directives, no special symbols, and
only the three i{standard section names} ic{.text}, ic{.data}
and ic{.bss}. However, it also supports the same use of ic{WRT} as
c{elf} does, to provide position-independent code relocation types.
See k{elfwrt} for full documentation of this feature.
c{aoutb} also supports the same extensions to the c{GLOBAL}
directive as c{elf} does: see k{elfglob} for documentation of
this.
H{as86fmt} c{as86}: Linux ic{as86} Object Files
The Linux 16-bit assembler c{as86} has its own non-standard object
file format. Although its companion linker ic{ld86} produces
something close to ordinary c{a.out} binaries as output, the object
file format used to communicate between c{as86} and c{ld86} is not
itself c{a.out}.
NASM supports this format, just in case it is useful, as c{as86}.
c{as86} provides a default output file-name extension of c{.o}.
c{as86} is a very simple object format (from the NASM user's point
of view). It supports no special directives, no special symbols, no
use of c{SEG} or c{WRT}, and no extensions to any standard
directives. It supports only the three i{standard section names}
ic{.text}, ic{.data} and ic{.bss}.
H{rdffmt} I{RDOFF}ic{rdf}: i{Relocatable Dynamic Object File
Format}
The c{rdf} output format produces RDOFF object files. RDOFF
(Relocatable Dynamic Object File Format) is a home-grown object-file
format, designed alongside NASM itself and reflecting in its file
format the internal structure of the assembler.
RDOFF is not used by any well-known operating systems. Those writing
their own systems, however, may well wish to use RDOFF as their
object format, on the grounds that it is designed primarily for
simplicity and contains very little file-header bureaucracy.
The Unix NASM archive, and the DOS archive which includes sources,
both contain an I{rdoff subdirectory}c{rdoff} subdirectory holding
a set of RDOFF utilities: an RDF linker, an RDF static-library
manager, an RDF file dump utility, and a program which will load and
execute an RDF executable under Linux.
c{rdf} supports only the i{standard section names} ic{.text},
ic{.data} and ic{.bss}.
S{rdflib} Requiring a Library: The ic{LIBRARY} Directive
RDOFF contains a mechanism for an object file to demand a given
library to be linked to the module, either at load time or run time.
This is done by the c{LIBRARY} directive, which takes one argument
which is the name of the module:
c library mylib.rdl
H{dbgfmt} ic{dbg}: Debugging Format
The c{dbg} output format is not built into NASM in the default
configuration. If you are building your own NASM executable from the
sources, you can define ic{OF_DBG} in c{outform.h} or on the
compiler command line, and obtain the c{dbg} output format.
The c{dbg} format does not output an object file as such; instead,
it outputs a text file which contains a complete list of all the
transactions between the main body of NASM and the output-format
back end module. It is primarily intended to aid people who want to
write their own output drivers, so that they can get a clearer idea
of the various requests the main program makes of the output driver,
and in what order they happen.
For simple files, one can easily use the c{dbg} format like this:
c nasm -f dbg filename.asm
which will generate a diagnostic file called c{filename.dbg}.
However, this will not work well on files which were designed for a
different object format, because each object format defines its own
macros (usually user-level forms of directives), and those macros
will not be defined in the c{dbg} format. Therefore it can be
useful to run NASM twice, in order to do the preprocessing with the
native object format selected:
c nasm -e -f rdf -o rdfprog.i rdfprog.asm
c nasm -a -f dbg rdfprog.i
This preprocesses c{rdfprog.asm} into c{rdfprog.i}, keeping the
c{rdf} object format selected in order to make sure RDF special
directives are converted into primitive form correctly. Then the
preprocessed source is fed through the c{dbg} format to generate
the final diagnostic output.
This workaround will still typically not work for programs intended
for c{obj} format, because the c{obj} c{SEGMENT} and c{GROUP}
directives have side effects of defining the segment and group names
as symbols; c{dbg} will not do this, so the program will not
assemble. You will have to work around that by defining the symbols
yourself (using c{EXTERN}, for example) if you really need to get a
c{dbg} trace of an c{obj}-specific source file.
c{dbg} accepts any section name and any directives at all, and logs
them all to its output file.
C{16bit} Writing 16-bit Code (DOS, Windows 3/3.1)
This chapter attempts to cover some of the common issues encountered
when writing 16-bit code to run under MS-DOS or Windows 3.x. It
covers how to link programs to produce c{.EXE} or c{.COM} files,
how to write c{.SYS} device drivers, and how to interface assembly
language code with 16-bit C compilers and with Borland Pascal.
H{exefiles} Producing ic{.EXE} Files
Any large program written under DOS needs to be built as a c{.EXE}
file: only c{.EXE} files have the necessary internal structure
required to span more than one 64K segment. i{Windows} programs,
also, have to be built as c{.EXE} files, since Windows does not
support the c{.COM} format.
In general, you generate c{.EXE} files by using the c{obj} output
format to produce one or more ic{.OBJ} files, and then linking
them together using a linker. However, NASM also supports the direct
generation of simple DOS c{.EXE} files using the c{bin} output
format (by using c{DB} and c{DW} to construct the c{.EXE} file
header), and a macro package is supplied to do this. Thanks to
Yann Guidon for contributing the code for this.
NASM may also support c{.EXE} natively as another output format in
future releases.
S{objexe} Using the c{obj} Format To Generate c{.EXE} Files
This section describes the usual method of generating c{.EXE} files
by linking c{.OBJ} files together.
Most 16-bit programming language packages come with a suitable
linker; if you have none of these, there is a free linker called
i{VAL}I{linker, free}, available in c{LZH} archive format from
W{ftp://x2ftp.oulu.fi/pub/msdos/programming/lang/}ic{x2ftp.oulu.fi}.
An LZH archiver can be found at
W{ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers}ic{ftp.simtel.net}.
There is another `free' linker (though this one doesn't come with
sources) called i{FREELINK}, available from
W{http://www.pcorner.com/tpc/old/3-101.html}ic{www.pcorner.com}.
A third, ic{djlink}, written by DJ Delorie, is available at
W{http://www.delorie.com/djgpp/16bit/djlink/}ic{www.delorie.com}.
When linking several c{.OBJ} files into a c{.EXE} file, you should
ensure that exactly one of them has a start point defined (using the
I{program entry point}ic{..start} special symbol defined by the
c{obj} format: see k{dotdotstart}). If no module defines a start
point, the linker will not know what value to give the entry-point
field in the output file header; if more than one defines a start
point, the linker will not know e{which} value to use.
An example of a NASM source file which can be assembled to a
c{.OBJ} file and linked on its own to a c{.EXE} is given here. It
demonstrates the basic principles of defining a stack, initialising
the segment registers, and declaring a start point. This file is
also provided in the I{test subdirectory}c{test} subdirectory of
the NASM archives, under the name c{objexe.asm}.
c segment code
c
c ..start: mov ax,data
c mov ds,ax
c mov ax,stack
c mov ss,ax
c mov sp,stacktop
This initial piece of code sets up c{DS} to point to the data
segment, and initialises c{SS} and c{SP} to point to the top of
the provided stack. Notice that interrupts are implicitly disabled
for one instruction after a move into c{SS}, precisely for this
situation, so that there's no chance of an interrupt occurring
between the loads of c{SS} and c{SP} and not having a stack to
execute on.
Note also that the special symbol c{..start} is defined at the
beginning of this code, which means that will be the entry point
into the resulting executable file.
c mov dx,hello
c mov ah,9
c int 0x21
The above is the main program: load c{DS:DX} with a pointer to the
greeting message (c{hello} is implicitly relative to the segment
c{data}, which was loaded into c{DS} in the setup code, so the
full pointer is valid), and call the DOS print-string function.
c mov ax,0x4c00
c int 0x21
This terminates the program using another DOS system call.
c segment data
c hello: db 'hello, world', 13, 10, '$'
The data segment contains the string we want to display.
c segment stack stack
c resb 64
c stacktop:
The above code declares a stack segment containing 64 bytes of
uninitialised stack space, and points c{stacktop} at the top of it.
The directive c{segment stack stack} defines a segment e{called}
c{stack}, and also of e{type} c{STACK}. The latter is not
necessary to the correct running of the program, but linkers are
likely to issue warnings or errors if your program has no segment of
type c{STACK}.
The above file, when assembled into a c{.OBJ} file, will link on
its own to a valid c{.EXE} file, which when run will print `hello,
world' and then exit.
S{binexe} Using the c{bin} Format To Generate c{.EXE} Files
The c{.EXE} file format is simple enough that it's possible to
build a c{.EXE} file by writing a pure-binary program and sticking
a 32-byte header on the front. This header is simple enough that it
can be generated using c{DB} and c{DW} commands by NASM itself, so
that you can use the c{bin} output format to directly generate
c{.EXE} files.
Included in the NASM archives, in the I{misc subdirectory}c{misc}
subdirectory, is a file ic{exebin.mac} of macros. It defines three
macros: ic{EXE_begin}, ic{EXE_stack} and ic{EXE_end}.
To produce a c{.EXE} file using this method, you should start by
using c{%include} to load the c{exebin.mac} macro package into
your source file. You should then issue the c{EXE_begin} macro call
(which takes no arguments) to generate the file header data. Then
write code as normal for the c{bin} format - you can use all three
standard sections c{.text}, c{.data} and c{.bss}. At the end of
the file you should call the c{EXE_end} macro (again, no arguments),
which defines some symbols to mark section sizes, and these symbols
are referred to in the header code generated by c{EXE_begin}.
In this model, the code you end up writing starts at c{0x100}, just
like a c{.COM} file - in fact, if you strip off the 32-byte header
from the resulting c{.EXE} file, you will have a valid c{.COM}
program. All the segment bases are the same, so you are limited to a
64K program, again just like a c{.COM} file. Note that an c{ORG}
directive is issued by the c{EXE_begin} macro, so you should not
explicitly issue one of your own.
You can't directly refer to your segment base value, unfortunately,
since this would require a relocation in the header, and things
would get a lot more complicated. So you should get your segment
base by copying it out of c{CS} instead.
On entry to your c{.EXE} file, c{SS:SP} are already set up to
point to the top of a 2Kb stack. You can adjust the default stack
size of 2Kb by calling the c{EXE_stack} macro. For example, to
change the stack size of your program to 64 bytes, you would call
c{EXE_stack 64}.
A sample program which generates a c{.EXE} file in this way is
given in the c{test} subdirectory of the NASM archive, as
c{binexe.asm}.
H{comfiles} Producing ic{.COM} Files
While large DOS programs must be written as c{.EXE} files, small
ones are often better written as c{.COM} files. c{.COM} files are
pure binary, and therefore most easily produced using the c{bin}
output format.
S{combinfmt} Using the c{bin} Format To Generate c{.COM} Files
c{.COM} files expect to be loaded at offset c{100h} into their
segment (though the segment may change). Execution then begins at
Ic{ORG}c{100h}, i.e. right at the start of the program. So to
write a c{.COM} program, you would create a source file looking
like
c org 100h
c section .text
c start: ; put your code here
c section .data
c ; put data items here
c section .bss
c ; put uninitialised data here
The c{bin} format puts the c{.text} section first in the file, so
you can declare data or BSS items before beginning to write code if
you want to and the code will still end up at the front of the file
where it belongs.
The BSS (uninitialised data) section does not take up space in the
c{.COM} file itself: instead, addresses of BSS items are resolved
to point at space beyond the end of the file, on the grounds that
this will be free memory when the program is run. Therefore you
should not rely on your BSS being initialised to all zeros when you
run.
To assemble the above program, you should use a command line like
c nasm myprog.asm -fbin -o myprog.com
The c{bin} format would produce a file called c{myprog} if no
explicit output file name were specified, so you have to override it
and give the desired file name.
S{comobjfmt} Using the c{obj} Format To Generate c{.COM} Files
If you are writing a c{.COM} program as more than one module, you
may wish to assemble several c{.OBJ} files and link them together
into a c{.COM} program. You can do this, provided you have a linker
capable of outputting c{.COM} files directly (i{TLINK} does this),
or alternatively a converter program such as ic{EXE2BIN} to
transform the c{.EXE} file output from the linker into a c{.COM}
file.
If you do this, you need to take care of several things:
b The first object file containing code should start its code
segment with a line like c{RESB 100h}. This is to ensure that the
code begins at offset c{100h} relative to the beginning of the code
segment, so that the linker or converter program does not have to
adjust address references within the file when generating the
c{.COM} file. Other assemblers use an ic{ORG} directive for this
purpose, but c{ORG} in NASM is a format-specific directive to the
c{bin} output format, and does not mean the same thing as it does
in MASM-compatible assemblers.
b You don't need to define a stack segment.
b All your segments should be in the same group, so that every time
your code or data references a symbol offset, all offsets are
relative to the same segment base. This is because, when a c{.COM}
file is loaded, all the segment registers contain the same value.
H{sysfiles} Producing ic{.SYS} Files
i{MS-DOS device drivers} - c{.SYS} files - are pure binary files,
similar to c{.COM} files, except that they start at origin zero
rather than c{100h}. Therefore, if you are writing a device driver
using the c{bin} format, you do not need the c{ORG} directive,
since the default origin for c{bin} is zero. Similarly, if you are
using c{obj}, you do not need the c{RESB 100h} at the start of
your code segment.
c{.SYS} files start with a header structure, containing pointers to
the various routines inside the driver which do the work. This
structure should be defined at the start of the code segment, even
though it is not actually code.
For more information on the format of c{.SYS} files, and the data
which has to go in the header structure, a list of books is given in
the Frequently Asked Questions list for the newsgroup
W{news:comp.os.msdos.programmer}ic{comp.os.msdos.programmer}.
H{16c} Interfacing to 16-bit C Programs
This section covers the basics of writing assembly routines that
call, or are called from, C programs. To do this, you would
typically write an assembly module as a c{.OBJ} file, and link it
with your C modules to produce a i{mixed-language program}.
S{16cunder} External Symbol Names
I{C symbol names}I{underscore, in C symbols}C compilers have the
convention that the names of all global symbols (functions or data)
they define are formed by prefixing an underscore to the name as it
appears in the C program. So, for example, the function a C
programmer thinks of as c{printf} appears to an assembly language
programmer as c{_printf}. This means that in your assembly
programs, you can define symbols without a leading underscore, and
not have to worry about name clashes with C symbols.
If you find the underscores inconvenient, you can define macros to
replace the c{GLOBAL} and c{EXTERN} directives as follows:
c %macro cglobal 1
c global _%1
c %define %1 _%1
c %endmacro
c %macro cextern 1
c extern _%1
c %define %1 _%1
c %endmacro
(These forms of the macros only take one argument at a time; a
c{%rep} construct could solve this.)
If you then declare an external like this:
c cextern printf
then the macro will expand it as
c extern _printf
c %define printf _printf
Thereafter, you can reference c{printf} as if it was a symbol, and
the preprocessor will put the leading underscore on where necessary.
The c{cglobal} macro works similarly. You must use c{cglobal}
before defining the symbol in question, but you would have had to do
that anyway if you used c{GLOBAL}.
S{16cmodels} i{Memory Models}
NASM contains no mechanism to support the various C memory models
directly; you have to keep track yourself of which one you are
writing for. This means you have to keep track of the following
things:
b In models using a single code segment (tiny, small and compact),
functions are near. This means that function pointers, when stored
in data segments or pushed on the stack as function arguments, are
16 bits long and contain only an offset field (the c{CS} register
never changes its value, and always gives the segment part of the
full function address), and that functions are called using ordinary
near c{CALL} instructions and return using c{RETN} (which, in
NASM, is synonymous with c{RET} anyway). This means both that you
should write your own routines to return with c{RETN}, and that you
should call external C routines with near c{CALL} instructions.
b In models using more than one code segment (medium, large and
huge), functions are far. This means that function pointers are 32
bits long (consisting of a 16-bit offset followed by a 16-bit
segment), and that functions are called using c{CALL FAR} (or
c{CALL seg:offset}) and return using c{RETF}. Again, you should
therefore write your own routines to return with c{RETF} and use
c{CALL FAR} to call external routines.
b In models using a single data segment (tiny, small and medium),
data pointers are 16 bits long, containing only an offset field (the
c{DS} register doesn't change its value, and always gives the
segment part of the full data item address).
b In models using more than one data segment (compact, large and
huge), data pointers are 32 bits long, consisting of a 16-bit offset
followed by a 16-bit segment. You should still be careful not to
modify c{DS} in your routines without restoring it afterwards, but
c{ES} is free for you to use to access the contents of 32-bit data
pointers you are passed.
b The huge memory model allows single data items to exceed 64K in
size. In all other memory models, you can access the whole of a data
item just by doing arithmetic on the offset field of the pointer you
are given, whether a segment field is present or not; in huge model,
you have to be more careful of your pointer arithmetic.
b In most memory models, there is a e{default} data segment, whose
segment address is kept in c{DS} throughout the program. This data
segment is typically the same segment as the stack, kept in c{SS},
so that functions' local variables (which are stored on the stack)
and global data items can both be accessed easily without changing
c{DS}. Particularly large data items are typically stored in other
segments. However, some memory models (though not the standard
ones, usually) allow the assumption that c{SS} and c{DS} hold the
same value to be removed. Be careful about functions' local
variables in this latter case.
In models with a single code segment, the segment is called
ic{_TEXT}, so your code segment must also go by this name in order
to be linked into the same place as the main code segment. In models
with a single data segment, or with a default data segment, it is
called ic{_DATA}.
S{16cfunc} Function Definitions and Function Calls
I{functions, C calling convention}The i{C calling convention} in
16-bit programs is as follows. In the following description, the
words e{caller} and e{callee} are used to denote the function
doing the calling and the function which gets called.
b The caller pushes the function's parameters on the stack, one
after another, in reverse order (right to left, so that the first
argument specified to the function is pushed last).
b The caller then executes a c{CALL} instruction to pass control
to the callee. This c{CALL} is either near or far depending on the
memory model.
b The callee receives control, and typically (although this is not
actually necessary, in functions which do not need to access their
parameters) starts by saving the value of c{SP} in c{BP} so as to
be able to use c{BP} as a base pointer to find its parameters on
the stack. However, the caller was probably doing this too, so part
of the calling convention states that c{BP} must be preserved by
any C function. Hence the callee, if it is going to set up c{BP} as
a ie{frame pointer}, must push the previous value first.
b The callee may then access its parameters relative to c{BP}.
The word at c{[BP]} holds the previous value of c{BP} as it was
pushed; the next word, at c{[BP+2]}, holds the offset part of the
return address, pushed implicitly by c{CALL}. In a small-model
(near) function, the parameters start after that, at c{[BP+4]}; in
a large-model (far) function, the segment part of the return address
lives at c{[BP+4]}, and the parameters begin at c{[BP+6]}. The
leftmost parameter of the function, since it was pushed last, is
accessible at this offset from c{BP}; the others follow, at
successively greater offsets. Thus, in a function such as c{printf}
which takes a variable number of parameters, the pushing of the
parameters in reverse order means that the function knows where to
find its first parameter, which tells it the number and type of the
remaining ones.
b The callee may also wish to decrease c{SP} further, so as to
allocate space on the stack for local variables, which will then be
accessible at negative offsets from c{BP}.
b The callee, if it wishes to return a value to the caller, should
leave the value in c{AL}, c{AX} or c{DX:AX} depending on the size
of the value. Floating-point results are sometimes (depending on the
compiler) returned in c{ST0}.
b Once the callee has finished processing, it restores c{SP} from
c{BP} if it had allocated local stack space, then pops the previous
value of c{BP}, and returns via c{RETN} or c{RETF} depending on
memory model.
b When the caller regains control from the callee, the function
parameters are still on the stack, so it typically adds an immediate
constant to c{SP} to remove them (instead of executing a number of
slow c{POP} instructions). Thus, if a function is accidentally
called with the wrong number of parameters due to a prototype
mismatch, the stack will still be returned to a sensible state since
the caller, which e{knows} how many parameters it pushed, does the
removing.
It is instructive to compare this calling convention with that for
Pascal programs (described in k{16bpfunc}). Pascal has a simpler
convention, since no functions have variable numbers of parameters.
Therefore the callee knows how many parameters it should have been
passed, and is able to deallocate them from the stack itself by
passing an immediate argument to the c{RET} or c{RETF}
instruction, so the caller does not have to do it. Also, the
parameters are pushed in left-to-right order, not right-to-left,
which means that a compiler can give better guarantees about
sequence points without performance suffering.
Thus, you would define a function in C style in the following way.
The following example is for small model:
c global _myfunc
c _myfunc: push bp
c mov bp,sp
c sub sp,0x40 ; 64 bytes of local stack space
c mov bx,[bp+4] ; first parameter to function
c ; some more code
c mov sp,bp ; undo "sub sp,0x40" above
c pop bp
c ret
For a large-model function, you would replace c{RET} by c{RETF},
and look for the first parameter at c{[BP+6]} instead of
c{[BP+4]}. Of course, if one of the parameters is a pointer, then
the offsets of e{subsequent} parameters will change depending on
the memory model as well: far pointers take up four bytes on the
stack when passed as a parameter, whereas near pointers take up two.
At the other end of the process, to call a C function from your
assembly code, you would do something like this:
c extern _printf
c ; and then, further down...
c push word [myint] ; one of my integer variables
c push word mystring ; pointer into my data segment
c call _printf
c add sp,byte 4 ; `byte' saves space
c ; then those data items...
c segment _DATA
c myint dw 1234
c mystring db 'This number -> %d <- should be 1234',10,0
This piece of code is the small-model assembly equivalent of the C
code
c int myint = 1234;
c printf("This number -> %d <- should be 1234n", myint);
In large model, the function-call code might look more like this. In
this example, it is assumed that c{DS} already holds the segment
base of the segment c{_DATA}. If not, you would have to initialise
it first.
c push word [myint]
c push word seg mystring ; Now push the segment, and...
c push word mystring ; ... offset of "mystring"
c call far _printf
c add sp,byte 6
The integer value still takes up one word on the stack, since large
model does not affect the size of the c{int} data type. The first
argument (pushed last) to c{printf}, however, is a data pointer,
and therefore has to contain a segment and offset part. The segment
should be stored second in memory, and therefore must be pushed
first. (Of course, c{PUSH DS} would have been a shorter instruction
than c{PUSH WORD SEG mystring}, if c{DS} was set up as the above
example assumed.) Then the actual call becomes a far call, since
functions expect far calls in large model; and c{SP} has to be
increased by 6 rather than 4 afterwards to make up for the extra
word of parameters.
S{16cdata} Accessing Data Items
To get at the contents of C variables, or to declare variables which
C can access, you need only declare the names as c{GLOBAL} or
c{EXTERN}. (Again, the names require leading underscores, as stated
in k{16cunder}.) Thus, a C variable declared as c{int i} can be
accessed from assembler as
c extern _i
c mov ax,[_i]
And to declare your own integer variable which C programs can access
as c{extern int j}, you do this (making sure you are assembling in
the c{_DATA} segment, if necessary):
c global _j
c _j dw 0
To access a C array, you need to know the size of the components of
the array. For example, c{int} variables are two bytes long, so if
a C program declares an array as c{int a[10]}, you can access
c{a[3]} by coding c{mov ax,[_a+6]}. (The byte offset 6 is obtained
by multiplying the desired array index, 3, by the size of the array
element, 2.) The sizes of the C base types in 16-bit compilers are:
1 for c{char}, 2 for c{short} and c{int}, 4 for c{long} and
c{float}, and 8 for c{double}.
To access a C i{data structure}, you need to know the offset from
the base of the structure to the field you are interested in. You
can either do this by converting the C structure definition into a
NASM structure definition (using ic{STRUC}), or by calculating the
one offset and using just that.
To do either of these, you should read your C compiler's manual to
find out how it organises data structures. NASM gives no special
alignment to structure members in its own c{STRUC} macro, so you
have to specify alignment yourself if the C compiler generates it.
Typically, you might find that a structure like
c struct {
c char c;
c int i;
c } foo;
might be four bytes long rather than three, since the c{int} field
would be aligned to a two-byte boundary. However, this sort of
feature tends to be a configurable option in the C compiler, either
using command-line options or c{#pragma} lines, so you have to find
out how your own compiler does it.
S{16cmacro} ic{c16.mac}: Helper Macros for the 16-bit C Interface
Included in the NASM archives, in the I{misc subdirectory}c{misc}
directory, is a file c{c16.mac} of macros. It defines three macros:
ic{proc}, ic{arg} and ic{endproc}. These are intended to be
used for C-style procedure definitions, and they automate a lot of
the work involved in keeping track of the calling convention.
An example of an assembly function using the macro set is given
here:
c proc _nearproc
c %$i arg
c %$j arg
c mov ax,[bp + %$i]
c mov bx,[bp + %$j]
c add ax,[bx]
c endproc
This defines c{_nearproc} to be a procedure taking two arguments,
the first (c{i}) an integer and the second (c{j}) a pointer to an
integer. It returns c{i + *j}.
Note that the c{arg} macro has an c{EQU} as the first line of its
expansion, and since the label before the macro call gets prepended
to the first line of the expanded macro, the c{EQU} works, defining
c{%$i} to be an offset from c{BP}. A context-local variable is
used, local to the context pushed by the c{proc} macro and popped
by the c{endproc} macro, so that the same argument name can be used
in later procedures. Of course, you don't e{have} to do that.
The macro set produces code for near functions (tiny, small and
compact-model code) by default. You can have it generate far
functions (medium, large and huge-model code) by means of coding
Ic{FARCODE}c{%define FARCODE}. This changes the kind of return
instruction generated by c{endproc}, and also changes the starting
point for the argument offsets. The macro set contains no intrinsic
dependency on whether data pointers are far or not.
c{arg} can take an optional parameter, giving the size of the
argument. If no size is given, 2 is assumed, since it is likely that
many function parameters will be of type c{int}.
The large-model equivalent of the above function would look like this:
c %define FARCODE
c proc _farproc
c %$i arg
c %$j arg 4
c mov ax,[bp + %$i]
c mov bx,[bp + %$j]
c mov es,[bp + %$j + 2]
c add ax,[bx]
c endproc
This makes use of the argument to the c{arg} macro to define a
parameter of size 4, because c{j} is now a far pointer. When we
load from c{j}, we must load a segment and an offset.
H{16bp} Interfacing to i{Borland Pascal} Programs
Interfacing to Borland Pascal programs is similar in concept to
interfacing to 16-bit C programs. The differences are:
b The leading underscore required for interfacing to C programs is
not required for Pascal.
b The memory model is always large: functions are far, data
pointers are far, and no data item can be more than 64K long.
(Actually, some functions are near, but only those functions that
are local to a Pascal unit and never called from outside it. All
assembly functions that Pascal calls, and all Pascal functions that
assembly routines are able to call, are far.) However, all static
data declared in a Pascal program goes into the default data
segment, which is the one whose segment address will be in c{DS}
when control is passed to your assembly code. The only things that
do not live in the default data segment are local variables (they
live in the stack segment) and dynamically allocated variables. All
data e{pointers}, however, are far.
b The function calling convention is different - described below.
b Some data types, such as strings, are stored differently.
b There are restrictions on the segment names you are allowed to
use - Borland Pascal will ignore code or data declared in a segment
it doesn't like the name of. The restrictions are described below.
S{16bpfunc} The Pascal Calling Convention
I{functions, Pascal calling convention}I{Pascal calling
convention}The 16-bit Pascal calling convention is as follows. In
the following description, the words e{caller} and e{callee} are
used to denote the function doing the calling and the function which
gets called.
b The caller pushes the function's parameters on the stack, one
after another, in normal order (left to right, so that the first
argument specified to the function is pushed first).
b The caller then executes a far c{CALL} instruction to pass
control to the callee.
b The callee receives control, and typically (although this is not
actually necessary, in functions which do not need to access their
parameters) starts by saving the value of c{SP} in c{BP} so as to
be able to use c{BP} as a base pointer to find its parameters on
the stack. However, the caller was probably doing this too, so part
of the calling convention states that c{BP} must be preserved by
any function. Hence the callee, if it is going to set up c{BP} as a
i{frame pointer}, must push the previous value first.
b The callee may then access its parameters relative to c{BP}.
The word at c{[BP]} holds the previous value of c{BP} as it was
pushed. The next word, at c{[BP+2]}, holds the offset part of the
return address, and the next one at c{[BP+4]} the segment part. The
parameters begin at c{[BP+6]}. The rightmost parameter of the
function, since it was pushed last, is accessible at this offset
from c{BP}; the others follow, at successively greater offsets.
b The callee may also wish to decrease c{SP} further, so as to
allocate space on the stack for local variables, which will then be
accessible at negative offsets from c{BP}.
b The callee, if it wishes to return a value to the caller, should
leave the value in c{AL}, c{AX} or c{DX:AX} depending on the size
of the value. Floating-point results are returned in c{ST0}.
Results of type c{Real} (Borland's own custom floating-point data
type, not handled directly by the FPU) are returned in c{DX:BX:AX}.
To return a result of type c{String}, the caller pushes a pointer
to a temporary string before pushing the parameters, and the callee
places the returned string value at that location. The pointer is
not a parameter, and should not be removed from the stack by the
c{RETF} instruction.
b Once the callee has finished processing, it restores c{SP} from
c{BP} if it had allocated local stack space, then pops the previous
value of c{BP}, and returns via c{RETF}. It uses the form of
c{RETF} with an immediate parameter, giving the number of bytes
taken up by the parameters on the stack. This causes the parameters
to be removed from the stack as a side effect of the return
instruction.
b When the caller regains control from the callee, the function
parameters have already been removed from the stack, so it needs to
do nothing further.
Thus, you would define a function in Pascal style, taking two
c{Integer}-type parameters, in the following way:
c global myfunc
c myfunc: push bp
c mov bp,sp
c sub sp,0x40 ; 64 bytes of local stack space
c mov bx,[bp+8] ; first parameter to function
c mov bx,[bp+6] ; second parameter to function
c ; some more code
c mov sp,bp ; undo "sub sp,0x40" above
c pop bp
c retf 4 ; total size of params is 4
At the other end of the process, to call a Pascal function from your
assembly code, you would do something like this:
c extern SomeFunc
c ; and then, further down...
c push word seg mystring ; Now push the segment, and...
c push word mystring ; ... offset of "mystring"
c push word [myint] ; one of my variables
c call far SomeFunc
This is equivalent to the Pascal code
c procedure SomeFunc(String: PChar; Int: Integer);
c SomeFunc(@mystring, myint);
S{16bpseg} Borland Pascal I{segment names, Borland Pascal}Segment
Name Restrictions
Since Borland Pascal's internal unit file format is completely
different from c{OBJ}, it only makes a very sketchy job of actually
reading and understanding the various information contained in a
real c{OBJ} file when it links that in. Therefore an object file
intended to be linked to a Pascal program must obey a number of
restrictions:
b Procedures and functions must be in a segment whose name is
either c{CODE}, c{CSEG}, or something ending in c{_TEXT}.
b Initialised data must be in a segment whose name is either
c{CONST} or something ending in c{_DATA}.
b Uninitialised data must be in a segment whose name is either
c{DATA}, c{DSEG}, or something ending in c{_BSS}.
b Any other segments in the object file are completely ignored.
c{GROUP} directives and segment attributes are also ignored.
S{16bpmacro} Using ic{c16.mac} With Pascal Programs
The c{c16.mac} macro package, described in k{16cmacro}, can also
be used to simplify writing functions to be called from Pascal
programs, if you code Ic{PASCAL}c{%define PASCAL}. This
definition ensures that functions are far (it implies
ic{FARCODE}), and also causes procedure return instructions to be
generated with an operand.
Defining c{PASCAL} does not change the code which calculates the
argument offsets; you must declare your function's arguments in
reverse order. For example:
c %define PASCAL
c proc _pascalproc
c %$j arg 4
c %$i arg
c mov ax,[bp + %$i]
c mov bx,[bp + %$j]
c mov es,[bp + %$j + 2]
c add ax,[bx]
c endproc
This defines the same routine, conceptually, as the example in
k{16cmacro}: it defines a function taking two arguments, an integer
and a pointer to an integer, which returns the sum of the integer
and the contents of the pointer. The only difference between this
code and the large-model C version is that c{PASCAL} is defined
instead of c{FARCODE}, and that the arguments are declared in
reverse order.
C{32bit} Writing 32-bit Code (Unix, Win32, DJGPP)
This chapter attempts to cover some of the common issues involved
when writing 32-bit code, to run under i{Win32} or Unix, or to be
linked with C code generated by a Unix-style C compiler such as
i{DJGPP}. It covers how to write assembly code to interface with
32-bit C routines, and how to write position-independent code for
shared libraries.
Almost all 32-bit code, and in particular all code running under
Win32, DJGPP or any of the PC Unix variants, runs in I{flat memory
model}e{flat} memory model. This means that the segment registers
and paging have already been set up to give you the same 32-bit 4Gb
address space no matter what segment you work relative to, and that
you should ignore all segment registers completely. When writing
flat-model application code, you never need to use a segment
override or modify any segment register, and the code-section
addresses you pass to c{CALL} and c{JMP} live in the same address
space as the data-section addresses you access your variables by and
the stack-section addresses you access local variables and procedure
parameters by. Every address is 32 bits long and contains only an
offset part.
H{32c} Interfacing to 32-bit C Programs
A lot of the discussion in k{16c}, about interfacing to 16-bit C
programs, still applies when working in 32 bits. The absence of
memory models or segmentation worries simplifies things a lot.
S{32cunder} External Symbol Names
Most 32-bit C compilers share the convention used by 16-bit
compilers, that the names of all global symbols (functions or data)
they define are formed by prefixing an underscore to the name as it
appears in the C program. However, not all of them do: the ELF
specification states that C symbols do e{not} have a leading
underscore on their assembly-language names.
The older Linux c{a.out} C compiler, all Win32 compilers, DJGPP,
and NetBSD and FreeBSD, all use the leading underscore; for these
compilers, the macros c{cextern} and c{cglobal}, as given in
k{16cunder}, will still work. For ELF, though, the leading
underscore should not be used.
S{32cfunc} Function Definitions and Function Calls
I{functions, C calling convention}The i{C calling convention}The C
calling convention in 32-bit programs is as follows. In the
following description, the words e{caller} and e{callee} are used
to denote the function doing the calling and the function which gets
called.
b The caller pushes the function's parameters on the stack, one
after another, in reverse order (right to left, so that the first
argument specified to the function is pushed last).
b The caller then executes a near c{CALL} instruction to pass
control to the callee.
b The callee receives control, and typically (although this is not
actually necessary, in functions which do not need to access their
parameters) starts by saving the value of c{ESP} in c{EBP} so as
to be able to use c{EBP} as a base pointer to find its parameters
on the stack. However, the caller was probably doing this too, so
part of the calling convention states that c{EBP} must be preserved
by any C function. Hence the callee, if it is going to set up
c{EBP} as a i{frame pointer}, must push the previous value first.
b The callee may then access its parameters relative to c{EBP}.
The doubleword at c{[EBP]} holds the previous value of c{EBP} as
it was pushed; the next doubleword, at c{[EBP+4]}, holds the return
address, pushed implicitly by c{CALL}. The parameters start after
that, at c{[EBP+8]}. The leftmost parameter of the function, since
it was pushed last, is accessible at this offset from c{EBP}; the
others follow, at successively greater offsets. Thus, in a function
such as c{printf} which takes a variable number of parameters, the
pushing of the parameters in reverse order means that the function
knows where to find its first parameter, which tells it the number
and type of the remaining ones.
b The callee may also wish to decrease c{ESP} further, so as to
allocate space on the stack for local variables, which will then be
accessible at negative offsets from c{EBP}.
b The callee, if it wishes to return a value to the caller, should
leave the value in c{AL}, c{AX} or c{EAX} depending on the size
of the value. Floating-point results are typically returned in
c{ST0}.
b Once the callee has finished processing, it restores c{ESP} from
c{EBP} if it had allocated local stack space, then pops the previous
value of c{EBP}, and returns via c{RET} (equivalently, c{RETN}).
b When the caller regains control from the callee, the function
parameters are still on the stack, so it typically adds an immediate
constant to c{ESP} to remove them (instead of executing a number of
slow c{POP} instructions). Thus, if a function is accidentally
called with the wrong number of parameters due to a prototype
mismatch, the stack will still be returned to a sensible state since
the caller, which e{knows} how many parameters it pushed, does the
removing.
There is an alternative calling convention used by Win32 programs
for Windows API calls, and also for functions called e{by} the
Windows API such as window procedures: they follow what Microsoft
calls the c{__stdcall} convention. This is slightly closer to the
Pascal convention, in that the callee clears the stack by passing a
parameter to the c{RET} instruction. However, the parameters are
still pushed in right-to-left order.
Thus, you would define a function in C style in the following way:
c global _myfunc
c _myfunc: push ebp
c mov ebp,esp
c sub esp,0x40 ; 64 bytes of local stack space
c mov ebx,[ebp+8] ; first parameter to function
c ; some more code
c leave ; mov esp,ebp / pop ebp
c ret
At the other end of the process, to call a C function from your
assembly code, you would do something like this:
c extern _printf
c ; and then, further down...
c push dword [myint] ; one of my integer variables
c push dword mystring ; pointer into my data segment
c call _printf
c add esp,byte 8 ; `byte' saves space
c ; then those data items...
c segment _DATA
c myint dd 1234
c mystring db 'This number -> %d <- should be 1234',10,0
This piece of code is the assembly equivalent of the C code
c int myint = 1234;
c printf("This number -> %d <- should be 1234n", myint);
S{32cdata} Accessing Data Items
To get at the contents of C variables, or to declare variables which
C can access, you need only declare the names as c{GLOBAL} or
c{EXTERN}. (Again, the names require leading underscores, as stated
in k{32cunder}.) Thus, a C variable declared as c{int i} can be
accessed from assembler as
c extern _i
c mov eax,[_i]
And to declare your own integer variable which C programs can access
as c{extern int j}, you do this (making sure you are assembling in
the c{_DATA} segment, if necessary):
c global _j
c _j dd 0
To access a C array, you need to know the size of the components of
the array. For example, c{int} variables are four bytes long, so if
a C program declares an array as c{int a[10]}, you can access
c{a[3]} by coding c{mov ax,[_a+12]}. (The byte offset 12 is obtained
by multiplying the desired array index, 3, by the size of the array
element, 4.) The sizes of the C base types in 32-bit compilers are:
1 for c{char}, 2 for c{short}, 4 for c{int}, c{long} and
c{float}, and 8 for c{double}. Pointers, being 32-bit addresses,
are also 4 bytes long.
To access a C i{data structure}, you need to know the offset from
the base of the structure to the field you are interested in. You
can either do this by converting the C structure definition into a
NASM structure definition (using c{STRUC}), or by calculating the
one offset and using just that.
To do either of these, you should read your C compiler's manual to
find out how it organises data structures. NASM gives no special
alignment to structure members in its own ic{STRUC} macro, so you
have to specify alignment yourself if the C compiler generates it.
Typically, you might find that a structure like
c struct {
c char c;
c int i;
c } foo;
might be eight bytes long rather than five, since the c{int} field
would be aligned to a four-byte boundary. However, this sort of
feature is sometimes a configurable option in the C compiler, either
using command-line options or c{#pragma} lines, so you have to find
out how your own compiler does it.
S{32cmacro} ic{c32.mac}: Helper Macros for the 32-bit C Interface
Included in the NASM archives, in the I{misc directory}c{misc}
directory, is a file c{c32.mac} of macros. It defines three macros:
ic{proc}, ic{arg} and ic{endproc}. These are intended to be
used for C-style procedure definitions, and they automate a lot of
the work involved in keeping track of the calling convention.
An example of an assembly function using the macro set is given
here:
c proc _proc32
c %$i arg
c %$j arg
c mov eax,[ebp + %$i]
c mov ebx,[ebp + %$j]
c add eax,[ebx]
c endproc
This defines c{_proc32} to be a procedure taking two arguments, the
first (c{i}) an integer and the second (c{j}) a pointer to an
integer. It returns c{i + *j}.
Note that the c{arg} macro has an c{EQU} as the first line of its
expansion, and since the label before the macro call gets prepended
to the first line of the expanded macro, the c{EQU} works, defining
c{%$i} to be an offset from c{BP}. A context-local variable is
used, local to the context pushed by the c{proc} macro and popped
by the c{endproc} macro, so that the same argument name can be used
in later procedures. Of course, you don't e{have} to do that.
c{arg} can take an optional parameter, giving the size of the
argument. If no size is given, 4 is assumed, since it is likely that
many function parameters will be of type c{int} or pointers.
H{picdll} Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF i{Shared
Libraries}
ELF replaced the older c{a.out} object file format under Linux
because it contains support for i{position-independent code}
(i{PIC}), which makes writing shared libraries much easier. NASM
supports the ELF position-independent code features, so you can
write Linux ELF shared libraries in NASM.
i{NetBSD}, and its close cousins i{FreeBSD} and i{OpenBSD}, take
a different approach by hacking PIC support into the c{a.out}
format. NASM supports this as the ic{aoutb} output format, so you
can write i{BSD} shared libraries in NASM too.
The operating system loads a PIC shared library by memory-mapping
the library file at an arbitrarily chosen point in the address space
of the running process. The contents of the library's code section
must therefore not depend on where it is loaded in memory.
Therefore, you cannot get at your variables by writing code like
this:
c mov eax,[myvar] ; WRONG
Instead, the linker provides an area of memory called the
ie{global offset table}, or i{GOT}; the GOT is situated at a
constant distance from your library's code, so if you can find out
where your library is loaded (which is typically done using a
c{CALL} and c{POP} combination), you can obtain the address of the
GOT, and you can then load the addresses of your variables out of
linker-generated entries in the GOT.
The e{data} section of a PIC shared library does not have these
restrictions: since the data section is writable, it has to be
copied into memory anyway rather than just paged in from the library
file, so as long as it's being copied it can be relocated too. So
you can put ordinary types of relocation in the data section without
too much worry (but see k{picglobal} for a caveat).
S{picgot} Obtaining the Address of the GOT
Each code module in your shared library should define the GOT as an
external symbol:
c extern _GLOBAL_OFFSET_TABLE_ ; in ELF
c extern __GLOBAL_OFFSET_TABLE_ ; in BSD a.out
At the beginning of any function in your shared library which plans
to access your data or BSS sections, you must first calculate the
address of the GOT. This is typically done by writing the function
in this form:
c func: push ebp
c mov ebp,esp
c push ebx
c call .get_GOT
c .get_GOT: pop ebx
c add ebx,_GLOBAL_OFFSET_TABLE_+$$-.get_GOT wrt ..gotpc
c ; the function body comes here
c mov ebx,[ebp-4]
c mov esp,ebp
c pop ebp
c ret
(For BSD, again, the symbol c{_GLOBAL_OFFSET_TABLE} requires a
second leading underscore.)
The first two lines of this function are simply the standard C
prologue to set up a stack frame, and the last three lines are
standard C function epilogue. The third line, and the fourth to last
line, save and restore the c{EBX} register, because PIC shared
libraries use this register to store the address of the GOT.
The interesting bit is the c{CALL} instruction and the following
two lines. The c{CALL} and c{POP} combination obtains the address
of the label c{.get_GOT}, without having to know in advance where
the program was loaded (since the c{CALL} instruction is encoded
relative to the current position). The c{ADD} instruction makes use
of one of the special PIC relocation types: i{GOTPC relocation}.
With the ic{WRT ..gotpc} qualifier specified, the symbol
referenced (here c{_GLOBAL_OFFSET_TABLE_}, the special symbol
assigned to the GOT) is given as an offset from the beginning of the
section. (Actually, ELF encodes it as the offset from the operand
field of the c{ADD} instruction, but NASM simplifies this
deliberately, so you do things the same way for both ELF and BSD.)
So the instruction then e{adds} the beginning of the section, to
get the real address of the GOT, and subtracts the value of
c{.get_GOT} which it knows is in c{EBX}. Therefore, by the time
that instruction has finished,
c{EBX} contains the address of the GOT.
If you didn't follow that, don't worry: it's never necessary to
obtain the address of the GOT by any other means, so you can put
those three instructions into a macro and safely ignore them:
c %macro get_GOT 0
c call %%getgot
c %%getgot: pop ebx
c add ebx,_GLOBAL_OFFSET_TABLE_+$$-%%getgot wrt ..gotpc
c %endmacro
S{piclocal} Finding Your Local Data Items
Having got the GOT, you can then use it to obtain the addresses of
your data items. Most variables will reside in the sections you have
declared; they can be accessed using the I{GOTOFF
relocation}c{..gotoff} special Ic{WRT ..gotoff}c{WRT} type. The
way this works is like this:
c lea eax,[ebx+myvar wrt ..gotoff]
The expression c{myvar wrt ..gotoff} is calculated, when the shared
library is linked, to be the offset to the local variable c{myvar}
from the beginning of the GOT. Therefore, adding it to c{EBX} as
above will place the real address of c{myvar} in c{EAX}.
If you declare variables as c{GLOBAL} without specifying a size for
them, they are shared between code modules in the library, but do
not get exported from the library to the program that loaded it.
They will still be in your ordinary data and BSS sections, so you
can access them in the same way as local variables, using the above
c{..gotoff} mechanism.
Note that due to a peculiarity of the way BSD c{a.out} format
handles this relocation type, there must be at least one non-local
symbol in the same section as the address you're trying to access.
S{picextern} Finding External and Common Data Items
If your library needs to get at an external variable (external to
the e{library}, not just to one of the modules within it), you must
use the I{GOT relocations}Ic{WRT ..got}c{..got} type to get at
it. The c{..got} type, instead of giving you the offset from the
GOT base to the variable, gives you the offset from the GOT base to
a GOT e{entry} containing the address of the variable. The linker
will set up this GOT entry when it builds the library, and the
dynamic linker will place the correct address in it at load time. So
to obtain the address of an external variable c{extvar} in c{EAX},
you would code
c mov eax,[ebx+extvar wrt ..got]
This loads the address of c{extvar} out of an entry in the GOT. The
linker, when it builds the shared library, collects together every
relocation of type c{..got}, and builds the GOT so as to ensure it
has every necessary entry present.
Common variables must also be accessed in this way.
S{picglobal} Exporting Symbols to the Library User
If you want to export symbols to the user of the library, you have
to declare whether they are functions or data, and if they are data,
you have to give the size of the data item. This is because the
dynamic linker has to build I{PLT}i{procedure linkage table}
entries for any exported functions, and also moves exported data
items away from the library's data section in which they were
declared.
So to export a function to users of the library, you must use
c global func:function ; declare it as a function
c func: push ebp
c ; etc.
And to export a data item such as an array, you would have to code
c global array:data array.end-array ; give the size too
c array: resd 128
c .end:
Be careful: If you export a variable to the library user, by
declaring it as c{GLOBAL} and supplying a size, the variable will
end up living in the data section of the main program, rather than
in your library's data section, where you declared it. So you will
have to access your own global variable with the c{..got} mechanism
rather than c{..gotoff}, as if it were external (which,
effectively, it has become).
Equally, if you need to store the address of an exported global in
one of your data sections, you can't do it by means of the standard
sort of code:
c dataptr: dd global_data_item ; WRONG
NASM will interpret this code as an ordinary relocation, in which
c{global_data_item} is merely an offset from the beginning of the
c{.data} section (or whatever); so this reference will end up
pointing at your data section instead of at the exported global
which resides elsewhere.
Instead of the above code, then, you must write
c dataptr: dd global_data_item wrt ..sym
which makes use of the special c{WRT} type Ic{WRT ..sym}c{..sym}
to instruct NASM to search the symbol table for a particular symbol
at that address, rather than just relocating by section base.
Either method will work for functions: referring to one of your
functions by means of
c funcptr: dd my_function
will give the user the address of the code you wrote, whereas
c funcptr: dd my_function wrt ..sym
will give the address of the procedure linkage table for the
function, which is where the calling program will e{believe} the
function lives. Either address is a valid way to call the function.
S{picproc} Calling Procedures Outside the Library
Calling procedures outside your shared library has to be done by
means of a ie{procedure linkage table}, or i{PLT}. The PLT is
placed at a known offset from where the library is loaded, so the
library code can make calls to the PLT in a position-independent
way. Within the PLT there is code to jump to offsets contained in
the GOT, so function calls to other shared libraries or to routines
in the main program can be transparently passed off to their real
destinations.
To call an external routine, you must use another special PIC
relocation type, I{PLT relocations}ic{WRT ..plt}. This is much
easier than the GOT-based ones: you simply replace calls such as
c{CALL printf} with the PLT-relative version c{CALL printf WRT
..plt}.
S{link} Generating the Library File
Having written some code modules and assembled them to c{.o} files,
you then generate your shared library with a command such as
c ld -shared -o library.so module1.o module2.o # for ELF
c ld -Bshareable -o library.so module1.o module2.o # for BSD
For ELF, if your shared library is going to reside in system
directories such as c{/usr/lib} or c{/lib}, it is usually worth
using the ic{-soname} flag to the linker, to store the final
library file name, with a version number, into the library:
c ld -shared -soname library.so.1 -o library.so.1.2 *.o
You would then copy c{library.so.1.2} into the library directory,
and create c{library.so.1} as a symbolic link to it.
C{mixsize} Mixing 16 and 32 Bit Code
This chapter tries to cover some of the issues, largely related to
unusual forms of addressing and jump instructions, encountered when
writing operating system code such as protected-mode initialisation
routines, which require code that operates in mixed segment sizes,
such as code in a 16-bit segment trying to modify data in a 32-bit
one, or jumps between different-size segments.
H{mixjump} Mixed-Size JumpsI{jumps, mixed-size}
I{operating system, writing}I{writing operating systems}The most
common form of i{mixed-size instruction} is the one used when
writing a 32-bit OS: having done your setup in 16-bit mode, such as
loading the kernel, you then have to boot it by switching into
protected mode and jumping to the 32-bit kernel start address. In a
fully 32-bit OS, this tends to be the e{only} mixed-size
instruction you need, since everything before it can be done in pure
16-bit code, and everything after it can be pure 32-bit.
This jump must specify a 48-bit far address, since the target
segment is a 32-bit one. However, it must be assembled in a 16-bit
segment, so just coding, for example,
c jmp 0x1234:0x56789ABC ; wrong!
will not work, since the offset part of the address will be
truncated to c{0x9ABC} and the jump will be an ordinary 16-bit far
one.
The Linux kernel setup code gets round the inability of c{as86} to
generate the required instruction by coding it manually, using
c{DB} instructions. NASM can go one better than that, by actually
generating the right instruction itself. Here's how to do it right:
c jmp dword 0x1234:0x56789ABC ; right
Ic{JMP DWORD}The c{DWORD} prefix (strictly speaking, it should
come e{after} the colon, since it is declaring the e{offset} field
to be a doubleword; but NASM will accept either form, since both are
unambiguous) forces the offset part to be treated as far, in the
assumption that you are deliberately writing a jump from a 16-bit
segment to a 32-bit one.
You can do the reverse operation, jumping from a 32-bit segment to a
16-bit one, by means of the c{WORD} prefix:
c jmp word 0x8765:0x4321 ; 32 to 16 bit
If the c{WORD} prefix is specified in 16-bit mode, or the c{DWORD}
prefix in 32-bit mode, they will be ignored, since each is
explicitly forcing NASM into a mode it was in anyway.
H{mixaddr} Addressing Between Different-Size SegmentsI{addressing,
mixed-size}I{mixed-size addressing}
If your OS is mixed 16 and 32-bit, or if you are writing a DOS
extender, you are likely to have to deal with some 16-bit segments
and some 32-bit ones. At some point, you will probably end up
writing code in a 16-bit segment which has to access data in a
32-bit segment, or vice versa.
If the data you are trying to access in a 32-bit segment lies within
the first 64K of the segment, you may be able to get away with using
an ordinary 16-bit addressing operation for the purpose; but sooner
or later, you will want to do 32-bit addressing from 16-bit mode.
The easiest way to do this is to make sure you use a register for
the address, since any effective address containing a 32-bit
register is forced to be a 32-bit address. So you can do
c mov eax,offset_into_32_bit_segment_specified_by_fs
c mov dword [fs:eax],0x11223344
This is fine, but slightly cumbersome (since it wastes an
instruction and a register) if you already know the precise offset
you are aiming at. The x86 architecture does allow 32-bit effective
addresses to specify nothing but a 4-byte offset, so why shouldn't
NASM be able to generate the best instruction for the purpose?
It can. As in k{mixjump}, you need only prefix the address with the
c{DWORD} keyword, and it will be forced to be a 32-bit address:
c mov dword [fs:dword my_offset],0x11223344
Also as in k{mixjump}, NASM is not fussy about whether the
c{DWORD} prefix comes before or after the segment override, so
arguably a nicer-looking way to code the above instruction is
c mov dword [dword fs:my_offset],0x11223344
Don't confuse the c{DWORD} prefix e{outside} the square brackets,
which controls the size of the data stored at the address, with the
one c{inside} the square brackets which controls the length of the
address itself. The two can quite easily be different:
c mov word [dword 0x12345678],0x9ABC
This moves 16 bits of data to an address specified by a 32-bit
offset.
You can also specify c{WORD} or c{DWORD} prefixes along with the
c{FAR} prefix to indirect far jumps or calls. For example:
c call dword far [fs:word 0x4321]
This instruction contains an address specified by a 16-bit offset;
it loads a 48-bit far pointer from that (16-bit segment and 32-bit
offset), and calls that address.
H{mixother} Other Mixed-Size Instructions
The other way you might want to access data might be using the
string instructions (c{LODSx}, c{STOSx} and so on) or the
c{XLATB} instruction. These instructions, since they take no
parameters, might seem to have no easy way to make them perform
32-bit addressing when assembled in a 16-bit segment.
This is the purpose of NASM's ic{a16} and ic{a32} prefixes. If
you are coding c{LODSB} in a 16-bit segment but it is supposed to
be accessing a string in a 32-bit segment, you should load the
desired address into c{ESI} and then code
c a32 lodsb
The prefix forces the addressing size to 32 bits, meaning that
c{LODSB} loads from c{[DS:ESI]} instead of c{[DS:SI]}. To access
a string in a 16-bit segment when coding in a 32-bit one, the
corresponding c{a16} prefix can be used.
The c{a16} and c{a32} prefixes can be applied to any instruction
in NASM's instruction table, but most of them can generate all the
useful forms without them. The prefixes are necessary only for
instructions with implicit addressing: c{CMPSx} (k{insCMPSB}),
c{SCASx} (k{insSCASB}), c{LODSx} (k{insLODSB}), c{STOSx}
(k{insSTOSB}), c{MOVSx} (k{insMOVSB}), c{INSx} (k{insINSB}),
c{OUTSx} (k{insOUTSB}), and c{XLATB} (k{insXLATB}). Also, the
various push and pop instructions (c{PUSHA} and c{POPF} as well as
the more usual c{PUSH} and c{POP}) can accept c{a16} or c{a32}
prefixes to force a particular one of c{SP} or c{ESP} to be used
as a stack pointer, in case the stack segment in use is a different
size from the code segment.
c{PUSH} and c{POP}, when applied to segment registers in 32-bit
mode, also have the slightly odd behaviour that they push and pop 4
bytes at a time, of which the top two are ignored and the bottom two
give the value of the segment register being manipulated. To force
the 16-bit behaviour of segment-register push and pop instructions,
you can use the operand-size prefix ic{o16}:
c o16 push ss
c o16 push ds
This code saves a doubleword of stack space by fitting two segment
registers into the space which would normally be consumed by pushing
one.
(You can also use the ic{o32} prefix to force the 32-bit behaviour
when in 16-bit mode, but this seems less useful.)
C{trouble} Troubleshooting
This chapter describes some of the common problems that users have
been known to encounter with NASM, and answers them. It also gives
instructions for reporting bugs in NASM if you find a difficulty
that isn't listed here.
H{problems} Common Problems
S{inefficient} NASM Generates i{Inefficient Code}
I get a lot of `bug' reports about NASM generating inefficient, or
even `wrong', code on instructions such as c{ADD ESP,8}. This is a
deliberate design feature, connected to predictability of output:
NASM, on seeing c{ADD ESP,8}, will generate the form of the
instruction which leaves room for a 32-bit offset. You need to code
Ic{BYTE}c{ADD ESP,BYTE 8} if you want the space-efficient
form of the instruction. This isn't a bug: at worst it's a
misfeature, and that's a matter of opinion only.
S{jmprange} My Jumps are Out of RangeI{out of range, jumps}
Similarly, people complain that when they issue i{conditional
jumps} (which are c{SHORT} by default) that try to jump too far,
NASM reports `short jump out of range' instead of making the jumps
longer.
This, again, is partly a predictability issue, but in fact has a
more practical reason as well. NASM has no means of being told what
type of processor the code it is generating will be run on; so it
cannot decide for itself that it should generate ic{Jcc NEAR} type
instructions, because it doesn't know that it's working for a 386 or
above. Alternatively, it could replace the out-of-range short
c{JNE} instruction with a very short c{JE} instruction that jumps
over a c{JMP NEAR}; this is a sensible solution for processors
below a 386, but hardly efficient on processors which have good
branch prediction e{and} could have used c{JNE NEAR} instead. So,
once again, it's up to the user, not the assembler, to decide what
instructions should be generated.
S{proborg} ic{ORG} Doesn't Work
People writing i{boot sector} programs in the c{bin} format often
complain that c{ORG} doesn't work the way they'd like: in order to
place the c{0xAA55} signature word at the end of a 512-byte boot
sector, people who are used to MASM tend to code
c ORG 0
c ; some boot sector code
c ORG 510
c DW 0xAA55
This is not the intended use of the c{ORG} directive in NASM, and
will not work. The correct way to solve this problem in NASM is to
use the ic{TIMES} directive, like this:
c ORG 0
c ; some boot sector code
c TIMES 510-($-$$) DB 0
c DW 0xAA55
The c{TIMES} directive will insert exactly enough zero bytes into
the output to move the assembly point up to 510. This method also
has the advantage that if you accidentally fill your boot sector too
full, NASM will catch the problem at assembly time and report it, so
you won't end up with a boot sector that you have to disassemble to
find out what's wrong with it.
S{probtimes} ic{TIMES} Doesn't Work
The other common problem with the above code is people who write the
c{TIMES} line as
c TIMES 510-$ DB 0
by reasoning that c{$} should be a pure number, just like 510, so
the difference between them is also a pure number and can happily be
fed to c{TIMES}.
NASM is a e{modular} assembler: the various component parts are
designed to be easily separable for re-use, so they don't exchange
information unnecessarily. In consequence, the c{bin} output
format, even though it has been told by the c{ORG} directive that
the c{.text} section should start at 0, does not pass that
information back to the expression evaluator. So from the
evaluator's point of view, c{$} isn't a pure number: it's an offset
from a section base. Therefore the difference between c{$} and 510
is also not a pure number, but involves a section base. Values
involving section bases cannot be passed as arguments to c{TIMES}.
The solution, as in the previous section, is to code the c{TIMES}
line in the form
c TIMES 510-($-$$) DB 0
in which c{$} and c{$$} are offsets from the same section base,
and so their difference is a pure number. This will solve the
problem and generate sensible code.
H{bugs} i{Bugs}I{reporting bugs}
We have never yet released a version of NASM with any e{known}
bugs. That doesn't usually stop there being plenty we didn't know
about, though. Any that you find should be reported to
W{mailto:hpa@zytor.com}c{hpa@zytor.com}.
Please read k{qstart} first, and don't report the bug if it's
listed in there as a deliberate feature. (If you think the feature
is badly thought out, feel free to send us reasons why you think it
should be changed, but don't just send us mail saying `This is a
bug' if the documentation says we did it on purpose.) Then read
k{problems}, and don't bother reporting the bug if it's listed
there.
If you do report a bug, e{please} give us all of the following
information:
b What operating system you're running NASM under. DOS, Linux,
NetBSD, Win16, Win32, VMS (I'd be impressed), whatever.
b If you're running NASM under DOS or Win32, tell us whether you've
compiled your own executable from the DOS source archive, or whether
you were using the standard distribution binaries out of the
archive. If you were using a locally built executable, try to
reproduce the problem using one of the standard binaries, as this
will make it easier for us to reproduce your problem prior to fixing
it.
b Which version of NASM you're using, and exactly how you invoked
it. Give us the precise command line, and the contents of the
c{NASM} environment variable if any.
b Which versions of any supplementary programs you're using, and
how you invoked them. If the problem only becomes visible at link
time, tell us what linker you're using, what version of it you've
got, and the exact linker command line. If the problem involves
linking against object files generated by a compiler, tell us what
compiler, what version, and what command line or options you used.
(If you're compiling in an IDE, please try to reproduce the problem
with the command-line version of the compiler.)
b If at all possible, send us a NASM source file which exhibits the
problem. If this causes copyright problems (e.g. you can only
reproduce the bug in restricted-distribution code) then bear in mind
the following two points: firstly, we guarantee that any source code
sent to us for the purposes of debugging NASM will be used e{only}
for the purposes of debugging NASM, and that we will delete all our
copies of it as soon as we have found and fixed the bug or bugs in
question; and secondly, we would prefer e{not} to be mailed large
chunks of code anyway. The smaller the file, the better. A
three-line sample file that does nothing useful e{except}
demonstrate the problem is much easier to work with than a
fully fledged ten-thousand-line program. (Of course, some errors
e{do} only crop up in large files, so this may not be possible.)
b A description of what the problem actually e{is}. `It doesn't
work' is e{not} a helpful description! Please describe exactly what
is happening that shouldn't be, or what isn't happening that should.
Examples might be: `NASM generates an error message saying Line 3
for an error that's actually on Line 5'; `NASM generates an error
message that I believe it shouldn't be generating at all'; `NASM
fails to generate an error message that I believe it e{should} be
generating'; `the object file produced from this source code crashes
my linker'; `the ninth byte of the output file is 66 and I think it
should be 77 instead'.
b If you believe the output file from NASM to be faulty, send it to
us. That allows us to determine whether our own copy of NASM
generates the same file, or whether the problem is related to
portability issues between our development platforms and yours. We
can handle binary files mailed to us as MIME attachments, uuencoded,
and even BinHex. Alternatively, we may be able to provide an FTP
site you can upload the suspect files to; but mailing them is easier
for us.
b Any other information or data files that might be helpful. If,
for example, the problem involves NASM failing to generate an object
file while TASM can generate an equivalent file without trouble,
then send us e{both} object files, so we can see what TASM is doing
differently from us.
A{iref} Intel x86 Instruction Reference
This appendix provides a complete list of the machine instructions
which NASM will assemble, and a short description of the function of
each one.
It is not intended to be exhaustive documentation on the fine
details of the instructions' function, such as which exceptions they
can trigger: for such documentation, you should go to Intel's Web
site, W{http://www.intel.com/}c{http://www.intel.com/}.
Instead, this appendix is intended primarily to provide
documentation on the way the instructions may be used within NASM.
For example, looking up c{LOOP} will tell you that NASM allows
c{CX} or c{ECX} to be specified as an optional second argument to
the c{LOOP} instruction, to enforce which of the two possible
counter registers should be used if the default is not the one
desired.
The instructions are not quite listed in alphabetical order, since
groups of instructions with similar functions are lumped together in
the same entry. Most of them don't move very far from their
alphabetic position because of this.
H{iref-opr} Key to Operand Specifications
The instruction descriptions in this appendix specify their operands
using the following notation:
b Registers: c{reg8} denotes an 8-bit i{general purpose
register}, c{reg16} denotes a 16-bit general purpose register, and
c{reg32} a 32-bit one. c{fpureg} denotes one of the eight FPU
stack registers, c{mmxreg} denotes one of the eight 64-bit MMX
registers, and c{segreg} denotes a segment register. In addition,
some registers (such as c{AL}, c{DX} or
c{ECX}) may be specified explicitly.
b Immediate operands: c{imm} denotes a generic i{immediate operand}.
c{imm8}, c{imm16} and c{imm32} are used when the operand is
intended to be a specific size. For some of these instructions, NASM
needs an explicit specifier: for example, c{ADD ESP,16} could be
interpreted as either c{ADD r/m32,imm32} or c{ADD r/m32,imm8}.
NASM chooses the former by default, and so you must specify c{ADD
ESP,BYTE 16} for the latter.
b Memory references: c{mem} denotes a generic i{memory reference};
c{mem8}, c{mem16}, c{mem32}, c{mem64} and c{mem80} are used
when the operand needs to be a specific size. Again, a specifier is
needed in some cases: c{DEC [address]} is ambiguous and will be
rejected by NASM. You must specify c{DEC BYTE [address]}, c{DEC
WORD [address]} or c{DEC DWORD [address]} instead.
b i{Restricted memory references}: one form of the c{MOV}
instruction allows a memory address to be specified e{without}
allowing the normal range of register combinations and effective
address processing. This is denoted by c{memoffs8}, c{memoffs16}
and c{memoffs32}.
b Register or memory choices: many instructions can accept either a
register e{or} a memory reference as an operand. c{r/m8} is a
shorthand for c{reg8/mem8}; similarly c{r/m16} and c{r/m32}.
c{r/m64} is MMX-related, and is a shorthand for c{mmxreg/mem64}.
H{iref-opc} Key to Opcode Descriptions
This appendix also provides the opcodes which NASM will generate for
each form of each instruction. The opcodes are listed in the
following way:
b A hex number, such as c{3F}, indicates a fixed byte containing
that number.
b A hex number followed by c{+r}, such as c{C8+r}, indicates that
one of the operands to the instruction is a register, and the
`register value' of that register should be added to the hex number
to produce the generated byte. For example, EDX has register value
2, so the code c{C8+r}, when the register operand is EDX, generates
the hex byte c{CA}. Register values for specific registers are
given in k{iref-rv}.
b A hex number followed by c{+cc}, such as c{40+cc}, indicates
that the instruction name has a condition code suffix, and the
numeric representation of the condition code should be added to the
hex number to produce the generated byte. For example, the code
c{40+cc}, when the instruction contains the c{NE} condition,
generates the hex byte c{45}. Condition codes and their numeric
representations are given in k{iref-cc}.
b A slash followed by a digit, such as c{/2}, indicates that one
of the operands to the instruction is a memory address or register
(denoted c{mem} or c{r/m}, with an optional size). This is to be
encoded as an effective address, with a i{ModR/M byte}, an optional
i{SIB byte}, and an optional displacement, and the spare (register)
field of the ModR/M byte should be the digit given (which will be
from 0 to 7, so it fits in three bits). The encoding of effective
addresses is given in k{iref-ea}.
b The code c{/r} combines the above two: it indicates that one of
the operands is a memory address or c{r/m}, and another is a
register, and that an effective address should be generated with the
spare (register) field in the ModR/M byte being equal to the
`register value' of the register operand. The encoding of effective
addresses is given in k{iref-ea}; register values are given in
k{iref-rv}.
b The codes c{ib}, c{iw} and c{id} indicate that one of the
operands to the instruction is an immediate value, and that this is
to be encoded as a byte, little-endian word or little-endian
doubleword respectively.
b The codes c{rb}, c{rw} and c{rd} indicate that one of the
operands to the instruction is an immediate value, and that the
e{difference} between this value and the address of the end of the
instruction is to be encoded as a byte, word or doubleword
respectively. Where the form c{rw/rd} appears, it indicates that
either c{rw} or c{rd} should be used according to whether assembly
is being performed in c{BITS 16} or c{BITS 32} state respectively.
b The codes c{ow} and c{od} indicate that one of the operands to
the instruction is a reference to the contents of a memory address
specified as an immediate value: this encoding is used in some forms
of the c{MOV} instruction in place of the standard
effective-address mechanism. The displacement is encoded as a word
or doubleword. Again, c{ow/od} denotes that c{ow} or c{od} should
be chosen according to the c{BITS} setting.
b The codes c{o16} and c{o32} indicate that the given form of the
instruction should be assembled with operand size 16 or 32 bits. In
other words, c{o16} indicates a c{66} prefix in c{BITS 32} state,
but generates no code in c{BITS 16} state; and c{o32} indicates a
c{66} prefix in c{BITS 16} state but generates nothing in c{BITS
32}.
b The codes c{a16} and c{a32}, similarly to c{o16} and c{o32},
indicate the address size of the given form of the instruction.
Where this does not match the c{BITS} setting, a c{67} prefix is
required.
S{iref-rv} Register Values
Where an instruction requires a register value, it is already
implicit in the encoding of the rest of the instruction what type of
register is intended: an 8-bit general-purpose register, a segment
register, a debug register, an MMX register, or whatever. Therefore
there is no problem with registers of different types sharing an
encoding value.
The encodings for the various classes of register are:
b 8-bit general registers: c{AL} is 0, c{CL} is 1, c{DL} is 2,
c{BL} is 3, c{AH} is 4, c{CH} is 5, c{DH} is 6, and c{BH} is
7.
b 16-bit general registers: c{AX} is 0, c{CX} is 1, c{DX} is 2,
c{BX} is 3, c{SP} is 4, c{BP} is 5, c{SI} is 6, and c{DI} is 7.
b 32-bit general registers: c{EAX} is 0, c{ECX} is 1, c{EDX} is
2, c{EBX} is 3, c{ESP} is 4, c{EBP} is 5, c{ESI} is 6, and
c{EDI} is 7.
b i{Segment registers}: c{ES} is 0, c{CS} is 1, c{SS} is 2, c{DS}
is 3, c{FS} is 4, and c{GS} is 5.
b I{floating-point, registers}{Floating-point registers}: c{ST0}
is 0, c{ST1} is 1, c{ST2} is 2, c{ST3} is 3, c{ST4} is 4,
c{ST5} is 5, c{ST6} is 6, and c{ST7} is 7.
b 64-bit i{MMX registers}: c{MM0} is 0, c{MM1} is 1, c{MM2} is 2,
c{MM3} is 3, c{MM4} is 4, c{MM5} is 5, c{MM6} is 6, and c{MM7}
is 7.
b i{Control registers}: c{CR0} is 0, c{CR2} is 2, c{CR3} is 3,
and c{CR4} is 4.
b i{Debug registers}: c{DR0} is 0, c{DR1} is 1, c{DR2} is 2,
c{DR3} is 3, c{DR6} is 6, and c{DR7} is 7.
b i{Test registers}: c{TR3} is 3, c{TR4} is 4, c{TR5} is 5,
c{TR6} is 6, and c{TR7} is 7.
(Note that wherever a register name contains a number, that number
is also the register value for that register.)
S{iref-cc} i{Condition Codes}
The available condition codes are given here, along with their
numeric representations as part of opcodes. Many of these condition
codes have synonyms, so several will be listed at a time.
In the following descriptions, the word `either', when applied to two
possible trigger conditions, is used to mean `either or both'. If
`either but not both' is meant, the phrase `exactly one of' is used.
b c{O} is 0 (trigger if the overflow flag is set); c{NO} is 1.
b c{B}, c{C} and c{NAE} are 2 (trigger if the carry flag is
set); c{AE}, c{NB} and c{NC} are 3.
b c{E} and c{Z} are 4 (trigger if the zero flag is set); c{NE}
and c{NZ} are 5.
b c{BE} and c{NA} are 6 (trigger if either of the carry or zero
flags is set); c{A} and c{NBE} are 7.
b c{S} is 8 (trigger if the sign flag is set); c{NS} is 9.
b c{P} and c{PE} are 10 (trigger if the parity flag is set);
c{NP} and c{PO} are 11.
b c{L} and c{NGE} are 12 (trigger if exactly one of the sign and
overflow flags is set); c{GE} and c{NL} are 13.
b c{LE} and c{NG} are 14 (trigger if either the zero flag is set,
or exactly one of the sign and overflow flags is set); c{G} and
c{NLE} are 15.
Note that in all cases, the sense of a condition code may be
reversed by changing the low bit of the numeric representation.
S{iref-ea} Effective Address Encoding: i{ModR/M} and i{SIB}
An i{effective address} is encoded in up to three parts: a ModR/M
byte, an optional SIB byte, and an optional byte, word or doubleword
displacement field.
The ModR/M byte consists of three fields: the c{mod} field, ranging
from 0 to 3, in the upper two bits of the byte, the c{r/m} field,
ranging from 0 to 7, in the lower three bits, and the spare
(register) field in the middle (bit 3 to bit 5). The spare field is
not relevant to the effective address being encoded, and either
contains an extension to the instruction opcode or the register
value of another operand.
The ModR/M system can be used to encode a direct register reference
rather than a memory access. This is always done by setting the
c{mod} field to 3 and the c{r/m} field to the register value of
the register in question (it must be a general-purpose register, and
the size of the register must already be implicit in the encoding of
the rest of the instruction). In this case, the SIB byte and
displacement field are both absent.
In 16-bit addressing mode (either c{BITS 16} with no c{67} prefix,
or c{BITS 32} with a c{67} prefix), the SIB byte is never used.
The general rules for c{mod} and c{r/m} (there is an exception,
given below) are:
b The c{mod} field gives the length of the displacement field: 0
means no displacement, 1 means one byte, and 2 means two bytes.
b The c{r/m} field encodes the combination of registers to be
added to the displacement to give the accessed address: 0 means
c{BX+SI}, 1 means c{BX+DI}, 2 means c{BP+SI}, 3 means c{BP+DI},
4 means c{SI} only, 5 means c{DI} only, 6 means c{BP} only, and 7
means c{BX} only.
However, there is a special case:
b If c{mod} is 0 and c{r/m} is 6, the effective address encoded
is not c{[BP]} as the above rules would suggest, but instead
c{[disp16]}: the displacement field is present and is two bytes
long, and no registers are added to the displacement.
Therefore the effective address c{[BP]} cannot be encoded as
efficiently as c{[BX]}; so if you code c{[BP]} in a program, NASM
adds a notional 8-bit zero displacement, and sets c{mod} to 1,
c{r/m} to 6, and the one-byte displacement field to 0.
In 32-bit addressing mode (either c{BITS 16} with a c{67} prefix,
or c{BITS 32} with no c{67} prefix) the general rules (again,
there are exceptions) for c{mod} and c{r/m} are:
b The c{mod} field gives the length of the displacement field: 0
means no displacement, 1 means one byte, and 2 means four bytes.
b If only one register is to be added to the displacement, and it
is not c{ESP}, the c{r/m} field gives its register value, and the
SIB byte is absent. If the c{r/m} field is 4 (which would encode
c{ESP}), the SIB byte is present and gives the combination and
scaling of registers to be added to the displacement.
If the SIB byte is present, it describes the combination of
registers (an optional base register, and an optional index register
scaled by multiplication by 1, 2, 4 or 8) to be added to the
displacement. The SIB byte is divided into the c{scale} field, in
the top two bits, the c{index} field in the next three, and the
c{base} field in the bottom three. The general rules are:
b The c{base} field encodes the register value of the base
register.
b The c{index} field encodes the register value of the index
register, unless it is 4, in which case no index register is used
(so c{ESP} cannot be used as an index register).
b The c{scale} field encodes the multiplier by which the index
register is scaled before adding it to the base and displacement: 0
encodes a multiplier of 1, 1 encodes 2, 2 encodes 4 and 3 encodes 8.
The exceptions to the 32-bit encoding rules are:
b If c{mod} is 0 and c{r/m} is 5, the effective address encoded
is not c{[EBP]} as the above rules would suggest, but instead
c{[disp32]}: the displacement field is present and is four bytes
long, and no registers are added to the displacement.
b If c{mod} is 0, c{r/m} is 4 (meaning the SIB byte is present)
and c{base} is 4, the effective address encoded is not
c{[EBP+index]} as the above rules would suggest, but instead
c{[disp32+index]}: the displacement field is present and is four
bytes long, and there is no base register (but the index register is
still processed in the normal way).
H{iref-flg} Key to Instruction Flags
Given along with each instruction in this appendix is a set of
flags, denoting the type of the instruction. The types are as follows:
b c{8086}, c{186}, c{286}, c{386}, c{486}, c{PENT} and c{P6}
denote the lowest processor type that supports the instruction. Most
instructions run on all processors above the given type; those that
do not are documented. The Pentium II contains no additional
instructions beyond the P6 (Pentium Pro); from the point of view of
its instruction set, it can be thought of as a P6 with MMX
capability.
b c{CYRIX} indicates that the instruction is specific to Cyrix
processors, for example the extra MMX instructions in the Cyrix
extended MMX instruction set.
b c{FPU} indicates that the instruction is a floating-point one,
and will only run on machines with a coprocessor (automatically
including 486DX, Pentium and above).
b c{MMX} indicates that the instruction is an MMX one, and will
run on MMX-capable Pentium processors and the Pentium II.
b c{PRIV} indicates that the instruction is a protected-mode
management instruction. Many of these may only be used in protected
mode, or only at privilege level zero.
b c{UNDOC} indicates that the instruction is an undocumented one,
and not part of the official Intel Architecture; it may or may not
be supported on any given machine.
H{insAAA} ic{AAA}, ic{AAS}, ic{AAM}, ic{AAD}: ASCII
Adjustments
c AAA ; 37 [8086]
c AAS ; 3F [8086]
c AAD ; D5 0A [8086]
c AAD imm ; D5 ib [8086]
c AAM ; D4 0A [8086]
c AAM imm ; D4 ib [8086]
These instructions are used in conjunction with the add, subtract,
multiply and divide instructions to perform binary-coded decimal
arithmetic in e{unpacked} (one BCD digit per byte - easy to
translate to and from ASCII, hence the instruction names) form.
There are also packed BCD instructions c{DAA} and c{DAS}: see
k{insDAA}.
c{AAA} should be used after a one-byte c{ADD} instruction whose
destination was the c{AL} register: by means of examining the value
in the low nibble of c{AL} and also the auxiliary carry flag
c{AF}, it determines whether the addition has overflowed, and
adjusts it (and sets the carry flag) if so. You can add long BCD
strings together by doing c{ADD}/c{AAA} on the low digits, then
doing c{ADC}/c{AAA} on each subsequent digit.
c{AAS} works similarly to c{AAA}, but is for use after c{SUB}
instructions rather than c{ADD}.
c{AAM} is for use after you have multiplied two decimal digits
together and left the result in c{AL}: it divides c{AL} by ten and
stores the quotient in c{AH}, leaving the remainder in c{AL}. The
divisor 10 can be changed by specifying an operand to the
instruction: a particularly handy use of this is c{AAM 16}, causing
the two nibbles in c{AL} to be separated into c{AH} and c{AL}.
c{AAD} performs the inverse operation to c{AAM}: it multiplies
c{AH} by ten, adds it to c{AL}, and sets c{AH} to zero. Again,
the multiplier 10 can be changed.
H{insADC} ic{ADC}: Add with Carry
c ADC r/m8,reg8 ; 10 /r [8086]
c ADC r/m16,reg16 ; o16 11 /r [8086]
c ADC r/m32,reg32 ; o32 11 /r [386]
c ADC reg8,r/m8 ; 12 /r [8086]
c ADC reg16,r/m16 ; o16 13 /r [8086]
c ADC reg32,r/m32 ; o32 13 /r [386]
c ADC r/m8,imm8 ; 80 /2 ib [8086]
c ADC r/m16,imm16 ; o16 81 /2 iw [8086]
c ADC r/m32,imm32 ; o32 81 /2 id [386]
c ADC r/m16,imm8 ; o16 83 /2 ib [8086]
c ADC r/m32,imm8 ; o32 83 /2 ib [386]
c ADC AL,imm8 ; 14 ib [8086]
c ADC AX,imm16 ; o16 15 iw [8086]
c ADC EAX,imm32 ; o32 15 id [386]
c{ADC} performs integer addition: it adds its two operands
together, plus the value of the carry flag, and leaves the result in
its destination (first) operand. The flags are set according to the
result of the operation: in particular, the carry flag is affected
and can be used by a subsequent c{ADC} instruction.
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases,
the c{BYTE} qualifier is necessary to force NASM to generate this
form of the instruction.
To add two numbers without also adding the contents of the carry
flag, use c{ADD} (k{insADD}).
H{insADD} ic{ADD}: Add Integers
c ADD r/m8,reg8 ; 00 /r [8086]
c ADD r/m16,reg16 ; o16 01 /r [8086]
c ADD r/m32,reg32 ; o32 01 /r [386]
c ADD reg8,r/m8 ; 02 /r [8086]
c ADD reg16,r/m16 ; o16 03 /r [8086]
c ADD reg32,r/m32 ; o32 03 /r [386]
c ADD r/m8,imm8 ; 80 /0 ib [8086]
c ADD r/m16,imm16 ; o16 81 /0 iw [8086]
c ADD r/m32,imm32 ; o32 81 /0 id [386]
c ADD r/m16,imm8 ; o16 83 /0 ib [8086]
c ADD r/m32,imm8 ; o32 83 /0 ib [386]
c ADD AL,imm8 ; 04 ib [8086]
c ADD AX,imm16 ; o16 05 iw [8086]
c ADD EAX,imm32 ; o32 05 id [386]
c{ADD} performs integer addition: it adds its two operands
together, and leaves the result in its destination (first) operand.
The flags are set according to the result of the operation: in
particular, the carry flag is affected and can be used by a
subsequent c{ADC} instruction (k{insADC}).
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases,
the c{BYTE} qualifier is necessary to force NASM to generate this
form of the instruction.
H{insAND} ic{AND}: Bitwise AND
c AND r/m8,reg8 ; 20 /r [8086]
c AND r/m16,reg16 ; o16 21 /r [8086]
c AND r/m32,reg32 ; o32 21 /r [386]
c AND reg8,r/m8 ; 22 /r [8086]
c AND reg16,r/m16 ; o16 23 /r [8086]
c AND reg32,r/m32 ; o32 23 /r [386]
c AND r/m8,imm8 ; 80 /4 ib [8086]
c AND r/m16,imm16 ; o16 81 /4 iw [8086]
c AND r/m32,imm32 ; o32 81 /4 id [386]
c AND r/m16,imm8 ; o16 83 /4 ib [8086]
c AND r/m32,imm8 ; o32 83 /4 ib [386]
c AND AL,imm8 ; 24 ib [8086]
c AND AX,imm16 ; o16 25 iw [8086]
c AND EAX,imm32 ; o32 25 id [386]
c{AND} performs a bitwise AND operation between its two operands
(i.e. each bit of the result is 1 if and only if the corresponding
bits of the two inputs were both 1), and stores the result in the
destination (first) operand.
In the forms with an 8-bit immediate second operand and a longer
first operand, the second operand is considered to be signed, and is
sign-extended to the length of the first operand. In these cases,
the c{BYTE} qualifier is necessary to force NASM to generate this
form of the instruction.
The MMX instruction c{PAND} (see k{insPAND}) performs the same
operation on the 64-bit MMX registers.
H{insARPL} ic{ARPL}: Adjust RPL Field of Selector
c ARPL r/m16,reg16 ; 63 /r [286,PRIV]
c{ARPL} expects its two word operands to be segment selectors. It
adjusts the RPL (requested privilege level - stored in the bottom
two bits of the selector) field of the destination (first) operand
to ensure that it is no less (i.e. no more privileged than) the RPL
field of the source operand. The zero flag is set if and only if a
change had to be made.
H{insBOUND} ic{BOUND}: Check Array Index against Bounds
c BOUND reg16,mem ; o16 62 /r [186]
c BOUND reg32,mem ; o32 62 /r [386]
c{BOUND} expects its second operand to point to an area of memory
containing two signed values of the same size as its first operand
(i.e. two words for the 16-bit form; two doublewords for the 32-bit
form). It performs two signed comparisons: if the value in the
register passed as its first operand is less than the first of the
in-memory values, or is greater than or equal to the second, it
throws a BR exception. Otherwise, it does nothing.
H{insBSF} ic{BSF}, ic{BSR}: Bit Scan
c BSF reg16,r/m16 ; o16 0F BC /r [386]
c BSF reg32,r/m32 ; o32 0F BC /r [386]
c BSR reg16,r/m16 ; o16 0F BD /r [386]
c BSR reg32,r/m32 ; o32 0F BD /r [386]
c{BSF} searches for a set bit in its source (second) operand,
starting from the bottom, and if it finds one, stores the index in
its destination (first) operand. If no set bit is found, the
contents of the destination operand are undefined.
c{BSR} performs the same function, but searches from the top
instead, so it finds the most significant set bit.
Bit indices are from 0 (least significant) to 15 or 31 (most
significant).
H{insBSWAP} ic{BSWAP}: Byte Swap
c BSWAP reg32 ; o32 0F C8+r [486]
c{BSWAP} swaps the order of the four bytes of a 32-bit register:
bits 0-7 exchange places with bits 24-31, and bits 8-15 swap with
bits 16-23. There is no explicit 16-bit equivalent: to byte-swap
c{AX}, c{BX}, c{CX} or c{DX}, c{XCHG} can be used.
H{insBT} ic{BT}, ic{BTC}, ic{BTR}, ic{BTS}: Bit Test
c BT r/m16,reg16 ; o16 0F A3 /r [386]
c BT r/m32,reg32 ; o32 0F A3 /r [386]
c BT r/m16,imm8 ; o16 0F BA /4 ib [386]
c BT r/m32,imm8 ; o32 0F BA /4 ib [386]
c BTC r/m16,reg16 ; o16 0F BB /r [386]
c BTC r/m32,reg32 ; o32 0F BB /r [386]
c BTC r/m16,imm8 ; o16 0F BA /7 ib [386]
c BTC r/m32,imm8 ; o32 0F BA /7 ib [386]
c BTR r/m16,reg16 ; o16 0F B3 /r [386]
c BTR r/m32,reg32 ; o32 0F B3 /r [386]
c BTR r/m16,imm8 ; o16 0F BA /6 ib [386]
c BTR r/m32,imm8 ; o32 0F BA /6 ib [386]
c BTS r/m16,reg16 ; o16 0F AB /r [386]
c BTS r/m32,reg32 ; o32 0F AB /r [386]
c BTS r/m16,imm ; o16 0F BA /5 ib [386]
c BTS r/m32,imm ; o32 0F BA /5 ib [386]
These instructions all test one bit of their first operand, whose
index is given by the second operand, and store the value of that
bit into the carry flag. Bit indices are from 0 (least significant)
to 15 or 31 (most significant).
In addition to storing the original value of the bit into the carry
flag, c{BTR} also resets (clears) the bit in the operand itself.
c{BTS} sets the bit, and c{BTC} complements the bit. c{BT} does
not modify its operands.
The bit offset should be no greater than the size of the operand.
H{insCALL} ic{CALL}: Call Subroutine
c CALL imm ; E8 rw/rd [8086]
c CALL imm:imm16 ; o16 9A iw iw [8086]
c CALL imm:imm32 ; o32 9A id iw [386]
c CALL FAR mem16 ; o16 FF /3 [8086]
c CALL FAR mem32 ; o32 FF /3 [386]
c CALL r/m16 ; o16 FF /2 [8086]
c CALL r/m32 ; o32 FF /2 [386]
c{CALL} calls a subroutine, by means of pushing the current
instruction pointer (c{IP}) and optionally c{CS} as well on the
stack, and then jumping to a given address.