pcreposix.3
上传用户:yhdzpy8989
上传日期:2007-06-13
资源大小:13604k
文件大小:6k
- .TH PCRE 3
- .SH NAME
- pcreposix - POSIX API for Perl-compatible regular expressions.
- .SH SYNOPSIS
- .B #include <pcreposix.h>
- .PP
- .SM
- .br
- .B int regcomp(regex_t *fIpregfR, const char *fIpatternfR,
- .ti +5n
- .B int fIcflagsfR);
- .PP
- .br
- .B int regexec(regex_t *fIpregfR, const char *fIstringfR,
- .ti +5n
- .B size_t fInmatchfR, regmatch_t fIpmatchfR[], int fIeflagsfR);
- .PP
- .br
- .B size_t regerror(int fIerrcodefR, const regex_t *fIpregfR,
- .ti +5n
- .B char *fIerrbuffR, size_t fIerrbuf_sizefR);
- .PP
- .br
- .B void regfree(regex_t *fIpregfR);
- .SH DESCRIPTION
- This set of functions provides a POSIX-style API to the PCRE regular expression
- package. See the fBpcrefR documentation for a description of the native API,
- which contains additional functionality.
- The functions described here are just wrapper functions that ultimately call
- the native API. Their prototypes are defined in the fBpcreposix.hfR header
- file, and on Unix systems the library itself is called fBpcreposix.afR, so
- can be accessed by adding fB-lpcreposixfR to the command for linking an
- application which uses them. Because the POSIX functions call the native ones,
- it is also necessary to add fR-lpcrefR.
- I have implemented only those option bits that can be reasonably mapped to PCRE
- native options. In addition, the options REG_EXTENDED and REG_NOSUB are defined
- with the value zero. They have no effect, but since programs that are written
- to the POSIX interface often use them, this makes it easier to slot in PCRE as
- a replacement library. Other POSIX options are not even defined.
- When PCRE is called via these functions, it is only the API that is POSIX-like
- in style. The syntax and semantics of the regular expressions themselves are
- still those of Perl, subject to the setting of various PCRE options, as
- described below.
- The header for these functions is supplied as fBpcreposix.hfR to avoid any
- potential clash with other POSIX libraries. It can, of course, be renamed or
- aliased as fBregex.hfR, which is the "correct" name. It provides two
- structure types, fIregex_tfR for compiled internal forms, and
- fIregmatch_tfR for returning captured substrings. It also defines some
- constants whose names start with "REG_"; these are used for setting options and
- identifying error codes.
- .SH COMPILING A PATTERN
- The function fBregcomp()fR is called to compile a pattern into an
- internal form. The pattern is a C string terminated by a binary zero, and
- is passed in the argument fIpatternfR. The fIpregfR argument is a pointer
- to a regex_t structure which is used as a base for storing information about
- the compiled expression.
- The argument fIcflagsfR is either zero, or contains one or more of the bits
- defined by the following macros:
- REG_ICASE
- The PCRE_CASELESS option is set when the expression is passed for compilation
- to the native function.
- REG_NEWLINE
- The PCRE_MULTILINE option is set when the expression is passed for compilation
- to the native function.
- In the absence of these flags, no options are passed to the native function.
- This means the the regex is compiled with PCRE default semantics. In
- particular, the way it handles newline characters in the subject string is the
- Perl way, not the POSIX way. Note that setting PCRE_MULTILINE has only
- fIsomefR of the effects specified for REG_NEWLINE. It does not affect the way
- newlines are matched by . (they aren't) or a negative class such as [^a] (they
- are).
- The yield of fBregcomp()fR is zero on success, and non-zero otherwise. The
- fIpregfR structure is filled in on success, and one member of the structure
- is publicized: fIre_nsubfR contains the number of capturing subpatterns in
- the regular expression. Various error codes are defined in the header file.
- .SH MATCHING A PATTERN
- The function fBregexec()fR is called to match a pre-compiled pattern
- fIpregfR against a given fIstringfR, which is terminated by a zero byte,
- subject to the options in fIeflagsfR. These can be:
- REG_NOTBOL
- The PCRE_NOTBOL option is set when calling the underlying PCRE matching
- function.
- REG_NOTEOL
- The PCRE_NOTEOL option is set when calling the underlying PCRE matching
- function.
- The portion of the string that was matched, and also any captured substrings,
- are returned via the fIpmatchfR argument, which points to an array of
- fInmatchfR structures of type fIregmatch_tfR, containing the members
- fIrm_sofR and fIrm_eofR. These contain the offset to the first character of
- each substring and the offset to the first character after the end of each
- substring, respectively. The 0th element of the vector relates to the entire
- portion of fIstringfR that was matched; subsequent elements relate to the
- capturing subpatterns of the regular expression. Unused entries in the array
- have both structure members set to -1.
- A successful match yields a zero return; various error codes are defined in the
- header file, of which REG_NOMATCH is the "expected" failure code.
- .SH ERROR MESSAGES
- The fBregerror()fR function maps a non-zero errorcode from either
- fBregcompfR or fBregexecfR to a printable message. If fIpregfR is not
- NULL, the error should have arisen from the use of that structure. A message
- terminated by a binary zero is placed in fIerrbuffR. The length of the
- message, including the zero, is limited to fIerrbuf_sizefR. The yield of the
- function is the size of buffer needed to hold the whole message.
- .SH STORAGE
- Compiling a regular expression causes memory to be allocated and associated
- with the fIpregfR structure. The function fBregfree()fR frees all such
- memory, after which fIpregfR may no longer be used as a compiled expression.
- .SH AUTHOR
- Philip Hazel <ph10@cam.ac.uk>
- .br
- University Computing Service,
- .br
- New Museums Site,
- .br
- Cambridge CB2 3QG, England.
- .br
- Phone: +44 1223 334714
- Copyright (c) 1997-2000 University of Cambridge.