README.CONF
上传用户:seven77cht
上传日期:2007-01-04
资源大小:486k
文件大小:25k
- WWWOFFLE - World Wide Web Offline Explorer - Version 2.5
- ========================================================
- If you are upgrading from version 1.x to version 2.x then you should read the
- file CHANGES.CONF which explains how to convert the sections in your existing
- wwwoffle.conf file to the new format.
- If you are upgrading from version 2.x to this version then the file CHANGES.CONF
- shows the new options.
- The configuration file (wwwoffle.conf) specifies all of the parameters that
- control the operation of the proxy server. The file is split into sections each
- containing a series of parameters as described below.
- The file is split into sections, each of which can be empty or contain one or
- more lines of configuration information. The sections are named and the order
- that they appear in the file is not important.
- The general format of each of the sections is the same. The name of the section
- is on a line by itself to mark the start. The contents of the section are
- enclosed between a pair of lines containing the '{' and '}' characters or '['
- and ']' characters. When the '{' and '}' characters are used the lines between
- contain configuration information. When the '[' and ']' characters are used the
- there must only be a single non-empty line between them that contains the name
- of a file (in the same directory) containing the configuration information.
- Comments are marked by a '#' character at the start of the line and blank lines
- are also allowed, both are ignored.
- StartUp
- -------
- This contains the parameters that are used when the program starts, changes to
- these are ignored if the configuration file is re-read while the program is
- running.
- http-port = <port> ; An integer specifying the port for the
- HTTP proxy (default=8080).
- wwwoffle-port = <port> ; An integer specifying the port for
- WWWOFFLE control connections
- (default=8081).
- spool-dir = <dir> ; The name of the spool directory
- (default=/var/spool/wwwoffle).
- run-uid = <user> | <uid> ; The username or numeric uid to run the
- wwwoffled server as (default=none).
- run-gid = <group> | <gid> ; The groupname or numeric gid to run the
- wwwoffled server as (default=none).
- use-syslog = yes | no ; Whether to use the syslog facility for
- messages (default=yes).
- password = <word> ; The password used for authentication of
- the control message (default=none).
- max-servers = <integer> ; The maximum number of server processes
- that are started (default=8).
- max-fetch-servers = <integer> ; The maximum number of server processes
- that are started to fetch pages that
- were marked in offline mode (default=4).
- dir-perm = <octal int> ; The permissions to use when creating
- spool directories (default=0755).
- file-perm = <octal int> ; The permissions to use when creating
- spool files (default=0644).
- run-online = <filename> ; The name of a program to run when switched
- to online mode (default=none).
- run-offline = <filename> ; The name of a program to run when switched
- to offline mode (default=none).
- run-autodial = <filename> ; The name of a program to run when switched
- to autodial mode (default=none).
- Notes: For the password to work the configuration file must be set so that only
- authorised users can read it.
- : The run-uid/run-gid options are not applicable to win32 (Win95/98).
- : To use the run-uid/run-gid options the server must be started as root.
- : The max-fetch-servers value must be less than max-servers or you will
- not be able to use WWWOFFLE interactively online while fetching.
- : The dir-perm and file-perm options override the umask settings and must
- be in octal starting with a '0' character.
- : The programs run using the run-online, run-offline and run-autodial
- options are started with a single parameter set to the current mode.
- Options
- -------
- Options that control how the program works.
- log-level = debug | info | important | warning | fatal
- ; Log messages with this or higher priority
- (default=important).
- index-latest-days = <age> ; The number of days to display in the index of
- the latest pages (default=7 days).
- request-changed = <time> ; While online pages will only be fetched if the
- cached version is older than this specified
- time in seconds (default=600).
- request-changed-once= yes | no ; While online pages will only be fetched if the
- cached version has not already been fetched
- once this session (default=yes).
- request-expired = yes | no ; While online pages that have expired will
- always be requested again (default=no).
- request-no-cache = yes | no ; While online pages that ask not to be cached
- will always be requested again (default=no).
- pragma-no-cache = yes | no ; Whether to request a new copy of a page if the
- request from the browser has 'Pragma: no-cache'
- (default=yes).
- confirm-requests = yes | no ; Whether to return a page requiring user
- confirmation instead of automatically recording
- requests made while offline (default=no).
- connect-timeout = <time> ; The time in seconds that WWWOFFLE will wait for
- the socket connection to be made (default=30).
- socket-timeout = <time> ; The time in seconds that WWWOFFLE will wait for
- data before giving up on a socket connection
- (default=120).
- connect-retry = yes | no ; If a connection cannot be made to a remote
- server then try again after a short delay
- (default=no).
- ssl-allow-port = <integer>; A port number that can be used for Secure
- Socket Layer (SSL) connections, e.g. https.
- no-lasttime-index = yes | no; Disables creation of the lasttime/prevtime
- indexes (default=no).
- intr-download-keep = yes | no; If the browser closes the connection while
- online the currently downloaded partial page
- should be kept (default=no).
- intr-download-size =<integer>; If the browser closes the connection while
- online the page should continue to download if
- smaller than this size in kB (default=1).
- intr-download-percent=<integer>; If the browser closes the connection while
- online the page should continue to download if
- more than this amount complete (default=80).
- timeout-download-keep= yes | no; If the server connection timeouts while reading
- then the currently downloaded partial page
- should be kept (default=no).
- Notes: The request-changed option can be set negative to indicate that cached
- pages are always used while online.
- : The request-changed-once option takes precedence over the
- request-changed option.
- : The request-expired and request-no-cache options takes precedence over
- the request-changed-once and request-changed options.
- : The pragma-no-cache option should be set to 'no' if when browsing
- offline all pages are re-requested by a 'broken' browser.
- : The ssl-allow-port should be set to 443 to allow https, there can be more
- than one ssl-port entry for other ports as required.
- FetchOptions
- ------------
- Options that control what is downloaded when fetching pages that were requested
- while offline.
- stylesheets = yes | no ; If style sheets are to be fetched.
- images = yes | no ; If images are to be fetched.
- frames = yes | no ; If frames are to be fetched.
- scripts = yes | no ; If scripts (e.g. Javascript) are to be fetched.
- objects = yes | no ; If objects (e.g. Java class files) are to be fetched.
- Notes: These options all default to 'no' if nothing is specified.
- ModifyHTML
- ----------
- Options that control how the HTML that is provided from the cache is modified.
- enable-modify-html = yes | no ; Enable the HTML modifications in this
- section (has a speed penalty)
- (default=no).
- add-cache-info = yes | no ; At the bottom of all of the spooled pages
- the date that the page was cached and some
- buttons are to be added (default=no).
- anchor-cached-begin =<HTML code>; Anchors (links) that are cached are to
- have the specified HTML inserted at the
- beginning (default="").
- anchor-cached-end =<HTML code>; Anchors (links) that are cached are to
- have the specified HTML inserted at the
- end (default="").
- anchor-requested-begin =<HTML code>; Anchors (links) that have been requested
- are to have the specified HTML inserted at
- the beginning (default="").
- anchor-requested-end =<HTML code>; Anchors (links) that have been requested
- are to have the specified HTML inserted at
- the end (default="").
- anchor-not-cached-begin =<HTML code>; Anchors (links) that are not cached or
- requested are to have the specified HTML
- inserted at the beginning (default="").
- anchor-not-cached-end =<HTML code>; Anchors (links) that are not cached or
- requested are to have the specified HTML
- inserted at the end (default="").
- disable-script = yes | no; Removes all scripts and scripted events
- (default=no).
- disable-blink = yes | no; Removes the <blink> tag (default=no).
- disable-animated-gif = yes | no; Disables the animation of GIF files
- (default=no).
- Notes: These options all rely on the HTML being syntactically correct, if it is
- not then the result is undefined.
- LocalHost
- ---------
- A list of hosts that the host running the wwwoffled server may be known by.
- This is so that the proxy does not need to contact itself to get the server
- local pages.
- <host> ; A hostname or IP address that in connection with the port number (in
- the StartUp section) specifies the WWWOFFLE proxy HTTP server.
- Notes: The host names must match exactly, no wildcard matches.
- : All of these hosts are also used the same way as those in the
- LocalNet and AllowedConnectHosts sections.
- : The first named host is used as the server name for several features
- so must be a name that will work from any client host on the network.
- : None of the entries here or in LocalNet are fetched via a proxy.
- LocalNet
- --------
- A list of hosts that are not to be cached by wwwoffled because they are on a
- local network.
- <host> ; A hostname or IP address that is not to be cached by the server.
- Notes: The host name matching uses wildcards (see the WILDCARD section).
- : A host can be excluded by appending a '!' to the start of the name, all
- possible aliases and IP addresses for the host are also required.
- : All entries here are assumed to be reachable even when offline.
- : All of the hosts in LocalHost are also not cached.
- : None of the entries here or in LocalHost are fetched via a proxy.
- AllowedConnectHosts
- -------------------
- A list of client hosts that are allowed to connect to the server.
- <host> ; A hostname or IP address that is allowed to connect to the server.
- Notes: The host name matching uses wildcards (see the WILDCARD section).
- : A host can be excluded by appending a '!' to the start of the name, all
- possible aliases and IP addresses for the host are also required.
- : All of the hosts in LocalHost are also allowed to connect.
- AllowedConnectUsers
- -------------------
- A list of the users that are allowed to connect to the server.
- <username>:<password> ; The username and password of the users that are allowed
- to connect to the server.
- Notes: If this section is left empty then no user authentication is done.
- : The username and password are both stored in plaintext format.
- : This requires the use of browsers that handle the HTTP/1.1 standard.
- DontCache
- ---------
- A list of URLs that are not to be cached by wwwoffled.
- URL-SPECIFICATION ; Do not cache any URLs that match this.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION.
- : The URL-SPECIFICATION can be negated, see URL-SPECIFICATION description.
- : The files will still be cached if fetched non-interactively.
- DontGet
- -------
- A list of URLs that are not to be got by wwwoffled (because they contain only
- junk adverts for example).
- URL-SPECIFICATION [ = <URL> ] ; Do not get any URLs that match this [ with
- the option to specify a replacement URL ].
- replacement = <URL> ; The default URL to replace any URLs that match
- the URL-SPECIFICATIONs instead of using the
- standard error message (default=none).
- Notes: See the bottom of this file for the description of URL-SPECIFICATION.
- : The URL-SPECIFICATION can be negated, see URL-SPECIFICATION description.
- : The URL /local/images/trans-1x1.gif is a suggested replacement
- (a 1x1 pixel transparent gif).
- DontGetRecursive
- ----------------
- A list of URLs that are not to be got by wwwoffled when fetching recursively.
- URL-SPECIFICATION ; Do not recursively get any URLs that match this.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION.
- : The URL-SPECIFICATION can be negated, see URL-SPECIFICATION description.
- DontRequestOffline
- ------------------
- A list of URLs that cannot be requested by users when offline.
- URL-SPECIFICATION ; Do not request any URLs that match this.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION.
- : The URL-SPECIFICATION can be negated, see URL-SPECIFICATION description.
- CensorHeader
- ------------
- A list of HTTP header lines that are to be removed from the requests sent to web
- servers and the replies that come back from them.
- <header> = <string> ; A header field name (e.g. From, Cookie, Set-Cookie
- User-Agent) and the string to replace the header
- value with.
- referer-self = yes | no ; Sets the Referer header to the same as the URL
- (default = no).
- referer-self-dir = yes | no ; Sets the Referer header to the URL directory name
- (default = no).
- Notes: The header is case sensitive, and does not have a ':' at the end.
- : The value of none or no string can be used to remove the header.
- : This only replaces headers it finds, it does not add any new ones.
- : The referer-self-dir option takes precedence over referer-self.
- FTPOptions
- ----------
- Options to use when fetching files using ftp.
- anon-username = <string> ; The username to use for anonymous ftp
- (default=anonymous).
- anon-password = <string> ; The password to use for anonymous ftp
- (default=<user>@<host>, determined at run time).
- auth-hostname = <host[:port]> ; A host to use a different username and password.
- auth-username = <string> ; The username to use on the above host.
- auth-password = <string> ; The password to use on the above host.
- Notes: The anon-password should be set to a sensible value especially if you
- are behind a firewall.
- : The auth-hostname, auth-username and auth-password options must come
- together as a triplet.
- : The auth-hostname must be exact, it is not used as a WILDCARD match.
- MIMETypes
- ---------
- MIME Types to use when fetching files not using HTTP.
- default = <mime-type>/<subtype> ; The default MIME type
- (default=text/plain).
- .<file-ext> = <mime-type>/<subtype> ; The MIME type to associate with a file
- extension.
- Notes: You must include the '.' in the file extension.
- : If more than one of the extensions match then the longest is used.
- Proxy
- -----
- This contains the names of the HTTP (or other) proxies to use external to the
- local machine.
- default = <host[:port]> ; The hostname and port on it to use as the
- default proxy.
- URL-SPECIFICATION = <host[:port]> ; The hostname and port on it to use as the
- proxy when getting URLs that match the
- URL-SPECIFICATION.
- auth-hostname = <host[:port]> ; A proxy server that uses proxy authentication,
- this is where the user must enter a username
- and password in the browser to use the proxy.
- auth-username = <string> ; The username to use on the above host.
- auth-password = <string> ; The password to use on the above host.
- ssl = <host[:port]> ; A proxy server that should be used for Secure
- Socket Layer (SSL) connections e.g. https.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION.
- : A hostname that matches more than one entry here uses the proxy of the
- longest matching one (protocol is included in assessing length).
- : You can use none or no hostname to indicate that a default or particular
- protocol or host is not to use a proxy.
- : None of the hosts in LocalNet/LocalHost will be fetched via a proxy.
- : The auth-hostname, auth-username and auth-password options must come
- together as a triplet.
- : The auth-hostname must be exact, it is not used as a wildcard match.
- DontIndex
- ---------
- A list of URLs that are not to be cached by wwwoffled.
- outgoing = URL-SPECIFICATION ; Do not index and URLs that match this in the
- outgoing index.
- latest = URL-SPECIFICATION ; Do not index and URLs that match this in the
- lasttime/prevtime/latest indexes.
- monitor = URL-SPECIFICATION ; Do not index and URLs that match this in the
- monitor index.
- host = URL-SPECIFICATION ; Do not index and URLs that match this in the
- host indexes.
- URL-SPECIFICATION ; Do not index any URLs that match this in any
- of the indexes.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION
- : The URL-SPECIFICATION can be negated, see URL-SPECIFICATION description.
- Alias
- -----
- A list of aliases that are used to replace the server name and path with another
- server name and path. Also for servers known by two names.
- URL-SPECIFICATION = URL-SPECIFICATION ; Any requests for the first URL-SPEC
- are replaced by the second URL-SPEC.
- Notes: See the bottom of this file for the description of URL-SPECIFICATION
- : The URL-SPECIFICATIONs must match exactly, no WILDCARDs are used and the
- URL arguments are ignored.
- Purge
- -----
- The method to determine which pages to purge, the default age the host specific
- maximum age of the pages in days, and the maximum cache size.
- use-mtime = yes | no ; The method to use to decide which files to
- purge, last access time (atime) or last
- modification time (mtime) (default=no).
- max-size = <size> ; The maximum size for the cache in MB after
- purging (default=0).
- min-free = <size> ; The minimum amount of free disk space in MB
- after purging (default=0).
- use-url = yes | no ; If true then use the URL to decide on the purge
- age, otherwise use the protocol and host only
- (default=no).
- del-dontget = yes | no ; If true then delete the files from hosts that
- are in the DontGet section (default=no).
- del-dontcache = yes | no ; If true then delete the files from hosts that
- are in the DontCache section (default=no).
- default = <age> ; The default maximum age of pages in days
- (default=14).
- URL-SPECIFICATION = <age> ; The maximum age of pages that match the
- URL-SPECIFICATION
- Notes: See the bottom of this file for the description of URL-SPECIFICATION
- : A hostname that matches more than one entry here uses the age of the
- longest matching one (protocol is included in assessing length).
- : An age of zero means not to keep, negative not to delete.
- : A maximum cache size of 0 means there is no limit to the size.
- : A minimum disk free of 0 means there is no limit to the free space.
- : If the max-size and min-free options are both used the smaller cache size
- is chosen.
- : The max-size and min-free options take into account the hosts that
- are never purged when measuring the cache size but do not purge them.
- : The URL-SPECIFICATION matches only the protocol and host unless use-url
- is set to true.
- --------------------------------------------------------------------------------
- WILDCARD
- --------
- A wildcard match is one that uses the '*' character to represent any group of
- characters.
- This is basically the same as the command line file matching expressions in DOS
- or the UNIX shell, except that the '*' can match the '/' character. A maximum
- of 2 '*' characters can be used in any wildcard.
- For example
- *.gif matches foo.gif and bar.gif
- *.foo.com matches www.foo.com and ftp.foo.com
- /foo/* matches /foo/bar.html and /foo/bar/foobar.html
- --------------------------------------------------------------------------------
- URL-SPECIFICATION
- -----------------
- When specifying a host and protocol and pathname in many of the sections a
- URL-SPECIFICATION can be used, this is a way of recognising a URL.
- For the purposes of this explanation a URL is considered to be made up of five
- parts.
- proto The protocol that is used (e.g. 'http', 'ftp')
- host The server hostname (e.g. 'www.gedanken.demon.co.uk').
- port The port number on the host (e.g. default of 80 for HTTP).
- path The pathname on the host (e.g. '/bar.html') or a directory name
- (e.g. '/foo/').
- args Optional arguments with the URL used for CGI scripts etc.
- (e.g. 'search=foo').
- For example the WWWOFFLE homepage: http://www.gedanken.demon.co.uk/wwwoffle/
- The protocol is 'http', the host is 'www.gedanken.demon.co.uk', the port is
- the default (in this case 80), and the pathname is '/wwwoffle/'.
- In general this is written as <proto>://<host>[:<port>]/<path>[?<args>]
- Where [] indicates an optional feature, and <> indicate a user supplied name
- or number.
- Some example URL-SPECIFICATION options are the following:
- *://* Any protocol, Any host, Any port, Any path, Any args
- (This is that same as saying 'default').
- *://*/<path> Any protocol, Any host, Any port, Named path, Any args
- *://*/*.<ext> Any protocol, Any host, Any port, Named path, Any args
- *://*/*? Any protocol, Any host, Any port, Any path, No args
- *://<path>?* Any protocol, Any host, Any port, Named path, Any args
- *://<host> Any protocol, Named host, Any port, Any path, Any args
- <proto>:// Named protocol, Any host, Any port, Any path, Any args
- <proto>://<host> Named protocol, Named host, Any port, Any path, Any args
- <proto>://<host>: Named protocol, Named host, Default port, Any path Any args
- *://<host>:<port> Any protocol, Named host, Named port, Any path, Any args
- The matching of the host, the path and the args use the wildcard matching that
- is described above.
- In some sections that accept URL-SPECIFICATIONs they can be negated by appending
- the '!' character to the start. This will mean that the comparison of a URL
- with the URL-SPECIFICATION will return the logically opposite value to what
- would be returned without the '!'. If all of the URL-SPECIFICATIONs in a
- section are negated and '*://*/*' is added to the end then the sense of the
- whole section is negated.