FAQ
上传用户:seven77cht
上传日期:2007-01-04
资源大小:486k
文件大小:28k
- WWWOFFLE VERSION 2.5d - FREQUENTLY ASKED QUESTIONS AND ANSWERS
- ==============================================================
- This file contains a list of frequently asked questions and their answers
- relating to WWWOFFLE version 2.5.
- Not all of the questions here are real users questions, some of them have been
- made up to give some help to people trying to use the program who find that the
- README documentation is insufficient.
- --------------------------------------------------------------------------------
- Section 0 - Why doesn't this FAQ answer my question?
- --------------------
- Section 1 - What does WWWOFFLE do (and what it doesn't)
- Q 1.1 Does WWWOFFLE support http, ftp, finger, https, gopher, ...?
- Q 1.2 Does WWWOFFLE run on systems other than UNIX?
- Q 1.3 Can you change WWWOFFLE so that in the pages that it generates ...?
- --------------------
- Section 2 - How to use WWWOFFLE to serve an intranet
- Q 2.1 Can the WWWOFFLE proxy be accessed by clients other than localhost?
- Q 2.2 Why can't remote clients access the WWWOFFLE proxy?
- Q 2.3 Why can't remote clients follow all of the links?
- Q 2.4 What are the security issues with WWWOFFLE in a multi-user environment?
- Q 2.5 How can I have different configurations for different groups of users?
- --------------------
- Section 3 - What to look for when WWWOFFLE fails
- Q 3.1 Why does my browser return an empty page with WWWOFFLE but not without?
- Q 3.2 Why can't WWWOFFLE find a host when the browser without it can?
- Q 3.3 Why does my browser say "Connection reset by peer" when browsing?
- Q 3.4 Why does following a link on an FTP site go to the wrong server?
- Q 3.5 Why does WWWOFFLE not handle Cookies correctly?
- --------------------
- Section 4 - Applet handling
- Q 4.1 Why doesn't my Browser start applet XYZ?
- Q 4.2 Are unicoded applet names supported?
- Q 4.3 Why does my Netcape Browser throw the trustProxy security exception?
- --------------------
- Section 5 - How to make most use of WWWOFFLE features
- Q 5.1 How can I see what monitored pages were downloaded last time online?
- Q 5.2 How can I do a recursive fetch on a regular interval?
- Q 5.3 How can I stop users from accessing the index?
- Q 5.4 How can I use JunkBuster with WWWOFFLE?
- --------------------
- Section 6 - More information about WWWOFFLE
- Q 6.1 Who wrote WWWOFFLE, When and Why?
- Q 6.2 What WWWOFFLE mailing lists are available?
- Q 6.3 How do I report bugs in WWWOFFLE?
- --------------------------------------------------------------------------------
- Section 0 - Why doesn't this FAQ answer my question?
- This FAQ is released with each new version of the WWWOFFLE program so if you are
- reading the supplied version and if the question is one that is frequently asked
- about this new version then you will by definition not find the answer here.
- This FAQ is also available on the WWWOFFLE homepage along with much other
- information about the program.
- http://www.gedanken.demon.co.uk/wwwoffle/version-2.5/
- --------------------------------------------------------------------------------
- Section 1 - What does WWWOFFLE do (and what it doesn't)
- --------------------
- Q 1.1 Does WWWOFFLE support http, ftp, finger, https, gopher, ...?
- Some of these are supported and some are not.
- http : Yes
- The original version of WWWOFFLE only supported http.
- ftp : Yes
- Since version 2.0 there has been support for ftp URLs.
- finger : Yes
- Since version 2.1 there has been support for finger. Although this is
- not a standard protocol for proxying there is no reason that it cannot
- usefully be performed.
- https : Yes
- Since version 2.4 there has been support for transparent proxying of
- Secure Socket Layer (SSL) connections. This includes the https
- protocol.
- gopher : No
- This is a protocol that is less popular now that the WWW has really
- taken off. From looking at browsers that support it, it would seem to
- be not impossible, but the market for it seems to be limited.
- --------------------
- Q 1.2 Does WWWOFFLE run on systems other than UNIX?
- For example DOS / Win3 / Win95 / WinNT / OS/2.
- UNIX = Yes
- This is the system that the program way designed and initially written
- for, it should work on many versions of UNIX.
- I know that it works on Linux, SunOS 4.1.x, Solaris 2.x, *BSD.
- DOS/Win3 = No
- The program was not designed for DOS, the filenames used and the
- multi-process nature of the program do not allow this.
- Win95/Win98/WinNT = Yes (Partly)
- A Windows 32-bit version of the program is now available thanks to the
- Cygwin development kit that provides a UNIX system call library
- available on MS Windows.
- OS/2 = Maybe
- I do not know of an equivalent for the Cygwin product for OS/2, if it
- exists then it should be possible to port as it was for Windows 95 /
- Windows NT above.
- --------------------
- Q 1.3 Can you change WWWOFFLE so that in the pages that it generates ...?
- This is a question that gets asked a lot. People want to see Javascript,
- images, different colours ... on the web pages that WWWOFFLE generates.
- From version 2.2 this is no longer an issue since it is possible to customise
- all of the web-pages that WWWOFFLE itself generates. This means that the
- background colour and the font size can all be changed to suit your preferences.
- To find out how to do this look in the /var/spool/wwwoffle/html/messages
- directory and read the README file.
- --------------------------------------------------------------------------------
- Section 2 - How to use WWWOFFLE to serve an intranet
- --------------------
- Q 2.1 Can the WWWOFFLE proxy be accessed by clients other than localhost?
- Yes it can, that facility has been present from the beginning.
- The other clients can be any type of computer that is connected to the server
- that is running the wwwoffled program. The only requirement is that they are
- networked to the server and that they have browsers on them configured to access
- the WWWOFFLE proxy.
- --------------------
- Q 2.2 Why can't remote clients access the WWWOFFLE proxy?
- The default situation in the wwwoffle.conf file is to not allow any clients to
- access the proxy other than localhost. To allow them to access the proxy the
- wwwoffle.conf file needs to be edited as described below and the new
- configuration loaded.
- The AllowedConnect section of the configuration file contains a list of hosts
- that are allowed to connect to the WWWOFFLE proxy. These names are matched
- against the name that WWWOFFLE gets when the connection is made and access is
- allowed or denied. A form of wildcard matching is applied to the entries in
- this list but no extra name lookups are performed.
- For example you are using the private IP address space 192.168.*.* for your
- intranet then your AllowedConnect section in the configuration file should look
- like this.
- AllowedConnect
- {
- 192.168.*
- }
- This will allow all hosts that come from this set of IP addresses to connect to
- the WWWOFFLE proxy.
- --------------------
- Q 2.3 Why can't remote clients follow all of the links?
- Some of the links that are generated in the web pages that come out of the
- WWWOFFLE proxy need to point to other pages on the proxy. To be able to do this
- the name of the host running the proxy needs to be specified in the LocalHost
- section of the configuration file.
- For example if the computer running the WWWOFFLE proxy is called www-proxy then
- the LocalHost section of the configuration file would look like this.
- LocalHost
- {
- www-proxy
- localhost
- 127.0.0.1
- }
- The first of the names is what is used by WWWOFFLE to generate these links. The
- others are used for servers that do not get cached by the proxy.
- --------------------
- Q 2.4 What are the security issues with WWWOFFLE in a multi-user environment?
- Security is a feature that I have considered to some extent when writing
- WWWOFFLE although it has not been one of my biggest concerns. The issues are
- listed below.
- For the Win32 version it should be noted that on Win95/98 there is not the user
- level security that is provided by UNIX. It is not possible therefore to create
- files that are readable by WWWOFFLE and not by other users. The security
- features that are present in WWWOFFLE are therefore inapplicable to these
- systems.
- Configuration file password
- This file can have a password specified in it in the StartUp section that is
- used to limit access to the control features of WWWOFFLE. If set this
- password must be used to put WWWOFFLE online, put it offline, purge the
- cache, stop the server, edit the configuration file etc. If you have set a
- password then you should also make the file readable only by authorised users.
- The password is sent as plain text when using the wwwoffle program to control
- the wwwoffled server. The encryption used for the web page authentication is
- trivial.
- Proxy Authentication
- With the ability to be able to control access to WWWOFFLE using the HTTP/1.1
- Proxy Authentication method, there is the added security risks of this. It
- is basically the same as for the configuration file password, the usernames
- and passwords are in plaintext in the configuration file and the password is
- send to the server using the same trivial encryption method.
- WWWOFFLE server uid/gid
- The uid and gid of the wwwoffled server process can be controlled by the
- run-uid and run-gid options in the StartUp section of the configuration file.
- This uid/gid needs to be able to read the configuration file (write is not
- required unless the interactive edit page is used) and have read/write access
- to the spool directory. If this option is used then the server must be
- started by root.
- Deleting requested URLs
- Only the user that makes a request for a page can delete that request, and
- then only when the deletion is done immediately. This is because a password
- is made by hashing the contents of the file in the outgoing directory. This
- means that read access to this directory must be denied for this to be secure.
- The built in web server
- This is a very simple server and will follow symbolic links, as a security
- feature only files that are world readable can be accessed. They must also
- be in a directory that the wwwoffled server can read. A check is not made for
- each directory component so world readable files in a directory readable only
- by the uid that runs wwwoffled are not safe.
- Accessing the cache
- There is in general no problem with allowing users access to the cache
- provided it is read only (but see URLs with password below). The only
- concern is that if purging is done using the access time of the files then
- running grep on the cache will spoil this.
- URLs with Passwords
- The URLs that use usernames and passwords need to be stored in the cache.
- For simplicity they are not hidden in any way. This means that any URL that
- uses a username/password in it can show up in the log file (with Debug or
- ExtraDebug levels only). The files in the cache also contain the username/
- password information and should be made inaccesible to users for that reason.
- --------------------
- Q 2.5 How can I have different configurations for different groups of users?
- When there are two groups of users that will access the same WWWOFFLE cache but
- where each group has different WWWOFFLE configurations it is possible to run two
- instances of WWWOFFLE.
- For example in a school it may be required that the students can access the
- cache but they cannot request new pages. The teachers must be able to access
- the same cache and to be able to use WWWOFFLE online and request pages while
- offline.
- The two WWWOFFLE configuration files will be the same in most respects, but
- there will be differences as shown below.
- -- wwwoffle.student.conf -- -- wwwoffle.teacher.conf --
- StartUp | StartUp
- { | {
- http-port = 8080 | http-port = 9080
- wwwoffle-port = 8081 | wwwoffle-port = 9081
- password = secret | password = teacher
- } | }
- |
- DontRequestOffline | DontRequestOffline
- { | {
- *://*/* |
- } | }
- |
- AllowedConnectUsers | AllowedConnectUsers
- { | {
- | teacher1:password1
- | teacher2:password2
- } | }
- |
- AllowedConnectHosts | AllowedConnectHosts
- { | {
- | teacher1pc
- | teacher2pc
- } | }
- The two copies of WWWOFFLE must use different port numbers. They use the same
- spool directory and therefore the same web-pages are available to both sets of
- users. You will need to have a password on the students version of WWWOFFLE to
- stop them editing the configuration file, but for the teachers it may not be
- required. To keep the students from accessing the teachers version of WWWOFFLE
- you must use either the AllowedConnectHosts or the AllowedConnectUsers sections
- in the configuration file. These will restrict access to either the set of
- machines that the teachers have access to or will require a username/password to
- be entered before browsing starts.
- In the example above the students are not allowed to request any pages when
- offline. This version of WWWOFFLE is never used in online mode so there is
- never any way that the students can browse while online. Only the teachers
- version of WWWOFFLE is ever used in online mode.
- --------------------------------------------------------------------------------
- Section 3 - What to look for when WWWOFFLE fails
- --------------------
- Q 3.1 Why does my browser return an empty page with WWWOFFLE but not without?
- When using a browser to visit a web-page nothing is returned when WWWOFFLE is
- used as a proxy but when the site is accessed directly without WWWOFFLE the page
- is visible.
- This can have a number of causes (all reported to me or tested myself):
- a) The web server that you are accessing requires the User-Agent header. If it
- is not present or set to an uncommon value (not Netscape or IE) then it
- returns an empty page
- In this case if you have the CensorHeader configuration file section set to
- remove the User-Agent header then you should either not censor this header
- line or set a replacement string that is acceptable.
- b) As above, but it does not matter what the value is for it to return a
- non-empty page.
- The solution is the same except that any User-Agent string can be used.
- c) The web server uses cookies to maintain state. This is common on sites that
- are more concerned with form than content, often without warning.
- d) The browser and server are trying to use HTTP/1.1 extensions that WWWOFFLE is
- ignoring.
- --------------------
- Q 3.2 Why can't WWWOFFLE find a host when the browser without it can?
- The most likely reason is that the DNS server that was configured when WWWOFFLE
- was started is no longer valid. This would happen for example if the file
- /etc/resolv.conf was changed after wwwoffled was run. This is not a WWWOFFLE
- only problem, but will affect any (most) programs when the DNS configuration is
- changed while they are running.
- When WWWOFFLE looks up a hostname it uses the standard UNIX library (libc)
- function call gethostbyname(). The name lookup part of libc (called the
- resolver library) is initialised when the program first uses a function from it.
- When a resolver library function is performed later it will use the
- configuration that was in place when the first function was used.
- The DNS configuration change may happen without you being aware of. Some of the
- user friendly PPP setup programs will change the /etc/resolv.conf file depending
- on which ISP you are connecting to. One example of a program that does this is
- kppp.
- Large browser projects (Netscape in particular) may use other methods of
- performing name lookups than the standard library. This mean that they may work
- even if the DNS configuration has changed since it was started. A working
- Netscape and a non-working WWWOFFLE may mean that your name server configuration
- has changed and is not a WWWOFFLE bug.
- --------------------
- Q 3.3 Why does my browser say "Connection reset by peer" when browsing?
- This happens when using Netscape to access some web-pages. The cause is not
- known, but the problem is only seen when WWWOFFLE is used and not when a direct
- connection is made.
- --------------------
- Q 3.4 Why does following a link on an FTP site go to the wrong server?
- If there is a directory called '/dir' on an ftp server and you load the page
- 'ftp://server/' you get a directory listing that includes a link to '/dir'.
- Following this link should take the browser to 'ftp://server/dir/', but on some
- browsers it goes to 'ftp://dir/' instead.
- I think that this behaviour is due to the browser and not WWWOFFLE. If you went
- to 'http://server/' and followed the link to '/dir/' then you would expect to go
- to 'http://server/dir/' and not to 'http://dir/'. This is just common sense.
- Why the browser is different for ftp than http I am not sure.
- [This should be fixed in version 2.1 of WWWOFFLE, so is not really applicable to
- this version of the FAQ]
- --------------------
- Q 3.5 Why does WWWOFFLE not handle Cookies correctly?
- Normal proxies cannot cache the result of URLs that are requested with Cookies
- because the result is different for each user. WWWOFFLE will cache pages that
- have cookies in them because it is intended to reduce the network traffic.
- If you want to use cookies when you are browsing then any pages that you see
- should not be considered as valid when you see them offline. The best way of
- handling this if there is a particular site that you visit is to put it into the
- DontCache section of the configuration file.
- It is not possible for WWWOFFLE to cache pages that use cookies to control the
- content in the same way that it handles pages that do not use cookies. Any
- implementation of cookie handling would need to give different replies to users
- depending on the cookie that is in the request. This would mean caching
- different pages for the same URL.
- But there is a problem that going to page A might set a cookie and then going to
- page B will give a different page. So, for example, if you have a cookie and
- you have page B cached when you are offline, following the link from B to A may
- give you a new cookie from A (when you go online and fetch A). This means that
- you cannot now go back to B when offline because the cookie is different (and so
- is the page, but you don't have it cached).
- An even worse problem is that reloading page C with the same cookie gives you a
- different page each time. This is because the cookie is used to count the
- number of times that you have visited the page. There is no way to know this
- and therefore you would keep getting the same page C (the cached one) even if
- you should be getting different ones.
- --------------------------------------------------------------------------------
- Section 4 - Applet handling
- --------------------
- Q 4.1 Why doesn't my Browser start applet XYZ.
- [Walter Pfannenmueller <pfn@online.de> writes:]
- I suppose you have enabled java support. Your Browser says something like
- "Can't start Applet XYZ.class". Check if the file has been successfully
- downloaded by WWWOFFLE. If the file is accessible, open a java console (your
- browser should provide something like that) and get more details on the problem.
- Probably it's a security - violation. Every Browser has it's own
- SecurityManager class and you should consult the manual how you can lower these
- restrictions. If your applet however tries to get in contact with some server
- functionality (servlets, RMI, CORBA), we are at the end of the possibilities of
- an offline reader.
- --------------------
- Q 4.2 Are unicoded applet names supported.
- [Walter Pfannenmueller <pfn@online.de> writes:]
- I don't know. I transform those names to UTF8 encoding and the rest depends on
- what your filesystem or the host filesystem does with it. Java compilers do
- have problems with unicode, too, even though it should be supported. I'd
- appreciate any information that helps enlighten the dark. I'd like to know how
- to code Unicode to UTF8 transformation. The implementation in javaclass.c looks
- somehow awkward.
- --------------------
- Q 4.3 Why does my Netcape Browser throw the trustProxy security exception?
- [Walter Pfannenmueller <pfn@online.de> writes:]
- The error message should be
- Could not resolve IP for host ... See the trustProxy property.
- The Netscape Browser tries to verify the applets source host IP address.
- While offline this is not possible. Therefore you have to persuade
- the Browser to trust the proxy. To do this you have to find the preferences
- file preferences.js on UNIX or prefs.js on Windows. Edit the file,
- even though it says "don't edit" and insert the line
- user_pref("security.lower_java_network_security_by_trusting_proxies", true);
- somewhere. be sure to have closed all browser windows, because the
- preferences file will be overwritten on closing. This should work for
- all Netscape 4.0x and 4.5.
- For more information have a look at
- http://developer.netscape.com/docs/technote/security/sectn3.html
- --------------------------------------------------------------------------------
- Section 5 - How to make most use of WWWOFFLE features
- --------------------
- Q 5.1 How can I see what monitored pages were downloaded last time online?
- The easiest way to do this is to go the the monitored web pages index and sort
- the pages by "Access Time" (http://localhost:8080/index/monitor/?atime). Each
- page is accessed when it is monitored so the most recently monitored ones are
- the ones at the top of this listing.
- --------------------
- Q 5.2 How can I do a recursive fetch on a regular interval?
- This is a combination of the recursive fetch option and the monitor option. If
- you select the page that you want in the recursive fetch index
- (http://localhost:8080/refresh-options/) with the options that you want and
- press the button you will be presented with a page telling you that the request
- has been recorded. There is a link on here to allow you to monitor this
- request, which takes you to the normal monitor page
- (http://localhost:8080/monitor-options) but with the URL already filled in.
- --------------------
- Q 5.3 How can I stop users from accessing the index?
- Access to the indexes can be denied to users by using the configuration file
- DontGet section.
- DontGet
- {
- http://localhost:8080/index
- }
- You must make sure that the hostname that you give is the first one in the
- LocalHost section since this is what will be checked.
- --------------------
- Q 5.4 How can I use JunkBuster with WWWOFFLE?
- The Internet Junk Buster is a progam that can filter out many of the junk
- adverts and other features of web-pages.
- The most recent versions of WWWOFFLE add in many of the features of the
- JunkBuster program but not all of them. If you look at the options that
- WWWOFFLE has you may decide that you don't need to use JunkBuster.
- If you decide that you do want to use both programs then there are two options:
- 1) Browser <-> WWWOFFLE <-> JunkBuster <-> Internet
- Any pages that the user requests that JunkBuster blocks will have the JunkBuster
- error message stored in the WWWOFFLE cache. Any recursive fetching or fetching
- of images that WWWOFFLE does in the background are passed through JunkBuster and
- the JunkBuster error messages are cached.
- 2) Browser <-> JunkBuster <-> WWWOFFLE <-> Internet
- Any pages that the user requests that JunkBuster blocks will not be stored in
- the WWWOFFLE cache. Any recursive fetching or fetching of images that WWWOFFLE
- does in the background are not passed through JunkBuster and they will be stored
- in the WWWOFFLE cache but blocked when the browser tries to view them.
- If you decide that WWWOFFLE will be doing lots of fetching because you are using
- it to browse offline then the 1st method is best. If you decide that you will
- be only using it while online and not requesting pages when offline then the 2nd
- method is best.
- If reducing bandwidth is the most important feature of JunkBuster then the 1st
- option is the best since it will stop WWWOFFLE fetching the junk pages.
- --------------------------------------------------------------------------------
- Section 6 - More information about WWWOFFLE
- --------------------
- Q 6.1 Who wrote WWWOFFLE, When and Why?
- The WWWOFFLE program was written by Andrew M. Bishop (amb@gedanken.demon.co.uk)
- in 1996,97,98,99,2000.
- There is a WWWOFFLE home-page on the World Wide Web, available via the author's
- home-page at http://www.gedanken.demon.co.uk/ . This is kept updated with news
- about the program, as new versions become available.
- An earlier program by the same author written in perl had been used for a while
- but it was realised that the functionality of that version was insufficient
- except for a small amount of use. Work on the WWWOFFLE program itself started
- in the Christmas holiday in 1996, initially as a hack to improve the perl
- version.
- After the release of the Beta version 0.9 at the beginning of January 1997 there
- was a lot of interest generated which led to the release of version 1.0 later
- that same month. More versions followed until December that year when version
- 2.0 was released. This contained several large new features (like FTP) and
- included a re-write of a large proportion of the code to make it easier to
- maintain and build on, this included changing completely the cache format.
- Version 2.1 was released in March 1998 with some more new features, version 2.2
- in June 1998 with more features and version 2.3 in August 1998 with even more
- features. Version 2.4 had more features when it was released in December 1998
- and version 2.5 had more again in September 1999.
- The Win32 version of the program was made possible by version beta-20 of the
- Cygwin development kit at the end of October 1998 when version 2.3e of WWWOFFLE
- was released. Versions 2.4b and 2.5a of WWWOFFLE were also released for Win32
- although none of them work totally on most platforms due to incompatibilities.
- The WWWOFFLE program can be freely distributed according to the terms of the GNU
- General Public License (see the file `COPYING').
- --------------------
- Q 6.2 What WWWOFFLE mailing lists are available?
- There are now four mailing lists available for WWWOFFLE. They can be subscribed
- to in two different ways - on the WWWOFFLE users web-page and via e-mail.
- wwwoffle-announce For announcements of new versions of WWWOFFLE.
- wwwoffle-users For discussion of WWWOFFLE features, excluding operating
- system specific features.
- wwwoffle-win32 For discussion of WWWOFFLE on the Win32 system.
- The first two are only for announcements from the author of WWWOFFLE, there is
- no discussion allowed on them. The latter two are open for posting from members
- of the list and others who are not subscribed.
- To subscribe by e-mail send a message to majordomo@gedanken.demon.co.uk with the
- message 'subscribe <group-name>' in the body, e.g. 'subscribe wwwoffle-announce'.
- --------------------
- Q 6.3 How do I report bugs in WWWOFFLE?
- By e-mail, send them to me at amb@gedanken.demon.co.uk and put WWWOFFLE somewhere
- in the subject line. You can also report bugs or provide comments via the
- feedback form on the WWWOFFLE home-page on the World Wide Web accessible via
- http://www.gedanken.demon.co.uk/ .
- Before doing this, you should check the FAQ and the WWWOFFLE web-page to see if
- the answer is there. If it is not and you want to report it to me then it helps
- if you can reproduce the error, in particular if you start wwwoffled as
- 'wwwoffled -d 5 -c wwwoffle.conf' and capture the debugging output for the
- session that shows the error.
- --------------------------------------------------------------------------------