NOTES
上传用户:xxcykj
上传日期:2007-01-04
资源大小:727k
文件大小:24k
- _________________________________________________________________
-
- Design Notes On Fetchmail
-
- These notes are for the benefit of future hackers and maintainers. The
- following sections are both functional and narrative, read from
- beginning to end.
-
- History
-
- A direct ancestor of the fetchmail program was originally authored
- (under the name popclient) by Carl Harris <ceharris@mal.com>. I took
- over development in June 1996 and subsequently renamed the program
- `fetchmail' to reflect the addition of IMAP support. In early November
- 1996 Carl officially ended support for the last popclient versions.
-
- Before accepting responsibility for the popclient sources from Carl, I
- had investigated and used and tinkered with every other UNIX
- remote-mail forwarder I could find, including fetchpop1.9,
- PopTart-0.9.3, get-mail, gwpop, pimp-1.0, pop-perl5-1.2, popc,
- popmail-1.6 and upop. My major goal was to get a header-rewrite
- feature like fetchmail's working so I wouldn't have reply problems
- anymore.
-
- Despite having done a good bit of work on fetchpop1.9, when I found
- popclient I quickly concluded that it offered the solidest base for
- future development. I was convinced of this primarily by the presence
- of multiple-protocol support. The competition didn't do
- POP2/RPOP/APOP, and I was already having vague thoughts of maybe
- adding IMAP. (This would advance two other goals: learn IMAP and get
- comfortable writing TCP/IP client software.)
-
- Until popclient 3.05 I was simply following out the implications of
- Carl's basic design. He already had daemon.c in the distribution, and
- I wanted daemon mode almost as badly as I wanted the header rewrite
- feature. The other things I added were bug fixes or minor extensions.
-
- After 3.1, when I put in SMTP-forwarding support (more about this
- below) the nature of the project changed -- it became a
- carefully-thought-out attempt to render obsolete every other program
- in its class. The name change quickly followed.
-
- The rewrite option
-
- RFC 1123 stipulates that MTAs ought to canonicalize the addresses of
- outgoing mail so that From:, To:, Cc:, Bcc: and other address headers
- contain only fully qualified domain names. Failure to do so can break
- the reply function on many mailers.
-
- This problem only becomes obvious when a reply is generated on a
- machine different from where the message was delivered. The two
- machines will have different local username spaces, potentially
- leading to misrouted mail.
-
- Most MTAs (and sendmail in particular) do not canonicalize address
- headers in this way (violating RFC 1123). Fetchmail therefore has to
- do it. This is the first feature I added to the ancestral popclient.
-
- Reorganization
-
- The second thing I did reorganize and simplify popclient a lot. Carl
- Harris's implementation was very sound, but exhibited a kind of
- unnecessary complexity common to many C programmers. He treated the
- code as central and the data structures as support for the code. As a
- result, the code was beautiful but the data structure design ad-hoc
- and rather ugly (at least to this old LISP hacker).
-
- I was able to improve matters significantly by reorganizing most of
- the program around the `query' data structure and eliminating a bunch
- of global context. This especially simplified the main sequence in
- fetchmail.c and was critical in enabling the daemon mode changes.
-
- IMAP support and the method table
-
- The next step was IMAP support. I initially wrote the IMAP code as a
- generic query driver and a method table. The idea was to have all the
- protocol-independent setup logic and flow of control in the driver,
- and the protocol-specific stuff in the method table.
-
- Once this worked, I rewrote the POP3 code to use the same
- organization. The POP2 code kept its own driver for a couple more
- releases, until I found sources of a POP2 server to test against (the
- breed seems to be nearly extinct).
-
- The purpose of this reorganization, of course, is to trivialize the
- development of support for future protocols as much as possible. All
- mail-retrieval protocols have to have pretty similar logical design by
- the nature of the task. By abstracting out that common logic and its
- interface to the rest of the program, both the common and
- protocol-specific parts become easier to understand.
-
- Furthermore, many kinds of new features can instantly be supported
- across all protocols by modifying the one driver module.
-
- Implications of smtp forwarding
-
- The direction of the project changed radically when Harry Hochheiser
- sent me his scratch code for forwarding fetched mail to the SMTP port.
- I realized almost immediately that a reliable implementation of this
- feature would make all the other delivery modes obsolete.
-
- Why mess with all the complexity of configuring an MDA or setting up
- lock-and-append on a mailbox when port 25 is guaranteed to be there on
- any platform with TCP/IP support in the first place? Especially when
- this means retrieved mail is guaranteed to look like normal sender-
- initiated SMTP mail, which is really what we want anyway.
-
- Clearly, the right thing to do was (1) hack SMTP forwarding support
- into the generic driver, (2) make it the default mode, and (3)
- eventually throw out all the other delivery modes.
-
- I hesitated over step 3 for some time, fearing to upset long-time
- popclient users dependent on the alternate delivery mechanisms. In
- theory, they could immediately switch to .forward files or their
- non-sendmail equivalents to get the same effects. In practice the
- transition might have been messy.
-
- But when I did it (see the NEWS note on the great options massacre)
- the benefits proved huge. The cruftiest parts of the driver code
- vanished. Configuration got radically simpler -- no more grovelling
- around for the system MDA and user's mailbox, no more worries about
- whether the underlying OS supports file locking.
-
- Also, the only way to lose mail vanished. If you specified localfolder
- and the disk got full, your mail got lost. This can't happen with SMTP
- forwarding because your SMTP listener won't return OK unless the
- message can be spooled or processed.
-
- Also, performance improved (though not so you'd notice it in a single
- run). Another not insignificant benefit of this change was that the
- manual page got a lot simpler.
-
- Later, I had to bring --mda back in order to allow handling of some
- obscure situations involving dynamic SLIP. But I found a much simpler
- way to do it.
-
- The moral? Don't hesitate to throw away superannuated features when
- you can do it without loss of effectiveness. I tanked a couple I'd
- added myself and have no regrets at all. As Saint-Exupery said,
- "Perfection [in design] is achieved not when there is nothing more to
- add, but rather when there is nothing more to take away." This program
- isn't perfect, but it's trying.
-
- The most-requested features that I will never add, and why not:
-
- Password encryption in .fetchmailrc
- The reason there's no facility to store passwords encrypted in the
- .fetchmailrc file is because this doesn't actually add protection.
-
- Anyone who's acquired the 0600 permissions needed to read your
- .fetchmailrc file will be able to run fetchmail as you anyway -- and
- if it's your password they're after, they'd be able to rip the
- necessary decoder out of the fetchmail code itself to get it.
-
- All .fetchmailrc encryption would do is give a false sense of security
- to people who don't think very hard.
-
- Truly concurrent queries to multiple hosts
- Occasionally I get a request for this on "efficiency" grounds. These
- people aren't thinking either. True concurrency would do nothing to
- lessen fetchmail's total IP volume. The best it could possibly do is
- change the usage profile to shorten the duration of the active part of
- a poll cycle at the cost of increasing its demand on IP volume per
- unit time.
-
- If one could thread the protocol code so that fetchmail didn't block
- on waiting for a protocol response, but rather switched to trying to
- process another host query, one might get an efficiency gain (close to
- constant loading at the single-host level).
-
- Fortunately, I've only seldom seen a server that incurred significant
- wait time on an individual response. I judge the gain from this not
- worth the hideous complexity increase it would require in the code.
-
- Multiple concurrent instances of fetchmail
- Fetchmail locking is on a per-invoking-user because finer-grained
- locks would be really hard to implement in a portable way. The problem
- is that you don't want two fetchmails querying the same site for the
- same remote user at the same time.
-
- To handle this optimally, multiple fetchmails would have to associate
- a system-wide semaphore with each active pair of a remote user and
- host canonical address. A fetchmail would have to block until getting
- this semaphore at the start of a query, and release it at the end of a
- query.
-
- This would be way too complicated to do just for an "it might be nice"
- feature. Instead, you can run a single root fetchmail polling for
- multiple users in either single-drop or multidrop mode.
-
- The fundamental problem here is how an instance of fetchmail polling
- host foo can assert that it's doing so in a way visible to all other
- fetchmails. System V semaphores would be ideal for this purpose, but
- they're not portable.
-
- I've thought about this a lot and roughed up several designs. All are
- complicated and fragile, with a bunch of the standard problems (what
- happens if a fetchmail aborts before clearing its semaphore, and how
- do we recover reliably?).
-
- I'm just not satisfied that there's enough functional gain here to pay
- for the large increase in complexity that adding these semaphores
- would entail.
-
- Multidrop and alias handling
-
- I decided to add the multidrop support partly because some users were
- clamoring for it, but mostly because I thought it would shake bugs out
- of the single-drop code by forcing me to deal with addressing in full
- generality. And so it proved.
-
- There are two important aspects of the features for handling
- multiple-drop aliases and mailing lists which future hackers should be
- careful to preserve.
-
- 1. The logic path for single-recipient mailboxes doesn't involve
- header parsing or DNS lookups at all. This is important -- it
- means the code for the most common case can be much simpler and
- more robust.
- 2. The multidrop handing does not rely on doing the equivalent of
- passing the message to sendmail -oem -t. Instead, it explicitly
- mines members of a specified set of local usernames out of the
- header.
- 3. We do not attempt delivery to multidrop mailboxes in the presence
- of DNS errors. Before each multidrop poll we probe DNS to see if
- we have a nameserver handy. If not, the poll is skipped. If DNS
- crashes during a poll, the error return from the next nameserver
- lookup aborts message delivery and ends the poll. The daemon mode
- will then quietly spin until DNS comes up again, at which point it
- will resume delivering mail.
-
- When I designed this support, I was terrified of doing anything that
- could conceivably cause a mail loop (you should be too). That's why
- the code as written can only append local names (never @-addresses) to
- the recipients list.
-
- The code in mxget.c is nasty, no two ways about it. But it's utterly
- necessary, there are a lot of MX pointers out there. It really ought
- to be a (documented!) entry point in the bind library.
-
- DNS error handling
-
- Fetchmail's behavior on DNS errors is to suppress forwarding and
- deletion of the individual message that each occurs in, leaving it
- queued on the server for retrieval on a subsequent poll. The
- assumption is that DNS errors are transient, due to temporary server
- outages.
-
- Unfortunately this means that if a DNS error is permanent a message
- can be perpetually stuck in the server mailbox. We've had a couple bug
- reports of this kind due to subtle RFC822 parsing errors in the
- fetchmail code that resulted in impossible things getting passed to
- the DNS lookup routines.
-
- Alternative ways to handle the problem: ignore DNS errors (treating
- them as a non-match on the mailserver domain), or forward messages
- with errors to fetchmail's invoking user in addition to any other
- recipients. These would fit an assumption that DNS lookup errors are
- likely to be permanent problems associated with an address.
-
- IPv6 and IPSEC
-
- The IPv6 support patches are really more protocol-family independence
- patches. Because of this, in most places, "ports" (numbers) have been
- replaced with "services" (strings, that may be digits). This allows us
- to run with certain protocols that use strings as "service names"
- where we in the IP world think of port numbers. Someday we'll plumb
- strings all over and then, if inet6 is not enabled, do a
- getservbyname() down in SocketOpen. The IPv6 support patches use
- getaddrinfo(), which is a POSIX p1003.1g mandated function. So, in the
- not too distant future, we'll zap the ifdefs and just let autoconf
- check for getaddrinfo. IPv6 support comes pretty much automatically
- once you have protocol family independence.
-
- Internationalization
-
- Internationalization is handled using GNU gettext (see the file
- ABOUT_NLS in the source distribution). This places some minor
- constraints on the code.
-
- Strings that must be subject to translation should be wrapped with _()
- or N_() -- the former in functuib arguments, the latter in static
- initializers and other non-function-argument contexts.
-
- Checklist for Adding Options
-
- Adding a control option is not complicated in principle, but there are
- a lot of fiddly details in the process. You'll need to do the
- following minimum steps.
- * Add a field to represent the control in struct run, struct query,
- or struct hostdata.
- * Go to rcfile_y.y. Add the token to the grammar. Don't forget the
- %token declaration.
- * Pick an actual string to declare the option in the .fetchmailrc
- file. Add the token to rcfile_l.
- * Pick a long-form option name, and a one-letter short option if any
- are left. Go to options.c. Pick a new LA_ value. Hack the
- longoptions table to set up the association. Hack the big switch
- statement to set the option. Hack the `?' message to describe it.
- * If the default is nonzero, set it in def_opts near the top of
- load_params in fetchmail.c.
- * Add code to dump the option value in fetchmail.c:dump_params.
- * Add proper FLAG_MERGE actions in fetchmail.c's optmerge()
- function. (If it's a global option, add an override at the end of
- load_params.)
- * Document the option in fetchmail.man. This will require at least
- two changes; one to the collected table of options, and one full
- text description of the option.
- * Add the new token and a brief description to the header comment of
- sample.rcfile.
- * Hack fetchmailconf to configure it. Bump the fetchmailconf
- version.
- * Hack conf.c to dump the option so we won't have a version-skew
- problem.
- * Add an entry to NEWS.
-
- There may be other things you have to do in the way of logic, of
- course.
-
- Before you implement an option, though, think hard. Is there any way
- to make fetchmail automatically detect the circumstances under which
- it should change its behavior? If so, don't write an option. Just do
- the check!
-
- Lessons learned
-
- 1. Server-side state is essential
-
- The person(s) responsible for removing LAST from POP3 deserve to
- suffer. Without it, a client has no way to know which messages in a
- box have been read by other means, such as an MUA running on the
- server.
-
- The POP3 UID feature described in RFC1725 to replace LAST is
- insufficient. The only problem it solves is tracking which messages
- have been read by this client -- and even that requires tricky,
- fragile implementation.
-
- The underlying lesson is that maintaining accessible server-side
- `seen' state bits associated with Status headers is indispensible in a
- Unix/RFC822 mail server protocol. IMAP gets this right.
-
- 2. Readable text protocol transactions are a Good Thing
-
- A nice thing about the general class of text-based protocols that
- SMTP, POP2, POP3, and IMAP belongs to is that client/server
- transactions are easy to watch and transaction code correspondingly
- easy to debug. Given a decent layer of socket utility functions (which
- Carl provided) it's easy to write protocol engines and not hard to
- show that they're working correctly.
-
- This is an advantage not to be despised! Because of it, this project
- has been interesting and fun -- no serious or persistent bugs, no long
- hours spent looking for subtle pathologies.
-
- 3. IMAP is a Good Thing.
-
- If there were a standard IMAP equivalent of the POP3 APOP validation,
- POP3 would be completely obsolete.
-
- 4. SMTP is the Right Thing
-
- In retrospect it seems clear that this program (and others like it)
- should have been designed to forward via SMTP from the beginning. This
- lesson may be applicable to other Unix programs that now call the
- local MDA/MTA as a program.
-
- 5. Syntactic noise can be your friend
-
- The optional `noise' keywords in the rc file syntax started out as a
- late-night experiment. The English-like syntax they allow is
- considerably more readable than the traditional terse keyword-value
- pairs you get when you strip them all out. I think there may be a
- wider lesson here.
-
- Motivation and validation
-
- It is truly written: the best hacks start out as personal solutions to
- the author's everyday problems, and spread because the problem turns
- out to be typical for a large class of users. So it was with Carl
- Harris and the ancestral popclient, and so with me and fetchmail.
-
- It's gratifying that fetchmail has become so popular. Until just
- before 1.9 I was designing strictly to my own taste. The multi-drop
- mailbox support and the new --limit option were the first features to
- go in that I didn't need myself.
-
- By 1.9, four months after I started hacking on popclient and a month
- after the first fetchmail release, there were literally a hundred
- people on the fetchmail-friends contact list. That's pretty powerful
- motivation. And they were a good crowd, too, sending fixes and
- intelligent bug reports in volume. A user population like that is a
- gift from the gods, and this is my expression of gratitude.
-
- The beta testers didn't know it at the time, but they were also the
- subjects of a sociological experiment. The results are described in my
- paper, The Cathedral And The Bazaar.
-
- Credits
-
- Special thanks go to Carl Harris, who built a good solid code base and
- then tolerated me hacking it out of recognition. And to Harry
- Hochheiser, who gave me the idea of the SMTP-forwarding delivery mode.
-
- Other significant contributors to the code have included Dave
- Bodenstab (error.c code and --syslog), George Sipe (--monitor and
- --interface), Gordon Matzigkeit (netrc.c), Al Longyear (UIDL support),
- Chris Hanson (Kerberos V4 support), and Craig Metz (OPIE, IPv6,
- IPSEC).
-
- Conclusion
-
- At this point, the fetchmail code appears to be pretty stable. It will
- probably undergo substantial change only if and when support for a new
- retrieval protocol or authentication method is added.
-
- Relevant RFCS
-
- Not all of these describe standards explicitly used in fetchmail, but
- they all shaped the design in one way or another.
-
- RFC821
- SMTP protocol
-
- RFC822
- Mail header format
-
- RFC937
- Post Office Protocol - Version 2
-
- RFC974
- MX routing
-
- RFC976
- UUCP mail format
-
- RFC1081
- Post Office Protocol - Version 3
-
- RFC1123
- Host requirements (modifies 821, 822, and 974)
-
- RFC1176
- Interactive Mail Access Protocol - Version 2
-
- RFC1203
- Interactive Mail Access Protocol - Version 3
-
- RFC1225
- Post Office Protocol - Version 3
-
- RFC1344
- Implications of MIME for Internet Mail Gateways
-
- RFC1413
- Identification server
-
- RFC1428
- Transition of Internet Mail from Just-Send-8 to 8-bit SMTP/MIME
-
- RFC1460
- Post Office Protocol - Version 3
-
- RFC1521
- MIME: Multipurpose Internet Mail Extensions
-
- RFC1869
- SMTP Service Extensions (ESMTP spec)
-
- RFC1652
- SMTP Service Extension for 8bit-MIMEtransport
-
- RFC1725
- Post Office Protocol - Version 3
-
- RFC1730
- Interactive Mail Access Protocol - Version 4
-
- RFC1731
- IMAP4 Authentication Mechanisms
-
- RFC1732
- IMAP4 Compatibility With IMAP2 And IMAP2bis
-
- RFC1734
- POP3 AUTHentication command
-
- RFC1870
- SMTP Service Extension for Message Size Declaration
-
- RFC1891
- SMTP Service Extension for Delivery Status Notifications
-
- RFC1892
- The Multipart/Report Content Type for the Reporting of Mail
- System Administrative Messages
-
- RFC1894
- An Extensible Message Format for Delivery Status Notifications
-
- RFC1893
- Enhanced Mail System Status Codes
-
- RFC1894
- An Extensible Message Format for Delivery Status Notifications
-
- RFC1938
- A One-Time Password System
-
- RFC1939
- Post Office Protocol - Version 3
-
- RFC1957
- Some Observations on Implementations of the Post Office
- Protocol (POP3)
-
- RFC1985
- SMTP Service Extension for Remote Message Queue Starting
-
- RFC2033
- Local Mail Transfer Protocol
-
- RFC2060
- Internet Message Access Protocol - Version 4rev1
-
- RFC2061
- IMAP4 Compatibility With IMAP2bis
-
- RFC2062
- Internet Message Access Protocol - Obsolete Syntax
-
- RFC2195
- IMAP/POP AUTHorize Extension for Simple Challenge/Response
-
- RFC2449
- IMAP/POP AUTHorize Extension for Simple Challenge/Response
-
- RFC2449
- POP3 Extension Mechanism
- _________________________________________________________________
-
-
-
- Eric S. Raymond <esr@snark.thyrsus.com>