ref: a8eedc80972f224d30d6a5184dee0e089032e3c6
parent: 10666c108567073748e4ddab016ecea29d0e3519
author: Ori Bernstein <ori@eigenstate.org>
date: Thu Nov 26 15:07:55 EST 2020
add upas theory of operations
--- /dev/null
+++ b/upas-theory.md
@@ -1,0 +1,266 @@
+Upas: Theory of Operation
+-------------------------
+
+Upas is the Plan 9 mail system. It's used for
+viewing mail, sending mail, and receiving mail.
+It comes with clients and servers for SMTP,
+IMAP, etc. It also provides a powerful toolkit
+for spam filtering and mail processing.
+
+Upas is configured through a scattering of
+methods. There are a few config files, and
+a number of scripts which users are intended
+to customize.
+
+Here's a list of some important files:
+
+ remotemail
+ The script that you customize
+ for delivering mail to remote
+ systems.
+
+ qmail
+ Enqueues mail for later delivery,
+ applying filters along the way.
+
+ rewrite
+ Rewrites and matches the destination
+ of the email, deciding which mail
+ box or smtp server to put the message
+ into.
+
+ smtpd.conf
+ Configures the SMTP server
+
+ validateaddress:
+ Checks if we should deliver to an
+ address on this system.
+
+There are a number of additional files not mentioned
+in this summary.
+
+
+Viewing
+-------
+
+Viewing mail with upas involves very few moving
+parts. Upas/fs connects to most mail protocols,
+and provides a consistent file system interface
+for all of them, abstracting the storage system
+away from the mail clients.
+
+Upas/fs knows how to render a file system for
+local mailboxes, maildirs, pop, and imap,
+serving them up in a multi-level heirarchy
+in /mail/fs, with one subdirectory for each
+mailbox mounted:
+
+ /mail/fs/$mbox/$mail/$subfiles
+
+For example, to see who sent the first email in
+the default mailbox, you could run:
+
+ cat /mail/fs/mbox/1/from
+
+Typically, you'd access mail/fs through a client
+such as nedmail or acme Mail.
+
+Upas/fs only has one config file in /mail/lib,
+for configuring which headers are shown.
+
+
+Sending And Receiving
+---------------------
+
+Sending and receving email via SMTP in upas is
+a similar operation: A mail is entered into the
+pipeline, is routed, and is delivered to the
+appropriate destination.
+
+Upas/send is the heart of the delivery pipeline.
+Sending invokes upas/marshal to drop an email
+into upas/send, while receiving does this via
+upas/smtpd.
+
+Sending and receiving in upas both roughly follow
+the same path. Both of them take an email, and
+dump it into upas/send, which applies the rewrite
+rules and sends it on to further routing depending
+on the destination of the email.
+
+The major difference between sending and receiving
+is in the starting point: When composing an email
+on plan 9, it gets sent to upas/marshal to drop it
+into the delivery pipeline. When plan 9 is set up
+to recieve mail directly, mail comes in through
+upas/smtpd.
+
+Send accepts a well formed email, and applies
+the rewrite rules in /mail/lib/rewrite. The
+rewrite rules are expected to match an email
+address and take an appropriate action.
+
+With a typical rewrite configuration, if the
+mail matches a local user, then the email will
+get deposited into their mailbox. Otherwise,
+the email is punted to /mail/lib/remotemail.
+
+With the default gateway setup, the pipeline
+looks something like this, where the rewrite
+rules that upas/send uses to interpret email
+enqueues it using qmail:
+
+ upas/marshal => upas/send =>
+ /mail/lib/qmail => qer =>
+ /mail/lib/remotemail => upas/smtp
+
+With the example smtp setup, rewrite also
+handles delivering emails locally.
+
+However, because of the flexibility of the
+rewrite rules, everything after upas/send
+can be swapped out and replaced.
+
+Marshal
+-------
+
+Marshal is the simpler of the two entry
+points into the mail system. All it does
+is take a message that you may type by
+hand, and formats it into an rfc822 envelope,
+and (depending on flags) passes it on to send.
+
+It's also used in the receiving pipeline,
+but only as an address validator, using the
+'x' flag to examine whether an address is
+deliverable.
+
+Smtpd
+-----
+
+Smtpd is the other entry point. It takes internet
+mail from other systems, and puts it into the
+delivery pipeline.
+
+It reads its config from /mail/lib/smtpd.conf.
+The default options for smtpd are not safe to
+put on the internet: Open relaying should be
+disabled, at minimum.
+
+It uses /mail/lib/validateaddress to check whether
+the user is available on the system. In the
+default implementation of validateaddress,
+upas/marshal -x $addr is used to expand aliases,
+and check if local delivery is possible for the
+address in question.
+
+If the address is locally deliverable, then
+send is invoked to deliver the mail. Otherwise,
+the mail is either relayed or rejected.
+
+Filtering
+---------
+
+In addition to the config files in /mail/lib,
+each user can configure mail filtering by
+editing /mail/box/$user/pipeto. This is where.
+for example, spam filtering would be done.
+
+An example of spam filtering is in:
+
+ /mail/lib/smtpd.example/pipeto.bayes
+
+There are some more complicated examples in
+
+ /sys/src/cmd/upas/filterkit/pipeto.sample
+ /sys/src/cmd/upas/filterkit/pipefrom.sample
+
+The scripts are run as user 'none', to protect
+you from any funny business.
+
+Upas ships with a number of programs designed
+to work with the pipefrom or pipeto setup.
+
+These include:
+
+ upas/filter
+ upas/list
+ upas/deliver
+ upas/token
+ upas/vf
+ upas/bayes
+
+There's also a utility library used by rc
+to make pipe scripts easier. It can be loaded
+like this:
+
+ . /mail/lib/pipeto.lib $*
+
+The pipeto script is invoked as:
+
+ rfc822-email | pipeto destaddr destmbox
+
+and isw expected to eventually invoke
+
+ upas/deliver
+
+to deliver their filtered emails.
+
+Storage Formats
+---------------
+
+If /mail/box/$user/$mbox is a file, then it's assumed
+to be in mbox format. If it's a directory, then it's
+assumed to be in mdir format. If the mailbox does not
+exist, then a new maildir is created.
+
+The Binaries
+------------
+
+Spam filtering:
+ upas/addhash: Merges bayes token hash tables togheter
+ upas/bayes: Evaluates bayes tokens
+ upas/msgtok: Tokenizes spam for bayesian filter
+ upas/isspam: Checks if a message is spam.
+ upas/token: Creates a message hash
+ upas/spf: Verifies SPF records
+ upas/ratfs: Spam blocklist FS
+ upas/vf: Virus filtering.
+ upas/list: Maintains and checks lists of users
+ upas/scanmail: Fixed-pattern spam filtering
+
+Mail filtering
+ upas/aliasmail: Manages translating mail aliases
+ upas/deliver: Drops a message into a specific mailbox
+ upas/filter: Reroutes messages to different mailboxes
+
+Mail serving:
+ upas/imap4d: Serves imap
+ upas/pop3: Serves pop3
+ upas/smtpd: Serves smtp
+
+Sending:
+ upas/marshal: Submits a message for delivery
+ upas/smtp: Sends a message to another mail server
+
+Mailing lists:
+ upas/ml: Receives and bounces mailing list messages.
+ upas/mlmgr: Manages mailing lists
+ upas/mlowner: Manages mailing list owner/control messages
+
+Internal Plumbing:
+ upas/qer: Enqueues commands
+ upas/runq: Runs and retries enqueued commands
+ upas/mbappend: Appends messages to mbox or mdir mailboxes
+ upas/send: Starts the email delivery process
+
+Utilities:
+ upas/msgcat: Shows message contents
+ upas/testscan: Dry run of scanmail
+ upas/spam: Marks an email as spam
+ upas/unspam: Reduces spam weight of message tokens
+ upas/tfmt: Prevents topposting
+ upas/unesc: Interpret =?foo?bar?=char?= escapes
+
+Clients:
+ upas/fs: Renders a mailbox as a file system