Authentication Mechanisms

Plaintext authentication
The simplest authentication mechanism is PLAIN. The client simply sends the password unencrypted to Dovecot. All clients support the PLAIN mechanism, but obviously there's the problem that anyone listening on the network can steal the password. For that reason (and some others) other mechanisms were implemented. Today however many people use SSL/TLS, and there's no problem with sending unencrypted password inside SSL secured connections. So if you're using SSL, you probably don't need to bother worrying about anything else than the PLAIN mechanism. Another plaintext mechanism is LOGIN. It's typically used only by SMTP servers to let Outlook clients perform SMTP authentication. Note that LOGIN mechanism is not the same as IMAP's LOGIN command. The LOGIN command is internally handled using PLAIN mechanism.

Non-plaintext authentication
Non-plaintext mechanisms have been designed to be safe to use even without SSL/TLS encryption. Because of how they have been designed, they require access to the plaintext password or their own special hashed version of it. This means that it's impossible to use nonplaintext mechanisms with commonly used DES or MD5 password hashes. If you want to use more than one non-plaintext mechanism, the passwords must be stored as plaintext so that Dovecot is able to generate the required special hashes for all the different mechanisms. If you want to use only one non-plaintext mechanism, you can store the passwords using the mechanism's own password scheme. With success/failure password databases (e.g. PAM) it's not possible to use non-plaintext mechanisms at all, because they only support verifying a known plaintext password. Dovecot supports the following non-plaintext mechanisms:
• •

CRAM-MD5: Protects the password in transit against eavesdroppers. Somewhat good support in clients. DIGEST-MD5: Somewhat stronger cryptographically than CRAM-MD5, but clients rarely support it. APOP: This is a POP3-specific authentication. Similiar to CRAM-MD5, but requires storing password in plaintext. NTLM: Mechanism created by Microsoft and supported by their clients.


• • •

Optionally supported using Samba's winbind.

GSS-SPNEGO: Similar to NTLM. GSSAPI: Kerberos v5 support. RPA: Compuserve RPA authentication mechanism. Similar to DIGEST-MD5, but client support is rare. ANONYMOUS: Support for logging in anonymously. This may be useful if you're intending to provide publically accessible IMAP archive.

• •

• •

OTP and SKEY: One time password mechanisms. Supported only by Dovecot v1.1 and later. EXTERNAL: EXTERNAL SASL mechanism. Supported only by Dovecot v1.2 and later.

Configuration
By default only PLAIN mechanism is enabled. You can change this by modifying dovecot.conf:
auth default { mechanisms = plain login cram-md5 # .. }

SSL
SSL and TLS terms are often used in confusing ways: • SSL (Secure Sockets Layer) is the original protocol implementation. SSLv3 is still allowed by Dovecot, but it's rarely used. Some clients use SSL to mean that they're going to connect to the imaps (993), pop3s (995) or smtps (465) port, although they're still going to use TLSv1 protocol. TLS (Transport Layer Security) replaced the SSL protocol. TLSv1 protocol is used practically always nowadays. Some clients use TLS to mean that they're going to use STARTTLS command after connecting to the standard imap (143), pop3 (110) or smtp port (25/587). Nothing would prevent using SSLv3 protocol after STARTTLS command.

Using two separate ports for plaintext and SSL connections was thought to be wasteful, so STARTTLS intended to deprecate the SSL ports (imaps, pop3s, smtps, etc). This never really happened, probably because of two reasons: •

Some admins don't even know about STARTTLS. Some admins want to require SSL/TLS, but don't realize that this is also possible with STARTTLS (Dovecot has disable_plaintext_auth=yes and ssl=required settings). Some admins understand everything, but still prefer to allow only SSL ports. This could be because it makes it easier to ensure that no information is leaked, because SSL/TLS handshake happens immediately. Some clients unfortunately try to do plaintext authentication without STARTTLS, even when IMAP server has told the client that it won't work.

Unfortunately there doesn't seem to be any clear and simple way to refer to these different meanings. SSL term is much more widely understood than TLS, so Dovecot configuration and this documentation only talks about SSL when in fact it means both SSL/TLS.

Self-signed SSL certificates
Self-signed SSL certificates are the easiest way to get your SSL server working. However unless you take some action to prevent it, this is at the cost of security:

The first time the client connects to the server, it sees the certificate and asks the user whether to trust it. The user of course doesn't really bother verifying the certificate's fingerprint, so a man-in-the-middle attack can easily bypass all the SSL security, steal the user's password and so on. If the client was lucky enough not to get attacked the first time it connected, the following connections will be secure as long as the client had permanently saved the certificate. Some clients do this, while others have to be manually configured to accept the certificate.

The only way to be fully secure is to import the SSL certificate to client's (or operating system's) list of trusted CA certificates prior to first connection. See SSL/CertificateClientImporting how to do it for different clients.
Self-signed certificate creation

Dovecot includes a script to build self-signed SSL certificates using OpenSSL. In the source distribution this exists in doc/mkcert.sh. Binary installations usually create the certificate automatically when installing Dovecot and don't include the script. The SSL certificate's configuration is taken from doc/dovecot-openssl.cnf file. Modify the file before running mkcert.sh. Especially important field is the CN (Common Name) field, which should contain your server's host name. The clients will verify that the CN matches the connected host name, otherwise they'll say the certificate is invalid. It's also possible to use wildcards (eg. *.domain.com) in the host name. They should work with most clients. By default the certificate is created to /etc/ssl/certs/dovecot.pem and the private key file is created to /etc/ssl/private/dovecot.pem. Also by default the certificate will expire in 365 days. If you wish to change any of these, modify the mkcert.sh script.

Certificate Authorities
The correct way to use SSL is to have each SSL certificate signed by an Certificate Authority (CA). The client has a list of trusted Certificate Authorities, so whenever it sees a new SSL certificate signed by a trusted CA, it will automatically trust the new certificate without asking the user any questions. There are two ways to get a CA signed certificate: buy it, or create your own CA. The clients have a built-in list of trusted CAs, so buying from one of those CAs will have the advantage of the certificate working without any client configuration. If you create your own CA, you'll have to install the CA certificate to all the clients (see SSL/CertificateClientImporting). There are multiple different tools for managing your own CA. The simplest way is to use a CA managing tool as gnoMint or TinyCA. However, if you need to tailor the properties of the CA, you always can use OpenSSL, very much customizable, but however a bit cumbersome. Dovecot is a high performance, secure, and fully standards-compliant IMAP/POP3 server. It also boasts a much simpler configuration setup than other IMAP servers and has a broad variety of authentication mechanisms. It also supports SSL and TLS encryption. Many distributions are now available with Dovecot included; it may not be the default IMAP/POP3 server, but it is usually a simple install command away.

this file is stored in /usr/share/doc/dovecot/. It also boasts a much simpler configuration setup than other IMAP servers and has a broad variety of authentication mechanisms. or change the SSLDIR variable to override the location: # cd /usr/share/doc/dovecot # vim dovecot-openssl.pem' ----subject= /C=CA/ST=Alberta/L=Edmonton/O=Foo Company/OU=IMAP server/CN=example. Once you have installed Dovecot. By default. whether to change default authentication options. More information on configuring Dovecot and all the other features it provides is available on the Dovecot wiki. Dovecot will only act as an IMAP server.... use: protocols = pop3 pop3s imap imaps To use SSL.. There is also a dovecot-openssl. but it can act as a POP3 server as well. edit /etc/dovecot.. the configuration file will most likely be /etc/dovecot. To do this. Many distributions are now available with Dovecot included.++++++ . Dovecot is a high performance. whether to change default authentication options. Many of the defaults are likely sufficient and will require little changes unless you need specific locations for the mail spool..com SHA1 Fingerprint=9A:23:B8:B4:0E:16:06:11:B2:FE:4E:49:C8:A8:C2:87:D8:79:1B:82 Next. and so forth. it may not be the default IMAP/POP3 server..private} # SSLDIR=/etc/ssl/dovecot sh mkcert.Once you have installed Dovecot.. and fully standards-compliant IMAP/POP3 server. On Mandriva Linux.. you may want to edit mkcert.conf again and set the following: ssl_disable = no ssl_cert_file = /etc/ssl/dovecot/certs/dovecot. but it is usually a simple install command away. edit /etc/dovecot. you will need to appropriately set the ssl_cert_file and ssl_key_file settings.sh Generating a 1024 bit RSA private key .sh script that Dovecot comes with..pem ssl_key_file = /etc/ssl/dovecot/private/dovecot..com/emailAddress=admin@example. using PAM.cnf # mkdir -p /etc/ssl/dovecot/{certs.sh as well. secure.. Many of the defaults are likely sufficient and will require little changes unless you need specific locations for the mail spool. The simplest way to get these certificates is to use the mkcert. If you want to provide the full gambit of POP3 and IMAP. Dovecot does support virtual users as well..conf.. . Depending on where you wish to store the certificate and key file.pem Now restart dovecot and it will authenticate against the system for users. and set ssl_disable to no.cnf file that you will want to edit to set the SSL certificate options.conf. It also supports SSL and TLS encryption.. and so forth. the configuration file will most likely be /etc/dovecot.++++++ writing new private key to '/etc/ssl/dovecot/private/dovecot. which makes it quite versatile.. with both the regular and SSL variants...conf and look for the protocols section: protocols = pop3 This would tell Dovecot to act as a pure POP3 server.

.cnf file that you will want to edit to set the SSL certificate options. edit /etc/dovecot.sh Generating a 1024 bit RSA private key .sh as well. Dovecot does support virtual users as well. this file is stored in /usr/share/doc/dovecot/. you will need to appropriately set the ssl_cert_file and ssl_key_file settings..sh script that Dovecot comes with... Dovecot will only act as an IMAP server. On Mandriva Linux.pem Now restart dovecot and it will authenticate against the system for users. while keeping Dovecot index files up to date... called deliver. and set ssl_disable to no.++++++ .. use: protocols = pop3 pop3s imap imaps To use SSL.pem ssl_key_file = /etc/ssl/dovecot/private/dovecot.. which makes it quite versatile. This page describes the common settings required to make deliver work. you may want to edit mkcert.com/emailAddress=admin@example. More information on configuring Dovecot and all the other features it provides is available on the Dovecot wiki..conf again and set the following: ssl_disable = no ssl_cert_file = /etc/ssl/dovecot/certs/dovecot. You should read it first.cnf # mkdir -p /etc/ssl/dovecot/{certs. The simplest way to get these certificates is to use the mkcert.com SHA1 Fingerprint=9A:23:B8:B4:0E:16:06:11:B2:FE:4E:49:C8:A8:C2:87:D8:79:1B:82 Next. and then the MTA specific pages: • • • • • LDA/Postfix LDA/Exim LDA/Sendmail LDA/Qmail LDA/ZMailer Main features of Dovecot LDA .By default.conf and look for the protocols section: protocols = pop3 This would tell Dovecot to act as a pure POP3 server.++++++ writing new private key to '/etc/ssl/dovecot/private/dovecot. is a local delivery agent which takes mail from an MTA and delivers it to a user's mailbox. There is also a dovecot-openssl.pem' ----subject= /C=CA/ST=Alberta/L=Edmonton/O=Foo Company/OU=IMAP server/CN=example.. using PAM. edit /etc/dovecot... but it can act as a POP3 server as well.. with both the regular and SSL variants. Dovecot LDA The Dovecot LDA.... To do this. Depending on where you wish to store the certificate and key file. If you want to provide the full gambit of POP3 and IMAP. or change the SSLDIR variable to override the location: # cd /usr/share/doc/dovecot # vim dovecot-openssl.private} # SSLDIR=/etc/ssl/dovecot sh mkcert...

If message couldn't be saved to the mailbox for any reason. it's created (unless -n is used). .• • • Mailbox indexing during mail delivery. it's delivered to INBOX instead.1+ only) Envelope sender address. ○ If Sieve plugin is used. user+ext@domain).conf. If given.conf file must be world readable to enable deliver process read it. this allows you to specify deliver -n -m Mail/$mailbox where mail is stored to Mail/$mailbox or to INBOX if $mailbox is empty. For example if you have "Mail/" namespace. the user information is looked up from dovecot-auth. The important settings are: • • • • postmaster_address hostname is used as the From: header address in bounce mails is used in generated Message-IDs and in Reporting-UA: header in bounce mails is used to send mails. Alternative configuration file path. this mailbox is used as the "keep" action's mailbox. It's also used if there is no Sieve script or if the script fails for some reason. Parameters Parameters accepted by deliver: • • • • • -d <username>: -a <address>: -f <address>: -c <path>: -m <mailbox>: Destination username.1: Deliveries to namespace prefix will result in saving the mail to INBOX instead. but not necessarily with system users. while running with user privileges. Default is the same as username. Typically used with virtual users.g. sendmail_path specifies the UNIX socket to dovecot-auth where deliver can lookup userdb information when -d parameter is used. (v1. See below how to configure Dovecot to create the socket. ○ v1. providing faster mailbox access later Quota enforcing by a plugin Sieve language support by a plugin ○ ○ ○ Mail filtering Mail forwarding Vacation auto-reply Common configuration The configuration is done in the protocol lda section in dovecot. auth_socket_path Note that dovecot. Destination address (e. which doesn't necessarily work the same as /usr/sbin/sendmail. If the mailbox doesn't exist. Note that the default is /usr/lib/sendmail. Destination mailbox (default is INBOX).

sub/ directory and your namespace configuration is prefix=INBOX/.forward files. The default is to send a rejection mail ourself (v1.1+). See the log file for details. 67 (EX_NOUSER): The destination username was not found.2+ no longer uses this. (v1. don't create it. For qmail's per-user setup. write the rejection reason to stderr and exit with EX_NOPERM.forward file: | "/usr/local/libexec/dovecot/deliver" This should work with any MTA which supports per-user . This is returned for almost all failures. the correct way to deliver mail there is to use -m INBOX/box/sub • • • • • -n: -s: -e: -k: If the destination mailbox doesn't exist. (v1. (v1.○ The mailbox name is specified the same as it's visible in IMAP client. 75 (EX_TEMPFAIL): A temporary failure. The fallback is to deliver mail to INBOX. Virtual users With a lookup Give the destination username to deliver with -d parameter. but currently it also prevents deliver from updating cache file so it shouldn't be used unless really necessary.box.0.) • • 77 (EX_NOPERM): -e parameter was used and mail was rejected. For example if you've a Maildir with . This allows a single mail to be delivered to multiple users using hard links. Typically this happens when user is over quota and quota_full_tempfail=no.1. a missing configuration setting or deliver binary is setuid-root and world-executable. This affects both -m parameter and fileinto action in Sieve scripts.1+). If using maildir the file is hard linked to the destination if possible. Don't clear all environment at startup (v1. This method doesn't require the authentication socket explained below since it's executed as the user itself. see LDA/Qmail.1+) Return values deliver will exit with one of the following values: • • • • 0 (EX_OK): Delivery was successful 64 (EX_USAGE): Invalid parameter given. Subscribe to mailboxes that are automatically created (via -m parameter or fileinto Sieve action). System users You can use deliver with a few selected system users (ie. -p <path>: Path to the mail to be delivered instead of reading from stdin.3+) If mail gets rejected. separator=/. for example: . user is found from /etc/passwd / NSS) by calling deliver in the user's ~/. 78 (EX_CONFIG): Failed to read configuration file.

path = /var/run/dovecot/auth-master # Auth master socket can be used to look up userdb information for # given usernames. socket listen { # Note that we're setting a master socket. Note that you should verify the user's existence prior to running deliver. but depending on your configuration it may return other information as well. #auth_socket_path = /var/run/dovecot/auth-master } auth default { . if not the directory must be created. GID and home directory. but still try to restrict the socket access if possible. So the information is similar to what can be found from eg. There are two ways to work around this problem: 1. Typically the result will contain the user's UID. mode = 0600 user = vmail # User running deliver #group = mail # Or alternatively mode 0660 + deliver user in this group } } . Without a lookup If you have already looked up the user's home directory and you don't need a userdb lookup for any other reason either (such as overriding settings for specific users). Multiple UIDs If you're using more than one UID for users. master { # Typically under base_dir/. You must have set the proper UID (and GID) before running deliver..deliver -f $FROM_ENVELOPE -d $DEST_USERNAME You'll need to set up a master authentication socket for deliver so it knows where to find mailboxes for the users: protocol lda { .. SMTP AUTH for Postfix and Exim uses client sockets. This probably isn't very sensitive information # for most systems. but you should try to restrict it more just to be safe. otherwise you'll end up having mail delivered to non-existing users as well. } The master socket can be used to do userdb lookups for given usernames.. you can run deliver similar to how it's run for system users: HOME=/path/to/user/homedir deliver -f $FROM_ENVELOPE This way you don't need to have a master listener socket. This means that it's probably not a problem to use mode=0666 for the socket. Make deliver setuid-root. It's not possible to run deliver as root without -d parameter. you're going to have problems running deliver. as most MTAs won't let you run deliver as root. . # UNIX socket path to master authentication server to find users. /etc/passwd for system users.

especially if you have untrusted users in your system.0 and older versions mails can be delivered only to mailboxes specified by the mail_location setting. ○ See Checkpassword for how to make deliver work with checkpassword. You should take extra steps to make sure that untrusted users can't run it and potentially gain root privileges. . You can do this by making sure only your MTA has execution access to it. For example: # chgrp secmail /usr/local/libexec/dovecot/deliver # chmod 04750 /usr/local/libexec/dovecot/deliver # ls -l /usr/local/libexec/dovecot/deliver -rwsr-x--. If you are using prefetch userdb. Using sudo: Alternatively. With v1. If deliver fails to write to log files it exits with temporary failure. which is running as root.1 and later. Setuid-root deliver can be used to gain root privileges. First configure sudo to allow 'dovelda' user to invoke deliver by adding the following to your /etc/sudoers: Defaults:dovelda !syslog dovelda ALL=NOPASSWD:/usr/local/libexec/dovecot/deliver Then configure your MTA to invoke deliver as user 'dovelda' and via sudo: /usr/bin/sudo /usr/local/libexec/dovecot/deliver instead of just plain /usr/local/libexec/dovecot/deliver. The UserDatabase/Prefetch page explains how to fix this. but note that it is just as insecure being able to run deliver via sudo as setuid-root. keep in mind that deliver does not make a password query and thus will not work if -d parameter is used. you can use sudo to wrap the invocation of deliver. Note that you have to recreate these rights after each update of dovecot. Deliver doesn't. Making deliver setuid-root: Beware: it's insecure to make deliver setuid-root. This has the advantage that updates will not clobber the setuid bit.1 root secmail 4023932 2009-01-15 16:23 deliver Then start deliver as a user that belongs to secmail group. which means that you might need some special configuration for it to log anything at all. Use sudo to wrap the invocation of deliver. see Logging. Problems with deliver • • Namespaces are supported with v1. Note that Postfix's mailbox_size_limit setting applies to all files that are written to. deliver can't write to log files larger than 50 MB and you'll start getting temporary failures. Logging • Normally Dovecot logs everything through its master process. • • • If you have trouble finding where Dovecot logs by default.2. So if you have a limit of 50 MB. Make sure you only give your MTA the ability to invoke deliver via sudo.

log info_log_path = /var/log/dovecot-deliver. You can do this by overriding the log_path and info_log_path settings: protocol lda { . set the paths empty: protocol lda { .dovecot.org/Quota Sieve language support can be added with Sieve plugin. This way you don't have to give any extra write permissions to other log files or the syslog socket.. http://wiki.dovecot.org/LDA/Sieve entation.log } For using syslog with deliver.If you want deliver to keep using Dovecot's the default log files: • If you're logging to syslog. If you're logging to Dovecot's default log files again you'll need to give enough write permissions to the log files for deliver. http://wiki. log_path = info_log_path = # You can also override the default syslog_facility: #syslog_facility = mail } Dovecot Plugins • • • Most of the Dovecot plugins work with deliver. Top of Form fullsearch 180 Titles Text Search: Bottom of Form • • • • Login MailboxFormat Maildir FrontPage . For example set it world-read/writable: chmod a+rw /dev/log. # remember to give proper permissions for these files as well log_path = /var/log/dovecot-deliver-errors.dovecot. make sure the syslog socket (usually /dev/log) has enough write permissions for deliver.org/Plugins Virtual quota can be enforced using Quota plugin. http://wiki.. • You can also specify different log files for deliver.

deleted and added without affecting the mailbox or other emails. Maildir and filesystems 1. XFS 1. Directory Structure 4. Contents 1. Dovecot extensions 1. IMAP keywords 3. Issues with the specification 1. Maildir filename extensions 2. Locking . Each mailbox folder is a directory and each message a file. IMAP UID mapping 2. and makes it safer to use on networked file systems such as NFS. Linux ext2 / ext3 3.• • • • • • • • • • RecentChanges FindPage HelpContents MailboxFormat/Maildir Edit (Text) Edit (GUI) Comments Info Attachments Top of Form Bottom of Form Maildir This format debuted with the qmail server in the mid-1990s. Maildir 1. Various tips 3. ReiserFS 4. This improves efficiency because individual emails can be modified. General comparisons of Maildir on different filesystems 2.

The version number is always 1 currently. The dovecot-uidlist file doesn't need to be locked for reading. Dovecot uses dovecotuidlist file to keep UID <-> filename mapping.27041_118. Dovecot instead writes the file's last known full filename. IMAP UID mapping IMAP requires each message to have a permanent unique ID number. Dovecot used to have version number 2 also for a while. Dovecot thinks you "unexpunged" message by restoring a message from backup. 1173189136 is the IMAP UIDVALIDITY and 20221 is the UID that will be given to the next added message.lock file needs to be created. so if the number is ever increased it needs to become version 3. . $Label2. Usually this allows opening the file without reading the directory's contents to find the file's current file name. $Junk. This means that if you create a new file using the same file name as what already exists in dovecot-uidlist. $Spam. The dovecot-uidlist file must never be directly modified.S Because with maildir the filename changes every time the message's flags change. everything before ":2. Procmail Problems 6. IMAP (and Dovecot) requires that messages are immutable. The file is basically in the same format as Courier IMAP's courierimapuiddb file. When writing dovecot-uidlist. $NonSpam keywords).foo. create a new message instead and expunge the old one. except for one difference (see below). dovecot-uidlist is updated lazily to optimize for disk I/O." string).org 20220 1035478339. $NonJunk. Thunderbird uses labels which map to keywords $Label1. it may not be removed from dovecot-uidlist until sometimes later. After the header comes the list of UID <-> filename mappings: 123 1035478339. With Courier IMAP the filenames contained only the maildir file's basename (ie. the filename listed in the file doesn't necessarily exist. The file begins with a header: 1 1173189136 20221 Where 1 means the file format version number. The extensions still keep the maildir standards compliant.org:2. References Dovecot extensions Since the standard maildir specification doesn't provide everything needed to fully support the IMAP protocol. Note that messages must not be modified once they've been delivered. so MUAs not supporting the extensions can still safely use it as a normal maildir. Dovecot had to create some of its own non-standard extensions. IMAP keywords All the non-standard message flags are called keywords in IMAP. If a message is expunged.27041_118. If you wish to modify them in any way. This causes a warning to be logged and the file to be renamed. it can only be replaced with rename() call.2. etc. Mail delivery 5.foo. Some clients use these automatically for marking spam (eg.

S=<size>: <size> contains the file size.<non-standard fields>]". . Dovecot supports reading a few fields from the <base filename>: • • . you're all set.com/eng/qmail_fs_benchmark. Maildir filename extensions The standard filename definition is: "<base filename>:2.foo. This is especially useful with Maildir++ quota. they're still stored in Dovecot's index files.htm (including some graphs) Linux ext2 / ext3 The main disadvantage is that searching can be slightly slower. Old versions of ext2 and ext3 on Linux don't support directory indexing (to speed up access).. Basic <flags> are described here. The file is in format: 0 $Junk 1 $NonJunk 0 means letter 'a' in the maildir filename. but newer versions of ext3 do. it can only be replaced with rename() call. and access to very large mailboxes (thousands of messages) can get slow with filesystems which don't have directory indexes. This means that only 26 keywords are possible to store in the maildir. You can check if the indexing is already enabled with tune2fs: tune2fs -l /dev/hda3 | grep features If you see dir_index. so it's still not such a good idea to do that. the file size with linefeeds being CR+LF characters. 1 means 'b' and so on.S Maildir and filesystems General comparisons of Maildir on different filesystems • • http://www.br/benchmark/fsbench. If more are used.SIZE. ie. <size> and <vsize> are the same.27041_118. although you may have to manually enable it. The file doesn't need to be locked for reading.z. This means that if Dovecot sees a comma in the <flags> field while updating flags in the filename. add it using: umount /dev/hda3 .W=<vsize>: <vsize> contains the file's RFC822.inf. A maildir filename with those fields would look something like: 1035478339. The file must not be directly modified. which may improve the performance. However other maildir MUAs may mess them up. The <non-standard fields> isn't used by Dovecot for anything currently. but when writing dovecot-uidlist file must be locked.thesmbexchange. If dir_index is missing.htiweb.<flags>". Setting this may give a small speedup because now Dovecot doesn't need to calculate the size itself.Dovecot stores keywords in the maildir filename's flags field using letters a.html http://www.W=1030:2. it doesn't touch anything after the comma.org.S=1000. The mapping from single letters to keyword names is stored in dovecotkeywords file. If the message was stored with CR+LF linefeeds. Dovecot has extended the <flags> field to be "<flags>[. Getting the size from the filename avoids doing a stat().

htm) • Directory Structure Dovecot uses Maildir++ directory layout for organizing mailbox directories.br/benchmark/fsbench.logbsize=131072 (Source: http://www.inf.dovecot. "folder/subfolder") ~/Maildir/.sw=3 l logdev=<some_other_device> (Source: http://www.logbufs=8.htiweb.folder. new messages arrive in new and read shall be moved to cur by the clients.0.au talk about "Choosing and Tuning Linux File Systems" (Slides as PDF) also recommends XFS for Maildir (alternatively ext3 with small blocks and high inodetofile ratio) Someone else wrote here in the wiki: XFS on TSL 3.subfolder/ .thesmbexchange. • • • • There are early reports on the dovecot mailing list which suggest that XFS seems quite a lot slower than ext3 or ReiserFS: http://www. The tmp directory is used during delivery. XFS XFS performance seems to depend on a lot of factors.html This 2007 Linux.thesmbexchange. Create the XFS with options -b size=1024 -d su=16k.br/benchmark/fsbench.xfs o noatime.org/list/dovecot/2006-May/013216.htiweb.html ○ http://www. This means that all the folders are directly inside ~/Maildir directory: • ~/Maildir/new.html But then again others recommend XFS for the use with Maildir and dovecot: http://www.html) Use mkfs.dovecot.org/list/dovecot/2007January/018994.inf. ~/Maildir/.htm Various tips • • Mounting XFS with logbufs=8 option might increase the speed.folder/ • • is a mailbox folder is a subfolder of a folder (ie.version=2 and mount. also on the system and the file system parameters. ~/Maildir/cur and ~/Maildir/tmp directories contain the messages for INBOX. Comparisons which suggest XFS as being best choice: • ○ http://www.com/eng/qmail_fs_benchmark.5 works almost twice as fast as our prior EXT3 installation of which is significant in size. ReiserFS is also a good option.conf.xfs -f -l size=32768b.com/eng/qmail_fs_benchmark. so it works well with maildir.tune2fs -O dir_index /dev/hda3 e2fsck -fD /dev/hda3 mount /dev/hda3 ReiserFS ReiserFS was built to be fast with lots of small files.

All this trouble is rather pointless. so there's no need for LDAs to support any type of locking. Because Dovecot uses its own non-standard locking (dovecot-uidlist. another process in the middle of listing files at the same time could skip a file. PID/host combination by itself should already guarantee that it never finds such a file. which would cause trouble. (Note difference with Mbox. Dovecot locks the maildir while doing modifications to it or while looking for new messages in it. Do stat(tmp/<filename>). . 3. If it does. they give no guaranteed protection and will just as easily pass duplicate filenames through and overwrite existing mails. Dovecot won't see them unless you rename them to Maildir++ layout. Create and write the message to the tmp/<filename>. Step 2 is pointless because there's a race condition between steps 2 and 3. link() it into new/ directory. This is required because otherwise Dovecot might temporarily see mails incorrectly deleted.) Issues with the specification Locking Although maildir was designed to be lockless.lock dotlock file). Delivering mails to new/ directory doesn't have any problems. Although not mentioned here. Even though they might catch a problem once in a while. Basically the problem is that if one process modifies the maildir (eg. an error message may be written to log and the message will receive a new UID. causing the mail to get corrupted. If the stat() found a file. This problem exists with all the commonly used filesystems. 4. Create a unique filename (only "time. and you end up writing to the same file in tmp/.host" here.pid. Mail delivery Qmail's how a message is delivered page suggests to deliver the mail like this: 1. other MUAs accessing the maildir don't support it. ~/Maildir/folder and ~/Maildir/folder/subfolder. After the next sync when it finds it again. wait 2 seconds and go back to step 1. In that case you should probably go back to step 1. the rest just sounds nice. the link() could again fail if the mail existed in new/ dir. This means that if another MUA is updating messages' flags or expunging messages. The skipping happens because readdir() system call doesn't guarantee that all the files are returned if the directory is modified between the calls to it. a rename() to change a message's flag). v1. Dovecot might temporarily lose some message.Most importantly this means that if your maildir folders exist in eg. Only the first step is what really guarantees that the mails won't get overwritten. • subscriptions file contains IMAP's mailbox subscriptions. something's broken and the stat() check won't help since another process might be doing the same thing at the same time.1 supports them by adding :LAYOUT=fs to mail_location. later Maildir spec has been updated to allow more uniqueness identifiers) 2.

Procmail Problems Maildir format is somewhat compatible with MH format. Typically they're stored in ~/mail/ or ~/Mail/ directories. IMAP protocol supports multiple mailboxes however. This makes procmail create the messages in MH format. Locking 1. Dovecot's Speed Optimizations 5. Because of this. Dotlock 2. Dovecot's Metadata 4. so there needs to be a place for them as well. after expunging messages from the maildir the inodes are freed and will be reused later. This is sometimes a problem when people configure their procmail to deliver mails to Maildir/new. Besides if the system was just rebooted. which makes Dovecot think that an expunged file reappeared into the mailbox and an error is logged. This means that another file with the same name may come to the maildir. From Escaping 6. Mbox Mailbox Format 1. the mbox format is typically thought of as a slow format. because an identical base filename could already exist in cur/. References Usually UNIX systems are configured by default to deliver mails to /var/mail/username or /var/spool/mail/username mboxes. the file in tmp/ could probably be even overwritten safely (assuming it wasn't already link()ed to new/). Deadlocks 2. and since it may contain any number of flags by then you can't check with a simple stat() anymore if it exists or not.inode_number. The file may already have been moved to cur/ directory. which basically means that the file is called msg. Forget about the 2 second waits and such that the Qmail's man page talks about. Step 2 was pointed out to be useful if clock had moved backwards. Mbox Mailbox Format Contents 1. However with Dovecot's indexing this isn't true. Only expunging messages from the beginning of a large mbox file is slow with Dovecot. While this appears to work first. The proper way to configure procmail to deliver to a Maildir is to use Maildir/ as the destination. right? Wrong. In IMAP world these files are called INBOX mailboxes. So really. However again this doesn't give any actual safety guarantees. Mbox Variants 7.In step 4 the link() would fail if an identical file already existed in the maildir. Directory Structure 3. The mbox file contains all the messages of a single mailbox. most other . all that's important in not getting mails overwritten in your maildir is the step 1: Always create filenames that are guaranteed to be unique.

Modifications to mbox may require moving data around within the file. • • • Dotlock Another problem with dotlocks is that if the mailboxes exist in /var/mail/. and software often uses incompatible locking. mboxes still aren't recommended to be used for important data. so interruptions (eg. power failures) can cause the mbox to break more or less badly. Because it allows creating only exclusive locks.lock file created by almost all software when writing to mboxes. Also because all the mails are in a single file. flock: flock() system call is quite commonly used for both read and write locking. So while using a dotlock typically prevents actual mailbox corruption. The one downside to it is that it doesn't work if mailboxes are stored in NFS. With Linux lockf() is internally compatible with fcntl() locks. With Dovecot this can be done by setting mail_privileged_group = mail. There are multiple different ways to lock a mbox. so it works well for reading as well. This grants the writer an exclusive lock over the mbox. See MboxLocking for how to check what locking methods some commonly used programs use. There are a couple of ways to work around this: • Give a mail group write access to the directory and then make sure that all software requiring access to the directory runs with the group's privileges. searching is much faster than with maildir. In some systems this fcntl() system call is compatible with flock(). This may mean making the binary itself setgid-mail. fcntl works with NFS if you're using lockd daemon in both NFS server and client.operations should be fast. so the dotlock file can't be created. but in other systems it's not. because users can't delete each others mailboxes. The read lock allows multiple processes to obtain a read lock for the mbox. but they can still create new files (the dotlock files). Although Dovecot tries to minimize the damage by moving the data in a way that data should never get lost (only duplicated). • . fcntl: Very similar to flock. lockf: POSIX lockf() locking. There are at least four different ways to lock a mbox: • dotlock: mailboxname. Set sticky bit to the directory (chmod +t /var/mail). it's somewhat useless so Dovecot doesn't support it. such as a mbox for newly created user who hadn't yet received mail. The downside to this is that users can create whatever files they wish in there. or using a separate dotlock helper program which is setgidmail. but again you shouldn't rely on this. so it's usually not used while reading the mbox so that other processes can also read it at the same time. also commonly used by software. This makes it somewhat safe to use. so you shouldn't rely on it. Locking Locking is a mess with mboxes. the user may not have write access to the directory. it doesn't protect against read errors if mailbox is modified while a process is reading.

Deadlocks If multiple lock methods are used. Dovecot treats them as its own private metadata. which is usually the case since dotlocks aren't typically used for read locking. but since it's already fcntl locked by A. F (\Flagged). It does sanity checks for them. Dovecot's Metadata Dovecot uses C-Client (ie. Finally after a couple of minutes they time out and fail the operation. a "pseudo message" is written to the mbox which contains X-IMAP header. both use dotlock and fcntl locking but in different order: • • • • Program A: fcntl locks the mbox Program B at the same time: dotlocks the mbox Program A continues: tries to dotlock the mbox. so the headers may also be modified or removed completely. but since it's already dotlocked by B. Dovecot simply assumes that all files it sees are mboxes and all directories mean that they contain sub-mailboxes. T (\Draft) and D (\Deleted) flags X-Keywords: Message's keywords Content-Length: Length of the message body in bytes Whenever any of these headers exist. the order in which the locking is done is important. Only the first message contains the X-IMAP or X-IMAPbase header. This is the "DON'T DELETE THIS MESSAGE -. The difference is that when all the messages are deleted from mbox file. None of these headers are sent to IMAP/POP3 clients when they read the mail. it starts waiting Now both of them are waiting for each others locks. UW-IMAP.FOLDER INTERNAL DATA" message which you hate seeing when using non-C-client and non-Dovecot . last used UID and list of used keywords X-IMAP: Same as X-IMAPbase but also specifies that the message is a "pseudo message" X-UID: Message's allocated UID Status: R (\Seen) and O (non-\Recent) flags X-Status: A (\Answered). it starts waiting Program B continues: tries to fcntl lock the mbox.imap/ file contains IMAP's mailbox subscriptions. Directory Structure When listing mailboxes. There are two special cases however which aren't listed: • • . These headers are: • • • • • • • X-IMAPbase: Contains UIDVALIDITY. directory contains Dovecot's index files. Pine) compatible headers in mbox messages to store metadata. Because it's not possible to have a file which is also a directory. Consider if two programs were running at the same time. Preferably your LDA should strip all these headers before writing the mail to the mbox.subscriptions . it's not possible to create a mailbox and child mailboxes under it.

which then cause all kinds of problems. setting works by adding and/or updating Dovecot's metadata headers only after closing the mailbox or when messages are expunged from the mailbox. If message contains X-Keywords header. If mbox_lazy_writes was enabled and the mail didn't yet have X-UID header. Status and X-Status headers are trusted completely. If it is. Because the byte count must be exact. otherwise the first mail which is received could contain faked X-IMAPbase header which could cause trouble. As long as Dovecot is in dirty mode. It can use this space to move only minimal amount of data necessary to get the necessary data inserted. it means that mails were expunged and again Dovecot does a full sync. Dovecot optimizes this by always leaving some space characters after some of its internal headers. timestamp or size). Whenever the mbox changes (ie. This is also why the pseudo message is important. The UID for a new message is calculated from "last used UID" in X-IMAP header + 1. Also when 232 is exceeded. Usually however the only thing besides Dovecot that modifies the mbox is the LDA which appends new mails to the mbox. If the X-UID header is different. mbox_dirty_syncs . This is however important to prevent abuse. and sometimes the writes don't have to be done at all if the whole message is expunged. Since the same header can come from the mail's sender. it fallbacks to a full sync to find the mail's correct position. it contains a space-separated list of keywords for the mail. Dovecot will also start having some problems. Dovecot reads only the newly added messages and goes into a "dirty mode". The upside of this is that it reduces writes because multiple flag updates to same message can be grouped. it can't be certain that mails are where it expects them to be. it first checks if the mailbox's size changed. Dovecot's Speed Optimizations Updating messages' flags and keywords can be a slow operation since you may have to insert a new header (Status. so fake X-UID headers don't really matter. X-Status. Dovecot uses MD5 sum of a couple of headers to compare the mails. Otherwise the UIDs could easily grow over 231 which some clients start treating as negative numbers. If it didn't.software. only the keywords are listed in XIMAP header are used. so whenever accessing some mail. The dirty mode goes away after a full sync. it's quite unlikely that abusing it can cause messages to be skipped (or rather appended to the previous message's body). If the mailbox shrunk. Also if data is removed. so it's pretty good idea to filter them in LDA if possible. Some mbox MUAs do this simply by rewriting all of the mbox after the inserted data. Dovecot first checks if the last known message is still where it was last time. C-Client works the same way. The downside is that other processes don't notice the changes immediately (but other Dovecot processes do notice because the changes are in index files). it just grows these spaces areas. mbox_lazy_writes setting tries to avoid re-reading the mbox every time something changes. So if the mbox size was grown. If the mbox is large. This is done always. this can be very slow. Content-Length is used as long as another valid mail starts after that many bytes. it first verifies that it really is the correct mail by finding its X-UID header. it most likely meant that only message flags were changed so it does a full mbox read to find it. X-Keywords) or at least insert data in the header's value.

mboxrd was named for Raul Dhesi in June 1995. mboxcl2 is like mboxcl but does away with the "From" quoting. though several people came up with the same idea around the same time. it is escaped with a greater-than sign (>From). LDA Indexing Dovecot v1. An issue with the mboxo format was that if the text ">From" appeared in the body of an email (such as from a reply quote). This is useful with mbox format. Dovecot v1. with each message beginning with a line containing "From SENDER DATE". lines beginning with "From " in message bodies are usually prefixed with '>' character while the message is being written to in mbox. mboxrd fixes this by always quoting ">From" lines as well. There are some tradeoffs though: • LDA indexing wastes disk I/O because it has to open and update index files . Dovecot currently uses mboxcl2 format internally. Normally opening the mailbox does a full sync if it had been changed outside Dovecot. mboxcl format was originated with Unix System V Release 4 mail tools. Mbox Variants There are a few minor variants of this format: mboxo is the name of original mbox format originated with Unix System V. It still quotes "From" as the original mboxo format does (and not as mboxrd does it). commonly referred to as From_-line. This is used to determine message boundaries. The format surrounds each message with lines containing four control-A's. It adds a ContentLength field which indicates the number of bytes in the message.0's deliver updates the main index file while message is being saved.does the same as mbox_dirty_syncs. With Maildir the benefits of this are pretty small. This eliminates the need to escape From: lines. This format is used by qmail. Both of these will probably be implemented later. Messages are stored in a single file. but it's planned to move to combination of mboxrd and mboxcl.g. it was not possible to distinguish this from the mailbox format's ">From". Dovecot doesn't either remove the '>' characters before sending the data to clients. mbox_very_dirty_syncs From Escaping In mboxes a new mail always begins with a "From " line. MMDF (Multi-channel Memorandum Distribution Facility mailbox format) was originated with the MMDF daemon. If "From" occurs at the beginning of a line anywhere in the email. but the dirty state is kept also when opening the mailbox. which can be very useful with all mailbox formats. especially if mbox_very_dirty_syncs=no. so readers can just remove the first ">" character. It means that when IMAP client wants to fetch the message's metadata (e. To avoid confusion. some header fields) they're already found from the cache file and Dovecot doesn't have to open and parse the message file. Dovecot doesn't currently do this escaping however.1+ deliver updates also cache file. Instead it prevents this confusion by adding Content-Length headers so it knows later where the next message begins.

so the second time message bodies are read they're already in memory So it depends on IMAP client if it's faster to use LDA or IMAP time indexing. except cache file is disabled completely (because the client probably won't fetch the same data twice within a connection). The index files consist of the following files: • • • • dovecot.index. with Maildir format these benefits are very small.log: Transaction log file dovecot. See Design/Indexes for more technical information how the index files are handled.cache: Cached mailbox data dovecot. because the message list metadata can be returned faster when it's pre-indexed. If index files are missing. the same structures are still kept in the memory.log. This also means that it's perfectly fine to use a non-Dovecot MDA to deliver mails that doesn't update indexes. In any case the user experience is typically faster with LDA indexing. Dovecot can efficiently see and index such new mails without doing anything expensive like "rebuilding indexes".• • • LDA indexing saves disk I/O because it already has the message body in memory. Non-indexed mail delivery Ignoring the benefits of cache file updates.log file is rotated to . the only thing left is the main index updates. Dovecot creates them automatically when the mailbox is opened.index: Main index file dovecot. intended to help find the message when its X-UID: header hasn't yet been written .index.2 file when it grows too large. If at any point creating a file or growing a file gives "not enough disk space" error. If the index files are disabled.index. so it doesn't need to read it from disk. the indexes are transparently moved to memory for the rest of the session. Dovecot's index files The basic idea behind Dovecot's index files is that it makes reading the mailboxes a lot faster. See IndexFiles for more information about what the index files contain. and many IMAP clients are configured to download all new message bodies anyway.log. Each mailbox has its own separate index files. Main index The main index contains the following information for each message: • • • • • IMAP UID Current flags and keywords Pointer to cache file mbox-only: mbox file offset mbox-only: MD5 sum of some of the message headers. As mentioned above.2: . IMAP indexing wastes disk I/O because it has to open and read message files IMAP indexing may save disk I/O because IMAP process always has index files opened.

not all) Sent date (parsed Date: header) Received date (IMAP's INTERNALDATE field) Physical and virtual message sizes Message's parsed MIME structure.1+. such as how many messages exist. only BODYSTRUCTURE is saved. allowing to quickly read only a specific MIME part (IMAP's FETCH BODY[1. how many of them are unseen and how many are marked with \Deleted flag. Online clients that ask for the same information multiple times (eg. Opening mailboxes and answering to STATUS IMAP commands can be usually done simply by getting the required information from the index file's header. Cache file is extremely helpful with the type 1 clients. For type 2 clients the cache file is helpful if they use multiple clients or if the data was cached while the message was being saved (Dovecot v1. The index file's header also contains some summary information. The second time they ask for the same information Dovecot can now get it quickly from the cache file instead of opening the message and parsing the headers. The first time that client requests message headers or some other metadata they're stored into the cache file. such as mailbox sorting data This is the same information that most other IMAP servers keep in memory while the mailbox is open. webmails. IMAP clients can work in many different ways.3] command) IMAP's BODY and BODYSTRUCTURE fields ○ • If both are used.1+ can do this). but Dovecot has the advantage of keeping the information permanently stored so it's easy to get it when opening the mailbox. Instead the headers used to build it are cached directly. Most IMAP clients behave like this. Pine) 2.2. or possibly only when the user opens the mail). since BODY can be generated from it IMAP's ENVELOPE isn't cached currently. This is why these operations are extremely fast with Dovecot compared to other servers that don't use an equivalent index file. Some of the information is . Cache file Cache file may contain the following information for messages: • • • • • • Message headers (some. Offline clients that usually download first some of the interesting message headers and only after that the message bodies (possibly automatically.• Other extensions in Dovecot v1. There are basically 2 types: 1. Mailbox synchronization The main index's header also contains mailbox syncing state: • • Maildir: cur/ and new/ directories' timestamps mbox: mbox file's mtime and size The index file is synchronized against mailbox only if the syncing information changes.

Instead of re-reading the whole main index file after each change it's necessary to only read a few bytes from the transaction log. Only the mailbox metadata that client(s) have asked for earlier are stored into cache file. so it is common to first read the main index and then apply new changes from the transaction log on top of that. 2.1+ the transaction log plays an even more important role. This most likely means it doesn't have a local cache. 2. It supports several different kinds of fields: MAIL_CACHE_FIELD_FIXED_SIZE The field size doesn't need to be stored in the cache file. download+delete POP3 users) it would even be possible to delete the whole main index and keep only the transaction log (although this isn't done currently). In Dovecot v1. MAIL_CACHE_FIELD_STRING Variable sized string. Client accessed a message older than one week. Without the virtual size being in cache Dovecot first has to read the whole message to calculate it. The data begins with a 0-terminated uint32_t line_numbers[]. MAIL_CACHE_FIELD_VARIABLE_SIZE Variable sized binary data. An alternative would be to do a comparison of two index mappings. Dovecot uses two rules to determine when data should be cached permanently instead of temporarily: 1. Client accessed messages in non-sequential order within this session. All the added fields are ORed together. This has two advantages when the mailbox is accessed using multiple simultaneous connections: 1. This allows Dovecot to be adaptive to different clients' needs and still not waste disk space (and cause extra disk I/O!) for fields that client never needs. Temporarily cached fields are dropped from the cache file after about a week. for example it's required to know the message's virtual size when downloading the message. It allows getting a list of changes quickly so that IMAP clients can be notified of the changes.helpful in any case. MAIL_CACHE_FIELD_BITMASK A fixed size bitmask field. The line number exists only for each header. Dovecot can cache fields either permanently or temporarily. header continuation lines in multiline headers don't get listed. MAIL_CACHE_FIELD_HEADER Variable sized message header. Transaction log All changes to the main index go through transaction log first. Design/Indexes/Cache explains the reasons for these rules. Cache file Cache file is used for storing immutable data. With empty mailboxes (eg. After the line numbers comes the . which is what most other IMAP servers do. The main index file is updated only "once in a while" to reduce disk writes. It's possible to add new bits by updating this field. mmap_disable=yes implementation relies on the transaction log. It's always the same.

it's added there). . The following is copied from the file: mail-cache-decisions. but otherwise it isn't a problem. it's important that they don't block writers either.c Users can be divided to three groups: 1. This also means that it's possible for two processes to write the same cached fields twice to the cache file. Normally Dovecot changes the decisions based on what fields are fetched and for what messages. If the client never fetches the cached data. Most users will use only a single IMAP client which caches everything locally. Because the data written to the cache file are really just cached data. MAIL_CACHE_DECISION_YES This field is cached for all mails. Reading cache files requires no locking. When the transaction is committed. MAIL_CACHE_DECISION_TEMP This field is cached for new mails. The caching decisions are: MAIL_CACHE_DECISION_NO This field isn't cached currently. LFs and the TABs or spaces for continued lines. Some users use multiple IMAP clients which cache everything locally. file contains the rules how Dovecot changes the decisions. and immediately after that unlocking the file. Cache decisions Dovecot tries to be smart about what it keeps in the cache file. The last 3 variable sized fields are treated identically by the cache file code. These could benefit from caching until all clients have fetched the data. Their main purpose is to make it easier for "dump cache file's contents" programs (src/util/idxview) to do their job. the updated cache offsets are written to the transaction log which makes them visible to other processes. A specific decision can be forced by ORing it with MAIL_CACHE_DECISION_FORCED. the fields' contents are identical. 2. Also because the readers are often also writers (if something isn't cached. including the "header-name: " prefix for each line. Having the data exist twice (or even more times) means wasting some disk space. such as with IMAP command FETCH 1:* (BODY. For these users it's quite pointless to do any kind of caching as it only wastes disk space. Writing is done by first locking the file. reserving some space to write to.list of headers.PEEK[] ENVELOPE BODYSTRUCTURE) it's important that updating the cache file doesn't block out any other readers. it's just waste of disk space and disk I/O. Locking Because cache file is typically used in potentially long-running operations. That might also mean more disk I/O. After that it's useless. This way the transaction can keep writing to the cache file as long as it wants to without blocking other writers. The duplicates are dropped the next time the file is compressed.

I figured out that people who care about performance most will be using Dovecot optimized LDA anyway which updates the indexes/cache immediately. After thinking about these a while. Noncaching clients might fetch messages in pretty much any order. LDA reads the mail anyway. When cache file is compressed. Cache temporarily: Clients want this only once 3. Different fields have different decisions. group 1. Dovecot's index files . I thought a week would be good.3. For example Pine and webmails. so it might as well extract some information about it and store them into cache. If decision hasn't matched for two months. User might also switch clients from non-caching to caching. So. Or it might be a client user hasn't just used for over a week. but with different clients. Some clients don't do permanent local caching at all. only what's visible in screen. So. Cache permanently: Clients want this more than once Different mailboxes have different decisions. They could benefit from caching only these fields. These clients would benefit from caching everything. Some locally caching clients might also access some data from server again. such as when searching messages. and 2. could be optimally implemented by keeping things cached only for a while. But how to figure out if user is in group 3? One quite easy rule would be to see if client is accessing messages older than a week. we have three caching decisions: 1. we can't know if user just started using a new client which is just filling its local cache for the first time. Don't cache: Clients have never wanted the field 2. Some will use server side sorting/threading which also makes messages to be fetched in random order. So we should re-evaluate our caching decisions from time to time. But with only that rule we might have already dropped useful cached data. Most locally caching clients always fetch new messages (all but body) when they see them. everything older than week will be dropped. They fetch them in ascending order. In that case even the first user group would benefit from caching the same way as second group. I picked two months because people go to at least one month vacations where they might still be reading mails. as they usually don't fetch everything they can. it's changed. Second rule would then be that if a session doesn't fetch messages in ascending order. There are some problems. such as if a client accesses message older than a week. the fetched field type will be permanently cached. It's not very nice if we have to read and cache it twice. This is done by checking the above rules constantly and marking when was the last time the decision was right. In these cases we shouldn't have marked the field to be permanently cached.

allocate space. Dovecot uses the highest bit for this flag. The index files can be accessed using mail-index. The lockless integers work by allocating one bit from each byte of the value to "this value is set" flag. For example there's no need to keep any index files locked while synchronizing.228 (with 32bit fields) once.cache) See IndexFiles for more generic information about what they contain and why. If all of them aren't set. and also for single transaction appends. but they cannot be changed. The actual writing inside the allocated space is done without any locks being held. write..99's index files didn't do this. Dovecot v0. It would be possible to set them back to "unset". One of them uses fields in a "lockless integer" format. The reader then verifies that the flag is set for the value's all bytes. Lockless integers Dovecot uses several different techniques to allow reading files without locking them.log. Locking The index files are designed so that readers cannot block a writer.index. Initially these fields have "unset" value. They're locked for writing only for the duration of allocating space inside the file. They can be set to a wanted value in range 0.index. and write locks are always short enough not to cause other processes to wait too long. the value is still "unset". Cache files doesn't require read locks. unlock.Dovecot's index files consist of three different files: • • • Main index file (dovecot.index) Transaction log (dovecot. Also writing to transaction log could work in a similar way to cache files: Lock.h API. In future these could be improved even further. They can however block the writer only for two seconds (and even this could be changed to not block at all). The writes are locked only for the duration of the mailbox synchronization.2) Cache file (dovecot. because one of the bytes didn't have the highest bit set 0xFFFFFFFF: The value is 228-1 0x80808080: The value is 0 0x80808180: The value is 0x80 • • .log and dovecot. but setting them the second time isn't safe anymore. Transaction logs don't require read locks. as long the mailbox backend takes care of the locking issues. so Dovecot never does this. So for example: • • • 0x00000000: The value is unset 0xFFFF7FFF: The value is unset. The writing is locked for the duration of the mailbox synchronization. The main index file is the only file which has read locks.index. and it was common to get lock timeouts when using multiple connections to the same large mailbox.

Everything should go through the transaction log. . because the message count in the header is updated last. The copy-locking is used always when doing anything that could corrupt the index file if it crashed in the middle of an operation. Try to preserve the headers and the minor version when updating the index file. This is normally the same as sizeof(struct mail_index_header).Dovecot contains mail_index_uint32_to_offset() and mail_index_offset_to_uint32() functions to translate values between integers and lockless integers. Note that this is safe only because of the exclusive transaction log lock. don't try to read the index. minor_version If this doesn't match MAIL_INDEX_MINOR_VERSION there are some backwards compatible changes in the index file (typically header fields). it'll fallback to copying the index file to a temporary file. but it's not done currently. but if it couldn't acquire the lock in two seconds. or if messages are expunged. such as cache file or mbox file offsets. Dovecot recreates the index file then. flags. This isn't necessary. writing and locking Reading dovecot. Shared read locking is done using the standard index locking method specified in lock_method setting (lock_method parameter for mail_index_open()). The locking works by first trying to lock the index with the standard locking method. unfortunately. Expunging the last messages would probably be safe also (because only the header needs updating). Reading. The index file should never be directly modified. Writing to index files requires transaction log to be exclusively locked first. and the only time the index needs to be write-locked is when transactions are written to it. so it's not possible to differentiate between "unset" and "set" 0 values. Main index The main index can be used to quickly look up messages' UIDs. The "unset" value is returned as 0.index file. For example if the header or record size changes. keywords and extensionspecific data. Header Fields that won't change without recreating the index: major_version If this doesn't match MAIL_INDEX_MAJOR_VERSION. because an old index file can be updated using the transaction log. base_header_size Extension headers begin after the base headers. New messages can be appended however. Currently the index file is updated whenever the backend mailbox is synchronized. This way the index locking only has to worry about existing read locks. In future there could be some smarter decisions about when writing to the index isn't worth the extra disk writes. and when unlocking it'll rename() the temporary file over the dovecot. This way the writers are never blocked by readers who are allowed to keep the shared lock as long as they want.index file requires locking.

but typically the UIDVALIDITY never changes. messages_count Number of records in the index file. they're simply recreated. first_recent_uid_lowwater There are no UIDs lower than this with MAIL_RECENT flag set. Message UIDs and counters: uid_validity IMAP UIDVALIDITY field. record_size Size of each record and its extensions. deleted_messages_count Number of records with MAIL_DELETED flag set. That's a bit ugly. but after it's set we don't currently try to even handle the case of UIDVALIDITY changing. . MAIL_INDEX_HDR_FLAG_FSCK Call mail_index_fsck() as soon as possible. it shouldn't try to continue using the index. recent_messages_count Number of records with MAIL_RECENT flag set. Initially can be 0. compat_flags Currently there is just one compatibility flag: MAIL_INDEX_COMPAT_LITTLE_ENDIAN. This is used to make sure that the main index. If the reader notices this flag. It's done by marking the index file corrupted and recreating it. This flag isn't actually set anywhere currently. transaction log and cache file are all part of the same index. MAIL_INDEX_HDR_FLAG_HAVE_DIRTY This index has records with MAIL_INDEX_MAIL_FLAG_DIRTY flag set. first_unseen_uid_lowwater There are no UIDs lower than this without MAIL_SEEN flag set. Dovecot doesn't try to bother to read different endianess files. Header flags: MAIL_INDEX_HDR_FLAG_CORRUPTED Set whenever the index file is found to be corrupted. indexid Unique index file ID.header_size Records begin after base and extension headers. seen_messages_count Number of records with MAIL_SEEN flag set. next_uid UID given to the next appended message. Only increases. Initially the same as sizeof(struct mail_index_record).

is explained in Design/Indexes/TransactionLog's Records . /* unsigned char name[name_size] */ /* unsigned char data[hdr_size] (starting 64bit aligned) */ }. sync_stamp Used by the mailbox backends to store their synchronization information.header_size. Fields related to syncing: log_file_seq Log file the log_*_offset fields point to. so log_file_int_offset <= log_file_ext_offset. record offset.6] are first moved to [1. Some day these should be removed and replaced with extension headers. log_file_ext_offset All the internal/external transactions before this offset in the log file are synced to the index. The second begins after the first one's data[] and so on. size and alignment struct mail_transaction_ext_intro. Extension headers After the base header comes a list of extensions and their headers. uint16_t record_align. uint16_t record_offset. struct mail_index_ext_header { uint32_t hdr_size. reset_id.7].. so you may need to skip a few bytes always. The [0.first_deleted_uid_lowwater There are no UIDs lower than this with MAIL_DELETE flag set. The day_first_uid[] fields are used by cache file compression to decide when to drop MAIL_CACHE_DECISION_TEMP data.base_header_size offset. The extensions always begin 64bit aligned however.. then [0] is set to the first appended UID. day_first_uid[8] These fields are updated when day_stamp < today. The lowwater fields are used to optimize searching messages with/without a specific flag. Read the extensions as long as the offset is smaller than mail_index_header. Then there are day fields: day_stamp UNIX timestamp to the beginning of the day when new records were last added to the index file. So they contain the first UID of the day for last 8 days when messages were appended. /* size of data[] */ uint32_t reset_id. uint16_t name_size. The first extension begins from mail_index_header. log_file_int_offset. sync_size. External transactions are synced more often than internal. uint16_t record_size.

. It also allows having modifiable flags for read-only mailboxes. The list of keywords is stored in "keywords" extension header: struct mail_index_keyword_header { uint32_t keywords_count.record_size. the index file syncing code is free to assign any offset inside the record to them. If a record has this flag set.There are hdr. If the extension size isn't the same as its alignmentation. they are quite tightly integrated to the index file code. /* for backwards compatibility */ uint32_t name_offset. This is used for example with mbox and mbox_lazy_writes=yes. The flags are a combination of enum mail_flags and enum mail_index_mail_flags. each listed in a NUL-terminated string beginning from name_offset. so it's possible to find a record by its UID with binary search. This will be fixed later.messages_count records in the file. Keywords The keywords are stored in record extensions. It's not currently possible to safely remove existing keywords. so it could be fixed later as well. struct mail_index_keyword_header_rec { uint32_t unused. The rest data is stored in record extensions. but for better performance and lower disk space usage in transaction logs. Each record contains at least two fields: Record UID and flags. The UID is always increasing for the records. So there exists keywords_count keywords. The unused field originally contained count field. /* relative to beginning of name[] */ }. So the nth bit in the bitfield points to the nth keyword listed in the header. but it's not perfect. The record size is specified by mail_index_header. but while writing this documentation I noticed it's not actually used anywhere. Since crashing in the middle of updating the keywords list pretty much breaks the keywords. This isn't strictly necessary either. /* struct mail_index_keyword_header_rec[] */ /* char name[][] */ }. Apparently it was added there accidentally. The keywords in the records are stored in a "keywords" extension bitfield. It'll be removed in later versions. Dovecot's current extension ordering code works pretty well. The records size is always divisible by the maximum alignmentation requirement. it means that the mailbox syncing code should ignore the flag in the mailbox and use the flag in the index file instead. Extensions The extensions only specify their wanted size and alignmentation. it may create larger records than necessary. There exists only one index flag currently: MAIL_INDEX_MAIL_FLAG_DIRTY. The extensions may be reordered at any time. adding new keywords causes the index file to be always copied to a temporary file and be replaced.

So if Dovecot crashes after having updated only the first flag. When beginning to synchronize a mailbox with index files. while internal transactions are commands to do something to the mailbox. A new log is created by first creating a dovecot. For example if a transaction sets a flag to one message and removes it from another. This is because the changes are already in the mailbox at the time the transaction is read. Each record begins with a .Transaction log The transaction log is a bit similar to transaction logs in databases.index. and the uncommitted internal transactions are applied on top of them. the next time the mailbox is opened both of the changes are done all over again. All the updates to the main index files are first written to the transaction log. external Transactions are either internal or external. In-memory caching of dovecot. check again that the dovecot. Currently there doesn't exist actual transaction boundaries in the log file.log. There are several advantages to this: • It provides atomic transactions: The transaction either succeeds. Dovecot can simply read the new changes from the transaction log and apply them to the in-memory copy of the main index.newlock dotlock file. If not. the last thing that's done is to update the "transaction log position" in the header. Writing is exclusively locked using the index files' default lock method (as specified by the lock_method setting). the index file is first updated with all the external changes. Instead of re-reading the whole index file after each external change.cache file also relies on the transaction log telling what parts of the file has changed. • It allows another process to quickly see what changes have been made. the append transactions must be external. For example IMAP needs to get a list of external changes after each command. All the changes in a transaction are simply written as separate records to the file. • In future the transaction logs can be somewhat easily used to implement replication.log wasn't created (or recreated) by another process. The difference is that external transactions describe changes that were already made to the mailbox. it's guaranteed that both changes happen.index. ○ This is also important when storing the index files in NFS or in a clustered filesystem.index. go ahead and write the log header to the dotlock file and finally rename() it to dovecot.log. Reading and writing Reading transaction logs doesn't require any locking at all. Once you have the dotlock. using the synchronization transaction writes only external transactions. and only after that the main index file is updated. ○ When updating the changes to the main index file. Also if the index file is updated when saving new mails to the mailbox. Internal vs. When synchronizing the mailbox.index. or it doesn't.

Once the whole transaction has been written. Record header . indexid This field must match to main index file's indexid field. which contains the record's size and type. there's a small race condition here with mmap()ed log files: 1. 3. but it stops in the middle of the transaction because the mmap size doesn't contain the whole transaction This probably isn't a big problem. because I've never seen this happen even with stress tests. The size is in lockless integer format. Process A: write() the rest of the transaction. Must be increasing.struct mail_transaction_header. Used in determining when to rotate the log file. don't try to parse it. Should be fixed at some point anyway. Process A: write() half of the transaction 2. updating the size=0 also 4. The fields are: major_version If this doesn't match MAIL_TRANSACTION_LOG_MAJOR_VERSION. Process B: mmap() the file. it'll recreate the log file. these fields allow it to easily check if there had been any more changes in the previous file. hdr_size Size of the log file's header. Header The transaction log's header never changes. Currently you can just ignore this field. When transaction log is rotated and the reader's "sync position" still points to the previous log file. except the indexid field may be overwritten with 0 if the log is found to be corrupted. file_seq The file's creation sequence. the 0 is updated with the actual size. minor_version If this doesn't match MAIL_TRANSACTION_LOG_MINOR_VERSION. Note that because there are no transaction boundaries. prev_file_seq. This way the transaction log readers won't see partial transactions because they stop at the size=0 if the transaction isn't fully written yet. Process B: parse the log file. it'll go past the original size=0 because the size had changed in the mmap. Use this instead of sizeof(struct mail_transaction_log_header). create_stamp UNIX timestamp when the file was created. the log file contains some backwards compatible changes. If Dovecot sees this. so that it's possible to add new fields and still be backwards compatible. The first transaction record is written with the size field being 0. prev_file_offset Contains the sequence and offset of where the last transaction log ended.

uid2 range. uid2=1000 to describe changes made to these 3 messages. 100 and 1000. If an expunge type is found without it. }. flags = 0 } struct mail_index_record { uid = 2. The append transaction's contents is simply the struct mail_index_record. uint8_t add_flags. Replacing all the files works by setting remove_flags = 0xFF and the add_flags containing the new flags. the size field can be used to figure out how many changes need to be done.The transaction record header (struct mail_transaction_header) contains size and type fields. The expunge transactions must have MAIL_TRANSACTION_EXPUNGE_PROT ORed to the transaction type field. Because the size of the transaction record for each type is known (or can be determined from the type-specific record contents). so it contains only the message's UID and flags. Also if the message had any keywords when it was appended. uid2 fields. they're in a separate transaction record. assume a corrupted transaction log. Appends As described above. The padding is ignored completely. they deserve some extra protection to make it less likely to accidentally expunge wrong messages in case of for example file corruption. Keyword changes Specific keywords can be added or removed one keyword at a time: . uid2. A single flag update structure can add new flags or remove existing flags. it would be possible to use uid1=1. A single transaction record may contain multiple changes of the same type. This also means that it's safe to write transactions describing changes to messages that were just expunged by another process (and already written to the log file before our changes). flags = 0 } UIDs Many record types contain uint32_t uid1. although some types don't allow this. Expunges Because expunges actually destroy messages. This means that the changes apply to all the messages in uid1. uint16_t padding. the appends must be in external transactions. The messages don't really have to exist in the range. The size field is in lockless integer format. uint8_t remove_flags. size = sizeof(struct mail_index_record) * 2 } struct mail_index_record { uid = 1.. So for example a record can contain: • • • struct mail_transaction_header { type = MAIL_TRANSACTION_APPEND. Flag changes The flag changes are described in: struct mail_transaction_flag_update { uint32_t uid1. so for example if the first messages in the mailbox had UIDs 1. The message contents aren't written to transaction log.

uint32_t reset_id. which also causes the extension records' contents to zeroed. and issues a reset_id = 2 change. */ uint32_t ext_id. This is a bit kludgy and hopefully will be replaced by something better in future. It's updated with MAIL_TRANSACTION_EXT_RESET record. Initially it's 1. you'll need to first write MAIL_TRANSACTION_EXT_INTRO record. /* unsigned char name[]. Whenever using an extension. uint16_t unused_padding. uint16_t name_size. so the changes must be ignored. If the extension already exists in the index file (it can't be removed). new extension: ext_id = (uint32_t)-1. you can use the ext_id field directly. give name. } */ }. If an introduction's reset_id doesn't match the last EXT_RESET. record_size specifies the number of bytes it wants to use for each record.index. uint32_t hdr_size. you'll first have to remove all of them with MAIL_TRANSACTION_KEYWORD_RESET and then add the new keywords. Otherwise you'll need to give a name to the extension. IMAP's STORE 1:* FLAGS (keyword) command). Extensions Extension records allow creating and updating extension-specific header and message record data. array of { uint32_t uid1. */ }. If you want to replace all the keywords (eg. uint16_t record_align. /* enum modify_type : MODIFY_ADD / MODIFY_REMOVE */ uint8_t padding. Process B: Decides to compress the cache file. but this uses more space of course. For example: • dovecot. hdr_size specifies the number of bytes the extension wants to have in the index file's header. uid2. but the cache file offsets point to the old file. • • • Process A: Begins a cache transaction. The sizes may grow . It's always possible to just give the name if you don't know the existing extension ID. it means that the be extension changes are stale and they must be ignored. The intro contains: struct mail_transaction_ext_intro { /* old extension: set ext_id. uint16_t name_size. For example messages' offsets to cache file or mbox file are stored in extensions.cache file's file_seq header is used as a reset_id.struct mail_transaction_keyword_update { uint8_t modify_type. /* unsigned char name[]. reset_id contains kind of a "transaction validity" field. There is padding after name[] so that uid1 begins from a 32bit aligned offset. Process A: Commits the transaction with reset_id = 1. don't set name. uint16_t record_size. updating some fields in it.

This also means that it's possible for two processes to write the same cached fields twice to the cache file. so the changes must be done for each message separately: struct mail_transaction_ext_rec_update { uint32_t uid. Also because the readers are often also writers (if something isn't cached. After the line numbers comes the list of headers. and immediately after that unlocking the file. Extension record updates typically are message-specific. Locking Because cache file is typically used in potentially long-running operations. Writing is done by first locking the file. */ }. The data begins with a 0-terminated uint32_t line_numbers[]. It supports several different kinds of fields: MAIL_CACHE_FIELD_FIXED_SIZE The field size doesn't need to be stored in the cache file. All the added fields are ORed together. header continuation lines in multiline headers don't get listed. Cache file Cache file is used for storing immutable data. the updated cache offsets are written to the transaction log which makes them visible to other processes. you want it to be 32bit aligned so that the process won't crash in CPUs which require proper alignmentation. Reading cache files requires no locking. MAIL_CACHE_FIELD_VARIABLE_SIZE Variable sized binary data. The last 3 variable sized fields are treated identically by the cache file code. The line number exists only for each header. This way the transaction can keep writing to the cache file as long as it wants to without blocking other writers. including the "header-name: " prefix for each line. reserving some space to write to. MAIL_CACHE_FIELD_STRING Variable sized string. MAIL_CACHE_FIELD_HEADER Variable sized message header. It's always the same. /* unsigned char data[]. When the transaction is committed. it's important that they don't block writers either.or shrink any time. LFs and the TABs or spaces for continued lines. it's added there). MAIL_CACHE_FIELD_BITMASK A fixed size bitmask field. Then again if you want to access the field as 4 bytes. the alignmentation can be 1. record_align contains the required alignmentation for the field. such as with IMAP command FETCH 1:* (BODY. Their main purpose is to make it easier for "dump cache file's contents" programs (src/util/idxview) to do their job.PEEK[] ENVELOPE BODYSTRUCTURE) it's important that updating the cache file doesn't block out any other readers. For example if the extension contains a 32bit integer. Because the data written to the cache file are really just cached data. It's possible to add new bits by updating this field. the fields' contents are identical. Having the data exist twice (or even more times) means wasting some disk .

That might also mean more disk I/O. They could benefit from caching only these fields. so it might as well extract some information about it and store them into cache. 2. MAIL_CACHE_DECISION_YES This field is cached for all mails. . it's just waste of disk space and disk I/O.c Users can be divided to three groups: 1. They fetch them in ascending order. In that case even the first user group would benefit from caching the same way as second group. but otherwise it isn't a problem. If the client never fetches the cached data. Some clients don't do permanent local caching at all. Most users will use only a single IMAP client which caches everything locally. Most locally caching clients always fetch new messages (all but body) when they see them. as they usually don't fetch everything they can. A specific decision can be forced by ORing it with MAIL_CACHE_DECISION_FORCED. 3. only what's visible in screen. The caching decisions are: MAIL_CACHE_DECISION_NO This field isn't cached currently. Normally Dovecot changes the decisions based on what fields are fetched and for what messages. After thinking about these a while. These could benefit from caching until all clients have fetched the data. group 1. file contains the rules how Dovecot changes the decisions. such as when searching messages. I thought a week would be good. It's not very nice if we have to read and cache it twice. Cache decisions Dovecot tries to be smart about what it keeps in the cache file. For these users it's quite pointless to do any kind of caching as it only wastes disk space. everything older than week will be dropped. Some users use multiple IMAP clients which cache everything locally. But with only that rule we might have already dropped useful cached data. I figured out that people who care about performance most will be using Dovecot optimized LDA anyway which updates the indexes/cache immediately. But how to figure out if user is in group 3? One quite easy rule would be to see if client is accessing messages older than a week.space. LDA reads the mail anyway. Some locally caching clients might also access some data from server again. and 2. The following is copied from the file: mail-cache-decisions. could be optimally implemented by keeping things cached only for a while. MAIL_CACHE_DECISION_TEMP This field is cached for new mails. When cache file is compressed. So. For example Pine and webmails. Noncaching clients might fetch messages in pretty much any order. The duplicates are dropped the next time the file is compressed. After that it's useless. These clients would benefit from caching everything. Some will use server side sorting/threading which also makes messages to be fetched in random order.

we have three caching decisions: 1. maildir: Store quota usage in Maildir++ maildirsize files. Usually you'd enable these by adding them to the mail_plugins settings in the config file: protocol imap { mail_plugins = } protocol pop3 { mail_plugins = } # In case you're protocol lda { mail_plugins = } quota imap_quota quota using deliver: quota Configuration . So. Cache temporarily: Clients want this only once 3. If decision hasn't matched for two months. the fetched field type will be permanently cached. This is the most commonly used quota for virtual users. There are some problems.g. So we should re-evaluate our caching decisions from time to time. Don't cache: Clients have never wanted the field 2. but with different clients. but it works quite well with mboxes. I picked two months because people go to at least one month vacations where they might still be reading mails. There are different quota backends that Dovecot can use: • • • • fs: Filesystem quota. Cache permanently: Clients want this more than once Different mailboxes have different decisions. Enabling quota plugins There are currently two quota related plugins: • • quota: Implements the actual quota handling and includes also all the quota backends. we can't know if user just started using a new client which is just filling its local cache for the first time. dirsize: The simplest and slowest quota backend. it's changed. Or it might be a client user hasn't just used for over a week. imap_quota: For reporting quota information via IMAP. that's done by returning extra fields from userdb. SQL). User might also switch clients from non-caching to caching. In these cases we shouldn't have marked the field to be permanently cached. Different fields have different decisions. such as if a client accesses message older than a week. They don't (usually) specify users' quota limits.Second rule would then be that if a session doesn't fetch messages in ascending order. This is done by checking the above rules constantly and marking when was the last time the decision was right. Quota Quota backend specifies the method how Dovecot keeps track of the current quota usage. dict: Store quota usage in a dictionary (e.

Dovecot v1. . The possible solutions for this are: • • Disable move-to-trash feature from client Dovecot v1. • To make sure users don't start keeping messages permanently in Trash you can use a nightly cronjob or expire plugin (v1. which works by: 1. However many clients use a "move-to-Trash" feature.0. Marking message with \Deleted flag 2.The configuration is done differently for v1. 4. COPY the message to Trash mailbox 2. Expunge the message from the original mailbox. the first COPY command will fail and user may get an unintuitive message about not being able to delete messages because user is over quota.1 quota configuration Quota and Trash mailbox Standard way to expunge messages with IMAP works by: 1.1: • • v1.0 + Maildir++ quota: You can completely ignore Trash mailbox from quota calculation by appending :ignore=Trash to the quota line. Note that this would allow users to store messages infinitely to the mailbox. Actually expunging the message using EXPUNGE command Both of these commands can be successfully used while user's quota is full.0 quota configuration v1.1) to expunge old messages from Trash mailbox.0 quota rewrite: You can ignore Trash like with v1. but you can also give a separate quota rule giving Trash mailbox somewhat more quota (but not unlimited).1 or v1.0 and v1. (Maybe later expunge the message from Trash when "clean trash" feature is used) If user is over quota (or just under it). Mark the message with \Deleted 3.