DAOS ISSUES WHEN IMAP OR POP3 TASKS ARE RUNNING
We have seen several instances of the same issue whereby many duplicate NLO files are created using up the majority of disk space when the IMAP or POP3 tasks are running. This issue first came to our attention in domino 8.5.3.
POP3
There is an uncontrolled growing of NLO files in the DAOS repository which has been caused by POP3 calls to a Lotus Notes mail database. When POP3 is disabled this issue does not emerge.
In the DAOS repository there were a dozen NLO files which reappeared every 10 minutes and they were created when the user connectsedover POP3 to the Domino server, the total size of these files is 100Mb, so 100Mb 6 times an hour, 24 hours, created every day 14,4GB of data in the DAOS repository!!! This is a lot of disk space used up unnecessarily.
The reason for these multiple NLO files is outlined as follows:
The Export (serialize) function that is being called during the POP3 LIST operation (invoked by the Outlook client) is storing the entire email message (formatted correctly to send to a POP3 client) as an attachment in order to find the size. These "attachments" are identical except for the timestamp on the serialise operation within the email header.
If the document exists in the POP3 cache, then it is reused. If it is NOT in the POP3 cache, then it is re-serialised at that time which leads to a different 'serialised by' entry in the header. That difference turns the attachment into a unique one, so a new NLO is allocated to contain it.
So this issue is all a side-effect of the cache. The issue is that POP3 is using attachments as a scratchpad/storage for the client ready formatted version of the email. In fact, having an attachment on that email is a red herring, the overall size of the formatted email (stored as an attachment by POP3) controls if it goes into DAOS or not, it's just more likely to go over the limit if it has an attachment. A sufficiently large body-only email without attachments would cause the same problem as it was (re)formatted for the POP3 client in response to the LIST operation.
There is also another case where multiple NLO files can be created.
When a POP3 client does a LIST operation to get a list of the messages in the user's mailfile, the size information for each message is included for each message. In order to determine the size, each message is converted (serialized) to the ready-to-download format. This serialized message is stored temporarily in an attachment created in the user's mailfile. If the serialised message is larger than the DAOS minimum participation size, the attachment is stored in DAOS. (Note: having an attachment in the message being serialised increases the size and therefore the likelihood that it will be stored in DAOS, but it's not strictly related. Any sufficiently large body-only message would behave the same way.)
Because the LIST operation is typically performed at each POP3 client poll interval, the message(s) can be serialised in response to these checks. If the message exists in the POP3 cache, it is not serialized again. If it does not exist in the cache (server restarted, aged out of cache, etc) then the message is serialised again. One of the elements of the serialised message header is the serialisation timestamp. Even though the content of the message has not changed since the last poll, the serialisation timestamp in the message header does change, which results in a unique attachment, and therefore a unique NLO file.
It turns out that the message size in the LIST command is not used by most clients, and therefore is not necessary to calculate. Skipping this calculation has two benefits:
1) Avoiding the overhead of serialising the uncached messages in the LIST output on each poll request
2) Avoiding creating unique NLO files by not (re)serialising the messages. The messages will only be serialised when a RETR operation is issued to fetch the individual message.
There is an INI variable that will cause that size calculation to be skipped, and supplies an approximate size instead:
POP3_LIST_SIZE_ESTIMATE=1
Adding this INI variable should avoid the NLO problems, and yield better performance.
DAOS is hooked in just above the NSF file I/O level, so it only sees that an object is being written to an NSF, and that object has the attachment attribute on it. It does not know (or care) where the attachment originated, or why it is being written. Even though POP3 is only using the attachment as a temporary container, DAOS doesn't know that, and stores it in the DAOS repository as it would with any other attachment. The reference count on the NLO soon goes from 1 to 0, but the NLO is not deleted until the DAOS deferred deletion interval (sort of like soft delete) has expired, 30 days by default. With potentially thousands of documents being serialised...uniquely...every hour, there could be tens of thousand
Even with the INI set, when POP3 server gets a RETR command, the document still needs to be serialised to prepare it. Since the POP3 code will create the serialised version of the document in an attachment in the user's mailfile, a 'temporary' DAOS object will result. Assuming that the client is smart enough to only RETR the document once, there should only ever be one of the temporary NLO files created for each document the POP client requests
There is room for improvement with some future code changes (ie...have POP3 serialise the document into a non attachment object so DAOS will ignore it. This issue has been reported in SPR #
PMAO8XTM5M: POP3 server serialises each document into an attachment object, which unnecessarily invokes DAOS.
This issue has been resolved in Domino 8.5.4.
IMAP
Multiple NLO files are created using up all of the disk space for the DAOS directory only when the IMAP task is running. When the IMAP task is disabled this issue does not emerge.
It seems that IMAP creates a new serialised version of old messages (full with body, headers and attachments in mime format) every time an IMAP client connects. These NLO files are not referenced in any mail database and are generated over and over and over, consuming all the disk space available.
These duplicate NLO files are intermediate objects that most likely were
used as part of a conversion between different encoding/compression styles. Duplicate NLO files can also be created if you are using mail journaling.
Things that should be checked:
- Is mail journaling enabled? If yes then mail journaling does not work well with DAOS and will create duplicate NLO files.
- Do you use different compression types or just 1? It is advisable to only use LZ1.
- Do you have different encoding methods (ex: prefers mime) or only 1? It is advisable to only use 1 type,
To resolve a manual prune=0 will remove these duplicate NLO files as they are temporary files that have no reference.
This issue has been reproduced by IBM by creating many NLO files during IMAP client
interactions. These are only used temporarily, have 0 references, and can be removed with PRUNE.
But we also need to determine if there is a way to avoid creating them in the first place.
A hotfix has been created for this purpose. The HF should also be applied if you have confirmed that mail journaling is not enabled, if you are using only 1 compression type and if you are using only 1 encoding method.
HOTFIX FOR BOTH POP3 AND IMAP ISSUES
A Hotfix has been created to resolve this issue which has been reported in
SPR PPOR8XZLPN. This HF was created for Domino 853. However the HF can be requested for 853FP1 etc by raising a support ticket. This HF was created for the IMAP issue but it will also resolve the POP3 issue.
The NLO contents tell us that both the IMAP and POP3 issues are the same. The attachment is being used as a scratchpad to contain the formatted/encoded message prior to download to the client. The only difference in these is the date stamp on the serialisation line in the message header (inside the scratchpad attachment), but that's enough to make the attachment unique, which means DAOS will allocate a new NLO to contain it. All of these end up with 0 refs, so pruning gets rid of them. But to avoid the issue in the first place the HF can be applied.