Technote Number: 1211241
Problem:
This issue was reported to Quality Engineering as SPR #DSTT6APL83, and is fixed
in Domino 6.5.5 Fix Pack 2 (FP2) and 6.5.6.
Excerpt from the Lotus Domino Release 6.5.6 fix list (available at
http://www.ibm.com/developerworks/lotus):
Server
SPR# DSTT6APL83 - This fix provides Notes.ini variables to control the maximum
amount of shared memory the backup process will consume before using temporary
files on disk. The Notes.ini variable "NSF_Backup_Memory_Constrained=1"
enables the feature with a default size of 20Mb, the Notes.ini variable
"NSF_Backup_Memory_Limit" is provided to tune that size. The default is
NSF_Backup_Memory_Limit=20000000.
Refer to the Upgrade Central site for details on upgrading Notes/Domino.
When a third party backup application begins a backup of an NSF file, each
update to that NSF results in Domino recording (in shared memory) the "before
image" state of the block on disk being updated. This continues until the
backup application has completed backing up the NSF.
The use of scheduled agents, running compact, fixup, updall, or any other
maintenance task that could produce a massive amount of updates to an NSF that
is also being backed up may result in Domino storing too much disk state
information in shared memory. This eventually results in all addressable
virtual memory for the process being exhausted. Domino eventually panics when
this occurs. For example, a crash could occur when an agent modifies many
documents while Tivoli Data Protector (domdsmc task) backs up the same large
database.
As a best practice, this panic can typically be avoided by simply limiting the
maintenance tasks that are scheduled to run during backup.
Note: In one case, this problem was the result of running updall -R while
domdsmc was backing up admin4.nsf (1.5 GB). Rescheduling updall and the backup
to not overlap each other resolved the issue.
To summarize, any maintenance or agents accessing one of these large databases
during the backup process causes a change to the database while it is being
backed up. Therefore, the backup must take those changes into account which
causes more shared memory to be used. This increases the chances of the backup
failing and/or Domino crashing.
Workarounds
One solution is to have a quiet time in which to do the backup. The less
changes that take place on a database while it is being backed up also reduces
the time the backup software requires to finish the backup.
For extremely large databases where backup failures or Domino crashes are
occurring frequently, the following workarounds are available:
1. For extremely large databases, schedule maintenance to run well before or
after the backup runs. Domino maintenance can be kicked off in three ways:
a. Notes.ini* (For example, ServerTasksAt2=UpdAll)
b. Program documents
c, Manually
* Upgrading servers to a later version of Domino will cause the default entries
for scheduled maintenance to be placed back into the notes.ini file. If you
have rearranged these entries to allow for a quiet time for running your
backups, be sure that upgrade maintenance includes resetting the notes.ini
maintenance entries to the settings you would like them to be set at.
2. Schedule backups during times of low user access to the databases.
3. Scheduling of agents affecting the backup quiet time. In one case, an
agent was making regular modifications every 15 minutes to a large database
(including during the quiet time) scheduled for backups. Below are some
suggestions for times when an agent constantly makes changes to a very large
Notes database during the backup process.
a. Investigate the agents and change the schedules of the agents that run on
many or all documents in a particular database so that they do not run during
the backup.
b. If you can do without running the agents during the backup, you can set up
a Program document to quit the AMGR task before you start the backup. You can
then have another Program document to load the task again. This will ensure
that you do not run into problem agents during the backup.
The Program document to quit AMGR would be configured as follows:
Basics Tab:
Program Name: nserver
Command line: -c "tell amgr quit"
The Program document to load AMGR would contain:
Program Name: amgr
Specify the specific days and times to run these that go along with your backup
schedule. Once this is setup correctly, your backup should complete much more
quickly and without any outages.
Note: IBM Support can assist you with running debug to help determine if agents
are constantly making changes to the database during the backup process. More >
|  |
|
|
|
|