ShowTable of Contents
About the authors
Ming Zhuo joined IBM CSTL in 2005, he currently works on the test for Lotus Domino for IBM i and project management.
Yao Zhao joined IBM CSTL in 2008. He currently works on the test for Lotus Domino for IBM i.
Introduction
NSD is the short form of Notes System Diagnostic, which is a most useful and important diagnostic tool provided by Lotus Notes Domino Product Suite. It can be used to troubleshoot severe server issues such as hang, crash,etc for such products as:
- Domino Server, Notes Client
- Quickr, Quickplace, DomDoc, Domino Workflow
- Sametime
The main objective of NSD is to collect call stacks of Domino jobs. The Call Stack section of NSD is the most important section for most of the Domino server or Notes client crashes or hangs. The Call Stacks shows where a crash occurs or what each of the threads is doing at the point when Domino is hanging. Without call stacks, it is very difficult to diagnose most of Domino problems.
Because collecting the data for NSD needs some Operating System specific APIs or mechanisms, there are three different implementations of NSD for Windows, Unix like platforms, and IBM i. They generally collect the similar set of information but there do have some differences, especially for the NSD on IBM i. In this article, we introduce the NSD tool of Lotus Domino on IBM i platform, including brief introduction of each section of an NSD file, history and evolvement of NSD on IBM i, and how NSD is triggered or called by end user.
Part A. General introduction to NSD
In Domino release 8.5.3, there are following parts in an IBM i NSD file:
Header
The header includes the following:
- Path to the NSD file
- Server name
- Date and time when the NSD generated
- IBM i system name
- OS type(OS400) and the OS release
- Domino release information
- List of hot fixes for current Domino release
- Job and thread information of the failing thread
- Call stack of the failing thread
Here is a typical header:
|--------10--------20--------30--------40--------50--------60--------70--------80--------|/zm853/data/IBM_TECHNICAL_SUPPORT/nsd_03_26_12@14_45_49.nsd
Server: ZM853
Date: Mon Mar 26 14:45:49 2012
System: H6060A1C
OS: OS400
Release: V5R4M0
Notes Version: Release 8.5.3!September 15, 2011
Hotfixes for product 5733L85, V8R5M3 and Product Option 13:
none.
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Notes Process Info (Time 14:45:51)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@ Notes Process Info -> Call Stack for Process @@>
JOB: 133148/QNOTES/NSD THREAD: 0x35
_CXX_PEP__Fv 0 QP0ZPCP2 QP0ZPCP2
Qp0zNewProcess 266 QP0ZPCPN QP0ZPCPN
InvokeTargetPgm__FP11qp0z_pcp_cb 210
_C_pep 0 NSD NSD
main 164
Shared memory
Shared Memory is one of the most important aspects of troubleshooting Domino issues. Memcheck is a tool to collect the shared memory information. It is not turned on by default when generating NSD on IBM i, because it can be a massive amount of data and take quite a while to dump the data. Customers can turn it on with NSD when necessary by adding "NSD_RUN_MEMCHECK=1" in Domino server’s notes.ini.
Here is an example of the data collected by memcheck. For saving space, we only list a few rows in memcheck result.
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Notes Memory Analyzer (memcheck) (Time 13:03:52)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Arguments: QDOMINO853/MEMCHECK -k curr -d err
Copyright (c) IBM Corporation 1987, 2012. All Rights Reserved.
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Notes Memory Analyzer (memcheck) -> Shared Memory Analysis (Time 13:03:52)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@ ------ Notes Memory -> Analysis -> DPOOL Memory Analysis :: (Shared) (Time 13:03:52) ------ @@>
** Analyzing shared memory DPOOL 'key=0xf8004000' size=0
Number of Shared Pools = 40
Number of Small Shared Pools = 9
SharedDPoolSize = 0
SmallSharedDPoolSize = 0
Amount of memory allocated in all handle tables = 0
Amount of shared memory mapped from the system = 0
** Analyzing shared memory DPOOL 'key=0xf8004001' size=0
** Analyzing shared memory DPOOL 'key=0xf8004002' size=0
** Analyzing shared memory DPOOL 'key=0xf8004003' size=0
** Analyzing shared memory DPOOL 'key=0xf8004004' size=0
** Analyzing shared memory DPOOL 'key=0xf8004005' size=0
** Analyzing shared memory DPOOL 'key=0xf8004006' size=0
.......
Major/Minor headings of Domino and system data
Major sections look like this:
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@......@>
Section: <Major heading title> (<Time xx:xx:xx>)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@......@>
Minor sections look like this:
<@@ <Major heading title> -> <Minor heading title> @@>
Here are some typical headings, which include:
Dump Job Call Stacks
Environment section
OS Process Table section
Dump of Domino only IPC Shared Memory Segment Info section
System Information section
Domino Console Entries section
Summary section
|--------10--------20--------30--------40--------50--------60--------70--------80--------|<@@ Notes Process Info -> Call Stack for Process 132814/QNOTES/SERVER @@>
Dump Job Call Stacks
--------------------
!
!
###################################
## Thread: 0000000000000002 (1/66) PID: 38678 Job: 132814/QNOTES/SERVER Time: 2012/03/26 14:46:02.966
###################################
!
!
Lib Name Pgm Name Mod Name Statement Procedure Name
---------- ---------- ---------- ---------- ------------------------------------------------------------
QSYS QP0ZPCP2 QP0ZPCP2 _CXX_PEP__Fv
QSYS QP0ZPCPN QP0ZPCPN 0000000266 Qp0zNewProcess
QSYS QP0ZPCPN QP0ZPCPN 0000000210 InvokeTargetPgm__FP11qp0z_pcp_cb
QDOMINO853 SERVER MAIN _C_pep
QDOMINO853 SERVER MAIN 0000000010 main
QDOMINO853 SERVER LIBMAIN 0000000050 ServerMain
QDOMINO853 SERVER LIBMAIN 0000000003 FirstProcessMain
QDOMINO853 SERVER POLL 0000000006 ServerPoller
QDOMINO853 LIBNOTES THREAD 0000000003 OSDelayThread
QDOMINO853 LIBNOTES USLEEP 0000000006 unix_usleep
QSYS QP0LLIB1 QP0LLIB1 0000000009 select
!
!
For space constraints, we only list the following header sections:
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Environment (Time 14:46:15)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: OS Process Table (Time 14:46:17)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Dump of Domino only IPC Shared Memory Segment Info (Time 14:46:20)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: System Information (Time 14:46:20)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Domino Console Entries (Time 14:46:23)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Summary (Time 14:46:24)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Other useful information for debugging
Some minor sections like system values, subsystem description, and shared memory information are added to NSD, since it would be nice to have those kinds of data available in the NSD file for later diagnoses. Please refer to table 1 in part B for details.
Part B. History and evolution of NSD
The NSD for Lotus Domino on IBM i is evolving gradually from the first day, to keep synchronized with other platforms, provide more useful information or be more user friendly. By reviewing the history of NSD you will see the enhancements between different Domino releases, which are mostly required by customers or our laboratory engineers.
Table 1: NSD enhancement history
Domino Release | IBM i specific or porting | Enhancement and sample |
6.5.4 | porting | Memcheck is added to NSD on IBM i. When porting memcheck to IBM i, we do not include all options, and do not turn it on by default, customer needs to set "NSD_RUN_MEMCHECK=1" in the notes.ini file to turn it on.
Sample:
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Notes Memory Analyzer (memcheck) (Time 14:45:57)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Arguments: QDOMINO853/MEMCHECK -k curr -d err -l /zm853/data/IBM_TECHNICAL_SUPPORT/nsd_03_26_12@14_45_49.nsd -o /zm853/data/IBM_TECHNICAL_SUPPORT/memcheck_03_26_12@14_45_49.dmp
Copyright (c) IBM Corporation 1987, 2011. All Rights Reserved.
|
6.5.5 | porting | The section Dumping Call Stacks of all Domino Jobs is added into nsd file on IBM i.
Sample:
<@@ Notes Process Info -> Call Stack for Process 132814/QNOTES/SERVER @@>
Dump Job Call Stacks
--------------------
!
!
###################################
## Thread: 0000000000000002 (1/66) PID: 38678 Job: 132814/QNOTES/SERVER Time: 2012/03/26 14:46:02.966
###################################
|
7.0.2 | porting | Time stamps are added to the header for the major sections of NSD on IBM i, for instance, in memcheck and each job's call stacks.
Sample:
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Notes Process Info (Time 14:45:51)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
<@@ Notes Process Info -> Call Stack for Process 132814/QNOTES/SERVER @@>
Dump Job Call Stacks
--------------------
!
!
###################################
## Thread: 0000000000000002 (1/66) PID: 38678 Job: 132814/QNOTES/SERVER Time: 2012/03/26 14:46:02.966
###################################
|
7.0.4 & 8.0.1 | IBM i specific | Update IBM i NSD to reliably find and copy spool files to NSD. This is an enhancement to use a IBM i system API to retrieve spool file information then pass it along on the CPYSPLF command to copy generated spool file to NSD.
This allows the following command to generate an NSD with the call stacks of a running server and the Domino console entries. In these and the later releases, customer or developer can now issue the RUNDOMCMD to get an NSD file containing all the Call Stacks instead of get a collection of Spool files each containing a call stack.
RUNDOMCMD <server name> CMD(CALL QDOMINO704/NSD) BATCH(*NO)
RUNDOMCMD <server name> CMD(CALL QDOMINO801/NSD) BATCH(*NO)
|
8.0.2 | IBM i specific | Add new option - option 17=Create NSD in WRKDOMSVR screen. This allows end user to create NSD without going into the domino console and issue the command. Please see figure 1.
|
8.5.2 | IBM i specific | Update IBM i NSD to retrieve and dump PASE and Java call stack entries, which are in different format..
Sample:
###################################
## Thread: 0000000000000005 (2/65) PID: 43639 Job: 137844/QNOTES/HTTP Time: 2012/03/27 15:47:49.858
###################################
|
|
Lib Name Pgm Name Mod Name Statement Procedure Name
---------- ---------- ---------- ---------- ------------------------------------------------------------
QSYS QLESPI QLECRTTH 0000000017 LE_Create_Thread2__FP12crtth_parm_t
QSYS QP0WPINT QP0WSPTHR 0000000019 pthread_create_part2
QJVM6032 QXJ9VM QXJ9STRTPA 0000000041 paseMainThread__13Qxj9StartPaseFPv
QSYS QP2SHELL2 QP2SHELL2 _CXX_PEP__Fv
QSYS QP2SHELL2 QP2SHELL2 0000000041 main
QSYS QP2USER QP2USER 0000000004 Qp2RunPase
QSYS QP2USER2 QP2API 0000000007 __Qp2RunPase
QSYS QP2USER2 QP2API 0000000006 runpase_main__FPi
QSYS QP2USER2 QP2API 0000000002 runpase_common__FiPvT2
P : *PASE_32 0 000000008C __start
Module: /QOpenSys/QIBM/ProdData/JavaVM/jdk60/32bit/jre/lib/ppc/jvmStartPase
P : *PASE_32 122 00000000DC main@AF2_1
Module: /QOpenSys/QIBM/ProdData/JavaVM/jdk60/32bit/jre/lib/ppc/jvmStartPase
P : *PASE_32 0 0000000003 *N
Module: /unix
P : *PASE_32 0 0000000018 _ILECALL
Module: /usr/lib/libc.a(shr.o)
P : *PASE_32 0 0000000008 <syscall32>:_ILECALLX
Module: /unix
QJVM6032 QXJ9VM QXJ9STRTPA 0000000015 xj9PaseActive
QSYS QP0WPTHR QP0WCOND 0000000049 pthread_cond_wait
QSYS QP0WPINT QP0WSCOND 0000000086 wait__20Qp0wPthreadConditionFP7Qp0wTcbP9Qp0wMutex
|
8.5.3 | IBM i specific | Add more system information into NSD file
1. QNOTES user profile
2. QNOTES authorities
3. WRKSYSSTS output
4. SYSTEM VALUES
5. Domino subsystem description
6. LOTUS_SERVERS |
8.5.3 | IBM i specific | Enhance IBM i NSD processing to dump call stacks of jobs running under Domino server, but not found in the server pid.nbf file.
Sample:
For space constraints, we only list header here for below sections:
<@@ Environment -> QNOTES User Profile @@>
Display User Profile - *BASIC Page 1
<@@ Environment -> QNOTES User Profile Authority @@>
Display Object Authority Page 1
<@@ System Information -> System Status Information @@>
System Status Information Page 1
<@@ System Information -> System Values @@>
System Values Page 1
<@@ System Information -> Subsystem Description @@>
5722SS1 V5R4M0 060210 Display Subsystem Description 12/03/26 15:47:58 Page 1
<@@ System Information -> LOTUS_SERVERS Information @@>
Jobs like kvoop don't have their pid's placed into the pid.nbf. This enhancement will dump the call stacks of these jobs when NSD file is generated. |
8.5.3 | IBM i specific | Enhance NSD to dump Domino shared memory segment information, including the Key, owner, creator, etc.
Sample:
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Section: Dump of Domino only IPC Shared Memory Segment Info (Time 15:47:55)
<@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@>
Domino partition number 8 has a Key range of F8004000 - F80047FF
ID Key Owner Creator Number Seg Size Damaged Resize Mark TS Last Admin Chg
Attached Del Time YYMMDD
---------- -------- ---------- ---------- ---------- ---------- ------- ------ ---- --- ---------------
1. 50812 F8004000 QNOTES QNOTES 15 12521040 N Y N N 02:46:44 120326
|
Figure 1. Option 17 in WRKDOMSVR screen
Part C. Main reasons an NSD is generated
On the IBM i platform, there are 4 main reasons an NSD is generated:
- An unhandled exception occurs
- A Domino determined critical failure
- A Domino subsystem or job ends abnormally
- User requests an NSD
An unhandled exception occurs
The exception message id usually appears to be MCH3601 or MCH0601. There can be other exceptions, but the vast majority of the exceptions that occur are one of these exceptions.
A Domino determined critical failure
These kind of failures are vary in scenarios. For example, a most common scenario is that PANIC or QUIT takes longer than default time value. The default timeout value to QUIT a Domino server is 300 seconds, but you can modify this value in the server document. If a Domino server is not ended within the timeout interval after issuing the QUIT command, the Domino server will be forced to end and an NSD file is generated accordingly.
Domino subsystem or job ends abnormally
A user does an ENDJOB *IMMED on a Domino job, or an ENDSBS *IMMED on a Domino subsystem, or an ENDDOMSVR *IMMED (which turns into an ENDSBS *IMMED command internally). This leaves a small truncated NSD.
User requests an NSD
User is able to request an NSD in following ways:
a. Choose option 17 from WRKDOMSVR
b. Use the CL command RUNDOMCMD SERVER@nowiki@1CMD(CALL PGM(QNOTES/NSD))
c. Issue the command "Load nsd" from domino console
Invoking NSD outside of the Domino server requires you to do it via a RUNDOMCMD command or via option 17 of WRKDOMSVR. If you just run CALL QNOTES/NSD from an IBM i command line, you will get an NSD file created under your current working directory but it will miss all of the Domino specific data.
Conclusion
In this article, we introduce the NSD tool of IBM Domino on IBM i platform, including brief introduction of each section of an NSD file, history and evolvement of NSD on IBM i, and how NSD is triggered or called by end user. NSD provides an easy and flexible way to collect the necessary information for diagnostics when Domino server runs into error conditions. We hope you can better leverage this tool to troubleshoot IBM i Domino problems.