Operation Tips and Techniques
This document section provides advice about ways to use OpCon and the IBM i Agent toolkit to automate IBM i operations. Details about how to use the IBM i toolkit components referenced in this section may be found in other topics of this document.
Monitoring for IBM i Jobs in MSGW Status
IBM discontinued its Navigator Monitor support for Job Status monitoring as it began distribution of its all-new IBM Navigator for i. The former solution proposed by SMA for interconnecting that type of Job Status monitor with the OpCon Agent for IBM i is no longer supported by IBM.
SMA has replaced the IBM solution with the Agent's own "Job Status Monitor" server job. The LSAM server job name is JOBSTS. This service is optional and it will only be started automatically if the LSAM Administrator sets the LSAM control that requests this service.
This document section offers suggestions about ways that clients of SMA Technologies can engage the IBM i LSAM Job Status Monitors for the purpose of detecting any jobs in selected IBM i subsystems that are stuck in MSGW status. It describes how to configure a Job Status Monitor to generate a message that can be intercepted by the IBM i LSAM Message Management facility, from which many forms of notification and response can be generated, including OpCon Event commands.
JOBSTS Server Implementation Outline
The strategy recommended by this document is to configure and start the IBM i LSAM Job Status Monitor server. If a job is found in one of the pre-registered IBM i subsystemw with a status of "MSGW", a readily identifiable message will be sent to QSYSOPR. Subsequently, an IBM i LSAM Message Management Parameter record can be configured to recognize this unique message, and attached Capture Rules and Response Rules can capture and communicate specific information about the job. The Response Rules can be used to initiate OpCon services such as notifications and triggers to launch automated OpCon Schedules that might perform error recovery mesures, for example.
The general purposes anticipated by this strategy include:
- The client operations staff and/or administrators can be aggressively notified when any jobs are stuck in the MSGW status.
- As each MSGW event is detected, the client may wish to implement new Message Management rules that can answer and/or respond to the specific messages that are discovered during follow-up research that is initiated by each Monitor event.
- As a result, there should be fewer jobs that are actually stuck, since the IBM i LSAM and OpCon can answer inquiry messages and possibly also initiate self-healing procedures to recover from the anticipated error condition.
The activities required to implement this strategy include:
- Configure the Job Status Monitor.
- See the following descriptions of the LSAM menu options and the screens used to maintain the JOBSTS server features.
- Using option 7 in the Job Status Monitor sub-menu, it is necessary to change the auto-start option to a value of 'Y' to enable automating starting and stopping of this LSAM server job at the same time as other LSAM server jobs are being managed.
- Choose an estimated Activity poll interval, depending on how aggressive the monitor operations should be.
- This can be adjusted later, after experience shows whether the monitor should be more active (for faster response) or less active, if the goal is to reduce the workload of the operating system.
- HINT: The default value of 15 seconds may be somewhat slow. Modern Power Processors are able to handle a lot more work per second than legacy systems.
- Add at least one IBM i subsystem to the Job Status Subsystem Management.
- A maximum of 25 subsystems can be monitored by the Job Status Monitor.
- Optional: Register any jobs that will be excluded by adding the jobs to the Job Status Job Exclusion Management.
- This feature prevents false notifications about jobs that are expected to issue messages requiring a reply.
- For example, a backup job might request a message reply after some backup media has been manually mounted.
- Jobs that are already being managed by OpCon automation do not need to be detected by the LSAM Job Status Monitor.
- This feature prevents false notifications about jobs that are expected to issue messages requiring a reply.
- Register LSAM Message Management Parameters that will detect the expected notifications from the Job Status Monitor.
- Define Capture Rules linked to the Message Management Parameters that can capture identifying information from the triggered Job Status Monitor message.
- Define Response Rules linked to each Capture Rule that will:
- Store the message identifying information into LSAM Dynamic Variables, if necessary.
- Generate any form of Notification Event, typically via OpCon using the OpCon External Event commands that are supported by prompts from within Response Rule maintenance and, optionally, from LSAM Multi-Step Job Script Steps.
JOBSTS Server Standard Alert Message
The IBM i System Job Status Monitor generates a very specific message that is sent to the QSYSOPR operator message queue. The definition of the message characteristics are necessary to build a correctly defined LSAM Message Management Parameter record. They also govern how Data Capture and Response Rules can be effectively configured.
- The message ID is 'SMA5802', as defined in the LSAM's SMAMSGF message file.
- The text of this message starts with the characters 'SMA5802'.
- The reason for quoting the message ID within the message primary text is to keep this LSAM JOBSTS server feature compatible with the strategy previously recommended by SMA for integrating the OpCon Agent with IBM's Navigator Monitors. In that circumstance it was not feasible to rely on a message ID, so the strategy had suggested registering a non-ID text message string that would be recognized by the characters 'SMA5802' in the first seven positions of the primary message text. Sites that had previously used the IBM Navigator Monitor integration strategy should be able to retain much of their previous response strategy for jobs found in a MSGW status.
- The IBM i Job ID for the job that was discovered in MSGW status is merged into the primary message text of message ID SMA5802.
- Using LSAM Message Data Capture Definitions, the Job ID can be found by specifying these Capture Definition parameters:
- Primary/Secondary text: P
- Message data from position: 70
- Length of data string: 28
- NOTE: Since the primary message text is a fixed-length string, the start of the IBM i Job ID is predictable. It is not necessary to use the Scan label to locate where the Job ID information may be found.
- The Job ID (a substitutable parameter in the primary message text) can hold up to 28 characters. This is the maximum length of an IBM i Job ID, comprised of the following components. Keep in mind that the Job Name and User Name may vary in length, but the Job Number will always occupy 6 positions, following the second slash character. Since blanks are not allowed in the Job or User name fields, the total actual length of the Job ID field will vary depending on the sum of the lengths of these two fields, plus 8.
- Job number (always 6 characters)
- forward slash (1 character)
- USERNAME (1 to 10 characters)
- forward slash (1 character)
- JOBNAME (1 to 10 characters)
- Using LSAM Message Data Capture Definitions, the Job ID can be found by specifying these Capture Definition parameters:
Job Status Monitor Menu
Job Status Monitor Menu
SYSTEMNAME JOB STATUS MONITOR MENU 00/00/00
USERNAME 01:01:01
Select one of the following:
1. Job Status Subsystem Management
2. Job Status Job Exclusion Management
3. Job Status Monitor Activity Log
7. System Job Status Monitor Configuration
Selection or command (C) SMA 1995,2012
===> ________________________________________________________________________
______________________________________________________________________________
F3=Exit F4=Prompt F9=Retrieve F12=Cancel
F13=Information Assistant F16=System main menu
Menu Pathways
Main Menu > LSAM management menu (#6) > Job Status Monitor menu (#9)
Field
Selection or command
Options
- 1=Job Status Subsystem Management
- 2=Job Status Job Exclusion Management
- 3=Job Status Monitor Activity Log
- 7=System Job Status Monitor Configuration
The options displayed on this menu are explained in the following Screens sections of this document. Type an option number in the Selection or command line and press <Enter> to begin using any of the options.
Functions
- F3=Exit: Returns to the master menu.
- F4=Prompt: Prompts for keywords for any command entered in the Select or command line.
- F9=Retrieve: Retrieves the previous command that was entered on the Select or the command line. If it is pressed multiple times, the system goes further and further back to previous commands.
- F12=Cancel: Returns to the master menu.
- F13=Information Assistant: Displays the IBM i general help screen.
- F16=System main menu: This is always shown on any system-generated menu screen. It branches to the general command entry menu for IBM i. Return to the previous menu by pressing <F3> or <F12>. This function is not commonly used and can be restricted for certain user profiles.