Network Working Group S. Waldbusser Request for Comments: 3729 March 2004 Category: Standards Track Application Performance Measurement MIB Status of this Memo This document specifies an Internet standards track protocol for the Internet community, and requests discussion and suggestions for improvements. Please refer to the current edition of the "Internet Official Protocol Standards" (STD 1) for the standardization state and status of this protocol. Distribution of this memo is unlimited. Copyright Notice Copyright (C) The Internet Society (2004). All Rights Reserved.
AbstractThis memo defines a portion of the Management Information Base (MIB) for use with network management protocols in TCP/IP-based internets. In particular, it defines objects for measuring the application performance as experienced by end-users. 1. The Internet-Standard Management Framework . . . . . . . . . . 2 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1. Report Aggregation . . . . . . . . . . . . . . . . . . . 4 2.2. AppLocalIndex Linkages . . . . . . . . . . . . . . . . . 8 2.3. Measurement Methodology. . . . . . . . . . . . . . . . . 10 2.4. Instrumentation Architectures. . . . . . . . . . . . . . 10 2.4.1. Application Directory Caching. . . . . . . . . . 10 2.4.2. Push Model . . . . . . . . . . . . . . . . . . . 11 2.5. Structure of this MIB Module . . . . . . . . . . . . . . 12 2.5.1. The APM Application Directory Group. . . . . . . 13 2.5.2. The APM User Defined Applications Group. . . . . 13 2.5.3. The APM Report Group . . . . . . . . . . . . . . 13 2.5.4. The APM Transaction Group. . . . . . . . . . . . 13 2.5.5. The APM Exception Group. . . . . . . . . . . . . 14 2.5.6. The APM Notification Group . . . . . . . . . . . 14 3. Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . 14 4. Security Considerations. . . . . . . . . . . . . . . . . . . . 58 5. References . . . . . . . . . . . . . . . . . . . . . . . . . . 60 5.1. Normative References . . . . . . . . . . . . . . . . . . 60 5.2. Informative References . . . . . . . . . . . . . . . . . 60
6. Author's Address . . . . . . . . . . . . . . . . . . . . . . . 60 7. Full Copyright Statement . . . . . . . . . . . . . . . . . . . 61 section 7 of RFC 3410 . Managed objects are accessed via a virtual information store, termed the Management Information Base or MIB. MIB objects are generally accessed through the Simple Network Management Protocol (SNMP). Objects in the MIB are defined using the mechanisms defined in the Structure of Management Information (SMI). This memo specifies a MIB module that is compliant to the SMIv2, which is described in STD 58, RFC 2578 , STD 58, RFC 2579  and STD 58, RFC 2580 . 7] by providing analysis of application performance as experienced by end-users. Application performance measurement measures the quality of service delivered to end-users by applications. With this perspective, a true end-to-end view of the IT infrastructure results, combining the performance of the application, desktop, network, and server, as well as any positive or negative interactions between these components. Despite all the technically sophisticated ways in which networking and system resources can be measured, human end-users perceive only two things about an application: availability and responsiveness. Availability - The percentage of the time that the application is ready to give a user service. Responsiveness - The speed at which the application delivers the requested service. A transaction is an action initiated by a user that starts and completes a distributed processing function. A transaction begins when a user initiates a request for service (i.e., pushing a submit button) and ends when the work is completed (i.e., information is provided or a confirmation is delivered). A transaction is the fundamental item measured by the APM MIB.
A failed transaction is a transaction that fails to provide the service requested by the end user, regardless of whether it is due to a processing failure or transport failure. An application protocol (e.g., POP3) may implement different commands or application "verbs" (e.g., POP3 Login and POP3 Retrieval). It will often be interesting to monitor these verbs separately because: 1) The verbs may have widely differing performance characteristics (in fact some may be response time oriented while others are throughput oriented) 2) The verbs have varying business significance 3) It provides more granularity of exactly what might be performing poorly This MIB Module allows the measurement of a parent application, its component verbs, or both. If monitoring both, one can watch the top-level application and then drill down to the verbs when trouble is spotted to learn which subcomponents are in trouble. Each application verb is registered separately in the Protocol Directory   as a child of its parent application. Application protocols implement one of three different types of transactions: transaction-oriented, throughput-oriented, or streaming-oriented. While the availability metric is the same for all three types, the responsiveness metric varies: Transaction-Oriented: These transactions have a fairly constant workload to perform for all transactions. In particular, to the degree that the workload may vary, it doesn't vary based on the amount of data to be transferred but based on the parameters of the transaction. The responsiveness metric for transaction- oriented applications is application response time, the elapsed time between the user's request for service (e.g., pushing the submit button) and the completion of the request (e.g., displaying the results) and is measured in milliseconds. This is commonly referred to as end-user response time. Throughput-Oriented: These transactions have widely varying workloads based on the amount of data requested. The responsiveness metric for throughput-oriented applications is kilobits per second. Streaming-Oriented: These transactions deliver data at a constant metered rate of speed regardless of excess capacity in the networking and computing infrastructure. However, when the infrastructures cannot deliver data at this speed, interruption of service or degradation of service can result. The responsiveness
metric for streaming-oriented applications is the signal quality ratio of time that the service is degraded or interrupted to the total service time. This metric is measured in parts per million.
SuccessfulTransactions The total number of transactions that were successful. The management station can derive the percent success by dividing SuccessfulTransactions by the TransactionCount. ResponsivenessMean The average of the responsiveness metric for all aggregated transactions that completed successfully. ResponsivenessMin The minimum responsiveness metric for all aggregated transactions that completed successfully. ResponsivenessMax The maximum responsiveness metric for all aggregated transactions that completed successfully. ResponsivenessBx The count of successful transactions whose responsiveness metric fell into the range specified for Bx. There are 7 buckets specified. Because the performance of different applications varies widely, the bucket ranges are specified separately for each application (in the apmAppDirTable) so that they may be tuned to typical performance of each application. For example, when aggregating the previous set of transactions by application we get (for simplicity the example only shows TransactionCount, SuccessfulTransactions, and ResponsivenessMean): Application Count Successful ResponsivenessMean HTTP 4 3 12 sec. SAP/R3 1 1 17 sec. FTP 1 1 212 Kbps. RealVideo 1 1 100.0% There are four different types of aggregation. The flows(1) aggregation is the simplest. All transactions that share common application/server/client 3-tuples are aggregated together, resulting in a set of metrics for all such unique 3- tuples. The clients(2) aggregation results in somewhat more aggregation (i.e., fewer resulting records). All transactions that share common application/client tuples are aggregated together, resulting in a set of metrics for all such unique tuples.
The servers(3) aggregation usually results in still more aggregation (i.e., fewer resulting records). All transactions that share common application/server tuples are aggregated together, resulting in a set of metrics for all such unique tuples. The applications(4) aggregation results in the most aggregation (i.e., the fewest resulting records). All transactions that share a common application are aggregated together, resulting in a set of metrics for all such unique applications. For example, if in a 5 minute period the following transactions occurred: Actual Transactions: # App Client Server Successful Responsiveness 1 HTTP Jim CallCtr N - 2 HTTP Jim HR Y 12 sec. 3 HTTP Jim Sales Y 7 sec. 4 HTTP Jim CallCtr Y 5 sec. 5 Email Jim Pop3 Y 12 sec. 6 HTTP Jane CallCtr Y 3 sec. 7 SAP/R3 Jane Finance Y 19 sec. 8 Email Jane Pop3 Y 16 sec. 9 HTTP Joe HR Y 18 sec. The flows(1) aggregation results in the following table. Note that the first record (HTTP/Jim/CallCtr) is the aggregation of transactions #1 and #4: Flow Aggregation: App Client Server Count Succe- Rsp Rsp Rsp RspB1 RspB2 ssful Mean Min Max HTTP Jim CallCtr 2 1 5 5 5 1 0 HTTP Jim HR 1 1 12 12 12 0 1 HTTP Jim Sales 1 1 7 7 7 1 0 Email Jim Pop3 1 1 12 12 12 0 1 HTTP Jane CallCtr 1 1 3 3 3 1 0 SAP/R3 Jane Finance 1 1 19 19 19 0 1 Email Jane Pop3 1 1 16 16 16 0 1 HTTP Joe HR 1 1 18 18 18 0 1 (Note: Columns above such as RspMean and RspB1 are abbreviations for objects in the apmReportTable) The clients(2) aggregation results in the following table. Note that the first record (HTTP/Jim) is the aggregate of transactions #1, #2, #3 and #4:
Client Aggregation: App Client Count Succe- Rsp Rsp Rsp RspB1 RspB2 ... ssful Mean Min Max HTTP Jim 4 3 8 5 12 2 1 Email Jim 1 1 12 12 12 0 1 HTTP Jane 1 1 3 3 3 1 0 SAP/R3 Jane 1 1 19 19 19 0 1 Email Jane 1 1 16 16 16 0 1 HTTP Joe 1 1 18 18 18 0 1 The servers(3) aggregation results in the following table. Note that the first record (HTTP/CallCtr) is the aggregation of transactions #1, #4 and #6: Server Aggregation: App Server Count Succe- Rsp Rsp Rsp RspB1 RspB2 ... ssful Mean Min Max HTTP CallCtr 3 2 4 3 5 2 0 HTTP HR 2 2 15 12 18 0 2 HTTP Sales 1 1 7 7 7 1 0 Email Pop3 2 2 14 12 16 0 2 SAP/R3 Finance 1 1 19 19 19 0 1 The applications(4) aggregation results in the following table. Note that the first record (HTTP) is the aggregate of transactions #1, #2, #3, #5, #6 and #9: Application Aggregation: App Count Succe- Rsp Rsp Rsp RspB1 RspB2 ... ssful Mean Min Max HTTP 6 5 9 3 18 3 2 Email 2 2 14 12 16 0 2 SAP/R3 1 1 19 19 19 0 1 The apmReportControlTable provides for a historical set of the last 'X' reports, combining the historical records found in history tables with the periodic snapshots found in TopN tables. Conceptually the components are: apmReportControlTable Specifies data collection and summarization parameters, including the number of reports to keep and the size of each report. apmReport Each APM Report contains an aggregated list of records that represent data collected during a specific time period.
An apmReportControlEntry causes a family of APM Reports to be created, where each report summarizes different, successive, contiguous periods of time. While the conceptual model of APM Reports shows them as distinct entities, they are all entries in a single apmReportTable, where entries in report 'A' are separated from entries in report 'B' by different values of the apmReportIndex. +-----------------------+ | | | apmReportControlTable | | | +-----------+ +-----------------------+ | | +-----------+ | | | | +-----------+ |---+ | | | +----------+ |---+ | | | apmReport |apmReport |----+ +-----------------------+ | | |Thu Mar 30 12-1PM | +----------+ | | |CLNT SERV PROT stats | | | |Joe News HTTP data | |Jan POP POP3 data | |Jan POP SMTP data | |Bob HR PSOFT data | |... | |... | +-----------------------+
Assuming the following entries in the RMON2 protocolDirectory: protocolDirectory ID (*) Parameters | LocalIndex ... WWW None | 1 WWW Get None | 2 SAP/R3 None | 3 (*) These IDs are represented here symbolically. Consult  for more detail in their format and the following entry in the apmHttpFilterTable: ApmHttpFilterTable Index | AppLocalIndex ServerAddress URLPath MatchType ... 5 | 20 hr.example.com /expense prefix(3) ... the apmAppDirTable would be populated with the following entries: apmAppDir AppLocalIndex ResponsivenessType | Config ... 1 transaction(1) | On ... 1 throughput(2) | On ... 2 transaction(1) | On ... 2 throughput(2) | On ... 3 transaction(1) | On ... 20 transaction(1) | On ... 20 throughput(2) | On ... The entries in the apmAppDirTable with an appLocalIndex of 1, 2 and 3 correspond to the identically named entries in the protocolDirectory table. appLocalIndex #1 results in 2 entries, one to measure the transaction responsiveness of WWW and one to measure its throughput responsiveness. In contrast, appLocalIndex #3 results in only a transaction entry because the agent does not measure the throughput responsiveness for SAP/R3 (probably because it isn't very meaningful). Finally, appLocalIndex #20 corresponds to the entry in the apmHttpFilterTable and has transaction responsiveness and throughput responsiveness measurements available. If a report was configured using application aggregation, entries in that report might look like:
apmReportTable CtlIndex Index AppLocalIdx ResponsivenessType | TransactionCount ... 1 1 1 transaction(1) | counters... 1 1 1 throughput(2) | counters... 1 1 2 transaction(1) | counters... 1 1 2 throughput(2) | counters... 1 1 3 transaction(1) | counters... 1 1 20 transaction(1) | counters... 1 1 20 throughput(2) | counters... Note that the index items protocolDirLocalIndex, apmReportServerAddress and apmReportClientID were omitted from apmReportTable example for brevity because they would have been equal to zero due to the use of the application aggregation in this example.
thousands of APM agents, this Application Directory will be the same on many, if not all of the agents. Repeated downloads of the Application Directory may be inefficient. The apmAppDirID object is a single object that identifies the configuration of all aspects of the Application Directory when it is equal to a well-known, registered configuration. Thus, when a manager sees an apmAppDirID value that it recognizes, it need not download the Application Directory from that agent. In fact, the manager may discover a new registered Application Directory configuration on one agent and then re-use that configuration on another agent that shares the same apmAppDirID value. Application directory registrations are unique within an administrative domain, allowing an administrator to create a custom application directory configuration without the need to assign it a globally-unique registration.
APM Reports If an agent wishes to push APM reports to a manager, it must send: apmAppDirID apmNameTable (any data updated since the last push) For each report the agent wishes to upload, it must send the entire apmReportControlEntry associated with that report and the associated entries in the apmReportTable that have changed since the last report. APM Transactions If an agent wishes to push APM transactions to a manager, it must send: apmAppDirID apmNameTable (any data updated since the last push) apmTransactionTable (relevant entries) APM Exceptions The agent must send: apmAppDirID apmNameTable (any data updated since the last push) apmTransactionEntry (of exception transaction) apmExceptionEntry (entry that generated exception) [Note that this list supersedes the information in the OBJECTS clauses of the apmTransactionResponsivenessAlarm and apmTransactionUnsuccessfulAlarm when the agent is using a push model. This additional information eliminates the need for the manager to request additional data to understand the exception.] The order of varbinds and where to segment varbinds into PDUs is at the discretion of the agent.
These groups are the basic unit of conformance. If an agent implements a group, then it must implement all objects in that group. While this section provides an overview of grouping and conformance information for this MIB Module, the authoritative reference for such information is contained in the MODULE-COMPLIANCE and OBJECT-GROUP macros later in this MIB Module. These groups are defined to provide a means of assigning object identifiers, and to provide a method for implementors of managed agents to know which objects they must implement.
some time into the performance of long-lived transactions such as streaming applications, large data transfers, or (very) poorly performing transactions. In fact, by their very definition, the apmReport and apmException mechanisms only provide visibility into a problem after nothing can be done about it. This group consists primarily of the apmTransactionTable.