in Index   Prev   Next

RFC 4150

Transport Performance Metrics MIB

Pages: 57
Proposed Standard
Part 1 of 3 – Pages 1 to 11
None   None   Next

Top   ToC   RFC4150 - Page 1
Network Working Group                                           R. Dietz
Request for Comments: 4150                                    Hifn, Inc.
Category: Standards Track                                        R. Cole
                                                             August 2005

                   Transport Performance Metrics MIB

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2005).


This memo defines a portion of the Management Information Base (MIB) for use with network management protocols in the Internet community. In particular, it describes managed objects used for monitoring selectable performance metrics and statistics derived from the monitoring of network packets and sub-application level transactions. The metrics can be defined through reference to existing IETF, ITU, and other standards organizations' documents. The monitoring covers both passive and active traffic generation sources.
Top   ToC   RFC4150 - Page 2

Table of Contents

1. The Internet-Standard Management Framework ......................2 2. Overview ........................................................2 2.1. Terms ......................................................5 2.2. Report Aggregation .........................................5 2.3. Structure of the MIB .......................................6 2.4. Statistics for Aggregation of Data: Conventions ............7 2.5. Relationship to the Remote Monitoring MIB ..................7 2.6. Relationship to RMON2-MIB Protocol Identifier Reference ....7 2.7. Relationship to Standards-Based Performance Metrics ........7 2.8. Relationship to Application Performance Measurement MIB ....8 3. Statistics Perspective ..........................................8 3.1. Statistics Structure ......................................10 3.2. Statistics Analysis .......................................11 4. Definitions ....................................................11 5. Acknowledgements ...............................................51 6. Security Considerations ........................................52 7. Normative References ...........................................53 8. Informative References .........................................54

1. The Internet-Standard Management Framework

For a detailed overview of the documents that describe the current Internet-Standard Management Framework, please refer to section 7 of RFC 3410 [RFC3410]. Managed objects are accessed via a virtual information store, termed the Management Information Base or MIB. MIB objects are generally accessed through the Simple Network Management Protocol (SNMP). Objects in the MIB are defined using the mechanisms defined in the Structure of Management Information (SMI). This memo specifies a MIB module that is compliant to the SMIv2, which is described in STD 58, RFC 2578 [RFC2578], STD 58, RFC 2579 [RFC2579] and STD 58, RFC 2580 [RFC2580].

2. Overview

This document continues the architecture created in the RMON2-MIB [RFC2021] by providing a major feature upgrade, primarily by providing new metrics and studies to assist in the analysis of performance for sub-application transaction flows in the network, in direct relationship to the transport of application layer protocols. Performance-monitoring agents have been widely used to analyze the parameters and metrics related to the perceived performance of distributed applications and services in networks. The metrics collected by these agents have ranged from basic response time to a
Top   ToC   RFC4150 - Page 3
   combination of metrics related to the loss and re-transmission of
   datagrams and PDUs.  Although the metrics are becoming more useful in
   the implementation of service-level monitoring and troubleshooting
   tools, the lack of a standard method to report these has limited the
   deployment to very specific customer needs and areas.

   This document is intended to create a general framework for the
   collection and reporting of performance-related metrics on sub-
   application level transaction flows in a network.  The MIB in this
   document is directly linked to the current RMON2-MIB [RFC2021], and
   uses the Protocol Directory as a key component in reporting the
   layering involved in the sub-application level transaction flows.

   The specific objectives of this document are to:

      + Provide a drill-down capability to complement the user-perceived
        monitoring defined within the Application Performance
        Measurement MIB (APM-MIB) [RFC3729].  This capability is
        intended to support trouble resolution, further characterization
        of performance, and a finer granularity of monitoring
        capabilities.  The APM-MIB provides a method for retrieving
        aggregated measurement data of the end-user's perception of
        application-level performance.  APM additionally provides
        thresholding and associated alarms if the end-user perceived
        performance degrades below defined thresholds.  The Transport
        Performance Metrics MIB (TPM-MIB) complements the APM-MIB
        capabilities by monitoring sub-application level transaction
        aspects not typically perceived by the end-user.  As an example,
        APM-MIB provides response time statistics of a typical web-
        browser application.  This application typically consists of DNS
        transactions, TCP connection establishment (or multiple
        establishments), HTTP download of the base page, and multiple
        downloads of the various embedded objects.  Ideally, TPM-MIB
        would provide statistics on the performance aspects of these
        multiple sub-application level transactions.

      + Provide additional performance metrics and related statistics.
        For troubleshooting and a finer granularity of performance
        monitoring, it is useful to provide measurements of additional
        metrics beyond those supported by the APM-MIB.

      + Support standards-based metrics and associated statistical
        aggregation by defining methods to reference those standards.
        The TPM-MIB provides a capability to describe metrics by
        reference to appropriate IETF, ITU, or other standards bodies
        defining metrics, including enterprise-specific standards
        bodies.  This capability is provided through the
Top   ToC   RFC4150 - Page 4
        Specifically, this MIB itself does not make references to metric
        specifications of the IETF, ITU and other organizations.
        Instead, it allows for the setup of the tpmMetricDefTable that
        does reference such IETF, ITU, and other metric specifications,
        and it allows pointers to such specifications to be dynamically
        listed in this table.  The following objects allow for that, and
        the DESCRIPTION clauses (of the objects below) explain how this
        is done:

               tpmMetricDefName OBJECT-TYPE
               tpmMetricDefReference OBJECT-TYPE
               tpmMetricDefGlobalID OBJECT-TYPE

        The tpmMetricDefGlobalID object contains a reference to the
        Object ID in a metrics registration MIB being developed in the
        IP Performance Metrics (IPPM) Working Group at the IETF; e.g.,
        the IPPM-REGISTRY-MIB [RFC4148], which defines the metric.  For
        metrics defined within the IPPM Working Group, which are
        included in the IPPM-REGISTRY-MIB, this object is used to
        reference those metrics directly.  For metrics not included
        within the IPPM-REGISTRY-MIB, the value of this object is set
        to 0.0 for none.

        Examples of appropriate references include the ITU-T
        Recommendation Y.1540 [Y.1540] on IP packet transfer performance
        metrics, and the IETF documents from the IPPM WG; e.g., RFC 2681
        on the round trip delay metric [RFC2681] or RFC3393 on the delay
        variation metric [RFC3393].  Others include RFC 2679 [RFC2679],
        RFC2680 [RFC2680], and RFC3432 [RFC3432].  Although no specific
        metric is mandatory, implementations should, at a minimum,
        support a round-trip delay and a round-trip loss metric.

      + Provide (as an option) a table storing the measurements of the
        metrics on a transaction by transaction basis.  There are times
        when it is useful to have access to the raw measurements.  The
        tpmCurReportTable optionally provides access to this capability.

   Although this document outlines the basic measurements of performance
   in regard to the transport of application flows, it does not attempt
   to measure or provide a means to measure the actual perceived
   performance of the application transactions or quality.  The detailed
   measurements of end-user-perceived performance are directly related
   to this document and may be found in the APM-MIB [RFC3729].

   The objects defined in this document are intended as an interface
   between an RMON agent and an RMON management application and are not
   intended for direct manipulation by humans.  Although some users may
   tolerate the direct display of some of these objects, few will
Top   ToC   RFC4150 - Page 5
   tolerate the complexity of manually manipulating objects to
   accomplish row creation.  These functions should be handled by the
   management application.

2.1. Terms

This document uses some terms that need introduction: DataSource A source of data for monitoring purposes. This term is used exactly as defined in the RMON2-MIB [RFC2021]. protocol A specific protocol encapsulation, as identified for monitoring purposes. This term is used exactly as defined in the RMON Protocol Identifiers document [RFC2895]. performance metric A specific, measured reporting metric, as identified for monitoring purposes. There can be several metrics reported by an agent in the same implementation. The metrics are extensible based on the agent implementation. application A network-based, high-level protocol performing useful work to an end-user of an end-system. Typically, the application performs multiple request/response transactions to complete its work. E.g., a web application downloading a web page completes DNS, TCP-connect, and multiple HTTP GET transactions prior to completing its task. transactions Elemental request/response transactions comprising more complex network-based applications. E.g., a transaction may include an ftp get request and the file download in response.

2.2. Report Aggregation

This MIB module provides functions that aggregate measurements into higher-level summaries identical to the aggregation defined in the APM-MIB [RFC3729]. In addition to temporal aggregation of data, the Textual Convention, TransactionAggregationType, is imported from the APM-MIB, which specifies the nature of the spatial aggregation employed.
Top   ToC   RFC4150 - Page 6

2.3. Structure of the MIB

The objects are arranged in the following groups: -- tpmCapabilitiesGroup -- tpmAggregateReportsGroup -- tpmCurrentReportsGroup -- tpmExceptionReportsGroup These groups are the basic units of conformance. If an agent implements a group, then it must implement all objects in that group. Although this section provides an overview of grouping and conformance information for this MIB module, the authoritative reference for such information is contained in the MODULE-COMPLIANCE and OBJECT-GROUP macros later in this MIB module. These groups are defined to provide a means of assigning object identifiers, and to provide a method for implementers of managed agents to know which objects they must implement.

2.3.1. The tpmCapabilitiesGroup

The tpmCapabilitiesGroup contains objects and tables that show the measurement protocol and metric capabilities of the agent. This group primarily consists of the tpmTransMetricDirTable and the tpmMetricDefTable.

2.3.2. The tpmAggregateReportsGroup

The tpmAggregateReportsGroup is used to provide the collection of aggregated statistical measurements for the configured report intervals. The tpmAggregateReportsGroup consists of the tpmAggrReportCntrlTable and the tpmAggrReportTable.

2.3.3. The tpmCurrentReportsGroup

The tpmCurrentReportsGroup is used to provide the collection of uncompleted measurements for the current configured report for those transactions caught in progress. A history of these transactions is also maintained once the current transaction has been completed. The tpmCurrentReportsGroup consists of the tpmCurReportTable and the tpmCurReportSize object.
Top   ToC   RFC4150 - Page 7

2.3.4. The tpmExceptionReportsGroup

The tpmExceptionReportsGroup is used to link immediate notifications of transactions that exceed certain thresholds defined in the apmExceptionGroup [RFC3729]. This group reports the aggregated sub- application measurements for those applications exceeding thresholds. The tpmExceptionReportsGroup consists of the tpmExcpReportTable.

2.4. Statistics for Aggregation of Data: Conventions

In order to measure the performance of traffic flows in a network, the proper analysis of a set of statistics is required. Because a large majority of the statistics have a basis of time, the use of a simple statistical model is feasible. Therefore, the MIB definitions within this document all use a basic set of statistical computed values to assist in further analysis by a management application. The remaining subsections in this section detail the common structured features the are applied to the performance metrics in the statistical format described above. The tpmMetricsDefTable (discussed below) describes the set of metrics supported in this MIB module.

2.5. Relationship to the Remote Monitoring MIB

This document describes the implementation of an additional MIB for the support of performance-related metrics within the framework of the RMON2-MIB [RFC2021]. The objects and table defined in this MIB module are an extension to the existing framework for the support of both Client/Server and Server push-related applications and services.

2.6. Relationship to RMON2-MIB Protocol Identifier Reference

This document uses the Protocol Identifiers outlined in the current Protocol Identifier Reference document, RFC 2895 [RFC2895]. The protocol index values throughout the document are a direct reference to the same relationship that exists between the RMON2-MIB [RFC2021] and the Protocol Identifier Reference document, RFC 2895 [RFC2895]. An important extension of the Protocol Identification to application- level verbs is found in RFC 3395 [RFC3395].

2.7. Relationship to Standards-Based Performance Metrics

This document uses the tpmMetricsDefTable to describe the metrics supported by an instance of the TPM-MIB. The performance metric index values throughout the document are a direct reference to the
Top   ToC   RFC4150 - Page 8
   metrics defined in that table.  The table defines metrics by directly
   referencing other standards that provide definitive descriptions of
   the metric.

2.8. Relationship to Application Performance Measurement MIB

This document uses the apmReportControlIndex, appLocalIndex, and apmReportIndex, as outlined in the current Application Performance Measurement MIB [RFC3729]. These objects are used to create a reference link for the purpose of reporting transaction flow details on application-level measurements. As such, the TPM-MIB is designed to provide a drill-down extension to the APM-MIB. Further, it draws heavily on the ideas and designs laid out in the APM-MIB.

3. Statistics Perspective

When dealing with time-based measurements on application data packets, ideally all the timestamps and related data could be stored and forwarded for later analysis. However, when faced with thousands of conversations per second on ever-faster networks, storing all the data, even if compressed, would take too much processing, memory, and manager download time to be practical. It is important to note that in dealing with network data we will be dealing with statistical populations and not samples. Statistics books deal with both because the math is similar. In collecting agent data, a population (i.e., all the data) must be processed. Because of the nature of application protocols, just sampling some of the packets will not give good results. Missing just one critical packet, such as one that specified an ephemeral port on which data will be transmitted or what application will be run, can cause much valid data to be lost. The time-based measurements the agent collects will come from examining the entire group of data, i.e., the population. The population will be finite. The agent will seek only to provide information that will describe the actual data. Analysis of that data will be left to the management station. The simplest form of representing a group of data is by frequency distributions, i.e., buckets. Statistics provides a great many ways of analyzing this type of data, and there are some rules in creating the buckets. First, the range needs to be known. Second, a bucket size needs to be determined. Fixed bucket sizes are best, although variable may be used if needed. However, the statistics texts tend only to refer to operations of fixed-size buckets. This method of describing data is expensive for an agent to implement. First, the
Top   ToC   RFC4150 - Page 9
   agent must process a great amount of data at a time.  Storing the
   data, determining the range, locating the buckets, and then filling
   in the data after the fact takes a fair amount of storage and time.
   Fixing the range and bucket sizes in the beginning can be
   problematic, as the agent may have to adjust the values for each of
   the applications it collects data on.  Such numbers can be in the
   thousands.  Additional complexity arises in adding new protocols and
   even in describing the buckets themselves to the management
   application.  This is the approach taken in the APM-MIB.

   A complimentary approach is to provide frequency distribution
   statistics.  They describe aggregation such as mean and standard
   deviation that can be obtained by summation functions on the
   individual data elements in a population.  Analysis of the data
   described by these functions has been thoroughly studied, and
   interpretation of these values is available to anyone with an
   introduction to statistics.  In fact, frequency distributions are
   routinely analyzed to generate these varied numbers, which are then
   used for further analysis.  Note that frequency distributions, by
   their very nature, provide an exact characterization of the data.
   Whereas buckets will introduce error factors that are not present
   with direct analysis by summation-type formulas.  Because the TPM-MIB
   provides a drill-down capability to the APM MIB, it has to measure
   and store much more information than the APM-MIB.  For this reason,
   and in order to complement the APM-MIB, the TPM-MIB relies on
   statistical descriptions rather than a bucket description of the
   measurement data.

   The agent will provide data that can be used to calculate the most
   basic and useful statistical aggregates.  The agent will not perform
   the calculations and will not provide the statistical measurement
   directly.  There are several reasons why this is not desired.  The
   first is that finding the final measurement can be expensive in terms
   of computation and representation.  There are divisions and square
   roots, and the measurements are expressed as floating point values.
   The second is that by providing the variables to the statistical
   functions, those variables are scalable.  It is possible to combine
   smaller intervals into larger ones.

   An example is the arithmetic mean or average.  This is the sum of the
   data divided by the number of data elements.  The agent will provide
   the sum of the x and the number of elements N.  The management
   station can perform the division to obtain the average.  Given two
   samples, they can be combined by adding the sum of the x's and by
   adding the number of elements to get a combined sum and number of
   elements.  The average formula then works just the same.  Also, the
   sum of the x and the number of element variables are used in
   calculating other statistical measurement values.
Top   ToC   RFC4150 - Page 10

3.1. Statistics Structure

The data statistical elements, datum, of the metric have been chosen to maximize the amount of data available while minimizing the amount of memory needed to store the statistic and minimizing the CPU processing requirement needed to generate the statistic. The statistic data structure contains five unsigned integer datum. N count of the number of data points for the metric S(X) sum of all the data point values for the metric S(X2) sum of all the data point values squared for the metric Xmax maximum data point value for the metric Xmin minimum data point value for the metric S(I*X) sum of the data points multiplied by their order, i.e., = SUM from i=1 to N { i*X sub i} A performance metric is used to describe events over a time interval. The measurement points can be processed immediately into the statistic and do not have to be stored for later processing. For example, to count the number of events in a time interval, it is sufficient to increment a counter for each event. It is not necessary to cache all the events and then to count them at the end of the interval. The statistic is also designed to be easily scalable in terms of combining adjacent intervals. For example, if an agent created a specific statistic every 30 seconds and a user table interval was set to 60 seconds, the 60-second statistic could be obtained by combining the two 30-second statistics. The following rules will be applied when combining adjacent statistics. N S(N) S(X) S(S(X)) S(X2) S(S(X2)) Xmax MAX(Xmax) Xmin MIN(Xmin) S(I*X) S(I*X) + N*S(X) +S(I*X) where the last two terms refer to the statistics from the later 30 second period and N is the count from the former 30 second period. This structure gives a generic framework upon which the actual performance statistics will be defined. Each specific statistical definition must address the specific significance, if any, given to each metric datum. While a specific metric definition should try to conform to the generic framework, it is acceptable for a metric datum to not be used, and to have no meaning, for a specific metric. In such cases the datum will default to a 0 value.
Top   ToC   RFC4150 - Page 11

3.2. Statistics Analysis

The actual meaning of a specific statistical datum is determined by the definition of the specific statistic. The following is a discussion of the operations and observations that can be performed on a generic metric. This means that the following may or may not apply and/or have meaning when applied to any specific metric. The following observations and analysis techniques are not all inclusive. Rather these are the ones we have come up with at the time of writing this document. + Number. + Frequency. + The time interval is that specified in the control table. It is not a metric datum, but it is associated with the metric sample. + Maximum + Minimum + Range + Arithmetic Mean + Root Mean Square + Variance + Standard Deviation + Slope of a least-squares line These are accessible from the statistical datum provided by this MIB module.

(page 11 continued on part 2)

Next Section