RFC 4150

Transport Performance Metrics MIB

Pages: 57
Proposed Standard

Part 1 of 3 – Pages 1 to 11

RFC4150 - Page 1

Network Working Group                                           R. Dietz
Request for Comments: 4150                                    Hifn, Inc.
Category: Standards Track                                        R. Cole
                                                                 JHU/APL
                                                             August 2005


                   Transport Performance Metrics MIB

Status of This Memo

   This document specifies an Internet standards track protocol for the
   Internet community, and requests discussion and suggestions for
   improvements.  Please refer to the current edition of the "Internet
   Official Protocol Standards" (STD 1) for the standardization state
   and status of this protocol.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (C) The Internet Society (2005).

Abstract

   This memo defines a portion of the Management Information Base (MIB)
   for use with network management protocols in the Internet community.
   In particular, it describes managed objects used for monitoring
   selectable performance metrics and statistics derived from the
   monitoring of network packets and sub-application level transactions.
   The metrics can be defined through reference to existing IETF, ITU,
   and other standards organizations' documents.  The monitoring covers
   both passive and active traffic generation sources.

RFC4150 - Page 2

Table of Contents

   1. The Internet-Standard Management Framework ......................2
   2. Overview ........................................................2
      2.1. Terms ......................................................5
      2.2. Report Aggregation .........................................5
      2.3. Structure of the MIB .......................................6
      2.4. Statistics for Aggregation of Data: Conventions ............7
      2.5. Relationship to the Remote Monitoring MIB ..................7
      2.6. Relationship to RMON2-MIB Protocol Identifier Reference ....7
      2.7. Relationship to Standards-Based Performance Metrics ........7
      2.8. Relationship to Application Performance Measurement MIB ....8
   3. Statistics Perspective ..........................................8
      3.1. Statistics Structure ......................................10
      3.2. Statistics Analysis .......................................11
   4. Definitions ....................................................11
   5. Acknowledgements ...............................................51
   6. Security Considerations ........................................52
   7. Normative References ...........................................53
   8. Informative References .........................................54

1.  The Internet-Standard Management Framework

   For a detailed overview of the documents that describe the current
   Internet-Standard Management Framework, please refer to section 7 of
   RFC 3410 [RFC3410].

   Managed objects are accessed via a virtual information store, termed
   the Management Information Base or MIB.  MIB objects are generally
   accessed through the Simple Network Management Protocol (SNMP).
   Objects in the MIB are defined using the mechanisms defined in the
   Structure of Management Information (SMI).  This memo specifies a MIB
   module that is compliant to the SMIv2, which is described in STD 58,
   RFC 2578 [RFC2578], STD 58, RFC 2579 [RFC2579] and STD 58, RFC 2580
   [RFC2580].

2.  Overview

   This document continues the architecture created in the RMON2-MIB
   [RFC2021] by providing a major feature upgrade, primarily by
   providing new metrics and studies to assist in the analysis of
   performance for sub-application transaction flows in the network, in
   direct relationship to the transport of application layer protocols.

   Performance-monitoring agents have been widely used to analyze the
   parameters and metrics related to the perceived performance of
   distributed applications and services in networks.  The metrics
   collected by these agents have ranged from basic response time to a

RFC4150 - Page 3

   combination of metrics related to the loss and re-transmission of
   datagrams and PDUs.  Although the metrics are becoming more useful in
   the implementation of service-level monitoring and troubleshooting
   tools, the lack of a standard method to report these has limited the
   deployment to very specific customer needs and areas.

   This document is intended to create a general framework for the
   collection and reporting of performance-related metrics on sub-
   application level transaction flows in a network.  The MIB in this
   document is directly linked to the current RMON2-MIB [RFC2021], and
   uses the Protocol Directory as a key component in reporting the
   layering involved in the sub-application level transaction flows.

   The specific objectives of this document are to:

      + Provide a drill-down capability to complement the user-perceived
        monitoring defined within the Application Performance
        Measurement MIB (APM-MIB) [RFC3729].  This capability is
        intended to support trouble resolution, further characterization
        of performance, and a finer granularity of monitoring
        capabilities.  The APM-MIB provides a method for retrieving
        aggregated measurement data of the end-user's perception of
        application-level performance.  APM additionally provides
        thresholding and associated alarms if the end-user perceived
        performance degrades below defined thresholds.  The Transport
        Performance Metrics MIB (TPM-MIB) complements the APM-MIB
        capabilities by monitoring sub-application level transaction
        aspects not typically perceived by the end-user.  As an example,
        APM-MIB provides response time statistics of a typical web-
        browser application.  This application typically consists of DNS
        transactions, TCP connection establishment (or multiple
        establishments), HTTP download of the base page, and multiple
        downloads of the various embedded objects.  Ideally, TPM-MIB
        would provide statistics on the performance aspects of these
        multiple sub-application level transactions.

      + Provide additional performance metrics and related statistics.
        For troubleshooting and a finer granularity of performance
        monitoring, it is useful to provide measurements of additional
        metrics beyond those supported by the APM-MIB.

      + Support standards-based metrics and associated statistical
        aggregation by defining methods to reference those standards.
        The TPM-MIB provides a capability to describe metrics by
        reference to appropriate IETF, ITU, or other standards bodies
        defining metrics, including enterprise-specific standards
        bodies.  This capability is provided through the
        tpmMetricsDefTable.

RFC4150 - Page 4

        Specifically, this MIB itself does not make references to metric
        specifications of the IETF, ITU and other organizations.
        Instead, it allows for the setup of the tpmMetricDefTable that
        does reference such IETF, ITU, and other metric specifications,
        and it allows pointers to such specifications to be dynamically
        listed in this table.  The following objects allow for that, and
        the DESCRIPTION clauses (of the objects below) explain how this
        is done:

               tpmMetricDefName OBJECT-TYPE
               tpmMetricDefReference OBJECT-TYPE
               tpmMetricDefGlobalID OBJECT-TYPE

        The tpmMetricDefGlobalID object contains a reference to the
        Object ID in a metrics registration MIB being developed in the
        IP Performance Metrics (IPPM) Working Group at the IETF; e.g.,
        the IPPM-REGISTRY-MIB [RFC4148], which defines the metric.  For
        metrics defined within the IPPM Working Group, which are
        included in the IPPM-REGISTRY-MIB, this object is used to
        reference those metrics directly.  For metrics not included
        within the IPPM-REGISTRY-MIB, the value of this object is set
        to 0.0 for none.

        Examples of appropriate references include the ITU-T
        Recommendation Y.1540 [Y.1540] on IP packet transfer performance
        metrics, and the IETF documents from the IPPM WG; e.g., RFC 2681
        on the round trip delay metric [RFC2681] or RFC3393 on the delay
        variation metric [RFC3393].  Others include RFC 2679 [RFC2679],
        RFC2680 [RFC2680], and RFC3432 [RFC3432].  Although no specific
        metric is mandatory, implementations should, at a minimum,
        support a round-trip delay and a round-trip loss metric.

      + Provide (as an option) a table storing the measurements of the
        metrics on a transaction by transaction basis.  There are times
        when it is useful to have access to the raw measurements.  The
        tpmCurReportTable optionally provides access to this capability.

   Although this document outlines the basic measurements of performance
   in regard to the transport of application flows, it does not attempt
   to measure or provide a means to measure the actual perceived
   performance of the application transactions or quality.  The detailed
   measurements of end-user-perceived performance are directly related
   to this document and may be found in the APM-MIB [RFC3729].

   The objects defined in this document are intended as an interface
   between an RMON agent and an RMON management application and are not
   intended for direct manipulation by humans.  Although some users may
   tolerate the direct display of some of these objects, few will

RFC4150 - Page 5

   tolerate the complexity of manually manipulating objects to
   accomplish row creation.  These functions should be handled by the
   management application.

2.1.  Terms

   This document uses some terms that need introduction:

   DataSource
        A source of data for monitoring purposes.  This term is used
        exactly as defined in the RMON2-MIB [RFC2021].

   protocol
        A specific protocol encapsulation, as identified for monitoring
        purposes.  This term is used exactly as defined in the RMON
        Protocol Identifiers document [RFC2895].

   performance metric
        A specific, measured reporting metric, as identified for
        monitoring purposes.  There can be several metrics reported by
        an agent in the same implementation.  The metrics are extensible
        based on the agent implementation.

   application
        A network-based, high-level protocol performing useful work to
        an end-user of an end-system.  Typically, the application
        performs multiple request/response transactions to complete its
        work.  E.g., a web application downloading a web page completes
        DNS, TCP-connect, and multiple HTTP GET transactions prior to
        completing its task.

   transactions
        Elemental request/response transactions comprising more complex
        network-based applications.  E.g., a transaction may include an
        ftp get request and the file download in response.

2.2.  Report Aggregation

   This MIB module provides functions that aggregate measurements into
   higher-level summaries identical to the aggregation defined in the
   APM-MIB [RFC3729].  In addition to temporal aggregation of data, the
   Textual Convention, TransactionAggregationType, is imported from the
   APM-MIB, which specifies the nature of the spatial aggregation
   employed.

RFC4150 - Page 6

2.3.  Structure of the MIB

   The objects are arranged in the following groups:

        -- tpmCapabilitiesGroup

        -- tpmAggregateReportsGroup

        -- tpmCurrentReportsGroup

        -- tpmExceptionReportsGroup

   These groups are the basic units of conformance.  If an agent
   implements a group, then it must implement all objects in that group.
   Although this section provides an overview of grouping and
   conformance information for this MIB module, the authoritative
   reference for such information is contained in the MODULE-COMPLIANCE
   and OBJECT-GROUP macros later in this MIB module.

   These groups are defined to provide a means of assigning object
   identifiers, and to provide a method for implementers of managed
   agents to know which objects they must implement.

2.3.1.  The tpmCapabilitiesGroup

   The tpmCapabilitiesGroup contains objects and tables that show the
   measurement protocol and metric capabilities of the agent.  This
   group primarily consists of the tpmTransMetricDirTable and the
   tpmMetricDefTable.

2.3.2.  The tpmAggregateReportsGroup

   The tpmAggregateReportsGroup is used to provide the collection of
   aggregated statistical measurements for the configured report
   intervals.  The tpmAggregateReportsGroup consists of the
   tpmAggrReportCntrlTable and the tpmAggrReportTable.

2.3.3.  The tpmCurrentReportsGroup

   The tpmCurrentReportsGroup is used to provide the collection of
   uncompleted measurements for the current configured report for those
   transactions caught in progress.  A history of these transactions is
   also maintained once the current transaction has been completed.  The
   tpmCurrentReportsGroup consists of the tpmCurReportTable and the
   tpmCurReportSize object.

RFC4150 - Page 7

2.3.4.  The tpmExceptionReportsGroup

   The tpmExceptionReportsGroup is used to link immediate notifications
   of transactions that exceed certain thresholds defined in the
   apmExceptionGroup [RFC3729].  This group reports the aggregated sub-
   application measurements for those applications exceeding thresholds.
   The tpmExceptionReportsGroup consists of the tpmExcpReportTable.

2.4.  Statistics for Aggregation of Data: Conventions

   In order to measure the performance of traffic flows in a network,
   the proper analysis of a set of statistics is required.  Because a
   large majority of the statistics have a basis of time, the use of a
   simple statistical model is feasible.  Therefore, the MIB definitions
   within this document all use a basic set of statistical computed
   values to assist in further analysis by a management application.

   The remaining subsections in this section detail the common
   structured features the are applied to the performance metrics in the
   statistical format described above.  The tpmMetricsDefTable
   (discussed below) describes the set of metrics supported in this MIB
   module.

2.5.  Relationship to the Remote Monitoring MIB

   This document describes the implementation of an additional MIB for
   the support of performance-related metrics within the framework of
   the RMON2-MIB [RFC2021].  The objects and table defined in this MIB
   module are an extension to the existing framework for the support of
   both Client/Server and Server push-related applications and services.

2.6.  Relationship to RMON2-MIB Protocol Identifier Reference

   This document uses the Protocol Identifiers outlined in the current
   Protocol Identifier Reference document, RFC 2895 [RFC2895].  The
   protocol index values throughout the document are a direct reference
   to the same relationship that exists between the RMON2-MIB [RFC2021]
   and the Protocol Identifier Reference document, RFC 2895 [RFC2895].
   An important extension of the Protocol Identification to application-
   level verbs is found in RFC 3395 [RFC3395].

2.7.  Relationship to Standards-Based Performance Metrics

   This document uses the tpmMetricsDefTable to describe the metrics
   supported by an instance of the TPM-MIB.  The performance metric
   index values throughout the document are a direct reference to the

RFC4150 - Page 8

   metrics defined in that table.  The table defines metrics by directly
   referencing other standards that provide definitive descriptions of
   the metric.

2.8.  Relationship to Application Performance Measurement MIB

   This document uses the apmReportControlIndex, appLocalIndex, and
   apmReportIndex, as outlined in the current Application Performance
   Measurement MIB [RFC3729].  These objects are used to create a
   reference link for the purpose of reporting transaction flow details
   on application-level measurements.  As such, the TPM-MIB is designed
   to provide a drill-down extension to the APM-MIB.  Further, it draws
   heavily on the ideas and designs laid out in the APM-MIB.

3.  Statistics Perspective

   When dealing with time-based measurements on application data
   packets, ideally all the timestamps and related data could be stored
   and forwarded for later analysis.  However, when faced with thousands
   of conversations per second on ever-faster networks, storing all the
   data, even if compressed, would take too much processing, memory, and
   manager download time to be practical.

   It is important to note that in dealing with network data we will be
   dealing with statistical populations and not samples.  Statistics
   books deal with both because the math is similar.  In collecting
   agent data, a population (i.e., all the data) must be processed.

   Because of the nature of application protocols, just sampling some of
   the packets will not give good results.  Missing just one critical
   packet, such as one that specified an ephemeral port on which data
   will be transmitted or what application will be run, can cause much
   valid data to be lost.

   The time-based measurements the agent collects will come from
   examining the entire group of data, i.e., the population.  The
   population will be finite.  The agent will seek only to provide
   information that will describe the actual data.  Analysis of that
   data will be left to the management station.

   The simplest form of representing a group of data is by frequency
   distributions, i.e., buckets.  Statistics provides a great many ways
   of analyzing this type of data, and there are some rules in creating
   the buckets.  First, the range needs to be known.  Second, a bucket
   size needs to be determined.  Fixed bucket sizes are best, although
   variable may be used if needed.  However, the statistics texts tend
   only to refer to operations of fixed-size buckets.  This method of
   describing data is expensive for an agent to implement.  First, the

RFC4150 - Page 9

   agent must process a great amount of data at a time.  Storing the
   data, determining the range, locating the buckets, and then filling
   in the data after the fact takes a fair amount of storage and time.
   Fixing the range and bucket sizes in the beginning can be
   problematic, as the agent may have to adjust the values for each of
   the applications it collects data on.  Such numbers can be in the
   thousands.  Additional complexity arises in adding new protocols and
   even in describing the buckets themselves to the management
   application.  This is the approach taken in the APM-MIB.

   A complimentary approach is to provide frequency distribution
   statistics.  They describe aggregation such as mean and standard
   deviation that can be obtained by summation functions on the
   individual data elements in a population.  Analysis of the data
   described by these functions has been thoroughly studied, and
   interpretation of these values is available to anyone with an
   introduction to statistics.  In fact, frequency distributions are
   routinely analyzed to generate these varied numbers, which are then
   used for further analysis.  Note that frequency distributions, by
   their very nature, provide an exact characterization of the data.
   Whereas buckets will introduce error factors that are not present
   with direct analysis by summation-type formulas.  Because the TPM-MIB
   provides a drill-down capability to the APM MIB, it has to measure
   and store much more information than the APM-MIB.  For this reason,
   and in order to complement the APM-MIB, the TPM-MIB relies on
   statistical descriptions rather than a bucket description of the
   measurement data.

   The agent will provide data that can be used to calculate the most
   basic and useful statistical aggregates.  The agent will not perform
   the calculations and will not provide the statistical measurement
   directly.  There are several reasons why this is not desired.  The
   first is that finding the final measurement can be expensive in terms
   of computation and representation.  There are divisions and square
   roots, and the measurements are expressed as floating point values.
   The second is that by providing the variables to the statistical
   functions, those variables are scalable.  It is possible to combine
   smaller intervals into larger ones.

   An example is the arithmetic mean or average.  This is the sum of the
   data divided by the number of data elements.  The agent will provide
   the sum of the x and the number of elements N.  The management
   station can perform the division to obtain the average.  Given two
   samples, they can be combined by adding the sum of the x's and by
   adding the number of elements to get a combined sum and number of
   elements.  The average formula then works just the same.  Also, the
   sum of the x and the number of element variables are used in
   calculating other statistical measurement values.

RFC4150 - Page 10

3.1.  Statistics Structure

   The data statistical elements, datum, of the metric have been chosen
   to maximize the amount of data available while minimizing the amount
   of memory needed to store the statistic and minimizing the CPU
   processing requirement needed to generate the statistic.

   The statistic data structure contains five unsigned integer datum.

       N        count of the number of data points for the metric
       S(X)     sum of all the data point values for the metric
       S(X2)    sum of all the data point values squared for the metric
       Xmax     maximum data point value for the metric
       Xmin     minimum data point value for the metric
       S(I*X)   sum of the data points multiplied by their order, i.e.,
                = SUM from i=1 to N { i*X sub i}

   A performance metric is used to describe events over a time interval.
   The measurement points can be processed immediately into the
   statistic and do not have to be stored for later processing.  For
   example, to count the number of events in a time interval, it is
   sufficient to increment a counter for each event.  It is not
   necessary to cache all the events and then to count them at the end
   of the interval.  The statistic is also designed to be easily
   scalable in terms of combining adjacent intervals.  For example, if
   an agent created a specific statistic every 30 seconds and a user
   table interval was set to 60 seconds, the 60-second statistic could
   be obtained by combining the two 30-second statistics.  The following
   rules will be applied when combining adjacent statistics.

       N         S(N)
       S(X)      S(S(X))
       S(X2)     S(S(X2))
       Xmax      MAX(Xmax)
       Xmin      MIN(Xmin)
       S(I*X)    S(I*X) + N*S(X) +S(I*X)
                 where the last two terms refer to the
                 statistics from the later 30 second period
                 and N is the count from the former 30 second
                 period.

   This structure gives a generic framework upon which the actual
   performance statistics will be defined.  Each specific statistical
   definition must address the specific significance, if any, given to
   each metric datum.  While a specific metric definition should try to
   conform to the generic framework, it is acceptable for a metric datum
   to not be used, and to have no meaning, for a specific metric.  In
   such cases the datum will default to a 0 value.

RFC4150 - Page 11

3.2.  Statistics Analysis

   The actual meaning of a specific statistical datum is determined by
   the definition of the specific statistic.  The following is a
   discussion of the operations and observations that can be performed
   on a generic metric.  This means that the following may or may not
   apply and/or have meaning when applied to any specific metric.

   The following observations and analysis techniques are not all
   inclusive.  Rather these are the ones we have come up with at the
   time of writing this document.

       + Number.

       + Frequency.

       + The time interval is that specified in the control table.  It
         is not a metric datum, but it is associated with the metric
         sample.

       + Maximum

       + Minimum

       + Range

       + Arithmetic Mean

       + Root Mean Square

       + Variance

       + Standard Deviation

       + Slope of a least-squares line

   These are accessible from the statistical datum provided by this MIB
   module.

(page 11 continued on part 2)