NAME

trapstat - report trap statistics

SYNOPSIS

/usr/sbin/trapstat [-t | -T | -e entry]


     [-C processor_set_id | -c cpulist] [-P] [-a]


     [-r rate] [ [interval [count]] | command | [args]]

/usr/sbin/trapstat -l

DESCRIPTION

The trapstat utility gathers and displays run-time trap statistics on UltraSPARC-based systems. The default output is a table of trap types and CPU IDs, with each row of the table denoting a trap type and each column of the table denoting a CPU. If standard output is a terminal, the table contains as many columns of data as can fit within the terminal width; if standard output is not a terminal, the table contains at most six columns of data. By default, data is gathered and displayed for all CPUs; if the data cannot fit in a single table, it is printed across multiple tables. The set of CPUs for which data is gathered and displayed can be optionally specified with the -c or -C option.

Unless the -r option or the -a option is specified, the value displayed in each entry of the table corresponds to the number of traps per second. If the -r option is specified, the value corresponds to the number of traps over the interval implied by the specified sampling rate; if the -a option is specified, the value corresponds to the accumulated number of traps since the invocation of trapstat.

By default, trapstat displays data once per second, and runs indefinitely; both of these behaviors can be optionally controlled with the interval and count parameters, respectively. The interval is specified in seconds; the count indicates the number of intervals to be executed before exiting. Alternatively, command can be specified, in which case trapstat executes the provided command and continues to run until the command exits. A positive integer is assumed to be an interval; if the desired command cannot be distinguished from an integer, the full path of command must be specified.

UltraSPARC I (obsolete), II, and III handle translation lookaside buffer (TLB) misses by trapping to the operating system. TLB miss traps can be a significant component of overall system performance for some workloads; the -t option provides in-depth information on these traps. When run with this option, trapstat displays both the rate of TLB miss traps and the percentage of time spent processing those traps. Additionally, TLB misses that hit in the translation storage buffer (TSB) are differentiated from TLB misses that further miss in the TSB. (The TSB is a software structure used as a translation entry cache to allow the TLB to be quickly filled; it is discussed in detail in the UltraSPARC II User's Manual.) The TLB and TSB miss information is further broken down into user- and kernel-mode misses.

Workloads with working sets that exceed the TLB reach may spend a significant amount of time missing in the TLB. To accommodate such workloads, the operating system supports multiple page sizes: larger page sizes increase the effective TLB reach and thereby reduce the number of TLB misses. To provide insight into the relationship between page size and TLB miss rate, trapstat optionally provides in-depth TLB miss information broken down by page size using the -T option. The information provided by the -T option is a superset of that provided by the -t option; only one of -t and -T can be specified.

OPTIONS

The following options are supported:

-a

Displays the number of traps as accumulating, monotonically increasing values instead of per-second or per-interval rates.

-c cpulist

Enables trapstat only on the CPUs specified by cpulist.

cpulist can be a single processor ID (for example, 4), a range of processor IDs (for example, 4-6), or a comma separated list of processor IDs or processor ID ranges (for example, 4,5,6 or 4,6-8).

-C processor_set_id

Enables trapstat only on the CPUs in the processor set specified by processor_set_id.

trapstat modifies its output to always reflect the CPUs in the specified processor set. If a CPU is added to the set, trapstat modifies its output to include the added CPU; if a CPU is removed from the set, trapstat modifies its output to exclude the removed CPU. At most one processor set can be specified.

-e entrylist

Enables trapstat only for the trap table entry or entries specified by entrylist. A trap table entry can be specified by trap number or by trap name (for example, the level-10 trap can be specified as 74, 0x4A, 0x4a, or level-10).

entrylist can be a single trap table entry or a comma separated list of trap table entries. If the specified trap table entry is not valid, trapstat prints a table of all valid trap table entries and values. A list of valid trap table entries is also found in The SPARC Architecture Manual, Version 9 and the Sun Microelectronics UltraSPARC II User's Manual. If the parsable option (-P) is specified in addition to the -e option, the format of the data is as follows:

Field	Contents
1	Timestamp (nanoseconds since start)
2	CPU ID
3	Trap number (in hexadecimal)
4	Trap name
5	Trap rate per interval

Each field is separated with whitespace. If the format is modified, it will be modified by adding potentially new fields beginning with field 6; exant fields will remain unchanged.

-l

Lists trap table entries. By default, a table is displayed containing all valid trap numbers, their names and a brief description. The trap name is used in both the default output and in the entrylist parameter for the -e argument. If the parsable option (-P) is specified in addition to the -l option, the format of the data is as follows:

Field	Contents
1	Trap number in hexadecimal
2	Trap number in decimal
3	Trap name
Remaining	Trap description

-P

Generates parsable output. When run without other data gathering modifying options (that is, -e, -t or -T), trapstat's the parsable output has the following format:

Field	Contents
1	Timestamp (nanoseconds since start)
2	CPU ID
3	Trap number (in hexadecimal)
4	Trap name
5	Trap rate per interval

Each field is separated with whitespace. If the format is modified, it will be modified by adding potentially new fields beginning with field 6; extant fields will remain unchanged.

-r rate

Explicitly sets the sampling rate to be rate samples per second. If this option is specified, trapstat's output changes from a traps-per-second to traps-per-sampling-interval.

-t

Enables TLB statistics.

A table is displayed with four principal columns of data: itlb-miss, itsb-miss, dtlb-miss, and dtsb-miss. The columns contain both the rate of the corresponding event and the percentage of CPU time spent processing the event. The percentage of CPU time is given only in terms of a single CPU. The rows of the table correspond to CPUs, with each CPU consuming two rows: one row for user-mode events (denoted with u) and one row for kernel-mode events (denoted with k). For each row, the percentage of CPU time is totalled and displayed in the rightmost column. The CPUs are delineated with a solid line. If the parsable option (-P) is specified in addition to the -t option, the format of the data is as follows:

Field	Contents
1	Timestamp (nanoseconds since start)
2	CPU ID
3	Mode (k denotes kernel, u denotes user)
4	I-TLB misses
5	Percentage of time in I-TLB miss handler
6	I-TSB misses
7	Percentage of time in I-TSB miss handler
8	D-TLB misses
9	Percentage of time in D-TLB miss handler
10	D-TSB misses
11	Percentage of time in D-TSB miss handler

Each field is separated with whitespace. If the format is modified, it will be modified by adding potentially new fields beginning with field 12; extant fields will remain unchanged.

-T

Enables TLB statistics, with page size information. As with the -t option, a table is displayed with four principal columns of data: itlb-miss, itsb-miss, dtlb-miss, and dtsb-miss. The columns contain both the absolute number of the corresponding event, and the percentage of CPU time spent processing the event. The percentage of CPU time is given only in terms of a single CPU. The rows of the table correspond to CPUs, with each CPU consuming two sets of rows: one set for user-level events (denoted with u) and one set for kernel-level events (denoted with k). Each set, in turn, contains as many rows as there are page sizes supported (see getpagesizes(3C)). For each row, the percentage of CPU time is totalled and displayed in the right-most column. The two sets are delineated with a dashed line; CPUs are delineated with a solid line. If the parsable option (-P) is specified in addition to the -T option, the format of the data is as follows:

Field	Contents
1	Timestamp (nanoseconds since start)
2	CPU ID
3	Mode k denotes kernel, u denotes user)
4	Page size, in decimal
5	I-TLB misses
6	Percentage of time in I-TLB miss handler
7	I-TSB misses
8	Percentage of time in I-TSB miss handler
9	D-TLB misses
10	Percentage of time in D-TLB miss handler
11	D-TSB misses
12	Percentage of time in D-TSB miss handler

Each field is separated with whitespace. If the format is modified, it will be modified by adding potentially new fields beginning with field 13; extant fields will remain unchanged.

EXAMPLES

Example 1 Using trapstat Without Options

When run without options, trapstat displays a table of trap types and CPUs. At most six columns can fit in the default terminal width; if (as in this example) there are more than six CPUs, multiple tables are displayed:

example# trapstat
vct  name               |     cpu0     cpu1     cpu4     cpu5     cpu8     cpu9
------------------------+------------------------------------------------------


 24 cleanwin            |     6446     4837     6368     2153     2623     1321


 41 level-1             |      100        0        0        0        1        0


 44 level-4             |        0        1        1        1        0        0


 45 level-5             |        0        0        0        0        0        0


 47 level-7             |        0        0        0        0        9        0


 49 level-9             |      100      100      100      100      100      100


 4a level-10            |      100        0        0        0        0        0


 4d level-13            |        6       10        7       16       13       11


 4e level-14            |      100        0        0        0        1        0


 60 int-vec             |     2607     2740     2642     2922     2920     3033


 64 itlb-miss           |     3129     2475     3167     1037     1200      569


 68 dtlb-miss           |   121061    86162   109838    37386    45639    20269


 6c dtlb-prot           |      997      847     1061      379      406      184


 84 spill-user-32       |     2809     2133     2739   200806   332776   454504


 88 spill-user-64       |    45819   207856    93487   228529    68373    77590


 8c spill-user-32-cln   |      784      561      767      274      353      215


 90 spill-user-64-cln   |        9       37       17       39       12       13


 98 spill-kern-64       |    62913    50145    63869    21916    28431    11738


 a4 spill-asuser-32     |     1327      947     1288      460      572      335


 a8 spill-asuser-64     |       26       48       18       54       10       14


 ac spill-asuser-32-cln |     4580     3599     4555     1538     1978      857


 b0 spill-asuser-64-cln |       26        0        0        2        0        0


 c4 fill-user-32        |     2862     2161     2798   191746   318115   435850


 c8 fill-user-64        |    45813   197781    89179   217668    63905    74281


 cc fill-user-32-cln    |     3802     2833     3733    10153    16419    19475


 d0 fill-user-64-cln    |      329    10105     4873    10603     4235     3649


 d8 fill-kern-64        |    62519    49943    63611    21824    28328    11693
108 syscall-32          |     2285     1634     2278      737      957      383
126 self-xcall          |      100        0        0        0        0        0
vct  name               |    cpu12    cpu13    cpu14    cpu15
------------------------+------------------------------------


 24 cleanwin            |     5435     4232     6302     6104


 41 level-1             |        0        0        0        0


 44 level-4             |        2        0        0        1


 45 level-5             |        0        0        0        0


 47 level-7             |        0        0        0        0


 49 level-9             |      100      100      100      100


 4a level-10            |        0        0        0        0


 4d level-13            |       15       11       22       11


 4e level-14            |        0        0        0        0


 60 int-vec             |     2813     2833     2738     2714


 64 itlb-miss           |     2636     1925     3133     3029


 68 dtlb-miss           |    90528    70639   107786   103425


 6c dtlb-prot           |      819      675      988      954


 84 spill-user-32       |   175768    39933     2811     2742


 88 spill-user-64       |        0   241348    96907   118298


 8c spill-user-32-cln   |      681      513      753      730


 90 spill-user-64-cln   |        0       42       16       20


 98 spill-kern-64       |    52158    40914    62305    60141


 a4 spill-asuser-32     |     1113      856     1251     1208


 a8 spill-asuser-64     |        0       64       16       24


 ac spill-asuser-32-cln |     3816     2942     4515     4381


 b0 spill-asuser-64-cln |        0        0        0        0


 c4 fill-user-32        |   170744    38444     2876     2784


 c8 fill-user-64        |        0   230381    92941   111694


 cc fill-user-32-cln    |     8550     3790     3612     3553


 d0 fill-user-64-cln    |        0    10726     4495     5845


 d8 fill-kern-64        |    51968    40760    62053    59922
108 syscall-32          |     1839     1495     2144     2083
126 self-xcall          |        0        0        0        0

Example 2 Using trapset with CPU Filtering

The -c option can be used to limit the CPUs on which trapstat is enabled. This example limits CPU 1 and CPUs 12 through 15.

example# trapstat -c 1,12-15
vct  name               |     cpu1    cpu12    cpu13    cpu14    cpu15
------------------------+---------------------------------------------


 24 cleanwin            |     6923     3072     2500     3518     2261


 44 level-4             |        3        0        0        1        1


 49 level-9             |      100      100      100      100      100


 4d level-13            |       23        8       14       19       14


 60 int-vec             |     2559     2699     2752     2688     2792


 64 itlb-miss           |     3296     1548     1174     1698     1087


 68 dtlb-miss           |   114788    54313    43040    58336    38057


 6c dtlb-prot           |     1046      549      417      545      370


 84 spill-user-32       |    66551    29480   301588    26522   213032


 88 spill-user-64       |        0   318652   111239   299829   221716


 8c spill-user-32-cln   |      856      347      331      416      293


 90 spill-user-64-cln   |        0       55       21       59       39


 98 spill-kern-64       |    66464    31803    24758    34004    22277


 a4 spill-asuser-32     |     1423      569      560      698      483


 a8 spill-asuser-64     |        0       74       32       98       46


 ac spill-asuser-32-cln |     4875     2250     1728     2384     1584


 b0 spill-asuser-64-cln |        0        2        0        1        0


 c4 fill-user-32        |    64193    28418   287516    27055   202093


 c8 fill-user-64        |        0   305016   106692   288542   210654


 cc fill-user-32-cln    |     6733     3520    15185     2396    12035


 d0 fill-user-64-cln    |        0    13226     3506    12933    11032


 d8 fill-kern-64        |    66220    31680    24674    33892    22196
108 syscall-32          |     2446      967      817     1196      755

Example 3 Using trapstat with TLB Statistics

The -t option displays in-depth TLB statistics, including the amount of time spent performing TLB miss processing. The following example shows that the machine is spending 14.1 percent of its time just handling D-TLB misses:

example# trapstat -t
cpu m| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
-----+-------------------------------+-------------------------------+----


  0 u|      2571  0.3         0  0.0 |     10802  1.3         0  0.0 | 1.6


  0 k|         0  0.0         0  0.0 |    106420 13.4       184  0.1 |13.6
-----+-------------------------------+-------------------------------+----


  1 u|      3069  0.3         0  0.0 |     10983  1.2       100  0.0 | 1.6


  1 k|        27  0.0         0  0.0 |    106974 12.6        19  0.0 |12.7
-----+-------------------------------+-------------------------------+----


  2 u|      3033  0.3         0  0.0 |     11045  1.2       105  0.0 | 1.6


  2 k|        43  0.0         0  0.0 |    107842 12.7       108  0.0 |12.8
-----+-------------------------------+-------------------------------+----


  3 u|      2924  0.3         0  0.0 |     10380  1.2       121  0.0 | 1.6


  3 k|        54  0.0         0  0.0 |    102682 12.2        16  0.0 |12.2
-----+-------------------------------+-------------------------------+----


  4 u|      3064  0.3         0  0.0 |     10832  1.2       120  0.0 | 1.6


  4 k|        31  0.0         0  0.0 |    107977 13.0       236  0.1 |13.1
=====+===============================+===============================+====


 ttl |     14816  0.3         0  0.0 |    585937 14.1      1009  0.0 |14.5

Example 4 Using trapstat with TLB Statistics and Page Size Information

By specifying the -T option, trapstat shows TLB misses broken down by page size. In this example, CPU 0 is spending 7.9 percent of its time handling user-mode TLB misses on 8K pages, and another 2.3 percent of its time handling user-mode TLB misses on 64K pages.

example# trapstat -T -c 0
cpu m size| itlb-miss %tim itsb-miss %tim | dtlb-miss %tim dtsb-miss %tim |%tim
----------+-------------------------------+-------------------------------+----


  0 u   8k|      1300  0.1        15  0.0 |    104897  7.9        90  0.0 | 8.0


  0 u  64k|         0  0.0         0  0.0 |     29935  2.3         7  0.0 | 2.3


  0 u 512k|         0  0.0         0  0.0 |      3569  0.2         2  0.0 | 0.2


  0 u   4m|         0  0.0         0  0.0 |       233  0.0         2  0.0 | 0.0
- - - - - + - - - - - - - - - - - - - - - + - - - - - - - - - - - - - - - + - -


  0 k   8k|        13  0.0         0  0.0 |     71733  6.5       110  0.0 | 6.5


  0 k  64k|         0  0.0         0  0.0 |         0  0.0         0  0.0 | 0.0


  0 k 512k|         0  0.0         0  0.0 |         0  0.0       206  0.1 | 0.1


  0 k   4m|         0  0.0         0  0.0 |         0  0.0         0  0.0 | 0.0
==========+===============================+===============================+====


      ttl |      1313  0.1        15  0.0 |    210367 17.1       417  0.2 |17.5

Example 5 Using trapstat with Entry Filtering

By specifying the -e option, trapstat displays statistics for only specific trap types. Using this option minimizes the probe effect when seeking specific data. This example yields statistics for only the dtlb-prot and syscall-32 traps on CPUs 12 through 15:

example# trapstat -e dtlb-prot,syscall-32 -c 12-15
vct  name               |    cpu12    cpu13    cpu14    cpu15
------------------------+------------------------------------


 6c dtlb-prot           |      817      754     1018      560
108 syscall-32          |     1426     1647     2186     1142
vct  name               |    cpu12    cpu13    cpu14    cpu15
------------------------+------------------------------------


 6c dtlb-prot           |     1085      996      800      707
108 syscall-32          |     2578     2167     1638     1452

Example 6 Using trapstat with a Higher Sampling Rate

The following example uses the -r option to specify a sampling rate of 1000 samples per second, and filter only for the level-10 trap. Additionally, specifying the -P option yields parsable output.

Notice the timestamp difference between the level-10 events: 9,998,000 nanoseconds and 10,007,000 nanoseconds. These level-10 events correspond to the system clock, which by default ticks at 100 hertz (that is, every 10,000,000 nanoseconds).

example# trapstat -e level-10 -P -r 1000
1070400 0 4a level-10 0
2048600 0 4a level-10 0
3030400 0 4a level-10 1
4035800 0 4a level-10 0
5027200 0 4a level-10 0
6027200 0 4a level-10 0
7027400 0 4a level-10 0
8028200 0 4a level-10 0
9026400 0 4a level-10 0
10029600 0 4a level-10 0
11028600 0 4a level-10 0
12024000 0 4a level-10 0
13028400 0 4a level-10 1
14031200 0 4a level-10 0
15027200 0 4a level-10 0
16027600 0 4a level-10 0
17025000 0 4a level-10 0
18026000 0 4a level-10 0
19027800 0 4a level-10 0
20025600 0 4a level-10 0
21025200 0 4a level-10 0
22025000 0 4a level-10 0
23035400 0 4a level-10 1
24027400 0 4a level-10 0
25026000 0 4a level-10 0
26027000 0 4a level-10 0

ATTRIBUTES

See attributes(7) for descriptions of the following attributes:

ATTRIBUTE TYPE	ATTRIBUTE VALUE
Interface Stability
Human Readable Output	Unstable
Parsable Output	Evolving

NOTES

When enabled, trapstat induces a varying probe effect, depending on the type of information collected. While the precise probe effect depends upon the specifics of the hardware, the following table can be used as a rough guide:

Option	Approximate probe effect
default	3-5% per trap
-e	3-5% per specified trap
-t, -T	40-45% per TLB miss trap hitting in the TSB, 25-30% per TLB miss trap missing in the TSB

These probe effects are per trap not for the system as a whole. For example, running trapstat with the default options on a system that spends 7% of total time handling traps induces a performance degradation of less than one half of one percent; running trapstat with the -t or -T option on a system spending 5% of total time processing TLB misses induce a performance degradation of no more than 2.5%.

When run with the -t or -T option, trapstat accounts for its probe effect when calculating the %tim fields. This assures that the %tim fields are a reasonably accurate indicator of the time a given workload is spending handling TLB misses — regardless of the perturbing presence of trapstat.

While the %tim fields include the explicit cost of executing the TLB miss handler, they do not include the implicit costs of TLB miss traps (for example, pipeline effects, cache pollution, etc). These implicit costs become more significant as the trap rate grows; if high %tim values are reported (greater than 50%), you can accurately infer that much of the balance of time is being spent on the implicit costs of the TLB miss traps.

Due to the potential system wide degradation induced, only the super-user can run trapstat.

Due to the limitation of the underlying statistics gathering methodology, only one instance of trapstat can run at a time.