Objectives
Learn how to apply
principles of effective reporting to performance test data.
Learn when to share
technical results versus produce stakeholder reports.
Learn what questions various
team members expect performance reports to answer.
Overview
Managers and stakeholders need more than simply the
results from various tests — they need conclusions based on those results, and
consolidated data that supports those conclusions. Technical team members also
need more than just results — they need analysis, comparisons, and details of
how the results were obtained. Team members of all types get value from
performance results being shared more frequently. In this chapter, you will
learn how to satisfy the needs of all the consumers of performance test results
and data by employing a variety of reporting and results-sharing techniques,
and by learning exemplar scenarios where each technique tends be well received.
Principles of Effective Reporting
The key to effective reporting is to present
information of interest to the intended audience in a quick, simple, and
intuitive manner. The following are some of underlying principles of effective
reporting:
Report early, report often
Report visually
Report intuitively
Use the right statistics
Consolidate data correctly
Summarize data effectively
Customize reports for the
intended audience
Use concise verbal summaries
Make the data available
Report Early, Report Often
Continual sharing of information and data is critical
to the efficiency and overall success of a performance-testing project.
However, not all of the information and data to be shared needs to take the
form of a formal or semiformal report. One effective approach is to send
stakeholders summary charts and tables every day or two in an e-mail message
that contains a concise statement of key points. Use the feedback and questions
you receive from those stakeholders when deciding what to put in the next formal
or semiformal report. In this way you can gauge the needs of your audience
before writing what is intended to be a stand-alone or final document.
Sharing information and data with the technical team
can be an even more straightforward process. It may be as simple as posting the
location of the new results files to a team wiki before you begin analyzing
them, and then posting links to any charts and graphs that derive from your
analysis.
Report Visually
Most people find that data and statistics reported in
a graphical format are easier to digest. This is especially true of performance
results data, where the volume of data is frequently very large and most
significant findings result from detecting patterns in the data. It is possible
to find these patterns by scanning through tables or by using complex
mathematical algorithms, but the human eye is far quicker and more accurate in
the vast majority of cases.
Once a pattern or “point of interest” has been
identified visually, you will typically want to isolate that pattern by
removing the remaining “chart noise.” In this context, chart noise includes all
of the data points representing activities and time slices that contain no
points of interest (that is, the ones that look like you expect them to).
Removing the chart noise enables you to more clearly evaluate the pattern you
are interested in, and makes reports more clear.
Report Intuitively
Whether formal or informal, reports should be able to
stand on their own. If a report leaves the reader with questions as to why the
information is important, the report has failed. While reports do not need to
provide the answers to issues to be effective, the issues should be quickly and
intuitively clear from the presentation.
One method to validate the intuitiveness of a report
is to remove all labels or identifiers from charts and graphs and all
identifying information from narratives and then present the report to someone
unfamiliar with the project. If that person is able to quickly and correctly
point to the issue of concern in the chart or graph, or identify why the issue
discussed in the narrative is relevant, then you have created an intuitive
report.
Use the Right Statistics
Even though there is a widespread need to understand
many statistical concepts, many software developers, testers, and managers
either do not have strong backgrounds in or do not enjoy statistics. This can
lead to significant misrepresentations of performance test results when
reporting. If you are not sure what statistics to use to highlight a particular
issue, do not hesitate to ask for assistance.
Consolidate Data Correctly
While it is not strictly necessary to consolidate
results, it tends to be much easier to demonstrate patterns in results when
those results are consolidated into one or two graphs rather than distributed
over dozens. That said, it is important to remember that only results from
identical test executions that are statistically similar can be consolidated
into performance report output tables and charts.
Additional Considerations
In order for results to be consolidated, both the test
and the test environment must be identical, and the test results must be
statistically equivalent. One approach to determining if results are similar
enough to be consolidated is to compare results from at least five test
executions and apply the following rules:
If more than 20 percent (or
one out of five) of the test execution results appear not to be similar
to the rest, something is generally wrong with the test environment, the
application, or the test itself.
If a 95th percentile value
for any test execution is greater than the maximum or less than the minimum
value for any of the other test executions, it is not statistically similar.
If every page/timer result
in a test execution is noticeably higher or lower on the chart than the results
of all the rest of the test executions, it is not statistically similar.
If a single page/timer
result in a test execution is noticeably higher or lower on the chart than all
the rest of the test execution results, but the results for all the rest of the
pages/timers in that test execution are not, the test executions are probably
statistically similar.
Summarize Data Effectively
Summarizing results frequently makes it much easier to
demonstrate meaningful patterns in the test results. Summary charts and tables
present data from different test executions side by side so that trends and
patterns are easy to identify. The overall point of these tables and charts is
to show team members how the test results compare to the performance goals of
the system so they can make important decisions about taking the system live,
upgrading the system, or even, in some cases, completely reevaluating the
project.
Additional Considerations
Keep the following key points in mind when summarizing
test data:
Use charts and tables that
make your findings clear.
Use text to supplement
tables and charts, not the other way around.
If a chart or table is
confusing to the reader, don’t use it.
Customize Reports for the Intended Audience
Performance test results are most commonly read by one
of three audiences: technical team members, non-technical team members, and
stakeholders outside of the core team. These three groups tend to look for very
different things in a performance report and are inclined to prefer different
presentation methods. When reporting, make sure that you identify which group
or groups you are reporting to and what their expectations are before deciding
on the best way to present the results you have collected.
Use Concise Verbal Summaries
Results should have at least a short verbal summary
associated with them, and some results are best or most easily presented in
writing alone. What you decide to include in that text depends entirely on your
intended audience. Some audiences may require just one or two sentences
capturing the key point(s) you are trying to make with the graphic. For
example:
“From observing this graph, you can see that the
system under test meets all stated performance goals up to 150 hourly users but
at that point degrades quickly to an essentially unusable state.”
Other audiences may also require a detailed
explanation of the graph being presented. For example:
“In this graph, you see the average response time in
seconds, portrayed vertically on the left side of the graph, plotted against
the total number of hourly users simulated during each test execution,
portrayed horizontally along the bottom of the graph. The intersection points
depict ”
Make the Data Available
There is a disturbingly popular belief that
performance testing (or other testing) data should not be shared in its raw
form out of fear that the consumers of that data will use or analyze it improperly.
While this concern is not invalid, of much greater concern is the fact that it
is simply not reasonable to expect any one person or team to be able to extract
all of the value from a set of data at one point in time. Data provides
different value to different people at different times, and the only way to get
the most out of the data is to make that data continually available to the
team. Additionally, making the data available tends to minimize some people’s
perception that the performance results are simply fabrications based on a set
of tools and processes that they do not understand.
Frequently Reported Performance Data
The following are the most frequently reported types
of results data. The sections that follow describe what makes this data interesting
to whom, as well as considerations for reporting that type of data.
End-user response times
Resource utilizations
Volumes, capacities, and
rates
Component response times
Trends
End-user Response Times
End-user response time is by far the most commonly
requested and reported metric in performance testing. If you have captured
goals and requirements effectively, this is a measure of presumed user
satisfaction with the performance characteristics of the system or application.
Stakeholders are interested in end-user response times to judge the degree to
which users will be satisfied with the application. Technical team members are
interested because they want to know if they are achieving the overall
performance goals from a user’s perspective, and if not, in what areas those
goals not being met.
Exemplar1
Figure 16.1 Response Time
Exemplar2
Figure 16.2 Response Time Degradation
Considerations
Even though end-user response times are the most
commonly reported performance-testing metric, there are still important points
to consider.
Eliminate
outliers before reporting. Even one
legitimate outlier can dramatically skew your results.
Ensure that the
statistics are clearly communicated. The difference between an average and a 90th percentile, for example,
can easily be the difference between “ship it” and “fix it.”
Report
abandonment separately. If you are
accounting for user abandonment, the collected response times for abandoned
pages may not represent the same activity as non-abandoned pages. To be safe,
report response times for non-abandoned pages with an end-user response time
graph and response times and abandonment percentages by page on a separate
graph or table.
Report every
page or transaction separately. Even
though some pages may appear to represent an equivalence class, there could be
differences that you are unaware of.
Resource Utilizations
Resource utilizations are the second most requested
and reported metrics in performance testing. Most frequently, resource
utilization metrics are reported verbally or in a narrative fashion. For
example, “The CPU utilization of the application server never exceeded 45
percent. The target is to stay below 70 percent.” It is generally valuable to
report resource utilizations graphically when there is an issue to be
communicated.
Exemplar for Stakeholders
Figure 16.3 Processor Time
Exemplar for Technical Team Members
Figure 16.4 Processor Time and Queue
Additional Considerations
Points to consider when reporting resource
utilizations include:
Know when to
report all of the data and when to summarize. Very often, simply reporting the peak value for a monitored resource
during the course of a test is adequate. Unless an issue is detected, the
report only needs to demonstrate that the correct metrics were collected to
detect the issue if it were present during the test.
Overlay resource
utilization metrics with other load and response data. Resource utilization metrics are most powerful when
presented on the same graph as load and/or response time data. If there is a
performance issue, this helps to identify relationships across various metrics.
If you decide to
present more than one data point, present them all. Resource utilization rates will often change
dramatically from one measurement to the next. The pattern of change across
measurements is at least as important as the current value. Moving averages and
trend lines obfuscate these patterns, which can lead to incorrect assumptions
and regrettable decisions.
Volumes, Capacities, and Rates
Volume, capacity, and rate metrics are also frequently
requested by stakeholders, even though the implications of these metrics are
often more challenging to interpret. For this reason, it is important to report
these metrics in relation to specific performance criteria or a specific
performance issue. Some examples of commonly requested volume, capacity, and
rate metrics include:
Bandwidth consumed
Throughput
Transactions per second
Hits per second
Number of supported
registered users
Number of records/items able
to be stored in the database
Exemplar
Figure 16.5 Throughput
Additional Considerations
Points to consider when reporting volumes, capacities
and rates include:
Report metrics
in context. Volume, capacity, and rate
metrics typically have little stand-alone value.
Have test
conditions and supporting data available. While this is a good idea in general, it is particularly important with
volume, capacity, and rate metrics.
Include narrative
summaries with implications. Again, while
this is a good idea in general, it is virtually critical to ensure
understanding of volume, capacity, and rate metrics.
Component Response Times
Even though component response times are not reported
to stakeholders as commonly as end-user response times or resource utilization
metrics, they are frequently collected and shared with the technical team.
These response times help developers, architects, database administrators
(DBAs), and administrators determine what sub-part or parts of the system are
responsible for the majority of end-user response times.
Exemplar
Figure 16.6 Sequential Consecutive Database Updates
Additional Considerations
Points to consider when reporting component response
times include:
Relate component
response times to end-user activities. Because it is not always obvious what end-user activities are impacted
by a component’s response time, it is a good idea to include those
relationships in your report.
Explain the
degree to which the component response time matters. Sometimes the concern is that a component might
become a bottleneck under load because it is processing too slowly; at other
times, the concern is that end-user response times are noticeably degraded as a
result of the component. Knowing which of these conditions applies to your
project enables you to make effective decisions.
Trends
Trends are one of the most powerful but
least-frequently used data-reporting methods. Trends can show whether
performance is improving or degrading from build to build, or the rate of
degradation as load increases. Trends can help technical team members quickly
understand whether the changes they recently made achieved the desired
performance impact.
Exemplar
Figure 16.7 Response Time Trends for Key Pages
Additional Considerations
Points to consider when reporting trends include:
Trends typically
do not add value until there are at least three measurements. Sometimes trends cannot be effectively detected until
there are more than three measurements. Start creating your trend charts with
the first set of data, but be cautious about including them in formal reports
until you have collected enough data for there to be an actual trend to report.
Share trends
with the technical team before including them in formal reports. This is another generally good practice, but it is
particularly relevant to trends because developers, architects, administrators,
and DBAs often will have already backed out a change that caused the trend to
move in the wrong direction before they are able to compile their report. In
this case, you can decide that the trend report is not worth including, or you
can simply make an annotation describing the cause and stating that the issue
has already been resolved.
Questions to Be Answered By Reporting
Almost every team member has unique wants, needs, and
expectations when it comes to reporting data and results obtained through
performance testing. While this makes sharing information obtained through
performance testing challenging, knowing what various team members expect and
value in advance makes providing valuable information to the right people, at
the right level of detail and at the right time, much easier
All Roles
Some questions that are commonly posed by team members
include:
Is performance
getting better or worse?
Have we met the
requirements/service level agreements (SLAs)?
What reports are
available?
How frequently
can I get reports?
Can I get a
report with more/less detail?
Executive Stakeholders
Executive stakeholders tend to have very specific
reporting needs and expectations that are often quite different from those of
other team members. Stakeholders tend to prefer information in small,
digestible chunks that clearly highlight the key points. Additionally,
stakeholders like visual representations of data that are intuitive at a
glance, as well as “sound bite”–size explanations of those visual
representations. Finally, stakeholders tend to prefer consolidated and
summarized information on a less frequent (though not significantly less frequent)
basis than other team members. The following are common questions that
executive stakeholders want performance test reports to answer:
Is this ready to
ship?
How will these
results relate to production?
How much
confidence should I have in these results?
What needs to be
done to get this ready to ship?
Is the
performance testing proceeding as anticipated?
Is the
performance testing adding value?
Project-Level Managers
Project-level managers — including the project
manager, development lead or manager, and the test lead or manager — have all
of the same needs and questions as the executive stakeholders, except that they
want the answers more frequently and in more detail. Additionally, they
commonly want to know the following:
Are performance
issues being detected efficiently?
Are performance
issues being resolved efficiently?
What performance
testing should we be conducting that we currently are not?
What performance
testing are we currently doing that is not adding value?
Are there
currently any blockers? If so, what are they?
Technical Team Members
Although technical team members have some degree of
interest in all of the questions posed by managers and stakeholders, they are
more interested in receiving a continual flow of information related to test
results, monitored data, observations, and opportunities for analysis and
improvement. Technical team members tend to want to know the following:
What do these
results mean to my specialty/focus area?
Where can I go
to see the results for the last test?
Where can I go
to get the raw data?
Can you capture
metric X during the next test run?
Types of Results Sharing
In the most basic sense, there are three distinct
types of results sharing: raw data display, technical reports, and stakeholder reports.
While all are based on timely, accurate, and relevant communication of results,
observations, concerns, and recommendations, each type targets a different
audience, and the most effective methods of communicating data differ
dramatically.
Raw Data Display
While not explicitly a reporting scenario, the sharing
of raw data for collaboration purposes involves many of the same principles of
data presentation that are applied to reports in order to improve the
effectiveness of the collaboration.
In general, most people would rather view data and
statistics in graphical form instead of in tables. In some cases, however,
tables are the most efficient way to show calculated results for all of the
data. It is recommended that you use tables sparingly in reports, while
including the tabular form of the data used to create charts and graphs as an
appendix or attachment to a report, so that interested stakeholders can refer
to it.
Results from the following types of tests can be well
represented in a tabular format:
Baseline
Benchmark
Scalability
Any other
user-experience–based test
Tables are an excellent way to present volumes of data
in a clean and orderly manner and to support the findings they ultimately lead
to. However, you should be careful not to overuse tables in your reports. Many
people quickly skip over tables and read only the surrounding text or examine
only the charts that go with them. Be certain that whether you use the tables
discussed below or other types, you present in your report only those tables
that clearly make an important point. Huge tables containing all of the
supporting data may be of interest to a few individuals, but not to most, and
thus should be included only in an appendix to a report. Raw data is most
commonly shared in the following formats:
Spreadsheets
Text files (and regular
expression searches)
Data collection tools
(“canned” reports)
Technical Reports
Technical reports are generally more formal than raw
data display, but not excessively so. Technical reports should stand on their
own, but since they are intended for technical members of the team who are
currently working on the project, they do not need to contain all of the
supplemental detail that a stakeholder report normally does. In the simplest
sense, technical reports are made up of the following:
Description of the test,
including workload model and test environment
Easily digestible data with
minimal pre-processing
Access to the complete data
set and test conditions
Short statements of
observations, concerns, questions, and requests for collaboration
Technical reports most commonly include data in the
following formats:
Scatter plots
Pareto charts
Trend charts
Summary spreadsheets
Stakeholder Reports
Stakeholder reports are the most formal of the
performance data sharing formats. These reports must be able to stand alone
while at the same time being intuitive to someone who is not working on the
project in a day-to-day technical role. Typically, these reports center on
acceptance criteria and risks. To be effective, stakeholder reports typically
need to include:
The acceptance criteria to
which the results relate
Intuitive, visual
representations of the most relevant data
A brief verbal summary of
the chart or graph in terms of criteria
Intuitive, visual
representations of the workload model and test environment
Access to associated
technical reports, complete data sets, and test conditions
A summary of observations,
concerns, and recommendations
When preparing stakeholder reports, consider that most
stakeholder reports meet with one (or more) of the following three reactions.
All three are positive in their own way but may not seem to be at first. These
reactions and some recommended responses follow:
“These are
great, but where’s the supporting data?” This is the most common response from a technical stakeholder. Many
people and organizations want to have all of the data so that they can draw
their own conclusions. Fortunately, this is an easy question to handle: simply
include the entire spreadsheet with this supporting data as an appendix to the
report.
“Very pretty,
but what do they mean?” This is where
text explanations are useful. People who are not familiar with performance
testing or performance results often need to have the implications of the
results spelled out for them. Remember that more than 90 percent of the times,
performance testers are the bearers of bad news that the stakeholder was not
expecting. The tester has the responsibility to ensure that the stakeholder has
confidence in the findings, as well as presenting this news in a constructive
manner.
“Terrific! This
is exactly what I wanted! Don’t worry about the final report — these will do
nicely.” While this seems like a blessing, do not take
it as one. Sooner or later, your tables and charts will be presented to someone
who asks one of the two preceding questions, or worse, asks how the data was
obtained. If there is not at least a final report that tells people where to
find the rest of the data, people will question the results because you are not
present to answer those questions.
Creating a Technical Report
Although six key components of a technical report are
listed below, all six may not be appropriate for every technical report.
Similarly, there may be additional information that should be included based on
exactly what message you are trying to convey with the report. While these six
components will result in successful technical reports most of the time,
remember that sometimes creativity is needed to make your message clear and
intuitive.
Consider including the following key components when
preparing a technical report:
A results graph
A table for single-instance
measurements (e.g., maximum throughput achieved)
Workload model (graphic)
Test environment (annotated
graphic)
Short statements of
observations, concerns, questions, and requests for collaboration
References section
Exemplar Results Graph
Figure 16.8 Consolidated Statistics
Exemplar Tables for Single-Instance Measurements
Figure 16.9 Single Instance Measurements
Exemplar Workload Model Graphic
Figure 16.10 Workload Model
Exemplar Test Environment Graphic
Figure 16.11 Test Environment
Exemplar Summary Statement
“The results graph shows both response times and
resource utilization together. Close examination shows that Application Server
CPU Usage and queue length coincide with significantly degraded response time.
It appears as if the application server CPU usage was the catalyst to the
degradation, but this has yet to be confirmed. The remaining charts and graphs
are included as supplemental information for easy reference.”
Exemplar References Section
“Raw data and additional supporting information is
checked into the version-control system with the build and tagged as
PerfTest-{date}-{issue number}.”
Creating a Stakeholder Report
Although eight key components of a stakeholder report
are listed below, all eight may not be appropriate for every stakeholder
report. Similarly, there may be additional information that should be included
based on exactly what message you are trying to convey with the report. While
these eight components will result in successful stakeholder reports most of
the time, remember that sometimes creativity is needed to make your message
clear and intuitive.
Consider including the following key components when
preparing a stakeholder report:
Criteria to which the
results relate
A results graph
A table for single-instance
measurements (e.g., maximum throughput achieved)
A brief verbal summary of
the chart or graph in terms of criteria
Workload model (graphic)
Test environment (annotated
graphic)
Summary of observations,
concerns, and recommendations
References section
Exemplar Criteria Statement
“This report relates to end-user response time
compliances as documented in the requirements management system as requirements
Perf### through Perf??? at one-half of the expected peak load with the most
commonly expected usage scenario.”
Exemplar Results Graph
Figure 16.12 Response Time Compliance Summary
Exemplar Tables for Single-Instance Measurements
Figure 16.13 Single Instance Measurements
Exemplar Criteria-Based Results Summary
“All metrics collected achieved their required values
except for the response times of pages 8 and 10.
Page 10 failed
to achieve its required value by 2 percent.
Page 8 failed to
achieve its required value by 38 percent.”
Exemplar Workload Model Graphic
Figure 16.14 Workload Model
Exemplar Test Environment Graphic
Figure 16.15 Test Environment
Exemplar Observations and Recommendations Statement
“Based on the test conditions and results, the
performance testing and tuning team recommends the following.
1. Continue performance testing with increasingly
strenuous scenarios and loads.
2. Priority should be given to determining the root cause
of pages 8 and 10 not achieving their acceptance criteria, and subsequently
tuning those root causes.
3. At such time as additional pages demonstrate a failure
to achieve their acceptance criteria, a dedicated root cause and tuning cycle
should be undertaken.”
Summary
Performance test reporting is the process of
presenting results data that will support key technological and business
decisions. The key to creating effective reports is to consider the audience of
the data before determining how best to present the data. The most effective
performance-test results will present analysis, comparisons, and details behind
how the results were obtained, and will influence critical business
decision-making.