unison/doc/testing/testing-framework.texi

@c ========================================================================
@c Testing framework
@c ========================================================================

@unnumbered Part I:  Testing

@node TestingFramework
@chapter Testing Framework

ns-3 consists of a simulation core engine, a set of models, example programs,
and tests.  Over time, new contributors contribute models, tests, and
examples.  A Python test program @samp{test.py} serves as the test
execution manager; @code{test.py} can run test code and examples to
look for regressions, can output the results into a number of forms, and
can manage code coverage analysis tools.  On top of this, we layer
@samp{Buildbots} that are automated build robots that perform
robustness testing by running the test framework on different systems
and with different configuration options.

@cartouche
Insert figure showing the components here
@end cartouche

@node BuildBots
@section Buildbots

At the highest level of ns-3 testing are the buildbots (build robots).
If you are unfamiliar with
this system look at @uref{http://djmitche.github.com/buildbot/docs/0.7.11/}.
This is an open-source automated system that allows @command{ns-3} to be rebuilt
and tested each time something has changed.  By running the buildbots on a number
of different systems we can ensure that @command{ns-3} builds and executes
properly on all of its supported systems.

Users (and developers) typically will not interact with the buildbot system other
than to read its messages regarding test results.  If a failure is detected in
one of the automated build and test jobs, the buildbot will send an email to the
@emph{ns-developers} mailing list.  This email will look something like:

@smallformat
@verbatim
  The Buildbot has detected a new failure of osx-ppc-g++-4.2 on NsNam.
  Full details are available at:
   http://ns-regression.ee.washington.edu:8010/builders/osx-ppc-g%2B%2B-4.2/builds/0

  Buildbot URL: http://ns-regression.ee.washington.edu:8010/

  Buildslave for this Build: darwin-ppc

  Build Reason: The web-page 'force build' button was pressed by 'ww': ww

  Build Source Stamp: HEAD
  Blamelist:

  BUILD FAILED: failed shell_5 shell_6 shell_7 shell_8 shell_9 shell_10 shell_11 shell_12

  sincerely,
  -The Buildbot
@end verbatim
@end smallformat

In the full details URL shown in the email, one can search for the keyword
@code{failed} and select the @code{stdio} link for the corresponding step to see
the reason for the failure.

The buildbot will do its job quietly if there are no errors, and the system will
undergo build and test cycles every day to verify that all is well.

@node Testpy
@section Test.py
The buildbots use a Python program, @command{test.py}, that is responsible for
running all of the tests and collecting the resulting reports into a human-
readable form.  This program is also available for use by users and developers
as well.

@command{test.py} is very flexible in allowing the user to specify the number
and kind of tests to run; and also the amount and kind of output to generate.

By default, @command{test.py} will run all available tests and report status
back in a very concise form.  Running the command,

@verbatim
  ./test.py
@end verbatim

will result in a number of @code{PASS}, @code{FAIL}, @code{CRASH} or @code{SKIP}
indications followed by the kind of test that was run and its display name.

@smallformat
@verbatim
  Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  Waf: Leaving directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  'build' finished successfully (0.939s)
  FAIL: TestSuite ns3-wifi-propagation-loss-models
  PASS: TestSuite object-name-service
  PASS: TestSuite pcap-file-object
  PASS: TestSuite ns3-tcp-cwnd
  ...

  PASS: TestSuite ns3-tcp-interoperability
  PASS: Example csma-broadcast
  PASS: Example csma-multicast
@end verbatim
@end smallformat

This mode is indented to be used by users who are interested in determining if
their distribution is working correctly, and by developers who are interested
in determining if changes they have made have caused any regressions.

There are a number of options available to control the behavior of @code{test.py}.
if you run @code{test.py --help} you should see a command summary like:

@verbatim
  Usage: test.py [options]

  Options:
    -h, --help            show this help message and exit
    -c KIND, --constrain=KIND
                          constrain the test-runner by kind of test
    -e EXAMPLE, --example=EXAMPLE
                          specify a single example to run
    -g, --grind           run the test suites and examples using valgrind
    -k, --kinds           print the kinds of tests available
    -l, --list            print the list of known tests
    -m, --multiple        report multiple failures from test suites and test
                          cases
    -n, --nowaf           do not run waf before starting testing
    -s TEST-SUITE, --suite=TEST-SUITE
                          specify a single test suite to run
    -v, --verbose         print progress and informational messages
    -w HTML-FILE, --web=HTML-FILE, --html=HTML-FILE
                          write detailed test results into HTML-FILE.html
    -r, --retain          retain all temporary files (which are normally
                          deleted)
    -t TEXT-FILE, --text=TEXT-FILE
                          write detailed test results into TEXT-FILE.txt
    -x XML-FILE, --xml=XML-FILE
                          write detailed test results into XML-FILE.xml
@end verbatim

If one specifies an optional output style, one can generate detailed descriptions
of the tests and status.  Available styles are @command{text} and @command{HTML}.
The buildbots will select the HTML option to generate HTML test reports for the
nightly builds using,

@verbatim
  ./test.py --html=nightly.html
@end verbatim

In this case, an HTML file named ``nightly.html'' would be created with a pretty
summary of the testing done.  A ``human readable'' format is available for users
interested in the details.

@verbatim
  ./test.py --text=results.txt
@end verbatim

In the example above, the test suite checking the @command{ns-3} wireless
device propagation loss models failed.  By default no further information is
provided.

To further explore the failure, @command{test.py} allows a single test suite
to be specified.  Running the command,

@verbatim
  ./test.py --suite=ns3-wifi-propagation-loss-models
@end verbatim

results in that single test suite being run.

@verbatim
  FAIL: TestSuite ns3-wifi-propagation-loss-models
@end verbatim

To find detailed information regarding the failure, one must specify the kind
of output desired.  For example, most people will probably be interested in
a text file:

@verbatim
  ./test.py --suite=ns3-wifi-propagation-loss-models --text=results.txt
@end verbatim

This will result in that single test suite being run with the test status written to
the file ``results.txt''.

You should find something similar to the following in that file:

@smallformat
@verbatim
FAIL: Test Suite ``ns3-wifi-propagation-loss-models'' (real 0.02 user 0.01 system 0.00)
  PASS: Test Case "Check ... Friis ... model ..." (real 0.01 user 0.00 system 0.00)
  FAIL: Test Case "Check ... Log Distance ... model" (real 0.01 user 0.01 system 0.00)
    Details:
      Message:   Got unexpected SNR value
      Condition: [long description of what actually failed]
      Actual:    176.395
      Limit:     176.407 +- 0.0005
      File:      ../src/test/ns3wifi/propagation-loss-models-test-suite.cc
      Line:      360
@end verbatim
@end smallformat

Notice that the Test Suite is composed of two Test Cases.  The first test case
checked the Friis propagation loss model and passed.  The second test case
failed checking the Log Distance propagation model.  In this case, an SNR of
176.395 was found, and the test expected a value of 176.407 correct to three
decimal places.  The file which implemented the failing test is listed as well
as the line of code which triggered the failure.

If you desire, you could just as easily have written an HTML file using the
@code{--html} option as described above.

Typically a user will run all tests at least once after downloading
@command{ns-3} to ensure that his or her environment has been built correctly
and is generating correct results according to the test suites.  Developers
will typically run the test suites before and after making a change to ensure
that they have not introduced a regression with their changes.  In this case,
developers may not want to run all tests, but only a subset.  For example,
the developer might only want to run the unit tests periodically while making
changes to a repository.  In this case, @code{test.py} can be told to constrain
the types of tests being run to a particular class of tests.  The following
command will result in only the unit tests being run:

@verbatim
  ./test.py --constrain=unit
@end verbatim

Similarly, the following command will result in only the example smoke tests
being run:

@verbatim
  ./test.py --constrain=unit
@end verbatim

To see a quick list of the legal kinds of constraints, you can ask for them
to be listed.  The following command

@verbatim
  ./test.py --kinds
@end verbatim

will result in the following list being displayed:

@smallformat
@verbatim
  Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  Waf: Leaving directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  'build' finished successfully (0.939s)Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  bvt:         Build Verification Tests (to see if build completed successfully)
  core:        Run all TestSuite-based tests (exclude examples)
  example:     Examples (to see if example programs run successfully)
  performance: Performance Tests (check to see if the system is as fast as expected)
  system:      System Tests (spans modules to check integration of modules)
  unit:        Unit Tests (within modules to check basic functionality)
@end verbatim
@end smallformat

Any of these kinds of test can be provided as a constraint using the @code{--constraint}
option.

To see a quick list of all of the test suites available, you can ask for them
to be listed.  The following command,

@verbatim
  ./test.py --list
@end verbatim

will result in a list of the test suite being displayed, similar to :

@smallformat
@verbatim
  Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  Waf: Leaving directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  'build' finished successfully (0.939s)
  histogram
  ns3-wifi-interference
  ns3-tcp-cwnd
  ns3-tcp-interoperability
  sample
  devices-mesh-flame
  devices-mesh-dot11s
  devices-mesh
  ...
  object-name-service
  callback
  attributes
  config
  global-value
  command-line
  basic-random-number
  object
@end verbatim
@end smallformat

Any of these listed suites can be selected to be run by itself using the
@code{--suite} option as shown above.

Similarly to test suites, one can run a single example program using the @code{--example}
option.

@verbatim
  ./test.py --example=udp-echo
@end verbatim

results in that single example being run.

@verbatim
  PASS: Example udp-echo
@end verbatim

Normally when example programs are executed, they write a large amount of trace
file data.  This is normally saved to the base directory of the distribution
(e.g., /home/user/ns-3-dev).  When @command{test.py} runs an example, it really
is completely unconcerned with the trace files.  It just wants to to determine
if the example can be built and run without error.  Since this is the case, the
trace files are written into a @code{/tmp/unchecked-traces} directory.  If you
run the above example, you should be able to find the associated
@code{udp-echo.tr} and @code{udp-echo-n-1.pcap} files there.

The list of available examples is defined by the contents of the ``examples''
directory in the distribution.  If you select an example for execution using
the @code{--example} option, @code{test.py} will not make any attempt to decide
if the example has been configured or not, it will just try to run it and
report the result of the attempt.

When @command{test.py} runs, by default it will first ensure that the system has
been completely built.  This can be defeated by selecting the @code{--nowaf}
option.

@verbatim
  ./test.py --list --nowaf
@end verbatim

will result in a list of the currently built test suites being displayed, similar to :

@verbatim
  ns3-wifi-propagation-loss-models
  ns3-tcp-cwnd
  ns3-tcp-interoperability
  pcap-file-object
  object-name-service
  random-number-generators
@end verbatim

Note the absence of the @command{Waf} build messages.

@code{test.py} also supports running the test suites and examples under valgrind.
Valgrind is a flexible program for debugging and profiling Linux executables.  By
default, valgrind runs a tool called memcheck, which performs a range of memory-
checking functions, including detecting accesses to uninitialised memory, misuse
of allocated memory (double frees, access after free, etc.) and detecting memory
leaks.  This can be selected by using the @code{--grind} option.

@verbatim
  ./test.py --grind
@end verbatim

As it runs, @code{test.py} and the programs that it runs indirectly, generate large
numbers of temporary files.  Usually, the content of these files is not interesting,
however in some cases it can be useful (for debugging purposes) to view these files.
@code{test.py} provides a @command{--retain} option which will cause these temporary
files to be kept after the run is completed.  The files are saved in a directory
named @code{testpy} under a subdirectory named according to the current Coordinated
Universal Time (also known as Greenwich Mean Time).

@verbatim
  ./test.py --retain
@end verbatim

Finally, @code{test.py} provides a @command{--verbose} option which will print
large amounts of information about its progress.  It is not expected that this
will be terribly useful unless there is an error.  In this case, you can get
access to the standard output and standard error reported by running test suites
and examples.  Select verbose in the following way:

@verbatim
  ./test.py --verbose
@end verbatim

All of these options can be mixed and matched.  For example, to run all of the
ns-3 core test suites under valgrind, in verbose mode, while generating an HTML
output file, one would do:

@verbatim
  ./test.py --verbose --grind --constrain=core --html=results.html
@end verbatim

@node TestTaxonomy
@section Test Taxonomy

As mentioned above, tests are grouped into a number of broadly defined
classifications to allow users to selectively run tests to address the different
kinds of testing that need to be done.

@itemize @bullet
@item Build Verification Tests
@item Unit Tests
@item System Tests
@item Examples
@item Performance Tests
@end itemize

@node BuildVerificationTests
@subsection Build Verification Tests

These are relatively simple tests that are built along with the distribution
and are used to make sure that the build is pretty much working.  Our
current unit tests live in the source files of the code they test and are
built into the ns-3 modules; and so fit the description of BVTs.  BVTs live
in the same source code that is built into the ns-3 code.  Our current tests
are examples of this kind of test.

@node UnitTests
@subsection Unit Tests

Unit tests are more involved tests that go into detail to make sure that a
piece of code works as advertised in isolation.  There is really no reason
for this kind of test to be built into an ns-3 module.  It turns out, for
example, that the unit tests for the object name service are about the same
size as the object name service code itself.  Unit tests are tests that
check a single bit of functionality that are not built into the ns-3 code,
but live in the same directory as the code it tests.  It is possible that
these tests check integration of multiple implementation files in a module
as well.  The file src/core/names-test-suite.cc is an example of this kind
of test.  The file src/common/pcap-file-test-suite.cc is another example
that uses a known good pcap file as a test vector file.  This file is stored
locally in the src/common directory.

@node SystemTests
@subsection System Tests

System tests are those that involve more than one module in the system.  We
have lots of this kind of test running in our current regression framework,
but they are typically overloaded examples.  We provide a new place
for this kind of test in the directory ``src/test''.  The file
src/test/ns3tcp/ns3-interop-test-suite.cc is an example of this kind of
test.  It uses NSC TCP to test the ns-3 TCP implementation.  Often there
will be test vectors required for this kind of test, and they are stored in
the directory where the test lives.  For example,
ns3tcp-interop-response-vectors.pcap is a file consisting of a number of TCP
headers that are used as the expected responses of the ns-3 TCP under test
to a stimulus generated by the NSC TCP which is used as a ``known good''
implementation.

@node Examples
@subsection Examples

The examples are tested by the framework to make sure they built and will
run.  Nothing is checked, and currently the pcap files are just written off
into /tmp to be discarded.  If the examples run (don't crash) they pass this
smoke test.

@node PerformanceTests
@subsection Performance Tests

Performance tests are those which exercise a particular part of the system
and determine if the tests have executed to completion in a reasonable time.

@node RunningTests
@section Running Tests

Tests are typically run using the high level @code{test.py} program.  They
can also be run ``manually'' using a low level test-runner executable directly
from @code{waf}.

@node RunningTestsUnderTestRunnerExecutable
@section Running Tests Under the Test Runner Executable

The test-runner is the bridge from generic Python code to @command{ns-3} code.
It is written in C++ and uses the automatic test discovery process in the
@command{ns-3} code to find and allow execution of all of the various tests.

Although it may not be used directly very often, it is good to understand how
@code{test.py} actually runs the various tests.

In order to execute the test-runner, you run it like any other ns-3 executable
-- using @code{waf}.  To get a list of available options, you can type:

@verbatim
  ./waf --run "test-runner --help"
@end verbatim

You should see something like the following:

@smallformat
@verbatim
  Waf: Entering directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  Waf: Leaving directory `/home/craigdo/repos/ns-3-allinone-test/ns-3-dev/build'
  'build' finished successfully (0.353s)
  --basedir=dir:          Set the base directory (where to find src) to ``dir''
  --constrain=test-type:  Constrain checks to test suites of type ``test-type''
  --help:                 Print this message
  --kinds:                List all of the available kinds of tests
  --list:                 List all of the test suites (optionally constrained by test-type)
  --out=file-name:        Set the test status output file to ``file-name''
  --suite=suite-name:     Run the test suite named ``suite-name''
  --verbose:              Turn on messages in the run test suites
@end verbatim
@end smallformat

There are a number of things available to you which will be familiar to you if
you have looked at @command{test.py}.  This should be expected since the test-
runner is just an interface between @code{test.py} and @command{ns-3}.  You
may notice that example-related commands are missing here.  That is because
the examples are really not @command{ns-3} tests.  @command{test.py} runs them
as if they were to present a unified testing environment, but they are really
completely different and not to be found here.

One new option that appears here is the @code{--basedir} option.  It turns out
that the tests may need to reference the source directory of the @code{ns-3}
distribution to find local data, so a base directory is always required to run
a test.  To run one of the tests directly from the test-runner, you will need
to specify the test suite to run along with the base directory.  So you could do,

@verbatim
  ./waf --run "test-runner --basedir=`pwd` --suite=pcap-file-object"
@end verbatim

Note the ``backward'' quotation marks on the @code{pwd} command.  This will run
the @code{pcap-file-object} test quietly.  The only indication that
you will get that the test passed is the @emph{absence} of a message from
@code{waf} saying that the program returned something other than a zero
exit code.  To get some output from the test, you need to specify an output
file to which the tests will write their XML status using the @code{--out}
option.  You need to be careful interpreting the results because the test
suites will @emph{append} results onto this file.  Try,

@smallformat
@verbatim
  ./waf --run "test-runner --basedir=`pwd` --suite=pcap-file-object --out=myfile.xml''
@end verbatim
@end smallformat

If you look at the file @code{myfile.xml} you should see something like,

@smallformat
@verbatim
<TestSuite>
  <SuiteName>pcap-file-object</SuiteName>
  <TestCase>
    <CaseName>Check to see that PcapFile::Open with mode ``w'' works</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <TestCase>
    <CaseName>Check to see that PcapFile::Open with mode ``r'' works</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <TestCase>
    <CaseName>Check to see that PcapFile::Open with mode ``a'' works</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <TestCase>
    <CaseName>Check to see that PcapFileHeader is managed correctly</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <TestCase>
    <CaseName>Check to see that PcapRecordHeader is managed correctly</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <TestCase>
    <CaseName>Check to see that PcapFile can read out a known good pcap file</CaseName>
    <CaseResult>PASS</CaseResult>
    <CaseTime>real 0.00 user 0.00 system 0.00</CaseTime>
  </TestCase>
  <SuiteResult>PASS</SuiteResult>
  <SuiteTime>real 0.00 user 0.00 system 0.00</SuiteTime>
</TestSuite>
@end verbatim
@end smallformat

If you are familiar with XML this should be fairly self-explanatory.  It is
also not a complete XML file since test suites are designed to have their
output appended to a master XML status file as described in the @command{test.py}
section.

@node ClassTestRunner
@section Class TestRunner

The executables that run dedicated test programs use a TestRunner class.  This
class provides for automatic test registration and listing, as well as a way to
execute the individual tests.  Individual test suites use C++ global
constructors
to add themselves to a collection of test suites managed by the test runner.
The test runner is used to list all of the available tests and to select a test
to be run.  This is a quite simple class that provides three static methods to
provide or Adding and Getting test suites to a collection of tests.  See the
doxygen for class @code{ns3::TestRunner} for details.

@node TestSuite
@section Test Suite

All @command{ns-3} tests are classified into Test Suites and Test Cases.  A
test suite is a collection of test cases that completely exercise a given kind
of functionality.  As described above, test suites can be classified as,

@itemize @bullet
@item Build Verification Tests
@item Unit Tests
@item System Tests
@item Examples
@item Performance Tests
@end itemize

This classification is exported from the TestSuite class.  This class is quite
simple, existing only as a place to export this type and to accumulate test
cases.  From a user perspective, in order to create a new TestSuite in the
system one only has to define a new class that inherits from class @code{TestSuite}
and perform these two duties.

The following code will define a new class that can be run by @code{test.py}
as a ``unit'' test with the display name, ``my-test-suite-name''.

@verbatim
  class MySuite : public TestSuite
  {
  public:
    MyTestSuite ();
  };

  MyTestSuite::MyTestSuite ()
    : TestSuite ("my-test-suite-name", UNIT)
  {
    AddTestCase (new MyTestCase);
  }

  MyTestSuite myTestSuite;
@end verbatim

The base class takes care of all of the registration and reporting required to
be a good citizen in the test framework.

@node TestCase
@section Test Case

Individual tests are created using a TestCase class.  Common models for the use
of a test case include "one test case per feature", and "one test case per method."
Mixtures of these models may be used.

In order to create a new test case in the system, all one has to do is to inherit
from the  @code{TestCase} base class, override the constructor to give the test
case a name and override the @code{DoRun} method to run the test.

@verbatim
class MyTestCase : public TestCase
{
  MyTestCase ();
  virtual bool DoRun (void);
};

MyTestCase::MyTestCase ()
  : TestCase ("Check some bit of functionality")
{
}

bool
MyTestCase::DoRun (void)
{
  NS_TEST_ASSERT_MSG_EQ (true, true, "Some failure message");
  return GetErrorStatus ();
}
@end verbatim

@node Utilities
@section Utilities

There are a number of utilities of various kinds that are also part of the
testing framework.  Examples include a generalized pcap file useful for
storing test vectors; a generic container useful for transient storage of
test vectors during test execution; and tools for generating presentations
based on validation and verification testing results.

@node Debugging test suite failures
@section Debugging test suite failures

To debug test crashes, such as:
@verbatim
CRASH: TestSuite ns3-wifi-interference
@end verbatim

You can access the underlying test-runner program via gdb as follows, and
then pass the "--basedir=`pwd`" argument to run (you can also pass other
arguments as needed, but basedir is the minimum needed)::
@smallformat
@verbatim
./waf --command-template="gdb %s" --run "test-runner"

Waf: Entering directory `/home/tomh/hg/sep09/ns-3-allinone/ns-3-dev-678/build'
Waf: Leaving directory `/home/tomh/hg/sep09/ns-3-allinone/ns-3-dev-678/build'
'build' finished successfully (0.380s)
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
L cense GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) r --basedir=`pwd`
Starting program: <..>/build/debug/utils/test-runner --basedir=`pwd`
[Thread debugging using libthread_db enabled]
assert failed. file=../src/core/type-id.cc, line=138, cond="uid <= m_information.size () && uid != 0"
...
@end verbatim
@end smallformat

Here is another example of how to use valgrind to debug a memory problem
such as:
@verbatim
VALGR: TestSuite devices-mesh-dot11s-regression
@end verbatim

@smallformat
@verbatim
./waf --command-template="valgrind %s --basedir=`pwd` --suite=devices-mesh-dot11s-regression" --run test-runner
@end verbatim
@end smallformat