Usage¶
This plugin provides a benchmark fixture. This fixture is a callable object that will benchmark any function passed to it.
Example:
def something(duration=0.000001):
"""
Function that needs some serious benchmarking.
"""
time.sleep(duration)
# You may return anything you want, like the result of a computation
return 123
def test_my_stuff(benchmark):
# benchmark something
result = benchmark(something)
# Extra code, to verify that the run completed correctly.
# Sometimes you may want to check the result, fast functions
# are no good if they return incorrect results :-)
assert result == 123
You can also pass extra arguments:
def test_my_stuff(benchmark):
benchmark(time.sleep, 0.02)
Or even keyword arguments:
def test_my_stuff(benchmark):
benchmark(time.sleep, duration=0.02)
Another pattern seen in the wild, that is not recommended for micro-benchmarks (very fast code) but may be convenient:
def test_my_stuff(benchmark):
@benchmark
def something(): # unnecessary function call
time.sleep(0.000001)
A better way is to just benchmark the final function:
def test_my_stuff(benchmark):
benchmark(time.sleep, 0.000001) # way more accurate results!
If you need to do fine control over how the benchmark is run (like a setup function, exact control of iterations and rounds) there’s a special mode - pedantic:
def my_special_setup():
...
def test_with_setup(benchmark):
benchmark.pedantic(something, setup=my_special_setup, args=(1, 2, 3), kwargs={'foo': 'bar'}, iterations=10, rounds=100)
Commandline options¶
py.test
command-line options:
--benchmark-min-time=SECONDS Minimum time per round in seconds. Default: ‘0.000005’ --benchmark-max-time=SECONDS Maximum run time per test - it will be repeated until this total time is reached. It may be exceeded if test function is very slow or –benchmark-min-rounds is large (it takes precedence). Default: ‘1.0’ --benchmark-min-rounds=NUM Minimum rounds, even if total time would exceed –max-time. Default: 5 --benchmark-timer=FUNC Timer to use when measuring time. Default: ‘time.perf_counter’ --benchmark-calibration-precision=NUM Precision to use when calibrating number of iterations. Precision of 10 will make the timer look 10 times more accurate, at a cost of less precise measure of deviations. Default: 10 --benchmark-warmup=KIND Activates warmup. Will run the test function up to number of times in the calibration phase. See –benchmark-warmup-iterations. Note: Even the warmup phase obeys –benchmark-max-time. Available KIND: ‘auto’, ‘off’, ‘on’. Default: ‘auto’ (automatically activate on PyPy). --benchmark-warmup-iterations=NUM Max number of iterations to run in the warmup phase. Default: 100000 --benchmark-disable-gc Disable GC during benchmarks. --benchmark-skip Skip running any tests that contain benchmarks. --benchmark-disable Disable benchmarks. Benchmarked functions are only ran once and no stats are reported. Use this if you want to run the test but don’t do any benchmarking. --benchmark-enable Forcibly enable benchmarks. Use this option to override –benchmark-disable (in case you have it in pytest configuration). --benchmark-only Only run benchmarks. This overrides –benchmark-skip. --benchmark-save=NAME Save the current run into ‘STORAGE-PATH/counter- NAME.json’. Default: ‘<commitid>_<date>_<time>_<isdirty>’, example: ‘e689af57e7439b9005749d806248897ad550eab5_20150811_041632_uncommitted-changes’. --benchmark-autosave Autosave the current run into ‘STORAGE-PATH/<counter>_<commitid>_<date>_<time>_<isdirty>’, example: ‘STORAGE-PATH/0123_525685bcd6a51d1ade0be75e2892e713e02dfd19_20151028_221708_uncommitted-changes.json’ --benchmark-save-data Use this to make –benchmark-save and –benchmark- autosave include all the timing data, not just the stats. --benchmark-json=PATH Dump a JSON report into PATH. Note that this will include the complete data (all the timings, not just the stats). --benchmark-compare=NUM Compare the current run against run NUM (or prefix of _id in elasticsearch) or the latest saved run if unspecified. --benchmark-compare-fail=EXPR Fail test if performance regresses according to given EXPR (eg: min:5% or mean:0.001 for number of seconds). Can be used multiple times. --benchmark-cprofile=COLUMN If specified measure one run with cProfile and stores 10 top functions. Argument is a column to sort by. Available columns: ‘ncallls_recursion’, ‘ncalls’, ‘tottime’, ‘tottime_per’, ‘cumtime’, ‘cumtime_per’, ‘function_name’. --benchmark-storage=URI Specify a path to store the runs as uri in form file://path or elasticsearch+http[s]://host1,host2/[in dex/doctype?project_name=Project] (when –benchmark- save or –benchmark-autosave are used). For backwards compatibility unexpected values are converted to file://<value>. Default: ‘file://./.benchmarks’. --benchmark-netrc=BENCHMARK_NETRC Load elasticsearch credentials from a netrc file. Default: ‘’. --benchmark-verbose Dump diagnostic and progress information. --benchmark-sort=COL Column to sort on. Can be one of: ‘min’, ‘max’, ‘mean’, ‘stddev’, ‘name’, ‘fullname’. Default: ‘min’ --benchmark-group-by=LABELS Comma-separated list of categories by which to group tests. Can be one or more of: ‘group’, ‘name’, ‘fullname’, ‘func’, ‘fullfunc’, ‘param’ or ‘param:NAME’, where NAME is the name passed to @pytest.parametrize. Default: ‘group’ --benchmark-columns=LABELS Comma-separated list of columns to show in the result table. Default: ‘min, max, mean, stddev, median, iqr, outliers, ops, rounds, iterations’ --benchmark-name=FORMAT How to format names in results. Can be one of ‘short’, ‘normal’, ‘long’, or ‘trial’. Default: ‘normal’ --benchmark-histogram=FILENAME-PREFIX Plot graphs of min/max/avg/stddev over time in FILENAME-PREFIX-test_name.svg. If FILENAME-PREFIX contains slashes (‘/’) then directories will be created. Default: ‘benchmark_<date>_<time>’
Comparison CLI¶
An extra py.test-benchmark
bin is available for inspecting previous benchmark data:
py.test-benchmark [-h [COMMAND]] [--storage URI] [--netrc [NETRC]]
[--verbose]
{help,list,compare} ...
Commands:
help Display help and exit.
list List saved runs.
compare Compare saved runs.
The compare command
takes almost all the --benchmark
options, minus the prefix:
- positional arguments:
- glob_or_file Glob or exact path for json files. If not specified
- all runs are loaded.
- options:
-h, --help show this help message and exit --sort=COL Column to sort on. Can be one of: ‘min’, ‘max’, ‘mean’, ‘stddev’, ‘name’, ‘fullname’. Default: ‘min’ --group-by=LABELS Comma-separated list of categories by which to group tests. Can be one or more of: ‘group’, ‘name’, ‘fullname’, ‘func’, ‘fullfunc’, ‘param’ or ‘param:NAME’, where NAME is the name passed to @pytest.parametrize. Default: ‘group’ --columns=LABELS Comma-separated list of columns to show in the result table. Default: ‘min, max, mean, stddev, median, iqr, outliers, rounds, iterations’ --name=FORMAT How to format names in results. Can be one of ‘short’, ‘normal’, ‘long’, or ‘trial’. Default: ‘normal’ --histogram=FILENAME-PREFIX Plot graphs of min/max/avg/stddev over time in FILENAME-PREFIX-test_name.svg. If FILENAME-PREFIX contains slashes (‘/’) then directories will be created. Default: ‘benchmark_<date>_<time>’ --csv=FILENAME Save a csv report. If FILENAME contains slashes (‘/’) then directories will be created. Default: ‘benchmark_<date>_<time>’ examples:
pytest-benchmark compare ‘Linux-CPython-3.5-64bit/*’
Loads all benchmarks ran with that interpreter. Note the special quoting that disables your shell’s glob expansion.pytest-benchmark compare 0001
Loads first run from all the interpreters.pytest-benchmark compare /foo/bar/0001_abc.json /lorem/ipsum/0001_sir_dolor.json
Loads runs from exactly those files.
Markers¶
You can set per-test options with the benchmark
marker:
@pytest.mark.benchmark(
group="group-name",
min_time=0.1,
max_time=0.5,
min_rounds=5,
timer=time.time,
disable_gc=True,
warmup=False
)
def test_my_stuff(benchmark):
@benchmark
def result():
# Code to be measured
return time.sleep(0.000001)
# Extra code, to verify that the run
# completed correctly.
# Note: this code is not measured.
assert result is None
Extra info¶
You can set arbirary values in the benchmark.extra_info
dictionary, which
will be saved in the JSON if you use --benchmark-autosave
or similar:
def test_my_stuff(benchmark):
benchmark.extra_info['foo'] = 'bar'
benchmark(time.sleep, 0.02)
Patch utilities¶
Suppose you want to benchmark an internal
function from a class:
class Foo(object):
def __init__(self, arg=0.01):
self.arg = arg
def run(self):
self.internal(self.arg)
def internal(self, duration):
time.sleep(duration)
With the benchmark
fixture this is quite hard to test if you don’t control the Foo
code or it has very
complicated construction.
For this there’s an experimental benchmark_weave
fixture that can patch stuff using aspectlib (make sure you pip install aspectlib
or pip install
pytest-benchmark[aspect]
):
def test_foo(benchmark):
benchmark.weave(Foo.internal, lazy=True):
f = Foo()
f.run()