ecify `label`, `sub_label`, `description`, and `env`. (Defined later) These fields are included in the representation of result object and by the `Compare` class to group and display results for comparison. 4) Instruction counts In addition to wall times, Timer can run a statement under Callgrind and report instructions executed. Directly analogous to `timeit.Timer` constructor arguments: `stmt`, `setup`, `timer`, `globals` PyTorch Timer specific constructor arguments: `label`, `sub_label`, `description`, `env`, `num_threads` Args: stmt: Code snippet to be run in a loop and timed. setup: Optional setup code. Used to define variables used in `stmt` global_setup: (C++ only) Code which is placed at the top level of the file for things like `#include` statements. timer: Callable which returns the current time. If PyTorch was built without CUDA or there is no GPU present, this defaults to `timeit.default_timer`; otherwise it will synchronize CUDA before measuring the time. globals: A dict which defines the global variables when `stmt` is being executed. This is the other method for providing variables which `stmt` needs. label: String which summarizes `stmt`. For instance, if `stmt` is "torch.nn.functional.relu(torch.add(x, 1, out=out))" one might set label to "ReLU(x + 1)" to improve readability. sub_label: Provide supplemental information to disambiguate measurements with identical stmt or label. For instance, in our example above sub_label might be "float" or "int", so that it is easy to differentiate: "ReLU(x + 1): (float)" "ReLU(x + 1): (int)" when printing Measurements or summarizing using `Compare`. description: String to distinguish measurements with identical label and sub_label. The principal use of `description` is to signal to `Compare` the columns of data. For instance one might set it based on the input size to create a table of the form: :: | n=1 | n=4 | ... ------------- ... ReLU(x + 1): (float) | ... | ... | ... ReLU(x + 1): (int) | ... | ... | ... using `Compare`. It is also included when printing a Measurement. env: This tag indicates that otherwise identical tasks were run in different environments, and are therefore not equivalent, for instance when A/B testing a change to a kernel. `Compare` will treat Measurements with different `env` specification as distinct when merging replicate runs. num_threads: The size of the PyTorch threadpool when executing `stmt`. Single threaded performance is important as both a key inference workload and a good indicator of intrinsic algorithmic efficiency, so the default is set to one. This is in contrast to the default PyTorch threadpool size which tries to utilize all cores. Ú _timer_clsÚ