Kernel Benchmark#

How To#

  1. write a config like bench_add.yaml

  2. run python -m oclk.benchmark bench_add.yaml

Configuration Reference#

A configuration is a list of Suite

Suite#

suite_name: str:

the name of the suite

kernel_file: str:

file path to the kernel file

kernels: List[Kernel]:

kernels in the file

timer: Timer:

define the timer

Kernel#

name: str:

the name of the kernel

suffix: Optional[str]:

suffix of the kernel, used in timer

definition: Optional[str]:

program compile definition, with “-D”

local_work_size: List[int]:

local work size of kernel

global_work_size: List[int]:

global work size of kernel

args: List[KernelArg]:

list of arguments, in the order

KernelArg#

name: str:

name of the argument

value: ArgValueGenerator:

define how to generate arg value, default is constant zeros

type: str:

arg type, can be array, or ctypes: float, int … see Runner

dtype: str:

np.dtype for array, required when type is array, default is float32, can be int32 int64 float32 float64

shape: List[int]:

array shape, required when type is array, default is [1]

ArgValueGenerator#

method: str:

generate method, can be random or constant, default constant

value: Union[int, float, List[Union[int, float]]]:

constant value, or random value range. default 0

Timer#

prefix: str:

timer name prefix, defalut “”

repeat: int:

repeat times in timer’s ONE call, defalut 1

warmup: int:

warm up times before timing, defalut 0

Example#

- suite_name: add
  kernel_file: examples/add.cl
  kernels:
    - name: add
      definition: ""
      args:
        - name: a
          type: array
          dtype: float32
          shape: [64, 64, 64]
          value:
            method: random
        - name: b
          type: array
          dtype: float32
          shape: [64, 64, 64]
          value:
            method: constant
            value: 0
        - name: length
          type: int
          value:
            method: constant
            value: 262144
        - name: out
          type: array
          dtype: float32
          shape: [64, 64, 64]
          # value field default: constant zero
      local_work_size: [1]
      global_work_size: [262144]
    - name: add_constant
      definition: ""
      local_work_size: [ 1 ]
      global_work_size: [ 262144 ]
      args:
        - name: a
          type: array
          dtype: float32
          shape:
            - 64
            - 64
            - 64
          value:
            method: random
        - name: x
          type: float
          shape: [ 64, 64, 64 ]
          value:
            method: constant
            value: 0
        - name: length
          type: long
          value:
            method: constant
            value: 262144
        - name: out
          type: array
          dtype: float32
          shape: [ 64, 64, 64 ]
    - name: add_batch
      definition: "-DBATCH_SIZE=4"
      local_work_size: [ 1 ]
      global_work_size: [ 65536 ]
      args:
        - name: a
          type: array
          dtype: float32
          shape:
            - 64
            - 64
            - 64
          value:
            method: random
        - name: b
          type: array
          dtype: float32
          shape: [64, 64, 64]
          value:
            method: constant
            value: 0
        - name: length
          type: long
          value:
            method: constant
            value: 262144
        - name: out
          type: array
          dtype: float32
          shape: [64, 64, 64]
  timer:
    prefix: "bench_add"
    repeat: 5
    warmup: 5