File kernel.h

Kernel functions.

Functions

int GpuKernel_init(GpuKernel * k, gpucontext * ctx, unsigned int count, const char ** strs, const size_t * lens, const char * name, unsigned int argcount, const int * types, int flags, char ** err_str)

Initialize a kernel structure.

lens holds the size of each source string. If is it NULL or an element has a value of 0 the length will be determined using strlen() or equivalent code.

If *err_str is returned not NULL then it must be free()d by the caller

Parameters
  • k: a kernel structure
  • ctx: context in which to build the kernel
  • count: number of source code strings
  • strs: C array of source code strings
  • lens: C array with the size of each string or NULL
  • name: name of the kernel function
  • argcount: number of kerner arguments
  • types: typecode for each argument
  • flags: kernel use flags (see ga_usefl)
  • err_str: (if not NULL) location to write GPU-backend provided debug info

Return
GA_NO_ERROR if the operation is successful
Return
any other value if an error occured

void GpuKernel_clear(GpuKernel * k)

Clear and release data associated with a kernel.

Parameters
  • k: the kernel to release

gpucontext* GpuKernel_context(GpuKernel * k)

Returns the context in which a kernel was built.

Return
a context pointer
Parameters
  • k: a kernel

int GpuKernel_setarg(GpuKernel * k, unsigned int i, void * val)
int GpuKernel_sched(GpuKernel * k, size_t n, size_t * gs, size_t * ls)

Do a scheduling of local and global size for a kernel.

This function will find an optimal grid and block size for the number of elements specified in n when running kernel k. The parameters may run a bit more instances than n for efficiency reasons, so your kernel must be ready to deal with that.

If either gs or ls is not 0 on entry its value will not be altered and will be taken into account when choosing the other value.

Parameters
  • k: the kernel to schedule for
  • n: number of elements to handle
  • gs: grid size (in/out)
  • ls: local size (in/out)

int GpuKernel_call(GpuKernel * k, unsigned int n, const size_t * gs, const size_t * ls, size_t shared, void ** args)

Launch the execution of a kernel.

Parameters
  • k: the kernel to launch
  • n: dimensionality of the grid/blocks
  • gs: sizes of launch grid
  • ls: sizes of launch blocks
  • shared: amount of dynamic shared memory to allocate
  • args: table of pointers to arguments

int GpuKernel_binary(const GpuKernel * k, size_t * sz, void ** obj)
const char* GpuKernel_error(const GpuKernel * k, int err)
struct GpuKernel
#include <kernel.h>

Kernel information structure.

Public Members

gpukernel* k

Device kernel reference.

void** args

Argument buffer.