API extensions to interact with the underlying Threadpool run-time. More...
Functions | |
| dnnl_status_t DNNL_API | dnnl_threadpool_interop_stream_create (dnnl_stream_t *stream, dnnl_engine_t engine, void *threadpool) |
| Creates an execution stream with specified threadpool. More... | |
| dnnl_status_t DNNL_API | dnnl_threadpool_interop_stream_get_threadpool (dnnl_stream_t astream, void **threadpool) |
| Returns a threadpool to be used by the execution stream. More... | |
| dnnl_status_t DNNL_API | dnnl_threadpool_interop_sgemm (char transa, char transb, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const float *A, dnnl_dim_t lda, const float *B, dnnl_dim_t ldb, float beta, float *C, dnnl_dim_t ldc, void *threadpool) |
| Performs single-precision matrix-matrix multiply. More... | |
| dnnl_status_t DNNL_API | dnnl_threadpool_interop_gemm_u8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const uint8_t *A, dnnl_dim_t lda, uint8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co, void *threadpool) |
| Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More... | |
| dnnl_status_t DNNL_API | dnnl_threadpool_interop_gemm_s8s8s32 (char transa, char transb, char offsetc, dnnl_dim_t M, dnnl_dim_t N, dnnl_dim_t K, float alpha, const int8_t *A, dnnl_dim_t lda, int8_t ao, const int8_t *B, dnnl_dim_t ldb, int8_t bo, float beta, int32_t *C, dnnl_dim_t ldc, const int32_t *co, void *threadpool) |
| Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C. More... | |
API extensions to interact with the underlying Threadpool run-time.
| dnnl_status_t DNNL_API dnnl_threadpool_interop_stream_create | ( | dnnl_stream_t * | stream, |
| dnnl_engine_t | engine, | ||
| void * | threadpool | ||
| ) |
Creates an execution stream with specified threadpool.
| stream | Output execution stream. |
| engine | Engine to create the execution stream on. |
| threadpool | Pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface. |
| dnnl_status_t DNNL_API dnnl_threadpool_interop_stream_get_threadpool | ( | dnnl_stream_t | astream, |
| void ** | threadpool | ||
| ) |
Returns a threadpool to be used by the execution stream.
| astream | Execution stream. |
| threadpool | Output pointer to an instance of a C++ class that implements dnnl::threapdool_iface interface. Set to NULL if the stream was created without threadpool. |
| dnnl_status_t DNNL_API dnnl_threadpool_interop_sgemm | ( | char | transa, |
| char | transb, | ||
| dnnl_dim_t | M, | ||
| dnnl_dim_t | N, | ||
| dnnl_dim_t | K, | ||
| float | alpha, | ||
| const float * | A, | ||
| dnnl_dim_t | lda, | ||
| const float * | B, | ||
| dnnl_dim_t | ldb, | ||
| float | beta, | ||
| float * | C, | ||
| dnnl_dim_t | ldc, | ||
| void * | threadpool | ||
| ) |
Performs single-precision matrix-matrix multiply.
The operation is defined as:
C := alpha * op( A ) * op( B ) + beta * C
where
op( X ) = X or op( X ) = X**T,alpha and beta are scalars, andA, B, and C are matrices:op( A ) is an MxK matrix,op( B ) is an KxN matrix,C is an MxN matrix.The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
| transa | Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed. |
| transb | Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed. |
| M | The M dimension. |
| N | The N dimension. |
| K | The K dimension. |
| alpha | The alpha parameter that is used to scale the product of matrices A and B. |
| A | A pointer to the A matrix data. |
| lda | The leading dimension for the matrix A. |
| B | A pointer to the B matrix data. |
| ldb | The leading dimension for the matrix B. |
| beta | The beta parameter that is used to scale the matrix C. |
| C | A pointer to the C matrix data. |
| ldc | The leading dimension for the matrix C. |
| threadpool | A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime). |
| dnnl_status_t DNNL_API dnnl_threadpool_interop_gemm_u8s8s32 | ( | char | transa, |
| char | transb, | ||
| char | offsetc, | ||
| dnnl_dim_t | M, | ||
| dnnl_dim_t | N, | ||
| dnnl_dim_t | K, | ||
| float | alpha, | ||
| const uint8_t * | A, | ||
| dnnl_dim_t | lda, | ||
| uint8_t | ao, | ||
| const int8_t * | B, | ||
| dnnl_dim_t | ldb, | ||
| int8_t | bo, | ||
| float | beta, | ||
| int32_t * | C, | ||
| dnnl_dim_t | ldc, | ||
| const int32_t * | co, | ||
| void * | threadpool | ||
| ) |
Performs integer matrix-matrix multiply on 8-bit unsigned matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
op( X ) = X or op( X ) = X**T,alpha and beta are scalars, andA, B, and C are matrices:op( A ) is an MxK matrix,op( B ) is an KxN matrix,C is an MxN matrix.A_offset is an MxK matrix with every element equal the ao value,B_offset is an KxN matrix with every element equal the bo value,C_offset is an MxN matrix which is defined by the co array of size len:offsetc = F: the len must be at least 1,offsetc = C: the len must be at least max(1, m),offsetc = R: the len must be at least max(1, n),The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
| transa | Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed. |
| transb | Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed. |
| offsetc | Flag specifying how offsets should be applied to matrix C:
|
| M | The M dimension. |
| N | The N dimension. |
| K | The K dimension. |
| alpha | The alpha parameter that is used to scale the product of matrices A and B. |
| A | A pointer to the A matrix data. |
| lda | The leading dimension for the matrix A. |
| ao | The offset value for the matrix A. |
| B | A pointer to the B matrix data. |
| ldb | The leading dimension for the matrix B. |
| bo | The offset value for the matrix B. |
| beta | The beta parameter that is used to scale the matrix C. |
| C | A pointer to the C matrix data. |
| ldc | The leading dimension for the matrix C. |
| co | An array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc. |
| threadpool | A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime). |
| dnnl_status_t DNNL_API dnnl_threadpool_interop_gemm_s8s8s32 | ( | char | transa, |
| char | transb, | ||
| char | offsetc, | ||
| dnnl_dim_t | M, | ||
| dnnl_dim_t | N, | ||
| dnnl_dim_t | K, | ||
| float | alpha, | ||
| const int8_t * | A, | ||
| dnnl_dim_t | lda, | ||
| int8_t | ao, | ||
| const int8_t * | B, | ||
| dnnl_dim_t | ldb, | ||
| int8_t | bo, | ||
| float | beta, | ||
| int32_t * | C, | ||
| dnnl_dim_t | ldc, | ||
| const int32_t * | co, | ||
| void * | threadpool | ||
| ) |
Performs integer matrix-matrix multiply on 8-bit signed matrix A, 8-bit signed matrix B, and 32-bit signed resulting matrix C.
The operation is defined as:
C := alpha * (op(A) - A_offset) * (op(B) - B_offset) + beta * C + C_offset
where
op( X ) = X or op( X ) = X**T,alpha and beta are scalars, andA, B, and C are matrices:op( A ) is an MxK matrix,op( B ) is an KxN matrix,C is an MxN matrix.A_offset is an MxK matrix with every element equal the ao value,B_offset is an KxN matrix with every element equal the bo value,C_offset is an MxN matrix which is defined by the co array of size len:offsetc = F: the len must be at least 1,offsetc = C: the len must be at least max(1, m),offsetc = R: the len must be at least max(1, n),The matrices are assumed to be stored in row-major order (the elements in each of the matrix rows are contiguous in memory).
| transa | Transposition flag for matrix A: 'N' or 'n' means A is not transposed, and 'T' or 't' means that A is transposed. |
| transb | Transposition flag for matrix B: 'N' or 'n' means B is not transposed, and 'T' or 't' means that B is transposed. |
| offsetc | Flag specifying how offsets should be applied to matrix C:
|
| M | The M dimension. |
| N | The N dimension. |
| K | The K dimension. |
| alpha | The alpha parameter that is used to scale the product of matrices A and B. |
| A | A pointer to the A matrix data. |
| lda | The leading dimension for the matrix A. |
| ao | The offset value for the matrix A. |
| B | A pointer to the B matrix data. |
| ldb | The leading dimension for the matrix B. |
| bo | The offset value for the matrix B. |
| beta | The beta parameter that is used to scale the matrix C. |
| C | A pointer to the C matrix data. |
| ldc | The leading dimension for the matrix C. |
| co | An array of offset values for the matrix C. The number of elements in the array depends on the value of offsetc. |
| threadpool | A pointer to a threadpool interface (only when built with the THREADPOOL CPU runtime). |