Primitive attributes. More...
#include <dnnl.hpp>


Public Member Functions | |
| primitive_attr () | |
| Constructs default (empty) primitive attributes. | |
| primitive_attr (dnnl_primitive_attr_t attr) | |
| Creates primitive attributes from a C API dnnl_primitive_attr_t handle. More... | |
| scratchpad_mode | get_scratchpad_mode () const |
| Returns the scratchpad mode. | |
| void | set_scratchpad_mode (scratchpad_mode mode) |
| Sets scratchpad mode. More... | |
| void | get_output_scales (int &mask, std::vector< float > &scales) const |
| Returns output scaling factors correspondence mask and values. More... | |
| void | set_output_scales (int mask, const std::vector< float > &scales) |
| Sets output scaling factors correspondence mask and values. More... | |
| void | get_scales (int arg, int &mask, std::vector< float > &scales) const |
| Returns scaling factors correspondence mask and values for a given memory argument. More... | |
| void | set_scales (int arg, int mask, const std::vector< float > &scales) |
| Sets scaling factors for primitive operations for a given memory argument. More... | |
| void | get_zero_points (int arg, int &mask, std::vector< int32_t > &zero_points) const |
| Returns zero points correspondence mask and values. More... | |
| void | set_zero_points (int arg, int mask, const std::vector< int32_t > &zero_points) |
| Sets zero points for primitive operations for a given memory argument. More... | |
| const post_ops | get_post_ops () const |
| Returns post-ops previously set via set_post_ops(). More... | |
| void | set_post_ops (const post_ops ops) |
| Sets post-ops. More... | |
| void | set_rnn_data_qparams (float scale, float shift) |
| Sets quantization scale and shift parameters for RNN data tensors. More... | |
| void | get_rnn_data_qparams (float &scale, float &shift) |
| Returns the quantization scale and shift parameters for RNN data tensors. More... | |
| void | set_rnn_weights_qparams (int mask, const std::vector< float > &scales) |
| Sets quantization scaling factors for RNN weights tensors. More... | |
| void | get_rnn_weights_qparams (int &mask, std::vector< float > &scales) |
| Returns the quantization scaling factors for RNN projection weights tensors. More... | |
| void | set_rnn_weights_projection_qparams (int mask, const std::vector< float > &scales) |
| Sets quantization scaling factors for RNN projection weights tensors. More... | |
| void | get_rnn_weights_projection_qparams (int &mask, std::vector< float > &scales) |
| Returns the quantization scaling factors for RNN projection weights tensors. More... | |
Public Member Functions inherited from dnnl::handle< dnnl_primitive_attr_t > | |
| bool | operator== (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &other) const |
| Equality operator. More... | |
| bool | operator!= (const handle &other) const |
| Inequality operator. More... | |
| handle ()=default | |
| Constructs an empty handle object. More... | |
| handle (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &)=default | |
| Copy constructor. | |
| handle (handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &&)=default | |
| Move constructor. | |
| handle (dnnl_primitive_attr_t t, bool weak=false) | |
| Constructs a handle wrapper object from a C API handle. More... | |
| handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > & | operator= (const handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &)=default |
| Assignment operator. | |
| handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > & | operator= (handle< dnnl_primitive_attr_t, handle_traits< dnnl_primitive_attr_t > > &&)=default |
| Move assignment operator. | |
| void | reset (dnnl_primitive_attr_t t, bool weak=false) |
| Resets the handle wrapper objects to wrap a new C API handle. More... | |
| dnnl_primitive_attr_t | get (bool allow_empty=false) const |
| Returns the underlying C API handle. More... | |
| operator dnnl_primitive_attr_t () const | |
| Converts a handle to the underlying C API handle type. More... | |
| operator bool () const | |
| Checks whether the object is not empty. More... | |
Primitive attributes.
|
inline |
Creates primitive attributes from a C API dnnl_primitive_attr_t handle.
The resulting handle is not weak and the C handle will be destroyed during the destruction of the C++ object.
| attr | The C API primitive attributes. |
|
inline |
Sets scratchpad mode.
| mode | Specified scratchpad mode. |
|
inline |
Returns output scaling factors correspondence mask and values.
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated output scaling factor is used for each index along that dimension. The mask value of 0 implies a common output scaling factor for the whole output tensor. |
| scales | Vector of output scaling factors. |
|
inline |
Sets output scaling factors correspondence mask and values.
Example usage:
| mask | Defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common output scaling factor for the whole output tensor. |
| scales | Constant vector of output scaling factors. If the scaling factors are known at the time of this call, the following equality must hold: \(scales.size() = \prod\limits_{d \in mask} output.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. If the scaling factors are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_F32_VAL value and the output scaling factors must be passed at execution time as an argument with index DNNL_ARG_ATTR_OUTPUT_SCALES. |
|
inline |
Returns scaling factors correspondence mask and values for a given memory argument.
| arg | Parameter argument index as passed to the primitive::execute() call. |
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Output vector of scaling factors. |
|
inline |
Sets scaling factors for primitive operations for a given memory argument.
| arg | Parameter argument index as passed to the primitive::execute() call. |
| mask | Scaling factors correspondence mask that defines the correspondence between the tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor is used for each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Constant vector of scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} argument.dims[d].\) |
|
inline |
Returns zero points correspondence mask and values.
| arg | Parameter argument index as passed to the primitive::execute() call. |
| mask | Zero points correspondence mask that defines the correspondence between the output tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
| zero_points | Output vector of zero points. |
|
inline |
Sets zero points for primitive operations for a given memory argument.
| arg | Parameter argument index as passed to the primitive::execute() call. |
| mask | Zero point correspondence mask that defines the correspondence between the tensor dimensions and the zero_points vector. The set i-th bit indicates that a dedicated zero point is used for each index along that dimension. Set the mask to 0 to use a common zero point for the whole output tensor. |
| zero_points | Constant vector of zero points. If the zero points are known at the time of this call, the following equality must hold: \(zero\_points.size() = \prod\limits_{d \in mask} argument.dims[d].\) If the zero points are not known at the time of the call, this vector must contain a single DNNL_RUNTIME_S32_VAL value and the zero points must be passed at execution time as an argument with index DNNL_ARG_ATTR_ZERO_POINTS. |
|
inline |
Returns post-ops previously set via set_post_ops().
|
inline |
Sets post-ops.
| ops | Post-ops object to copy post-ops from. |
|
inline |
Sets quantization scale and shift parameters for RNN data tensors.
For performance reasons, the low-precision configuration of the RNN primitives expect input activations to have the unsigned 8-bit integer data type. The scale and shift parameters are used to quantize floating-point data to unsigned integer and must be passed to the RNN primitive using attributes.
The quantization formula is scale * data + shift.
Example usage:
| scale | The value to scale the data by. |
| shift | The value to shift the data by. |
|
inline |
Returns the quantization scale and shift parameters for RNN data tensors.
| scale | The value to scale the data by. |
| shift | The value to shift the data by. |
|
inline |
Sets quantization scaling factors for RNN weights tensors.
The low-precision configuration of the RNN primitives expect input weights to use the signed 8-bit integer data type. The scaling factors are used to quantize floating-point data to signed integer and must be passed to RNN primitives using attributes.
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Returns the quantization scaling factors for RNN projection weights tensors.
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Sets quantization scaling factors for RNN projection weights tensors.
passed to RNN primitives using attributes.
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |
|
inline |
Returns the quantization scaling factors for RNN projection weights tensors.
| mask | Scaling factors correspondence mask that defines the correspondence between the output tensor dimensions and the scales vector. The set i-th bit indicates that a dedicated scaling factor should be used each index along that dimension. Set the mask to 0 to use a common scaling factor for the whole output tensor. |
| scales | Constant vector of output scaling factors. The following equality must hold: \(scales.size() = \prod\limits_{d \in mask} weights.dims[d].\) Violations can only be detected when the attributes are used to create a primitive descriptor. |