![]() Tools: Quantizer: Fixed issue observed with int4 weight override support.Ĭore: Added userlogs ( –userlogs=warn) for Op validation failures for both offlineĪnd online prepare thereby making it easier to track fallback.Ĭore: HTP Offline Cache Blob backward compatibility - Snpe Version check relaxed GPU Runtime: Fixed accuracy issues related to tensor memory optimization. SNPE Core: Fixed stability with concurrency use cases. SNPE Core: Fix online dequantization of int4 axis quant dlc when ran on CPU/GPU. Tools: Quantizer: Added fix to use default activation bitwidth for static tensors instead of default parameter, except for static tensor that are known to be parameters like convolution weights and bias Tools: GoogleNAS: Added support for utilizing the GoogleNAS service with SNPE hardware in the loop (HIL). Tools: Converters: Onnx: Added 5D tensor support for PoolMax3d. GPU Runtime: Improved network initialization time in subsequent runs on GPU when using netrun –storage_dir option. GPU Runtime: Fixed verifier issue in Softmax2UdoPackage. GPU Runtime: Improved accuracy in models having Concat op with large dimensions.ĭSP Runtime: Bug fix in running HTP FP16 networks on non fp16 supported SoCs (like sm8350, sm7350) GPU Runtime: Fixed validation errors for Concat op with large dimensions. Tools: Fixed bug in snpe-dlc-quantize with option –axis_quant and –enable_htp when multiple socs are passed using –htp_socs. Tools: Quantizer: Improve Error handling to remove 'uncaught exception' errors. Tools: Fixed functional failure for snpe-architecture-checker. Tools: Missing files for snpe-quantization-checker have been added to the SDK. Tools: ONNX Converter: fixed issue related to missing Cast operation. GPU Runtime: Improved network initialization time in subsequent runs on GPU when using setInitCacheMode. SDK: Added missing documentation files for snpe-quantization-checker. SNPE Core: Enabled logging in Op validation. Snpe-net-run: Added new flag –userbuffer_auto to automatically detect and use the right buffer type based on tensor data type in the model Tools: New tools - snpe-architecture-checker & snpe-quantization-checker are added. HTP: solve vtcm overflow for transposeconv2d layer whose groups > 1, in depth= out depth, padding =0 and groups != in depth. Tools: Tensorflow Converter: Fixed issues with per-channel quantization of weights: set is_symmetric = true by default, added param "axis" and "is_symmetric" into weight encodings info. Tools: snpe-dlc-quant: Fixed abnormal DLC size increase when axis quantization is used. Tools: snpe-throughput-net-run - capturing the status of lost thread in the result summary. Tools: Added new flag –userbuffer_auto in snpe-parallel-run to automatically detect and use the right buffer type based on tensor data type in the model.ĭocumentation: SNPE1 to SNPE2 migration guide is added. Tools: Added new options for snpe-net-run and snpe-parallel-run –use_native_input_files and –use_native_output_files to support inputs in their native format as opposed to default float32 format. ![]() Tools: onnx converter: support conv's input data is Initializer.ĭSP: Improve execute time of dynamic depthwise convolution with uint8 weights.Ĭore: Added error handling based on buffer data size in execute(). Tools: Converters: Fixed a bug in the optimization that merges Matmul + Reshape + Add to FC Op that would incorrectly insert the FC Op before the Constant Bias Op. Tool:ONNX Converter: Fixed TransposeOp input axis format NT issue. HTP: solve vtcm overflow issue happened when change data layout: from uint8 flat to uint8 crouton in tcm. Tools: Converters: Onnx: Added support for Sign. Tools: snpe-dlc-graph-prepare fix benign error message during offline prepare for v68 based SoC s (–htp_socs sm8350, sm7350 etc) Tools: ONNX converter: Added support for NonMaxSuppression op. SNPE AIP: Fixed perf profile setting for multithread scenario.Ĭore: Added new C API Snpe_SNPE_GetInputDimensionsOfFirstTensor() to facilitate retrieving Input dimension without Input tensor name. Tools: Fix the converter issue for GRU op. Tools: Converters: Added a new optimization sequence to squash BatchNorm into Full圜onnected. Tools: snpe-throughput-net-run now supports –userbuffer_auto option (similar to snpe-net-run) for automatic IO tensor data type detection. GPU Runtime: Support Pack operation with 1 input.Ĭore: Updated C API documentation for ITensor/Userbuffer creation indicating data size.Ĭore: setLogLevel() API hooked up to the runtimes for updating logging level after creating logger handle.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |