Speed Up Inference with Mixed Precision | AI Model Optimization with Intel® Neural Compressor

2023-07-26 15 0 8,613 YouTube

Download Convert to MP3

Learn the most simple model optimization technique to speed up AI inference. Mixed precision, often used to speed up training, can also be used to speed up inference without having to worry about sacrificing accuracy. Mixed precision is a popular technique for speeding up training of large AI models. It can also be a simple way to reduce model size and inference latency. This approach mixes lower-precision floating point formats such as FP16 and Bfloat16, together with the original 32-bit floating point parameters. Choosing how to mix formats requires assessing the accuracy effects, knowing what is supported by a given device, and what layers are used. Intel® Neural Compressor automatically mixes in lower-precision formats supported by the hardware and the model’s layers. This video shows how to get started, whether you’re using PyTorch*, TensorFlow*, or ONNX* Runtime. It also shows how to automatically assess the accuracy effects of lower precisions. Intel® Neural Compressor: bit.ly/3Nl6pVj Intel® Neural Compressor GitHub: bit.ly/3NlBgkH About Intel Software: Intel® Developer Zone is committed to empowering and assisting software developers in creating applications for Intel hardware and software products. The Intel Software YouTube channel is an excellent resource for those seeking to enhance their knowledge. Our channel provides the latest news, helpful tips, and engaging product demos from Intel and our numerous industry partners. Our videos cover various topics; you can explore them further by following the links. Connect with Intel Software: INTEL SOFTWARE WEBSITE: https://intel.ly/2KeP1hD INTEL SOFTWARE on FACEBOOK: http://bit.ly/2z8MPFF INTEL SOFTWARE on TWITTER: http://bit.ly/2zahGSn INTEL SOFTWARE GITHUB: http://bit.ly/2zaih6z INTEL DEVELOPER ZONE LINKEDIN: http://bit.ly/2z979qs INTEL DEVELOPER ZONE INSTAGRAM: http://bit.ly/2z9Xsby INTEL GAME DEV TWITCH: http://bit.ly/2BkNshu #intelsoftware #ai Speed Up Inference with Mixed Precision | AI Model Optimization with Intel® Neural Compressor

Speed Up Inference with Mixed Precision | AI Model Optimization with Intel® Neural Compressor

Related Videos