All Videos

Best Practices For Speed In Deep Learning Applications On Intel Architecture

Alaa Eltablawy, Colfax International You have set up a deep learning model that you are planning to train on an Intel architecture processor. In order to be productive, you have to minimize the training time. You run the application and see that it takes N seconds for a single training epoch. How do you know if it is good? If improvement is possible, what can you do to improve the training time? Are there tools to identify a tuning strategy? This workshop demonstrates how Intel software development tools can answer these questions to maximize your productivity in deep learning on Intel architecture. You will get access to the Intel® AI DevCloud, where you can experiment with the optimization of a TensorFlow-based application for image segmentation. The instructor will demonstrate the performance analysis results obtained with Intel® VTune Amplifier and Application Performance Snapshot and explain how this analysis consistently guides you to the use of known "performance tuning knobs" in TensorFlow. You can apply the techniques learned in the workshop to other deep learning frameworks, as well as classical machine learning with Python and other AI-related applications. In addition, the workshop will present information on obtaining Intel Distribution for Python, Intel VTune Amplifier, and other tools for enterprise, academic, and non-commercial use.