Native CPU Optimization on Deeplearning4j

There are a few ways to tune or improve performance for CPU systems on DL4J and ND4J, which this guide will address. First, let’s define some terminology: OpenMP OpenMP is open API for parallel programming for C/C++/Fortran, and since ND4j uses a backend written in C++, we use OpenMP for better parallel performance on CPUs. CPU vs. Core vs. HyperThreading A CPU is a single physical unit that usually consists of multiple cores. Each core is able to process instructions independent of other cores. And each core is subject for HyperThreading – in such systems, that’s shown as an additional set of cores. Say you have Intel i7-4790 CPU installed: It’s one physical CPU, four physical cores and eight total threads. Or you have a dual Intel® Xeon® Processor E5-2683…


Link to Full Article: Native CPU Optimization on Deeplearning4j