Hitaya OneAPI: Skin Cancer Detection using Intel® Optimization for Tensorflow

Jayita Bhattacharyya
8 min readMay 9


Following up with Hitaya Healthcare for underserved communities by leveraging Intel OneAPI products, this time we illustrate how Intel oneDNN can help accelerate computer vision problems. Here we have an image classification problem for skin cancer detection.

Skin diseases have prevailed for long years, up until recently when early diagnosis has helped in saving lives. The main cause of skin cancer is exposure to ultraviolet (UV) radiation from the sun or tanning beds. Other risk factors include fair skin, a history of sunburns or skin cancer, a family history of the disease, a weakened immune system, and exposure to certain chemicals. Dermatologists could enhance their treatments with the help of evolved data science techniques and algorithms for faster diagnosis.

According to the World Health Organization (WHO), there were an estimated 61,000 deaths from melanoma worldwide in past year. Melanoma remains a significant cause of death from skin cancer, with an estimated 7,180 deaths expected in the US last year.

Data Collection & Preparation

We use the open-source dataset Society for Imaging Informatics in Medicine (SIIM)International Skin Imaging Collaboration (ISIC) Melanoma Classification. We’ve extended our detection to 8 types of skin cancers namely:

  1. Melanoma — This is the most dangerous type of skin cancer, and it can be deadly if not detected and treated early. Melanoma typically appears as a dark, irregularly shaped mole on the skin that may have uneven borders, multiple colours, or a raised appearance.
  2. Melanocytic nevus — A melanocytic nevus, also known as a mole, is a type of skin lesion that is usually benign, but in rare cases, it can develop into a type of skin cancer called melanoma.
  3. Basal cell carcinoma — This is the most common type of skin cancer, and it usually appears as a small, raised bump on the skin that is smooth and shiny or waxy in texture. It tends to grow slowly and rarely spreads to other body parts.
  4. Actinic keratosis — Actinic keratosis (AK) is a type of skin lesion that is considered to be a precancerous condition, meaning it has the potential to develop into a type of skin cancer, typically squamous cell carcinoma.
  5. Benign keratosis (solar lentigo / seborrheic keratosis / lichen planus-like keratosis)
  6. Dermatofibroma — This is a rare type of skin cancer that typically appears as a firm, raised bump on the skin that may be pink, red, or brown in colour. It tends to grow slowly but can invade nearby tissue and cause damage.
  7. Vascular lesion — Vascular lesions are a type of skin lesion that is caused by abnormalities in the blood vessels. While these lesions are not typically considered to be a type of skin cancer, some types of vascular lesions can be associated with an increased risk of certain types of skin cancer, such as basal cell carcinoma.
  8. Squamous cell carcinoma — This type of skin cancer usually appears as a firm, red bump on the skin that may have a scaly or crusty appearance. It can grow quickly and may spread to other parts of the body if not treated.

This is spread over 25000 images across classes into train, test and validation sets.


We make use of Intel Distribution for Python for our implementation. Some of Intel® oneAPI AI Analytics Toolkit that we make prominent use of in our workload is as follows:

Intel® Advanced Matrix Extensions

Intel AMX is designed to accelerate a wide range of matrix operations commonly used in deep learning and other AI applications, including convolution, matrix multiplication, and matrix addition. By providing dedicated hardware support for these operations, Intel® AMX can significantly improve the performance of AI workloads, while also reducing power consumption.

Intel® AMX includes a new data type called the “bfloat16” format, which is a 16-bit floating-point format optimized for AI workloads. The bfloat16 format can provide better performance than traditional 32-bit floating-point formats while still providing sufficient precision for most AI workloads.

Intel® Deep Learning Boost

Intel Tech DL Boost is designed to accelerate deep learning inference workloads by providing dedicated hardware support for key neural network operations, such as convolution, pooling, and activation functions. By offloading these operations to the hardware, Intel® DL Boost can significantly improve the performance of deep learning inference, while also reducing power consumption.

Intel® DL Boost includes several important features, including support for low-precision arithmetic, which can further accelerate inference workloads by reducing the number of bits used to represent data. It also includes support for variable-length instructions, which can help reduce memory bandwidth requirements and improve efficiency.

Intel® Neural Compressor


Intel® Neural Compressor is a technology that uses neural networks to compress deep learning models. Deep learning models are typically very large, which can make them difficult to deploy on resource-constrained devices such as smartphones and embedded systems. Intel® Neural Compressor addresses this challenge by compressing deep learning models so that they can be deployed more efficiently on these devices.

The technology trains a neural network to learn a compressed representation of a larger deep-learning model. The compressed representation is designed to capture the essential information needed for accurate inference while discarding non-essential information. Once the compressed representation has been learned, it can be used to generate a smaller, more efficient version of the original deep learning model that can be deployed on resource-constrained devices.


Intel® Neural Compressor can significantly reduce the size of deep learning models, with compression ratios of up to 100x or more in some cases. This can make it much easier to deploy deep learning models on a wide range of devices and can help to accelerate the development of new AI applications in areas such as robotics, autonomous vehicles, and the Internet of Things.

Neural Compressor has a built-in pypi package and can be installed easily:

# install stable basic version from pypi
pip install neural-compressor

# install from source
git clone https://github.com/intel/neural-compressor.git

Intel® Optimization for TensorFlow

Intel Extension for Tensorflow

Intel® Optimization for TensorFlow is a set of software optimizations that are specifically designed to improve the performance of TensorFlow, a popular open-source machine learning framework. The optimizations are tailored to take advantage of the hardware capabilities of Intel processors, such as Intel® Advanced Vector Extensions 512 (Intel® AVX-512), Intel® Math Kernel Library (Intel® MKL), and Intel® Distribution of Python.

Intel® oneAPI AI Analytics Toolkit

Some of the key features of Intel® Optimization for TensorFlow include:

  1. TensorFlow compiled with Intel® AVX-512 instructions to enable vectorized math operations and improved performance.
  2. Intel® MKL-DNN library integration to optimize neural network computations on Intel processors.
  3. TensorFlow Docker images are pre-optimized with Intel® MKL, Intel® MPI, and other libraries to simplify deployment.
  4. Support for mixed precision training, which can reduce the memory requirements of deep learning models and improve training speed.
  5. TensorFlow Profiler integration to help developers optimize their models for performance.


# gpu version
pip install --upgrade intel-extension-for-tensorflow[gpu

4th Gen Intel® Xeon® Scalable processors

Intel® Xeon® Scalable processors are a family of server processors designed for use in data centres and other high-performance computing applications. They are built using Intel’s latest microarchitecture. Some key features of Intel® Xeon® Scalable processors include - High core count, High memory capacity, High performance, Enhanced security, and Scalability.

Overall, Intel® Xeon® Scalable processors are well-suited for a wide range of high-performance computing applications such as scientific simulations, big data analytics, and AI workloads.

Intel DevCloud

Intel DevCloud is a cloud-based platform that provides developers and data scientists with access to a wide range of Intel hardware and software tools for developing, testing, and optimizing their applications. The platform is designed to enable rapid experimentation and prototyping of applications on a variety of hardware configurations, including CPUs, GPUs, FPGAs, and other accelerators. Some key features of Intel DevCloud include — Easy access to hardware, Pre-installed software and tools, Integration with popular development tools, Scalability, and Collaboration.

Next Steps

  1. Train a CNN ResNet model
  2. Quantize the frozen PB model by Intel® Neural Compressor to INT8 model.
  3. Compare the performance of FP32 and INT8 models.
  4. Skin Cancer Prediction


Intel® Optimization for TensorFlow is designed to help data scientists and developers get the best possible performance from their TensorFlow workloads on Intel hardware. By taking advantage of the latest hardware capabilities and software optimizations, it can help to accelerate the development and deployment of AI applications across a wide range of industries and use cases.

Optimizing TensorFlow for 4th Gen Intel Xeon Processors

Follow up links



Jayita Bhattacharyya

Official Code-breaker | Generative AI | Machine Learning | Software Engineer | Traveller