top of page

Tips and Tricks: Using DICOM Images for High School AI Science Fair Projects

Updated: Apr 11, 2022


How to convert DICOM Images into JPEG for STEM and Artificial Intelligence Science Fair Competition Projects

This blog is part of our Tips and Tricks series, where students can learn techniques and code snippets that can help them build science fair projects with AI and real-world applications and data.


In this blog post - we show you how to convert DICOM Images to JPEG for use with Deep Learning/Neural Networks like ResNet.


What is DICOM?

DICOM stands for Digital Imaging and Communications in Medicine. It is a standard that medical practitioners and hospitals around the world use to share medical data with each other.


Why is DICOM important for science fair projects?


Science fair projects benefit from using real-world data. Judges of science fairs will frequently ask you where you got your data and why your data is credible. These days, fortunately, many cleaned and anonymized medical data sets are available to students for use in studies and projects. However, they do usually come in formats used by medical professionals (like the DICOM format) and you will need to convert it to a more common format (like JPEG) to use it in standard AI programs.


How can I convert DICOM to JPEG?


The python example code snippet below shows you how to do this. The key things to note are

  • You will need to use a DICOM-specific library to read the image. We are using pydicom

  • The DICOM image will be in some resolution that you will need to determine

  • Once you find the resolution (by finding the max and min of the pixel values), you will need to scale it to the right size to use for your Neural Network. ResNet frequently uses 0-255 - so we show that example here.

  • Once you have re-scaled the image, you need to write it back out in JPEG form. We are using CV2 to do that.

  • Real-world datasets usually have hundreds or thousands of images. The simplest thing to do is to put them in a directory and process all of them at once. You can see how the code does that below.



import pydicom as dicom
import os
import cv2
import numpy as np
 


PNG = False
# Specify the .dcm folder path
folder_path = "DICOM IMAGE"
# Specify the output jpg/png folder path
jpg_folder_path = "JPG TEST"
images_path = os.listdir(folder_path)
for n, image in enumerate(images_path):
  ds = dicom.dcmread(os.path.join(folder_path, image))
  pixel_array_numpy = ds.pixel_array
  min_value = np.min(pixel_array_numpy)
  max_value = np.max(pixel_array_numpy)
  range_pixels = max_value - min_value
  pixel_array_numpy = ((pixel_array_numpy - min_value)/range_pixels)*255
  pixel_array_numpy = pixel_array_numpy.astype('int')
  if PNG == False:
    image = image.replace('.dcm', '.jpg')
  else:
    image = image.replace('.dcm', '.png')
  cv2.imwrite(os.path.join(jpg_folder_path, image), pixel_array_numpy, [cv2.IMWRITE_JPEG_QUALITY, 100])
 
  print('{} image converted'.format(n))
 

Hope you find this helpful! For other tips and tricks - please select the Tips and Tricks category on our blog.



93 views0 comments
bottom of page