COVID-19 Detection using CNNs
Detecting the presence of COVID-19 in a patient’s lungs by analyzing chest X-Rays using CNNs.
COVID-19 is an infection that is caused by the SARS-Cov2 virus. There are multiple variants of this virus, as the world has come to know it, but one particularly dangerous one is the Delta variant, which is known to cause damage to the lungs of its host. This infection can be spotted via a chest X-RAY scan. This scan is the basis for this project. Using a Convolutional Neural Network, these scans are analyzed for signs of COVID-19 infection. The tricky bit is differentiating between scans of patients with viral pneumonia and COVID-19. The latter stages of the COVID infection is brought on by a pneumonia like infection in the lungs. This project was implemented in the PyTorch framework.
1. The Dataset
The COVID-19 Radiograpy Database is a collection of chest X-RAYS of patients with the following conditions:
- COVID-19 infection: 3616 images
- Normal lungs: 10,192 images
- Non-COVID lung infections: 6012 images
- Viral pneumonia infection: 1345 images This database was created by researchers from Qatar Univeristy and Univeristy of Dhaka in collaboration with medical professionals in their countries as well as Malaysia and Pakistan.
2. Custom Dataset
I created a custom dataset class that is helpful while training and testing the model. This class inherits from torch.utils.data.Dataset
and implements the __getitem()__
method.
Code
|
|
3. Image Tranforms
A few preprocessing steps were added to the pipeline before the model could be trained. For the training images:
- Resizing: match input dimensions for pretrained ResNet18
- Augmentation:
RandomHorizontalFlip
- Normalization
And for the test images:
- Resizing
- Normalization
Normalization was done separately to avoid Data Leakage
.
4. The Model
The Convolutional Neural Network used for this project is a relatively light-weight residual network, ResNet18. This model was chosen for ease of transfer learning. The model was pretrained on the ImageNet dataset, which consists of images from over 1000 classes.
Code
|
|
5. Training and Performance
The model was trained for a few epochs to fine-tune the weights to improve the prediction accuracy. The performance criteria was set to 95% accuracy and the model was able to do so in under 2 epochs. The training loop is highlighted below, for full code, please visit my GitHub repository for this project.
Code
|
|
6. Results
With an accuracy of 95%
, this model can be used as a tool to detect COVID-19 in the lungs of patients. It cannot, in any capacity, be a subsitute for a medical professional.