A practical approach to help medical practitioners helping us in the battle against COVID-19

Coronavirus disease 2019 (COVID-19) is a highly infectious disease caused by severe acute respiratory syndrome coronavirus 2. The disease first originated in December 2019 from Wuhan, China and since then it has spread globally across the world affecting more than 200 countries. The impact is such that the World Health Organization(WHO) has declared the ongoing pandemic of COVID-19 a Public Health Emergency of International Concern.

As of 29th April, there is a total of 31,30,191 cases with 2,17,674 deaths in more than 200 countries across the world. (Source: Bing COVID-19 tracker).

So, in this particular scenario, one primary thing that needs to be done and has already started in the majority of the countries is Manual testing, so that the true situation can be understood and appropriate decisions can be taken.

But the drawbacks of manual testing includes sparse availability of testing kits, costly and inefficient blood tests; a blood test takes around 5–6 hours to generate the result.

So, the idea is to overcome these circumstances using the Deep Learning technique for better and efficient treatment. Since the disease is highly contagious therefore as early as we generate the results the fewer cases in the city that’s why we can use Convolution Neural Network to get our job done.

Can you distinguish between both X-rays if they haven’t been labeled? I bet you can’t, but a CNN can.

Analysis of COVID-19 using deep learning includes lungs x-rays of patients and the basic idea is to classify the x-ray as COVID affected or normal. In short, the problem is a binary classification problem where we classify Normal vs COVID-19 cases.

There are several pros and cons of using Deep Learning to tackle such kinds of situations:

  1. Pros: More time saving; less expensive; easy to operate
  2. Cons: Practically we need ~100% accuracy as we can’t wrongly identify the patients as it might lead to further spread of disease which is highly discouraged.

But still, this model can return good accuracies and can further can be enhanced.

Data Preparation

Machine Learning needs a lot of data to train; the data we need for this type of problem is chest X-Ray for both COVID affected and fit patients. There is no direct link to the dataset but we can make-shift to get the data and start the operation.

  1. Dr.Joseph Paul Cohen recently open-sourced a database containing chest X-ray images of patients suffering from the COVID-19 disease. The dataset used is an open-source dataset which consists of COVID-19 images from publicly available research, as well as lung images with different pneumonia-causing diseases such as SARS, Streptococcus, and Pneumocystis.
  2. I have also used the Kaggle’s Chest X-ray competitions dataset to extract X-rays of healthy patients and have sampled 100 images to have a balance with the COVID-19 available images.

So, the dataset consists of COVID-19 X-ray scan images and also the angle when the scan is taken. It turns out that the most frequently used view is the Posteroanterior view and I have considered the COVID-19 PA view X-ray scans for my analysis.

To stratify our data we will take an equal number of images and will blend them and later will divide into test and train data.

Now you all can skip the above steps if you want as I already have split and prepared the data which can be found on my Github Repository.

Model Deployment

Since we have already prepared the data which is the most tedious part of this project, let’s move to the next step here we will create a deep learning model that is going to learn the difference between normal X-Ray and COVID-19 affected X-Ray and later can predict.

I assume you all know the basics of CNN architecture if not I highly recommend you follow:- Basics of CNN

Model Architecture

I tend to have 3 hidden layers, you can experiment with more or fewer layers that is up to you. I am going to follow the traditional approach of increasing the neurons as we go deep inside the layer; as it helps to learn more features from the image which returns us better certainty.

I’m going to have (224,224,3) input neurons that are we are resizing our data to 224*224 with 3 channels as it is considered to be the ideal size and therefore our model can grasp even minutiae and necessary features from the image.

At last, I am going to flatten our features and will use sigmoid as activation function as we are having binary classification problem, and thus our output will only contain one cell, adam as optimizer works pretty well with sigmoid hence compiling the model with them in addition to cross binary entropy.

Parameters

You might be wondering why I didn’t directly deploy VGG16 or any predefined model but for that, you must know the architecture of VGG16, it contains roughly around 140million parameters, on the other hand, our model includes around 5.7million parameters, so it is better and more optimal to use customized model rather than training for hours on transfer learning especially for small datasets like this.

Training Data

Since we’ve already defined our model the next task we are left with is training our data on the defined model.

I tried training the data without performing shearing, zooming and horizontal shifts so the accuracy I got was around 50% which is pretty low for real-time projects like this.

Accuracy was low due to less reception as data wasn’t molded

So it's better to mold the data for better reception of features, therefore, we are performing shearing, zooming, and horizontal rotation on our training data.

Once the images are sculpted we can convert the given images in the input shape that is 224*224 with a batch size of 32 and can train our training set.

Preparing training data

In my training process I am going with 10 epochs with 8 steps per epochs, again feel free to experiment with hyperparameters and maybe that could yield better results.

Training the model
Summarizing the model

Also, we can plot loss and accuracy for better understanding of required hyperparameters.

The defined hyperparameters produce 96.4% accuracy, which isn’t bad but still can be improved as, if we deploy a model with around 96.4% accuracy in real scenarios, wrongly identified patients still can spread the disease and our goal for an efficient approach wouldn’t be able to achieve success.

Again if you want to save your time from training you can download the trained model from my Github repository.

Confusion Matrix

To visualize the results in a more understanding manner we’re going to implement a confusion matrix.

The prototype for the confusion matrix is as follows:-

The confusion matrix we’re getting is as follows:-

Decoding the confusion matrix, out of 30 COVID affected patients we are getting 30 people we are getting 0 wrongly classified and, out of 28 normal patients we are getting 28 patients are classified right and 2 as wrongly classified.

The results we are getting is good but still, the accuracy can be improved in order to fulfil our intention.

Source Code

The whole source code along with dataset and trained model can be found at my Github Repository:- Covid-19 Detection

Conclusion

So, to conclude, I want to ponder on the fact again that the analysis has been done on a limited dataset and the results are exploratory and nothing conclusive can be inferred from the same. Medical validations have not been done on the approach.

I plan to improve the model to increase the tenacity of my model with more X-ray scans so that the model is generalizable. Furthermore, I encourage readers to experiment on the model to make it more precise.

As always, thanks so much for reading, and please share this article if you found it useful!

Stay home, Stay safe!

Feel free to connect:

LinkedIN ~ https://www.linkedin.com/in/dakshtrehan/
Instagram ~ https://www.instagram.com/_daksh_trehan_/
Github ~ https://github.com/dakshtrehan

Follow for further Machine Learning/ Deep Learning blogs.

Medium ~ https://medium.com/@dakshtrehan
Cheers.

The cover template and confusion matrix template was made by me on www.canva.com. The x-ray images are part of open sourced dataset available on github and kaggle. The rest pictures are from my Jupyter Notebook. Please contact me if you would like permission to use them.

Also shoutout to Prateek bhaiyaa and Coding Blocks for helping me understand these models better through their data science sessions.