The Success Of Tomato Genius

Updated: Mar 15

The COVID-19 Pandemic has infected people of many races and ages and I fear to imagine if a disease like that emerged in crops. Crops are the number one source of food for humans and diseases can easily wipe them out. With that inspiration, I created an AI service that can predict what disease a tomato plant has such as Early Blight, Late Blight, Septoria, Curl Virus



Aim I wanted to learn the basics of Image Classification using a neural network.

What I did The COVID-19 Pandemic has infected people of many races and ages and I fear to imagine if a disease like that emerged in crops. Crops are the number one source of food for humans and diseases can easily wipe them out. With that inspiration, I created an AI service that can predict what disease a tomato plant has such as Early Blight, Late Blight, Septoria, Curl Virus. To achieve this, I used images of tomato plant leaves as a dataset to the Image Classification AI training program provided by aiclub.world website. The dataset was made up of 5 categories (1 for each disease and 1 healthy group) with 20 images per category equaling 120 images.

Problem

The leaf on the left is early blight and is predicted by the AI before I moved it to the Late Blight Category.

When I tested my AI service, I got a disappointing accuracy of 23.3%. That is almost the same as randomly picking 1 out of the 5 categories. This happened because my AI service was confusing a disease called Early Blight with the healthy leaves as they look the same.

Solution

I first attempted to change my dataset 7 times but that didn’t improve the accuracy much. After some tests, I figured the Early Blight disease was being confused with the healthy leaves. I proceeded to use the elimination by trial method by eliminating one category from the whole dataset and testing it. I repeated this for all five categories.

This is an Early Blight Leaf that has been predicted Late Blight after the Early Blight has been put under the Late Blight Category.

I found that removing the Early Blight from the dataset results in an acceptable accuracy of 90.47%. I still wanted my AI to classify a leaf that had Early Blight as dangerous. The AI knows that the Late Blight disease has a certain kind of pixel arrangement such that 70% of the leaf looks diseased. Now that we have added the Early Blight which looks 20-30% of the leaf is diseased, the AI takes an average of these two percentages and gets the ‘weight’ or what percent of the leaf looks diseased. This helps as the AI’s calculations are closer and more accurate in identifying the thirty percent diseased looks.

Learning I learned about how basic neural networks work and how images are converted to be compatible with them. A general idea of a neural network runs on weights and calculations determined by the dataset you give. An input is given to the first layer of a neural network in neurons of information. That information is passed on to the next layer in the network. This layer runs calculations with the input and sends it to the neurons in the next layer and so on. These layers are called hidden layers where the AI does its thinking.

At the end of the network, there is an output layer where the AI has several outputs and chooses the best one for the input you give. No matter how hard you try, you can never understand why the AI chooses to follow a certain path of neurons from your input to the final prediction.

One process where images are converted to numbers and used as input in a neural network is called flattening. Pixels in an image contain a number that represents a color in the light s