Data Collection and Labeling
Red Rot Images
We collected a dataset of approximately 800-1000 images of sugarcane
plants, encompassing both healthy and red rot-infected plants. The
images were sourced from multiple agricultural databases and field
surveys, covering a variety of environmental conditions and geographic
regions. Expert agricultural professionals labeled each image to ensure
that disease identification was precise, providing accurate annotations
of red rot disease at different stages.
Data Augmentation
To improve the generalization of our models and compensate for the
relatively small dataset, we applied extensive data augmentation. This
increased the dataset size to over 8000 images and added variation to
the training data, making the models more robust. Augmentation
techniques included: Rotation: Random rotations between -40 and +40
degrees to simulate various orientations of plants in the field.
Flipping: Horizontal and vertical flips to mimic different plant
orientations. Brightness and Contrast Adjustments: Adjusting brightness
levels within ±20% to simulate various lighting conditions during data
collection. Zoom and Cropping: Introducing random zooms and cropping to
simulate differing distances between the camera and plants. Random Noise
Injection: Adding small levels of Gaussian noise to simulate real world
image imperfections.
Model Training Pipeline
Contribution:
Our pipeline involved training several deep learning models, including
EfficientNet, ResNet-152, VGG16, and Vision Transformers, as well as a
Hybrid CNN + Random Forest model. The pipeline was designed to compare
the performance of these architectures on the red rot detection task and
identify the most efficient and accurate models for real-world
deployment.
Model Architectures and Functionality"
1.Hybrid CNN + Random Forest Model • Why Chosen: The hybrid model
combined the feature extraction capabilities of Convolutional Neural
Networks (CNNs) with the robust classification of Random Forests, aiming
to leverage the strengths of both deep learning and traditional machine
learning. Results: Test Accuracy: 96.15%, F1 Score: 96.11%.
2.EfficientNet Model • Why Chosen: EfficientNet was selected for its
ability to scale network depth, width, and resolution efficiently,
providing a good balance between model complexity and computational
resources. • Results: Best Validation Accuracy: 99.88%, Test Accuracy:
99.88%, Precision: 100.00%, Recall: 99.76%, F1 Score: 99.88%.
3.ResNet-152 Model • Why Chosen: ResNet-152 was chosen for its depth and
use of residual connections, which help mitigate the vanishing gradient
problem in deep neural networks, allowing the model to learn more
intricate features. • Results: Best Validation Accuracy: 100.00%, Test
Accuracy: 99.64%, Precision: 99.53%, Recall: 99.77%, F1 Score: 99.65%.
4.VGG16 Model • Why Chosen: VGG16 was included for comparative analysis
due to its historical significance in image recognition tasks. •
Results: Best Validation Accuracy: 99.16%, Test Accuracy: 99.16%,
Precision: 99.76%, Recall: 98.59%, F1 Score: 99.18%. 5.Vision
Transformers Model • Why Chosen: Vision Transformers (ViT) were explored
to leverage self-attention mechanisms, potentially capturing global
image features more effectively than CNNs. • Results: Best Validation
Accuracy: 97.71%, Test Accuracy: 96.51%, Precision: 96.73%, Recall:
96.01%, F1 Score: 96.37%
Evaluation Metrics
To assess the performance of the models, we used the following metrics:
Accuracy: The percentage of correctly classified images. Precision: The
ratio of true positive predictions to the total number of positive
predictions. Recall (True Positive Rate): The percentage of actual
positive cases (diseased plants) correctly identified by the model. F1
Score: The harmonic mean of precision and recall, balancing these two
metrics. ROC Curve & AUC: The ROC curve evaluated the trade-offs between
the true positive rate and false positive rate. Confusion Matrix: A
confusion matrix provided detailed insights into model
misclassifications.
Conclusion
These protocols provide a comprehensive approach to building an AI-based
system for red rot disease detection in sugarcane. By following these
steps, we were able to develop models that generalize well across
different environmental conditions and disease stages. Our protocols
ensure that this system can be replicated and scaled for use in other
agricultural settings.
Analysis
1. The ResNet-152 model Results: - Best Validation Accuracy: 99.88% -
Test Accuracy: 99.88% - Precision: 100.00% - Recall: 99.76% - F1 Score:
99.88% .It delivered the best overall performance, achieving a perfect
validation accuracy and near-perfect test accuracy. Its depth and use of
residual connections allowed it to capture complex features, making it
highly effective at distinguishing between healthy and diseased
sugarcane plants.
2. EfficientNet Model Analysis: EfficientNet proved to be a highly
accurate model, excelling in both precision and F1 score. Its ability to
scale network depth, width, and resolution efficiently contributed to
its high performance.
3. Hybrid CNN + Random Forest Model Results: - Test Accuracy: 96.15% -
F1 Score: 96.11%
4. VGG16 Model Results: - Best Validation Accuracy: 99.16% - Test
Accuracy: 99.16% - Precision: 99.76% - Recall: 98.59% - F1 Score: 99.18%
5. Vision Transformers Model Results: - Best Validation Accuracy: 97.71%
- Test Accuracy: 96.51% - Precision: 96.73% - Recall: 96.01% -
F1 Score: 96.37%