Semi-supervised approach object density estimation, localization, and counting

Proposed AgRegNet, a powerful attention-based deep regression network designed to simultaneously estimate density, localize, and count objects in complex scenes with minimal point annotations. It bypasses the need for complex and resource intensive object detection or polygon annotations, directly generating high-fidelity density maps whose pixel sum yields accurate object counts. Delivers high accuracy in both sparse and dense object distributions with heavy occlusion. Details: (Bhattarai et al., 2024)

[Left/Top] Canopy image with early stage flower. [Right/Bottom] Canopy with early stage flower with overlaid density map generated from proposed approach.
[Left/Top] Canopy image with full bloom flower. [Right/Bottom] Canopy with full bloom flower with overlaid density map generated from proposed approach.
[Left/Top] Canopy image with harvest ready apples. [Right/Bottom] Canopy with harvest ready apples with overlaid density map generated from proposed approach.

Once the density maps were generated, a post-processing algorithm was developed to estimate flower and fruit location. The individual flowers and fruits were localized by identifying local peaks corresponding to object centroids. The predicted peaks were then matched to ground truth annotations using a bipartite graph matching with the Hungarian algorithm for precise one-to-one correspondence.

[Zoom in for better view] Localization results for full bloom flowers and harvest heady apples. Red: ground truth, Blue: True Positive, Magenta: False Negative, and Yellow: False positive. Cyan: Line connecting ground truth to the true positive

For flower images, AgRegNet achieved density map similarity score of 93.8 out of 100, an object counting accuracy of 87.3%, and a localization accuracy score of 0.81 (on a scale of 0 to 1). For fruit images, the model reached a density map similarity score of 91.0, counting accuracy of 94.4%, and a localization accuracy of 0.93, demonstrating strong performance in both visual scene understanding and object localization.

Contact me for code and dataset.


SKILLS: Python, OpenCV, PyTorch, NumPy, Pandas, Scikit-learn, Linux

References

2024

  1. COMPAG
    Uddhav Bhattarai , Santosh Bhusal , Qin Zhang , and 1 more author
    Computers and Electronics in Agriculture, 2024