TY - JOUR
T1 - MASPP and MWASP
T2 - multi-head self-attention based modules for UNet network in melon spot segmentation
AU - Tran, Khoa Dang
AU - Ho, Trang Thi
AU - Huang, Yennun
AU - Le, Nguyen Quoc Khanh
AU - Tuan, Le Quoc
AU - Ho, Van Lam
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.
PY - 2024
Y1 - 2024
N2 - Sweet melon, and in particular, spotted melon, is one of the most profitable fruit crops for farmers in the international market. As the spot ratio impacts the melon’s visual appeal, it plays a significant role in shaping consumers’ initial impressions and influencing their decision to purchase a spotted melon. However, accurately determining the spot area on a melon’s skin is challenging due to the diverse sizes and colors of these spots among different types of melons. In this study, the novel networks based on UNet model have been proposed to accurately determine the spot area on melon skins after harvesting. First, Mask R-CNN model was employed to isolate the melons from unwanted objects and backgrounds. Then, the novel variants of the Atrous Spatial Pyramid Pooling (ASPP) and Waterfall Atrous Spatial Pooling (WASP) were developed based on the multi-head self-attention (MHSA) approach to efficiently enhance the original structures. Finally, the proposed modules were integrated into VGG16-UNet network to segment melons’ spots on its skin. The experimental results demonstrate that the proposed methods yielded promising outcomes, achieving a mean IoU of 89.86% and an accuracy of 99.45% across all classes. Moreover, it outperformed other existing models.
AB - Sweet melon, and in particular, spotted melon, is one of the most profitable fruit crops for farmers in the international market. As the spot ratio impacts the melon’s visual appeal, it plays a significant role in shaping consumers’ initial impressions and influencing their decision to purchase a spotted melon. However, accurately determining the spot area on a melon’s skin is challenging due to the diverse sizes and colors of these spots among different types of melons. In this study, the novel networks based on UNet model have been proposed to accurately determine the spot area on melon skins after harvesting. First, Mask R-CNN model was employed to isolate the melons from unwanted objects and backgrounds. Then, the novel variants of the Atrous Spatial Pyramid Pooling (ASPP) and Waterfall Atrous Spatial Pooling (WASP) were developed based on the multi-head self-attention (MHSA) approach to efficiently enhance the original structures. Finally, the proposed modules were integrated into VGG16-UNet network to segment melons’ spots on its skin. The experimental results demonstrate that the proposed methods yielded promising outcomes, achieving a mean IoU of 89.86% and an accuracy of 99.45% across all classes. Moreover, it outperformed other existing models.
KW - Atrous spatial pyramid pooling
KW - Multi-head self-attention
KW - Semantic segmentation
KW - UNet
KW - Waterfall atrous spatial pooling
UR - http://www.scopus.com/inward/record.url?scp=85188904692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188904692&partnerID=8YFLogxK
U2 - 10.1007/s11694-024-02466-1
DO - 10.1007/s11694-024-02466-1
M3 - Article
AN - SCOPUS:85188904692
SN - 2193-4126
VL - 18
SP - 3935
EP - 3949
JO - Journal of Food Measurement and Characterization
JF - Journal of Food Measurement and Characterization
IS - 5
ER -