TY - GEN
T1 - Hierarchical network for facial palsy detection
AU - Hsu, Gee Sern Jison
AU - Huang, Wen Fong
AU - Kang, Jiunn Horng
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/13
Y1 - 2018/12/13
N2 - We propose the Hierarchical Detection Network (HDN) for the detection of facial palsy syndrome. This can be the first deep-learning based approach for the facial palsy detection. The proposed HDN consists of three component networks, the first detects faces, the second detects the landmarks on the detected faces, and the third detects the local palsy regions. The first and the third component networks are built on the Darknet framework, but with fewer layers of convolutions for shorter processing speed. The second component network employs the latest 3D face alignment network for locating the landmarks. The first component network employs a Na × Na grid over the overall input image, while the third component network employs a Nb × Nb grid over each detected face, making the HDN capable of efficiently locating the affected palsy regions. As previous approaches were evaluated on proprietary databases, we have collected 32 videos from YouTube and made the first public database for facial palsy study. To enhance the robustness against expression variations, we include the CK+ facial expression database in the training and testing phases. We show that the HDN does not just detect the local palsy regions, but also captures the frequency of the intensity, enabling the video-to-description diagnosis of the syndrome. Experiments show that the proposed approach offers an accurate and efficient solution for facial palsy detection/diagnosis.
AB - We propose the Hierarchical Detection Network (HDN) for the detection of facial palsy syndrome. This can be the first deep-learning based approach for the facial palsy detection. The proposed HDN consists of three component networks, the first detects faces, the second detects the landmarks on the detected faces, and the third detects the local palsy regions. The first and the third component networks are built on the Darknet framework, but with fewer layers of convolutions for shorter processing speed. The second component network employs the latest 3D face alignment network for locating the landmarks. The first component network employs a Na × Na grid over the overall input image, while the third component network employs a Nb × Nb grid over each detected face, making the HDN capable of efficiently locating the affected palsy regions. As previous approaches were evaluated on proprietary databases, we have collected 32 videos from YouTube and made the first public database for facial palsy study. To enhance the robustness against expression variations, we include the CK+ facial expression database in the training and testing phases. We show that the HDN does not just detect the local palsy regions, but also captures the frequency of the intensity, enabling the video-to-description diagnosis of the syndrome. Experiments show that the proposed approach offers an accurate and efficient solution for facial palsy detection/diagnosis.
UR - http://www.scopus.com/inward/record.url?scp=85060846803&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060846803&partnerID=8YFLogxK
U2 - 10.1109/CVPRW.2018.00100
DO - 10.1109/CVPRW.2018.00100
M3 - Conference contribution
AN - SCOPUS:85060846803
T3 - IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
SP - 693
EP - 699
BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2018
PB - IEEE Computer Society
T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2018
Y2 - 18 June 2018 through 22 June 2018
ER -