TY - JOUR
T1 - Deep Learning-Based Hepatocellular Carcinoma Histopathology Image Classification
T2 - Accuracy Versus Training Dataset Size
AU - Lin, Yu Shiang
AU - Huang, Pei Hsin
AU - Chen, Yung Yaw
N1 - Funding Information:
This work was supported in part by the Ministry of Science and Technology, and in part by the National Taiwan University (NTU), Taiwan.
Publisher Copyright:
© 2013 IEEE.
PY - 2021
Y1 - 2021
N2 - Globally, liver cancer causes more than 700,000 deaths each year and is the second-leading cause of death from cancer. Hepatocellular carcinoma (HCC) is the most common type of liver cancer in adults and accounts for most deaths in cirrhosis patients. Patients with early-stage liver cancer can be treated by surgical intervention with a good prognosis; thus, early diagnosis, as confirmed by liver pathology examination, is necessary to combat HCC. Conventional manual pathology examination requires considerable time and labor, even with established expertise. It is widely accepted that intelligent classifiers may prove effective in the diagnosis process. In this study, we used a GoogLeNet (Inception-V1)-based binary classifier to classify HCC histopathology images. The classifier achieved 91.37% (±2.49) accuracy, 92.16% (±4.93) sensitivity, and 90.57% (±2.54) specificity in HCC classification. Although the classification accuracy of deep learning is reported to be positively correlated with the amount of training data, it is often uncertain how much training data are required for deep learning to achieve satisfactory performance in clinical diagnosis. Moreover, deep learning methods require annotated data to generate efficient models. However, annotated data are a relatively scarce resource and can be expensive to obtain. Hence, the relationship between classification accuracy and the number of liver histopathology images for training was investigated. An inverse power law function-based estimation model is proposed to evaluate the minimum number of annotated training images required for a desired diagnostic accuracy.
AB - Globally, liver cancer causes more than 700,000 deaths each year and is the second-leading cause of death from cancer. Hepatocellular carcinoma (HCC) is the most common type of liver cancer in adults and accounts for most deaths in cirrhosis patients. Patients with early-stage liver cancer can be treated by surgical intervention with a good prognosis; thus, early diagnosis, as confirmed by liver pathology examination, is necessary to combat HCC. Conventional manual pathology examination requires considerable time and labor, even with established expertise. It is widely accepted that intelligent classifiers may prove effective in the diagnosis process. In this study, we used a GoogLeNet (Inception-V1)-based binary classifier to classify HCC histopathology images. The classifier achieved 91.37% (±2.49) accuracy, 92.16% (±4.93) sensitivity, and 90.57% (±2.54) specificity in HCC classification. Although the classification accuracy of deep learning is reported to be positively correlated with the amount of training data, it is often uncertain how much training data are required for deep learning to achieve satisfactory performance in clinical diagnosis. Moreover, deep learning methods require annotated data to generate efficient models. However, annotated data are a relatively scarce resource and can be expensive to obtain. Hence, the relationship between classification accuracy and the number of liver histopathology images for training was investigated. An inverse power law function-based estimation model is proposed to evaluate the minimum number of annotated training images required for a desired diagnostic accuracy.
KW - Convolutional neural network
KW - deep learning
KW - hepatocellular carcinoma
KW - histopathology image classification
KW - inverse power law function-based fitting curve regression
UR - http://www.scopus.com/inward/record.url?scp=85101731801&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85101731801&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2021.3060765
DO - 10.1109/ACCESS.2021.3060765
M3 - Article
AN - SCOPUS:85101731801
SN - 2169-3536
VL - 9
SP - 33144
EP - 33157
JO - IEEE Access
JF - IEEE Access
M1 - 9359762
ER -