TY - GEN
T1 - Using AI Algorithm to Establish the CVD Risk Assessment Model
AU - Chen, Yin Chen
AU - Lee, Hsiu An
AU - Hsu, Chien Yeh
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
PY - 2022
Y1 - 2022
N2 - In Taiwan, diseases with cardiovascular include heart disease and cerebrovascular disease among the top ten causes of death. With the development of data mining in the medical field, it can be used to establish the risk prediction models of disease as a tool to assist physicians in decision-making in diagnosis. The purpose of this study is taking Taiwanese health examination data as an example, to identify the significant risk factors of cardiovascular disease and to establish the risk assessment model of cardiovascular disease using data mining. Using Chi-square test and Information Gain identify the correlation between various factors and cardiovascular diseases. Using eight algorithms such as decision tree, random forest, XGBoost, neural network, logistic regression, support vector machine, K nearest neighbor algorithm and voting algorithm to establish the risk assessment model of cardiovascular disease and using confusion matrix and AUC as model evaluation. Compare model performance with different factor combinations. Experiment result shows that it finds 22 questionnaires and biochemical variable risk factors affecting cardiovascular disease and 10 questionnaire factors. ANN and VOTE with 22 factors of Information Gain (threshold > 0.01) are the best models. Both models have the same accuracy (0.88) and AUC (0.90). The best model of questionnaire variable is ANN with the accuracy rate is 0.89, and the AUC is 0.91.
AB - In Taiwan, diseases with cardiovascular include heart disease and cerebrovascular disease among the top ten causes of death. With the development of data mining in the medical field, it can be used to establish the risk prediction models of disease as a tool to assist physicians in decision-making in diagnosis. The purpose of this study is taking Taiwanese health examination data as an example, to identify the significant risk factors of cardiovascular disease and to establish the risk assessment model of cardiovascular disease using data mining. Using Chi-square test and Information Gain identify the correlation between various factors and cardiovascular diseases. Using eight algorithms such as decision tree, random forest, XGBoost, neural network, logistic regression, support vector machine, K nearest neighbor algorithm and voting algorithm to establish the risk assessment model of cardiovascular disease and using confusion matrix and AUC as model evaluation. Compare model performance with different factor combinations. Experiment result shows that it finds 22 questionnaires and biochemical variable risk factors affecting cardiovascular disease and 10 questionnaire factors. ANN and VOTE with 22 factors of Information Gain (threshold > 0.01) are the best models. Both models have the same accuracy (0.88) and AUC (0.90). The best model of questionnaire variable is ANN with the accuracy rate is 0.89, and the AUC is 0.91.
KW - Cardiovascular disease
KW - Data mining
KW - Feature selection
KW - Risk assessment model
UR - http://www.scopus.com/inward/record.url?scp=85141716601&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85141716601&partnerID=8YFLogxK
U2 - 10.1007/978-981-19-4132-0_18
DO - 10.1007/978-981-19-4132-0_18
M3 - Conference contribution
AN - SCOPUS:85141716601
SN - 9789811941313
T3 - Lecture Notes in Electrical Engineering
SP - 156
EP - 166
BT - Innovative Computing - Proceedings of the 5th International Conference on Innovative Computing, IC 2022
A2 - Pei, Yan
A2 - Chang, Jia-Wei
A2 - Hung, Jason C.
PB - Springer Science and Business Media Deutschland GmbH
T2 - 5th International Conference on Innovative Computing, IC 2022
Y2 - 19 January 2022 through 21 January 2022
ER -