Abstract
Data plays a vital role as a source of information to organizations, especially in times of information and technology. One encounters a not-so-perfect database from which data is missing, and the results obtained from such a database may provide biased or misleading solutions. Therefore, imputing missing data to a database has been regarded as one of the major steps in data mining. The present research used different methods of data mining to construct imputative models in accordance with different types of missing data. When the missing data is continuous, regression models and Neural Networks are used to build imputative models. For the categorical missing data, the logistic regression model, neural network, C5.0 and CART are employed to construct imputative models. The results showed that the regression model was found to provide the best estimate of continuous missing data; but for categorical missing data, the C5.0 model proved the best method.
Original language | English |
---|---|
Pages (from-to) | 109-118 |
Number of pages | 10 |
Journal | Journal of Intelligent Manufacturing |
Volume | 19 |
Issue number | 1 |
DOIs | |
Publication status | Published - Feb 2008 |
Externally published | Yes |
Keywords
- BPNN
- C5.0
- Data mining
- Imputation
- Missing data
- Regression
ASJC Scopus subject areas
- Software
- Artificial Intelligence
- Industrial and Manufacturing Engineering