Missing data imputation using NIPALS in Excel
This tutorial shows how to easily impute missing data in Excel using the NIPALS algorithm with the XLSTAT software.
Principle of the NIPALS approach for completing missing data
The NIPALS method is a method presented by H. Wold (1973) to allow principal component analysis with missing values. The NIPALS algorithm is applied on the dataset and the obtained PCA model is used to predict the missing values.
Imputing missing values with XLSTAT
After opening XLSTAT, select XLSTAT / Preparing data / Missing data.
The dataset we are using here contains six variables and six observations with six missing values.
The Missing data dialog box appears. Select the data you want to complete in the Quantitative data field (in our case the table with missing values). Select the NIPALS missing data method. Activate the option for observation labels and select the name of the cars.
Once you have clicked on the OK button, the results are displayed on a new sheet.
Results of the NIPALS imputation process with XLSTAT
Descriptive statistics tables are displayed (one table before and a second one after imputation).
Then the completed data is displayed.
We can see in bold the completed values. We can compare these values to the real value available in this dataset:
We can see that imputed missing values are very close to the real values. For example for the displacement of Honda Civic, the real value is 1396 and the imputed value is 1365.236. If we had used a mean imputation method, the imputed value would have been 1781.4 which is very far from the value obtained with NIPALS.
Was this article useful?
- Yes
- No