Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2017/wiki/pmwiki.php on line 691

Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2017/wiki/pmwiki.php on line 694

Warning: Use of undefined constant MathJaxInlineCallback - assumed 'MathJaxInlineCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2017/wiki/cookbook/MathJax.php on line 84

Warning: Use of undefined constant MathJaxEquationCallback - assumed 'MathJaxEquationCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2017/wiki/cookbook/MathJax.php on line 88

Warning: Use of undefined constant MathJaxLatexeqrefCallback - assumed 'MathJaxLatexeqrefCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2017/wiki/cookbook/MathJax.php on line 94
CIS520 Machine Learning | Lectures / Missing Data
Recent Changes - Search:

Home

Missing Data

 

Data are often missing. (Think about examples)

Missing at Random (MAR)

Data are rarely missing at random. When they are, there is usually a simple EM algorithm to impute the missing values. One can then do machine learning on the ‘complete’ data set.

A more complete definition is here

Missing Not at Random (MNAR)

Data are mostly not missing at random. They are missing for a good reason.

For regression, a standard approach is

  1. Replace any missing values with the average of the values that are there. (“imputation”)
  2. Add a separate column for each feature which is an indicator function: 1 if missing, 0 if present.
  3. Run standard regression and feature selection.

Oddly, most packages don’t automatically add the missing variable indicators, although plenty of them will do step 1 (“imputation”)

If you aren’t doing feature selection, you can just use a zero instead of the mean in step 1.

Back to Lectures

Edit - History - Print - Recent Changes - Search
Page last modified on 17 November 2016 at 12:52 PM