Recent Changes - Search:

Home

Random Forests

 

Random forests

Given n observations with p predictors.

Input: {$m << p$} the fraction of the predictors to sample (often sqrt(p)) , and {$f$}, the fraction of the data to use for training

Repeat many times:

  • Choose a training set by choosing f*N training cases (with replacement). This is called {$bagging$}
  • Build a decision tree as follows
    • For each node of the tree, randomly choose m variables and find the best split from among those m variables
    • repeat until the full tree is built (no pruning)
      • Sometimes people just do this with “stumps” — a single split.

To predict, take the modal classification (‘majority vote’) over all the trees.

See also wikipedia

Back to Lectures

Edit - History - Print - Recent Changes - Search
Page last modified on 04 October 2016 at 02:10 PM