#!/usr/local/bin/php
Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2016/wiki/pmwiki.php on line 691

Warning: "continue" targeting switch is equivalent to "break". Did you mean to use "continue 2"? in /cgihome/cis520/html/dynamic/2016/wiki/pmwiki.php on line 694

Warning: Use of undefined constant MathJaxInlineCallback - assumed 'MathJaxInlineCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 84

Warning: Use of undefined constant MathJaxEquationCallback - assumed 'MathJaxEquationCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 88

Warning: Use of undefined constant MathJaxLatexeqrefCallback - assumed 'MathJaxLatexeqrefCallback' (this will throw an Error in a future version of PHP) in /cgihome/cis520/html/dynamic/2016/wiki/cookbook/MathJax.php on line 94
CIS520 Machine Learning | Lectures / Real ML
Recent Changes - Search:

Home

Real ML

 
  • Overfitting is your worst enemy
    • Train, Test (Quiz), Validate
    • Out-of-sample in the real world is subtle
  • Loss functions
  • Feature generation is critical
    • Think about the problem!!
    • How might you transform the features?
      • Do you want a scale-invariant method or not?
    • What else could you measure?
    • Is semi-supervised learning possible?
    • Are there surrogate labels you might use?
  • Feature Blocks
    • Different feature sets need different regularization
    • One solution: block-stagewise regression
  • Combinations of multiple methods (“ensemble methods”) are usually the most accurate
  • Missing data
    • missing at random or not requires different imputation
  • Explanation/Insight is often important
    • visualization: word clouds, PCA, MDS
      • MDS: given an {$n x n$} matrix of distances between points, find a new (usually 2-D) representation of each of the points that as closely as possible preserves that distance matrix
    • Look at the data
      • posts, images scoring highest in some feature or outcome
    • variable importance
      • How “important” is each feature for the prediction?
    • Correlation is not causality

Back to Lectures

Edit - History - Print - Recent Changes - Search
Page last modified on 21 November 2016 at 02:52 PM