Main /

Description

CIS 5200 provides a fundamental introduction to the mathematics, algorithms and practice of machine learning, focusing on representation, loss functions, and optimization. Topics covered include:

Supervised learning: least squares regression, logistic regression, L0/L1/L2 feature selection/regularization, online learning, boosting, Naive Bayes, support vector machines, ensemble methods, neural nets/deep learning
Unsupervised learning: PCA, K-means clustering, Gaussian Mixture Models, EM, HMMs, Bayesian networks
Reinforcement learning: TD-learning, Q-learning, deep learning

Audience

The course is aimed broadly at advanced undergraduates and beginning graduate students in computer science, electrical engineering, mathematics, physics, and statistics. This is a hard course; A good alternative for those with less math background or time is CIS419/519 or, if you want a really nice, much easier intro, take the Coursera ML course. If unsure which to take, see this.

Software

We will be coding in Python, using the Jupyter/SKLearn/Pytorch libraries, running on Google Colab.

Pre-requisites

Basic probability and statistics (random variables, covariance matrix, CDF/PDF, Gaussian and other distributions, multiple regression). [CSE 261]
Basic linear algebra (matrices, vectors, rank, basis, projection, inverse, eigenvectors).
Reasonable programming skills, including basic knowledge of python.

Format

The format this year will be a mix of

Lectures - these can be watched live or via video recording
Worksheets - step-by-step jupyter notebooks that cover the core material
Homework, quizzes, final project, midterm, final (see below)
"Pods" - mandatory attendance, TA-led discussion groups that meet for one hour a week. They should help you meet other students, assure you understand the material, and allow for discussion of broader topics.

Evaluation

20% Worksheets: These are graded primarily on being completed on time (but answers need to be sensible).
20% Problem Sets: Your lowest homework score will be dropped. Any homework turned in late will be penalized 25 points per late day or fraction of day.
10% Participation/attendance: Only for your pod! One permitted absence.
10% Final project: Late days are not permitted for the final project.
10% Quizzes and Surveys
10% Midterm
20% Final (cumulative: 1/3 on pre-midterm; 2/3 on post-midterm)

The problem sets include programming questions. The midterm and final will be semi-closed book exams (cheat sheet allowed: one 2-sided sheet for the midterm, two 2-sided sheets for the final), which will encompass material covered in the lectures and assigned in the readings. The project is an open-ended three-person team project.
We do not take attendance except for the pods, but you will learn more if you attend lectures instead of watching the recordings.
Worksheets, quizzes and surveys should be completed before your pod on the week after they are assigned.

Reading Materials

For the mathematical side of ML: C. Bishop, Pattern Recognition and Machine Learning. 2007
For classical ML in Scikit-learn: hands on machine learning
For deep learning in pytorch: Dive into Deep Learning
example final projects demo1 and demo2
See also Resources and Lectures