# CS780 / CS880: Introduction to Machine Learning

### When and Where

Tue & Thu, 12:40 pm - 2:00 pm Kingsbury N133

See class overview for more information on textbooks, syllabus, assignments, office hours, and grading.

## Assignments

Please use Piazza for questions about assignments.

Assignment | Due Date |
---|---|

Assignment 1 | 2/14/17 at 12:40PM |

Assignment 2 | 2/21/17 at 12:40PM |

Assignment 3 | 3/09/17 at 12:40PM |

Assignment 4 | 4/06/17 at 12:40PM |

Assignment 5 | 4/20/17 at 12:40PM |

## Syllabus

Date | Slides | Reading | Notebooks |
---|---|---|---|

^{1}⁄_{26} |
Statistical learning | ISL 1,2 | (html) (RMD) |

^{1}⁄_{31} |
Linear regression I | ISL 3.1-2 | (html) (RMD) |

^{2}⁄_{02} |
No class |
||

^{2}⁄_{07} |
Linear regression II | ISL 3.3-6 | |

^{2}⁄_{09} |
No class |
||

^{2}⁄_{14} |
Logistic regression | ISL 4.1-3 | (html)(RMD) |

^{2}⁄_{16} |
LDA, QDA, Bayes | ISL 4.4-6 | |

^{2}⁄_{21} |
Cross-validation | ISL 5 | |

^{2}⁄_{23} |
Model selection | ISL 6.1-6.2 | |

^{2}⁄_{28} |
Dimensionality | ISL 6.3-6.4 | |

^{3}⁄_{2} |
PCA ML/MAP | ISL 10.1-2 | ML PCA |

^{3}⁄_{6} |
Clustering and EM | ISL 10.3-5 | kmeans |

^{3}⁄_{9} |
Midterm Review | ISL 1-6, 10 | |

^{3}⁄_{21} |
** Midterm ** | ||

^{3}⁄_{23} |
Linear algebra | LAO 1.1-2,2,3 | |

^{3}⁄_{28} |
LA in ML | LAR | linear algebra |

^{3}⁄_{30} |
LA in ML | LAR | linear algebra |

^{4}⁄_{04} |
SVM | ISL 9 | |

^{4}⁄_{06} |
Decision trees and boosting | ISL 8 | |

^{4}⁄_{11} |
Nonlinear methods | ISL 7 | |

^{4}⁄_{13} |
Recommender systems | ||

^{4}⁄_{18} |
Bayes nets | MLP 10 | |

^{4}⁄_{20} |
Reinforcement learning | RL | |

^{4}⁄_{25} |
Final exam review | ||

^{4}⁄_{27} |
Project presentations (Graduate) | ||

^{5}⁄_{02} |
Deep learning and big data | DL | |

^{5}⁄_{04} |
Project presentations (Undergraduate) |

## Project

See the project overview for details on the details of deliverables. The deliverable are due by the end of the day (midnight).

Date | Deliverable | Page Limit |
---|---|---|

^{2}⁄_{24} |
Project description and data sources | 1 |

^{3}⁄_{07} |
Evaluation methodology | 1 |

^{3}⁄_{23} |
Method and literature overview | 2 |

^{4}⁄_{06} |
Preliminary results | 3 |

^{4}⁄_{27} |
Final report | 7 |

## Exams

See practice questions for questions you should be able to answer to be ready for the midterm and final exams.

Date | Exam |
---|---|

^{3}⁄_{21} |
Midterm (take home) |

## Textbooks

### Main reference:

ISL: James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning

### More in-depth material:

### Related topics:

- LAO: Hefferon, J. Linear Algebra (2017)
- LA: Strang, G. Introduction to Linear Algebra. (2016)
*Also see:*online lectures - LAR: Introductory Linear Agebra with R
- CO: Boyd, S., & Vandenberghe, L. (2004). Convex Optimization.
- RL: Sutton, R. S., & Barto, A. (2012). Reinforcement learning. 2nd edition (forthcoming?)
- RLA: Szepesvari, C. (2013), Algorithms for Reinforcement Learning
- DL: Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning
- MLP: Murphy, K (2012). Machine Learning, A Probabilistic Perspective.

See class overview for more information on the textbook.

## Class Content

The goal of this class is to teach you how to use *machine learning* to *understand data* and *make predictions* in practice. The class will cover the fundamental concepts and algorithms in machine learning and data science as well as a wide variety of practical algorithms. The main topics we will cover are:

- The maximum likelihood principle
*Regression*: Linear regression*Classification*: Logistic regression and linear discriminant analysis- Cross-validation, bootstrap, and over-fitting
*Model selection*: Regularization, Lasso*Nonlinear models*: Decision trees, Support vector machines*Unsupervised*: Principal component analysis, k-means*Advanced topics*: Bayes nets and deep learning

The graduate version of the class will cover the same topics in greater depth.

### Programming Language

The class will involve hand-on data analysis using machine learning methods. The recommended language for programming assignments is R which is an excellent tool for statistical analysis and machine learning. *No prior knowledge of R is needed or expected*; the book and lecture will cover a gentle introduction to the language. Experienced students may also choose other alternatives, such as Python or Matlab.

## Pre-requisites

Basic programming skills (scripting languages like Python are OK) and some familiarity with statistics and calculus. If in doubt, please email me.