Syllabus

Course Objectives

By the end of the course, you will be able to

  • articulate the main assumptions underlying machine learning approaches
  • demonstrate the basic principles of dataset creation
  • articulate the importance of data representations
  • evaluate machine learning algorithms
  • articulate the difference between supervised and unsupervised learning
  • apply a range of supervised and unsupervised learning techniques

Grade Breakdown

Class Participation: 10%

The participation grade is a combination of attendance (including arriving on time); attentiveness, engagement, and participation during class; and general preparedness for class discussions.

Datacamp Assignments: 25%

These projects are hands-on activities designed to both provide coding background and reinforce the concepts covered in class.

Project 1 (Dataset creation): 15%
Due: June 13

Curation and cleaning of a labelled data set that you will use for the supervised and unsupervised learning tasks in project 2 and 3. The dataset can be built from existing data and should be stored in your GitHub repository.

Project 2 (Supervised learning): 15%
Due: June 21

Application of two supervised learning techniques on the dataset you created in Project 1. This assignment should be completed as a Jupyter notebook in your GitHub repository.

Project 3 (Unsupervised learning): 15%
Due: July 1

Application of two unsupervised learning techniques on the dataset you created in Project 1.  This assignment should be completed as a Jupyter notebook in your GitHub repository.

Final Paper: 20%
Due: July 8

A 5–8 page paper describing the work you did in projects 1–3 (your dataset and your supervised and unsupervised experiments). The paper should describe both what you did technically and what you learned from the relative performance of the machine learning approaches you applied to your dataset.  This assignment should be posted as a PDF in your GitHub repository.