Syllabus

Course Objectives

By the end of the course, you will be able to

articulate the main assumptions underlying machine learning approaches
demonstrate the basic principles of dataset creation
articulate the importance of data representations
evaluate machine learning algorithms
articulate the difference between supervised and unsupervised learning
apply a range of supervised and unsupervised learning techniques

Grade Breakdown

Class Participation: 10%

The participation grade is a combination of attendance (including arriving on time); attentiveness, engagement, and participation during class; and general preparedness for class discussions.

Datacamp Assignments: 25%

These projects are hands-on activities designed to both provide coding background and reinforce the concepts covered in class.

Project 1 (Dataset creation): 15%
Due: June 13

Curation and cleaning of a labelled data set that you will use for the supervised and unsupervised learning tasks in project 2 and 3. The dataset can be built from existing data and should be stored in your GitHub repository.

Project 2 (Supervised learning): 15%
Due: June 21

Application of two supervised learning techniques on the dataset you created in Project 1. This assignment should be completed as a Jupyter notebook in your GitHub repository.

Project 3 (Unsupervised learning): 15%
Due: July 1

Application of two unsupervised learning techniques on the dataset you created in Project 1. This assignment should be completed as a Jupyter notebook in your GitHub repository.

Final Paper: 20%
Due: July 8

A 5–8 page paper describing the work you did in projects 1–3 (your dataset and your supervised and unsupervised experiments). The paper should describe both what you did technically and what you learned from the relative performance of the machine learning approaches you applied to your dataset. This assignment should be posted as a PDF in your GitHub repository.

This entry is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.

DATA 71200 Advanced Data Analysis Methods (Summer 2022)

M.S. Program in Data Analysis and Visualization, CUNY Graduate Center

Course Objectives

Grade Breakdown

Need help with the Commons?