[ Image Source: Robinson, E. and Nolis, J. ]

Description of Course

This course introduces students to the principles and tools of data science. This course will provide a foundation for properly collecting and analyzing data to draw insights and to answer data-driven questions. The course has three main components: applied probability and statistics, data analysis and visualization, and machine learning. In the first component students will be introduced to the fundamentals of applied probability and statistics, learn how to interpret randomness, and how to assess predictive uncertainty. Students will then learn how to handle, clean, process, and visualize data of varying types using Python. Finally, the students will be introduced to the basics of machine learning to build predictive models. Students will further learn how to assess model validity and how to interpret the quality of model predictions.

Primary Resources

All reading material will be made available through presentation slides or the course webpage. Students will find the following optional textbooks useful throughout this course:

  • WL : Wasserman, L. "All of Statistics: A Concise Course in Statistical Inference." Springer, 2004
  • MK : Murphy, K. "Machine Learning: A Probabilistic Perspective." MIT press, 2012

Instructor and Contact Information:

Instructor: Jason Pacheco, GS 724, Email: pachecoj@cs.arizona.edu
TA: Enfa Rose George: enfageorge@email.arizona.edu
TA: Saiful Islam Salim saifulislam@email.arizona.edu
Office Hours:
    Enfa, Mondays, 10:30 - 11:30, Gould-Simpson Rm 934, Desk #6 (Hybrid)
    Saiful, Tuesdays, 10:00 - 11:00, Gould-Simpson Rm 942 (Hybrid)
    Jason, Wednesdays, 10:00 - 11:00, (Zoom)
D2L: https://d2l.arizona.edu/d2l/home/1072117
Piazza: https://piazza.com/arizona/fall2021/csc380
Instructor Homepage: http://www.pachecoj.com

Date Topic Readings Assignment
8/24 Introduction + Course Overview   (slides) What is Data Science?
Robinson, E. and Nolis, J.
8/26 Random Events and Probability   (slides) WL : CH1
8/31 Discrete Probability Distributions + numpy.random   (slides) WL : CH2 HW1 (Due: 9/9)
9/2 Continuous Probability, PDFs   (slides)
9/7 Moments and Dependence   (slides) WL : CH3
9/9 Statistics and Estimation   (slides) WL : Sec. 9.1 - 9.7 HW2 (Due: 9/16)
9/14 Bayesian Statistics   (slides) WL : Sec. 6.3, Sec. 10.2, Sec. 11.1-11.4
Scribbr: A Step-by-step Guide to Statistical Analysis
9/16 Data Collection Scribbr:
HW3 (Due: 9/23)
9/21 Exploratory Data Analysis
9/23 Data Preprocessing
9/28 Introduction to Data Visualization
9/30 Data Visualization
10/5 Review + Midterm
11/11 Veteran's Day / NO CLASS
11/25 Thanksgiving Recess / NO CLASS
12/15 Final Exam

© Jason Pacheco, 2020