
MS Curriculum
Our Master’s in Data Science 12-month program teaches students to develop deep expertise in research design, causal inference, and the statistical and computational foundations of modern analytics—grounded in rigorous quantitative reasoning and advanced computing, and applicable across industry, academia, and beyond.
The structured curriculum emphasizes building, evaluating, and interpreting models from first principles, so students understand not just how tools work, but why. Through hands-on, applied use of advanced statistical and AI-enhanced methods to address real-world challenges in business, health care, and public policy, students learn to code productively, ethically, and reproducibly—and to communicate complex results clearly to diverse audiences.
The degree requires 36 credit hours at the 500-level consisting of the courses listed. You can find full course descriptions and syllabi below.
First Term (Fall) | Course Content |
| Data scientists are increasingly expected not just to run analyses but to determine what questions data can actually answer - and how. This course develops the reasoning skills that sit beneath every modeling decision. We work backward from question types ("What is?", "Why?", "What if?", "What's next?") to the logic that justifies particular analytical approaches. You'll learn to distinguish causal from predictive questions, understand why randomization enables certain inferences, recognize when and why models generalize, and critically evaluate the reasoning behind modern AI workflows. The emphasis is on why methods work, not just how to implement them. 4 Credit Hours | |
This course examines principled approaches to model selection and tuning grounded in loss functions, statistical risk, and generalization theory, rather than ad hoc procedures. Topics include regularization, cross-validation, and modern learning algorithms, with emphasis on how these methods navigate the bias–variance tradeoff across widely used modeling families. Upon completion, students will be able to rigorously justify modeling choices and proactively identify potential failure modes in research and applied settings. | |
This class introduces students to advanced Python programming for data science applications, through several independent projects. In the first half of the course, students will be introduced to core data structures, parallel processing, data retrieval via webscraping/APIs, and visualization. In the second half, students will get introduced to supervised and unsupervised machine learning tools using SciKitLearn, covering topics related to model tuning, selection, out-of-sample validation, replicability, and version control. | |
This course focuses on the theory and practice of communicating data-driven insights through writing, visualization, and presentation. Students study perceptual, cognitive, and rhetorical principles for effective and persuasive data communication and apply them using contemporary visualization and presentation tools. Emphasis is placed on crafting clear analytical narratives, designing ethical and interpretable graphics, and communicating results with impact to technical and non-technical audiences. Through iterative projects, students develop skills in evidence-based writing and presentation that support rigorous and compelling data-informed arguments. | |
Second Term (Spring) | Course Content |
This course immerses students in the full lifecycle of real-world data science projects, integrating modeling, analysis, and theory through a series of iterative, applied mini-projects. Students develop professional skills in project planning and management, audience-aware communication, ethical and privacy-conscious practice, and the effective incorporation of AI tools. Emphasis is placed on justifying decisions, giving and receiving feedback, and delivering data science work that is clear, responsible, and impactful across stakeholders. | |
Applied math and statistics for Generalized Linear Methods, Dimensionality reduction, Panel data methods, and Machine learning techniques. | |
| This course focuses on programming skills related to data analysis, machine learning, & practical concepts for reproducible big data research. Coding in Python. 4 Credit Hours | |
Students form teams and apply their skills to a real problem/question of their choosing. All projects yield written reports and oral presentations. | |
Third Term (Summer) | Course Content |
Apply data science methods to real problems in industry, government, or non-profits. Developing professional skills and practicing communications skills are key goals. Teams of 2-3 students will work with community partners to produce solutions to real data-centered issues. |