Course Overview

We now live in a world of information, where data can be leveraged to rapidly answer previously unanswerable questions. This course teaches students how to make sense of the large amounts of data frequently available, from hypothesis formation and data collection to methods of analysis and visualization. We begin by discussing how to set up Internet-level experiments and formulate testable hypotheses. We then learn ways to automatically gather, store and query large datasets. Next, we introduce two important classes of analysis: statistical methods (descriptive and predictive) and information visualization. Students learn to use the Python and R programming languages to carry out data collection, analysis and visualization, culminating in a final project using real data of the students' choosing.

Please use the links above for more information.