CSE 5/7338 Project Description

The final project is on a topic selected by the students. See the topics page for potential topics and resources.

Students are required to work in pairs. We can have one group of 3 or one person working on their own, since we have an odd number of students.

Project reports are due on the final day of class. The project proposal is worth 10% of the overall project grade, the project report is worth 70% of the overall project grade, while the project presentation is worth 20% of the overall project grade.

The scope of the project is to identify, process and analyze an existing dataset on a security-related topic. It may be necessary to link two or more disparate data sources in order to conduct interesting analysis (e.g., linking data on US hospitals with records of data breaches so that you can report on the incidence of data breaches at hospitals and see if characteristics of the hospital affect the likelihood of breaches). Begin by stating hypotheses that you would like to examine with the data. Then write code in R to explore and then analyze the data. Apply any relevant statistical tests to confirm or refute your stated hypotheses. Finally, you will write up a report describing the dataset and providing basic analysis.

Key Dates and Project Tasks

The project is divided into the following tasks:

A key purpose of the course project is to help students gain experience in carrying out research projects and writing effective research papers. Consequently, the quality of the writing of the Project Proposal and Report will be evaluated carefully for clarity.

Project Proposal

The Project Proposal can be thought of as a first draft of the introduction and methodology section of the Project Report. Your task is to introduce the topic you are proposing to investigate and explain why it is interesting. You must then describe the methodology for investigating the topic. This must include a description of the data set(s) you will be using, a clear statement of any hypotheses you hope to answer by examining the data, and a justification for why the examined data set can help answer these questions.

The Project Proposal should be roughly 800 words long. To get improved feedback on writing and project proposals, students will do peer evaluation of each other's proposal documents as part of a homework assignment. Feedback from the instructor and peer evaluation can be used to improve the proposal, and revised text from the Project Proposal can be incorporated into the final Project Report.

Full details on the project proposal can be found here.

Project Report Requirements

The write-up should include the following sections: (results can be broken up into multiple sections if you prefer):

If there is any related work on the topic, please include references to these as well. References should follow the ACM format (see bottom of this page for examples.) In particular, for empirical projects that attempt to estimate costs, be sure to include references to how the costs are calculated.

I have included very rough indications of recommended word counts. Please note that what I really care about is that you write just enough to clearly communicate your project's motivation and results. These word counts are only an indication, so please do not add fluff to meet the minimum or cut out essential parts if you go over.

Abstract

Usually written after the rest of the report has been completed, the abstract should provide a pithy summary of the report's main findings. It must introduce the topic and describe the key results. As a general rule, abstracts should be no longer than 200 words (and closer to 100 is often acceptable).

Introduction

Here you should introduce and motivate your project topic. You should explain what your key questions to investigate are, why you think this is interesting, and how you expect the data you are analyzing to help answer the questions. Explicitly relate the question under investigation to any economics concepts discussed in the course. You should also briefly summarize the paper's key contributions and provide an outline of the subsequent sections of the paper.

Methodology

Here you should describe the data you used. In particular, explain for how long the data covers, how it was collected, whether it can be considered representative, and any limitations to its representativeness. Note that you do not need to go into too much detail here about the logistics of data collection. For example, you do not need to list the columns of all the tables in your database.

Results

Here you should describe the key results from your analysis. You can first include summary statistics to introduce the data set, followed by an explanation of the questions you are trying to answer and a description of the graphs and relevant statistics to answer the questions appropriately.

Conclusions, Limitations and Future Work

You should begin by restating your key results. Next you should describe the limitations of the current study -- ways in which the data set fell short, difficulties in generalizing results, etc. The discussion of limitations can transition into a discussion of opportunities for future work. What would you do if you had more time? Is there additional data you would collect, given what you now know about your data source? You can think of future work as a curated ``brain dump'' of what you would like to do if you were able to continue working on this topic.

Appendix

Include any code as an appendix to the report.

Note that I will use the following rubric when evaluating the project reports.

Project Presentations

Each group will give brief presentations during the last class summarizing your key findings. The project presentation is worth 20% of the overall project grade.

The goal of the presentation is to explain to others the topic you investigated, including the questions you hoped to answer, your results, and any lessons you learned from carrying out the study. An evaluation rubric is available here.