P5: Project Report


The project report is the final deliverable summarizing the key findings of your project. It is not meant to detail everything you did this semester; instead, it should motivate the topic, explain the data collection methodology, and report on key findings.

Paper Structure

Your paper should have each of the four sections (Results can be broken up into multiple sections if you prefer):

  1. Abstract (200 words)
  2. Introduction (500 words)
  3. Methodology (1000-2000 words)
  4. Results (1000-2000 words)
  5. Conclusions, Limitations, and Future Work (250-500 words)

If there is any related work on the topic, please include references to these as well.

I have included very rough indications of recommended word counts. Please note that what I really care about is that you write just enough to clearly communicate your project's motivation and results. These word counts are only an indication, so please do not add fluff to meet the minimum or cut out essential parts if you go over.


Usually written after the rest of the report has been completed, the abstract should provide a pithy summary of the report's main findings. It must introduce the topic and describe the key results. As a general rule, abstracts should be no longer than 200 words (and closer to 100 is often acceptable).


Here you should introduce and motivate your project topic. You should explain what your key questions to investigate are, why you think this is interesting, and how you expect the data collection to help answer the questions. You should also briefly summarize the paper's key contributions and provide an outline of the subsequent sections of the paper.


Here you should describe at a high level how your data collection was designed. You should refer to the APIs referenced, explaining what data you retrieved from them. You should explain the strategy for issuing queries to the APIs to construct representative data samples that could be used in the analysis.

Note that you do not need to go into too much detail here about the logistics of data collection. For example, you do not need to list the columns of all the tables in your database.


Here you should describe the key results from your analysis. You can first include summary statistics to introduce the data set, followed by an explanation of the questions you are trying to answer and a description of the graphs and data to answer the question appropriately.

Conclusions, Limitations and Future Work

You should begin by restating your key results. Next you should describe the limitations of the current study -- ways in which the data set fell short, difficulties in generalizing results, etc. The discussion of limitations can transition into a discussion of opportunities for future work. What would you do if you had more time? Is there additional data you would collect, perhaps from different APIs, given what you now know about your data source? You can think of future work as a ``brain dump'' of what you would like to do if you were able to continue working on this topic.

Working in collaboration

It is OK for different team members to take primary responsibility for different sections (or subsections of the paper). In this sense P5 can be worked on in parallel moreso than other project phases. I suggest that you decide early on who will be responsible for producing first drafts of each section. You should assign one person to each section of the Introduction, Methodology and Results. After each person has written a draft of their respective section, the group should meet to combine the paper and work together to create a cohesive draft. It is essential that you each read over and provide feedback on what each other has written multiple times in order to improve the clarity of language and argument.

Once you have a solid Introduction, Methodology and Results, you can collaboratively work on the Conclusions, Limitations and Future Work section (dividing initial writing responsibility across team members but discussing what should go in each section as a group).

Integrating earlier work

It is acceptable (and even encouraged!) for you to re-use what has been done in earlier project phases when writing the report. In particular, I expect that P0 will be helpful for the Introduction, P1_documentation.pdf and BP1 will be helpful for the Methodology, and P2 (p2writeup.pdf) will be helpful for Results. Note, however, that there is no need to mention P4 in the report.

You should not simply copy and paste from those documents and call it a day, however. You should incorporate any feedback I gave, changes you made, and construct a cohesive argument that can be read from start to finish.

Exemplary papers

On the resources web page there is a listing of exemplary papers and projects. If you skim over a few of the papers it should give you an idea of what goes into papers describing data collection efforts.