Data Sources and Ideas for Topics

Data sources

There are many more potential sources of security data than I have listed here. Please do not limit your search to these websites.

If you are considering an empirical paper, look at some data sources and see if you can find one that you can do suitable analysis on. In particular, look for datasets that include numerical and categorical variables. Another approach is to pick a class of cybercrime and try to find as much information as you can to come up with an estimate of its cost, who is affected by it, and what the likelihood of attack is.

Linking data sources

Again this is a partial list. The idea here is to find supplemental data that can shed more light on existing security-related data sources.

Economic Indicators

Topics

Please note that this is not an exhaustive list. You are strongly encouraged to select a topic that is not on this list if it is of interest to you.

Example empirical projects

In addition to basing an empirical project on the topics above, here are a few topics for an empirical project.

  1. Data breaches at universities. Combine reports of data breaches with information on universities. Join the two data sets based on names and then examine the data to see if particular characteristics of the university affect the probability of a breach occurring (e.g., public vs. private, enrollment, etc.). One could also look at university rankings to see if there is any correlation between university ranking and breach probability.
  2. Data breaches at hospitals. Combine reports of data breaches with information on hospitals. Join the two data sets based on names and then examine the data to see if particular characteristics of the hospital affect the probability of a breach occurring (e.g., hospital size,).
  3. Online password database hacks. Examine a dataset on web password database breaches to estimate the annual probability of a breach occurring, how many customers are affected per breach, and the success rate per hacker group.
  4. Compare malware and phishing distribution by TLD. Compare a dataset of known malware domains to known phishing domains. For malware, one can also look at differences in the type of site as classified by the data. Also, identify the TLDs that are most afflicted by malware and phishing by normalizing according to the number of registrations per top-level domain, as indicated in the appendix of this document.

Paper resources

Curated Paper Lists and Literature Reviews

Relevant Conferences

Here are some relevant conferences that have papers on security economics topics. In addition, you are encouraged to use search engines on Google Scholar and DBLP. Here's a hint for DBLP searches -- include "venue:weis" in the search to restrict results to WEIS.

Here are some top computer security conferences where papers on security economics are occasionally published:

Papers are also published in information-school and business-school journals, such as Management Science and Information Systems Research.