A practical discussion of how potentially revolutionary, yet ethically questionable data---such as that from facebook---is currently being handled in academia. With every day that passes, the users of social media websites are providing scientists with ever-richer, larger datasets on human behavior. At the same time, machine-learning techniques allow us to exploit this data to accurately predict who these users are and how they will behave in the future. I begin this talk by outlining the need for public datasets containing rich information on individuals and their social relations. I then show how in practice, distribution and use of such datasets by academics is awkward and confused. I conclude with some consideration of how "enhancing" datasets by, for example, inferring missing or hidden data using machine learning classifiers, creates yet another ethical grey-zone.
Secdocs is a project aimed to index high-quality IT security and hacking documents. These are fetched from multiple data sources: events, conferences and generally from interwebs.
Serving 8166 documents and 531.0 GB of hacking knowledge, indexed from 2419 authors from 163 security conferences.