Data Policy Compliance

Recently, Google took an extension down from the chrome web store that I had previously been using as part of a research project. Here are my thoughts on the matter.

An experiment with extensions

First, it is worth mentioning that the extension was originally used for a pilot study a while ago, and isn’t any longer in use, so no-one is being adversely affected by this. I also fully support any move that will increase users privacy, and like what I’m seeing.

But it does strike me as a little hypocritical that google should take down a tool which was being used to collect data for research when their whole business model is about hoarding data. Whatever, I’m not the multi-billion-dollar-conglomerate, so what does my opinion count?

The extension in question was used in a controlled study environment, and collected the following data:

  • Domains of sites visited for the duration of the experiment
  • Some code that people wrote in an editor
  • An email address

Prior to taking part, all participants would have had this explained to them, and completed a consent form that detailed the policy surrounding how their data would be stored, and how to opt out at a later date. (I.e. they were talked through the privacy policy). We did this as part of going through the university’s ethics approval process, and then telling participants this in the consent form they signed before taking part.

Despite this, and despite the fact that the chrome extension itself was “unlisted” – i.e. without our explicit approval people could not install it, the extension was found to be violating the chrome web store guidelines. The clinching failure seems to centre around not have a viable privacy policy (A failing of my own – I had neglected this considering it to only concern commercial enterprise).

If that’s how google wants to handle things, fair enough. For a long time the web store has been a hive of malware, and if Google are finally cleaning up and police their storefront, that’s a good thing. I’ll take down my extension and find some other route for measuring data in future research projects if it means there will be less malware for people to encounter.

For now, as part of further research, we’ve decided to build a custom system that has all of the coding side integrated using a GitLab instance. Once built we’ll be able to handle the code collection, execution, e-mail addressing and any questionnaire stuff ourselves. If we want to handle visited domains, we’ll need to work something else out though that would be in line with extension hosting guidelines.

To summarise my thoughts on a perspective of google’s newfound privacy-centric attitude:

  1. A privacy policy isn’t a measure of trust – Just because someone provides a privacy policy, that doesn’t in and of itself make it user friendly and privacy compliant. Unless you have some way of vetting that the entities that make extensions actually follow through with what they claim in their privacy policy, it means absolutely nothing. It strikes me that Google’s lawyers are just covering their backsides by requiring these, and if that’s what they’re using to police their web store, I am not filled with confidence.
  2. Their developer guidelines require Prominent Disclosure of collected data – This is when you state up front, clearly, what data is going to be collected whenever it gets collected. When we collected data, there was a big red banner that said “Domains you visit are currently being captured” that was added to the webpage. I can’t help but notice that Google doesn’t prominently disclose the fact that they are capturing your data every time you search, watch a youtube video, use your android phone in any fashion… you get the idea.
  3. Data collection requires affirmative consent – Beyond just having a privacy policy you need explicit consent to capture private data. We properly explained how data was going to be collected both before (the consent form) and during (a red banner). I can’t help but think that what with the massive long privacy document (that the majority of people won’t read) you have to agree to in order to use Google services, they’re kind of missing the “affirmative consent” part when people start using their services.

Some advice for researchers:

  • This goes without saying, but get a consent form from your ethics approval department. In addition to this, get your ethics approval department to provide you with a legalese privacy policy as well – you can use that if you need a browser extension, but I imagine it will come in handy in other aspects of research that may be impacted by GDPR.
  • Any time data is being collected (private or not, just to be safe), use big red warning banners to let participants know whats happening. I appreciate that this could feasibly impact the neutrality of the data in some circumstances, but informed consent is still important.
  • Treat your extension (indeed any software or setup used) as if it was going to be public, even if it is marked as private or unlisted. This comes in handy if you have to rely on a 3rd party, but should also future proof your setup should you wish to open source it at a later date (something I would strongly recommend doing for replication reasons).
  • If using a browser extension, avoid the browser vendor webstore entirely where possible – if you’re doing the study in-person, this should be doable. You won’t have to worry about Google / Mozilla making changes to their service mid-experiment. This is a consideration to have if you want remote participants.

And some advice for Google:

  • If you’re going to hold developers to such a high standard of privacy, good for you. But maybe consider holding yourselves to that same standard. It just seems like the right thing to do.

Comment

Vivaldi