Carnegie Mellon University

Online Consent – Comparing Web Tracking of Users vs. Bots – Main Study

This survey and data collection task is part of a research study conducted by Eric Zeng at Carnegie Mellon University (CMU) and is funded by CyLab.

Summary

We are conducting a study of user privacy on the web. Many websites use “web trackers” to track the websites that people visit. Websites use this data to analyze the performance of their sites and enable advertisers to target people with online ads tailored to their interests. A common technique that researchers use to study web trackers is automated web crawlers (bots). However, we do not know if the trackers that we observe using bots are the same as what real people see when browsing the web.

To compare how web tracking differs between bots and real people, we are looking for participants to install a browser extension that will send us data on your browsing history and the web trackers that you encounter on the websites you visit. By comparing your data with ours, this will help us understand whether the tools researchers use to measure web trackers are accurate. The study will run for 14 days, and participants who complete the full study will be paid $48 as compensation for their time and data.

Purpose

The purpose of the research is to determine whether data on web trackers collected by automated web crawlers is representative of data collected from real users. These results will help researchers understand the strengths and limitations of other research on web privacy that use automated web crawlers for data collection.

Procedures

This study consists of two parts: a pre-screening survey, and a 14 day data collection task (the part you are currently on).

Pre-screening survey. You previously completed this part of this study. In this survey, we asked about your demographic information, your internet use habits, your attitudes towards online privacy, and your general digital knowledge. We used this information to determine your eligibility for this part of the study.

Data collection task. This consent form is for the data collection task. If you consent to participate, we will provide you with instructions on how to install the browser extension that we use to collect data in this study. This browser extension will collect data on the websites that you visit, including data from the past 30 days, and the web trackers that you encounter while browsing. You will continue to use your browser normally for 14 days with the extension installed. After the study concludes, the extension will uninstall itself.

It is very important to our research that we collect real browsing behavior from users. Data that does not represent real browsing will interfere with our ability to accurately understand online tracking. For that reason, we expect that you will use your browser with our extension for at least one hour per day, on average.

During the study, if we determine there has not been enough browsing activity, or if we detect inauthentic browsing behavior, we reserve the right to terminate the study early at our discretion. If there are some days where you will not be able to use your computer (e.g. travel), please let us know, and we will be happy to accommodate those situations.

Participant Requirements

Participation in this study is limited to individuals age 18 and older living in Allegheny County, PA.

Participants must use Google Chrome as their primary desktop web browser during the main study.

Participants who block all JavaScript or all third party cookies in their primary desktop browser are not eligible for the main study.

Participants who do not regularly use their primary desktop web browser, or who have deleted a substantial amount of their web history in the previous 30 days, are not eligible.

Risks

The risks and discomfort associated with participation in this study are no greater than those ordinarily encountered in daily life or during other online activities.

We will collect data the URLs (links) for each website that you visit while the browser extension is active. These URLs may link to confidential information if it does not require a login or account to access. However, we will not have access to your data on sites that require logins (e.g. your email, social media), nor will we collect any information on what you see on the page. We will filter out any URLs that are not accessible by the public.

If at any time you want to temporarily stop collecting data, you have the option to pause data collection for an hour. Additionally, we do not collect any of your data in Incognito/Private Browsing mode.

We will keep the data we collect as secure as possible: we will use encryption to protect all data sent to us, and the data will be stored on a CMU server. We make a best effort to protect data but cannot guarantee full protection.

Benefits

There may be no personal benefit from your participation in the study, but the data collected may be helpful for future research and policymaking in online privacy. The data collected may help researchers determine whether the current methods for measuring online tracking and privacy accurately represent what real people experience, and if not, how to design more representative studies. Developing better methods will help researchers advance a clearer understanding of how much peoples’ privacy is eroded by the online advertising and tracking ecosystem.

Compensation & Costs

Participants who are selected for the second part of the study, and collect data using the browser extension for the full 14 days, will be up to paid $48. Participants will receive $2 per day of data collected, and $20 for completing the entire study. Participants who choose to stop participating before completing the full 14 days of data collection will receive compensation for the number of days they participated. There will be no compensation for filling out the pre-screening survey.

There will be no cost to you if you participate in this study, but data sent by the browser extension to our servers will count towards your data cap if you are using a metered internet plan for your computer (we estimate this will be <1% of total bandwidth usage).

Future Use of Information

In the future, once we have removed all identifiable information from your data, we may use the data for our future research studies, or we may distribute the data to other researchers for their research studies. We would do this without getting additional informed consent from you (or your legally authorized representative). Sharing of data with other researchers will only be done in such a manner that you will not be identified.

Confidentiality

By participating in this research, you understand and agree that Carnegie Mellon may be required to disclose your consent form, data and other personally identifiable information as required by law, regulation, subpoena or court order. Otherwise, your confidentiality will be maintained in the following manner:

Your data and consent form will be kept separate. Your consent form will be stored in a secure location on Carnegie Mellon property and will not be disclosed to third parties. By participating, you understand and agree that the data and information gathered during this study may be used by Carnegie Mellon and published and/or disclosed by Carnegie Mellon to others outside of Carnegie Mellon. However, your name, address, contact information and other direct personal identifiers will not be mentioned in any such publication or dissemination of the research data and/or results by Carnegie Mellon. Note that per regulation all research data must be kept for a minimum of 3 years.

The researchers will take the following steps to protect participants’ identities during this study: (1) Each participant will be assigned a unique ID; (2) all data collected will be associated with the participant ID, not their name.

No third-party annotation services will be used to annotate your data.

Right to Ask Questions & Contact Information

If you have any questions about this study, you should feel free to ask them by contacting the Principal Investigator now at:
Eric Zeng,
Postdoctoral Research Associate
CyLab
4720 Forbes Ave
Pittsburgh, PA 15213
ericzeng@cmu.edu

If you have questions later, desire additional information, or wish to withdraw your participation please contact the Principal Investigator by mail, phone or e-mail in accordance with the contact information listed above.

If you have questions pertaining to your rights as a research participant; or to report concerns to this study, you should contact the Office of Research integrity and Compliance at Carnegie Mellon University. Email: irb-review@andrew.cmu.edu . Phone: 412-268-4721

Voluntary Participation

Your participation in this research is voluntary. You may discontinue participation at any time during the research activity. You may print a copy of this consent form for your records.


I am at least 18 years of age.