Measuring and Managing Honesty in a Participant Pool Online
IntroSocialSci has been cultivating our participant pool during our open beta period for the past 6 months. We've grown our pool to thousands of participants, conducted dozens and dozens of studies, and produced well over 2 million answers for researchers using our platform. During this time we've learned a thing or two about how to gauge and manage participant honesty. Many of the things discussed here have been implemented; others are under development or on the product road map. Our goal is to provide a means for researchers to get their results faster by bringing their studies online, to the SocialSci platform, and letting us worry about acquiring participants, getting responses, compensating participants, and dealing with cheaters.Framing our participants for honest engagementBefore starting this discussion, we should explain how SocialSci frames participants to be honest, and weeds out some who may be pre-disposed to lying from the get-go:
If you are signing up for an account, you'll have to be truthful with at least your cell phone number, and if you ever intend to get back into your account - the username, password, and security survey questions should be something you know, and it is easier to remember true answers here than falsehoods. [1] If you hope to continue taking studies and earning more - you'll likely sign up for our newsletter with a valid email address. We have a 30% percent click-through on each newsletter that goes out, indicating a very high response rate, with valid emails. Approximately 75% of our participants are on our mailing list. Others may find out about new studies via Twitter or Facebook.In SocialSci's system we offer participants points when completing studies when they're first released by researchers. We also offer 'archived studies' worth no points, but only social rewards such as badges. Historically 3.3% of participants in our system have been flagged for cheating, with only 0.3 % of total users cheating a second time.The information discussed here is based on our experiences and interactions with that subset of cheaters (still numbering in the hundreds), compared to the majority of honest participants. While I work with researchers on a daily basis, I am not a researcher, and so take this discussion in the context of a curious entrepreneur, not a rigorous scientific investigation.Motivations for lyingI'll break these down into four broad categories: anonymity, curiosity, intention and desperation.
Delving more into desperation, an easy-to-understand pattern emerges: Even the best intentioned participants begin to run into an ethical quandary when presented with the prospect of rewards. When points are offered on a study, a small subset of users choose to lie to gain additional points, primarily driven by reaching a goal of cashing out points.The most common pattern we see for lying participants goes as follows:
- We differentiate ourselves from online marketing surveying by using the tagline "Real Science"
- We require a valid cell phone number to limit account creation
- We ask you to enter a username which doesn’t identify you
- We ask for a password
- We encourage you to sign up for our mailing list by entering your email address
- As we don't associate your email address or phone number to your username, we ask you to complete 5 questions from a list of security questions, such as “how you like your eggs prepared?”
If you are signing up for an account, you'll have to be truthful with at least your cell phone number, and if you ever intend to get back into your account - the username, password, and security survey questions should be something you know, and it is easier to remember true answers here than falsehoods. [1] If you hope to continue taking studies and earning more - you'll likely sign up for our newsletter with a valid email address. We have a 30% percent click-through on each newsletter that goes out, indicating a very high response rate, with valid emails. Approximately 75% of our participants are on our mailing list. Others may find out about new studies via Twitter or Facebook.In SocialSci's system we offer participants points when completing studies when they're first released by researchers. We also offer 'archived studies' worth no points, but only social rewards such as badges. Historically 3.3% of participants in our system have been flagged for cheating, with only 0.3 % of total users cheating a second time.The information discussed here is based on our experiences and interactions with that subset of cheaters (still numbering in the hundreds), compared to the majority of honest participants. While I work with researchers on a daily basis, I am not a researcher, and so take this discussion in the context of a curious entrepreneur, not a rigorous scientific investigation.Motivations for lyingI'll break these down into four broad categories: anonymity, curiosity, intention and desperation.
- Anonymity: Those who either believe that providing non-consistent answers helps to maintain their privacy or their anonymity. Answers are provided to mislead.
- Curiosity: Those who are just curious about our service or a particular study, and submit answers which are not intended to be serious. Many of these never actually fully complete the study. They're not driven by attaining the points, but merely understanding what a study is like, what we offer, etc. These are the people who answer 'human' to the ethnicity questions, provide age answers of 2 and 1000. Mostly harmless, as such outlier responses are auto-flagged as invalid by us and researchers.
- Intention: Creating an account or taking studies with the pure intent being to maximize profit. These are some of the easiest to catch, because they magically qualify for EVERY study. Even the ones which conflict. They're male AND female, Caucasian and African-American, and Smoker and a non-smoker! Trying to maximize profit by answering dishonestly while not being caught by our algorithm is an inherently impossible thought-provoking challenge: You can't see into the future as to what studies we'll be offering, and at what point values -- to determine what should be your consistent false-persona over time.
- Desperation: These are survey takers who start out taking studies, answering them honestly. Until they get to a point where they realize they're only one more survey away from getting a reward… so they begin to evaluate whether they should lie to get more points and the ability to cash in for a reward. This typically happens in one session, their first. Participants who have been taking studies honestly over time rarely become suddenly desperate and lie to take a study they do not qualify for.
Delving more into desperation, an easy-to-understand pattern emerges: Even the best intentioned participants begin to run into an ethical quandary when presented with the prospect of rewards. When points are offered on a study, a small subset of users choose to lie to gain additional points, primarily driven by reaching a goal of cashing out points.The most common pattern we see for lying participants goes as follows:
- Participants sign up for a SocialSci account
- Either through announcements within our system or word of mouth, participants gain an expectation of earning a reward.
- They begin taking studies, and earning points
- They realize, for instance, that they're only one more study's worth of points away from redeeming for a reward.
- They also realize they don't qualify for the study -- by having read the title or the researcher’s terms agreement.
- They decide to 'become' who the researcher is looking for -- lie on the study, and attempt to earn points.
- They redeem their points for a reward.
- Gold Standard Questions - Perhaps asking what color the sky is, or to type a word instead of choosing from the options. If the user isn't taking the study seriously enough to correctly answer that question, then its a red flag to the researcher for that result. Our researchers do this. SocialSci does this. We interject such questions along with our qualifying studies, and incorporate it into our algorithm which determines the 'credibility score' of participants.
- Hidden Complex Questions make it more difficult to lie: What race is your mother, what race is your father? -- Will the liar think through the complexity of ensuring that this correlated with their own self-identified race-lie in a different study? What about asking what year you were born vs. your age?
- Liars lie fast. If a lie is prepared (as in you've taken the study before, and you're attempting to game our system) -- this is easily detected, and taken into account. Even attempts to game this, by taking far too long to submit answers will raise flags, as it will be outside of the standard deviation for question, page, and survey answer times. The only real way to not flag this is to honestly take the survey!
- For starters, they conducted much of their business from the same IP addresses. We don't store IP addresses, but we do store a one-way bcrypt hash of them for comparison purposes.
- Similarly, they would log out of one account and right into another: a clear pattern of cheating.
- The survey completion time was too short - average of 20 minutes vs. their 3 minutes.
- They provided very similar responses: They were always a white homosexual male.
- They chose the same studies to complete, and in the same order.
- The human element: We visually could see a pattern in the username creation that was not normal
Our 24 hour review period worked exactly as intended: We were able to mass block the accounts, deny the orders, and flag the responses as invalid, not paying out a single dime to their cheating ways.ConclusionAlthough detecting and dealing with cheaters can be a difficult business, it is also a rewarding one. We're creating the best system online for researchers to get responses quickly and inexpensively for their studies, and trust in the answers they receive. We've already helped researchers around the globe get their results faster, and start writing up their results sooner, instead of spending endless months hunting down more participants, dealing with the hassles of collecting their responses, and compensating them.Despite covering many of our strategies here, we have many more built into the system, and many more to come. Let us worry about this, and we'll let the researchers focus on their research. We look forward to continuing to explore the challenges of running an honest, payable, and anonymous online pool of participants and sharing our findings with the research community.- Mike
& The SocialSci Team
References:
[1] Self-relevance effect
Anthony G. Greenwald, The totalitarian ego: Fabrication and revision of personal history, American Psychologist, Volume 35, Issue 7, July 1980, Pages 603-618, ISSN 0003-066X, DOI: 10.1037/0003-066X.35.7.603.
[1] Self-relevance effect
Anthony G. Greenwald, The totalitarian ego: Fabrication and revision of personal history, American Psychologist, Volume 35, Issue 7, July 1980, Pages 603-618, ISSN 0003-066X, DOI: 10.1037/0003-066X.35.7.603.