Measuring and Managing Honesty in a Participant Pool Online
- We differentiate ourselves from online marketing surveying by using the tagline "Real Science"
- We require a valid cell phone number to limit account creation
- We ask you to enter a username which doesn’t identify you
- We ask for a password
- We encourage you to sign up for our mailing list by entering your email address
- As we don't associate your email address or phone number to your username, we ask you to complete 5 questions from a list of security questions, such as “how you like your eggs prepared?”
If you are signing up for an account, you'll have to be truthful with at least your cell phone number, and if you ever intend to get back into your account - the username, password, and security survey questions should be something you know, and it is easier to remember true answers here than falsehoods. [1] If you hope to continue taking studies and earning more - you'll likely sign up for our newsletter with a valid email address. We have a 30% percent click-through on each newsletter that goes out, indicating a very high response rate, with valid emails. Approximately 75% of our participants are on our mailing list. Others may find out about new studies via Twitter or Facebook.In SocialSci's system we offer participants points when completing studies when they're first released by researchers. We also offer 'archived studies' worth no points, but only social rewards such as badges. Historically 3.3% of participants in our system have been flagged for cheating, with only 0.3 % of total users cheating a second time.The information discussed here is based on our experiences and interactions with that subset of cheaters (still numbering in the hundreds), compared to the majority of honest participants. While I work with researchers on a daily basis, I am not a researcher, and so take this discussion in the context of a curious entrepreneur, not a rigorous scientific investigation.Motivations for lyingI'll break these down into four broad categories: anonymity, curiosity, intention and desperation.
- Anonymity: Those who either believe that providing non-consistent answers helps to maintain their privacy or their anonymity. Answers are provided to mislead.
- Curiosity: Those who are just curious about our service or a particular study, and submit answers which are not intended to be serious. Many of these never actually fully complete the study. They're not driven by attaining the points, but merely understanding what a study is like, what we offer, etc. These are the people who answer 'human' to the ethnicity questions, provide age answers of 2 and 1000. Mostly harmless, as such outlier responses are auto-flagged as invalid by us and researchers.
- Intention: Creating an account or taking studies with the pure intent being to maximize profit. These are some of the easiest to catch, because they magically qualify for EVERY study. Even the ones which conflict. They're male AND female, Caucasian and African-American, and Smoker and a non-smoker! Trying to maximize profit by answering dishonestly while not being caught by our algorithm is an inherently impossible thought-provoking challenge: You can't see into the future as to what studies we'll be offering, and at what point values -- to determine what should be your consistent false-persona over time.
- Desperation: These are survey takers who start out taking studies, answering them honestly. Until they get to a point where they realize they're only one more survey away from getting a reward… so they begin to evaluate whether they should lie to get more points and the ability to cash in for a reward. This typically happens in one session, their first. Participants who have been taking studies honestly over time rarely become suddenly desperate and lie to take a study they do not qualify for.
Delving more into desperation, an easy-to-understand pattern emerges: Even the best intentioned participants begin to run into an ethical quandary when presented with the prospect of rewards. When points are offered on a study, a small subset of users choose to lie to gain additional points, primarily driven by reaching a goal of cashing out points.The most common pattern we see for lying participants goes as follows:
- Participants sign up for a SocialSci account
- Either through announcements within our system or word of mouth, participants gain an expectation of earning a reward.
- They begin taking studies, and earning points
- They realize, for instance, that they're only one more study's worth of points away from redeeming for a reward.
- They also realize they don't qualify for the study -- by having read the title or the researcher’s terms agreement.
- They decide to 'become' who the researcher is looking for -- lie on the study, and attempt to earn points.
- They redeem their points for a reward.
- Gold Standard Questions - Perhaps asking what color the sky is, or to type a word instead of choosing from the options. If the user isn't taking the study seriously enough to correctly answer that question, then its a red flag to the researcher for that result. Our researchers do this. SocialSci does this. We interject such questions along with our qualifying studies, and incorporate it into our algorithm which determines the 'credibility score' of participants.
- Hidden Complex Questions make it more difficult to lie: What race is your mother, what race is your father? -- Will the liar think through the complexity of ensuring that this correlated with their own self-identified race-lie in a different study? What about asking what year you were born vs. your age?
- Liars lie fast. If a lie is prepared (as in you've taken the study before, and you're attempting to game our system) -- this is easily detected, and taken into account. Even attempts to game this, by taking far too long to submit answers will raise flags, as it will be outside of the standard deviation for question, page, and survey answer times. The only real way to not flag this is to honestly take the survey!
- For starters, they conducted much of their business from the same IP addresses. We don't store IP addresses, but we do store a one-way bcrypt hash of them for comparison purposes.
- Similarly, they would log out of one account and right into another: a clear pattern of cheating.
- The survey completion time was too short - average of 20 minutes vs. their 3 minutes.
- They provided very similar responses: They were always a white homosexual male.
- They chose the same studies to complete, and in the same order.
- The human element: We visually could see a pattern in the username creation that was not normal
[1] Self-relevance effect
Anthony G. Greenwald, The totalitarian ego: Fabrication and revision of personal history, American Psychologist, Volume 35, Issue 7, July 1980, Pages 603-618, ISSN 0003-066X, DOI: 10.1037/0003-066X.35.7.603.