Measuring and Managing Honesty in a Participant Pool Online

Intro

SocialSci has been cultivating our participant pool during our open beta period for the past 6 months.  We've grown our pool to thousands of participants, conducted dozens and dozens of studies, and produced well over 2 million answers for researchers using our platform. During this time we've learned a thing or two about how to gauge and manage participant honesty.  Many of the things discussed here have been implemented; others are under development or on the product road map.  Our goal is to provide a means for researchers to get their results faster by bringing their studies online, to the SocialSci platform, and letting us worry about acquiring participants, getting responses, compensating participants, and dealing with cheaters.

Framing our participants for honest engagement

Before starting this discussion, we should explain how SocialSci frames participants to be honest, and weeds out some who may be pre-disposed to lying from the get-go:

  • We differentiate ourselves from online marketing surveying by using the tagline "Real Science"
  • We require a valid cell phone number to limit account creation
  • We ask you to enter a username which doesn’t identify you
  • We ask for a password
  • We encourage you to sign up for our mailing list by entering your email address
  • As we don't associate your email address or phone number to your username, we ask you to complete 5 questions from a list of security questions, such as “how you like your eggs prepared?”

If you are signing up for an account, you'll have to be truthful with at least your cell phone number, and if you ever intend to get back into your account - the username, password, and security survey questions should be something you know, and it is easier to remember true answers here than falsehoods. [1] If you hope to continue taking studies and earning more - you'll likely sign up for our newsletter with a valid email address. We have a 30% percent click-through on each newsletter that goes out, indicating a very high response rate, with valid emails. Approximately 75% of our participants are on our mailing list.  Others may find out about new studies via Twitter or Facebook.

In SocialSci's system we offer participants points when completing studies when they're first released by researchers. We also offer 'archived studies' worth no points, but only social rewards such as badges. Historically 3.3% of participants in our system have been flagged for cheating, with only 0.3 % of total users cheating a second time.

The information discussed here is based on our experiences and interactions with that subset of cheaters (still numbering in the hundreds), compared to the majority of honest participants. While I work with researchers on a daily basis, I am not a researcher, and so take this discussion in the context of a curious entrepreneur, not a rigorous scientific investigation.

Motivations for lying

I'll break these down into four broad categories: anonymity, curiosity, intention and desperation.

  • Anonymity: Those who either believe that providing non-consistent answers helps to maintain their privacy or their anonymity. Answers are provided to mislead.
  • Curiosity: Those who are just curious about our service or a particular study, and submit answers which are not intended to be serious. Many of these never actually fully complete the study.  They're not driven by attaining the points, but merely understanding what a study is like, what we offer, etc. These are the people who answer 'human' to the ethnicity questions, provide age answers of 2 and 1000. Mostly harmless, as such outlier responses are auto-flagged as invalid by us and researchers.
  • Intention: Creating an account or taking studies with the pure intent being to maximize profit. These are some of the easiest to catch, because they magically qualify for EVERY study.  Even the ones which conflict.  They're male AND female, Caucasian and African-American, and Smoker and a non-smoker!  Trying to maximize profit by answering dishonestly while not being caught by our algorithm is an inherently impossible thought-provoking challenge: You can't see into the future as to what studies we'll be offering, and at what point values -- to determine what should be your consistent false-persona over time.
  • Desperation: These are survey takers who start out taking studies, answering them honestly. Until they get to a point where they realize they're only one more survey away from getting a reward… so they begin to evaluate whether they should lie to get more points and the ability to cash in for a reward.  This typically happens in one session, their first. Participants who have been taking studies honestly over time rarely become suddenly desperate and lie to take a study they do not qualify for.

Delving more into desperation, an easy-to-understand pattern emerges: Even the best intentioned participants begin to run into an ethical quandary when presented with the prospect of rewards.  When points are offered on a study, a small subset of users choose to lie to gain additional points, primarily driven by reaching a goal of cashing out points.

The most common pattern we see for lying participants goes as follows:

  1. Participants sign up for a SocialSci account
  2. Either through announcements within our system or word of mouth, participants gain an expectation of earning a reward.
  3. They begin taking studies, and earning points
  4. They realize, for instance, that they're only one more study's worth of points away from redeeming for a reward.
  5. They also realize they don't qualify for the study -- by having read the title or the researcher’s terms agreement.
  6. They decide to 'become' who the researcher is looking for -- lie on the study, and attempt to earn points.
  7. They redeem their points for a reward.

From this, we can observe that lying is seldom a participant’s primary intent: They answered honestly on the first few studies -- the ones they did qualify for. They only started lying out of desperation to reach their reward goal.

Lets dive further into the patterns and minds of those lying due to intention or desperation, and how we detect that they're lying.

Lack of attention - Not quite lying

Despite having the intention of lying, few provide the proper attention required to do so well.  If ages change wildly for instance, this is a clear example of lying through not paying attention. Plenty of people don't want to take the online survey process seriously, and so attempt to complete them as quickly as possible, putting in the first answers that pop into their mind, and blindly clicking away at the options.  This is the easiest to detect: Wildly unrealistic survey completion times, skipping any answer possible, submitting pages w/o required questions answered, missing gold standard questions, and more easily caught errors.

Another phenomena we see has to do with how closely participants pay attention to the researcher's survey terms agreement. Often-times researchers publish a long multi-page terms sheet that participants are expected to read and agree to before taking a study.  Most of the information contained within is of little significance to most participants, and so they accept the terms without reading them. Buried among those terms is a paragraph explaining who qualifies to participate in the study.

For one researcher who asked participants to confirm that they are the demographic they just agreed to in the terms -- we saw a 9% reduction in the number of invalid participants continuing on in the study, which was nearly 100% more effective than people who lied to that question, but stopped taking the survey at some point later (lying difficulty?). Finally, this method was 80% as effective as those who lied, completed the study, and were then caught by us.

SocialSci does this for many of our researchers. If our system doesn't have a high enough confidence based on past responses (or lack of responses for new users), we present the user with a qualifying study.  We don't tell them what the researcher is looking for. We just ask them to answer a few demographic questions, so we can determine if they qualify for that researcher's set of studies.  This is extremely effective: When people aren't provided information about what is being looked for, they have no incentive to answer differently than the truth.

The more directly participants are asked to confirm who they are, and if they qualify, the higher percentage who will disqualify themselves.

Consistent Lying is Difficult

Outside of SocialSci, I've observed that most people aren't good liars. They're especially not good consistent liars. This holds doubly true within SocialSci, where our system is designed to compare participants' responses over time for consistency.

Whether with full intention, or only out of desperation, sometimes participants will decide to lie, and having read the terms, will take the study not as themselves, but emulating who the researcher is looking for. This is easier said than done. Even if they start a study with the intent of lying, we've found that the further into the study people are asked for their demographic info, the more likely they are to forget to lie, and will provide the true info -- getting caught by our system.

For instance, if you're a white male, and you decide to take a study that requires you to be an African-American female, age 18-30, who is bisexual, we find that studies where cheaters are asked for this information within the first few pages of answering questions, will lie effectively. If instead that information is collected on the last page of a 15 page, 20 minute study, they often will not remember who they are supposed to be: And without a way to go back and review the terms, once having started the study, they eventually fail at lying consistently to match the study requirements.

Detecting lying - Onion layer approach

Thankfully I don't believe sharing our approach necessarily lessens its effectiveness, so I’m happy to share. I've spent 5 years working in the computer security field, and have taken away a number of lessons which I've applied to our lie detection strategy.  Many people have the mindset that if something isn't 100% foolproof, it's not worth doing.  Nothing is 100%, and in the security world, we combat this by the concept of 'onion' layers of security. You layer one approach on top of another: If someone gets through one method, as it’s not 100% effective, they will have to contest with another layer, and so-on, increasing the overall probability of success in detection. For example, a cheater taking surveys will try to optimize their monetary gain, by taking studies faster as they become accustomed to whats being asked of them. We'll detect this as out of the norm. Or they may forget their persona for one of the studies, and we'll catch them. By having a multi-layered approach to detecting a lack of truthfulness, we can gauge whether someone made a one-time error, or has a history of deceitful answers and actions.

The final goal is that once we've added enough layers, or hurdles for someone to jump through, it becomes not economically viable (nor intellectually possible) for them to be smart enough, take enough time, and be persistent enough to hop through all the hurdles just to get a $5 Amazon Gift card from us. And even if after all that, they still do? Its not going to be a statistically significant proportion of users who participate in studies to skew the results.  Such lying exists in the offline world also.  Researchers take that into account. We're not promising a fool-proof system, we're promising a much better system than what currently exists online for honest, payable and anonymous participants.

Lie detection is a tricky business, but with enough data and time; it becomes increasingly more apparent whether a participant has lied. Some more examples of our onion layers:

  • Gold Standard Questions - Perhaps asking what color the sky is, or to type a word instead of choosing from the options.  If the user isn't taking the study seriously enough to correctly answer that question, then its a red flag to the researcher for that result.  Our researchers do this. SocialSci does this.  We interject such questions along with our qualifying studies, and incorporate it into our algorithm which determines the 'credibility score' of participants.
  • Hidden Complex Questions make it more difficult to lie: What race is your mother, what race is your father? -- Will the liar think through the complexity of ensuring that this correlated with their own self-identified race-lie in a different study? What about asking what year you were born vs. your age?
  • Liars lie fast. If a lie is prepared (as in you've taken the study before, and you're attempting to game our system) -- this is easily detected, and taken into account. Even attempts to game this, by taking far too long to submit answers will raise flags, as it will be outside of the standard deviation for question, page, and survey answer times. The only real way to not flag this is to honestly take the survey!

In the Security world, there's the concept of 'As a defender, I must defend all avenues all the time from everyone' -- As an attacker I must find only one chink in the armor at one point in time to gain access.  I believe this concept similarly applies to catching liars: They must make only one slip-up in one study to begin to arouse suspicion in our algorithm -- a far more difficult task for the cheater when compared to just being honest. Sir Walter Scott's famous couplet "Oh, what a tangled web we weave / When first we practice to deceive!" describes the often difficult procedure of covering up a lie so that it is not detected in the future.

Detecting lying - Human review

We combine our automated algorithm with human reviews of participants’ activities. We implement a 24 hour review period for all orders, as the largest incentive for lying is to profit through rewards. This gives SocialSci staff a window of time to review any flags on a participant, and always keep on top of new advances in ways participants may try to lie, or cheat the system.  Our staff has the ability to positively or negatively flag accounts with different tags, worth different points, much like SpamAssassin would do for spam emails, as part of a Bayesian filter. We combine these manual flags with the algorithm output for an overall profile of participants, and that feedback goes into future development iterations to improve our algorithms.

Notifying participants of cheating

Based on all these tactics for detection, we take a variety of actions to alert participants.  We have two competing incentives: It costs us to acquire participants, and we want to have a large and diverse pool of participants for our researchers to access.  At the same time, we cannot tolerate having poor quality participants, who lie, or do not otherwise take our researchers’ studies seriously.  We take a graduated approach of notifying participants about the importance of being honest, and taking the studies seriously.  We hope to correct any poor behavior early and often, and aim for improvement by notifying participants of disqualification while taking a study, or when redeeming rewards. Opportunities are given to improve their cred score by taking archived studies, which are not worth points. If warning them is not possible, or the behavior is so egregiously fraudulent, we may be forced to block them from redeeming rewards, or participating in future studies. We always leave the door open for participants to discuss these decisions with us, and contest any account flags.  Sometimes we or our algorithms make mistakes, and we wish to manually correct this, and improve the algorithms.

Priming participants for honesty, and keeping them honest

Based on everything discussed, and the patterns and behaviors we see in participants, there are a number of things we can do to encourage honesty within the system, and quickly correct dishonesty.

Our biggest challenge is separating ourselves from other surveying online.  We are not marketing surveys. We're real scientific studies. Your answers matter: They affect the researchers’ science.  We care about participants’ honesty, researchers care, and we want them to care!  

For starters, we can ensure that we have sufficient new studies to take for any new participant, or when notifying our existing  pool. If they are able to earn enough points to cash in an award, then they have less incentive for opportunistic lying.  For those who have had poor experiences with online surveying in the past, this helps us gain credibility with participants, and begin changing their mindset.

As discussed earlier, we can also improve the survey terms agreement. We can emphasize specific requirements, and point-blank ask participants to agree to being the age and gender called for.  Many people who do not qualify will not continue when prompted in this way.

Case Study: Mass cheating -- what researchers experience online today

Just before publishing this post, we had our most egregious case of cheating yet: One or more persons created over 160 accounts, participated in studies, and submitted orders for rewards. The spoils for their efforts? A grand total of $5.

Combating situations like this is precisely why we designed SocialSci. We want researchers to be able to conduct their research, and get trustworthy results fast -- without having to play the cat and mouse game with cheaters, or design complex systems as SocialSci has: Let us worry about that.

Our onion layers of protection worked as intended, and enabled us to catch this cheater, and prevent them from skewing results, or pilfering our bank accounts. Here's what they did, and how we caught them.

Just like any user, they signed up for a new account by receiving a text message from us, but at the time didn't qualify for any studies.  A few days later we announced new studies and they signed up for another account. They began taking studies, finding two studies they qualified for, and earned enough points to redeem for a gift card. As everything checked out with their account, our system awarded the gift card 24 hours later.

In the meantime, the prospect of illicit gains was just too much for this user, and they decided to try to game the system.  They found an online SMS platform where they could receive messages from us, and signed up for multiple accounts.  As the accounts, their responses, and their orders piled up, the flags began going off.  We could identify them all as rogue: The only question was how they were signing up for multiple accounts.  For debugging purposes we only further obscure phone numbers a short period of time after they're submitted. By reviewing the recent phone numbers of new accounts, we saw an unusual pattern: many numbers in the same area code and exchange. We called one of them and were prompted with an automated message from an online texting service.  Sure enough we visited their website (they shall go unnamed) and reviewed the service. We've since followed their guidelines to submit our texting numbers to their block list to prevent this service from being used in such a way for the future.

But what did the user do that led to us being able to identify them?

  • For starters, they conducted much of their business from the same IP addresses. We don't store IP addresses, but we do store a one-way bcrypt hash of them for comparison purposes.
  • Similarly, they would log out of one account and right into another: a clear pattern of cheating.
  • The survey completion time was too short - average of 20 minutes vs. their 3 minutes.
  • They provided very similar responses: They were always a white homosexual male.
  • They chose the same studies to complete, and in the same order.
  • The human element: We visually could see a pattern in the username creation that was not normal

Our 24 hour review period worked exactly as intended: We were able to mass block the accounts, deny the orders, and flag the responses as invalid, not paying out a single dime to their cheating ways.

Conclusion

Although detecting and dealing with cheaters can be a difficult business, it is also a rewarding one. We're creating the best system online for researchers to get responses quickly and inexpensively for their studies, and trust in the answers they receive. We've already helped researchers around the globe get their results faster, and start writing up their results sooner, instead of spending endless months hunting down more participants, dealing with the hassles of collecting their responses, and compensating them.

Despite covering many of our strategies here, we have many more built into the system, and many more to come. Let us worry about this, and we'll let the researchers focus on their research.  We look forward to continuing to explore the challenges of running an honest, payable, and anonymous online pool of participants and sharing our findings with the research community.

- Mike

& The SocialSci Team
 

References:
[1] Self-relevance effect
Anthony G. Greenwald, The totalitarian ego: Fabrication and revision of personal history, American Psychologist, Volume 35, Issue 7, July 1980, Pages 603-618, ISSN 0003-066X, DOI: 10.1037/0003-066X.35.7.603.