Skip to main content

8 Ways to Check your Data Quality when using ODK or KOBO

Better data quality is one of the key benefits of mobile data collection.  You can run more in-depth checks on data entry quality as the survey is being carried out on a daily basis (not waiting until the end of the survey)!

Here’s a quick list of the top 8 ways to check the quality of your incoming data from your mobile survey:

Download Now Data Quality Navy

Data Quality Check #1: Length of Time to Finish Each Survey

Data Quality Check #1 time

  • To check this, subtract the start time of the form to the end time of the form, and see the overall time.
  • Why does this matter? An enumerator who is filling in data too quickly (or taking too long) may be filling in data dishonestly.  Not always the case, but at least it flags anomalies that you can go and check out in person.
  • ODK has three ways for you to track timestamps through a form –
    • start and end times for a form being filled,
    • a timestamp collected the first time they go to a particular question (click to find out how to collect timestamps in your XLSForm)
    • a full audit on your ODK form – where a .csv file is produced at the end of the questionnaire being finalized that gives you full details on how an enumerator filled out a form, and when they accessed every single question.

Data Quality Check #2: Collecting GPS Points

Data Quality Check #2 GPS

  • Check what percentage of the time your enumerators are collecting GPS.  If Enumerator X only collected 15% GPS points, while all other enumerators collected 95%+ GPS points – what’s going on with Enumerator X?)
  • ODK is working on a feature to collect GPS in the background (as part of the metadata of the form) instead of making GPS collection an explicit question in the interview. You may fear that this does not give enough control to the enumerator in the context.  Sometimes GPS collection is controversial in your context.  Therefore, use this feature wisely.

Data Quality Check #3: Random Distribution of GPS Points

Data Quality Check #3 GPS random

  • If you’re doing a random household survey, you’ll want to check for randomness in the location of the households surveyed. Collect GPS at each interview location, and then throw these points onto a map and check their randomness visually.
  • Watch out for points all along a straight line (a road).  Maybe an enumerator simply walked down a single road collecting data. If you’re trying to collect random, representative data, then this kind of data collection will not give you a random or overall view of the entire population of that community or location.
  • It’s easiest to do this kind of visual check with aerial imagery in the background of your map.

Data Quality Check #4: Gender Ratios

Data Quality Check #4 gender ratio

  • If you are doing a random survey (with truly random interviewee selection methodologies), and you’re collecting complete gender data on your population, you should end up with a 50/50 split between male and female. Perhaps you’ll get a 48/52 split, maaaayyybe a 47/53 split – but that is probably the max.  If you’re showing a 45/55 split, or a 40/60 split – you’ve got some issues with the random selection methodology.
  • The great thing about mobile data collection is that you can check these ratios daily. And if, by half-way through the survey, those numbers aren’t about 50/50, you can pause the survey midway and fix whatever the problem is.

Download Now Data Quality Turquoise

Data Quality Check #5: Bell Curves look ‘Normal’

Data Quality Check #5 bell curves

  • Many collected datasets will show some sort of a bell-curve – probably a skewed bell-curve. You might notice “spikes” in your curve.  Or you might notice how values along your dataset “jump” suddenly.  If you see this, then you might want to look a little more closely into these non-normal data distributions.
  • For example, we as humans tend towards entering numbers that are rounded off to factors of 5. So if you see spikes at “5, 10, 15, 20”, etc, then you’re most likely seeing human bias in number entry. (Human bias in this example is numbers being rounded up or down to the nearest 5).
  • You might also notice a jump in the dataset if a data entry assistant is skewing the data purposely. For example, if people get a benefit with a score of 50 or above, you may have scores that are artificially skewed towards being above 50.  Watch out for these signs of untruthful data collection.
  • Remember – the person entering data may not even realize they’re doing it! So if you notice this human bias showing up in your numbers, then ask yourself – Why, Why Why??  Get to the bottom of it.  Re-train your team so that they understand how bias causes problems in the data.  Train them on how to remain unbiased during their surveys.

Data Quality Check #6: Re-Interview People by Phone

Data Quality Check #6 phone back

  • It’s tough to do monitoring of a survey. Most of our monitoring and evaluation efforts exist to check up on physical programmes, activities, being run in a community.  However monitoring of a survey is good practice, as well.  One of the ways you can monitor the implementation of a survey is to randomly select some interviewees and phone them up to double-check a few things.
    • Where were they interviewed? (Does the location match where they were supposed to be interviewed according to the survey methodology?)
    • Ask them three or four questions from the survey to see if you get the same answers as what the enumerator entered into the form.
  • To do this, collect the first name of the interviewee, a phone number from them, and consent that you can call them for monitoring purposes.
    • Names and numbers are personal, sensitive, data, and must be protected! When you collect personal and sensitive data, you’ve got to make sure you’ve got good data protection practices in place.  Here’s a list of 27 tips for good data protection for humanitarian teams.

Data Quality Check #7: Unique Beneficiary Signatures

Data Quality Check #7 signatures unique

  • If you are collecting beneficiary signatures using XLSForm, then do a quick visual comparison of all the signatures collected by opening up the media folder. Are the signatures unique?  Or do they all look pretty similar?  If all the signatures are similar, then you might want to check that the enumerators know to collect the beneficiary signature.  Make sure they aren’t just putting their own signature down on the form.

Data Quality Check #8: Photo Evidence

Data Quality Check #8 photos

  • One great way to add evidence to your survey is to collect one or two photos with the survey. You can always give enumerators the option not to collect photos.  You can do a quality check to see which enumerators are collecting photo evidence regularly vs. hardly at all.
  • There is an option in ODK that you force the form to collect a NEW picture right then and there, instead of giving the option to the enumerator to select a previously-collected picture to use in the survey. Use this option if you want to ensure that pictures are taken at the exact moment the question appears.

Download Now Data Quality Green

By Janna

Janna is an aid worker, an engineer, a mom, a wife, and a self-declared data-lover! Her mission is to connect with every field worker in the world to help the humanitarian sector use information management and technology to make aid faster and more accountable.