It's a year since we (at the Centre for Humanitarian Data) released the IATI COVID-19 Funding dashboard. 

It's continued to successfully pull in and use IATI data every day since.  However, we have encountered new and unexpected data challenges along the way.  For example, we found that some large publishers started to use the term "COVID-19" in all their descriptions, regardless of whether the activity related to the pandemic.  Hence, we had to adjust our processing logic so that we didn't use this data.  

A recent phenomenon I wanted to share was the publication of some transaction values that seem to be overly large.  

For example, this activity includes a $200 Billion commitment to COVID-19 responses in Morocco.

$200B to COVID-19 in Morocco
$200B outgoing commitment

 

In another example, a publisher seemed to provide Billions, when it seemed that Millions were more relevant.  the IATI Technical Team  instigated some excellent work to analyse this:

Analysis of BRC data by IATI tech team
Analysis of BRC data by IATI tech team

 

In the first example, this activity is still published as we found it.  We've had to exclude this from any data use, given that it seems to be false.  In the second case, things look to be fixed, but we're still excluding it in the meantime.

In both cases, the values are technically valid (in that, they don't fail schema validation).  We only learnt about them through the other data users, or the impact on our own tools.

To help with our data routines, we maintain a simple lookup sheet to stop any such activities.  That's something particular to this one instance of using IATI data, so doesn't feel very robust.

Of course, we have long established that IATI is not an accounting system - but these examples do disrupt those that use the financial data!  If we leave this to just these people, it both multiples and duplicates the work...

The question for this Community of Practice is therefore: what do we need to help flag and communicate when a transaction value doesn't look right?  In turn, how do we actually identify what might be candidates for such behaviour? 

Comments (2)

Herman van Loon
Herman van Loon

Hi Steven,

Good points you are raising. On the positive side: since you are using the data these data errors see the light! To catch these kind of errors, without reverting to eyeballing each individual transaction, the only way I can thing of to flag them is to have additional plausibility checks. E.g. on the activity level, I doubt there are legitimate single IATI transactions greater than 1 billion USD. Depending on the type of organizations, you can probably set more strict plausibility rules. E.g. a large bilateral donor or multilateral organizations may have some transactions exceeding 50 million USD.  That seems very unlikely though for local or donor-country based NGO's

David Megginson
David Megginson

Great points, Herman -- thanks. In cases like these, I think there's also value in making both the rules the list of blocked orgs and transactions public (when safely possible), to keep the "T" in IATI. I think we were doing that originally, but I'm not sure about the block list now, since I was away on medical leave for a while.


Please log in or sign up to comment.