We plan to release some upgrades to the Unified Data Pipeline in Jan 2024.
Bug fix: providing data for activities in multiple files with the same iati-identifier
Included in this release is a bug fix that will ensure the Datastore provides data for cases where activities with the same ID (iati-identifier) appear in more than one file. This is something the guidance says not to do (See “with each activity contained in only one file.”) but we know that many publishers do this.
What is the problem being fixed?
Currently a specific problem occurs when the data in each file is different:
- File F1 has Activity with iati-identifier 1, and it has some data.
- File F2 has Activity with the same iati-identifier 1, and it has some data which is different from the data in F1.
In these cases, when you download XML data or request JSON or XML data via the API, you will see 2 records for activities, both with the same iati-identifier 1. But sometimes, both activities will have the data from F1 attached or both activities will have the data from F2 attached.
This fix will make sure that in these cases each record will always have different data - so you will get 2 activities and one will have data from file F1 and the other will have the data from file F2. This allows consumers to consider all of the available data and make decisions about how to use it in their application.
IATI Standard guidance: please publish activities correctly
To be clear: the guidance is still that activities are published in only one file, even after this fix is released. Not doing so makes the data ambiguous and harder to use, as data consumers have to decide whether to choose the data from one file, or try to merge the data in some way. It may also cause problems in other tools.
If you have any questions or comments, let us know here or via our support email address : firstname.lastname@example.org.