I often had to deal with converting data from various formats to others. Most of the time, the task is trivial and only requires writing a single script that does the job. It might be convert data from an old system to a new one, to allow different tools to perform analysis on it or simply to import data from an external entity. In most cases, errors can have huge implications. Since conversions don’t usually occur very often, it’s usually wise to waste a few CPU cycles to make sure incoming data is right and the output makes sense.
In a perfect world, you decide what the input format is and what the content is. In the real world, it’s not always possible due to restrictions on the capacities of the exporting application. The worst case will cause informations to be missing or unaccessible due to bad references. It’s not uncommon to see data scattered across multiple applications and some tasks might require to access multiple sources. Accessing multiple databases or gathering informations from multiple files should not be a technical problem. The problems come with inconsistencies in the data. I have seen systems where the same entry is not referenced to with the same number in different systems or where the numbering conventions simply changed with time and were never documented. The task was actually to make the user’s work more efficient when accessing various systems.
A good trick to solve further problems is to write a good user documentation (yet an other task most developpers hate). Documentation is the only protection against user errors and fake bug reports a developper can have. The documentation shouldn’t only explain the different interface details of the application, but also what is the scope of the application (what it can do, what it can’t do and why it exists). A simple RTFM will then solve most problems.