Electronic Batch Registration to Common Works Registration - The Story behind the Conversion Tool
At the beginning of 2018, we released a free EBR to CWR Conversion tool. It has been steadily gaining in popularity. It has also become an internal workhorse for the Data Cleaning and Conversion service we provide.
Almost by Accident
We have conversion tools that turn various spreadsheet formats (CSV/Excel) into CWR, and EBR falls in the category. However, when a client needed their EBR files converted to CWR, and our general CSV to CWR converter core would have needed significant changes to do that, we had another look at the issue. It turned out that EBR is actually CWR slimmed down and put into a spreadsheet structure. With our CWR library done and well tested, it’s reading capability was only slightly extended to handle EBR format. And then we just added a form and released it as a free tool.
Vanilla Tool versus Scripts
At the time, it seemed like an edge case, so we put absolutely no limitations on the tool. Internally we do used more complex scripts that add additional information to the CWR file, such as a list of sub-publishing entities with their respective territories. But it turned out that we mostly used the very same tool we released to the public, which was gradually extended, so the conversion would accept additional columns in the EBR file.
Free versus Paid?
At the time, we saw the tools as foundations for our services, not as the product itself. The conversion service did bring in some money, but even more importantly, clients that used these services have registered works with CWRs we made in a several societies in South America, Europe and Australia. And not just paying clients.
Util GDPR set in, we were saving both the EBR files and resulting CWR files. We released the CWR Acknowledgement parser, which was at the time a trimmed down version of the Visual Validator. So, we were able to have both CWRs our users made from their EBRs and resulting acknowledgement files. This gave us insights into several other collecting organizations.
Not every visitor succeeded with the conversion. The validation errors reported may look like:
Error in THE WORK: SPU, field `publisher_name`: Text contains invalid characters.
Someone who is fluent in CWR, the meaning of this sentence is completely obvious. One of the controlled publisher names has characters not allowed in names in a CWR file. Not every visitor succeeds in finding and fixing the error. And there might be another error. And another. And with the web interface, only the first error was being reported.
So, we needed to do something about this.
The conversion tool gained a butcher mode (as opposed to a ‘surgeon’), that dealt with most issues in EBR files. Initially we thought it to be risky and we cautioned against using it, but we have tested it on all the EBR files we had collected up to that point, and not once did it cause a problem, while in most cases it would enable the conversion. So, we renamed it to ‘Fix bad data’.
But with GDPR setting in, and our CWR being compatible with dozens of collecting organizations and major sub-publishers, the free version made no sense any more, except maybe as a lure for a paid automated service that would complement our aforementioned conversion service.
Dropbox as Interface
In order to solve all these issues, we chose to try a different approach: to use Dropbox (at first, to be followed by Google Drive, etc.) as an interface. This really simplifies the conversion, as we can have different IN folders with different settings, as well as an OUT folder for successful conversions and an ERROR folder for reporting issues. And, as output could be more than one file or one view, this made partial conversions possible.
In partial conversions, all EBR rows (works) that can be converted to CWR are converted, and a new EBR file, containing only bad rows, is also returned. So user can choose whether to fix the issues and try again or to use the partial CWR file immediately, if time is of the essence.
All this made it possible for us to have a proper subscription service that is simple to use and can be tailored to many different needs. The details are visible in the comparison chart.