Syntax Highlighting for Common Works Registration Format
This article is about reasons for using syntax highlighter for CWR files in various situations. But first the background story.
History of CWR syntax highlighting
I made the first CWR syntax highlighting in late 2013 (for Sublime Text 2), and have been using it ever since. Early on, I used it to find errors in CWR files. This is where I got the idea for the Visual CWR Validator, released in 2017. It was the first tool based on my Music Metadata Engine. At the time, it was free-to-use.
At some point, I realized that for most users, syntax highlighting was more important than validation. The validation part was actually just confusing. All they wanted was to understand the data contained in the file. And it had to be significantly more readable and clear to smart people with no previous CWR experience.
So, in 2018, I created a new tool, removing the “validator” part. Well, in reality, it is a completely different code, still based on the same engine. But it followed the “less is more” philosophy. While all fields have additional information on mouseover, most of them have no colour, being light gray, for counters and defaults, or black for fields with actual values. Colour is rarely used and each has a special meaning. And some spacing was added.
This new tool, called Visual CWR, is free to use, and that will not change. Aforementioned Visual CWR Validator is now only available as yearly subscription.
The highlighting also made it into open-source code.
Note that the above screenshot from Django-Music-Publisher (DMP) depicts CWR version 3.0, not 2.1, shown in previous screenshots. CWR 3.0 is significantly easier to read.
Even less information is colour-coded in DMP. There are three reasons for this. This highlighting is not based on a metadata engine, but rather a simple Django template. More importantly, the target group is different. And, most importantly, I learned something in between about users. They want only what they need.
Why is CWR Syntax Highlighting Important?
To a software developer, the question seems equivalent to: Why is walking important? We could just be crawling.
While reading any code or data format without syntax highlighting is something I can do, I prefer to use a tool that provides me with syntax highlighting. I am faster and can focus on the issue at hand better. More unreadable the format is, more important the syntax highlighting. And CWR, even 3.0, is very unreadable.
All records (rows) are not equal. They actually form a structure. File has groups of transactions. Every transaction should be separated. Every transaction has a header row. All rows in the transaction are, directly or indirectly, below it. For this, an indentation can be used. SPT records are really children to SPU records, so they can be indented even further. There has to be an indent after the record type field as well, so the rest of the file aligns again.
In the first image, there is no indentation, in the other two, there is and it is somewhat different, but after 19 characters, the file is again vertically aligned in all three cases. This is where the actual data starts in all records within transactions.
Further readability is achieved by grouping the record headers with colour-coding. In the last example, publisher data is green, writer data is blue, etc.
Sequences and Blanks
Each transaction record has at least 2 sequence numbers, and some have more. These may be important when debugging. For most people, they are not. The lack of importance is in the last two examples, signified with light grey. There is one exception, though. The publisher chain sequence. It is, in the last example, marked with red. This is because it is important in CWR 2.2 and 3.0. They represent the link between writers and their original publishers. In the last screenshot, they are marked with red.
If a field does not have a value, it is either blank or filled with zeros for numeric, date, time and duration fields. In the second screenshot, such values are shown as light grey.
So, with light grey, we remove a lot of clutter. Now the real work begins. Colour-coding the important fields.
Which Fields are Important?
The answer really depends on the user and the use case. For the rest of the article, we will presume that a Django Music Publisher user made a registration and wants to double-check things. For this, we will use the following screenshot of a hypothetical co-published library work.
Titles, including work titles, alternative titles and record titles, are marked with purple. Titles are always important, particularly the work title when one searches through the large file.
In the example above, there are two publishers, but one of them is unknown (OPU record). The known one is marked with green. This includes the internal IP code (here DMP), as they appear in various records, while publisher name only appears in SPU/OPU records.
Very similar to publishers, marked with blue. Please note that writer-related data includes PWR records that, among other data, includes internal IP codes for the writer (blue) and the publisher (green), as well as the publisher sequence number (red).
Artists names are shown in cyan, both in PER and REC records.
Implementing CWR Syntax Highlighting
I have worked on several different implementations, three of which have been mentioned here. If you want to implement CWR syntax highlighting, there are several options.
Do it Yourself
Of course, if you have a CWR-capable software, the same developers should be able to implement this. This is not a difficult task, though experience matters.
Use my REST API
This is, by far, the fastest way, and probably the cheapest in the short to medium run. It can usually be achieved in under a week. It can be configured you match your needs and colours in your app.
Buy the code
You can get it for most programming languages, so far we have implemented Python, PHP and JS. But we can translate it to a programming language of your choice.