Syntax Highlighting for Common Works Registration Format

This article is about reasons for using syntax highlighter for CWR files in various situations. But first the background story.

History of CWR syntax highlighting

I made the first CWR syntax highlighting in late 2013 (for Sublime Text 2), and have been using it ever since. Early on, I used it to find errors in CWR files. This is where I got the idea for the Visual CWR Validator, released in 2017. It was the first tool based on my Music Metadata Engine. At the time, it was free-to-use.

CWR 2.1 complete highlighting with validation from Visual CWR Validator

At some point, I realized that for most users, syntax highlighting was more important than validation. The validation part was actually just confusing. All they wanted was to understand the data contained in the file. And it had to be significantly more readable and clear to smart people with no previous CWR experience.

So, in 2018, I removed the “validator” part. Well, in reality, it is a completely different code, still based on the same engine. But it followed the “less is more” philosophy. While all fields have additional information on mouseover, most of them have no color, being light gray, for counters and defaults, or black for fields with actual values. Color is rarely used and each has a special meaning. And some spacing was added.

CWR 2.1 complete highlighting without validation from Visual CWR

This new tool is free to use. It is, by far, the most used tool on my website. But, free to use is not free in my book. For a software tool to be free, the source has to be freely available. And I am not releasing the engine code as open source.

Please note that the above images contain links to videos about the respective tools. The latter one is now somewhat obsolete, as Django Music Publisher now (since July 2019) has basic syntax highlighting included in the code.

CWR 3.0 basic highlighting without validation from Django Music Publisher

Also, note that the above image (also a link to a video playlist about Django Music Publisher) depicts CWR version 3.0, not 2.1, shown in the previous screenshots. CWR 3.0 is significantly easier to read.

However, even less information is color-coded. But one can notice that the titles and names are, unlike in the previous example, color-coded. There are three reasons for this. This highlighting is not based on a metadata engine, but rather a simple Django template. More importantly, the target group is different. And, most importantly, I learned something in between.

Why is CWR Syntax Highlighting Important?

To a software developer, the question seems equivalent to: Why is walking important? We could just be crawling.

While reading any code or data format without syntax highlighting is something I can do, I prefer to use a tool that provides me with syntax highlighting. I am faster and can focus on the issue at hand better. More unreadable the format is, more important the syntax highlighting. And CWR, even 3.0, is very unreadable.

Data Structure

All records (rows) are not equal. They actually form a structure. File has groups of transactions. Every transaction should be separated. Every transaction has a header row. All rows in the transaction are, directly or indirectly, below it. For this, an indentation can be used. SPT records are really children to SPU records, so they can be indented even further. There has to be an indent after the record type field as well, so the rest of the file aligns again.

In the first image, there is no indentation, in the other two, there is and it is somewhat different, but after 19 characters, the file is again vertically aligned in all three cases. This is where the actual data starts in all records within transactions.

Further readability is achieved by grouping the record headers with color-coding. In the last example, publisher data is green, writer data is blue, etc.

Sequences and Blanks

Each transaction record has at least 2 sequence numbers, and some have more. These may be important when debugging, but for most people, they are not important. The lack of importance is in the last two examples, signified with light gray. There is one exception, though. The publisher chain sequence. It is, in the last example, marked with red. This is because it is important in CWR 2.2 and 3.0. They represent the link between writers and their original publishers. In the last screenshot, they are marked with red.

If a field does not have a value, it is either blank or filled with zeros for numeric, date, time and duration fields. In the second screenshot, such values are shown as light gray.

So, with light gray, we remove a lot of clutter. Now the real work begins. Color-coding the important fields.

Which Fields are Important?

The answer really depends on the user and the use case. For the rest of the article, we will presume that a Django Music Publisher user made the registration and wants to double-check things. For this, we will use the following screenshot of a hypothetical co-published library work.

/media/_versions/images/screenshot_from_2019-08-05_13-37-04_full.png

CWR 3.0 transaction, highlighting available as open-source in Django Music Publisher

Titles

Titles, including work titles, alternative titles and record titles, are marked with purple. Titles are always important, particularly the work title when one searches through the large file.

Publishers

In the example above, there are two publishers, but one of them is unknown (OPU record). The known one is marked with green. This includes the internal IP code (here DMP), as they appear in various records, while publisher name only appears in SPU/OPU records.

Writers

Very similar to publishers, marked with blue. Please note that writer-related data includes PWR records that, among other data, includes internal IP codes for the writer (blue) and the publisher (green), as well as the publisher sequence number (red).

Artists

Artists names are shown in cyan, both in PER and REC records.

Shares and Society Affiliations

For this explanation, we will refer to the third (SPT) record. There are three types of collectible shares: performance, mechanical and synchronization, depicted with cyan, purple and red respectively, and so are the society affiliation codes.

For both fields, having additional data on mouseover is important. This is not shown in the screenshot, but you can easily test this with Django Music Publisher, if you don’t want to install it yourself, you can try out DMP.guru, the first 30 days are free, or use the Visual CWR tool with your own CWR 2.1 file, or this one.

Implementing CWR Syntax Highlighting

I have worked on several different implementations, three of which have been mentioned here. If you want to implement CWR syntax highlighting, there are several options.

Do it Yourself

Of course, if you have a CWR-capable software, the same developers should be able to implement this. This is not a difficult task, though experience matters.

Use my REST API

This is, by far, the fastest way, and probably the cheapest in the short to medium run. It can usually be achieved in under a week. It can be configured you match your needs and colors in your app.

Buy the code

You can get it for most programming languages, so far we have implemented Python, PHP and JS. But we can translate it to a programming language of your choice.