BitPerfect: May 2013

Friday 31 May 2013

BitPerfect v1.0.7 Released!

BitPerfect version 1.0.7 has now been released to the App Store. It is a free upgrade for all existing BitPerfect users. Version 1.0.7 is a maintenance release.
Version 1.0 7 fixes the incompatibility with iTunes 11.0.3

Thursday 23 May 2013

More issues with iTunes 11.0.3

Please read this carefully.

We have a release candidate for BitPerfect 1.0.7 which addresses the problems we have reported with iTunes 11.0.3 and I have been testing it myself all day on two systems, one using iTunes 11.0.3 and the other using iTunes 11.0.2. If our Beta Team blesses it, and Apple approves it, it will be available soon as a free update.

1.0.7 has been performing flawlessly for me all day, on both systems, with no problems whatsoever. But suddenly, in the late afternoon, the original problem returned again with a vengeance, completely out of the blue. At least it appeared that way. I won't take you through the diagnostic process blow-by-blow, but instead I will present you with my conclusions.

Basically, if I took BitPerfect out of the picture completely, and played music using iTunes alone, what I observed was that playback of a track would proceed normally, but the track position indicator would remain frozen at 0 seconds. When this happens with BitPerfect playing, after a few seconds BitPerfect interprets this as the user having manually returned the track position slider to the zero position, and re-starts playback of the track from the beginning. If the track position slider continues to stay frozen at the zero mark, the track will continue to repeat ad nauseam from zero. This is the same thing that used to happen before we made the fix, except that instead of the track position slider staying at zero, the call to Apple Scripting Bridge to return the track position was returning a garbled value that BitPerfect was interpreting as zero. This is different behavior entirely, but with an identical outcome.

My music is located on a NAS, and the NAS is in turn accessed via an ethernet hub. iTunes must load every tune over ethernet. What seems to be happening is that my ethernet hub is in the process of failing. If I cold boot it, everything seems to start working perfectly for a while, and then all of a sudden the problem returns. My Mac is a ~2009 Mac Mini with an Intel dual-core processor. Our best interpretation of what is happening is that iTunes 11.0.3 runs a more tightly thread-based execution model. If the ethernet connection is flaky (either because your hub is flaky, or maybe because you are running a WiFi connection with a low signal or unexpected interference) then one core is busy running the music playback, and iTunes spawns another thread to manage the housekeeping tasks such as updating the track position slider. However, this thread wants to run on the other core, which is busy trying to manage a non-responsive ethernet connection. If I move the music off the NAS and onto the local HD, the problem immediately goes away.

I have no idea whether this is a very specific problem to my own configuration, or if other users might also see something similar. Generally speaking, anybody with an older Mac, which is doing something - anything, really - that ties up one core and won't release it, could experience this problem.

That represents our initial take on what we are observing. It may end up being totally wrong, so don't take it as gospel. But in any case I thought it was worth reporting, since the symptoms for BitPerfect users would appear at first to be the same as those associated with a known BitPerfect problem (which we have since fixed). If you suspect you are seeing the same problem, just try quitting BitPerfect and playing music the normal way without it. Watch the track position slider and make sure it is behaving normally.

Wednesday 22 May 2013

iTunes 11.0.3 update

Preliminary testing of a test version of BitPerfect which fixes the known problems with iTunes 11.0.3 is proceeding successfully. This means that a permanent solution should be available soon. We are looking at perhaps 2-3 weeks, most of which will comprise waiting for Apple to approve release to the App Store.

Thank you for your patience!

Tuesday 21 May 2013

Loudspeakers

Upgrading your Hi-Fi equipment can be both deeply satisfying, and deeply unsatisfactory, and both at the same time. Satisfying, because you have invested in something that you have either wanted for a long time, or have spent a long time researching and preparing. Unsatisfying, because, so many times, after the 'thrill of the chase' is over, and the 'new car smell' has worn off, you often find that your new installation is somehow not much more musically satisfying than the previous system. Go on - admit it. We've all been there.

There are many reasons and causes for this, but I am going to focus on one of them that I see happening more frequently than any other. The loudspeaker fixation.

It is perhaps natural to focus on the loudspeaker as the 'most important' item in your playback chain. Very clearly, different loudspeakers have an immediately obvious different sound signature. Anybody can hear readily the differences in sound when you change from one loudspeaker to another, whether you are an audiophile or a civilian. Although (thankfully) it is not something we see too much of these days, it is relatively easy for a retailer to hook a bunch of loudspeakers up to a switching box and have the customer switch back and forth between them in real time, the differences between each model standing in stark contrast. It is not so easy to set this up for - say - a bunch of amplifiers, or a bunch of speaker cables. But even so, as O. J. Simpson's lawyers might have put it, I am willing to stipulate to the differences between loudspeakers, that they are real, substantial, and readily apparent.

However, differences in sound, and differences in musicality, appear consistently to be different things. It is an inconvenient truth that while anybody can readily appreciate the former, for most people it appears that the latter has to be learned. This is an uncomfortable notion, in that it implies all sorts of connotations such as 'golden ears' or 'trained musician', and other faintly elitist notions. But it's generally true, and is perhaps a subject for a post all of its own one of these days. It used to be that this learning process was something a good dealer should be able to train you to do. (This was something Ivor Tiefenbrun, the founder of Linn, drilled mercilessly into all of his dealers back in the day.) But too few dealerships these days seem to have mastered that art. In my case, I was shown the light by a sales clerk in a high-end London audio store during the course of one Tuesday afternoon. What I learned that afternoon has stayed with me all my life. (Geez - it sounds like I'm coming out!...)

But back to loudspeakers. Most people who are not audiophiles - and many who are - have a tendency to categorize audio equipment in one of two groups. Those that sound different, and those that don't. For example, most will happily put loudspeakers into the "those that do" category, and interconnect cables into "those that don't". Amplifiers are usually described by audiophiles as being among "those that do", although real-world behavior tends to suggest that people actually treat them as "those that don't". The easiest way to quantify this behavior is to consider the way people apportion their budgets in building an audio system. It makes sense that individuals would apportion the money they spend in the way that best reflects their own perception of where the most bang can be had for the buck. Most people - audiophile or otherwise - seem comfortable with the notion that a full 50% of the budget for an audio system should be set aside for the loudspeakers. It is my view that this is seldom the most satisfactory proportion.

So, given that loudspeakers indisputably sound more immediately different than do amplifiers, how does an audiophile set about choosing the components that will comprise that new upgraded system? I think that the best way to think about different loudspeaker systems is to recognize that each different model has a 'character' of sound which is immediately evident, and a 'quality' of sound that is not. The elements of 'character' are often expressed in terms of a loudspeaker being more or less suitable for one genre of music or another. And its true. Some speakers definitely make a better job of pop/rock than classical, and vice versa, and yet others come into their own with intimate jazz. And for most people, buying an audio system involves an element of compromise driven by budgetary constraints. If you want to hear 'quality' go listen to a pair of Wilson's entry-level (but still car-priced) Sophia 3's. These loudspeakers are widely available, and are all about sound quality. The 'character' might or might not be to you taste, but the 'quality' is indisputable, and is there in spades.

So the process of buying a system starts with identifying a loudspeaker model that works well with the sort of program material the listener wants to play over them. That's not at all a bad way to start. But don't make it the end. By choosing your loudspeakers you have not broken the back of the task by any means. The key to musical fulfillment lies in what comes next. You should spend equal amounts of time on loudspeakers, source components, amplifiers, and cables (in that order). There is a solid argument for apportioning costs in the same way as well. Many people have a real problem spending as much on cables as they do on a pair of loudspeakers - and I can sympathize enormously with that argument - but if you are serious about buying your system based entirely on what you hear, then that is what you should be prepared to do.

In closing, I must say that the effects of cables (power cords, interconnects, USB cables, and speaker cables) never ceases to floor me. Like almost everybody else, I have a real problem in accepting the perceived value of (for example) a pair of speaker cables as being on the same level as the loudspeakers to which they are connected, but I must disclose that my B&W 802 Diamond loudspeakers are connected using a pair of Transparent Audio Reference interconnects which cost very nearly the same price. And dammit, those Transparents really do make the Diamonds sing!

Sunday 19 May 2013

e-mail up and running again!

We have had a problem with our e-mail over the last few days, but this appears to be resolved now. I apologize if you have had problems contacting us.

Thursday 16 May 2013

iTunes 11.0.3

If you have already installed iTunes 11.0.3 then you have two options.

Option 1 is to use BitPerfect in "Minimize iTunes" mode. We have tried that here for a couple of hours and it seems to work.

Option 2 is to uninstall iTunes and re-install iTunes 11.0.2 - here are some instructions:
http://osxdaily.com/2012/02/06/delete-itunes-mac-os-x/
You will have to download iTunes 11.0.2 from here:
http://mac.filehorse.com/download-itunes/

After re-installing iTunes 11.0.2 it may show an error message reading one of iTunes' ".itl" files. Ideally, you will be able to replace this file with an older version that you have previously backed up using Time Machine. Failing that, the solution is to re-import your music after telling iTunes not to organize and not to copy files (don't forget to re-set the organize/copy settings back to what they were afterwards).

DO NOT INSTALL iTunes 11.0.3

DO NOT INSTALL iTunes 11.0.3 yet! I just installed it and it does not appear to function correctly with BitPerfect. I need to do more testing to confirm. I will post an update ASAP.

Tuesday 7 May 2013

Digital Clipping

Clipping is a term that originated with analog audio, and refers to the situation where the magnitude of the signal rises to a level that is larger than the ability of the medium to store it, or the electronics to deliver it. For example, with magnetic tape, the signal is stored by embedding a magnetic signal onto the tape. But if the musical signal gets beyond a certain level, the tape will not have the magnetic capacity to store a large enough magnetic signal. Same thing with an amplifier – if you amplify a signal enough, you will eventually run out of voltage (or current) at the output.

With magnetic tape, as well as – generally speaking – with good old-fashioned vacuum tube amplifiers, when the signal level approaches and exceeds the maximum the system was designed to handle, the musical peaks get gradually compressed, so gradually in fact that for the most part you don’t notice it happening. This so-called “soft clipping” meant that, for the most part, clipping was not the most crucial sonically degrading issue faced by early audio designers.

This all changed with the advent of solid-state electronics. Your typical transistor amplifier does not soft-clip. It hard-clips. This means that when it tries to deliver an output voltage larger than it the maximum it was designed for, the output voltage just sits at the maximum value and stays there until the output signal drops below that maximum value. The peak of the signal is just wiped out, and the signal waveform develops a flat-topped appearance everywhere this hard-clip occurs. Imagine Shaquille O’Neill walking through your front door, and instead of gracefully ducking to avoid bumping his head, the door simply chops his head off. The effect on the music is similarly messy.

In digital audio, the effect of clipping can actually be even worse! Lets look at what happens when a signal is clipped. The easiest way to do that is to consider the clipping as being an error signal which is added to the music signal. This error signal comprises nothing but the peaks that got chopped off. If we analyze this signal, we find that it has frequency components which extend from within the audio bandwidth (which is considered to be about 16Hz – 20,000Hz) on up into frequency ranges above the audio bandwidth. In analog space, we can generally just ignore any components above the audio bandwidth because we can’t hear them anyway. But in digital audio we can’t do that.

Typical digital audio has a sampling frequency of 44,100Hz, the standard developed for the Compact Disc. There is a firm and fixed mathematical law that says if we want to sample a waveform at a certain frequency, then we have to make sure that the waveform contains no frequencies above exactly one half of the sampling frequency. This frequency is termed the “Nyquist” frequency. For CD, that means it has to have no content at any frequency above 22,050Hz. What happens if you try and encode a signal at, say, “N” Hz ABOVE the Nyquist frequency? What you find is that the result you get is EXACTLY THE SAME as you would have got if instead the signal was “N” Hz BELOW the Nyquist frequency. When you play back this signal, it is not the original high frequencies you will hear, but the "bogus" lower ones. This effect is called mirroring, and is a very audibly destructive artifact. It explains why the original analog signal has to be very tightly filtered prior to being sampled, to eliminate all traces of any frequency components above the Nyquist frequency.

Back to clipping. If you take a perfectly good signal in the digital domain, and perform some signal processing on it, then the possibility generally exists that the resultant signal will contain peaks that are above the maximum value that can be represented by the digital encoding system. What do you do with those peaks? The easiest thing is to “clip” them at the digital maximum, so that just as with analog clipping in a solid-state amplifier, each sample that works out to be above the digital maximum is encoded as a digital maximum. You will have, in effect, encoded a waveform containing frequency components above the Nyquist frequency. When you play back that signal, those otherwise inaudible components will be recreated as audible components at corresponding frequencies below the Nyquist frequency. This will sound even worse than hard-clipping in an amplifier.

The solution is to use mathematics to “re-shape” the portion of the signal that is being driven into clipping, in such a way as to remove all of the unwanted high-frequency components. Of course, there will be a sonic price to pay, even for this. But once you have driven the signal into overload in the first place, there is no escaping without some sort of penalty.

This sort of situation arises in general with any form of signal processing, but "mirroring" is most commonly encountered when down-sampling from a higher sample rate to a lower one, particularly one derived from a DSD source which has (by design) a lot of high-frequency noise. In general, you have to assume that the higher-rate-sampled “source” data can contain frequency components anywhere below its own Nyquist frequency. But some of those frequencies can still be higher than the Nyquist frequency of the lower sample rate which is the “target” of the conversion. So, unless you absolutely know for a certainty that the “source” material contains no frequency content above the Nyquist frequency of the “target”, then your downsampling process needs to incorporate an appropriately designed low-pass digital filter.

Monday 6 May 2013

Two’s Complement

Most of us understand that PCM audio data “samples” (measures) the music signal many times a second (44,100 times a second for a CD) and stores the result in a number. For a CD this number is a 16-bit number. A 16-bit number can take on whole-number values anywhere between 0 and 65,535. Whole-number values means it can take on the values such as 27,995 and 13,288. But it cannot take on values such as 1.316 or 377½. Whilst this works just fine, recall that the music waveforms that we are trying to measure swing from positive to negative, and not from zero to a positive number. But it turns out we can work around that. You see, an interesting property of binary numbers and the inner workings of computers can be brought to bear.

A 16-bit number is just a string of 16 digits which can be either one or zero. Here is an example, the number 13,244 expressed in ordinary 16-bit binary form: 0011001110111100. (I hope I don’t need to explain binary numbers to you.) If this were all zeros it would represent 0, and if it were all ones, it would represent 65,535. But there are actually different ways in which to interpret a sequence of 16 binary digits, and one of these is called “Twos Complement”.

Before going into that, I want to talk about 15-bit numbers. A 15-bit number can take on values between 0 and 32,767. Wouldn’t it be nice if we could encode our music as one 15-bit number representing 0 to +32,767 for all those times when the musical waveform swings positive, and another 15-bit number representing 0 to -32,767 for all those times when the musical waveform swings negative? In fact, we can do that very easily. We take a 16-bit number, and reserve one of the bits (say, the most significant bit) to read 0 to represent a positive number and 1 to represent a negative number, and use the remaining 15 bits to say how positive (or negative) it is! Are you with me so far?

We need to make one small modification. Both the positive and the negative swings encode the value zero. We can’t have two different numbers both representing the same value, so we need to fix that. What we do is we say that the negative waveform swings encode the numbers -1 to -32,768 so that the value zero is only encoded as part of the positive waveform swing. So now we have a system where we can encode the values from -32,768 to +32,767 which makes us very happy.

Lets do a simple thing. Take each of our numbers from -32,768 to +32,767 and add 32,768 to them. We end up with numbers that range from 0 to 65,535. This is our original 16-bit number! What we done, in a roundabout way, is to create the “Twos Complement” of our 16-bit number. The twos complement lets us express 16-bit data in a form that covers both positive and negative values.

It turns out that this makes computers very happy as well, because numbers represented as twos complement respond identically to the arithmetic operations of addition, subtraction, and multiplication. So we can manipulate them in exactly the same way as we do regular integers. In fact, twos complement representation is so inherently useful to computers that they use an even more friendly term for them – Signed Integers.

Twos Complement (or Signed Integer) representation is such a huge convenience for computer audio, that most audio processing uses this representation. Amongst other things, simple signal processing functions like Digital Volume Control are more efficient to code with Signed Integers.

There is one thing to bear in mind, though, and it catches a lot of people out. Recall that the negative swing encodes a higher maximum number than the positive swing. Here I am going to shift the discussion from the illustrative example of 16-bit numbers to the more general case of N-bit numbers. The largest negative swing that can be encoded is 2^(N-1) whereas the largest positive swing that can be encoded is 2^(N-1) - 1. Where this becomes important is to note that the ratio between the two is not constant, and depends on N, the bit depth. This comes into play if you are designing a D-to-A Converter with separate DACs for the negative and positive voltage swings. You need to design it such that the negative and positive sides both reach the same peak output with an input signal of 2^(N-1), while recognizing that the positive side can never see it in practice, since it should only ever receive a maximum signal of 2^(N-1) - 1. If it ever receives a signal of 2^(N-1) this would indicate an error in its internal processing algorithms.

Similar considerations exist when normalizing the output of a DSP stage (which should properly be in floating point format) for rendering to integer format. The processed floating point data is typically normalized to ±1.0000 and it would be an error to map this to ±2^(N-1) in Twos Complement integer space, because this would result in clipping of the positive voltage swing at its peak. Instead it must be mapped to ±2^(N-1) – 1.

Such things make a difference when you operate at the cutting edge of sound quality.

Friday 3 May 2013

“Pure” PCM and “Pure” DSD

I wrote a little while ago about the relationship between DSD and PCM – how DSD is a specific implementation of SDM (Sigma-Delta Modulation), and how both ADCs and DACs for PCM are built around SDM engines. I also wrote about the algorithms that convert data between the two formats, and how the conversions are not entirely lossless.

The picture I left you with was that – for all practical purposes – there is no such thing as “Pure” PCM. Any PCM music data is derived from SDM at some point in its creation, and has therefore undergone at least one conversion sequence.

I want to ramble further on DSD, and whether music stored and replayed in the DSD format can be any more “Pure”. The problem is the inescapable fact that – until somebody comes up with a truly significant breakthrough – you cannot “edit” music in the SDM domain. This has huge ramifications for the recording industry, where recording, mixing, and mastering can involve very profound manipulation of the music. Indeed something as simple as volume control – a fade-out, for example – cannot be done in the SDM format. And the recording industry routinely employs way more elaborate effects that would make your hair curl (have you noticed how many recording artists have unnaturally curly hair?...).

To my knowledge, there are only two studio-grade recording desks out there capable of producing commercial DSD recordings – Sonoma and Pyramix. Of the two, Sonoma is the oldest, and least functional. Sonoma drops the signal out to PCM for fade-in and fade-out, but apart from that offers no sound manipulation capability. Pyramix is modern and quite progressive, but it does all its mixing in “DXD” which is 24-bit 352.8kHz PCM, so all Pyramix DSD is derived from what are essentially DXD masters.

Is there any such thing as a pure DSD recording? Well, yes, there is, but you have to restrict yourself to transcriptions of analog tape, where no further audio processing is required. To be fair, there is a fair amount of archival material out there which could benefit greatly from transcription to DSD for re-release. But, for new recordings, you would need to record to analog tape and mix on an analog deck if you wanted to create true DSD recordings. Some boutique studios do follow this approach.

And as far as it goes, that sounds all well and good. But then I came across an interesting paragraph in a 10-year old technical paper from Philips in Holland (who, together with Sony, were the driving force behind SACD). Here they talk about the typical “DAC” configuration used in a SACD player, and I was rather surprised to read it. According to this paper, the low-pass analog filters that are required to convert pure DSD to analog do not possess the impulse response characteristics they consider to be necessary for high-end audio performance. However, digital low-pass filters are more than up to the task. Therefore, the first thing the DAC does is to use a digital low-pass filter to convert the DSD to a 2.822MHz multi-bit PCM signal. This PCM signal is then fed into a SDM to generate a multi-bit (typically between 3 and 5 bits) SDM signal at 5.6MHz or even 11.3MHz. This multi-bit SDM can finally be passed through a low-pass analog filter without having to sacrifice the impulse response characteristic.

So, who’d-a thunk it? DSD gets converted to PCM and back again in the DAC of a SACD player! It would be interesting to find out whether modern DSD DACs utilize a similar approach. If so, then arguably, as well as there being no such thing as “Pure” PCM, there could be no such thing as “Pure” DSD either!

Thursday 2 May 2013

DSD and Analog

There is a lot of buzz about DSD these days, and how it sounds more like Analog than PCM does. Well that’s just opinion of course, but a lot of people seem to be lining up behind that idea. So here’s a tidbit just to get you thinking...

What exactly is Analog anyway? Lets look at an analog electrical waveform. When we transmit an electrical signal from one place to another (such as from an amplifier to a loudspeaker, or from one place on a circuit board to another), that transmission takes place in the form of electrical current flow. The variations in the magnitude of the current flow represent the information that is being transmitted. When we play music through an electronic device, the variations in current flow, backwards and forwards, represent exactly the variations in air pressure, up and down, that we perceive as sound. In a loudspeaker these variations in current are converted into corresponding variations in air pressure, and thus we get to hear the music.

Electrical current flow is carried by fundamental particles called electrons. These, together with protons and neutrons, are the building blocks that together comprise atoms. Electrons carry an “electronic charge”. Just like balloons do when you rub them against a nylon sweater. But unlike the balloon, an electron’s electric charge is built right into it. Electrons cannot acquire or lose their electric charge, they always have the same amount of charge and they keep it permanently. So when electrons flow along a metal wire, for example, their electric charge is also flowing along the wire. It is this flow of electric charge that we define as an Electric Current. The more charge that flows along the wire, the bigger the current.

Now, it is a very interesting property of electrons that the amount of electric charge that each one carries is a fixed and very precise amount. There is absolutely no variation whatsoever – not even the smallest imaginable amount – in the magnitude of the charge on an electron. Every single one is absolutely identical. Electrical charge can only exists in amounts which are whole-number multiples of this fundamental quantity. This has an interesting consequence on current flow. An electrical current measured in Amperes is, to all intent and purpose, a measure of the number of electrons flowing through the wire in question. And that number is a whole number – there is no such thing a “part of” an electron. Either the whole electron has flowed through the wire or none of it has. Current flow therefore has a fundamental “granularity” to it.

Can we detect this granularity? Yes we can, but in ordinary electronic circuits the answer is no. The number of electrons that flow down a wire per second when it is carrying a current of one Ampere is so big that, if you wrote it down, it would have eighteen digits. Try writing down an 18-digit number and see if you can make sense of it! (During the Iraqi war, Condoleeza Rice enters the Oval Office and informs George Bush that two Brazilian soldiers were killed in Baghdad that morning. President Bush instantly turns an ashen shade of grey and slumps into his office chair, holding his head between his hands. Finally he looks up through red eyes and asks, “Just how many, exactly, is a bazillion?...”).

But the concept does have some relevance to the line of thought that caused me to write this post. Because any electrical current flow comprises a discrete flow of electrons, I can represent that current flow by recording when each and every electron arrives at the detection point. Since an electron has either arrived or it hasn’t, I can indicate this by writing “1” every time an electron arrives, and “0” every time an electron doesn’t arrive! If the current flow is large, I will be writing more 1’s than 0’s and vice-versa if the current flow is small.

Strange thing, but isn’t that exactly how DSD works?...

Wednesday 1 May 2013

Ripping your CD Collection - VI. Classical Music Metadata

This is the last part of my series on ripping your CD collection. I hope you have found it useful. This part deals with the thorny problems faced by collectors of classical music, who face additional frustrations when it comes to managing a computerized music collection. A warning straight off the top – there are no definitive answers for this!

Rock, pop, and jazz albums are generally conceived right from the word go as albums. The artist goes into the studio with the objective of recording an album. The record company signs the artist up to a deal that is usually expressed in terms of so many albums. An advance is paid out to the artist to prepare an album. And as a rule, record companies, studios, artists, and consumers all have converging expectations as to what an “Album” means.

Classical music is different. Take for example the CD of Dvorak’s 6^th symphony I have in front of me. The performance is about 40 minutes long. Most record companies don’t want to short-change their customers by selling them a 40-minute CD (of course, 40-minute LPs were not at all unusual!), and customers don’t like to be short-changed by buying a 40-minute CD. So what usually happens is some other piece of filler gets put on the disk as a companion piece. Maybe a piece by Dvorak, maybe by someone else. In my case in point it is a piece by the obscure Vítezslav Novak called “Eternal Longing”.

Most classical albums don’t have a formal name. They are often content to just list the pieces they contain on the front cover. In my example the disk says on the cover “DVORAK Symphony No 6, NOVAK Eternal longing, BBC SO/Jiri Belohlavek”. If I enter that as the album name then it will be (a) a large mouthful; (b) will probably not fit into the space allocated to display the album name; and (c) will be lost amongst all the other classical albums whose names are equally clumsy. In my case, the standard I have adopted leads me to store this album as “Dvorak – Symphony No 6 (Belohlavek)”. FYI, freedB recognizes this disc, not unreasonably, as "Dvorak, Symphony No. 6 - Novak, Eternal Longing". An option, and one that I have experimented with, is to store the two works as two virtual albums, one for each piece, with both showing the same Album Art.

That example was not too much of a challenge. How about this one from Deutsche Grammophon? (The great teutonic classical music label loves this sort of thing.) There are two major pieces on this disc, each are equally prominent, and the common theme is the Cello. The two pieces are “DVORAK: CELLO CONCERTO IN B MINOR” (DG loves to put its titles in upper case) and “TCHAIKOVSKY: VARIATIONS ON A ROCOCCO THEME, OP 33”. First of all, what title do I provide for this “Album”? You could do what I did: “Dvorak & Tchaikovsky – Cello Concertos” (yes, I know …), or you could find some other bastardization, or even just replicate in its entirety the original DG mouthful. The main point is, suppose you wanted to see if you had one or other of those pieces in your very large album collection (which you inherited from a late relative, for example, so you don’t know for sure everything that’s in it). What type of search through your database would properly identify it? There really is absolutely no answer to this problem. Metadata standards have evolved in a fashion that is extremely unhelpful to classical music enthusiasts.

Here is another unwelcome wrinkle. When a record company finally decides to make its music available for purchase by download, the way they do this is that they generally retain the services of an “Aggregator”. The aggregator does various things, but one of them is to embed the metadata. Since there are no firm standards for doing this with classical music, you get all sorts of inconsistencies. One common one is who is the Artist? Or more to the point, whose name gets put in the Artist field? Sometimes you will find the conductor’s name in there. Other times the name of the orchestra is used (Is an orchestra actually an Artist? If not, what is it? Most metadata standards now recognize “Ensemble” as a Field, but I have not yet seen it implemented in any of the mainstream music player Apps). Yet other times it is the composer’s name that appears in the Artist field, even though the composer already has his own unambiguous Field! When something as simple as this can get screwed up through ambiguity, you know you have a problem.

And another one! Opera. On what basis do you determine whose names go into the Artist fields on an operatic recording? A cast can have over a dozen listed performers. Do you embed all of their names, or just a selection? Does everybody’s name get embedded in every track, or do we just include those Artists who perform on the individual tracks? (Some people might prefer to see that, but it represents a Herculean metadata grooming task.)

Classical music listeners are left with the thin end of the wedge. The existing metadata standards just don’t serve their needs at all well. Each individual has a choice to make as to how far, and in what direction, to bend it in order to make it fit. There is no one solution that will meet everybody’s needs. To make one will require someone with enough clout to force all of the stakeholders to line up behind them, and I don’t see anyone who has the combination of ideas, motivation, and (most important) resources to take that on.

Back to Part V.