Monday, 20 April 2015

PERSPECTIVE: Poll - Upgraded USB and Sound?

I enjoy visiting the Steve Hoffman Forums for perspective. I think it's probably one of the most balanced places to hang out at. Of course, as in any audio forum, there will be a number of heated arguments here and there, but overall, it's great to see a place where heterogeneous music lovers of various experience levels and beliefs can congregate, share tips, and get advice... It's certainly one of the more vibrant audio communities out there!

Recently there was an interesting poll on whether USB cable upgrades can improve a system's sound quality. Titled "The Great USB cable debate poll" (full disclosure, I took part and posted a comment as well), it ran from February 22 to March 8th before it was closed. In total, it received 415 votes, gathered 36 pages of responses and the outcome was:

About 1/4 felt that upgraded USB cables make a difference, and 3/4 did not. Surprised?

Of course, any poll must be viewed in the context of the respondents. This one was conducted in the "Audio Hardware" forum and like I said, I think the denizens of this site does capture a broader group of audiophiles and music lovers than most places; from pros to reviewers to the general music enthusiast. Topics range from people posting pictures of their audio room, to discussion of LPs and turntables, to opinions on the latest DACs, computer servers, speakers, headphones...

I certainly do not expect we should put full faith in something or other based on general consensus. In areas where we are passionate about, of course we should figure things out ourselves and come to our own conclusions... Once awhile, it is nice however to see a poll such as this to get a sense of what the "cohort" thinks. As an audiophile who reads the usual magazines, one might get the impression that there's almost 100% acceptance that upgrading USB cables for sound quality is a "given" within this hobby. Likewise, some might pigeon-hole all "audiophiles" as "audiophools" spending all kinds of money on questionable claims. IMO both perspectives would be inaccurate.

I think it's fair to recognize that most audio lovers are reasonable people even though sometimes it seems like the most fringe and outlandish voices appear to be the mouthpieces of this hobby.

Hmmm, I wonder what the result would be if we asked "Can upgraded ethernet cables at various price points improve the sound of your system?"


Okay, just a quick post here today as I'm working behind the scenes to get another blind test going... More information in the next week or so I hope!

Have a great week.

Saturday, 11 April 2015

ANALYSIS: DSD-to-PCM 2015 - foobar SACD Plug-In, AuI ConverteR, noise & impulse response...

Noise characteristics of PCM vs. DSD - image found here.
In my post last week looking at the various DSD-to-PCM converters, Solderdude Frans made a good suggestion... Let's have a look at the newer SACD plugin which has superseded the DSDIFF plug-in as the converter of choice these days for foobar. Also, it was suggested by Yuri Korzunov, the author of AuI ConverteR 48x44 to have a look at his converter package as well.

I. DSD-to-PCM - foobar SACD plug-in & AuI ConverteR

So, I downloaded the newest SACD plug-in currently at version 0.7.7 dated 2015/03/16. I deleted the DSDIFF plug-in from my computer so there are no interactions, and installed the new files.

Notice that the SACD plugin has a configuration panel for settings:

Because this plug-in does not directly output to 24/96, I figure let's try with the highest output (352.8kHz) and I will use the best samplerate converter (SRC) I have (the excellent iZotope RX 4) to bring it down to 24/96 for analysis as before. Here are the settings I used in iZotope RX 4:
Sharpest "max" filter for 24/96 in iZotope RX 4, linear phase without suppression of pre-ringing - nothing fancy...

The other parameter we can play with in the SACD plug-in is the DSD2PCM mathematics setting. By default, it's the standard fixed-point integer mode. Let us also analyze the result from the highest precision mode "Multistage (Double Precision)".

As well, I downloaded the AuI ConverteR 48x44 software. The current demo is version 4.1.20. Other than setting the output for WAV 24/96, I left the rest of the settings to default.

Using the exact same procedure as last week, here's a summary of what I got:

Interesting... It looks like the SACD plug-in is actually about the same as the old DSDIFF 1.4 (<1dB difference) in terms of noise level and dynamic range. Notice just like last week's results from XLD, that going from fixed-point to double floating-point calculations made no difference here.

AuI ConverteR resulted in some impressive numbers! Let's see what it's doing in detail...

As you can see, AuI ConverteR is using a sharp "brick-wall" filter right at ~20kHz to remove essentially everything after 20kHz. As such, have a look at what it does with the noise profile:

Wow. That is an impressively sharp, precise filter at 20kHz! I can approximate that effect with iZotope RX 4's EQ plug-in with a low-pass at 20kHz, high Q of 25 or so (not shown) but AuI ConverteR looks even cleaner with less noise floor irregularity.

That little bit of high-frequency "rippling" with DSDIFF is probably a result of the resampling algorithm. Otherwise, foobar DSDIFF and the newer SACD plug-ins appear very similar.

Basically, this is what we can say at this point...

1. foobar SACD plug-in works about the same as the old DSDIFF plug-in. I would not be surprised if the algorithm (DSD2PCM) is essentially the same if we look "under the hood".

2. The AuI ConverteR software puts up some impressive numbers. This is done with a very strong low-pass filter. If you feel there is no need to retain frequencies >20kHz, then this will clearly get the job done.

II. All that noise!

But wait, there's more! SACD Plug-in also has a 30kHz lowpass mode - "Direct - (Double Precision, 30kHz LF)". Hmmm, I wonder how that looks?

Engaging the 30kHz lowpass mode really resulted in a step down in calculated accuracy. Here's a look at the graphs:

Indeed, the 30kHz low-pass filter is doing the job (yellow).

We can see the effect of that 30kHz filter on the noise floor... Certainly not the prettiest filter out there! Realize that although the differences are there in these graphs with a synthetic test signal, we're talking about noise down below -150dB (below 20kHz). It's just not an issue in terms of audibility.

Now, let us see if we can do it better by using iZotope RX 4 to do the 30kHz low-pass filtering instead of the algorithm used by SACD plug-in. Here's a simple setting:

Low-pass filter at 30kHz as seen, Q = 5.0 (not too steep), linear-phase FIR with FFT size of 32k.


This is what a good low-pass filter can do for the results. As you can see, when we use the 30kHz iZotope low-pass filtering, the calculated noise level drops substantially on the RightMark analysis... This also tells us that the reason Saracon and JRiver measured so well is because they implement very good quality noise filtering algorithms beyond just the DSD-to-PCM calculations.

The SACD Plug-in's DSD2PCM algorithm is excellent, and when we pair it up with iZotope's SRC and 30kHz low-pass filtering, we get some fantastic results easily on par to what Saracon does:

III. Impulse Response and DSD-to-PCM Converters

Despite the inherent noise in DSD, we can drop overall noise levels substantially with a good low-pass filter. In fact, since a picture is worth a thousand words, this is what a 15kHz (-12dBFS) sine wave looks like comparing the unfiltered foobar SACD plug-in output at 352kHz with the 30kHz iZotope RX 4 low-pass filtering (again, this is with Saracon as the encoding software for PCM-to-DSD):

This is what all that extra high-frequency noise looks like in DSD when you don't filter it out at all. Notice that DSD128 is significantly less noisy. The question is, just how much noise reduction should we actually do? (You can also see the noise through an analogue oscilloscope - as shown here.)

As noted by Juergen in the comments to the previous post, there is this matter about time-domain behaviour as well which can be skewed as we apply various filters.

Let's see what a 24/96 impulse looks like after going through the DSD encoder [Saracon] and most of the decoders I looked at (DSD-to-PCM converter output set to 24/352 for each, AudioGate's max was 192kHz):
(Click to enlarge.)
In the top left panel, this is what a 0dBFS 24/96 "impulse" would look like with a typical linear-phase oversampling interpolation showing symmetrical pre- and post-ringing. Even though Adobe Audition renders the interpolation, the actual PCM data itself is a simple, single "pulse" (see Addendum below for screenshots using Audacity).

When I convert this waveform to DSD64 and DSD128 with Saracon and then back to PCM with the foobar SACD plug-in to 24/352 unfiltered (retaining all that ultrasonic noise), you get the 2nd and 3rd left images. Notice again the amount of noise in the signal and again, we see the superiority of DSD128. From a time domain perspective, the SACD conversion process is excellent. The shape and timing of the impulse would be completely retained since the 2.8224 MHz sampling rate of DSD64 provides ~29 samples within each 96kHz time period.

When we use iZotope RX 4 with 30kHz low-pass filtering (4th left image from the top), the "impulse" amplitude is significantly reduced and we see the corresponding ringing pattern as the high frequency noise is removed and no longer obstructing the picture.

AudioGate and Saracon both look very similar. Both use linear phase filters with characteristic symmetrical pre- and post-ringing. Whereas AudioGate allows high frequencies through (and thus well formed impulse), we see the effect of Saracon's filter (pre- and post-ringing ~30kHz).  JRiver looks like it uses an intermediate phase filter (with 24kHz or 30kHz low-pass) which minimizes but does not remove pre-ringing. Comparatively, we see that DSD Master is using a form of minimum phase filter that removes the pre-ringing but the post-ringing is augmented.

AuI ConverteR is an interesting case. As we saw above with the RightMark tests, it implements a very sharp ~20kHz low-pass filter. This impulse response looks to be linear phase with accentuated pre- and post-ringing due to the sharpness of the filter; the "price" to pay I suppose.

I'll leave you to decide how you feel about this information and whether you think the relative time domain effects resulting from implementation of the filters are audible. Back in 2013 I had a listen to some filter settings off the TEAC DAC and had difficulty noticing much of a difference; again here, I listen and fail to convince myself that I have any clear preferences among the converters including using ABX Comparator. So far I'm using headphones (Sennheiser HD800 + TEAC UD501 DAC, ASIO driver playing DSD64 converted to 24/352kHz) so perhaps I need to try again with the speaker system. You guys up for an internet "blind" test to see if there's a preference towards linear phase vs. minimum phase upsampling???

IV. Conclusion

I hope we can appreciate the compromises we face with DSD to PCM conversion. How much noise can we tolerate from the 1-bit quantization when we move the signal to PCM? What's the best frequency to set a low-pass filter assuming one believes it's necessary? What parameters should we use to filter (minimum / linear / intermediate phase, sharp vs. gradual roll-off...)? What's the best sampling rate to spit out the PCM data (eg. do we need to produce >96kHz files if we roll-off before 48kHz)?

As I noted last week, I really am not convinced that these differences are audible beyond volume level changes and whether the ultrasonic noise causes problems for one's audio system (eg. intermodulation distortions, interaction with tweeter ultrasonic peaks, and other non-linearities). This is why I don't think there's any point in "crowning" any software package as being superior. Although it's interesting to demonstrate and experiment with, I suspect this is all rather obsessive academic results of interest to audio geeks :-).

If I had to choose, I remain partial to Saracon and JRiver because of the excellent results from the low-pass filtering used by default with those programs; one-step easy conversion using very reasonable parameters. As you can see, I can get similar results with the foobar SACD plug-in creating 24/352 output, and running that through iZotope RX 4 with high-precision samplerate conversion and low-pass filtering indicating that the underlying free DSD-to-PCM algorithm works well. AuI ConverteR is interesting in that the default setting I looked at resulted in a very clean output so long as one does not feel there is any need for >20kHz signals to be retained nor concerned about the effect on the impulse tracing.


You can perhaps imagine, after "penning" these last 2 posts, I'm pretty well done with talking about DSD for awhile. The most interesting question for me currently as suggested by the discussions with the previous post is this whole notion of just how much significance we should place on resolution in the time-domain irrespective of audible frequencies.

If it is significant (I hesitant to use the word "important" since that should be obvious by now if it is the case), then how much is enough? Should we take research like this paper by Kunchur (2008) seriously? Or is it possible that for practical purposes, it doesn't really matter that much when we're listening to real music as opposed to test signals? In any case, I have a strong suspicion that we will be revisiting this in the days ahead since this seems like an area that will be brought out when Meridian's MQA becomes available as I suspect they will emphasize time parameters, digital filter types, and samplerate given their apparent satisfaction with 16-bit resolution.


Finally, it has come to my attention that there was much unhappiness regarding a recent blog post on the importance of noise (here also) in digital audio reproduction to the point of using speculation to support an underlying belief that expensive ethernet cables could somehow impart beneficial effects (as you know, I found no evidence of significance in my testing with various types of ethernet cables). As usual, no empirical data or real-life examples were provided and support came from more testimony from the like-minded and some links that are at best tangential to high-fidelity audio. It looks like bans from commenting were issued for what seems like rather fair statements calling out the obvious lack of substance. I guess that's how people not felt to be "true believers in the audiophile experience" are dealt with. IMO, this is unfortunate behaviour for a site reporting on mature audio computing technology.

There is much that can be said, argued and refuted in that article, but I think for most reasonable audiophiles it's rather obvious and many excellent points can be found in the comments... What is of relevance to this blog entry is that if one believes that expensive ethernet cabling can reduce noise in the "system" (in a way that appears difficult for these people to produce empirical evidence for), why would any audiophile who subscribes to this theory even want to listen to DSD64 (where the noise is obviously demonstrable and a potential cause of distortion)? Or even consider DSD64 superior to 24/192 at times? Would it not be just as likely that some folks actually like the ultrasonic noise and what it actually is doing through the system? Perhaps similar to how some tube-lovers talk about certain types of distortion being unobjectionable? In fact, back in late 2013, I posted on my impressions with realtime PCM-to-DSD transcoding with JRiver 19 and felt that DSD64 did impart a subtle change to the sound. I wouldn't say that I felt the sonic difference compelled me to convert all my PCM files to DSD for listening, but it was an interesting effect. Maybe that's why some people would prefer a DAC that purposely converts PCM to DSD like the PS Audio DirectStream DAC (the signal is purposely downsampled to DSD128, and then only noise filtered by 80kHz according to this review).

I hope you enjoyed this exploration into the world of DSD (again)... I got a few projects piled up to work on so might not be around as much for the next couple weeks. I'll also be in Boston in the next little while so if anyone has a recommendation on music store I should check out near downtown, let me know!

I'm also thoroughly enjoying David Byrne's book How Music Works (2012, with 2013 update) - check it out for entertaining reading!

Enjoy the music folks :-).

Note that Adobe Audition renders the PCM data with a linear phase interpolation filter. Here are renderings of some impulse waveforms using Audacity which does not do the fancy interpolation for reference:

Friday, 3 April 2015

ANALYSIS: DSD-to-PCM Conversion 2015 - Windows & Mac OS X

Impulse Response: One of the talking points from back in the day as a selling point for DSD... Yup! DSD can better reproduce a 0.000003 second "click". Source: Merging Technologies

I. Preamble

It is amazing how quickly another year has passed. About this time last year, I posted the first comparison of DSD Encoders and Decoders "shoot-out" of sorts comparing Weiss Saracon 01.61-27, KORG AudioGate 2.3.3 and JRiver 19.0.117 in terms of quality - both encoding and decoding fidelity using the RightMark Audio Analyzer software. The idea was to determine which of the three created DSD files from an original 24/96 PCM test signal and then decoded it back to 24/96 in a way where there was as little change in terms of distortion, flat frequency response, and lowest amount of added noise.

Perhaps not surprisingly, the expensive Weiss Saracon software sets the standard as the most consistent DSD encoder that resulted in the best output once decoded. The differences in decoding capability appeared to be very minor (questionable audibility between the 3) but objectively, both Saracon and JRiver 19 were on par and the free AudioGate 2 somewhat "noisier" in terms of the PCM output (I speculated this was due to stronger dithering algorithm).

Well, another year has passed in terms of software upgrades to DSD decoding and I was interested to compare the decoding capabilities as of late. We have brand new versions of JRiver and AudioGate now, plus I didn't get to test foobar with the DSDIFF plugin last year. Plus we now have DSD decoding on the Mac OS X available with XLD and commercially with DSD Master.

To maintain an "apples to apples" comparison as best I can, I will use the Saracon 01.61-27 encoded DSD file of a 24/96 test signal with each of these decoders. I think this is fair given that Saracon produced excellent results last time as an encoder plus it is an "industry standard" used in many professional studios.

Let us have a look at the decoders I will be testing:

1. foobar with DSDIFF 1.4 plugin [Windows]. foobar2000 does not need any introduction as a freely available Windows music player. It's stable. Sounds fantastic. Is bitperfect. Is extremely feature-rich. And as witnessed by the myriad plugins like DSDIFF, very extensible. The DSDIFF plugin has been around as version 1.4 since 2011 by kode54. Decoding was set to 24/96 PCM, WAV output.

2. KORG AudioGate 3.0.2 [Windows]. AudioGate got an upgrade in June 2014 to version 3. It's no longer "free" like before where you can do conversions after just allowing Tweets from your account. I'm going to test with the output set at 24/96, "High Quality", no dithering. Let's see if the noise floor is better than AudioGate 2.

3. JRiver Media Center 20.0.87 [Windows used, Mac and Linux also available]. Upgrade from version 19 to 20 has introduced some new features (one of which I'm most interested in I'll talk about another time). Don't know if DSD decoding has changed... We shall see! For the test this year, I decided to stay with the default "safe" 24kHz low-pass filter like last year. Remember, you do have a choice: go to Tools --> Options... --> Advanced --> ... Configure input plug-in ... --> DSD input plug-in...

4. X Lossless Decoder (XLD) 20141129 [Mac OS X]. Freely available audio tool for Mac users. As of the 2014/11/09 release, it has the capability of decoding DSD files. Let's see how this newcomer compares! Again, decoding target was 24/96, I used higher quality "SoX VHQ Linear" resampling. Since the options allow easy adjustment of parameters like decimation and quantization, I will test both the default (8:1 decimation, 24-bit integer quantization) as well as higher quality (8:1 decimation, 32-bit floating point). [Decimation correlates to the output samplerate, so 8:1 for DSD64 is 352.8kHz output which then gets resampled by SoX to 96kHz.]

5. DSD Master 1.0  [Mac OS X]. Thank you to Richard at BitPerfect Sound Inc. for letting me give this program a spin (US$29.99 on the AppStore)! As per the company namesake, these are the same guys who brought you BitPerfect for the Mac. I actually downloaded this program in late 2014, unfortunately it took me awhile to get to the testing... After migrating to Windows in the last 2 years, I just haven't been using the Mac nearly as much. As with the other programs, I've set DSD Master to do all processing back to 24/96. Write as a WAV file. No gain applied. I see DSD Master can handle up to DSD256. One interesting feature is the DSD Hybrid mode where the DSD data is retained along with PCM conversion - large file sizes to be expected, and you will need BitPerfect 2.0 to play through iTunes to a DSD DAC.

For completeness, I will also post up the results for Weiss Saracon 01.61-27 which I obtained last year as the standard for comparison. (I borrowed Saracon to test last year so have not kept up as to whether there have been updated versions since.)

II. Results

Here are the summary results (to keep the tables smaller, I separated the Windows and Mac software used):
Windows DSD Conversion
Mac DSD Conversion
Remember that the RightMark software is calculating these results based on the audible 20Hz to 20kHz spectrum. Ultrasonic components therefore will be ignored in the numerical result so factors like different low-pass filtering settings will not show up here unless it clearly encroaches into the audible range. The first column is for an untouched copy of the original 24/96 PCM test signal - this represents the ideal / "perfect" result possible. The second column is the results from Saracon - same numbers as last year for comparison with a professional piece of software (with professional price tag).

If we compare the software updates, we see that there has been a substantial change with KORG AudioGate 3 compared to version 2 in terms of noise level; I'm seeing about a 10dB improvement compared to last year. If my suspicion is correct, perhaps they were using a stronger dithering algorithm back in version 2. The result is now much more in line with the other conversion packages.

As for JRiver 20 vs. 19, there has been little change; about 1-2dB difference. Note that JRiver posted fantastic numbers last year already so the slight improvement is actually very impressive!

Let's now look at the newcomers to this round-up...

Foobar DSDIFF did okay overall. Good performance and about the same as AudioGate when decoding the test signal. Hey, it's free :-).

XLD on the Mac provided good results as well. As you can see comparing the 24-bit integer to 32-bit floating point calculations, there was no difference. Any difference between 24-bit and 32-bit processing would be below the precision of the final 24/96 WAV file and RightMark's calculations. Therefore, if you use this program, you might as well stick with 24-bit integer calculations as it's faster.

Finally, we have the DSD Master software. Excellent quality output converting DSD64 to 24/96. Very low noise level and high dynamic range essentially the same as the much more expensive Weiss Saracon. (Remember, this is all relative since we're talking about noise levels below -130dB!)

Let's have a look at the graphs:
Windows DSD Conversion - Frequency Response
Mac DSD Conversion - Frequency Response
As you can see, JRiver has the strongest low-pass filter of all the programs (24kHz, 48dB/octave by default) followed by Saracon. Foobar DSDIFF plugin rolls off a little earlier and finally AudioGate looks like it lets most/all of the noise through. [Remember, in principle, it is a good idea to keep the low-pass filter to reduce potential for intermodulation distortion on playback and reduce any risk of damage from excess high-frequency noise. It also reduces the file size.]

On the Mac, I see that neither XLD nor DSD Master perform much low-pass filtering - at least not within 48kHz.

Here are the noise level graphs:
Windows DSD Conversion - Noise Level. Slight irregularity in the DSDIFF noise floor at high frequencies.
Mac DSD Conversion - Noise Level [note XLD tracings exactly the same so only the purple 32-bit tracing showing up]
No surprises around the amount of ultrasonic noise resulting from the noise-shaping required with 1-bit DSD quantization. As usual, with DSD64, the noise starts right around 20kHz using Saracon as encoder (remember that some encoders like AudioGate 2 last year, the noise floor isn't as flat all the way to 20kHz). We know that Saracon conversion back to PCM deals with this ultrasonic noise using a strong filter which essentially suppresses everything by 40kHz (hence 24/88 is good enough with Saracon). In a similar fashion, the default 24kHz low-pass filter with JRiver does a good job keeping ultrasonic noise down. As you can see, the others - foobar DSDIFF plugin, KORG AudioGate, XLD, and DSD Master are all allowing the high frequency noise through at default settings.

III. Conclusions

As I suggested last year, I believe that DSD --> PCM conversion is transparent. I'm measuring the distortion added by both PCM (24/96) --> DSD64 [via Saracon] as well as DSD64 --> PCM (24/96) steps and as you can see, there is nothing showing up of concern. Fidelity is maintained beyond any DAC's analogue output I am aware of except for all that ultrasonic stuff peaking at about -85dB if filtering is not applied. Speaker system / headphone playback would add more distortion than this (at least within the audible frequencies).

You might be asking - what about subjective listening?

Well, I did convert Jorma Kaukonen's "Blue Railroad Train" (from the SACD Blue Country Heart, 2002, recorded in DSD) with each converter for a listen with the TEAC UD-501 DAC (reviewed here) through my Emotiva XSP-1 preamp, XPA-1L monoblocks in Class A bias mode, and Paradigm Signature S8v3 speakers, connected with all balanced interconnects. Once I made sure the levels were matched (for example, by default JRiver did not perform +6dB gain, where as AudioGate does), I could not confidently differentiate the output from each converter. It's hard to do a blinded comparison between DSD and PCM because the DAC emits a faint "click" each time there is a switch between the formats to let me know something has changed, furthermore, DSD playback isn't at exactly the same level (not to mention the DAC boldly declares "PCM" or "DSD" on the front LCD panel). The recording sounded beautiful, "crystal clear" - excellent whether in the form of a direct DSD playback (native ASIO) or converted to 24/96 PCM. Let's just say that if I walked in the room, I would have no trouble thoroughly enjoying the music produced through these converters...

Bottom line: No need to worry about the sonic output from any of these converters IMO. Conversion algorithms and software look mature with little difference between them. The only significant choice is whether you want to have a low-pass filter in the conversion process (I do so I'd prefer Saracon or JRiver). I know some people claim they can hear qualitative differences between conversion programs beyond just level differences... Maybe. I'd certainly be impressed if anyone can show positive controlled, blinded listening test results given the minute changes I see/hear while doing these tests!


So guys, what do you think about the state of DSD these days? It really looks like the "push" has fizzled lately... Other than more DACs supporting native DSD playback, there seems to be little news out there. Anyone actually buying many DSD downloads?

As I wrote back in April 2013 (On SACD & DSD audio...), there are many factors working against DSD audio if the goal is to expand beyond just a small audio-geek niche format. It appears my concerns around the need for a modern file format that provides full tagging and data compression persists...

Psssstttt... Coders... Want to be famous? Pull together some code to create an open source ID3 taggable compression CODEC for DSD (.fdac? Free DSD Audio Codec - how about just .dac format). Get the guys at JRiver and foobar2000 to support it, and make sure it runs in Linux for music servers. I bet this format would become widely used among the guys ripping their SACD's and those who rip LPs into DSD! Make sure to support compression of DSD128+ as well just in case hi-res DSD becomes available (as far as I know the old Philips ProTECH DST Encoder could only handle DSD64)...

Until next time, have a wonderful April and hope you're enjoying the music!

Thursday, 26 March 2015

MUSINGS: Gone 4K / UHD - A "Look" At Ultra-High-Definition...

This week, I thought I'd take a break from just the audio stuff and discuss a new "toy" I got 5 weeks ago. That's right, as the title suggests, a 4K / UHD screen; it's a computer monitor to be exact:

A view from behind the commander's chair :-). BenQ BL3201PH on the table.
Remember that there's a separate "body" defining 4K movies at the local movieplex - the Digital Cinema Initiative (DCI). They have "true" 4000 pixel horizontal resolution like the 4096x1716 (2.39:1 aspect), or the very close 3996x2160 (1.85:1) resolutions. Whereas for the smaller screens like computer monitors and TV's, we have the UHD "Ultra High Definition" standard defined as 3840x2160 (16:9 / 1.78:1). So although it's not exactly "4K" horizontally, it's close and I guess "4K" is a better advertising catch-phrase than "2160P". Needless to say 3840x2160 is 2x the linear resolution of 1080P or 4x the actual number of pixels.

Please allow me to reminisce a little on "ancient" technology history... Back in 1989, in my university undergrad, I worked for a summer doing computer science research and saw for the first time a SUN SPARCstation 1 "pizza box" with 20MHz processor, 16MB RAM, and a 256-color "megapixel" (1152 x 900) display. I was blown away! This was a "dream machine" compared to my 7MHz Motorola 68000, 512KB Commodore Amiga 1000 with 32 colors (4096-color HAM mode was cool but limited in application, before the 64-color EHB mode) and a maximum resolution of 640x400 interlaced (can be pushed a bit into overscan). Back in those days, even a relatively expensive Macintosh was only capable of 640x480 8-bit (256) color.

The closest to "true-color" I saw in the 80's was an old Motorola VME 68020 machine I worked on to develop a rudimentary GUI for image recognition software running an ancient 16-bit color Matrox frame buffer video card. Although limited to 640x480 interlaced, it was impressive to see an actual digital picture on a computer screen that looked like something out of a video!

[Even back then, although the sound quality was nothing to write home about, in 1989, the first PC Sound Blaster card was introduced. By then, we had been living with CD audio for a number of years already, and even this first generation card was capable of 8/22 mono already. It was just a matter of time before 16/44 stereo sampling was on option given enough memory and storage space. The Sound Blaster 16 with 16/44 stereo came just a few years later in 1992. Clearly, technology for imaging / video has always been behind audio in capability and relative fidelity due to complexity and storage requirements (this of course also speaks to the neurological sophistication of the visual architecture compared to audio in our brains).]

At some point in the early 1990's I saw a TI "true color" 24-bit graphics card machine at the university (remember the TARGA graphics file format anyone?). By 1994, I bought myself a Picasso II graphics cards for the Amiga capable of 800x600, 24-bit color (sweet!). By 1997, my first PC graphics card capable of >1024x768, 24-bit color was the Matrox Mystique. From then on, each generation of graphics card became more about 3D performance rather than 2D speed or resolution... My computer display also got upgraded through the years, from NEC MultiSync CRTs to 1280x1024 LCD, to Dell's UltraSharp 24" series (1920x1200), and last year I got the excellent 27" BenQ BL2710PT (2560x1440).

But one goal remained elusive on my desktop machine. A large screen monitor (in the 30" range) with at least spatial "high fidelity"; looking smooth, detailed, with clearly enough fidelity that my eyes/mind no longer would be able to distinguish those digital pixels anymore - in essence, something close to the limit of our visual spatial apparatus in 2D (perhaps like how CD is close to our auditory limits within the stereo domain). Although in the visual sphere there's still room for improvement in terms of color accuracy, contrast (dynamics), and black levels, finally it looks like we're "there" with pixel resolution (and at minimum flickerless 60Hz refresh rates with decent response time).

This goal of achieving pixel resolution meeting biological limits is obvious and technology companies have been building up towards it for years. Apple's "marketing speak" captured it nicely; they called it "Retina Display" - a screen resolution packed tightly enough that individual pixels would not be visible to the user. The first product they released to the public with this resolution designation was the iPhone 4 with a screen resolution of 960x640 (3.5", 326ppi) in June 2010 (of course other phone companies use high resolution screens and have surpassed Apple's screens; though I must credit Apple with their superb marketing prowess!). Steve Jobs back then made a presentation about the resolution of the human eye being around 300 dpi for cellphone use:

Realize that this number is only relevant in relation to distance from which the screen is viewed. When we test eye-sight, the "target" of 20/20 vision is when we are able to discriminate two visual contours separated by 1 "arc minute" of angular resolution (1/60th of 1 degree). Like I mentioned in the post a couple weeks ago, like hearing acuity, there will be phenotypic variation to this in the population and some folks will achieve better than 20/20 vision just like some people will have better hearing than others ("golden ears"). For those interested in the physics and calculated limits of vision, check out this page.

Coming back to technology then... As per Steve Jobs, when we use a cell phone, we generally view it at a distance closer than say a laptop or desktop monitor. Normally we'll view a smallish screen phone (say <6" diagonal) at about 10-12 inches. In that context, the 300 pixel per inch specification is about right... Just like in audio where we can argue about "Is CD Resolution Enough?", the visual resolution guys also argue if more is needed - witness the passion of the Cult Of Mac and their plea for "True Retina" (something like 900 ppi for the iPhone 4, and 9K for a 27" computer screen)!

Until that day when we can see for ourselves if 9K is needed though (the UHD definition offers 8K for those truly on the bleeding edge of technology), check out this helpful web site for calculating what viewing distance a screen becomes "retina" grade:

Enter the horizontal and vertical resolution, then screen size, and press "CALCULATE". It'll tell you the PPI resolution, aspect ratio, and most importantly in this discussion at what distance the angular resolution of the pixels reach the 20/20 threshold. Using this calculator, my BenQ BL3201PH, 32" 4K/UHD (3840 x 2160, 137 dpi) monitor reaches "retina" resolution at a viewing distance of 25".

Considering that I generally sit >25" away from the monitor, it looks like I've achieved that "magic" resolution I've been hoping for all these years :-). With a 32" monitor, you actually wouldn't want to sit too close, otherwise you'd be moving the head too much to scan the screen all the time. Subjectively, the monitor image looks gorgeous and it really is wonderful not noticing any pixels or easily making out any aliasing imperfections in text. I think I can live with this for a few years!

There's something special about achieving high fidelity (whether audio or visual). For a machine to match (and these days surpass) biological sensory limitations is a milestone. And to do it at price points within reach of most consumers is further evidence of technological maturation. In just a few years, we've witnessed the transformation of high resolution screen technology with "retina" resolution starting in handheld devices, to laptops, and now to the desktop monitor...

In the Archimago household, there remains one large screen screaming for these high resolutions. My TV in the sound & home theater room. If I plug in the numbers into the website, it looks like I'll need an 80" 4K TV :-). Well, I'll be keeping an eye on those prices then! Although I'm willing to jump into the 4K computer monitor waters at this time, I think I'll wait when it comes to the TV. HDMI 2.0, DisplayPort 1.3, HDCP 2.2 all need to be hashed out and widely supported before I jump in with a big purchase. Also, OLED 4K could be spectacular... Maybe next year?


I want to say a few words about the usability of 4K monitors. I was actually a little apprehensive at first about buying one due to some reviewers complaining that text size was too small and it was too difficult to use with Windows 8.1. I suspect this would be the case with smaller 4K screens like 27" models (Huh!? What's with that 5K iMac at 27"?). At 32", I can actually use it even at 100% (1:1) although a 125% scaling made things easier on the eyes. Note that many/most Windows programs are still not "scaling aware", which is why having the screen usable at 100% from a standard viewing distance is beneficial at this time.

Use the "scaling"!
Firefox runs great with 125% scaling and you can go into Chrome's options to set the default scaling to 125% as well. Internet Explorer looks excellent out-of-the-box.

For digital photography, 32" 4K was made for Lightroom / Photoshop! The ability to see your photos on a large screen with 8 full megapixels is stunning. The bad news is that my quad-core Intel i7 CPU is feeling slower processing all those megapixels from a RAW file; not quite enough to make me feel I need a CPU upgrade just yet.

There are some 90+Mbps AVC 4K demo videos floating around providing a tantalizing taste of what 4K Blu-Ray could look like in the home theater. Panasonic showed off their 4K "ULTRA HD Blu-Ray" at CES2015 recently and I suspect that will be the best image quality we're going to get for awhile simply because of the large capacity Blu-Ray disks have to offer. It looks like the new encoding standard H.265/HEVC will be used for these future videos and this will provide even better compression efficiency and image quality for the same bitrate (supposedly similar image quality at 50% data rate compared to H.264/AVC). This could end up being the last copy of  Shawshank Redemption I ever buy... Hopefully :-). [Even here, we can get into a debate about analogue vs. digital... Arguably, unless the movie was filmed in 70mm, 4K should be more than adequate to capture the full image quality of any 35mm production.]

For the time being, 4K YouTube streaming does look better than 1080P but it's clear that Internet bitrates impose significant compression penalties (noticeable macroblock distortions with busy scenes). Netflix has some material but will not currently stream 4K to the computer (only 4K TVs so far - probably due to copyright protection). I have watched 4K shows like House Of Cards and Breaking Bad off Netflix, but like 4K YouTube, the quality isn't really that impressive at this point.

Finally, remember the hardware needed to run a 4K/UHD monitor. I decided at this point to get the screen because we now have 2nd generation reasonable-priced screens (~$1000) at 60Hz, with IPS-type technology. The BenQ uses the DisplayPort to achieve 60Hz refresh rate and is SST (Single Stream Transport) instead of MST (Multi-Stream Transport) which split the screen into 2 x 2K "tiles". SST should be hassle free as I have heard of folks experiencing driver issues with the tiled screens not handled properly (imagine only half the screen displaying if the software fails the tiling process). Note that for a bit more money, the Samsung U32D970Q has received some excellent reviews for image quality and color accuracy.

I'm currently using an AMD/ATI Radeon R9 270X graphics card I got last year. Not expensive and has been trouble free for 60Hz SST operation. Just remember to buy some quality DisplayPort 1.2 cables (the BenQ has both full-sized and micro DisplayPort input). Here's an example of a very high speed digital interface that requires about 12Gbps of data transferred to achieve 3840x2160, 24-bits/pixel at 60Hz. The 6' DP-to-miniDP cable that comes with the monitor does the job fine but so far I have had no luck with 10' generic cables just to give some extra flexibility to my setup (anyone know of a reliable 10' 4K/60Hz cable, maybe 26AWG conductors?). Even at data rates 25x that of high-speed USB 2.0 (and 2x USB 3.0 speed), there's no need to spend >$20 for a good 6' cable.

Modern high-performance gaming at 4K would really demand a more powerful graphics processor so I haven't tried on this machine. I suspect less demanding ones would run just fine.


As noted earlier, remember that pixel resolution is only one factor in overall image quality. The ability to display good contrast (like dynamic range in music) and also color accuracy are very important. Clearly it's in these other areas that computer and TV displays can further improve. Note also that UHD defines an enlarged color space as well (ITU-R BT 2020 vs. the previous Rec. 709 for standard HDTV - see here) so the improvement in this regard is another tangible benefit.

I hope you enjoyed this foray outside the usual audio technical discussions... Enjoy the music and whatever visual set-up you're running!

PS: Happy Dynamic Range Day (March 27, 2015)! Great to see a recent purchase; Mark Knopfler's Trackerwas mastered at decent DR11... Keep 'em coming - "rescuing the art form" is about preserving qualities like the full dynamic range and releasing music meant for listening with systems superior to boomboxes and earbuds!

Tuesday, 17 March 2015

MUSINGS: Audiophiles "Us vs. Them" (Objectivists vs. Subjectivists) Attitudes and Envy!?

Since I'm stuck at LAX on my way home from a wonderful Spring Break with the wife and kids down in Texas as well as a Caribbean cruise, I thought I'd polish my response to Hal Espen's comment in the last post... BTW, I enjoyed visiting Bjorn's in San Antonio just to see the audio and home theater gear they had on display. Some really nice stuff and it looks like they're upgrading their main demo room to Atmos soon. I appreciated the knowledgeable staff and friendly attitude; taking time to demo the gear even though they knew I didn't even live in the USA.

So Hal, nice comment:
Pure confirmation bias from beginning to end. None of this really exists. : )
You can't have it both ways. Either your blog is about providing the little bursts of brain chemicals that us vs. them skeptics receive when scientism is seen to be crushing audiophilia, or you're going to go wafting into the subjective realm of the subtle and imaginative "classy" pleasures of reproducing music electronically, as you do with evident misgivings here.

Which side are you on, boys? Which side are you on>
Gets right to the heart of some of the heated debates and arguments I suppose... I guess I "swing both ways" in some regards. :-)

Remember though, I am "more objective" in my leanings in terms of intolerance for some of the true BS (like certain cables). However, I have no issues with enjoying the finer things in life. If a $50,000 pair of speakers made with premium materials look fantastic with my decor, sounds great, and I really wanted them; I would happily buy them! But as an objectivist, I would just like to make sure that they are built to scientific principles around the ability to convey accurate sound (decent frequency response, minimal enclosure resonance, good time alignment, rational crossovers, adequate power handling...). The philosophical issue I have with pure subjectivism is that I see these parameters as pre-requisites as an informed consumer to my purchase and essential to any complete review due to psychological limitations (biases) and limits of human hearing acuity based on personal experience with my own failings and knowing the limits of various "golden ears" I have come across in my travels. I don't think it's unreasonable to state that some folks lack insight into their own abilities and limitations - this is not just a comment about audiophiles but apply to many other situations as well.

There are many examples in the Stereophile reviews where IMO it's quite clear that certain "recommended" components should be viewed with suspicion in the eyes of those interested in objective criteria of accuracy and "high fidelity". An example is something like the DeVore Orangutans - they don't "measure up" as can be seen with the Stereophile measurements and there's even a comment about audible coloration with solo piano by JA. Many interesting comments in that post. For the asking price of $12000/pair, I think that's unreasonable performance for the expense given a myriad of other options at that price point and below. BTW I have heard them and although they sound OK especially with low power amps, I am just not interested in gear that "color" the sound in a significant way. No matter what some subjective folks "think" or "hear" or "feel". (Esthetically these speakers do nothing for me either.)

This principle is all the more relevant with stuff like cables (especially for digital signals) where there's literally nothing there to measure or difference to be heard once any kind of controlled protocol is put in place. Other than subjective esthetic preferences and psychological "feel good" about owning expensive copper snakes. I really don't care enough about the "bling" of cables to feel it's worthy of the expense since that is all they offer.

Ultimately, I think it's OK to embrace the various "shades of grey" in audiophile philosophical leanings and I hope I don't come off too intolerant of anyone's freedom to believe what they want. However I don't have to believe everything I hear/read and I choose to take a stand on buying gear based on what appears to be reasonably "sound" science. Some folks seem to think it's about expense or "envy" about the cost of audiophile gear; and that's the reason why some people criticize the equipment or company. While this may be the case at times, personally I do not believe this is my concern at all (nor have I met many objectivists where I thought they might be projecting envy as a major reason for their disdain of nonsense). Over the years I've easily spent >$50,000 on audio gear and much more than that to buy a house meeting my criteria for a decent enough sound room (yet another pre-requisite - something I wish all audio reviewers would talk more about and show us pictures of the soundroom rather than listing likely insignificant accessories like cables used).

I truly find it bizarre that recently folks like Michael Lavorgna at AudioStream keeps talking about "envy" (like here and here)... Folks, when objectively some things don't make sense like $1000 ethernet cables, what is there to be envious about unless one is honestly willing to accept that they are in this audio hobby not for sound quality but acknowledging that "bling" is worth coveting (like that $17,000 Apple Watch)?

Gents (and ladies). Enjoy the music!

No need to get upset in flame wars since it's just a hobby... One of I hope a number of others since there is so much in this world to enjoy. Figure out what's important to you and your stance. Most of all, for the love of the community, stay cool when it comes to debates out there :-). IMO, the objective perspective has so much to offer in terms of reality testing, tools to help adjudicate qualitative differences, and a way to tease out collective facts from individual beliefs... For something as obvious as ethernet cables, put the facts forward and wait for reasonable responses or evidence to show otherwise. Hopefully folks will think about their beliefs and engage in reasonable conversation about what is important and how we can all benefit from improved sound quality.

And it's always good to realize there's more to this than a simplistic and childish "us vs.  them" attitude of course...

BTW: I just couldn't help but run into this article on the "JCAT Reference USB" cables. Can someone tell me the definition of a "true believer in the audiophile experience" or the "true hobbyist"? So what does that make "us" or are we "them"? :-) Also, shouldn't we be reserving phrases like "true believer" for religion and faith rather than engineered products based on applied physics!?

Saturday, 7 March 2015

2015-02-27: HiFi Centre Grand Reopening...

About a year ago, I reported on the B&W Nautilus demo at the HiFi Centre here in Vancouver. That was at their old location... As of late February, they're now in the new place near Vancouver Chinatown and to celebrate, they had a nice (re)opening event (February 27, 2015). It caught some publicity from the Stereophile web site as well.

I decided to go check it out after work on the Friday and see the new space. For fun, here are a few cell phone snapshots of some of the gear on display. Note that it was well attended and I purposely tried not to take shots with people in them to protect the innocent :-).

Upon entering the main building, we see a Nautilus "shell" display; HiFi Centre has always had a good partnership with B&W:

Of course what's a party for audiophiles without some live jazzy music? And there were free drinks on the house as well... Thanks guy!

There's a nice "headphone bar" to demo. Just plug in the appropriate headphone to the Bryston BHA-1 amp and use the iPad to play a tune. Good to see balanced cabling used for the higher model Audeze and Sennheisers (not that it'd make much difference in a store but at least being demoed with best potential quality). I'm only showing the Audeze and Sennheiser selections here. They also have a full line of Audio-Technica headphones and Grados on the other side out of the picture to the left.

I already have the Sennheiser HD800's so was eyeing that Audeze LCD-3 - maybe I'll add a planar magnetic headphone to my collection at some point. Beside the LCD-3 on the top rack was the LCD-XC, a closed-back unit which sounded excellent with great noise isolation in the room. It feels heavy in hand and I can see it potentially getting heavy over time when worn, but the comfy ear pads really felt great (at least over the few minutes I was listening).

Across the way was an Auralic stack with the Aries streamer, Vega DAC and Taurus headphone amp. It was connected to the yet-to-be-released AudioQuest NightHawk headphones. It's a "semi-open" design so even in a loud area with folks chatting and music playing, it wasn't too difficult to hear the music playing; not as good as a fully closed design of course for noise isolation. Keep an eye on the Head-Fi posts to see when it's officially released. The AudioQuest rep says it's coming in April. It felt comfortable and I didn't notice any sonic issues for what it's worth in a noisy environment. As for the Auralic devices I think the Vega is a great DAC and I've always liked that amber LCD front panel design. Esthetically I still think the Aries looks too much like a Cylon Basestar on the old Battlestar Galactica! I don't see the point of the vertical visual obstruction from the "flares" on top and bottom. I'm sure it works fine as a digital transport, but it's not all that visually pleasing to me and doesn't help functionally IMO. And for this price, I'm just not impressed by the plastic facade.

Moving along, we see turntables on the wall along with various speakers below them. Most of the tables were Rega and Music Hall. These were not connected; for that you'll have to enter the 5 listening rooms. The classic Linn LP12 on display as well.

As for the music rooms (5 in total), I am impressed by the sensible and pleasing room treatments... Each room has a wall-mounted iPad running control software along with ethernet wired network streaming. Plenty of tracks available to play and presumably if all the rooms are connected to the same central server, one could play the same music and assess sonic differences originating from the same mastering. Computer-based music servers are obviously here to stay. Again, the noise level was a bit too high that night to appreciate the music playback but overall no complaints!

Here is the Bryston / B&W room. A few jazz and rock tracks were played. I was there when Lorde's "Royals" (from Pure Heroine of course) was playing... Good rendition. Although those B&W 802 Diamonds are capable of reproducing the low end quite well, I noticed that it didn't sound as "precise" as I have heard this track with a good sub; seemed just a little bloated to me even listening at what I thought would have been the sweet spot (too close to wall?). Something else I need to double check with my next visit. Listeners were suitably impressed nonetheless and I suspect many audiophiles have not heard this track with a system capable of "flat" response to 20Hz. I don't remember what was being used to render the music stream but there was a dCS Puccini CD/SACD player in there if anyone is interested in expensive disk spinners these days.

Another room featured the Vienna Acoustics Liszt speakers (~$15k/pair). Again, very nice room and sound quality... Forgot to take note of what amp they were using. They were playing an LP at the time interestingly enough. There was also a room featuring Totem speaker; in this case the Element Earth (~$9k/pair) on the right, I believe driven by Naim electronics. I was quite impressed by the tonal balance on acoustic and bass response from percussion tracks out of those little guys! They're smaller in person than my impression looking at the picture. Demoing the Totem was none other than the founder Vince Bruzzese... We had a nice chat about their speakers and design of the Element line. Personable guy, enthusiastic, unostentatious - I like that! I think most folks came out of that room impressed by these Canadian speakers.

The "highest end" room in terms of cost was the Sonus Faber Lilium (>$65k/pair - sorry for poor focus, you know, alcohol and all...) driven by dual McIntosh MC2301 300W tube monoblocks (>$20k/pair). They were playing light classical at the time; nothing that I thought really challenged the amp/speakers. For anyone wondering about the "grill" in front of the Lilium, they're just soft string-like material so IMO there's no real protection for the drivers behind... Not something you'd want to put in the family room when friends with kids come over for a visit! :-)

The Mac Rack...
Speaking of McIntosh, here's one for the "high-end" computer audiophile:
Looks like we have a MXA70 (50Wpc) integrated amp on the left I think with XR50 bookshelf speakers. I didn't get to hear this setup as it was in the main hallway with many folks hanging around and chatting.

One of the rooms featured NAD and Bluesound (the server "Vault" was to the right just outside the picture). Both NAD and Bluesound are divisions of Lenbrook Industries so it made sense they were paired. If I'm not mistaken, that would be the B&W CM6 S2 to the right (~$2k/pair). I'll have to come back again to check out the NAD streaming devices especially that Masters M12. It looks like it has the BluOS module for streaming installed. The NAD rep was a pleasure to chat with as well. I'm certainly not about to change my Logitech Media Server (Squeezebox) system soon since it's working so well over the years (currently using one of the LMS 7.9 builds, >100 days uptime on the Windows server), but the BluOS control system seemed well thought out.

So, apparently this new store is built around the "sensory" retail Bang & Olufsen concept... Not sure what the specific elements are in the design here or how it compares to the New York or Copenhagen stores but it is well laid out. Of course we have B&O lifestyle products on display. They look good...

Those BeoLab 5's (active, Class-D, ~$16k/pair?) on either side of Paul McCartney are positively futuristic looking and filled the room nicely with sound; not sure how good they are with soundstaging or accuracy however. They were just playing some B&O promo material. I've seen mixed reviews.

Finally, what is an audiophile store without gadgets like cartridges and of course cables?

There you go. The AudioQuest USB and ethernet Diamonds. Yeah...

It was funny seeing the wives and girlfriends hanging outside the showroom entrance while the men mingled amidst the audio gear. Partly makes up for all the times you see guys just sitting on mall benches when the girls go clothes/shoe shopping I guess.

Dear readers, see, even an objectivist can have fun in this hobby :-). It's also great to see that there are currently plans for the first Vancouver Audio Show this May 8-10 - I might check that out if I'm in town around then. It will be great to see Vancouver increase in audiophile hobby prominence situated where it is with a huge influx from Asia in general and China specifically. Nice to have a store such as this to visit once awhile; especially not far from home...

A classy party for a classy new store... Bravo HiFi Centre!


Off for some R&R over Spring Break with the family in Texas. I just hope it warms up down there! If anyone has recommendations for a good hi-fi or music store (used vinyl!?) to check out in San Antonio - let me know.

As always... Hope you're all enjoying your music.

Sunday, 1 March 2015

MUSINGS: Audio Quality, The Various Formats, and Diminishing Returns - In Pictures!

Let me be the first to say that graphs and charts where audio formats are plotted out in terms of unidimensional sound quality ratings are ridiculously oversimplified and can be very misleading! However, they can be fun to look at and could be used as bite-sized "memes" for discussion when meeting up with audio friends or for illustration when people ask about audio quality.

Since they're out there already, let us spend some time this week to look at these visual analogies as a way to "think" about what the authors of these works want us to consider/believe. I'm going to screen capture without permission a couple of these images to explore. As usual, I do this out of a desire to discuss, critique, and hopefully educate which I consider "fair use" of copyrighted material; as a reminder to readers, other than a tiny bit of ad revenue on this blog (hey, why not?), I do not expect any other gain from writing a post like this.

First, here's PONO's "Underwater Listening" diagram released around the time of the 2014 SXSW (March 2014):
PONO: Underwater Listening
Others have already commented on this of course (here also). I don't know what ad "genius" came out with this diagram, but it is cute, I suppose. I remember being taken aback by this picture initially as it's so far out of "left field" (creative?) that I felt disoriented when I first saw this thing...

How audio formats would evoke a desire to compare underwater depths remains a mystery to me. Obviously, there's a desire to impress upon the recipient two main messages - a direct correlation between sampling rate (from CD up) with quality, and to make sure the MP3 format gets deprecated as much as possible (1000 ft?!). On both those counts, this image gets it so wrong, it's almost comedic. Clearly, one cannot directly correlate samplerate and bitrate with audio quality because the relationship isn't some kind of linear correlation. Why would CD quality be "200 ft", and 96kHz "20 ft"? Surely nobody in their right mind would say that 96kHz is 10 times perceptually "better". Sure, there is a correlation such that a low bitrate file like 64kbps MP3 will sound quite compromised with poor resolution, but without any qualification around this important bitrate parameter, how can anyone honestly say that all MP3s sound bad? I might as well say that Neil Young's a poor-sounding recording artist because the Le Noise (2010) and A Letter Home (2014) albums are low fidelity.

I suspect that the PONO camp must be a bit ashamed of this diagram since I don't see it around anymore and I don't find it on their website (might have missed it). I don't think the "underwater" diagram made many friends nor sold many machines in any case...

Here's a more recent chart from Meridian circa late 2014:
Meridian: History of audio quality & convenience?!
From this, we "learn" that "downloads" have poorer quality than CDs (always?!). Also, I "learn" that LPs sound significantly better than "DVD-A/SACD" (and by extension high-resolution audio). But the most important point is that current streaming audio sounds worse than cassette tapes in quality. Does that make sense to anyone? Is this saying that streaming Spotify, Tidal, Qobuz, etc. customers are so hung up with convenience that they're willing to listen to sound quality worse than an 80's Walkman?

Of course this is the myth that they primarily want to perpetuate because guess what... Buy this "revolutionary" Meridian MQA and that'll make streaming sound awesome!

While in some cases, sure, we can say a very poorly encoded 192kbps MP3 download (like something done in 1999 with XING MP3) could sound significantly worse than CD and a 64kbps stream can be worse than an excellent cassette copy, like the PONO "artwork" above, there are some truly horrible gross generalizations here! Many LPs sound poor due to low quality pressings, many downloads are qualitatively superb, and clearly any reasonable music streaming service sounds better than a cassette tape - who's kidding whom?! Furthermore, a high resolution digital master (like with high-res downloads or encoded on DVD-A/SACD) has the capability to be more accurate than reel-to-reel tape, but of course subjectively, analogue tape can add its own unique signature/color/distortion that can be preferred... (To be able to mix in the digital domain without generational fidelity loss compared to analogue tape is obviously a big plus.)

Of course, it's easy for me to just criticize without putting something forward... Therefore, please allow me to add for your consideration my submission to the "overgeneralized sound quality vs. audio format graph":

It's a graph of the law of diminishing returns in terms of audio technology and sound quality. I think it's important to take into account the fact that hearing ability is obviously NOT infinite. Due to biological phenotypic variation, there's probably a bell-shaped curve to hearing ability as well as moment-to-moment fluctuations in acuity which is represented by the "Zone of max. auditory acuity" gradient [See comments: probably more of an asymmetrical negatively skewed distribution]. Depending on a person's maximum hearing ability, the 100% point will shift up or down relative to another but let's keep this graph simple and say that for any individual, we can only hear up to 100% based on how we're endowed. Day to day, our hearing acuity changes - everything from current stress level affecting the ability to attend to the sound, to ear wax, to allergies, to sinus/ear infections, to noise induced hearing loss, to tinnitus, to age will result in a decline in the maximum acuity (some of this sadly irreversible). Obviously, mental training can help improve how well we attend and pick up subtle cues.

The Y-axis therefore represents the "Perceived Fidelity" up to 100%. Exactly how fidelity is measured is not important in this simple diagram but obviously will consist of frequency response, dynamic resolution, low noise floor, low distortion including timing anomalies using the same mastering of a recording of superb quality for all formats. On the X-axis, we have "Effective Uncompressed PCM Bitrate" as a measure of approximately how much data is used for encoding the audio. This is a proxy for how much "effort" is needed to achieve the level of fidelity. Note that the scale is logarithmic, not linear to correspond to the logarithmic perception of frequencies and dynamic range. More data, more storage, more "effort" is needed to achieve any improvement to perceived quality as we go towards the top of the plateau to the right of the graph.

As you can see, the curve plateaus since we obviously cannot hear beyond around "100%". At some point, it really does not matter how much data we use to encode the sound, there just will not be any significant perceivable difference and all we've done is wasted storage. The big question of course is at what point along this curve do we place the capabilities of the various audio formats.

Starting with good old CD, we know that scientific research has shown little evidence to suggest in controlled trials that higher resolution sounds much better (see discussion here). Therefore, I think it's reasonable to put it at point (1) which is quite far along the curve already - this would correspond to the 16/44 stereo PCM bitrate of ~1.5Mbps. It's very close to the 100% point - I don't think it's unreasonable to say around 95% so there is a possibility for some improvement. Where exactly this lies is not that important, it could be 90% for example; the main idea being that qualitative gains beyond the CD format are not going to be really massive. As we go higher to 24/96 (~5Mbps, point 3) and 24/192 (~10Mbps, point 5), we achieve essentially 100% perceived quality and for all the effort in terms of bitrate/file size, relatively little is gained. Although mathematically these high-resolution formats can capture more frequencies and greater dynamic range, the actual auditory benefits are limited.

Where does DSD sit in this? Realize that 1-bit DSD isn't as efficient as PCM (a description I've seen calls each bit of DSD an "additive" refinement to the sound, versus a "geometric" refinement with multibit PCM). Furthermore, noise shaping shifts the quantization noise into the higher frequencies resulting in non-uniform dynamic range across the spectrum; this is generally not a problem because hearing sensitivity also drops in the higher frequencies. From what I have heard and through examining DSD rips, I think that DSD64 is better (more accurate) than CD but not much more (I personally think 21-bit/50kHz PCM, about ~2Mbps, is good enough for DSD64 conversion and avoids encoding all that excess noise) whereas DSD128 is just short of 24/96 but very close. Note that this inefficiency in DSD encoding screams for the use of compression which I have argued should really be implemented in DSD file formats a couple years back.

So what about lossy compression in terms of perceived fidelity? Considering that there has not been good data to demonstrate that many people can differentiate high bitrate MP3 from lossless PCM, I have no issues placing it just shy of CD quality. To keep the graph clutter-free, I just used a single line to denote the MP3 320kbps quality even though I recognize that there could be a wide range to the fidelity depending on quality of the encoder and demands of the music. There are special cases, usually containing high frequency content that can demonstrate limitations with high bitrate MP3 but these are rare and generally will not be evident in actual music. You might ask "why is 320kbps MP3 equivalent to ~1.5Mbps uncompressed PCM!?" The answer is due to psychoacoustics techniques employed. Sure, there is significant data reduction, and yes, taken out of context of the rest of the audio, you can hear the difference (as in "The Ghost in the MP3" project). However the data removal was done with sophisticated algorithms informed by models of human hearing. As encoding algorithms have improved, so too have the sonic quality of the resulting MP3 over the years. This is a good example of how you cannot compare bitrates directly; the way the data is being encoded is obviously very different! And sadly PONO advertising doesn't seem to understand this when they keep using diagrams like this:

Just because a lot of data is used doesn't mean there's much benefit even if the recording were done in true high resolution. By the time we get to 24/192, we're way into the zone of diminishing returns and may in fact as some have suggested entered a point where the sound quality suffers because of potential intermodulation distortion from ultrasonic frequencies and some DAC's may no longer be functioning in an optimal fashion. The fact that technologically we can get this far into the curve is also a reflection of the state of maturity of audio science. Personally I remain partial to 24/96 as a "container" for the highest resolution humans will ever need; one which is already standard on both recording and playback equipment.

Finally, as I indicated in a previous post, vinyl has limitations. Yes, it can of course sound great but there are limitations to accuracy (including differences for outer grooves vs. inner grooves), higher overall distortion, and material imperfections. As a result, there will be a wide range to the sound of LP playback as identified in the graph. Perceived fidelity compared to the original source would be lower but also remember that just like the reel-to-reel tape discussion above, some of the distortion and coloration could be "euphonic" as well - hence preferred by some (many?).

I'm sure a graphics artist could produce a much more pleasing image than what I kludged together above :-). Like the PONO and Meridian pictures, it's simplistic but I think compared to the others, a more realistic representation.

Notice that the Meridian graph above tries to suggest that there has been deterioration of potential sound quality over time (especially when they suggest streaming quality is like cassette tape!). I've seen a number of people parrot this same idea in magazines and forums. I think this is nonsense. Consider that even free Spotify is streaming with Ogg Vorbis 160kbps on the desktop (still very good!). With a premium account, you get 320kbps. And sites like Tidal already do lossless 16/44 FLAC. We're looking at quality either reasonably close or identical to CD quality. Here's my version of the chart:

As you can see, I don't believe there has really been any inverse correlation between sound quality and convenience over time. Note the drop in convenience from CD to DVD-A/SACD which I don't think is a big deal since many DVD-As play in regular DVDs and are easy to rip now (dead format anyway), plus SACDs are often hybrid and play on standard CDs (and can also be ripped these days with some inconvenience). The shift from physical media to "virtual" digital data storage has been tremendously convenient although it brings with it a new skill set - file management, proper tagging, and of course managing backups. Now the shift towards streaming has become even more convenient and "mobile" through wireless data networks (but there's limited ability to customize and tag one's collection and the sense of "ownership" of the music - a problem if one is a "collector"). As far as I'm concerned, the only real qualitative decline was from LP to cassette tapes where convenience in terms of portability improved (can listen in cars and Walkmen, less need for cleaning, but no random access song selection which is why I gave LP a 50, and cassette only an increase to 60 overall). I believe streaming just needs a little more bandwidth and if we can reliably get 24/48 FLAC streaming, we will achieve a quality and convenience beyond what most music lovers and audiophiles would feel they "need" (we'll see if MQA really offers much more). Of course, there's always that desire to have physical artwork and booklets to thumb through while listening to the music - vinyl remains the "king" of album art in that regard.

One final comment to those who feel that just because folks like myself do not believe high bitrate MP3 sounds substantially different from lossless 16/44, that I'm somehow "advocating" for lossy audio. That's not exactly true since I don't think anyone would deny that lossless formats are superior for the best accuracy / fidelity. I still prefer FLAC as my archive format because then I can convert to whatever other format I want without multigenerational lossy degradation. However, I do believe MP3 is the way to go with cars and portable audio even if they support lossless and high resolution. High bitrate MP3's are quick to transfer, take up less space, and there's just no way I will be able to hear a difference in my car or walking down the street. I personally find high-resolution lossless files (or God forbid uncompressed DSD) on a phone or portable device extremely wasteful even if storage size were not an issue. MP3 (and similar formats like AAC, WMA, Vorbis...) has its place as a tool for high quality compression and there are many applications where it's all one ever needs to get the job done completely. Plus MP3s are universally supported.

Bottom Line: Remember the principle of diminishing returns as we're dealing with mature audio technology and limitations of the hearing apparatus. It's important to keep this in mind when assessing the promise of "new technology" and manufacturer claims such as the diagrams above.

(Did anyone see any critical comments from the audiophile press about PONO or Meridian's ad material above? How about Sony's 64GB "Premium Sound" SD card recently? There sadly seems to be a lack of critical thinking in much of the audiophile reporting these days, which only serves to isolate this hobby and solidifies the concept of the pejorative "audiophool".)


Regretfully, I missed a live performance by Cécile McLorin Salvant here in Vancouver last weekend. A friend went and thought the performance was amazing! She seems to be channeling a young Ella...

Check out her albums Cecile (2010) and the Grammy-nominated Womanchild (2013) if you like jazz vocals.

Enjoy the music...