What do we really mean when we talk about "loudspeaker sound"? We've said it – we've heard it – but what should a good loudspeaker sound like?
If we are being true to the goal of seeking higher fidelity, then an ideal loudspeaker should have no "sound" at all. In other words, it should be completely transparent, an open door to the musical performance. An accurate speaker's role is to replicate–precisely–the sounds of musical instruments or voices exactly as they were present in the original source recording. The speaker shouldn't add any sound of its own–it shouldn't make strings and violins sound screechy or edgy, nor should it artificially add bass emphasis to male singers' voices so they sound too bassy or "fat" (a common fault of many speakers).
So what sonic clues should immediately become evident? What attributes do we listen for, and what weaknesses should we be mindful of? Let's start with the midrange, which is where most musical content resides, and where our hearing is by far the most sensitive. (Yes, I know we all love bass, but a speaker must reproduce the midrange smoothly if we are ultimately going to like it.) If a loudspeaker nails the midrange precisely, without harsh-sounding peaks, or dips that make the mids sound muffled and distant, it will tell your ears immediately whether you will accept it as natural and "musical." If not, you'll reject it as tonally false or "colored," and music won't sound realistic. We've all heard speech and we're familiar with the sound and nuances of male and female voices. And we've all grown up hearing pop music, which mostly features vocalists or groups of singers, as well as live choruses, the latter in schools or church, or even the national anthem at sporting events. Therefore, a good place to start is with a good CD of an individual singer or a group.
If it's a female vocalist, does the voice sound smooth and clear? On the other hand, the vocal may sound too clear, with exaggerated "s's" and "t" sounds (that's called sibilance), which can tip you off to a speaker with a midrange peak that makes female voices sound sharp or shrill. Make sure you choose a recording that's smooth and natural to begin with–many are not–in order to properly judge whether a loudspeaker can accurately reproduce that voice. An older CD with a natural female vocal is Jennifer Warne's Famous Blue Raincoat. Norah Jones's voice on the 5.1-channel DVD Live in New Orleans is a current recording that very naturally captures her vocals, without excessive sibilance or brightness. Likewise, Diana Krall Live in Paris is well engineered. If you are listening in stereo, the vocal should be naturally placed with the other instruments; it shouldn't sound dull or muffled or farther back. The latter is a sure sign of a speaker's reduced or recessed midrange (sometimes termed "laid-back"), which will place midrange sounds farther away and make them less distinct.
A choral recording of men's and women's voices is an excellent test of midrange clarity and detail: Can you separate the four parts of a chorus–the male basses and tenors, and the female altos and sopranos? The sopranos are the highest-pitched female vocals; the altos are lower. You should be able to hear each section of a choir clearly. Speakers with depressed midrange response make all choirs sound somewhat muffled and blurred, wooly or "fuzzy" sounding. Almost any modern CD of Handel's Hallelujah chorus from "The Messiah" will do, but lately I've been using a wonderfully natural CD of English composer Edward Elgar's Scenes from the Bavarian Highlands, with the London Symphony Chorus and Orchestra (Chandos 9436). Try track 10, "The Dance" (listen for the triangle–it should have an ultra-clear metallic ring, which is exactly how it sounds on Axiom M22ti's) and track 15, "The Marksmen." Listen for each section of the chorus. This disc also features a very spacious and open sound with great horns, strings and percussion (try playing it back in multichannel through Dolby Pro Logic II or Logic 7 processors; it decodes wonderfully) and very convincing hall sound and depth. And there are loud, exciting musical climaxes, with plenty of cymbals and drums. Good speakers should remain clear and not get muddy during the loudest peaks in the music. (Incidentally, this CD is a great choice for anyone who finds some older classical works boring and ponderous, and would prefer an introduction to Elgar's richly melodic and lively 20th-century music.)
A well-recorded male vocal such as Harry Connick's voice on the soundtrack CD from When Harry Met Sally (any track except track 1, which has harsh trumpets), or James Taylor's concert DVD Live at the Beacon theater, should be smooth and natural, with no fuzzy or low bass emphasis. Speakers with an elevated upper bass hump will make most male voices sound thick or fat, rather than natural and real. These recordings also have excellent acoustic and electric bass lines, respectively, with crisp sounding cymbals and good drums. On loudspeakers with good bass response (even some bookshelf models like Axiom's M3ti or M22ti have clear, smooth bass to about 40 Hz) you can follow each individual bass note. A speaker with poor or uneven bass output will make electric or acoustic bass sound like a dull thump, with individual notes hard to distinguish.
Some classic rock recordings like Dire Straits' Brothers in Arms are very well recorded, with natural-sounding vocals, deep bass, and guitar lines that aren't harsh. Likewise Eric Clapton's Unplugged DVD is an engineering stand-out. You will likely have your own favorites that you've heard so often you'll just know when they sound "right" on good speakers.
If you are considering speakers for mostly home theater use, the same standards apply. A speaker that is smooth and accurate on music alone will be just as neutral and transparent with movie soundtracks. There is no separate category of speakers which are "good for home theater" or "good for music." The same standards of fidelity apply.
Center-channel speakers pose a particular challenge to any manufacturer, because the tonal (timbral) match between the center and the front left and right main speakers is very important. If there isn't a reasonably smooth tonal match, character's voices will shift tonality as they move across the stereo soundstage, and the seamless blend of dialog with music and sound effects will be broken. So listen to centers for any "boxy" effects. Speaking voices heard through a good center shouldn't sound like a disembodied voice inside a box. It should seem natural and not change tonal quality as the actor moves across the stereo stage from left to right, or vice-versa.
I asked Tom Cumberland, an Axiom colleague and an experienced engineer involved in the audio business for many years, how he judges loudspeaker sound: "As an electronics guy, I always listen for listening fatigue, which shows up as non-linearities in a loudspeaker's sound. With amplifiers, listening fatigue can result from distortion and other artifacts. With speakers, non-linearities show up as peaks and dips in the frequency response as well as other distortions.
"You can't just listen for a few minutes," notes Tom. "You have to listen for an hour to music or watch a whole movie. If the speakers cause listening fatigue, it will show up with longer-term exposure."
I also queried Ian Colquhoun, the founder of Axiom, on what he listens for when assessing prototype speakers:
"As the designer of Axiom loudspeakers, I am really analyzing and comparing the connections between the data from double-blind listening tests to laboratory acoustical measurements. For those of us who do extensive listening, we become adept at picking out fairly quickly when the loudspeaker is coloring the sound. For most people who do not listen to loudspeakers for a living, this process can take some time. Regardless of the experience level, however, the individual results are shockingly consistent from double-blind listening tests. An inexperienced listener, and even listeners who claim they are tone-deaf, eventually will come to the same conclusion as the experienced listener. This process lends itself to listeners becoming more and more satisfied with the sound of a good speaker over time and more and more frustrated with a bad one.
"An analysis of why this happens shows that it is related to amplitude and balance in relation to the source material. Even using source material that is colored on the original recording–and there are lots of recordings like that–it will still sound as good as it is ever likely to sound when reproduced on properly designed loudspeakers. This is because the odds of a colored loudspeaker lining up in exactly the inverse coloration of a bad recording is simply impossible in the long run. The amplitude, or volume level, has a dramatic effect on the blind listening test. Two identical loudspeakers with one set just slightly higher in amplitude will consistently result in the louder one winning the listening test. Since a colored loudspeaker will have broad variations in amplitude response throughout the audible frequency range there will inevitably be certain sections of the performance that are unnaturally louder than they should be. The inexperienced listener may conclude early on in the listening session that this is a good thing because it is a louder thing, but as various source materials are used and as various instruments or vocals on the same source material coincide with this unnaturally loud frequency region, the fatigue and the imbalance begin to become apparent and undesirable."