- Looking good on stage
- Dancing (sometimes also factored in when people discuss "talent" - but rarely)
- Looking good on TV
- Stage presence
- Looking good in front of still cameras (i.e modelling)
- Talking to the media (if you don't think this one matters, just ask anyone in KARA)
- Looking good in a versatile way for different outfits to use in different concepts
- Emotional labour (what air hostesses do - keeping the happy facade up, smiling constantly when you've had a shit day etc, ask f(x) about this one)
- Looking good even in airports
- Fellatio technique (just ask [insert your bias here])
- Looking good even in a car accident (or else)
- Behind-the-scenes talent that supports your talent (songwriters, producers, stylists, choreographers)
- Looking good at all times
The obsession about vocal performance always seems odd to me when we're talking about the idol pop genre, a genre of music where the #1 most successful female idol group of all time was The Spice Girls and the most successful solo singers of all time in their respective genders were Elvis Presley and Madonna. I thought it would have been fairly obvious to anybody with their eyes even half-open that outstanding vocal performance was kind of an optional requirement at most. In this context it's easy to see why k-pop companies don't bother to train their artists too hard in vocal performance, they've sensibly worked out through market research that it's not something that's really needed.
Nevertheless, this doesn't stop a bunch of armchair douchebags obsessing about vocal quality anyway, and one of the favourite tools that they use to pick apart vocal performance is the "MR removed" mix. This mix is created by ripping out all the backing track with audio editing software, so you can just hear (and snidely judge as if your opinion is in any way relevant) the vocals on their own. OMG THE TRUTH IS REVEALED, AMIRITE?
Well, not exactly. The problem with this is that the results are typically not indicative of the true vocal performance, or anything else for that matter, except how much time some bored Starcraft player has to fuck around with sound waveforms and make them sound like you're listening to your neighbour's TV set from inside a toilet bowl.
So what's the problem?
To understand why MR removed videos are essentially completely fucking useless as a tool to evaluate singers, we firstly need to understand how the software that removes backing tracks works. The rest of this blog post is going to get a bit fucking technical, but there's really no way this can be avoided. I'll do my best to explain this all in language that any 11-year old EXO fan can understand.
There are two techniques that are used to create an "MR removed" mix, and we'll discuss them (and the associated problems) separately.
1. PHASE CANCELLATION
All sound is the vibration of molecules. When a sound is generated from a singer, the vibration of the singer's vocal cords from side to side vibrates nearby air molecules which also start going from side to side. These molecules bumpity-bumpity onto other air molecules until they eventually get to your ear, where they bumpity onto the hairs in your ear that then also start going from side to side. Because your brain is ultra-clevery-smart and stuff, it then converts those hair movements into brainwaves and that's how you hear "ULF NEGA ULF AWOOOOOOOO".
We can chart molecular motion of sound onto a graph, like this:
The horizontal axis is time (in fractions of a second), and the vertical axis is amplitude. From the center 0.0, the molecule moves up, and then down, up and down...
The result is a waveform of sound that you can hear. But what would happen if we had two waveforms, and they were exact opposites of each other?
As you can see, the second waveform we've now added below goes up where the first one goes down, and vice versa, just like I would if I was lying down on that stage while Eunjung bopped up and down over my lap. The second signal is what is called an "out of phase" signal, as the wave motion is considered to be 180 degrees different or "out of phase" to the original wave motion, so in other words a total opposite, like how the 180 degree point is on the opposite end of the zero degree point of a protractor.
In this case, the two waveforms, being opposites, would mathematically cancel each other out, and if you played them both as they are displayed here, you wouldn't hear any sound at all - even though the sound is still being generated, it's being generated in equal-but-opposite directions. This effect is known as "phase cancellation". This is how active noise-cancelling headphones work, and it also part of how cancellation in "MR removed" MVs works.
So let's apply this to k-pop. Say you've got a live recording of Dal Shabet's new song "Molest Me On The Subway, Oppa".
Let's also say that because you're a big fan, you've also got the studio version.
Since you know that the group just sings along to the studio recording on the live stage, by combining the two as above, lining them up just right, and then inverting the waveform so that the studio version is out of phase, the studio version's audio should cancel out the waveforms on the studio recording that they're singing along with, just leaving the "difference", which is the live vocals and any cheering, right?
Wrong - as you can hear. There's all sorts of weird crusty shit in the mix, for a start - yes the main audio track is cancelled, but the reproduction of it in the TV studio has a different ambience which changes the sound slightly, and those differences can still be heard, including not just the effect of whatever speaker system they're using in the studio, but also any reflections of sound that are bouncing off the back walls and back into the microphones. Also, half the vocals are actually missing - what's with that? Is it because the girls are so busy dreaming of all the clit-rubbing action they're going to get next time the take the subway that they just chose not to sing some of the syllables? Not likely (sadly). The problem with phase cancellation is that it acts across the whole mix, not just the bits you want it to act on, so if you're singing along to the vocals on the backing track, then every time the waveform of your voice becomes equal-but-opposite to the out-of-phase waveform of your voice used by some Starcraft nerd to perform the cancellation, your voice gets cancelled out as well. Oops. Paradoxically, the more true to the original recording your vocal performance is, the more likely this will happen and the phase-cancelling software will cancel a big chunk of your voice out almost completely. So when you're hearing an "MR removed" mix, and the voice is kind of fading in and out and it sounds really weak, that could be because that person is singing really poorly, or it could be because they are singing a little bit too well, because it's so close to like what's on the recording that a large chunk of it is being cancelled - which is of course exactly what they're trying to do. Unless you're actually in the studio controlling those levels, there's no true way of knowing which one of these possibilities is true.
If backing tracks don't actually contain the voice itself, then this isn't a problem, and the phase cancellation works a lot better. However, if the backing tracks don't contain the voice itself you don't exactly need an MR removed version anyway, for obvious reasons - you already can hear the vocal.
Let's move onto our second useless technique that doesn't work all that well for removing backing tracks from vocals, so we can understand why it also sucks:
2. STEREO BANDPASS FILTERING
Sometimes, you just ain't got a studio version. Maybe it doesn't exist, because it's a one-off never-to-be-repeated live performance of some song that this artist doesn't normally do. On the other hand maybe it does exist but you don't have access to it because you're anti this artist and you wouldn't buy their stuff, you just want to make an MR mix to prove to the rest of the world why they shouldn't buy it either, in the vain and futile hope that this will actually affect the artist's bottom line, because you suck and should be destroyed. Or perhaps you've already done the phase cancellation but there's still a crapload of noise in the background and you want to get rid of more of it so your pristine vocal track shines through so you can hear how shit it is.
Now, common conventional audio mixing wisdom dictates that both vocals and instruments in an audio mix need "room", which means you've got to find somewhere in the audio field to put them, otherwise you can't hear everything clearly. Let's look a visual representation of an audio field.
Now let's separate our field into areas, so we know what we're dealing with. The vertical axis of our field is the "frequency field", which means the pitch of our instruments and voices. High sounding things go up the top, low sounding things down the bottom.
Now let's add stereo. We'll conceptualise our sound as being either in the center of the stereo field (coming out equally through both speakers) or it will be panned either "hard left" or "hard right", and we'll use the horizontal axis to represent this.
Now, when someone mixes a pop hit what they're attempting to do is fill up all the boxes with "stuff" so they get a nice full-sounding mix, but without anything overlapping. If there are too many things in the one box, they tend to compete for space, so the aim is for a reasonably even distribution of sounds.
A typical result of elements that you might hear in a pop mix:
Dead center is almost always where the main vocal track sits. Seeing as how we want to isolate vocals and hear them on their own, if we apply a filter, we can filter out the deep stuff and the high stuff (thus leaving a "band" of audio in the middle hence "bandpass" because we let that bit "pass" through and block the rest), and we can also filter out the stuff at the left and right edges of the mix. This should just leave us with the vocal, right?
Well, yes...actually it works great. Check this original and then the MR removed version:
But whoever made that video wasted their time, because with a recording like that, you don't need the MR removed version anyway - there's no studio version with a vocal backing track for Ailee to sing over the top of, therefore no reason to separate the parts. Whoever made this is just having a "look how good I can make an MR removed video sound" wank.
If we're talking about the more typical k-pop scenario of a singer singing over a backing track that includes their own voice, then we're straight back into shit-filled toilet bowl territory again, because there's no way that bandpass filtering can tell the difference between the studio vocal track and the live one that's been plopped over the top. Most MR removed videos therefore have to use a combination of stereo bandpass filtering and phase cancellation to bring you a result, which in turn butchers all the audio, including the stuff that you're actually there to listen to:
You can hear the guitars and the snare drum bleeding through quite strongly - because these instruments are operating on a similar frequency range and stereo location to Lee Hi's voice. Other instruments you can't hear at all, they're outside the filter range. However, what you also can't hear is half of Lee Hi's actual vocal performance, and what is there sounds like a bunch of warbly crap because half of what actually makes her sound decent has been ripped out along with all the other stuff. If you didn't know who she was you could well be forgiven for thinking that she's no better or worse at singing than anyone in Dal Shabet.
Here's Ailee again, singing to a backing track of her own voice this time and you'll notice half her voice is actually gone, from 0:40 the audio is a shitfest and she's dropping in volume everywhere:
What a mess, right? Forgetting the fact that this is Ailee who we do actually know is a
I hope this blog post has demonstrated to you how MR removed gadgetry doesn't actually do the job it's supposed to. Having said all that, even if it did work, you'd still be an idiot to evaluate someone's vocals that way, for one very obvious reason - why should the way you are not hearing the performance in a live setting take primacy over they way you are hearing the performance in a live setting? Or to put it another way, if a tree falls in the forest, and nobody is around to hear it, are you a cunt for wanting to know the frequency of the sound it made when it fell over and killed a bunch of animals fucking?