Support |
There's a tricky balance on response curves and where to put precision. Using an imaging example (because I happen to know more about that): If one does computations with images, those computations often (though not always) work better when using a linear measure of photon energy — i.e., twice as much photon energy is represented by twice as big a number. Exposure adjustments are a prime example of where this is the case. On the other hand, people's perception of light is decidedly non-linear. Twice as much photon energy does not necessarily look twice as bright. This makes linear encodings perceptually inefficient because for a steady linear progression there will be a lot less information in the shadows than people perceive and conversely there will be a lot more precision in the highlights. I may muff the exact numbers here, but what people perceive as middle gray is something like 18% gray on a linear scale with black at 0% and white at 100%. So, if we just stored values ranged 0-100, half of the range would only get 18 values while the other half would get 82. Imaging professionals bash on JPEG for being an 8-bit format — only 256 distinct levels — but those 256 levels are distributed in a way that is more perceptually uniform so often it is more than enough. (Where it does get challenged is when editing. Almost every image correction tends to lose levels of data when things get remapped — open up the shadows and you lose levels in the highlights. So, 256 well distributed levels is good representationally but a bit challenged when it comes to large tonal moves.) Turning back to audio. If MIDI velocities are encoded on a linear scale that may make sense in some ways, but since we perceive loudness on a logarithmic scale, it probably puts the precision in the wrong places. Or considering the middle gray case, does a velocity value of 64 sound half as loud as 127? Does it do so in an interesting way? Is the precision where you want it when programming a drum loop? Mark