(taken from a message board post where I was discussing PITCHf/x uncertainty)
Discussion of these correction algorithms and uncertainty around something that is precisely measured brings up a tangential point: Physicists are rather famous for saying “Any measurement that you make without knowledge of its uncertainty is completely meaningless.” (Walter Lewin, actually)
And so this is a good thing that we talk about it for PITCHf/x, because uncertainty is good. However, the move in sabermetrics to blindly accept observed data is very… bad. I’ll stand behind OBP and SLG all day, since these have no uncertainties around them. Same with linear weights (for what they area). But… UZR/DRS/TZ…. no. These are based off of observed measurements from BIS/GIS stringers that have a serious uncertainty around them. Additionally, the data has been shown to have serious park biases – especially in Chavez Ravine.
This is the old PECOTA/BPro issue all over again – when you keep data proprietary and sell it piecemeal, you suffer from publisher’s bias and all sorts of conflict of interest. And then this data is fitted to an equation that has some regression involved in it, further compounding the error (and worse: drawing conclusions from facts not found in evidence).
UZR and other similar concepts should have an uncertainty listed. Saying someone’s UZR is +15.5 is ridiculous; the same is true for saying someone’s fastball has a linear weight of +1.2 runs. The former is stupid because stringers have serious uncertainty around them (which goes unreported and unquantified) and the latter is dumb because we do not know for sure that someone’s fastball is indeed a fastball (not all pitch types are characterized correctly).
And so the derivation of stuff like linear weights and objective data needs to be separated from the…. well… psuedoscience (psuedoanalysis?) that is often done with UZR/DRS and other measurements like it. Just because analysts qualify that the data is indeed “fuzzy” does not make it okay. You need to publish uncertainty measurements or error bars, otherwise the data (and especially its conclusions) are worthless.