As part of its modeling, Tyler needs to prove its algorithm works. So it handpicked 1294 sales (and counted two of those twice), and produced a “Sales Ratio Study“.

Based on those results, it looks great! And well within the limits for the coefficient of dispersion given by New York State: https://www.tax.ny.gov/pubs_and_bulls/orpts/ownershandbook.htm.

But that’s because they only cherry-picked 53% of the sales during the period they were looking at — they chose to ignore 1,135 sales! So what if we used ALL of the sales between 7/1/2013 and 7/1/2015? and what if we used the sales AFTER 7/1/2015?

Instead of a good R-squared metric of 0.97, you get under 0.9, with a coefficient of dispersion of 159% and 289%, showing how much worse the actual model is than implied by the 7.5% coefficient calculated when cherry-picking 1300 sales out of the available 2400.

This Excel file (sales ratio study check) shows the details of arriving at the above numbers.