
on your optimizer uinless I know how it works.
The EM report gives mean, median and absolute WPEs, so any model has to match those, but you say your optimizer is not a model. So I have no views. I do not know what it is trying to do. All I know is that the alphas that you output are not the alphas in the data, and that the alphas in the data do not vary significantly with increase in Bush's vote.
I am talking about real numbers. Real numbers are exactly what I am talking about.
If you want the algebra, here it is: The alpha for a precinct is:
Kerry responses/Kerry votes divided by Bush responses/Bush votes. I call Kerry responses/Kerry votes "Kp" and Bush response/Bush votes "Bp", i.e. Kp for Kerry participation rate and Bp for Bush participation rate. So alpha = Kp/Bp.
If you want the mean alpha for a category you cannot simply take the mean of all the precinct alphas, because, as you say, it is a ratio, and therefore does not have a normal distribution. What you do is you take the arctangent of the ratios, take the mean of the arctans, then the tangent of the mean. Mean alpha = the tangent of the mean arctan(Kp/Bp).
Obviously you can't do this because you don't have the data. So you approximate to it. Fair enough, it is the best you can do with what you have got.
But I am talking about real data. Mitofsky has the real data, and he has given the answer for ln(alpha). I also have checked the answer using arctan. There is no significant slope between ln(alpha) or arctan(alpha) and Bush's share of the vote. So yes, I dispute that the means are significantly different between categories. They are different, but not significantly different. It is not an assumption. It is a computation. There is no slope.
So whether the optimizer output makes sense or not makes no difference. In the real numbers there is no slope.
And they have absolutely nothing to do with the timeline. They are simply the raw responses, completely unweighted, that went into the estimate. The weights were what produced the timeline changes. the raw data is the raw data. If there is bias in the raw data it is either bias in the poll or bias in the count.
The unweighted data we are working with show, unambiguously, that there was a massively significant redshift of something around alpha=1.15 on average, with a huge amount of variance, as can be seen from the plot.
But alpha did not vary with partisanship. It varied with lots of other things, but not with that. Not at zeroorder correlation level, anyway.
There is no significant slope.
