Democratic Underground - I'll have a go

at answering here too:

What I am referring to as total sampling rate is:

responses/votes

So each candidate's sampling rate would be:

responses for candidate/votes for candidate. The ratio between the sampling rates (let's call them Ks and Bs) will give you what I call alpha. Ln(Ks/Bs) will give you WPE_bias.

These two sampling rates are knowable - you just divide the tallies by the count. Or, if all you have are the WPEs and the vote count proportions you can use my fancy function to derive them from those.

Total completion rate is also knowable - total tallies divided by total selected (responses+refusals+misses). However, what we don't know is the completion rate for each candidate's voters - because we don't know how many refusers and misses were Kerry voters and how many refusers and misses were Bush voters.

We can attempt to compute it algebraically from the WPE, but that entails the assumption that the sample (including refusers and misses) was a random 1/Nth (where N is the interviewing rate) of the voters in the precinct. Where N is very small (ideally, every voter approached) the ratio between your two completion rate estimates should match the ratio betwen your two sampling rates pretty well. But where N is large, there is no guarantee that your 1/Nth of voters was truly a random 1/Nth, and in any case it will be subject to sampling error. It may have been "enriched" by voters for one particular candidate. Bush voters (or Kerry voters in some precincts) may have avoided the pollsters altogether. In precincts where refusal rates were high, interviewers may have unconsciously sampled the (N+1)th voter, if the Nth looked hostile - especially if voters were milling about. We know that WPE was greater where N was large. Refusals and misses (especially misses) may also not have been very diligently recorded. All these things happen.

All of which makes me skeptical of completion rate calculations - all we know for certain is the ratio between the sampling rates - that's what my algebra applied to Mitofsky's data gives you. It doesn't work very well (as my modelling showed) when applied to aggregate means or medians. To go on then to derive separate completion rate estimates from this ratio, given a mean completion rate for a group of precincts and some form of aggregate WPE for a group of precincts, seems to me to run the risk of piling error on error. We know the completion rate variance was large. We don't know which completion rates applied to which precincts (large or small alpha). And it is likely also that total completion rates may at least partly reflect how desperate the interviewers became for responses. After all, it was tallies, not refusals and misses, that the pollsters wanted.

Reply #58: I'll have a go [View All]