Pages 1, 2 and 5 reference either Lindeman, myself, or both; I have also commented on pages 3 and 5. Page 6 refers to a largely methodological paper by Michael McDonald.
Page 1.Dopp uses quotation marks to frame a statement attributed to Lindeman and myself (footnote 1): "If there is vote fraud, then there will be a positive correlation between Bush vote increase from 2000 to 2004 and the exit poll discrepancy." This statement does not appear in the source given, I am not an author of the source, and it is not true. It is crucially not true. As Dopp correctly demonstrates, it is perfectly possible for there to be vote fraud and for there to be no positive correlation between swing and shift. The relevant question is not whether such a counter-case it is possible, but how probable it is, given the variance in each variable. That, of course, is what is tested when a correlational hypothesis is tested. Indeed it is exactly what Dopp advocates testing when she suggests that fraud is indicated by a negative correlation between WPE and Kerry's exit poll share and/or by a positive correlation between WPE and Kerry's vote-share. It's how correlations work.
My argument is therefore not invalidated by Dopp's because she simply invalidates (correctly) an argument I do not make. What I conclude (by calculating the confidence limits of the regression line) is that the maximum shared variance between swing and shift is very small. However, translating this into votes is less easy, as the amount of variance shared will depend on the variance in fraud. If fraud was uniform in extent and magnitude then there would be no variance, and thus no shared variance. However, by the same token the correlations between shift and other factors from which others have deduced fraud would also be absent or meaningless. Fraud would correlate with nothing. Moreover, one would have to ascribe virtually all variance in redshift over and above sampling variance to non-sampling error in the poll. There is no problem with doing so, but it flies in the face of most fraud arguments, including many of those advanced by Baiman and Dopp. Practically pertinent, however, is the fact that a range of voting methods are represented in the poll, rendering substantial variance inevitable. In order to estimate the maximum number of votes likely to be stolen (for a given probability threshold) we therefore have to make some assumptions as to the maximum likely proportion of corrupt precincts, and also the likely variance in fraud magnitude. I have been modelling various scenarios, and I cannot squeeze anything like popular-vote margin scale theft into the plot without making heroic assumptions about the distribution and uniformity of the fraud, and the more generous my assumptions the more all other fraud arguments (Freeman's "neglected correlations" for example) are invalidated.
Dopp states on this page that "ESI's analysis method is equally invalid as a statistical model or trend <....>". She appears misunderstand the term "statistical model". The way correlational analysis works is that a "model" is fitted to data. This model is sometimes called the "model hypothesis" to distinguish it from the "null hypothesis". If the data is explained better by the model hypothesis than by the null hypothesis (as tested, for example, by an F test), the model is supported. If it is not, the null is retained. When the null is retained, it could well be that the hypothesis is true - in other words that what we are observing may be a "counter-example", to use Dopp's phrase. This is why, when we retain the null, we compute confidence limits for the best model fit. In other words, we do not claim (ever) that the null is "true" (actually a two tailed-null is never true). What we do is to compute the largest effect that is probable (to a given degree of probability, say 95%) given the data. As the statistical power in the Ohio sample is small, the confidence intervals are wide, and could accommodate fraud without stretching probability unduly. However, as the statistical power in the nationwide sample is large, it places strong constraints on the likely prevalence and variance of fraud, as I argue above. (Elsewhere, Dopp has accused Lindeman and myself of using the term "statistical model" as some kind of retreat from the word "hypothesis" - this, clearly, is not the case. All hypotheses in statistics are tested using a "statistical model". It's how inferential statistics works).
As for the word "trend" - this term is usually reserved for a model that narrowly fails to achieve signficance. There is no "trend" for swing and shift to be correlated, either in the ESI's Ohio dataset or in the nationwide dataset.
(I should also point out that the other apparently verbatim quotation on this page, attributed to ESI: "there is no exit poll evidence of fraud..." also does not appear in the source given, although unlike the statement attributed to myself and Lindeman, it is a reasonable paraphrase of ESI's conclusion.)
Page 2.
The idea behind the measure is not to "eliminate or 'straighten out' the influence of exit-poll-partisan-response bias". I refer you to my original paper (www.geocities.com/lizzielid/WPEpaper.pdf) and to the Lindeman, Liddle and Brady paper here:
http://inside.bard.edu/~lindeman/ASApaper_060409.pdf. The idea was to find a way of quantifying "bias" (whether in poll or count) that would be invariant with regard to vote-share. It was not easy to do, but I think we managed it.
What Dopp's simulations demonstrate is the nature of the problem we were trying to solve. WPE has greater variance in the centre of the plot than at the end. For a given magnitude of "bias" (in count or poll) the WPE value is smaller at the extremes than in the centre. Our suggested measure ("tau") is uniform for a given degree of bias. However, because of the asymmetric nature of the Poisson distribution for all vote-share proportions except 50%, sampling error results in a non-symmetrical distribution of both WPE and tau at the extremes of the vote-share distribution, and in the case of tau, the variance due to sampling error a) increases at the extremes and b) the expected value of the mean departs from zero (the magnitude of the deviation being a function of sample size). This was why we developed "tau prime" and recommended that a Weighted Least Squares regression technique be used to tame the remaining heteroskedasticity. It has nothing to do with eliminating "the influence of exit-poll partisan-response bias". I can only think Dopp has not read our papers. I would also note, while I am on this page, that the paper she cites in footnote 5 as being a "mathematically correct way to evaluate exit poll data" advocates,
inter alia, correlating WPE with exit poll share; as any error in the poll (including sampling error) will then appear on both sides of the regression equation, absolutely nothing can be deduced from such a correlation. Far from being "mathematically correct", this is a fundamental statistical error.
Dopp also alleges on this page that "Liddle also drops the most suspicious (and most indictative of vote miscounts) precincts from her calculations as "outliers". I am not sure what she is referring to here, and she has not provided a citation. I have not dropped any outliers from any analysis, although I have certainly conducted leverage diagnostics and outlier analyses, as any competent and diligent data analyst should.
Page 3.
NEDA's equations to derive Kerry and Bush exit poll response rates unfortunately do not work, for several reasons. One of these relates, again, to the asymmetry of the Poisson distribution when events are rare (such as encountering a Kerry voter in a high Bush precinct), and it can be readily shown that using the equations given, even if overall response rates were known with accuracy (which they are not) and for each precinct (the plot merely shows aggregate values), that sampling error alone will commonly generate impossibly high response rates (>1) for the group of voters who are in the minority.
It is simply not possible to infer differential response rates from these, or any, data. However, differential participation (or "representation") rates can be computed directly from the data (tallied responses for each candidate/tallied votes for each candidate), and the ratio between them is what I termed "alpha", from which "tau" and "tau prime" are derived. (See Lindeman, Liddle and Brady for further details.)
I would agree with NEDA that differential non-response may be a relatively minor factor in producing the overall discrepancy. The exit poll data are more strongly supportive of selection bias as the key to the discrepancy, and this will be reflected in the ratio between participation (representation) rates for each set of voters, as of course will vote-miscounts favoring one candidate. However, both differential non-response and selection bias may arise from the same latent variable - a greater willingness for Kerry voters than for Bush voters to participate in the poll.
Page 4.
The confidentiality issue is far more serious than Kathy appears to realise, and "vote-counts" can't be blurred, at least by Scheuren's method. "Vote-share" can be blurred by collecting data from a large number of precincts (all, I think), banding them by vote-share, and assigning each NEP precinct to the mean of the band in which it falls. I know of no method of blurring a quantity like vote counts that would preserve the statistical properties of the data, and yet render precincts unidentifiable. The more variables provided, blurred or otherwise, the greater the chance of precincts being identified.
As for the claim that
no scientific hypothesis can be considered proven before the supporting data are released to competing investigators
Firstly no scientific hypothesis is ever "proven"; it is merely supported. Secondly, in general, in science, investigators do not "compete" over the same data. If a hypothesis is supported by one group, then the challenge is for the same or other groups to replicate the finding
on new data. In the social sciences, data are not generally released because of confidentiality issues. That is why methods and results sections of papers are so important - other investigators need to know exactly what was done and what was found. They do not need to do the same thing again, although reviewers might legitimately demand that other analyses be carried out on the data. However, I will agree with Dopp that the E-M evaluation document was not couched as a scientific report, and lacked the kind of information, such as t tests, F tests and standard deviations, that would be required to critically evaluate the document. That was one of the reasons I wrote my original paper.
Page 5.
Regarding the swing shift correlation, see my comments to page 1. Curiously, on this page, Dopp avers that
Baiman and Dopp never did any of the 'vote share/red shift analysis' Lindeman refers to because Dopp mathematically proved that “vote share/red shift” analysis is useless for analyzing exit poll data
even though at least two of the referenced papers (footnote13) plot linear trend lines between voteshare and redshift (the latter measured by WPE) and one states that "evaluating exit poll data" involves, inter alia, "Plotting the pattern of actual and significant precinct discrepancies according to their official vote share and exit poll shares" (my italics). I can only think Dopp has misunderstood Lindeman's statement, which remains perfectly valid. If a single counter example invalidates a correlational hypothesis, Dopp's method itself would be rendered invalid, as would her advocated correlation between redshift and exit poll share, were it not invalid anyway owing to the error term confound mentioned above.
I will take the opportunity to point out that Lindeman's paper does not characterise "anyone who believes that exit polls are correct as 'fundamentalists'". From his abstract:
Note well: “exit poll fundamentalism” does not refer to the hypothesis that Kerry received more votes, nor the belief or hypothesis that the exit polls evince fraud. These are empirical issues amenable to rational debate, and reasonable people may disagree. Still less does it refer to any and all criticisms of the 2004 election or of election systems. Exit poll fundamentalism as I have encountered it, and as I define it here, amounts to a closed belief system that forecloses further discourse and discovery"
(my bold).
which is a direct contradiction of Kathy's paraphrase.
edited for errors - I expect some remain