2016 Postmortem
Related: About this forumToday in polling, the RCP mega-sample
So, as most people know, there are two primary poll aggregate websites out there--Pollster and Real Clear Politics. Each of them approach aggregating polls in different ways. Today, I will talk about RCP. If the interest is there, I will address Pollster at another time.
So first, the amazing thing about aggregating services like RCP is that their margin of error on results is actually zero. There are many reasons for this, but my favorite is that aggregation combines a lot of information collected from different sources which makes trying to predict a unifying population impossible. Keep that in mind when looking at aggregation sites.
Anyway, RCP creates its current polling results through a technique that has many names, but I like to refer to as mega-sampling. Mega-sampling is where you take individual poll samples and results and combine them together to make a new much larger sample. This gives a higher weight to results of polls with larger sample sizes while not negating the results of smaller polls. A mega-sample is usually created on a rolling time period (All polls in the last month) or a rolling number of polls (The last 5 polls regardless of when they occurred). As new polls are released, they're added into the mega-sample and, depending on the type of mega-sample, the oldest poll is removed.
So, practically what does this look like? Here is an example:
Poll A has a sample of 500 people. 80% like Clinton, 20% like Sanders (400 to 100)
Poll B has a sample of 1000 people. 70% like Clinton, 30% like Sanders (700 to 300)
Poll C has a sample of 1500 people. Candidates are split at 50% each (750 to 750)
The mega-sample is 3000 people. Clinton has 1850 vs Sanders' 1150. The aggregate percentages therefore are 62% to 38%.
Pretty simple, really. RCP, in my opinion, has two main strengths over Pollster. First, they don't include internet only polls which are notoriously plagued with sample bias. Second, they don't try to smooth trend lines like Pollster. Smoothing causes a lagging effect to directional shifts and is extremely annoying from a statistical point of view.
Let me know if you have any questions. Depending on response, I will try and tackle the Pollster methodology next week.
Godhumor
(6,437 posts)DemocratSinceBirth
(99,710 posts)As your combine samples the size of your sample increases and your margin of error decreases.
kenn3d
(486 posts)I see this post got very little notice, and I'm sorry I missed it earlier. I really appreciate your insights. Excellent, easy to understand explanation of a poorly understood subject.
I tend to agree with RCP on the potential bias in internet polling and think that some culling of the Pollster dataset tends to improve its accuracy. I hope other DUers will show some interest in the subject of aggregation of polls, and I'd be keen to read your post on the HuffPolster methodology to learn how it differs from RCP.
Thanks again
mythology
(9,527 posts)Thanks for sharing it.
brooklynite
(94,547 posts)Godhumor
(6,437 posts)I think I kept the explanation pretty short to show how all polls are used to generate one big poll, honestly.