| Title: | Clean and Classify Constituency-Level Election Results |
|---|---|
| Description: | Provides cleaned constituency-level election results derived from the Constituency-Level Elections Archive and an analysis-ready subset of elections conducted under simple electoral systems. The repository also contains the auditable maintainer workflow used to construct the datasets. |
| Authors: | Jack Bailey [aut, cre], Chris Hanretty [ctb] |
| Maintainer: | Jack Bailey <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.0 |
| Built: | 2026-06-04 18:40:36 UTC |
| Source: | https://github.com/jackobailey/clean_clea |
A cleaned version of Release 18 of the Constituency-Level Elections Archive
lower-chamber dataset. Election-specific corrections are documented in the
package repository under data-raw/corrections/.
clean_cleaclean_clea
A tibble with 1,296,813 rows and 33 variables:
CLEA release identifier.
CLEA election identifier.
CLEA region code.
Country or territory name.
Country or territory numeric code.
Election year.
Election month.
Subnational election identifier, where applicable.
Constituency name.
Constituency identifier.
Constituency district magnitude.
Party name.
Party identifier.
Candidate name.
First-election-period eligible voters.
First-election-period voters.
First-election-period valid votes.
First-election-period invalid votes.
First-election-period turnout.
First-election-period candidate votes.
First-election-period candidate vote share.
First-election-period party votes.
First-election-period party vote share.
Second-election-period eligible voters.
Second-election-period voters.
Second-election-period valid votes.
Second-election-period invalid votes.
Second-election-period turnout.
Second-election-period candidate votes.
Second-election-period candidate vote share.
Second-election-period party votes.
Second-election-period party vote share.
Seats won.
The maintainer pipeline verifies the pinned source checksum before applying 610 election-specific correction scripts and tracked manual patches. These corrections can replace values, restructure results, add documented missing rows, or remove elections and rows whose source data cannot be reconciled.
This dataset preserves CLEA sentinel values and 888 duplicate rows where no audited correction has been made. It inherits the coverage and coding limitations of CLEA and the cited correction sources.
CLEA Lower Chamber Elections Archive, Release 18 (October 15, 2025)
Kollman, K., Hicken, A., Caramani, D., Backer, D. A., and Lublin, D. Constituency-Level Elections Archive.
An analysis-ready subset of clean_clea containing elections classified as simple electoral systems using V-Dem classifications and documented manual corrections. Vote and seat shares and district-level summary statistics are included.
simple_systemssimple_systems
A tibble with 280,569 rows and 31 variables:
CLEA election identifier.
Three-letter country or territory code.
Country or territory name.
United Nations geographic subregion.
United Nations geographic region.
Election year.
Election month.
Constituency name.
Constituency identifier.
District magnitude.
Party name.
Party identifier.
Candidate name.
Candidate votes in the source data.
Party votes in the source data.
Vote count used for analysis.
Seats won.
District vote share.
District seat share.
Whether the source result was uncontested.
V-Dem electoral-system classification.
Whether a legal electoral threshold is recorded.
Actual number of vote-winning parties or candidates.
Actual number of seat-winning parties or candidates.
Effective number of vote-winning parties or candidates.
Effective number of seat-winning parties or candidates.
Gallagher disproportionality index.
Vote share cast for parties or candidates winning no seats.
Threshold of exclusion, calculated as 1 / (m + 1).
Threshold of representation, calculated as 1 / (m * nv2).
Minimum of tx and tr.
The classifier combines V-Dem with tracked manual classifications.
simple_systems excludes elections before 1900, systems not classified as
simple, the United States and Panama, and elections that retain missing
votes, missing seats, aggregate-party codes, or inconsistent vote and seat
rankings. Uncontested results are assigned one analytical vote before shares
and summary statistics are calculated.
The data inherit the coverage and coding limitations of CLEA, V-Dem, and the tracked manual sources. Legal thresholds and complex allocation rules may not be fully represented by district-level summary statistics.
CLEA Lower Chamber Elections Archive and V-Dem.