L2 dataset
Notable People - Cross-verified Dataset
This cross-verified dataset contains 2.2 million individuals, it can be used for research purposes. This dataset is linked to the following paper that should be cited directly instead of the data itself:
PublisherSciences Po Paris
AccessFile download
LicenseCC-BY-SA
Updated2026-06-07
Views1
Topics
birth date, people, history, wikipedia
Links
Details
Access & cost
- Pricing
- free open access — N/A N/A / N/A
Legal & licence
- Access Rights
- Public
- Legal Risk Notes
- In this paper, we introduced a multi-language database of notable individuals with the use of 7 language editions of Wikipedia and Wikidata to assemble a list of 4,678,040 individuals. This significantly reduced the Anglo-Saxon bias, but not all. Two main drawbacks remain. First, we did not exploit the non-Western language editions to cross-verify information on individual characteristics. Second, we did not collect the number of words beyond these 7 language editions: they enter in the notability index, but this index cannot be considered as global, resulting in a Western-world bias in notability measures. This is however partly compensated by the use of the total number of hits for all Wikipedia editions and not only 7, in our aggregate notability measure.The accuracy of Wikipedia being not perfect7,8, our data is as good as the source data, but our approach adds new possibilities: to cross-check across different language editions and reduce errors when possible.
- Conforms To
- DCAT-3SourceCommons-SCF-0.1
Coverage
- Spatial
- world
- Wikidata Main Topics
- public figure (Q662729)notability (Q4993710)
Identifiers & provenance
- Wikidata Id
- Q662729
- Prefill Status
- Not checked
Evaluations
5.0
- solidago-flaccidifoliacontributor