-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathREADME.qmd
104 lines (81 loc) · 3.25 KB
/
README.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
---
format: gfm
---
Warning: This is *alpha* software.
# `countrycode` for Python
`countrycode` standardizes country names, converts them into ~40 different coding schemes, and assigns region descriptors.
Convert country names to and from 9 country code schemes.
* Bugs & Development: https://github.com/vincentarelbundock/pycountrycode
* Pypi: https://pypi.python.org/pypi/countrycode
* [Vincent's webpage](https://arelbundock.com)
This is a Python port of [the `countrycode` package for R.](https://vincentarelbundock.github.io/countrycode)
## The Problem
Different data sources use different coding schemes to represent countries (e.g. CoW or ISO). This poses two main problems: (1) some of these coding schemes are less than intuitive, and (2) merging these data requires converting from one coding scheme to another, or from long country names to a coding scheme.
## The Solution
The `countrycode` function can convert to and from 40+ different country coding schemes, and to 600+ variants of country names in different languages and formats. It uses regular expressions to convert long country names (e.g. Sri Lanka) into any of those coding schemes or country names. It can create new variables with various regional groupings.
## Supported codes
See the section at the very end of this README for a full list of supported codes and languages. These include:
* 600+ variants of country names in different languages and formats.
* AR5
* Continent and region identifiers.
* Correlates of War (numeric and character)
* European Central Bank
* [EUROCONTROL](https://www.eurocontrol.int) - The European Organisation for the Safety of Air Navigation
* Eurostat
* Federal Information Processing Standard (FIPS)
* Food and Agriculture Organization of the United Nations
* Global Administrative Unit Layers (GAUL)
* Geopolitical Entities, Names and Codes (GENC)
* Gleditsch & Ward (numeric and character)
* International Civil Aviation Organization
* International Monetary Fund
* International Olympic Committee
* ISO (2/3-character and numeric)
* Polity IV
* United Nations
* United Nations Procurement Division
* Varieties of Democracy
* World Bank
* World Values Survey
* Unicode symbols (flags)
* And more...
# Installation
From `pypi`:
```python
pip install countrycode
```
Latest version from Gitub:
```sh
git clone https://github.com/vincentarelbundock/pycountrycode
cd pycountrycode
pip install .
```
# Usage
```{python}
import polars as pl
from countrycode import countrycode
countrycode([12, 124], origin = "iso3n", destination = "cldr.short.fr")
```
```{python}
countrycode(['canada', 'United States', 'alGeria'], origin = "country.name.en.regex", destination = "iso3c")
```
```{python}
countrycode(pl.Series(['DZA', 'CAN', 'USA']), origin = "iso3c", destination = "cldr.short.fr")
```
```{python}
countrycode(['DZA', 'CAN', 'USA'], origin = "iso3c", destination = "iso3n")
```
```{python}
countrycode(12, origin = "iso3n", destination = "cldr.short.de")
```
```{python}
countrycode(["Democratic Republic of Vietnam", "Algeria"], origin = "country.name.en.regex", destination = "iso3c")
```
```{python}
countrycode(["Algerien"], origin = "country.name.de", destination = "iso3c")
```
# Supported codes: Full list
```{python}
from countrycode import codelist
", ".join(codelist.columns)
```