-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path2_glottolog.Rmd
executable file
·185 lines (149 loc) · 6.46 KB
/
2_glottolog.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
---
title: "2. Glottolog functions"
author: "[G. Moroz](mailto:agricolamz@gmail.com) <br> Presentation link: goo.gl/PS2UD4"
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(message = FALSE, comment=NA)
```
<style>
.parallax {
/* The image used */
background-image: url("glottolog.png");
/* Set a specific height */
min-height: 350px;
/* Create the parallax scrolling effect */
background-attachment: fixed;
background-position: center;
background-repeat: no-repeat;
background-size: auto;
}
#wrap { width: 700px; height: 550px; padding: 0; overflow: hidden; }
#frame { width: 900px; height: 700px; border: 1px solid LightGray; }
#frame { zoom: 0.75;
-moz-transform: scale(0.7);
-moz-transform-origin: 0 0; }
</style>
<div class="parallax"></div>
## 1. About [Glottolog](http://glottolog.org/)
> _Glottolog provides a comprehensive catalogue of the world's languages, language families and dialects. It assigns a unique and stable identifier (the Glottocode) to (in principle) all languoids, i.e. all families, languages, and dialects._
## 2. Glottolog functions
Don't forget to load a library:
```{r}
library("lingtypology")
```
### 2.1 Command name's syntax
Most of the functions in `lingtypology` have the same syntax: **what you need.what you have**. Most of them are based on _language name_.
* **aff.lang()** — get affiliation by language
* **area.lang()** — get macro area by language
* **country.lang()** — get country by language
* **iso.lang()** — get ISO 639-3 code by language
* **gltc.lang()** — get glottocode (identifier for a language in the Glottolog database) code by language
* **lat.lang()** — get latitude by language
* **long.lang()** — get longitude by language
Some of them help to define a vector of languages.
* **lang.aff()** — get language by affiliation
* **lang.iso()** — get language by ISO 639-3 code
* **lang.gltc()** — get language by glottocode
Additionally there are some functions to convert glottocodes to ISO 639-3 codes and vice versa:
* **gltc.iso()** — get glottocode by ISO 639-3 code
* **iso.gltc()** — get ISO 639-3 code by glottocode
[Glottolog database (v. 2.7)](http://glottolog.org/) provides `lingtypology` with language names, ISO codes, genealogical affiliation, macro area, countries, coordinates, and much information. This set of functions doesn't have a goal to cover all possible combinations of functions. Check out additional information that is preserved in the version of the Glottolog database used in `lingtypology`:
```{r}
names(glottolog.original)
```
Using R functions for data manipulation you can create your own database for your purpose.
### 2.2 Using base functions
All functions introduced in the previous section are regular functions, so they can take the following objects as input:
* a regular string
```{r}
iso.lang("Adyghe")
lang.iso("ady")
country.lang("Adyghe")
lang.aff("West Caucasian")
```
* a vector of strings
```{r}
area.lang(c("Adyghe", "Aduge"))
lang <- c("Adyghe", "Russian")
aff.lang(lang)
```
* other functions. For example, let's try to get a vector of ISO codes for the Circassian languages
```{r}
iso.lang(lang.aff("Circassian"))
```
The behavior of most functions is rather predictable, but the function `country.lang` has an additional feature. By default this function takes a vector of languages and returns a vector of countries. But if you set the argument `intersection = TRUE`, then the function returns a vector of countries where all languages from the query are spoken.
```{r}
country.lang(c("Udi", "Laz"))
country.lang(c("Udi", "Laz"), intersection = TRUE)
```
### 2.3 Spell Checker: look carefully at warnings!
There are some functions that take country names as input. Unfortunately, some countries have alternative names. In order to save users the trouble of having to figure out the exact name stored in the database (for example _Ivory Coast_ or _Cote d'Ivoire_), all official country names and standard abbreviations are stored in the database:
```{r}
lang.country("Cape Verde")
lang.country("Cabo Verde")
head(lang.country("USA"))
```
All functions which take a vector of languages are enriched with a kind of a spell checker. If a language from a query is absent in the database, functions return a warning message containing a set of candidates with the minimal Levenshtein distance to the language from the query.
```{r}
aff.lang("Adyge")
```
### 2.4 Changes in the glottolog database
Unfortunately, the [Glottolog database (v. 2.7)](http://glottolog.org/) is not perfect for all my tasks, so I changed it a little bit. After [Robert Forkel's issue](https://github.com/ropensci/lingtypology/issues/1) I decided to add an argument `glottolog.source`, so that everybody has access to "original" and "modified" (by default) glottolog versions:
```{r}
is.glottolog(c("Abkhaz", "Abkhazian"), glottolog.source = "original")
```
```{r}
is.glottolog(c("Abkhaz", "Abkhazian"), glottolog.source = "modified")
```
<div class="parallax"></div>
## 3. Tasks
### 3.1 Celtic languages
How many Celtic languages in the database?
<form name="FormOne" onsubmit="return validateFormOne()" method="post">
<input type="text" name="answerOne">
<input type="submit" value="check">
</form>
### 3.2 Austronesian languages
How many Austronesian languages in the database?
<form name="FormTwo" onsubmit="return validateFormTwo()" method="post">
<input type="text" name="answerTwo">
<input type="submit" value="check">
</form>
### 3.3 Russian and Standard Arabic
What is the country where, according the database, Russian and Standard Arabic are spoken?
<form name="FormThree" onsubmit="return validateFormThree()" method="post">
<input type="text" name="answerThree">
<input type="submit" value="check">
</form>
<script>
function validateFormOne() {
var x = document.forms["FormOne"]["answerOne"].value;
if (x != "6") {
alert("I have a different answer...");
return false;
} else {
alert("Exactly!");
return false;
}
}
function validateFormTwo() {
var y = document.forms["FormTwo"]["answerTwo"].value;
if (y != "1247") {
alert("I have a different answer...");
return false;
} else {
alert("Exactly!");
return false;
}
}
function validateFormThree() {
var z = document.forms["FormThree"]["answerThree"].value;
if (z != "Israel") {
alert("I have a different answer...");
return false;
} else {
alert("Exactly!");
return false;
}
}
</script>