-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy path09-a3resp.Rmd
624 lines (463 loc) · 19.6 KB
/
09-a3resp.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
# Assignment Three {#a3resp}
### Assignment Three: The Week Four Assignment {.unnumbered}
**WDI and `ggplot2`**
* Create an R Notebook of a Data Analysis containing the following and submit the rendered HTML file (eg. `a3_123456.nb.html` by replacing 123456 with your ID)
1. create an R Notebook using the R Notebook Template in Moodle, save as `a3_123456.Rmd`,
2. write your name and ID and the contents,
3. run each code block,
4. preview to create `a3_123456.nb.html`,
5. submit `a3_123456.nb.html` to Moodle.
1. Choose at least one indicator of WDI
- Information of the data: Name, Indicator, Description, Source, etc.
- Download the data with `WDI`
- Explain why you chose the indicator
- List questions you want to study
2. Explore the data using visualization using `ggplot2`
- Use a histogram (geom_histogram), boxplot (geom_boxplot), a scatter plot (geom_point), a line plot (geom_line)
- For at least one chart, add title, and labels of axis, and add an explanation of it
3. Observations and difficulties encountered.
**Due:** 2023-01-16 23:59:00. Submit your R Notebook file in Moodle (The Third Assignment). Due on Monday!
## Set up
Follow the workflow explained in EDA4 on January 18.
### EDA by R Studio: Step 1
In RStudio,
1.1. Project
* Create a new project: File > New Project; or
* Open a project: File > Open Project, Open Project in New Session, Open Recent Project
- It is easier to find an existing project from: File > Recent Project
* _Check there is a file `project_name.Rproj` in your project folder (directory)_
1.2. data folder (directory) `data`
* Create a data folder: Press New Folder at the right bottom pane; or
* Confirm the data folder previously created: Press Files at the right bottom pane
* _If you follow 1, the data folder exists in your project folder_
1.3. Move (or copy) data for the project to the data folder
* If you downloaded the data, it is in your Download folder. Move it to `data`.
* _Check in your RStudio that your data is in `data`: Press Files at the right bottom pane and click `data`, the data folder._
### EDA by R Studio: Step 2
2.1. Project Notebook: Memo
- Create an R Notebook: File > New File > R Notebook
+ You can use R Notebook template in Moodle by moving the template (template.Rmd or template.nb.Rmd) file in your project folder or copy and paste the text file into your new R Notebook.
+ If you use template.nb.Rmd (R Notebook File), choose Open in Editor.
- Add descriptive title.
2.2. Setup Code Chunk
- Create a code chunk and add packages to use in the project and RUN the code.
+ library(tidyverse)
+ library(WDI)
+ or any other packages
```{r 9-1}
library(tidyverse)
library(WDI)
```
_When people read a paper, they want to know the minimal set of packages required to install. So it is better to load the packages by library() you actually need in the paper._
2.3. Choose `Source` or `Visual` editor mode, and start editing Project Notebook
- Set up Headings such as: About, Data, Analysis and Visualizations, Conclusions
- Under About or Data, paste url of the sites and/or the data
2.4. Edit a new file by saving as for a report
- File > Save As...
## General Comments
### Varibles
We should know first about the variables. At least you must know if each of the variables is a categorical variable or a numerical variable.
### Visualization
* What type of variation occurs within my variables?
* What type of covariation occurs between my variables?
## World Development Indicators (WDI)
### Setup
It is not a must, but it is good to run the following code chunk and use `wdi_cache` for `WDIsearch()` to update the mata data. See `WDIcache` help.
```{r 9-2, eval=FALSE}
wdi_cache <- WDIcache()
```
```{r 9-3, echo=FALSE, eval=FALSE}
wdi_cache <- WDIcache()
write_rds(wdi_cache, "./data/wdi_cache.RData")
```
```{r 9-4, echo=FALSE, eval=TRUE}
wdi_cache <- read_rds("./data/wdi_cache.RData")
```
### Indicators Used in Assignment Three
* NY.GDP.PCAP.KD: GDP per capita (constant 2015 US$)
* NY.GDP.MKTP.CD: GDP (current US$)
* NY.GDP.PCAP.KD.ZG: GDP per capita growth (annual%)
* SP.POP.TOTL: Total Population
* IP.JRN.ARTC.SC: Scientific and technical journal articles
* FP.CPI.TOTL.ZG: Inflation, consumer prices (% annual growth)
* DT.ODA.ODAT.PC.ZS: Net ODA received per capita (current US$)
* BX.KLT.DINV.CD.WD: Foreign direct investment, net inflows (BoP, current US$)
* MS.MIL.XPND.GD.ZS: Military expenditure (% of GDP)
* MS.MIL.TOTL.P1: Total Armed Forces Personnel
* SI.POV.MDIM: Multidimensional poverty headcount ratio (% of total population)
* SI.POV.NAHC: Poverty headcount ratio at national poverty lines (% of population)
* SI.POV.GINI: Gini index
* NY.GNS.ICTR.ZS: Gross savings (% of GDP)
* SL.UEM.TOTL.ZS: Unemployment, total (% of total labor force) (modeled ILO estimate)
* SP.URB.TOTL.IN.ZS (Urban population (% of total population))
* AG.LND.AGRI.ZS: Agricultural land (% of land area)
* SE.ADT.LITR.FE.ZS: Literacy rate, adult female (% of females ages 15 and above))
* SE.ADT.LITR.MA.ZS: Literacy rate, adult male (% of males ages 15 and above))
* SE.ADT.LITR.ZS: Literacy rate, adult total
* SE.ADT.1524.LT.FE.ZS: Literacy rate, youth female (% of females ages 15-24)
* SE.XPD.TOTL.GD.ZS: Government expenditure on education, total (% of GDP)
* SG.GEN.PARL.ZS: Proportion of seats held by women in national parliaments (%)
### WDIsearch
We obtain the information of the data, i.e., meta data of WDI: Name, Indicator, Description, Source, etc. by `WDIsearch`.
* Usage
```
WDIsearch(string = "gdp", field = "name", short = TRUE, cache = NULL)
```
* Arguments
* `string`: Character string. Search for this string using grep with ignore.case=TRUE.
* `field`: Character string. Search this field. Admissible fields: 'indicator', 'name', 'description', 'sourceDatabase', 'sourceOrganization'
* `short`: TRUE: Returns only the indicator's code and name. FALSE: Returns the indicator's code, name, description, and source.
* `cache`: Data list generated by the WDIcache function. If omitted, WDIsearch will search a local list of series.
If you follow the order of the arguments, you can omit argument names: `string`, `field`, `short`, `cache`.
* The simplest format 1.
```{r 9-5}
WDIsearch("NY.GNS.ICTR.ZS", "indicator")
```
* The simplest format 2.
```{r 9-6}
WDIsearch("Literacy rate", "name")
```
* Detailed with short = FALSE
```{r 9-7}
WDIsearch("IP.JRN.ARTC.SC", "indicator", FALSE)
```
* You can use `wdi_cache` downloaded above to use the updated meta data.
```{r 0-8}
WDIsearch("Government expenditure on education", "name", FALSE, wdi_cache)
```
### Importing Data Using WDI
* The shortest form. The following is same as:
`WDI(country = "all", indicator = "NY.GDP.PCAP.KD", start = 1960, end = NULL, extra = FALSE, cache = NULL, latest = NULL, language = "en)`
```{r 9-9, eval=FALSE}
WDI(indicator = "NY.GDP.PCAP.KD")
```
```{r 9-10, echo=FALSE, eval=FALSE}
df_gdp <- WDI(indicator = "NY.GDP.PCAP.KD")
write_csv(df_gdp, "./data/gdp.csv")
```
```{r 9-11, echo=FALSE, eval=TRUE}
df_gdp <- read_csv("./data/gdp.csv")
df_gdp
```
* In order to use the downloaded data, we need to assign a name to it. It can be `df` if you download and use only one data, but it is better to assign the data to be a more descriptive name.
```{r 9-12, eval=FALSE}
df_gdppcap <- WDI(indicator = "NY.GDP.PCAP.KD")
df_gdppcap
```
```{r 9-13, echo=FALSE, eval=FALSE}
df_gdppcap <- WDI(indicator = "NY.GDP.PCAP.KD")
write_csv(df_gdp, "./data/gdppcap.csv")
```
```{r 9-14, echo=FALSE, eval=TRUE}
df_gdppcap <- read_csv("./data/gdppcap.csv")
df_gdppcap
```
* If you want to use extra information such as `region`, `income`, `lending`, use `extra = TRUE`.
```{r 9-15, eval=FALSE}
df_gdppcap_extra <- WDI(indicator = "NY.GDP.PCAP.KD", extra = TRUE)
df_gdppcap_extra
```
```{r 9-16, echo=FALSE, eval=FALSE}
df_gdppcap_extra <- WDI(indicator = "NY.GDP.PCAP.KD", extra = TRUE)
write_csv(df_gdppcap_extra, "./data/gdppcap_extra.csv")
```
```{r 9-17, echo=FALSE, eval=TRUE}
df_gdppcap_extra <- read_csv("./data/gdppcap_extra.csv")
df_gdppcap_extra
```
### Writing and reading a csv file
Since the data is generally huge, it is better to use a data saved in your computer.
Before saving it check whether you have `data` folder (or directory) in you project folder. Use the `Files` tab of the right bottom pane. You should be able to see the project icon with Rproj at the end, your R Notebook file you are editing, and the `data` folder. Since `.csv` is automatically added, `write_csv(gdppcap_extra, "./data/gdppcap_extra")` does the same as below.
```{r 9-18, eval = FALSE}
write_csv(df_gdppcap_extra, "./data/gdppcap_extra.csv")
```
To read the data, run the next code chunk. You do not have to run the code `df_gdppcap_extra <- WDI(indicator = "NY.GDP.PCAP.KD", extra = TRUE)` again.
```{r 9-19, eval = FALSE}
gdppcap <- read_csv("./data/gdppcap_extra.csv")
```
"./data/gdpcap_extra.csv" is the way to express the name of the data `gdpcap_extra.csv` in csv format in the data folder.
### `country` argument
If you want to import data for several countries, you can use `iso2c` codes of the countries.
```{r 9-20}
ASEAN <- c("BN", "ID", "KH", "LA", "MM", "MY", "PH", "SG")
```
```{r 9-21, eval=FALSE}
df_gdppcap_asean <- WDI(ASEAN, "NY.GDP.PCAP.KD")
```
```{r 9-22, echo=FALSE, eval=FALSE}
df_gdppcap_asean <- WDI(ASEAN, "NY.GDP.PCAP.KD")
write_csv(df_gdppcap_asean, "./data/gdppcap_asean.csv")
```
```{r 9-23, echo=FALSE, eval=TRUE}
df_gdppcap_asean <- read_csv("./data/gdppcap_asean.csv")
df_gdppcap_asean
```
```{r 9-24}
df_gdppcap_asean %>% distinct(country) %>% pull()
```
```{r 9-25}
wdi_cache$country %>% filter(iso2c %in% ASEAN) %>% distinct(country) %>% pull()
```
You can also use `wdi_cache$country`.
```{r 9-26}
wdi_cache$country %>% slice(1:10)
```
```{r 9-27}
wdi_cache$country %>% filter(iso2c %in% ASEAN) %>% pull(country)
```
You can find iso2c codes from the downloaded data, or using wdi_cache$country
```{r 9-28}
wdi_cache$country %>%
filter(country %in% c("Brunei Darussalam", "Indonesia", "Cambodia", "Lao PDR", "Myanmar", "Malaysia", "Philippines", "Singapore")) %>%
pull(iso2c)
```
You can also find countries in some category.
```{r 9-29}
wdi_cache$country %>%
filter(region == "South Asia") %>%
pull(country)
```
```{r 9-30}
wdi_cache$country %>%
filter(income == "Lower middle income") %>%
pull(iso2c)
```
The following shows a list of indicators
```{r 9-31}
wdi_cache$series %>% slice(1:10)
```
The following is same as `WDIsearch("gdp per cap", "name")`. See WDIsearch help.
```{r 9-32, eval=FALSE}
wdi_cache$series %>% filter(grepl("gdp per cap", name, ignore.case = TRUE))
```
## Responses to Questions
### Q1. How do we use histograms.
* [Histograms and frequency polygons](https://ggplot2.tidyverse.org/reference/geom_histogram.html)
* [Posit Cloud: Histograms](https://posit.cloud/learn/primers/3.3)
```{r 9-33}
wdi_cache$series %>% filter(indicator == "SG.GEN.PARL.ZS") %>% pull(name)
```
```{r 9-34, eval=FALSE}
df_women_in_parl <- WDI(indicator = c(women_in_parl = "SG.GEN.PARL.ZS")) %>%
drop_na(women_in_parl)
```
```{r 9-35, echo=FALSE, eval=FALSE}
write_csv(df_women_in_parl, "./data/women_in.parl.csv")
```
```{r 9-36, echo=FALSE, eval=TRUE}
df_women_in_parl <-read_csv("./data/women_in.parl.csv")
```
```{r 9-37, }
df_women_in_parl %>% filter(year == 2020) %>%
ggplot(aes(women_in_parl)) + geom_histogram()
```
```{r 9-38}
df_women_in_parl %>% filter(year == 2020) %>%
ggplot(aes(women_in_parl)) + geom_histogram(binwidth = 5)
```
```{r 9-39}
wdi_cache$series %>% filter(indicator == "IP.JRN.ARTC.SC") %>% pull(name)
```
```{r 9-40, eval=FALSE}
df_stja <- WDI(indicator = c(stja = "IP.JRN.ARTC.SC"), extra = TRUE) %>%
drop_na(stja)
```
```{r 9-41, echo=FALSE, eval=FALSE}
write_csv(df_stja, "./data/stja.csv")
```
```{r 9-42, echo=FALSE, eval=TRUE}
df_stja <-read_csv("./data/stja.csv")
```
```{r 9-43}
df_stja$year %>% summary()
```
Default of the number of bins is 30. Change and find an appropriate one.
```{r 9-44}
df_stja %>% filter(income != "Aggregates") %>%
ggplot(aes(stja)) + geom_histogram(bins = 20)
```
```{r 9-45}
df_stja %>% filter(income != "Aggregates", stja >0) %>%
ggplot(aes(stja)) + geom_histogram(bins = 20) + scale_x_log10()
```
```{r 9-46}
df_stja %>% filter(income != "Aggregates", income != "Not classified", stja >0) %>%
ggplot(aes(stja, fill = income)) + geom_histogram(alpha = 0.7, bins = 20, color = "black") + scale_x_log10()
```
```{r 9-47}
df_stja %>% filter(income != "Aggregates", income != "Not classified", stja >0) %>%
ggplot(aes(stja, fill = income)) + geom_density(alpha = 0.3) + scale_x_log10()
```
```{r 9-48}
wdi_cache$series %>% filter(indicator == "MS.MIL.XPND.GD.ZS") %>% pull(name)
```
```{r 9-49, eval=FALSE}
df_milxpnd <- WDI(indicator = c(milxpnd = "MS.MIL.XPND.GD.ZS"), extra = TRUE) %>%
drop_na(milxpnd)
```
```{r 9-50, echo=FALSE, eval=FALSE}
write_csv(df_milxpnd, "./data/milxpnd.csv")
```
```{r 9-51, echo=FALSE, eval=TRUE}
df_milxpnd <-read_csv("./data/milxpnd.csv")
```
Default of the number of bins is 30. Here I chose binwidth = 0.5 (%).
```{r 9-52}
df_milxpnd %>% filter(income != "Aggregates", year == 2021) %>%
ggplot(aes(milxpnd, fill = income)) + geom_histogram(color = "black", binwidth = 0.5) +
labs(title = "Military expenditure (% of GDP)")
```
```{r 9-53}
df_milxpnd %>% filter(income != "Aggregates", income != "Not classified", year == 2021) %>%
ggplot(aes(milxpnd, fill = income)) + geom_density(binwidth = 0.5, alpha = 0.3) +
labs(title = "Military expenditure (% of GDP)")
```
### Q2. Effect of unemployed rate under pandemic depending on income groups
```{r 9-54}
wdi_cache$series %>% filter(indicator == "SL.UEM.TOTL.ZS") %>% pull(name)
```
```{r 9-55, eval=FALSE}
df_ur <- WDI(indicator = c(ur = "SL.UEM.TOTL.ZS"),
extra = TRUE, cache = wdi_cache) %>% drop_na(ur)
```
```{r 9-56, echo=FALSE, eval=FALSE}
write_csv(df_ur, "./data/ur.csv")
```
```{r 9-57, echo=FALSE, eval=TRUE}
df_ur <-read_csv("./data/ur.csv")
```
```{r 9-58}
df_ur %>% filter(income == "Aggregates") %>% filter(grepl('income', country)) %>%
filter(year >= 2018) %>% ggplot() + geom_line(aes(x = year, y = ur, color = country))
```
### Q3. Add values at each data point.
```{r 9-59}
wdi_cache$series %>% filter(indicator == "FP.CPI.TOTL.ZG") %>% pull(name)
```
```{r 9-60, eval=FALSE}
df_infl <- WDI(country = c("VN","CN","JP"),
indicator = c(inf = "FP.CPI.TOTL.ZG"), extra = TRUE) %>% drop_na(inf)
```
```{r 9-61, echo=FALSE, eval=FALSE}
write_csv(df_infl, "./data/infl.csv")
```
```{r 9-62, echo=FALSE, eval=TRUE}
df_infl <-read_csv("./data/infl.csv")
```
```{r 9-63}
df_infl_s <- df_infl %>% filter(year %in% c(2000, 2005, 2010, 2015, 2020),
country %in% c("Japan", "Vietnam", "China"))
df_infl_s
```
```{r 9-64}
df_infl_s %>%
ggplot(aes(x = year, y = inf, color= country)) +
geom_line() + geom_point() +
geom_text(aes(label = round(inf,2)), nudge_y = 0.8) +
labs(title = "Inflation of Japan, Vietnam and China from 2000 to 2020",
x = "", y = "Inflation, consumer prices (annual %)")
```
## Regression Lines; Topic of EDA5
```{r 9-65}
wdi_cache$series %>% filter(indicator %in% c("SI.POV.NAHC","SI.POV.MDIM")) %>% pull(name)
```
```{r 9-66, cache=TRUE, eval=FALSE}
df_wdi_poverty <- WDI(
indicator = c(poverty = "SI.POV.NAHC", multipoverty = "SI.POV.MDIM", gdppercap = "NY.GDP.PCAP.KD"), start = 1990,
extra = TRUE) %>% drop_na(poverty, multipoverty, gdppercap)
```
```{r 9-67, echo=FALSE, eval=FALSE}
write_csv(df_wdi_poverty, "./data/wdi_poverty.csv")
```
```{r 9-68, echo=FALSE, eval=TRUE}
df_wdi_poverty <-read_csv("./data/wdi_poverty.csv")
```
```{r 9-69}
df_wdi_poverty %>%
group_by(country, year) %>%
mutate(mean_gdp = mean(gdppercap)) %>%
mutate(mean_poverty= mean(poverty)) %>%
ungroup() %>% filter(income != "Aggregates") %>%
ggplot(aes(x = mean_gdp)) + geom_point(aes(y = mean_poverty, color = income)) +
scale_x_log10() + geom_smooth(aes(y = mean_poverty), formula = y~x, linetype="longdash", color = "black", method = "lm", se = FALSE) +
labs(x = "GDP per capita", y = "poverty rate (% of population)", title = "Poverty rates and GDP per capita", subtitle="world countries, 1990-2021 average, by income level")
```
```{r 9-70}
df_wdi_poverty %>%
group_by(country, year) %>%
mutate(mean_gdp = mean(gdppercap)) %>%
mutate(mean_multipoverty= mean(multipoverty)) %>%
ungroup() %>%
filter(region !="Aggregates") %>% ggplot(aes(x = mean_gdp)) + geom_point(aes(y = mean_multipoverty, color = region)) +
scale_x_log10() +
geom_smooth(aes(y = mean_multipoverty), formula = y~x, linetype="longdash", color = "black", method = "lm", se = FALSE) + labs(x = "GDP per capita", y = "Multidimentinal poverty rate (% of population)", title = "Multidimentional Poverty rates and GDP per capita", subtitle="world countries, 1990-2021 average, by region")
```
## Appendix: A secondary axis
* [Specify a secondary axis](https://ggplot2.tidyverse.org/reference/sec_axis.html)
- This function is used in conjunction with a position scale to create a secondary axis, positioned opposite of the primary axis. All secondary axes must be based on a one-to-one transformation of the primary axes.
`scale_y_continuous(sec.axis = sec_axis(~ . scaling_function))`
### Example
Suppose you have two indicators,
* NY.GDP.MKTP.KD: GDP (constant 2015 US$)
* NY.GDP.PCAP.KD: GDP per capita (constant 2015 US$)
```{r 9-71}
WDIsearch(string = "NY.GDP.MKTP.KD", field = "indicator", short = FALSE, cache = wdi_cache)
```
```{r 9-72}
WDIsearch(string = "NY.GDP.PCAP.KD", field = "indicator", short = FALSE, cache = wdi_cache)
```
List the name of countries of ASEAN and BRICs using `wdi_cache$country`.
```{r 9-73}
asean <- c("Brunei Darussalam", "Cambodia", "Lao PDR", "Myanmar",
"Philippines", "Indonesia", "Malaysia", "Singapore")
brics <- c("Brazil", "Russian Federation", "India", "China")
```
Find the `iso2c` of the countries using `wdi_cache$country`.
```{r 9-74}
wdi_cache$country %>%
filter(country %in%
c("Brunei Darussalam", "Cambodia", "Lao PDR", "Myanmar",
"Philippines", "Indonesia", "Malaysia", "Singapore",
"Brazil", "Russian Federation", "India", "China")) %>%
pull(iso2c)
```
Separate the iso3c's of the countries with commas and read data using `WDI`.
```{r 9-75, eval=FALSE}
wdi_gdp <- WDI(
country = c("BR", "BN", "CN", "ID", "IN", "KH", "LA", "MM", "MY", "PH", "RU", "SG"),
indicator = c(gdp = "NY.GDP.MKTP.KD", gdpPercap = "NY.GDP.PCAP.KD"),
start = 1960, extra = TRUE, cache = wdi_cache)
```
```{r 9-76, echo=FALSE, eval=FALSE}
write_csv(wdi_gdp, "./data/wdi_gdp.csv")
```
```{r 9-77, echo=FALSE, eval=TRUE}
wdi_gdp <-read_csv("./data/wdi_gdp.csv")
```
```{r 9-78}
wdi_gdp %>% filter(country %in% asean) %>% drop_na(gdp, gdpPercap) %>% summary()
```
Using the summary, decide the scaling of two variables, `gdpPercap` and `gdp`.
```{r 9-79}
wdi_gdp %>% drop_na(gdp, gdpPercap) %>%
filter(country %in% asean) %>%
ggplot() +
geom_line(aes(x = year, y = gdpPercap, linetype = country)) +
geom_line(aes(x = year, y = gdp/(10^7), col = country)) +
coord_trans(x ="identity", y="log10") +
scale_y_continuous(sec.axis = sec_axis(~ . *(10^7), name = "gdp/(10^7)"))
```
#### BRICs
```{r 9-80}
wdi_gdp %>% filter(country %in% brics) %>% drop_na(gdp, gdpPercap) %>% summary()
```
Using the summary, decide the scaling of two variables, `gdpPercap` and `gdp`.
```{r 9-81}
wdi_gdp %>% drop_na(gdp, gdpPercap) %>%
filter(country %in% brics) %>%
ggplot() +
geom_line(aes(x = year, y = gdpPercap, linetype = country)) +
geom_line(aes(x = year, y = gdp/(10^9), col = country)) +
coord_trans(x ="identity", y="log10") +
scale_y_continuous(sec.axis = sec_axis(~ . *(10^7), name = "gdp/(10^9)"))
```