-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathIntroduction_to_RMarkdown.Rmd
390 lines (224 loc) · 10.2 KB
/
Introduction_to_RMarkdown.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
---
title: "A Hands-On Introduction to R Markdown"
author: "Ben B-L"
date: "`r format(Sys.time(), '%d %B, %Y')`"
output:
ioslides_presentation:
code_folding: "hide"
logo: images-rmarkdown/rmarkdown.png
incremental: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = FALSE)
library(dplyr)
library(ggplot2)
theme_set(theme_bw())
```
## Topics {data-background=#cceeff}
* Concept of reproducible documents
* R Markdown (and Markdown)
* Running code _inline_ and via _code chunks_
* Graphics and tables
* Output formats
* Fancier things
* Resources
**Goal: to be comfortable making basic reproducible documents.**
## Assumptions
* You're familiar with the basic mechanics of R
This is intended to be a hands-on workshop, so also:
* You have R (and probably RStudio) installed
* You have the [rmarkdown](https://cran.r-project.org/web/packages/rmarkdown/index.html) package installed
**Note:** this presentation doesn't (yet) cover the newer [Quarto](https://quarto.org) publishing system, but will provide a good foundation for learning it.
## Reproducible documents {data-background=#cceeff}
_What_ are reproducible documents?
"A reproducible document is one where any code used by the original researcher to compute an output (a graph, equation, table or statistical result) is embedded within the narrative describing the work."
From the [eLife Reproducible Document Stack](https://elifesciences.org/labs/7dbeb390/reproducible-document-stack-supporting-the-next-generation-research-article)
## Reproducible documents
Here we are in Word:
<img src="images-rmarkdown/word_screenshot.png" width = "100%">
We have _no idea_ where these numbers came from.
## Reproducible documents
Copy-and-paste into word processing documents _sucks_, and it's easy to make mistakes (not update numbers, copy the wrong number, etc).
<img src="images-rmarkdown/copypaste.jpg" style="float:right" width = "50%">
## Reproducible documents
What if I want to generate 100 formatted reports for 100 different datasets?
<img src="images-rmarkdown/dataset_reports.png" width = "100%">
## Reproducible documents
<div style="float: left; width: 50%;">
_Why_ do we want to use reproducible documents?
* Transparency
* Reproducibility
* Reduce mistakes
* **SAVE TIME AND WORK**
</div>
<div style="float: right; width: 50%;">
<img src="images-rmarkdown/automation.png" width = "95%">
[xkcd](https://xkcd.com/1319/)
</div>
## Reproducible documents
<iframe width="922" height="519" src="https://www.youtube.com/embed/PCRoc0JxRWg" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
## R Markdown {data-background=#cceeff}
R Markdown is an easy-to-write plain text format for creating dynamic documents and reports.
It's designed to let us mix _text_ and _code_ to produce a [Markdown](https://en.wikipedia.org/wiki/Markdown) document, that is then transformed into a final output
## R Markdown: under the hood
<img src="images-rmarkdown/workflow.png" width = "100%">
[The R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/rmarkdown-process.html)
## Let's dive in
<img src="images-rmarkdown/create-img1.png" width = "100%">
## Let's dive in
<img src="images-rmarkdown/create-img2.png" width = "100%">
## Let's dive in
What do we have? A _text document_ (the R Markdown file).
<img src="images-rmarkdown/knit.png" width = "100%">
## Let's dive in
What happened?
<img src="images-rmarkdown/knitted.png" width = "100%">
## Text: Markdown
```
# A big header
## A smaller one
### Smaller still
A [link](https://daringfireball.net/projects/markdown/).
A sentence with _italics_ in it. **Boldface.**
```
A [link](https://daringfireball.net/projects/markdown/). A sentence with _italics_ in it. **Boldface.**
<img src="images-rmarkdown/third_way.png" width = "50%">
https://xkcd.com/1285/
## Text: Markdown
<img src="images-rmarkdown/markdown-quickref.png" width = "50%">
## Code: inline
R code in Markdown documents can be _inline_ or in _code chunks_.
Inline code uses backticks, inside of which is an R expression that starts with "r". It is typically used for something short:
* These slides were made with `` `r "\u0060r R.version.string\u0060"` ``.
* These slides were made with `r R.version.string`.
* Two plus two equals `` `r "\u0060r 2+2\u0060"` ``.
* Two plus two equals `r 2+2`.
You can also easily insert equations like $\sum_{n=1}^{10} n^2$
## Code: chunks
The other mechanism to include R code in your document is via code chunks. These are evaluated by `knitr` and the results inserted into your document.
Let's try this!
**Two important notes:**
* You can give chunks individual names, which makes it easier to navigate around longer documents.
* RStudio has nice controls for running individual chunks.
## Code: chunks
Side note: code chunks don't have to be in R!
<img src="images-rmarkdown/chunk-langs.png" width = "15%">
```{bash bash-no-eval, echo=TRUE, eval=FALSE}
ls *.Rmd
```
```{bash bash}
ls *.Rmd
```
## Chunk options
We can control the behavior of code chunks by using _chunk options_ (settings). Some common ones include:
Option | Default | Effect
------- | -------- | --------
eval | TRUE | Whether to evaluate the chunk's code
echo | TRUE | Whether to display chunk's code
include | TRUE | Whether to include chunk output
message | TRUE | Whether to display chunk messages
cache | FALSE | Whether to cache results
## Chunk options
Try inserting a new chunk with the code `dim(iris)`.
What happens if you set `eval=FALSE`? What about `echo=FALSE`?
Why?
<img src="images-rmarkdown/chunk-options.png" width = "100%">
## Graphics
Say we have a figure we want to include:
```{r gapminder-chunk, echo=TRUE, eval=FALSE}
library(gapminder)
p <- gapminder %>%
filter(year == 1977) %>%
ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
geom_point() +
scale_x_log10() +
ggtitle("gapminder 1977")
print(p)
```
## Graphics
```{r gapminder-graph, echo=FALSE}
library(gapminder)
p <- gapminder %>%
filter(year == 1977) %>%
ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
geom_point() +
scale_x_log10()
print(p)
```
## Graphics (behind the scenes)
<img src="images-rmarkdown/gapminder-code.png" width = "100%">
## Graphics
Let's add a graph to your document.
Note there are chunk options `fig.cap`, `fig.width`, and `fig.height` you can change.
## Tables
Tables can be made by hand (see the Markdown help page), but usually we have a data frame that we want to display.
The simplest method uses `knitr`'s built-in `kable()` function:
```{r simple-table}
knitr::kable(iris)
```
## Output formats
We haven't really talked about [pandoc](https://pandoc.org/), the software that transforms Markdown into an output format of our choice.
<img src="images-rmarkdown/knit-options.png" width = "25%">
Use the Knit menu button, or the `output:` line in the YAML header, to change the output format to e.g. `word_document" or "pdf_document".
These may require extra software to be installed.
## Session info
For reproducibility, it is a good practice to call `sessionInfo()` at the end of your document. Like this:
```{r sessioninfo, echo=TRUE}
sessionInfo()
```
## Fancier things to whet your appetite {data-background=#cceeff}
This has been the tip of the iceberg.
<img src="images-rmarkdown/iceberg.jpg" width = "100%">
## Fancier things: interactive graphs
```{r plotly, message=FALSE}
library(plotly)
p <- gapminder %>%
filter(year == 1977) %>%
ggplot(aes(gdpPercap, lifeExp, size = pop, color = continent)) +
geom_point() +
scale_x_log10() +
theme_bw() +
ggtitle("gapminder 1977")
ggplotly(p)
```
## Fancier things: interactive tables
```{r DT, message=FALSE}
library(DT)
datatable(gapminder, rownames = FALSE, filter = "top",
options = list(pageLength = 5, scrollX = TRUE)) %>%
formatStyle(columns = 1:6, fontSize = "50%")
```
\
This takes _~~one~~ two_ lines of code in RMarkdown. Example based on [this post](https://holtzy.github.io/Pimp-my-rmd/#use_dt_for_tables).
## Fancier things: interactive maps
```{r leaflet, out.width='100%', echo=FALSE}
library(leaflet)
leaflet() %>%
addTiles() %>%
setView(-76.9219, 38.9709, zoom = 17) %>%
addPopups(-76.9219, 38.9709,
"Here is the <b>Joint Global Change Research Institute</b>")
```
## Fancier things: citation management
```
In a subsequent paper [@Bond-Lamberty2009-py], we used the same model
outputs to examine the _hydrological_ implications of these wildfire
regime shifts [@Nolan2014-us].
```
In a subsequent paper (Bond-Lamberty et al. 2009), we used the same model outputs to examine the _hydrological_ implications of these wildfire regime shifts (Nolan et al. 2014).
**References**
Bond-Lamberty, Ben, Scott D Peckham, Stith T Gower, and Brent E Ewers. 2009. “Effects of Fire on Regional Evapotranspiration in the Central Canadian Boreal Forest.” Glob. Chang. Biol. 15 (5): 1242–54.
Nolan, Rachael H, Patrick N J Lane, Richard G Benyon, Ross A Bradstock, and Patrick J Mitchell. 2014. “Changes in Evapotranspiration Following Wildfire in Resprouting Eucalypt Forests.” Ecohydrol. 6 (January). Wiley Online Library.
## Resources
Good resources:
* The [R Markdown Cheat Sheet](https://rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf)
* The [R Markdown Cookbook](https://bookdown.org/yihui/rmarkdown-cookbook/)
* [15 Tips on Making Better Use of R Markdown](https://slides.yihui.org/2019-dahshu-rmarkdown#1)
* [Stack Overflow](https://stackoverflow.com/questions/tagged/r-markdown) of course
* Allison Hill's [How I Teach R Markdown](https://alison.rbind.io/post/2020-05-28-how-i-teach-r-markdown/)
* [How to Make Beautiful Tables in R](https://rfortherestofus.com/2019/11/how-to-make-beautiful-tables-in-r/)
## The End {data-background=#cceeff}
Thanks for attending this introduction to R Markdown documents workshop! We hope it was useful.
This presentation was made using R Markdown version `r packageVersion("rmarkdown")` running under `r R.version.string`.
These slides are available at https://rpubs.com/bpbond/983114.
They were written in R Markdown! The code is [here](https://github.com/bpbond/R-workshops/blob/main/Introduction_to_RMarkdown.Rmd).