From b16a2d3cef2a456f6eea654b0e966037e18e3871 Mon Sep 17 00:00:00 2001
From: Manny Gimond <mgimond@colby.edu>
Date: Wed, 17 Apr 2024 09:22:42 -0400
Subject: [PATCH] small edits

---
 bivariate.qmd                                 |  10 +++++-----
 ...-25_015ed87426b2e1459bdf2eaf4001adcd.RData | Bin 365 -> 0 bytes
 ...-25_580c4e09e0751ca71793d95540cf5875.RData | Bin 0 -> 363 bytes
 ...k-25_580c4e09e0751ca71793d95540cf5875.rdb} |   0
 ...k-25_580c4e09e0751ca71793d95540cf5875.rdx} | Bin
 .../figure-html/unnamed-chunk-25-1.png        | Bin 5582 -> 5502 bytes
 docs/bivariate.html                           |  12 ++++++------
 .../figure-html/unnamed-chunk-25-1.png        | Bin 5582 -> 5502 bytes
 docs/search.json                              |   4 ++--
 9 files changed, 13 insertions(+), 13 deletions(-)
 delete mode 100644 bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.RData
 create mode 100644 bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.RData
 rename bivariate_cache/html/{unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdb => unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdb} (100%)
 rename bivariate_cache/html/{unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdx => unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdx} (100%)

diff --git a/bivariate.qmd b/bivariate.qmd
index 3537fd97..da7d2980 100644
--- a/bivariate.qmd
+++ b/bivariate.qmd
@@ -118,7 +118,7 @@ Non-parametric fit applies to the family of fitting strategies that *do not* imp
 
 #### Loess
 
-A flexible curve fitting option is the **loess**  curve (short for **lo**cal regr**ess**ion; also known as the *local weighted regression*). Unlike the parametric approach to fitting a curve, the loess does **not** impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the *smooth* curve. The range of x-values that contribute to each localized regression lines is defined by the $\alpha$ parameter which usually ranges from 0.2 to 1. The larger the $\alpha$ value, the smoother the curve. The other parameter that defines a loess curve is $\lambda$: it defines the polynomial order of the localized regression line. This is usually set to 1 (though `ggplot2`'s implementation of the loess defaults to a 2^nd^ order polynomial).
+A flexible curve fitting option is the **loess**  curve (short for **lo**cal regr**ess**ion; also known as the *local weighted regression*). Unlike the parametric approach to fitting a curve, the loess does **not** impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the *smooth* curve. The range of x-values that contribute to each localized regression lines is defined by the **span** parameter,  $\alpha$,  which usually ranges from 0.2 to 1 (but, it can be greater than 1 for smaller datasets). The larger the $\alpha$ value, the smoother the curve. The other parameter that defines a loess curve is $\lambda$: it defines the **polynomial order** of the localized regression line. This is usually set to 1 (though `ggplot2`'s implementation of the loess defaults to a 2^nd^ order polynomial).
 
 ```{r echo=FALSE}
 library(dplyr)
@@ -297,7 +297,7 @@ ggplot(df, aes(x = area, y = residuals)) + geom_point() +
 
 ```
 
-We are interested in identifying any pattern in the residuals. If the model does a good job in fitting the data, the points should be uniformly distributed across the plot and the loess fit should approximate a horizontal line. With the linear model `M`, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show *dependence* on the x values. 
+We are interested in identifying any pattern in the residuals. **If the model does a good job in fitting the data, the points should be uniformly distributed across the plot** and the loess fit should approximate a horizontal line. With the linear model `M`, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show *dependence* on the x values. 
 
 Next, we'll look at the residuals from the second order polynomial model `M2`.
 
@@ -334,7 +334,7 @@ where $\varepsilon$ is a constant that does not vary as a function of varying $x
 
 ### Spread-location plot
 
-The `M2` and `lo` models do a good job in eliminating any dependence between residual and x-value. Next, we will check that the residuals do not show a dependence with *fitted* y-values. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted `cp.ratio` values (for a univariate analogy, think of the fitted line as representing a *level* across different segments along the x-axis). We'll generate a spread-level plot of model `M2`'s residuals (note that in the realm of regression analysis, such plot is often referred to as a **scale-location** plot). We'll also add a loess curve to help visualize any patterns in the plot.
+The `M2` and `lo` models do a good job in eliminating any dependence between residual and x-value. Next, we will check that **the residuals do not show a dependence with *fitted* y-values**. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted `cp.ratio` values (for a univariate analogy, think of the fitted line as representing a *level* across different segments along the x-axis). We'll generate a spread-level plot of model `M2`'s residuals (note that in the realm of regression analysis, such plot is often referred to as a **scale-location** plot). We'll also add a loess curve to help visualize any patterns in the plot.
 
 ```{r fig.height=2.5, fig.width=2.5, small.mar=TRUE}
 sl2 <- data.frame( std.res = sqrt(abs(residuals(M2))), 
@@ -345,12 +345,12 @@ ggplot(sl2, aes(x = fit, y  =std.res)) + geom_point() +
                           method.args = list(degree = 1) )
 ```
 
-The function `predict()` extracts the y-values from the fitted model `M2` and is plotted along the x-axis. It's clear from this plot that the residuals are not homogeneous; they increase as a function of increasing *fitted* CP ratio. The "bend" observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may want to increase the loess span.
+The function `predict()` extracts the y-values from the fitted model `M2` and is plotted along the x-axis. It's clear from this plot that the residuals are not homogeneous; they increase as a function of increasing *fitted* CP ratio. The "bend" observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may therefore want to increase the loess span by setting `span = 2`.
 
 
 ```{r fig.height=2.5, fig.width=2.5, small.mar=TRUE}
 ggplot(sl2, aes(x = fit, y = std.res)) + geom_point() +
-              stat_smooth(method = "loess", se = FALSE, span = 1.5, 
+              stat_smooth(method = "loess", se = FALSE, span = 2, 
                           method.args = list(degree = 1) )
 ```
 
diff --git a/bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.RData b/bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.RData
deleted file mode 100644
index c200f7f4936e2dff8c52352fde4a7307aa427931..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001

literal 365
zcmV-z0h0b7iwFP!000002Cb3bO2aS|#xu8Fm@6M4?xroaU0XMugI)+aZ@lsk3`Dl9
zIc<Y!Q<_vAV-Mqtim6^q#Y+!_d>@?iJLJpF#rzo0F+!+|dfh$L?Hh^u_jfaIf>3YI
zm=5ZpeRE&=EB;JtPBknNn&dFdbs>gLS<(WSm*h?P<VE8}5RQROPDjznDuQq{K3lOg
z0-BD3Afzl|26eE*cHXAbDcSf52tiaqh42en$Cj=eut-i-9$Wvj*V<ydTrM|$O+L4I
ziDa<FSyl<DZ6zX?P*C<OiAl<}OV&mx&HNga<B%hgfh-o4<fXQa#pFjRP4z+*QtI3;
zK<AQ~>4A`-)W9VQjGLWb-(DJB(bBlkAG_oi<a>QVd|GG9bPKMu%^<6xC4~-g+NWT=
z%g&RHt{Uwzt|~#-TYUKNXzzkcxd1gxdDhh6{h6N^`c;`Zn-|V>9!H~K+tS-(KI{Dh
L;+EC?C;|Wg^3kv*

diff --git a/bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.RData b/bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.RData
new file mode 100644
index 0000000000000000000000000000000000000000..7e6dfe420c5ff1c9fc45d59c22aaa8b3ae142d34
GIT binary patch
literal 363
zcmV-x0hIn9iwFP!000002Cb1zOT#c6#<RDBIr$OdE^Tq`+HIxVpa((c#gi{E5b0w3
zZyQXT(xmDb`z(G_G1bFVJo<-_=Yiz+kSEvY>nUDigisIldtKBUXo&`Qw<~*&P`|59
z2ldgB-Y=a;_DpL=6+|(Oa~S8U;Nzw&X#t5H=S}%!PyEQAyYUR%1-OCl#We7O#WY#?
z{>+V2e;)W6b$q~f-j>TH**P(APGmtjcM4jEhN>GdNKR!Q8vm1P>@nVKHan*#pJ{e1
z64>J`tGH06<dYGhAk9}2l9Z_t*=iwG;?y85i=2`SL=jbjmCDo>lOH1$Rgo-&P`O!v
z%0;4Y54iv-ha)0EyVcp%&4tz#Ew$5WzYyQ&3*pc@llm-YQke;4HMAVhB3Anlj1SG(
zqES_&?1agR)9oIg+&`Fy+ET_rj#HL3HQ0Z)XM0XnX4dY7B`OamGk4$8+oiwi{R7-0
JS;r;<003x<u0H?(

literal 0
HcmV?d00001

diff --git a/bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdb b/bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdb
similarity index 100%
rename from bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdb
rename to bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdb
diff --git a/bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdx b/bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdx
similarity index 100%
rename from bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd.rdx
rename to bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875.rdx
diff --git a/bivariate_files/figure-html/unnamed-chunk-25-1.png b/bivariate_files/figure-html/unnamed-chunk-25-1.png
index feb65c5beb4044ddd975dd4fd23967e89b784560..e498ba6d975355cbe911591f2ab84a322aabd235 100644
GIT binary patch
delta 4669
zcmYjVdpy(o|KEj8uF3V}*6l<sqjN;=U!|nNaf*py&M708T*nxnEhoNFR14)=Bo&!d
zYz)JZBow*T*h(XWF?X|lKhrtC$8UeV-mmxT{(8ROug`m3DAE4PG^n@Au%o@*+1R}4
zVey}z{{9IOR`LR+AdiqWk*+5Ed8jz#jYkLhx<4hDRsGQGW3-l2;~vAgoI$a;n|i)I
z>wJ5R{)F?ab=-*FiJ)D!waaJPdmO8i_Q};=ZorT))KHC8RH--A6csy6`OnMMfnS~k
zw(yd3sW;*JfX=wIE8}g4dO2_$7IRla>Q}#mqXM17!?+wXma&Vgou|~yBj^yKmkISJ
zrpk^T<0g+9pv^dvZ37oKDR<h9KeVs#)Hi}hAXKXh${zJcQp2~>{hQYCdnyS9wXru!
zusd51{{<JAI5NPk*EX-?UR_n>$?ki#bKS%dr&ut~>e*}YmA9%`UeEoFI<rmPrE-_c
zvVC0{)4xNR9$2t+_M8JzDzqBgYqiln6FSdCE|o_N#Mq^Jh2590a3I_o7ps^1fIeh`
z?>KPXj^V;^y65kM{<i4cQPKIvYM}bT|D;~pDogd*ss$)(T4k_K5H-+et#(rm&ZGCZ
z&0U+O=vx*WGph@M0HLF1C~56Bt70CFtwfTl)Lt!iZL}WEpgJ_CEq642@`B$Jpub*R
z{mBbHdnI?hcaA_e=io9N=8jV187VTs`(}?k&MJIn$e%M9BDhqWHniWg&EH@;;nIi~
z*s0c00Fs6LT(U5Jec@(1K{o!j$OslwP!sG`vR*F7t^c{RoltvFFXPR60A9hvbE^SO
zwkEj-&Xw&<P4RK+SeZDN_9XfGDY3Z)G~IOa)PxJHh8Bl8%BFilyIpP<6@Qy@87=<S
z%s7K}CirE_#buTnKOHpul>4Nd)MR}pA1TfPXiaZ=UY6H*=i`yFYhV2}6Pevf<s}y<
z8L}~M1)0U)h;t@^KGnFX`H0;Ca(Ve;bXQX35-;Tg>cO{MJgeHfgXjisU=LC=J@qow
zl@zuPRtK8XLZw0|`ky|x_FxxW>EWn29bLmqy_0IcUu#j~zRKEVw2~2tq4r!~8Uel=
z#7%g%Y$bbz<nE%Bj^@h<JTKp!&eQ``lV?XfMQ)1{U}(`7oiTeW?H*%J98_em@|kzG
zdF&0-rz41Ja{ac&*{oNb$*Gme^gMu-2+IYGYF;J}1k<HL52v4J>@ZfZ?%}J&Ux1lu
zJ20_Z@5p%tth~v_g3A<59MB3?qiCfcm8qPMy~|Q)+X?d+J^x<`!QF9c>TEVhy<ZQ!
zHFg>SN)^vuRkZejbp(=K3(&NCbIKmJE`^wPD>CJ|_-0f9Nvi+tuV2(mY+zHoyK}PJ
z^(fKv@@AZVqQd53LJ6)BrLSkO-Y^l&%dZ@8Za%wx&4JO~{K?jwO=*n#8DDd7{p8{s
zi7_A7&II1I4Ax>_p2P>%D}*Fj9_ssJnQdU?eT~(b^=nrLX}w;=w$r7cf@V(P?76K&
zMigaAq*0?3LaIy8M_E&me%Ra*XTt3nQl>PKJMfKCuV6ko!#j*s`SV}ZD8NlJYOnql
zc`59n)zdg_?wd3@sU_&)L{!m5MoyZ8)DpyFyz}p94dj*sdLHAWkc&p5Z5~#YUC6i(
zzHGhi{0X2zjMC+&#+xO>g<m=xdM7%Rifr=h;RxpBNL~}fR;G)c?;Jxt-U`5AKE&C+
z-^fZT{o_+j3<skTxsgynk$v<+L!US%{8E?klZOg*H-q0Y3HqW-!tvx1kw)oC%E!6D
zTazR0l;Eb)!q%#Nf{ta+@5Ozn2bvBwDmwc)_aWV<eH%3L3fE+?xupv_>R}7pNNQU=
zIxRK1L+CZT`EB1hL19h>>9fC)i!mI*+~>#xpJog^CD&3^KLjx3G(q&b((P%^F1AYk
zpPtRK;u?~m>4a|Ad>6$R{S~ImYVh(l%N^W_G@}ZV+D7cEs9QABjFGLZm?e`f_e=2k
z-Vy#H0f`{3!KEt4IxgXf$I}*(!r8($!FwZ)xN_C}mOM^{^BGboA`*1=S-iH09|*$j
z1{t#mplQgrqL2~4wtlbE%0bNXmpJcA0`j|DZStHv;$(wgS!#2KW0&^l|CM+H8^Ko@
z{19Wndd~Q!y0ThZqoH*zAqAG_jXr&Tqdw<62!TSf){3I^%EI#%*8a@J#RxwYNr@E8
z{aMP@*~oL(Nl3VmWF0%@3KM>T+Zq8HrUD)Lum}c!aq*>@Z(3qCBgkD+8xw5B{n;>b
zso1l7#wauz^CIqw>S#bU_6!FKbpeDkizk*d3E4^Uu+Z<8d$^$|H+9ZcFlKS&7_U%z
z0!+)tCm=r?Js9?^VXwtUEbbkwU-$|yK)jgOQ4GtCQJsWLCF1LVaQH922Nh~^Fz@XR
zxeN7<OXO%9|NE}+u}`x*$YnW?q_y)Ag;Qh4=(SJ_z31;9LDPl>Tgj;0M@Zpl&HJBv
za<hADdls=T47u!%SqQnT4%E;&&npv@IcAXsDHy3&aJ06KN~~H~j9V{63wf@!cQ<TV
zHu$s2<ES-D2*5Q(p6jPr7Jf5uxg-yzw+7xy;HGO;nj{>O;l5JQSxvS5?kqI4?C7Po
zKq^Sbk8;2L*Hh>eqt@H~$MzWuB~1?+)cdQOp~27HnuAH+QGLAK2Nh!S@$&2n@QTz-
zc@>eFjchexN9$^bbuNOL#^k3UurUd~r(3+oACmz<I2-YDWe-vUqx&8eOCkk{NGSyE
z?Oz%7tO=)GmR;6p>mZS~gVN|5CneNAaxt~&L4(5Omnc9eTDN7YmnRr3ngsSuM@xco
zf#&shv?9;isIK1>p)M7w_J$1*<5+bJv+@JLUoGeHs>uUOa!SK#x^c@u)+ugZO<QUp
zeagy^+w7g$J#cDP$!17)G;Un<<^J6F`}|r<T<i2^wk~q?FyDn9r=IxFW1=XgPN+?%
z?%m)Njpb4*@Rt^c4s#zKK|lOlAuyrHd4(yj#NH0B)Ay3=ERKq_kufrdxv#$Y;b#Tj
zgc9drtYj_$4>Ihv_|uc)&!o#2GJgb`V#n|Qe#K05y5-CI?WNy4i&w|usxkp<0;GKD
zACdN$f{lfFUdNoJkh5P{2u{f$8{G>=&2xpi;NZ|`ED-mX1@>7b)SXOtYGvm6RAk)q
z;N&cyh8Vf4bByykv16}sg2Kf<OCd1&bU9P01G7DXHI%+;U}G1zJzvh!Jhgw&HYTT(
zaqqTO2)Fjh;N^@er<iV$Yt#l$x)Y%}7HDMfvy0ryLK=I$vq1diy3pQ%hmM=r>BXvT
zh>)~EE|zECHt)JX!q*Q5dBOUB1U|;%T}}TkGy`<=LYo}!4;y*_9E|fO63Wmr8skMM
z|Ej8ik@{A@7YeupA3F&YU|kIh2Yg$vrVsrc>4#p6RrSSIC8&mcL6w4qzRfBH7^GE=
zsUShkvahI*bF%SZm$%c&3|ji}_M=*-e$A2$-9Mk@-K6Cu*L9tNwUw;g&JVYG8g4k0
z_#FN%93Wj?8Qm!Hb(4Iy33zWwJP@k|>S5sVaFa~i^kDO&zUGoRq}kq^j3{$(N`VYK
zLc8&xBdqe-rsZDg#JLMEsEMm~QSu4Zo%o_n+4JU+Ka{r!cQCReYNkWp4@1oO|Mq#y
zlmJ-dvWrEHU+S;^FasZP@~0_MX+huCy)fJV#mhqV&`*)woZJs@H>(v0#rs~%WQPu*
zDw`zzFi_Xp#up4Nr!Q?`5B?+3O?a>5&$;tqG$`z!7^Jm1ayoi8ZB^E~$7=qK0Z18&
zZtm?)Qfyp3EfC1{UTpf5(Bf7$^HbP-C;s`Yk$nG0lC6W^l_*<El>1XsT_`=GF9BZA
zu=3>%=6y!?Y|Z-Re4(sc;^B)S7h-H<h9Rwg^mRPO3|n&3_tzXrv$T&w+*nP8^m|0Z
za-sNjJS=x?6>0@+{9GAWputKT1yCSo5m^r0$h1t%g1${UV2JDNebBs+ycNgl-J`|w
zRMBajK<#6Nspt^Mhn|>N??VjcT772mCh%e{5NM|So`nZuupj!{V~J}+VzA3pd2Beb
zNkXZ}W20AJZ&GsDglNv@t}HfkIeAk<l|BQ^Xm6U?8#X<R5t)I#KfztAixgf0?}Mu>
z)J(&CBM37o7_GLhYts%AyFMTE7@FG3?4}}^A2sfQ+kZh(B&Y=S>XmOc8TQb!SY-)g
zTjdW4DQM6NOoA+JoQ^M95*fUlC-*LF03iNOUi_gfQ~A*7Kj2wthx9K>2Y~*SSsxo9
zU}oc*^s3)EHbqEGLU7aCC!=O{o&^oSscagW(()h@n+WiCf;fcff<%v*NO4M#ghTV?
z;*&Qm=tJln#J%xFdEwiIG7Tn}9J7Jq)b=@{1?v6(UIUtdEjj7YfkXd!c4wLB4RIkH
z_lpg&6~CkA`qAemdeg~CJ5-h=^{{~a-wm1c^(Ozz#{BFP%%h^qIP`Y_%pe-V)Ciq%
zY5VkfjFe=p8^X?UZZm5+<qbhL9T}8VF-VDIb+USx`t<rG^{-|T;uKcG*h{3%381Do
z>F&h}=V}>tyiYv0o==z;1O%p#tUeyl^M8y9V92od(hl*@TMi#G3b)ZMS#Yh-$(rlc
zM`gMfUpaSWQZd%Z!euQ^RO8M%vqo@#N*2)QRND=I)lgxD@YAIC$qV5{#o<2LEFOlf
z&q#=}<gzDC!D&r2r4)}=%pV0dKhf7E5h^gh7}PdnC1X971I~|W%5wpyl?eJiq#q^*
zmneJ{S}k!Ky^*=ZdcPDG9*DMJilQtS8tAuM*k{1#Cl1FPnuxu9N4K+LZ^&(>A`Gm%
z`%0cE#F_@@PQ>arGqZvl_wl2Ij(>YG?Xk!_1{#f7vj~C2`Dm~ZFPTT~2F(=FdL?aB
z&_0kk3Z8O-4Gqt|;0ue%b-?khU`gXw;e=tx0<(Y;iR`v=s=75`7S@%iqH}}gS)r~)
zE*P+_96BuNHkAff1J<NXEs)i{pu7*|rN6$R85oHRu6{}`OCq|vaz<@Wox*>*cN@p*
zQkn!Z@*vMxsS5jbaSPVUsi$bO@p00L?(;kyod8x^58HcPQkWxTBBtc2B$~~qBv-<?
zaRKEbYwSWjz`F_^vroylGH1m}kMbKnSAg$XZOPkH)*KP)<<v;LN(`j#glS4mu*$Qj
z_tjK^Q@`-rzsINIeb0X1VYt3D-`-^)k80RI@O#0q-${gxE<>sSu-<6TqJAL$;JoqV
z-rF1RmsPD1n$^%W61vG9k-J1sjf?p8?e;{glcpuqr$d-rukPnor<+*RkX%Fb2Rlt4
zAkS2Y6A6;3Rvj&>@}#PltKHx9fVH;j3__6wePxm4!q6CTPW;G@91|Q+I%6D0;NLW5
z6G{{cFa6fGyUeIL61S<SI^9;_EUr<h9Xhit9ZbA)nW8kd{#Fb9C5tj$V4#!fBQR6|
zwcQ;EE*|O!H-eke8ok;Dkin&gx7|6ctmFyl$_~T9cGsvrf!B86A(q=-UF=*Mebz$!
zlBk>4@!a|}gEe%1nhDREdd)lCZ_?OSX#wW(e#;+GBIAaMG0mV6{-a)v>F#41T}dH|
zpew8U+1Lx=#+ipYr+2l5R_yH*d{;d4iv((#q1c$JjWJ(%o26?bZ798yH>CYA0O!8G
z%Q#k9lPM!lU50K0{~3|>)0kSQTikIt5^r{R`O&P~#6uZu`S6OCS~t6stk>|l?pKQ3
zBFTEi@pKLh9{gGg{lRZ5wsRws6%egiKHM%Lj^Vz=8Lzv~zdn4^NxVkjl=VH4$wR?f
zFx6>et|kGXm+kr|SxQBX1)_HXi<c;v#uF)$X7{neSz`W2HHw@YtiT4FL9QP;r9!=_
prceD%ZHu0aypI&Z2$nP;hN|OekE{cC#zVlL<1uIZiX*to{{xPwNcjK&

delta 4733
zcmYjUc|4SB`=14qWlXYX?IcTvNo8MKBu8jDp={HuB4j2NhT)k>at`5SEM*y$P+>S!
zw!u@5gd&uE8S5Ck8T<Oq(EIuQp1+>Yy<Fe#b=}wXz3*%5aFb?XG}J{x%lh<*KZ4Sx
z`=&j5e#r9Uv)*YUMFfOYv>pd|=w{w-aH@dHT_^j3AF`dh@5|Yys%p%~cW!Spt&6VP
zd>9ESpw0BO5X_Fp$c>(DE(orO&-eQJfc9lkr0}a0WgaR^*^m%KJ`)JL@M8-1yRM(1
zD?n*NsR5<KbajnXLhZwE)lWdzG0T?IovEY33I^`iy7et|?9X3t<C}R6trcpkf_f7k
zL}gntk{U0h`;~usha+92zTShT9{A}0M|SFBaVY!`_CrhTyog{@IEWP2+dc!cVg413
zxbjJMA3Bl=n%M4Z`WtJwVfH2XBUyBiyR2rmO-u#FfgzS|Kd-5#e<x?DP5Drud!_89
zM(U|vfr%Y{MzE4^6ve}n>T%$7woj^>d1s%k+Z;%~%v#evdk0BN__X-Nj{V6!y!M!i
zlPFAcOf2Y%n2h{W#FcR_;7P=o1;PkFP8mR*rI6relzx;5P!3VNjT%~McU#RGRJ%i8
zVnNC2%=-}H2ics(rpzvz<|4<TMJtZECeJgz0elZZ0W$kpE@XJnfv)3v&0c_IlNp(n
zyk|<?UTNj}-}x1usXhY<Ur?&<v*Mk1sAh9vMBVLRHQ>g(#2UrX?bFym!Wvj1oRXPO
zykRE$6~JZkxW+Q%{EY48O3xIJWjS&#ZY}2d<%%<|XH}&D(_(Xpqoct~>{b|+s(#3-
zr3BTQIo?oq>FSVe6xL>BRP_?pIG@IzQ1#u4>4P>0l&(`ajDZEL5;|HFp&&H3vf5{`
z6@=PU&PC4n-$0RlVqfM&5K~@7^!zz>5WwHrGB!qOcQse5@H`BHp?{=g*zN1GVx&CD
z>s`BkBt%fi=|6X!_0#!^;=s)Opin>bnz~y>@>C=udh%V+BfiE#XyIPRgp$+TJ@bY4
zbT!;>DcQxol&0xEM4`hq6CLBG4$TFXrepgXjzs6Pi{pk?`a+`34-b&CyKV#eu2zKj
z@*={6>G!RV^|nqSK%uO5QhC>HeOEQYVUbpX!U1t7V|%292Cm-jO7I!E?E~kqqd9Bd
zaLoWWQ+t%Gh|^|SZ#no@3zyuXC@X9bl@PH=GIq&)NI>@ZoCwm%(5==nVW?k}Zio5(
z<2W^8o*=PMv<>Hz8hg)}9s!guK1Y|ui#sKBG?e5P^Mzi6VKVX*3hH09C_<b~zIadi
zjF?pO<DP@<c9d>*io%yScWaGoi*&!Y^mk-DDYqd1RP)ub4RMQ0iUVV^h9{fZVx=#R
z>ow=I+4XyR{ftKGpE?$i0^I~YTK*dh`2abm#}jr6)*MjHcZN?}0dNzOL6!QioWmM0
zCRk$Rk0hqmwBTcYI{xw_@H`lu>Bx5mMw3lfIkFeu$2dm5L-N6B@0AfK-5L@jUm(e8
zi|&)*D`gE~$T}oB%#64UtaNn@2|EbElo>{X5U|m&>NorfXoI62#E*jyQq=_I4GPD{
zoM4eiH2V;~697-T!w^XH$RWH8$Z{6*$)?O5!tViXajHq&9|&lxgis!~!ETdN&Ay*%
zC<z3a+pAjzGT>2Th67h8wAAhjY_OwqhUZ!9)*0Gi{17y&M~P9_5`h?=w`+gBfFr{j
z*OmlUj(C{5#b}#`610LVObf#g6Kyb~RZTa6`fIw7S(PFUPJ<&WgEiosv;w3t%)Jpg
z)<g4}mGpWZLa{)`wszvTXZ8iRe0R!4bdxM9IZy?$KWATHnOt`O=EzPUuJfl&J-`W|
z^2U0k8TG1bcRmP<;*?yPE)7p3@kPs(X^Lh!MZs1rLp%X;gZ~j40w)>(hWjcF&9>1g
zd36WiJ&l(#zV=yIT}&>9=O?C4hbq7|*Ow4M8*Gl8yQ;h0z<Sq!F0l`myS}uiG;?1w
zSHF($e@=(DfA!IQc8S4JIw{~c+@r<-woMe`%tB{tmY8Sl$j+-vh?R|I%H-Ml6YaH#
z1F#a#G-}RH!gZr|A_71#tK;r{HcW$-EO2T1D{kZxAG7PPgGbSf4^@@Weru}&_KESj
zu0xS~$<w0|0w%K2P2@Po@QTjw-6a^XcJ$LZJ$kP7bf};_q;QD;<TRc5DTz|neL4r>
zQ09wG7Jg4$hnW}_ra;s>4=7xhtm;4B8$gwxoA+L5ZxdMoM%PccYNyevcZcx-Wpj%^
z6ZPF~swR4McQo&{t$vS5NE=u9;=y4cUus#}r%zIgrxoh9j@_HVyqO|!SgAV1v?bJX
z*1u338$uWr46UaK%w+$|)8Rx`LuB(W*x0eCNxoA5;1S!J+e~X{2H@6K1}gKTSO4{l
z5O6s#H=_SLbdsKVUchPT*)$vZRBm7{d+-d$J4Ky8ty;fc9^Y1n_BxLB+juJt-nUJF
z+1Z6-?iY*Ga2C50VZ<U7_}(@FzBS)7BEtGMN$497Mwf>$BIcQjkdD9$LI#Fk4m&a(
zDMko4I%ol4RTwG4XZhR&dR=voKko_8D;Ivq7a-Kjx^;1ypP>kul)o;En882K+=2m0
zBwahoS9oAM>5-g4Ftj)Ld8P|1vX4M)v};2eZ9-_9F;QT>*!w#DKHYxOD9^=fQ2hZc
zllP|qU#I`&<`2)(N0y}Lh~zo)@~{plVkPw{Px2NFLN)a)4&qo3t45$uOy$SE)JHu{
z;?V7tf!fv;6$I}l%_NkM;;`I^eBkhbcavpBTz$_w#eQb-&7Pm<4W1|QS8egm7h+(j
z(J;}Yezk1AZnk>6#;2trQjx0q7jg4e>9h$H@0L@HdQBV}wgT3KLwCju8R@NZ4pQcy
zuDvt;bJ#f~F<DU+QyJ2{uduTtgG+=0qxD`U(2bV27{%WV@&gsWx$U=ei^4fWljat4
zxY(~qn#-1$UX@BY+*C}Y^Ze%%^9^2_ZUb_0TVe&F9;nhA@T()%w&ONb75oh(4VEEN
zbbr|wxSLH#C}>)rzbp(M-Q_NUUn4W#0~pPjx<IyLyyQyQ=By4BJvwR64N^T+=}e0=
zMFF?=?9tC<jp?xztG1tuzQuHNJrQ&<eCDWqz)5%Z2&veUy1B5nILR{J7<jwU=%qPu
zJVPWZ9luuMsZ=|^*3p&`tXuHNeE_KPA7}zF-dfv7TJy&G2i;M0Q^TmLOB)Fq*tNs$
zMS&|hKC_xiP42hUq0(|=nG1X@WkYGbFbS=)&>6J-H7?jIa_ZZgNVCHZeXccDE$m-T
z%&juBmQxjgxif_4nE~RJbMazWes^F$_nH?9Vp-G;U0As1VcG+Dr$QX&cLMl>%tRPP
zaR%4~e3gJwt!f8Wa(6wSOW=u8BPrKo@t3DU0zD&J`rw+YYrMMqjWP7XbM&p~y>kLo
zO^+6}@P(oHu1qi8-F<tATgj6h=1#89sw%hFFxz^|;x(;5Uu~>Rc>P`Z?!%G)XiyIT
zAN?iTG^R<$UoKI<Cz~9=UwvTTC7WHdzoclf#DJujo&2YHzw^okUiKZ#u>1MHTu4*B
zmB_pMb)MOE{TiF0@v1%6*XaH(JklW4BR+&7wHpxhTq7nvW`VU{ud`s8-h5h|r=Fj+
z0Hp`JQUEh=n#t;IsnjYz#h~q1Zt~H8fg$*Vk+PHX-uGB<I9Bu$_D+~aNP$HyxiwRF
zkg*x%txdncn{cV=M=C=7F6Jr}o^dlLMI$_Jl=K+#`%UhojRr)?o@zX?uk?L^Ik)`O
z_)WS_n^R46>VZMC(#l_BY5l_KygNcP?xpT-p%^l1hKvSRJugJsn^874KGE+Hjd7)3
z?QnP#m8eGuyY}uShZx%+QXd9my?RHi@GHf)pMon^Z)|){u1{`TUQjH&M19eXvbvo&
zBsxx$Q+}EkW;CCNK=fWnyy+DX`R7*vc5W6wTMm6t@0~+=dLNbxmFB(=a>pW|CVkuc
zoOCfN*n`Nu=gu(NK_SGI;$cs765;Pd{OHh>Ka9-Td@^&#>-h?ME$R|A6=5~VdrzTB
zZ=wp!!H*-WqArX4Ufg1DCNqgpZrU~Dcp5mzeP=fNFlr}-2laac+#mpWlgkRdVv<4j
zp`jk}-nWYI4%cmDHMgWU_-U1(QW&@F2*NG^+-ijzI;JMR25H3GOw9pp^|2AWEnzVI
zeysG)Lcv}l{Kn2gox*g1J*vWH%~p&<`WBHpreT<hkER=3IpLs!SjIA+Z~&L#EGc;n
z##1dE*ciG)^%$0nvEah4@}t>1P85(Lsvtac@xo5&=Ui$gu%x-28Gv+S*S(|oyW9r_
zJLF<hJNC_@^tG6xbf$D+xWIEJ0fuG?f9KJJH0}aPgYm}PxLP!Guu!M|U4syZU4zTu
z;aw<sr}~{47kKXK4m}U7r8|ks{jPA{#Ev~bvb*}OK?LY0N1`%iRw5>jsfOdYPIZ89
z8^wq<w<$XC+du?V{?Lur+(UlQJP8Zh71+^a>Ojo!dwY}-EcWF|171)jAuGPe>?YH#
zij$X$|JQcBNQrv+2G5IV2XxMl{*S(*U#(o5<kI!R0Gx?pe@sCDRNnH~An#!RUkCE_
z%WT3{1gOnfATM)(#e2ZZt8a^KVaCmEi2j&aPxm+}y)SCvF{?G`c@Zrs@_qI@vX%2{
zHi69PQqHlT_(@LyLiH{Y-lsgh{APv}Na4IH$Qif=jbrs(xdlyLF3mLG4BlI?8L*X8
zo0`x)WMMT)*CsQUOPg(ayi6IbQy)@k)dW>CWtg^_{4el-9ZpdFg?lk&Jmug8hslCV
zGpma^j1~t!JM1#IGkSmUile5x;FPw#lED@C4(ZP?hkFpiEj2+m?XXtd`Qd9U@78+d
zW2PxCwpUZl|8x^FF_kXrhAhxUNp&^Td>n@lTVvqxpqe0Vao-(r{Gd_AxX<PH&~bt*
zMDxdxh$Jm5y2^ofR{IUEsyA{=ONvPKU|71t0#mK9b0)BGCIR7F!CfCqwm~$T*dC2%
z?k^o_Ad8;GYB+{-Q|2xz?}PkDZvJyyQ%S&&oY+MEb{t~btWo%5{M8^g-OBiqDyp!h
zToa<?8wJYGZbO0<$saj;M5sJ~NYLa<(G%vr(%=*oO9fuZ81jedFdL1HzLPu;m6eI|
z3eN;O=9-wz{<Ve23HCJ{eTh&7;T*3&Pt^zI$B$5$sKQ?UotBi_(5ecd;gb$MY8V|1
zhBA+m=4Toou_>m^OGp5#B3!Amn;$t*E7cm6lN|Q}amR)8+T|4CrgWPpgF?1tu!dR9
zdik&HIA681n?Wl}Ve}6#eeKqO;g~YD&eU(mDxR9bXFmt}TAEhNNh8f=ujjUQjg7R7
zeddT@vg^Q1Es7THN0|ITSdu;wSX+!uG^vSZ>CAPFeKx0T1LI1mX5H4_z~&??$zQ)J
zBUHC+6wQ=%I~PZe4v(V`W^3YB?nM~M%Jd1^laV}wR`mf>*p2Gx&ZeYp$-=?f1(_hA
z{2Sflpr2Uh)L*SFy=DX7{}U|Y^5aRDc+!=}sk%W+l~3ppl5N57gfaoWcTZ11dR)Tu
z<t;ll!%hLwEX<r&<Ha0~QHt@lE<w?y;aKI<Etqup`!-P-+QHO=SBBYAHRD`^X=%!;
zOjJ!pK<q!+NPDA7zc6&UzoM_dj;VkhVxcG(E%m7AbC&JM_okn(Zc)2x7i5D>HdJCe
z{G6`<jqBiGK<lh?wucQtRl3awn-nG{L*FCfMma92=#qPXtIu1)b;Vn)eBsAp-n~nF
zw066-)3w4VA%+_6;mkS4?+OKH%&o2OVo8$Ez(k|RFH^H{f{z^{TN)g9e)t~I>L@w!
zYM-C^ZcX>lJZ*CKxEzMnIR5h&WkNzxb&RB>HbC8G1W^~Fz{C`|*}1xx7-Onh<!|_`
sF;bnqgE-@PNh@U%szfo7+%L5VNaUm)Yp=>~g>XM>OS{vwlh_CU1!5dv2LJ#7

diff --git a/docs/bivariate.html b/docs/bivariate.html
index 150c22c9..111bb2cb 100644
--- a/docs/bivariate.html
+++ b/docs/bivariate.html
@@ -606,7 +606,7 @@ <h3 data-number="24.2.2" class="anchored" data-anchor-id="non-parametric-fits"><
 <p>Non-parametric fit applies to the family of fitting strategies that <em>do not</em> impose a structure on the data. Instead, they are designed to let the dataset reveal its inherent structure. One explored in this course is the <em>loess</em> fit.</p>
 <section id="loess" class="level4" data-number="24.2.2.1">
 <h4 data-number="24.2.2.1" class="anchored" data-anchor-id="loess"><span class="header-section-number">24.2.2.1</span> Loess</h4>
-<p>A flexible curve fitting option is the <strong>loess</strong> curve (short for <strong>lo</strong>cal regr<strong>ess</strong>ion; also known as the <em>local weighted regression</em>). Unlike the parametric approach to fitting a curve, the loess does <strong>not</strong> impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the <em>smooth</em> curve. The range of x-values that contribute to each localized regression lines is defined by the <span class="math inline">\(\alpha\)</span> parameter which usually ranges from 0.2 to 1. The larger the <span class="math inline">\(\alpha\)</span> value, the smoother the curve. The other parameter that defines a loess curve is <span class="math inline">\(\lambda\)</span>: it defines the polynomial order of the localized regression line. This is usually set to 1 (though <code>ggplot2</code>’s implementation of the loess defaults to a 2<sup>nd</sup> order polynomial).</p>
+<p>A flexible curve fitting option is the <strong>loess</strong> curve (short for <strong>lo</strong>cal regr<strong>ess</strong>ion; also known as the <em>local weighted regression</em>). Unlike the parametric approach to fitting a curve, the loess does <strong>not</strong> impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the <em>smooth</em> curve. The range of x-values that contribute to each localized regression lines is defined by the <strong>span</strong> parameter, <span class="math inline">\(\alpha\)</span>, which usually ranges from 0.2 to 1 (but, it can be greater than 1 for smaller datasets). The larger the <span class="math inline">\(\alpha\)</span> value, the smoother the curve. The other parameter that defines a loess curve is <span class="math inline">\(\lambda\)</span>: it defines the <strong>polynomial order</strong> of the localized regression line. This is usually set to 1 (though <code>ggplot2</code>’s implementation of the loess defaults to a 2<sup>nd</sup> order polynomial).</p>
 </section>
 <section id="how-a-loess-is-constructed" class="level4" data-number="24.2.2.2">
 <h4 data-number="24.2.2.2" class="anchored" data-anchor-id="how-a-loess-is-constructed"><span class="header-section-number">24.2.2.2</span> How a loess is constructed</h4>
@@ -703,7 +703,7 @@ <h3 data-number="24.3.1" class="anchored" data-anchor-id="residual-dependence-pl
 <p><img src="bivariate_files/figure-html/unnamed-chunk-21-1.png" class="img-fluid" width="240"></p>
 </div>
 </div>
-<p>We are interested in identifying any pattern in the residuals. If the model does a good job in fitting the data, the points should be uniformly distributed across the plot and the loess fit should approximate a horizontal line. With the linear model <code>M</code>, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show <em>dependence</em> on the x values.</p>
+<p>We are interested in identifying any pattern in the residuals. <strong>If the model does a good job in fitting the data, the points should be uniformly distributed across the plot</strong> and the loess fit should approximate a horizontal line. With the linear model <code>M</code>, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show <em>dependence</em> on the x values.</p>
 <p>Next, we’ll look at the residuals from the second order polynomial model <code>M2</code>.</p>
 <div class="cell" data-small.mar="true" data-hash="bivariate_cache/html/unnamed-chunk-22_fb96e1aa15975b174d62d20b9081d5bb">
 <div class="sourceCode cell-code" id="cb18"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a>df<span class="sc">$</span>residuals2 <span class="ot">&lt;-</span> <span class="fu">residuals</span>(M2)</span>
@@ -735,7 +735,7 @@ <h3 data-number="24.3.1" class="anchored" data-anchor-id="residual-dependence-pl
 </section>
 <section id="spread-location-plot" class="level3" data-number="24.3.2">
 <h3 data-number="24.3.2" class="anchored" data-anchor-id="spread-location-plot"><span class="header-section-number">24.3.2</span> Spread-location plot</h3>
-<p>The <code>M2</code> and <code>lo</code> models do a good job in eliminating any dependence between residual and x-value. Next, we will check that the residuals do not show a dependence with <em>fitted</em> y-values. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted <code>cp.ratio</code> values (for a univariate analogy, think of the fitted line as representing a <em>level</em> across different segments along the x-axis). We’ll generate a spread-level plot of model <code>M2</code>’s residuals (note that in the realm of regression analysis, such plot is often referred to as a <strong>scale-location</strong> plot). We’ll also add a loess curve to help visualize any patterns in the plot.</p>
+<p>The <code>M2</code> and <code>lo</code> models do a good job in eliminating any dependence between residual and x-value. Next, we will check that <strong>the residuals do not show a dependence with <em>fitted</em> y-values</strong>. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted <code>cp.ratio</code> values (for a univariate analogy, think of the fitted line as representing a <em>level</em> across different segments along the x-axis). We’ll generate a spread-level plot of model <code>M2</code>’s residuals (note that in the realm of regression analysis, such plot is often referred to as a <strong>scale-location</strong> plot). We’ll also add a loess curve to help visualize any patterns in the plot.</p>
 <div class="cell" data-small.mar="true" data-hash="bivariate_cache/html/unnamed-chunk-24_beb1abcfc562cfc3cb631cc257e96e5b">
 <div class="sourceCode cell-code" id="cb20"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>sl2 <span class="ot">&lt;-</span> <span class="fu">data.frame</span>( <span class="at">std.res =</span> <span class="fu">sqrt</span>(<span class="fu">abs</span>(<span class="fu">residuals</span>(M2))), </span>
 <span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a>                   <span class="at">fit     =</span> <span class="fu">predict</span>(M2))</span>
@@ -747,10 +747,10 @@ <h3 data-number="24.3.2" class="anchored" data-anchor-id="spread-location-plot">
 <p><img src="bivariate_files/figure-html/unnamed-chunk-24-1.png" class="img-fluid" width="240"></p>
 </div>
 </div>
-<p>The function <code>predict()</code> extracts the y-values from the fitted model <code>M2</code> and is plotted along the x-axis. It’s clear from this plot that the residuals are not homogeneous; they increase as a function of increasing <em>fitted</em> CP ratio. The “bend” observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may want to increase the loess span.</p>
-<div class="cell" data-small.mar="true" data-hash="bivariate_cache/html/unnamed-chunk-25_015ed87426b2e1459bdf2eaf4001adcd">
+<p>The function <code>predict()</code> extracts the y-values from the fitted model <code>M2</code> and is plotted along the x-axis. It’s clear from this plot that the residuals are not homogeneous; they increase as a function of increasing <em>fitted</em> CP ratio. The “bend” observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may therefore want to increase the loess span by setting <code>span = 2</code>.</p>
+<div class="cell" data-small.mar="true" data-hash="bivariate_cache/html/unnamed-chunk-25_580c4e09e0751ca71793d95540cf5875">
 <div class="sourceCode cell-code" id="cb21"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ggplot</span>(sl2, <span class="fu">aes</span>(<span class="at">x =</span> fit, <span class="at">y =</span> std.res)) <span class="sc">+</span> <span class="fu">geom_point</span>() <span class="sc">+</span></span>
-<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a>              <span class="fu">stat_smooth</span>(<span class="at">method =</span> <span class="st">"loess"</span>, <span class="at">se =</span> <span class="cn">FALSE</span>, <span class="at">span =</span> <span class="fl">1.5</span>, </span>
+<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a>              <span class="fu">stat_smooth</span>(<span class="at">method =</span> <span class="st">"loess"</span>, <span class="at">se =</span> <span class="cn">FALSE</span>, <span class="at">span =</span> <span class="dv">2</span>, </span>
 <span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a>                          <span class="at">method.args =</span> <span class="fu">list</span>(<span class="at">degree =</span> <span class="dv">1</span>) )</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 <div class="cell-output-display">
 <p><img src="bivariate_files/figure-html/unnamed-chunk-25-1.png" class="img-fluid" width="240"></p>
diff --git a/docs/bivariate_files/figure-html/unnamed-chunk-25-1.png b/docs/bivariate_files/figure-html/unnamed-chunk-25-1.png
index feb65c5beb4044ddd975dd4fd23967e89b784560..e498ba6d975355cbe911591f2ab84a322aabd235 100644
GIT binary patch
delta 4669
zcmYjVdpy(o|KEj8uF3V}*6l<sqjN;=U!|nNaf*py&M708T*nxnEhoNFR14)=Bo&!d
zYz)JZBow*T*h(XWF?X|lKhrtC$8UeV-mmxT{(8ROug`m3DAE4PG^n@Au%o@*+1R}4
zVey}z{{9IOR`LR+AdiqWk*+5Ed8jz#jYkLhx<4hDRsGQGW3-l2;~vAgoI$a;n|i)I
z>wJ5R{)F?ab=-*FiJ)D!waaJPdmO8i_Q};=ZorT))KHC8RH--A6csy6`OnMMfnS~k
zw(yd3sW;*JfX=wIE8}g4dO2_$7IRla>Q}#mqXM17!?+wXma&Vgou|~yBj^yKmkISJ
zrpk^T<0g+9pv^dvZ37oKDR<h9KeVs#)Hi}hAXKXh${zJcQp2~>{hQYCdnyS9wXru!
zusd51{{<JAI5NPk*EX-?UR_n>$?ki#bKS%dr&ut~>e*}YmA9%`UeEoFI<rmPrE-_c
zvVC0{)4xNR9$2t+_M8JzDzqBgYqiln6FSdCE|o_N#Mq^Jh2590a3I_o7ps^1fIeh`
z?>KPXj^V;^y65kM{<i4cQPKIvYM}bT|D;~pDogd*ss$)(T4k_K5H-+et#(rm&ZGCZ
z&0U+O=vx*WGph@M0HLF1C~56Bt70CFtwfTl)Lt!iZL}WEpgJ_CEq642@`B$Jpub*R
z{mBbHdnI?hcaA_e=io9N=8jV187VTs`(}?k&MJIn$e%M9BDhqWHniWg&EH@;;nIi~
z*s0c00Fs6LT(U5Jec@(1K{o!j$OslwP!sG`vR*F7t^c{RoltvFFXPR60A9hvbE^SO
zwkEj-&Xw&<P4RK+SeZDN_9XfGDY3Z)G~IOa)PxJHh8Bl8%BFilyIpP<6@Qy@87=<S
z%s7K}CirE_#buTnKOHpul>4Nd)MR}pA1TfPXiaZ=UY6H*=i`yFYhV2}6Pevf<s}y<
z8L}~M1)0U)h;t@^KGnFX`H0;Ca(Ve;bXQX35-;Tg>cO{MJgeHfgXjisU=LC=J@qow
zl@zuPRtK8XLZw0|`ky|x_FxxW>EWn29bLmqy_0IcUu#j~zRKEVw2~2tq4r!~8Uel=
z#7%g%Y$bbz<nE%Bj^@h<JTKp!&eQ``lV?XfMQ)1{U}(`7oiTeW?H*%J98_em@|kzG
zdF&0-rz41Ja{ac&*{oNb$*Gme^gMu-2+IYGYF;J}1k<HL52v4J>@ZfZ?%}J&Ux1lu
zJ20_Z@5p%tth~v_g3A<59MB3?qiCfcm8qPMy~|Q)+X?d+J^x<`!QF9c>TEVhy<ZQ!
zHFg>SN)^vuRkZejbp(=K3(&NCbIKmJE`^wPD>CJ|_-0f9Nvi+tuV2(mY+zHoyK}PJ
z^(fKv@@AZVqQd53LJ6)BrLSkO-Y^l&%dZ@8Za%wx&4JO~{K?jwO=*n#8DDd7{p8{s
zi7_A7&II1I4Ax>_p2P>%D}*Fj9_ssJnQdU?eT~(b^=nrLX}w;=w$r7cf@V(P?76K&
zMigaAq*0?3LaIy8M_E&me%Ra*XTt3nQl>PKJMfKCuV6ko!#j*s`SV}ZD8NlJYOnql
zc`59n)zdg_?wd3@sU_&)L{!m5MoyZ8)DpyFyz}p94dj*sdLHAWkc&p5Z5~#YUC6i(
zzHGhi{0X2zjMC+&#+xO>g<m=xdM7%Rifr=h;RxpBNL~}fR;G)c?;Jxt-U`5AKE&C+
z-^fZT{o_+j3<skTxsgynk$v<+L!US%{8E?klZOg*H-q0Y3HqW-!tvx1kw)oC%E!6D
zTazR0l;Eb)!q%#Nf{ta+@5Ozn2bvBwDmwc)_aWV<eH%3L3fE+?xupv_>R}7pNNQU=
zIxRK1L+CZT`EB1hL19h>>9fC)i!mI*+~>#xpJog^CD&3^KLjx3G(q&b((P%^F1AYk
zpPtRK;u?~m>4a|Ad>6$R{S~ImYVh(l%N^W_G@}ZV+D7cEs9QABjFGLZm?e`f_e=2k
z-Vy#H0f`{3!KEt4IxgXf$I}*(!r8($!FwZ)xN_C}mOM^{^BGboA`*1=S-iH09|*$j
z1{t#mplQgrqL2~4wtlbE%0bNXmpJcA0`j|DZStHv;$(wgS!#2KW0&^l|CM+H8^Ko@
z{19Wndd~Q!y0ThZqoH*zAqAG_jXr&Tqdw<62!TSf){3I^%EI#%*8a@J#RxwYNr@E8
z{aMP@*~oL(Nl3VmWF0%@3KM>T+Zq8HrUD)Lum}c!aq*>@Z(3qCBgkD+8xw5B{n;>b
zso1l7#wauz^CIqw>S#bU_6!FKbpeDkizk*d3E4^Uu+Z<8d$^$|H+9ZcFlKS&7_U%z
z0!+)tCm=r?Js9?^VXwtUEbbkwU-$|yK)jgOQ4GtCQJsWLCF1LVaQH922Nh~^Fz@XR
zxeN7<OXO%9|NE}+u}`x*$YnW?q_y)Ag;Qh4=(SJ_z31;9LDPl>Tgj;0M@Zpl&HJBv
za<hADdls=T47u!%SqQnT4%E;&&npv@IcAXsDHy3&aJ06KN~~H~j9V{63wf@!cQ<TV
zHu$s2<ES-D2*5Q(p6jPr7Jf5uxg-yzw+7xy;HGO;nj{>O;l5JQSxvS5?kqI4?C7Po
zKq^Sbk8;2L*Hh>eqt@H~$MzWuB~1?+)cdQOp~27HnuAH+QGLAK2Nh!S@$&2n@QTz-
zc@>eFjchexN9$^bbuNOL#^k3UurUd~r(3+oACmz<I2-YDWe-vUqx&8eOCkk{NGSyE
z?Oz%7tO=)GmR;6p>mZS~gVN|5CneNAaxt~&L4(5Omnc9eTDN7YmnRr3ngsSuM@xco
zf#&shv?9;isIK1>p)M7w_J$1*<5+bJv+@JLUoGeHs>uUOa!SK#x^c@u)+ugZO<QUp
zeagy^+w7g$J#cDP$!17)G;Un<<^J6F`}|r<T<i2^wk~q?FyDn9r=IxFW1=XgPN+?%
z?%m)Njpb4*@Rt^c4s#zKK|lOlAuyrHd4(yj#NH0B)Ay3=ERKq_kufrdxv#$Y;b#Tj
zgc9drtYj_$4>Ihv_|uc)&!o#2GJgb`V#n|Qe#K05y5-CI?WNy4i&w|usxkp<0;GKD
zACdN$f{lfFUdNoJkh5P{2u{f$8{G>=&2xpi;NZ|`ED-mX1@>7b)SXOtYGvm6RAk)q
z;N&cyh8Vf4bByykv16}sg2Kf<OCd1&bU9P01G7DXHI%+;U}G1zJzvh!Jhgw&HYTT(
zaqqTO2)Fjh;N^@er<iV$Yt#l$x)Y%}7HDMfvy0ryLK=I$vq1diy3pQ%hmM=r>BXvT
zh>)~EE|zECHt)JX!q*Q5dBOUB1U|;%T}}TkGy`<=LYo}!4;y*_9E|fO63Wmr8skMM
z|Ej8ik@{A@7YeupA3F&YU|kIh2Yg$vrVsrc>4#p6RrSSIC8&mcL6w4qzRfBH7^GE=
zsUShkvahI*bF%SZm$%c&3|ji}_M=*-e$A2$-9Mk@-K6Cu*L9tNwUw;g&JVYG8g4k0
z_#FN%93Wj?8Qm!Hb(4Iy33zWwJP@k|>S5sVaFa~i^kDO&zUGoRq}kq^j3{$(N`VYK
zLc8&xBdqe-rsZDg#JLMEsEMm~QSu4Zo%o_n+4JU+Ka{r!cQCReYNkWp4@1oO|Mq#y
zlmJ-dvWrEHU+S;^FasZP@~0_MX+huCy)fJV#mhqV&`*)woZJs@H>(v0#rs~%WQPu*
zDw`zzFi_Xp#up4Nr!Q?`5B?+3O?a>5&$;tqG$`z!7^Jm1ayoi8ZB^E~$7=qK0Z18&
zZtm?)Qfyp3EfC1{UTpf5(Bf7$^HbP-C;s`Yk$nG0lC6W^l_*<El>1XsT_`=GF9BZA
zu=3>%=6y!?Y|Z-Re4(sc;^B)S7h-H<h9Rwg^mRPO3|n&3_tzXrv$T&w+*nP8^m|0Z
za-sNjJS=x?6>0@+{9GAWputKT1yCSo5m^r0$h1t%g1${UV2JDNebBs+ycNgl-J`|w
zRMBajK<#6Nspt^Mhn|>N??VjcT772mCh%e{5NM|So`nZuupj!{V~J}+VzA3pd2Beb
zNkXZ}W20AJZ&GsDglNv@t}HfkIeAk<l|BQ^Xm6U?8#X<R5t)I#KfztAixgf0?}Mu>
z)J(&CBM37o7_GLhYts%AyFMTE7@FG3?4}}^A2sfQ+kZh(B&Y=S>XmOc8TQb!SY-)g
zTjdW4DQM6NOoA+JoQ^M95*fUlC-*LF03iNOUi_gfQ~A*7Kj2wthx9K>2Y~*SSsxo9
zU}oc*^s3)EHbqEGLU7aCC!=O{o&^oSscagW(()h@n+WiCf;fcff<%v*NO4M#ghTV?
z;*&Qm=tJln#J%xFdEwiIG7Tn}9J7Jq)b=@{1?v6(UIUtdEjj7YfkXd!c4wLB4RIkH
z_lpg&6~CkA`qAemdeg~CJ5-h=^{{~a-wm1c^(Ozz#{BFP%%h^qIP`Y_%pe-V)Ciq%
zY5VkfjFe=p8^X?UZZm5+<qbhL9T}8VF-VDIb+USx`t<rG^{-|T;uKcG*h{3%381Do
z>F&h}=V}>tyiYv0o==z;1O%p#tUeyl^M8y9V92od(hl*@TMi#G3b)ZMS#Yh-$(rlc
zM`gMfUpaSWQZd%Z!euQ^RO8M%vqo@#N*2)QRND=I)lgxD@YAIC$qV5{#o<2LEFOlf
z&q#=}<gzDC!D&r2r4)}=%pV0dKhf7E5h^gh7}PdnC1X971I~|W%5wpyl?eJiq#q^*
zmneJ{S}k!Ky^*=ZdcPDG9*DMJilQtS8tAuM*k{1#Cl1FPnuxu9N4K+LZ^&(>A`Gm%
z`%0cE#F_@@PQ>arGqZvl_wl2Ij(>YG?Xk!_1{#f7vj~C2`Dm~ZFPTT~2F(=FdL?aB
z&_0kk3Z8O-4Gqt|;0ue%b-?khU`gXw;e=tx0<(Y;iR`v=s=75`7S@%iqH}}gS)r~)
zE*P+_96BuNHkAff1J<NXEs)i{pu7*|rN6$R85oHRu6{}`OCq|vaz<@Wox*>*cN@p*
zQkn!Z@*vMxsS5jbaSPVUsi$bO@p00L?(;kyod8x^58HcPQkWxTBBtc2B$~~qBv-<?
zaRKEbYwSWjz`F_^vroylGH1m}kMbKnSAg$XZOPkH)*KP)<<v;LN(`j#glS4mu*$Qj
z_tjK^Q@`-rzsINIeb0X1VYt3D-`-^)k80RI@O#0q-${gxE<>sSu-<6TqJAL$;JoqV
z-rF1RmsPD1n$^%W61vG9k-J1sjf?p8?e;{glcpuqr$d-rukPnor<+*RkX%Fb2Rlt4
zAkS2Y6A6;3Rvj&>@}#PltKHx9fVH;j3__6wePxm4!q6CTPW;G@91|Q+I%6D0;NLW5
z6G{{cFa6fGyUeIL61S<SI^9;_EUr<h9Xhit9ZbA)nW8kd{#Fb9C5tj$V4#!fBQR6|
zwcQ;EE*|O!H-eke8ok;Dkin&gx7|6ctmFyl$_~T9cGsvrf!B86A(q=-UF=*Mebz$!
zlBk>4@!a|}gEe%1nhDREdd)lCZ_?OSX#wW(e#;+GBIAaMG0mV6{-a)v>F#41T}dH|
zpew8U+1Lx=#+ipYr+2l5R_yH*d{;d4iv((#q1c$JjWJ(%o26?bZ798yH>CYA0O!8G
z%Q#k9lPM!lU50K0{~3|>)0kSQTikIt5^r{R`O&P~#6uZu`S6OCS~t6stk>|l?pKQ3
zBFTEi@pKLh9{gGg{lRZ5wsRws6%egiKHM%Lj^Vz=8Lzv~zdn4^NxVkjl=VH4$wR?f
zFx6>et|kGXm+kr|SxQBX1)_HXi<c;v#uF)$X7{neSz`W2HHw@YtiT4FL9QP;r9!=_
prceD%ZHu0aypI&Z2$nP;hN|OekE{cC#zVlL<1uIZiX*to{{xPwNcjK&

delta 4733
zcmYjUc|4SB`=14qWlXYX?IcTvNo8MKBu8jDp={HuB4j2NhT)k>at`5SEM*y$P+>S!
zw!u@5gd&uE8S5Ck8T<Oq(EIuQp1+>Yy<Fe#b=}wXz3*%5aFb?XG}J{x%lh<*KZ4Sx
z`=&j5e#r9Uv)*YUMFfOYv>pd|=w{w-aH@dHT_^j3AF`dh@5|Yys%p%~cW!Spt&6VP
zd>9ESpw0BO5X_Fp$c>(DE(orO&-eQJfc9lkr0}a0WgaR^*^m%KJ`)JL@M8-1yRM(1
zD?n*NsR5<KbajnXLhZwE)lWdzG0T?IovEY33I^`iy7et|?9X3t<C}R6trcpkf_f7k
zL}gntk{U0h`;~usha+92zTShT9{A}0M|SFBaVY!`_CrhTyog{@IEWP2+dc!cVg413
zxbjJMA3Bl=n%M4Z`WtJwVfH2XBUyBiyR2rmO-u#FfgzS|Kd-5#e<x?DP5Drud!_89
zM(U|vfr%Y{MzE4^6ve}n>T%$7woj^>d1s%k+Z;%~%v#evdk0BN__X-Nj{V6!y!M!i
zlPFAcOf2Y%n2h{W#FcR_;7P=o1;PkFP8mR*rI6relzx;5P!3VNjT%~McU#RGRJ%i8
zVnNC2%=-}H2ics(rpzvz<|4<TMJtZECeJgz0elZZ0W$kpE@XJnfv)3v&0c_IlNp(n
zyk|<?UTNj}-}x1usXhY<Ur?&<v*Mk1sAh9vMBVLRHQ>g(#2UrX?bFym!Wvj1oRXPO
zykRE$6~JZkxW+Q%{EY48O3xIJWjS&#ZY}2d<%%<|XH}&D(_(Xpqoct~>{b|+s(#3-
zr3BTQIo?oq>FSVe6xL>BRP_?pIG@IzQ1#u4>4P>0l&(`ajDZEL5;|HFp&&H3vf5{`
z6@=PU&PC4n-$0RlVqfM&5K~@7^!zz>5WwHrGB!qOcQse5@H`BHp?{=g*zN1GVx&CD
z>s`BkBt%fi=|6X!_0#!^;=s)Opin>bnz~y>@>C=udh%V+BfiE#XyIPRgp$+TJ@bY4
zbT!;>DcQxol&0xEM4`hq6CLBG4$TFXrepgXjzs6Pi{pk?`a+`34-b&CyKV#eu2zKj
z@*={6>G!RV^|nqSK%uO5QhC>HeOEQYVUbpX!U1t7V|%292Cm-jO7I!E?E~kqqd9Bd
zaLoWWQ+t%Gh|^|SZ#no@3zyuXC@X9bl@PH=GIq&)NI>@ZoCwm%(5==nVW?k}Zio5(
z<2W^8o*=PMv<>Hz8hg)}9s!guK1Y|ui#sKBG?e5P^Mzi6VKVX*3hH09C_<b~zIadi
zjF?pO<DP@<c9d>*io%yScWaGoi*&!Y^mk-DDYqd1RP)ub4RMQ0iUVV^h9{fZVx=#R
z>ow=I+4XyR{ftKGpE?$i0^I~YTK*dh`2abm#}jr6)*MjHcZN?}0dNzOL6!QioWmM0
zCRk$Rk0hqmwBTcYI{xw_@H`lu>Bx5mMw3lfIkFeu$2dm5L-N6B@0AfK-5L@jUm(e8
zi|&)*D`gE~$T}oB%#64UtaNn@2|EbElo>{X5U|m&>NorfXoI62#E*jyQq=_I4GPD{
zoM4eiH2V;~697-T!w^XH$RWH8$Z{6*$)?O5!tViXajHq&9|&lxgis!~!ETdN&Ay*%
zC<z3a+pAjzGT>2Th67h8wAAhjY_OwqhUZ!9)*0Gi{17y&M~P9_5`h?=w`+gBfFr{j
z*OmlUj(C{5#b}#`610LVObf#g6Kyb~RZTa6`fIw7S(PFUPJ<&WgEiosv;w3t%)Jpg
z)<g4}mGpWZLa{)`wszvTXZ8iRe0R!4bdxM9IZy?$KWATHnOt`O=EzPUuJfl&J-`W|
z^2U0k8TG1bcRmP<;*?yPE)7p3@kPs(X^Lh!MZs1rLp%X;gZ~j40w)>(hWjcF&9>1g
zd36WiJ&l(#zV=yIT}&>9=O?C4hbq7|*Ow4M8*Gl8yQ;h0z<Sq!F0l`myS}uiG;?1w
zSHF($e@=(DfA!IQc8S4JIw{~c+@r<-woMe`%tB{tmY8Sl$j+-vh?R|I%H-Ml6YaH#
z1F#a#G-}RH!gZr|A_71#tK;r{HcW$-EO2T1D{kZxAG7PPgGbSf4^@@Weru}&_KESj
zu0xS~$<w0|0w%K2P2@Po@QTjw-6a^XcJ$LZJ$kP7bf};_q;QD;<TRc5DTz|neL4r>
zQ09wG7Jg4$hnW}_ra;s>4=7xhtm;4B8$gwxoA+L5ZxdMoM%PccYNyevcZcx-Wpj%^
z6ZPF~swR4McQo&{t$vS5NE=u9;=y4cUus#}r%zIgrxoh9j@_HVyqO|!SgAV1v?bJX
z*1u338$uWr46UaK%w+$|)8Rx`LuB(W*x0eCNxoA5;1S!J+e~X{2H@6K1}gKTSO4{l
z5O6s#H=_SLbdsKVUchPT*)$vZRBm7{d+-d$J4Ky8ty;fc9^Y1n_BxLB+juJt-nUJF
z+1Z6-?iY*Ga2C50VZ<U7_}(@FzBS)7BEtGMN$497Mwf>$BIcQjkdD9$LI#Fk4m&a(
zDMko4I%ol4RTwG4XZhR&dR=voKko_8D;Ivq7a-Kjx^;1ypP>kul)o;En882K+=2m0
zBwahoS9oAM>5-g4Ftj)Ld8P|1vX4M)v};2eZ9-_9F;QT>*!w#DKHYxOD9^=fQ2hZc
zllP|qU#I`&<`2)(N0y}Lh~zo)@~{plVkPw{Px2NFLN)a)4&qo3t45$uOy$SE)JHu{
z;?V7tf!fv;6$I}l%_NkM;;`I^eBkhbcavpBTz$_w#eQb-&7Pm<4W1|QS8egm7h+(j
z(J;}Yezk1AZnk>6#;2trQjx0q7jg4e>9h$H@0L@HdQBV}wgT3KLwCju8R@NZ4pQcy
zuDvt;bJ#f~F<DU+QyJ2{uduTtgG+=0qxD`U(2bV27{%WV@&gsWx$U=ei^4fWljat4
zxY(~qn#-1$UX@BY+*C}Y^Ze%%^9^2_ZUb_0TVe&F9;nhA@T()%w&ONb75oh(4VEEN
zbbr|wxSLH#C}>)rzbp(M-Q_NUUn4W#0~pPjx<IyLyyQyQ=By4BJvwR64N^T+=}e0=
zMFF?=?9tC<jp?xztG1tuzQuHNJrQ&<eCDWqz)5%Z2&veUy1B5nILR{J7<jwU=%qPu
zJVPWZ9luuMsZ=|^*3p&`tXuHNeE_KPA7}zF-dfv7TJy&G2i;M0Q^TmLOB)Fq*tNs$
zMS&|hKC_xiP42hUq0(|=nG1X@WkYGbFbS=)&>6J-H7?jIa_ZZgNVCHZeXccDE$m-T
z%&juBmQxjgxif_4nE~RJbMazWes^F$_nH?9Vp-G;U0As1VcG+Dr$QX&cLMl>%tRPP
zaR%4~e3gJwt!f8Wa(6wSOW=u8BPrKo@t3DU0zD&J`rw+YYrMMqjWP7XbM&p~y>kLo
zO^+6}@P(oHu1qi8-F<tATgj6h=1#89sw%hFFxz^|;x(;5Uu~>Rc>P`Z?!%G)XiyIT
zAN?iTG^R<$UoKI<Cz~9=UwvTTC7WHdzoclf#DJujo&2YHzw^okUiKZ#u>1MHTu4*B
zmB_pMb)MOE{TiF0@v1%6*XaH(JklW4BR+&7wHpxhTq7nvW`VU{ud`s8-h5h|r=Fj+
z0Hp`JQUEh=n#t;IsnjYz#h~q1Zt~H8fg$*Vk+PHX-uGB<I9Bu$_D+~aNP$HyxiwRF
zkg*x%txdncn{cV=M=C=7F6Jr}o^dlLMI$_Jl=K+#`%UhojRr)?o@zX?uk?L^Ik)`O
z_)WS_n^R46>VZMC(#l_BY5l_KygNcP?xpT-p%^l1hKvSRJugJsn^874KGE+Hjd7)3
z?QnP#m8eGuyY}uShZx%+QXd9my?RHi@GHf)pMon^Z)|){u1{`TUQjH&M19eXvbvo&
zBsxx$Q+}EkW;CCNK=fWnyy+DX`R7*vc5W6wTMm6t@0~+=dLNbxmFB(=a>pW|CVkuc
zoOCfN*n`Nu=gu(NK_SGI;$cs765;Pd{OHh>Ka9-Td@^&#>-h?ME$R|A6=5~VdrzTB
zZ=wp!!H*-WqArX4Ufg1DCNqgpZrU~Dcp5mzeP=fNFlr}-2laac+#mpWlgkRdVv<4j
zp`jk}-nWYI4%cmDHMgWU_-U1(QW&@F2*NG^+-ijzI;JMR25H3GOw9pp^|2AWEnzVI
zeysG)Lcv}l{Kn2gox*g1J*vWH%~p&<`WBHpreT<hkER=3IpLs!SjIA+Z~&L#EGc;n
z##1dE*ciG)^%$0nvEah4@}t>1P85(Lsvtac@xo5&=Ui$gu%x-28Gv+S*S(|oyW9r_
zJLF<hJNC_@^tG6xbf$D+xWIEJ0fuG?f9KJJH0}aPgYm}PxLP!Guu!M|U4syZU4zTu
z;aw<sr}~{47kKXK4m}U7r8|ks{jPA{#Ev~bvb*}OK?LY0N1`%iRw5>jsfOdYPIZ89
z8^wq<w<$XC+du?V{?Lur+(UlQJP8Zh71+^a>Ojo!dwY}-EcWF|171)jAuGPe>?YH#
zij$X$|JQcBNQrv+2G5IV2XxMl{*S(*U#(o5<kI!R0Gx?pe@sCDRNnH~An#!RUkCE_
z%WT3{1gOnfATM)(#e2ZZt8a^KVaCmEi2j&aPxm+}y)SCvF{?G`c@Zrs@_qI@vX%2{
zHi69PQqHlT_(@LyLiH{Y-lsgh{APv}Na4IH$Qif=jbrs(xdlyLF3mLG4BlI?8L*X8
zo0`x)WMMT)*CsQUOPg(ayi6IbQy)@k)dW>CWtg^_{4el-9ZpdFg?lk&Jmug8hslCV
zGpma^j1~t!JM1#IGkSmUile5x;FPw#lED@C4(ZP?hkFpiEj2+m?XXtd`Qd9U@78+d
zW2PxCwpUZl|8x^FF_kXrhAhxUNp&^Td>n@lTVvqxpqe0Vao-(r{Gd_AxX<PH&~bt*
zMDxdxh$Jm5y2^ofR{IUEsyA{=ONvPKU|71t0#mK9b0)BGCIR7F!CfCqwm~$T*dC2%
z?k^o_Ad8;GYB+{-Q|2xz?}PkDZvJyyQ%S&&oY+MEb{t~btWo%5{M8^g-OBiqDyp!h
zToa<?8wJYGZbO0<$saj;M5sJ~NYLa<(G%vr(%=*oO9fuZ81jedFdL1HzLPu;m6eI|
z3eN;O=9-wz{<Ve23HCJ{eTh&7;T*3&Pt^zI$B$5$sKQ?UotBi_(5ecd;gb$MY8V|1
zhBA+m=4Toou_>m^OGp5#B3!Amn;$t*E7cm6lN|Q}amR)8+T|4CrgWPpgF?1tu!dR9
zdik&HIA681n?Wl}Ve}6#eeKqO;g~YD&eU(mDxR9bXFmt}TAEhNNh8f=ujjUQjg7R7
zeddT@vg^Q1Es7THN0|ITSdu;wSX+!uG^vSZ>CAPFeKx0T1LI1mX5H4_z~&??$zQ)J
zBUHC+6wQ=%I~PZe4v(V`W^3YB?nM~M%Jd1^laV}wR`mf>*p2Gx&ZeYp$-=?f1(_hA
z{2Sflpr2Uh)L*SFy=DX7{}U|Y^5aRDc+!=}sk%W+l~3ppl5N57gfaoWcTZ11dR)Tu
z<t;ll!%hLwEX<r&<Ha0~QHt@lE<w?y;aKI<Etqup`!-P-+QHO=SBBYAHRD`^X=%!;
zOjJ!pK<q!+NPDA7zc6&UzoM_dj;VkhVxcG(E%m7AbC&JM_okn(Zc)2x7i5D>HdJCe
z{G6`<jqBiGK<lh?wucQtRl3awn-nG{L*FCfMma92=#qPXtIu1)b;Vn)eBsAp-n~nF
zw066-)3w4VA%+_6;mkS4?+OKH%&o2OVo8$Ez(k|RFH^H{f{z^{TN)g9e)t~I>L@w!
zYM-C^ZcX>lJZ*CKxEzMnIR5h&WkNzxb&RB>HbC8G1W^~Fz{C`|*}1xx7-Onh<!|_`
sF;bnqgE-@PNh@U%szfo7+%L5VNaUm)Yp=>~g>XM>OS{vwlh_CU1!5dv2LJ#7

diff --git a/docs/search.json b/docs/search.json
index f7aeb6ba..7bec3135 100755
--- a/docs/search.json
+++ b/docs/search.json
@@ -865,14 +865,14 @@
     "href": "bivariate.html#fitting-the-data",
     "title": "24  Fits and residuals",
     "section": "24.2 Fitting the data",
-    "text": "24.2 Fitting the data\nScatter plots are a good first start in visualizing bivariate data, but this is sometimes not enough. Our eyes need “guidance” to help perceive patterns. Another visual aid involves fitting the data with a line. We will explore two fitting strategies: the parametric fit and the non-parametric fit.\n\n24.2.1 Parametric fit\nA parametric fit is one where we impose a structure to the data–an example of which is a straight line (also referred to as a 1st order polynomial fit). The parametric fit is by far the most popular fitting strategy used in the realm of data analysis and statistics.\n\n24.2.1.1 Fitting a straight line\nA straight line is the simplest fit one can make to bivariate data. A popular method for fitting a straight line is the least-squares method. We’ll use R’s lm() function which provides us with a slope and intercept for the best fit line.\nThis can be implemented in the base plotting environment as follows:\n\nM &lt;- lm(cp.ratio ~ area, dat = df)\nplot(cp.ratio ~ area, dat = df)\nabline(M, col = \"red\")\n\n\n\n\nIn the ggplot2 plotting environment, we can make use of the stat_smooth function to generate the regression line.\n\nlibrary(ggplot2)\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method =\"lm\", se = FALSE)\n\n\n\n\nThe se = FALSE option prevents R from drawing a confidence envelope around the regression line. Confidence envelopes are used in inferential statistics (a topic not covered in this course).\nThe straight line is a first order polynomial with two parameters, \\(a\\) and \\(b\\), that define an equation that best describes the relationship between the two variables:\n\\[\ny = a + b (x)\n\\]\nwhere \\(a\\) and \\(b\\) can be extracted from the regression model object M as follows:\n\ncoef(M)\n\n(Intercept)        area \n 0.01399056  0.10733436 \n\n\nThus \\(a\\) = 0.014 and \\(b\\) = 0.11.\n\n\n24.2.1.2 Fitting a 2nd order polynomial\nA second order polynomial is a three parameter function (\\(a\\), \\(b\\) and \\(c\\)) whose equation \\(y = a + bx + cx^2\\) defines a curve that best fits the data. We define such a relationship in R using the formula cp.ratio ~ area + I(area^2). The identity function I() preserves the arithmetic interpretation of area^2 as part of the model. Our new lm expression and resulting coefficients follow:\n\nM2 &lt;- lm(cp.ratio ~  area + I(area^2) , dat = df)\ncoef(M2)\n\n  (Intercept)          area     I(area^2) \n 2.8684792029 -0.0118691702  0.0008393243 \n\n\nThe quadratic fit is thus,\n\\[\ny = 2.87 - 0.012 x + 0.000839 x^2\n\\]\nIn using the base plot environment, we cannot use abline to plot the predicted 2nd order polynomial curve since abline only draws straight lines. We will need to construct the line manually using the predict and lines functions.\n\nplot(cp.ratio ~ area, dat=df)\nx.pred &lt;- data.frame( area = seq(min(df$area), max(df$area), length.out = 50) )\ny.pred &lt;- predict(M2, x.pred)\nlines(x.pred$area, y.pred, col = \"red\")\n\n\n\n\nTo generate the same plot in ggplot2, simply pass the formula as an argument to stat_smooth:\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n  stat_smooth(method = \"lm\", se = FALSE, formula = y ~  x + I(x^2) )\n\n\n\n\n\n\n\n24.2.2 Non-parametric fits\nNon-parametric fit applies to the family of fitting strategies that do not impose a structure on the data. Instead, they are designed to let the dataset reveal its inherent structure. One explored in this course is the loess fit.\n\n24.2.2.1 Loess\nA flexible curve fitting option is the loess curve (short for local regression; also known as the local weighted regression). Unlike the parametric approach to fitting a curve, the loess does not impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the smooth curve. The range of x-values that contribute to each localized regression lines is defined by the \\(\\alpha\\) parameter which usually ranges from 0.2 to 1. The larger the \\(\\alpha\\) value, the smoother the curve. The other parameter that defines a loess curve is \\(\\lambda\\): it defines the polynomial order of the localized regression line. This is usually set to 1 (though ggplot2’s implementation of the loess defaults to a 2nd order polynomial).\n\n\n24.2.2.2 How a loess is constructed\nBehind the scenes, each point (xi,yi) that defines the loess curve is constructed as follows:\n\nA subset of data points closest to point xi are identified (xi i shown as the vertical dashed line in the figures below). The number of points in the subset is computed by multiplying the bandwidth \\(\\alpha\\) by the total number of observations. In our current example, \\(\\alpha\\) is set to 0.5. The number of points defining the subset is thus 0.5 * 14 = 7. The points are identified in the light blue area of the plot in panel (a) of the figure below.\nThe points in the subset are assigned weights. Greater weight is assigned to points closest to xi and vice versa. The weights define the points’ influence on the fitted line. Different weighting techniques can be implemented in a loess with the gaussian weight being the most common. Another weighting strategy we will also explore later in this course is the symmetric weight.\nA regression line is fit to the subset of points. Points with smaller weights will have less leverage on the fitted line than points with larger weights. The fitted line can be either a first order polynomial fit or a second order polynomial fit.\nNext, the value yi from the regression line is computed. This is shown as the red dot in panel (d). This is one of the points that will define the shape of the loess.\n\n\n\n\n\n\nThe above steps are repeated for as many xi values practically possible. Note that when xi approaches an upper or lower x limit, the subset of points becomes skewed to one side of xi. For example, when estimating x10, the seven closest points to the right of x10 are selected. Likewise, for the upper bound x140, the seven closest points to the left of x140 are selected.\n\n\n\n\n\nIn the following example, just under 30 loess points are computed at equal intervals. This defines the shape of the loess.\n\n\n\n\n\nIt’s more conventional to plot the line segments than it is to plot the points.\n\n\n\n\n\n\n\n\n24.2.2.3 Plotting a loess in R\nThe loess fit can be computed in R using the loess() function. It takes as arguments span (\\(\\alpha\\)), and degree (\\(\\lambda\\)).\n\n# Fit loess function\nlo &lt;- loess(cp.ratio ~ area, df, span = 0.5, degree = 1)\n\n# Predict loess values for a range of x-values\nlo.x &lt;- seq(min(df$area), max(df$area), length.out = 50)\nlo.y &lt;- predict(lo, lo.x)\n\nThe modeled loess curve can be added to the scatter plot using the lines function.\n\nplot(cp.ratio ~ area, dat = df)\nlines(lo.x, lo.y, col = \"red\")\n\n\n\n\nIn ggplot2 simply pass the method=\"loess\" parameter to the stat_smooth function.\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method = \"loess\", se = FALSE, span = 0.5)\n\nHowever, ggplot defaults to a second degree loess (i.e. the small regression line elements that define the loess are modeled using a 2nd order polynomial and not a 1st order polynomial). If a first order polynomial (degree=1) is desired, you need to include an argument list in the form of method.args=list(degree=1) to the stat_smooth function.\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method = \"loess\", se = FALSE, span = 0.5, \n                         method.args = list(degree = 1) )"
+    "text": "24.2 Fitting the data\nScatter plots are a good first start in visualizing bivariate data, but this is sometimes not enough. Our eyes need “guidance” to help perceive patterns. Another visual aid involves fitting the data with a line. We will explore two fitting strategies: the parametric fit and the non-parametric fit.\n\n24.2.1 Parametric fit\nA parametric fit is one where we impose a structure to the data–an example of which is a straight line (also referred to as a 1st order polynomial fit). The parametric fit is by far the most popular fitting strategy used in the realm of data analysis and statistics.\n\n24.2.1.1 Fitting a straight line\nA straight line is the simplest fit one can make to bivariate data. A popular method for fitting a straight line is the least-squares method. We’ll use R’s lm() function which provides us with a slope and intercept for the best fit line.\nThis can be implemented in the base plotting environment as follows:\n\nM &lt;- lm(cp.ratio ~ area, dat = df)\nplot(cp.ratio ~ area, dat = df)\nabline(M, col = \"red\")\n\n\n\n\nIn the ggplot2 plotting environment, we can make use of the stat_smooth function to generate the regression line.\n\nlibrary(ggplot2)\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method =\"lm\", se = FALSE)\n\n\n\n\nThe se = FALSE option prevents R from drawing a confidence envelope around the regression line. Confidence envelopes are used in inferential statistics (a topic not covered in this course).\nThe straight line is a first order polynomial with two parameters, \\(a\\) and \\(b\\), that define an equation that best describes the relationship between the two variables:\n\\[\ny = a + b (x)\n\\]\nwhere \\(a\\) and \\(b\\) can be extracted from the regression model object M as follows:\n\ncoef(M)\n\n(Intercept)        area \n 0.01399056  0.10733436 \n\n\nThus \\(a\\) = 0.014 and \\(b\\) = 0.11.\n\n\n24.2.1.2 Fitting a 2nd order polynomial\nA second order polynomial is a three parameter function (\\(a\\), \\(b\\) and \\(c\\)) whose equation \\(y = a + bx + cx^2\\) defines a curve that best fits the data. We define such a relationship in R using the formula cp.ratio ~ area + I(area^2). The identity function I() preserves the arithmetic interpretation of area^2 as part of the model. Our new lm expression and resulting coefficients follow:\n\nM2 &lt;- lm(cp.ratio ~  area + I(area^2) , dat = df)\ncoef(M2)\n\n  (Intercept)          area     I(area^2) \n 2.8684792029 -0.0118691702  0.0008393243 \n\n\nThe quadratic fit is thus,\n\\[\ny = 2.87 - 0.012 x + 0.000839 x^2\n\\]\nIn using the base plot environment, we cannot use abline to plot the predicted 2nd order polynomial curve since abline only draws straight lines. We will need to construct the line manually using the predict and lines functions.\n\nplot(cp.ratio ~ area, dat=df)\nx.pred &lt;- data.frame( area = seq(min(df$area), max(df$area), length.out = 50) )\ny.pred &lt;- predict(M2, x.pred)\nlines(x.pred$area, y.pred, col = \"red\")\n\n\n\n\nTo generate the same plot in ggplot2, simply pass the formula as an argument to stat_smooth:\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n  stat_smooth(method = \"lm\", se = FALSE, formula = y ~  x + I(x^2) )\n\n\n\n\n\n\n\n24.2.2 Non-parametric fits\nNon-parametric fit applies to the family of fitting strategies that do not impose a structure on the data. Instead, they are designed to let the dataset reveal its inherent structure. One explored in this course is the loess fit.\n\n24.2.2.1 Loess\nA flexible curve fitting option is the loess curve (short for local regression; also known as the local weighted regression). Unlike the parametric approach to fitting a curve, the loess does not impose a structure on the data. The loess curve fits small segments of a regression lines across the range of x-values, then links the mid-points of these regression lines to generate the smooth curve. The range of x-values that contribute to each localized regression lines is defined by the span parameter, \\(\\alpha\\), which usually ranges from 0.2 to 1 (but, it can be greater than 1 for smaller datasets). The larger the \\(\\alpha\\) value, the smoother the curve. The other parameter that defines a loess curve is \\(\\lambda\\): it defines the polynomial order of the localized regression line. This is usually set to 1 (though ggplot2’s implementation of the loess defaults to a 2nd order polynomial).\n\n\n24.2.2.2 How a loess is constructed\nBehind the scenes, each point (xi,yi) that defines the loess curve is constructed as follows:\n\nA subset of data points closest to point xi are identified (xi i shown as the vertical dashed line in the figures below). The number of points in the subset is computed by multiplying the bandwidth \\(\\alpha\\) by the total number of observations. In our current example, \\(\\alpha\\) is set to 0.5. The number of points defining the subset is thus 0.5 * 14 = 7. The points are identified in the light blue area of the plot in panel (a) of the figure below.\nThe points in the subset are assigned weights. Greater weight is assigned to points closest to xi and vice versa. The weights define the points’ influence on the fitted line. Different weighting techniques can be implemented in a loess with the gaussian weight being the most common. Another weighting strategy we will also explore later in this course is the symmetric weight.\nA regression line is fit to the subset of points. Points with smaller weights will have less leverage on the fitted line than points with larger weights. The fitted line can be either a first order polynomial fit or a second order polynomial fit.\nNext, the value yi from the regression line is computed. This is shown as the red dot in panel (d). This is one of the points that will define the shape of the loess.\n\n\n\n\n\n\nThe above steps are repeated for as many xi values practically possible. Note that when xi approaches an upper or lower x limit, the subset of points becomes skewed to one side of xi. For example, when estimating x10, the seven closest points to the right of x10 are selected. Likewise, for the upper bound x140, the seven closest points to the left of x140 are selected.\n\n\n\n\n\nIn the following example, just under 30 loess points are computed at equal intervals. This defines the shape of the loess.\n\n\n\n\n\nIt’s more conventional to plot the line segments than it is to plot the points.\n\n\n\n\n\n\n\n\n24.2.2.3 Plotting a loess in R\nThe loess fit can be computed in R using the loess() function. It takes as arguments span (\\(\\alpha\\)), and degree (\\(\\lambda\\)).\n\n# Fit loess function\nlo &lt;- loess(cp.ratio ~ area, df, span = 0.5, degree = 1)\n\n# Predict loess values for a range of x-values\nlo.x &lt;- seq(min(df$area), max(df$area), length.out = 50)\nlo.y &lt;- predict(lo, lo.x)\n\nThe modeled loess curve can be added to the scatter plot using the lines function.\n\nplot(cp.ratio ~ area, dat = df)\nlines(lo.x, lo.y, col = \"red\")\n\n\n\n\nIn ggplot2 simply pass the method=\"loess\" parameter to the stat_smooth function.\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method = \"loess\", se = FALSE, span = 0.5)\n\nHowever, ggplot defaults to a second degree loess (i.e. the small regression line elements that define the loess are modeled using a 2nd order polynomial and not a 1st order polynomial). If a first order polynomial (degree=1) is desired, you need to include an argument list in the form of method.args=list(degree=1) to the stat_smooth function.\n\nggplot(df, aes(x = area, y = cp.ratio)) + geom_point() + \n             stat_smooth(method = \"loess\", se = FALSE, span = 0.5, \n                         method.args = list(degree = 1) )"
   },
   {
     "objectID": "bivariate.html#residuals",
     "href": "bivariate.html#residuals",
     "title": "24  Fits and residuals",
     "section": "24.3 Residuals",
-    "text": "24.3 Residuals\nFitting the data with a line is just the first step in EDA. Your next step should be to explore the residuals. The residuals are the distances (parallel to the y-axis) between the observed points and the fitted line. The closer the points are to the line (i.e. the smaller the residuals) the better the fit.\nThe residuals can be computed using the residuals() function. It takes as argument the model object. For example, to extract the residuals from the linear model M computed earlier type,\n\nresiduals(M)\n\n         1          2          3          4          5          6          7          8          9         10 \n 1.3596154  1.3146525  0.5267426 -0.1133131 -0.1421694  0.3998247 -1.2188692 -2.0509148 -1.8622085 -0.1704418 \n        11         12         13         14 \n-0.5788861  0.4484394 -1.0727624  3.1602905 \n\n\n\n24.3.1 Residual-dependence plot\nOne way to visualize the residuals is to create a residual-dependence plot–this plots the residuals as a function of the x-values. We’ll do this using ggplot so that we can also fit a loess curve to help discern any pattern in the residuals.\n\ndf$residuals &lt;- residuals(M)\nggplot(df, aes(x = area, y = residuals)) + geom_point() +\n             stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                         method.args = list(degree = 1) )\n\n\n\n\nWe are interested in identifying any pattern in the residuals. If the model does a good job in fitting the data, the points should be uniformly distributed across the plot and the loess fit should approximate a horizontal line. With the linear model M, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show dependence on the x values.\nNext, we’ll look at the residuals from the second order polynomial model M2.\n\ndf$residuals2 &lt;- residuals(M2)\nggplot(df, aes(x = area, y = residuals2)) + geom_point() +\n               stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                           method.args = list(degree = 1) )\n\n\n\n\nThere is no indication of dependency between the residual and the area values. The second order polynomial is a large improvement over the first order polynomial. Let’s look at the loess model.\n\ndf$residuals3 &lt;- residuals(lo)\nggplot(df, aes(x = area, y = residuals3)) + geom_point() +\n               stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                           method.args = list(degree = 1) )\n\n\n\n\nNo surprise here. The loess model does a good job in capturing any overall pattern in the data. But note that this is to be expected given that the loess fit is a non-parametric model–it lets the data define its shape!\n\nYou may ask “if the loess model does such a good job in fitting the data, why bother with polynomial fits?” If you are seeking to generate a predictive model that explains the relationship between the y and x variables, then a mathematically tractable model (like a polynomial model) should be sought. If the interest is simply in identifying a pattern in the data, then a loess fit is a good choice.\n\nA model is deemed “well defined” when its residuals are constant over the full range of \\(x\\). In essence, we expect a formula of the form:\n\\[\nY = a + bx + \\varepsilon\n\\] where \\(\\varepsilon\\) is a constant that does not vary as a function of varying \\(x\\). This should sound quite familiar to you given that we’ve spent a good part of the univariate analysis section seeking a homogeneous spread in the data (i.e. a spread that did not change as a function of fitted group values). So what other diagnostic plots can we (and should we) generate from the fitted model? We explore such plots next.\n\n\n24.3.2 Spread-location plot\nThe M2 and lo models do a good job in eliminating any dependence between residual and x-value. Next, we will check that the residuals do not show a dependence with fitted y-values. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted cp.ratio values (for a univariate analogy, think of the fitted line as representing a level across different segments along the x-axis). We’ll generate a spread-level plot of model M2’s residuals (note that in the realm of regression analysis, such plot is often referred to as a scale-location plot). We’ll also add a loess curve to help visualize any patterns in the plot.\n\nsl2 &lt;- data.frame( std.res = sqrt(abs(residuals(M2))), \n                   fit     = predict(M2))\n\nggplot(sl2, aes(x = fit, y  =std.res)) + geom_point() +\n              stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                          method.args = list(degree = 1) )\n\n\n\n\nThe function predict() extracts the y-values from the fitted model M2 and is plotted along the x-axis. It’s clear from this plot that the residuals are not homogeneous; they increase as a function of increasing fitted CP ratio. The “bend” observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may want to increase the loess span.\n\nggplot(sl2, aes(x = fit, y = std.res)) + geom_point() +\n              stat_smooth(method = \"loess\", se = FALSE, span = 1.5, \n                          method.args = list(degree = 1) )\n\n\n\n\nThe point’s influence is reduced enough to convince us that the observed monotonic increase is real.\nWe learned with the univariate data that re-expressing values was one way to correct for any increasing or decreasing spreads across the range of fitted values. At this point, we may want to look into re-expressing the data. But, before we do, we’ll explore another diagnostic plot: the normal q-q plot.\n\n\n24.3.3 Checking residuals for normality\nIf you are interested in conducting a hypothesis test (i.e. addressing the question “is the slope significantly different from 0”) you will likely want to check the residuals for normality since this is an assumption made when computing a confidence interval and a p-value.\nIn ggplot, we learned that a normal q-q plot could be generated using the stat_qq and stat_qq_line function.\n\nggplot(df, aes(sample = residuals2)) + \n  stat_qq(distribution = qnorm) +\n  stat_qq_line(distribution = qnorm, col = \"blue\")\n\n\n\n\nHere, the residuals seem to stray a little from a normal distribution."
+    "text": "24.3 Residuals\nFitting the data with a line is just the first step in EDA. Your next step should be to explore the residuals. The residuals are the distances (parallel to the y-axis) between the observed points and the fitted line. The closer the points are to the line (i.e. the smaller the residuals) the better the fit.\nThe residuals can be computed using the residuals() function. It takes as argument the model object. For example, to extract the residuals from the linear model M computed earlier type,\n\nresiduals(M)\n\n         1          2          3          4          5          6          7          8          9         10 \n 1.3596154  1.3146525  0.5267426 -0.1133131 -0.1421694  0.3998247 -1.2188692 -2.0509148 -1.8622085 -0.1704418 \n        11         12         13         14 \n-0.5788861  0.4484394 -1.0727624  3.1602905 \n\n\n\n24.3.1 Residual-dependence plot\nOne way to visualize the residuals is to create a residual-dependence plot–this plots the residuals as a function of the x-values. We’ll do this using ggplot so that we can also fit a loess curve to help discern any pattern in the residuals.\n\ndf$residuals &lt;- residuals(M)\nggplot(df, aes(x = area, y = residuals)) + geom_point() +\n             stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                         method.args = list(degree = 1) )\n\n\n\n\nWe are interested in identifying any pattern in the residuals. If the model does a good job in fitting the data, the points should be uniformly distributed across the plot and the loess fit should approximate a horizontal line. With the linear model M, we observe a convex pattern in the residuals suggesting that the linear model is not a good fit. We say that the residuals show dependence on the x values.\nNext, we’ll look at the residuals from the second order polynomial model M2.\n\ndf$residuals2 &lt;- residuals(M2)\nggplot(df, aes(x = area, y = residuals2)) + geom_point() +\n               stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                           method.args = list(degree = 1) )\n\n\n\n\nThere is no indication of dependency between the residual and the area values. The second order polynomial is a large improvement over the first order polynomial. Let’s look at the loess model.\n\ndf$residuals3 &lt;- residuals(lo)\nggplot(df, aes(x = area, y = residuals3)) + geom_point() +\n               stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                           method.args = list(degree = 1) )\n\n\n\n\nNo surprise here. The loess model does a good job in capturing any overall pattern in the data. But note that this is to be expected given that the loess fit is a non-parametric model–it lets the data define its shape!\n\nYou may ask “if the loess model does such a good job in fitting the data, why bother with polynomial fits?” If you are seeking to generate a predictive model that explains the relationship between the y and x variables, then a mathematically tractable model (like a polynomial model) should be sought. If the interest is simply in identifying a pattern in the data, then a loess fit is a good choice.\n\nA model is deemed “well defined” when its residuals are constant over the full range of \\(x\\). In essence, we expect a formula of the form:\n\\[\nY = a + bx + \\varepsilon\n\\] where \\(\\varepsilon\\) is a constant that does not vary as a function of varying \\(x\\). This should sound quite familiar to you given that we’ve spent a good part of the univariate analysis section seeking a homogeneous spread in the data (i.e. a spread that did not change as a function of fitted group values). So what other diagnostic plots can we (and should we) generate from the fitted model? We explore such plots next.\n\n\n24.3.2 Spread-location plot\nThe M2 and lo models do a good job in eliminating any dependence between residual and x-value. Next, we will check that the residuals do not show a dependence with fitted y-values. This is analogous to univariate analysis where we checked if residuals increased or decreased with increasing medians across categories. Here we will compare residuals to the fitted cp.ratio values (for a univariate analogy, think of the fitted line as representing a level across different segments along the x-axis). We’ll generate a spread-level plot of model M2’s residuals (note that in the realm of regression analysis, such plot is often referred to as a scale-location plot). We’ll also add a loess curve to help visualize any patterns in the plot.\n\nsl2 &lt;- data.frame( std.res = sqrt(abs(residuals(M2))), \n                   fit     = predict(M2))\n\nggplot(sl2, aes(x = fit, y  =std.res)) + geom_point() +\n              stat_smooth(method = \"loess\", se = FALSE, span = 1, \n                          method.args = list(degree = 1) )\n\n\n\n\nThe function predict() extracts the y-values from the fitted model M2 and is plotted along the x-axis. It’s clear from this plot that the residuals are not homogeneous; they increase as a function of increasing fitted CP ratio. The “bend” observed in the loess curve is most likely due to a single point at the far (right) end of the fitted range. Given that we have a small batch of numbers, a loess can be easily influenced by an outlier. We may therefore want to increase the loess span by setting span = 2.\n\nggplot(sl2, aes(x = fit, y = std.res)) + geom_point() +\n              stat_smooth(method = \"loess\", se = FALSE, span = 2, \n                          method.args = list(degree = 1) )\n\n\n\n\nThe point’s influence is reduced enough to convince us that the observed monotonic increase is real.\nWe learned with the univariate data that re-expressing values was one way to correct for any increasing or decreasing spreads across the range of fitted values. At this point, we may want to look into re-expressing the data. But, before we do, we’ll explore another diagnostic plot: the normal q-q plot.\n\n\n24.3.3 Checking residuals for normality\nIf you are interested in conducting a hypothesis test (i.e. addressing the question “is the slope significantly different from 0”) you will likely want to check the residuals for normality since this is an assumption made when computing a confidence interval and a p-value.\nIn ggplot, we learned that a normal q-q plot could be generated using the stat_qq and stat_qq_line function.\n\nggplot(df, aes(sample = residuals2)) + \n  stat_qq(distribution = qnorm) +\n  stat_qq_line(distribution = qnorm, col = \"blue\")\n\n\n\n\nHere, the residuals seem to stray a little from a normal distribution."
   },
   {
     "objectID": "bivariate.html#re-expressing-the-data",