Skip to content

Commit

Permalink
Merge pull request #343 from UBC-DSCI/main
Browse files Browse the repository at this point in the history
PDF fixes production update
  • Loading branch information
trevorcampbell authored Dec 28, 2023
2 parents 33a628b + c485fd1 commit 4bf8838
Show file tree
Hide file tree
Showing 18 changed files with 228 additions and 127 deletions.
2 changes: 2 additions & 0 deletions .github/workflows/update_build_environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ name: Rebuild and publish new ubcdsci/py-intro-to-ds image on DockerHub
on:
pull_request:
types: [opened, synchronize]
branches:
- 'main'

jobs:
rebuild-docker:
Expand Down
37 changes: 26 additions & 11 deletions source/classification1.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ kernelspec:
:tags: [remove-cell]
from chapter_preamble import *
from IPython.display import HTML
from IPython.display import Image
from sklearn.metrics.pairwise import euclidean_distances
import numpy as np
import plotly.express as px
Expand Down Expand Up @@ -281,6 +282,7 @@ perimeter and concavity variables. Recall that the default palette in `altair`
is colorblind-friendly, so we can stick with that here.

```{code-cell} ipython3
:tags: ["remove-output"]
perim_concav = alt.Chart(cancer).mark_circle().encode(
x=alt.X("Perimeter").title("Perimeter (standardized)"),
y=alt.Y("Concavity").title("Concavity (standardized)"),
Expand All @@ -289,12 +291,16 @@ perim_concav = alt.Chart(cancer).mark_circle().encode(
perim_concav
```

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:05-scatter", perim_concav)
```

:::{glue:figure} fig:05-scatter
:name: fig:05-scatter
:figclass: caption-hack

Scatter plot of concavity versus perimeter colored by diagnosis label.
```
:::

+++

Expand Down Expand Up @@ -855,7 +861,11 @@ for neighbor_df in neighbor_df_list:
# tight layout
fig.update_layout(margin=dict(l=0, r=0, b=0, t=1), template="plotly_white")
glue("fig:05-more", fig)
# if HTML, use the plotly 3d image; if PDF, use static image
if "BOOK_BUILD_TYPE" in os.environ and os.environ["BOOK_BUILD_TYPE"] == "PDF":
glue("fig:05-more", Image("img/classification1/plot3d_knn_classification.png"))
else:
glue("fig:05-more", fig)
```

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
Expand Down Expand Up @@ -1432,6 +1442,7 @@ The new imbalanced data is shown in {numref}`fig:05-unbalanced`,
and we print the counts of the classes using the `value_counts` function.

```{code-cell} ipython3
:tags: ["remove-output"]
rare_cancer = pd.concat((
cancer[cancer["Class"] == "Benign"],
cancer[cancer["Class"] == "Malignant"].head(3)
Expand All @@ -1445,12 +1456,16 @@ rare_plot = alt.Chart(rare_cancer).mark_circle().encode(
rare_plot
```

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:05-unbalanced", rare_plot)
```

:::{glue:figure} fig:05-unbalanced
:name: fig:05-unbalanced
:figclass: caption-hack

Imbalanced data.
```
:::

```{code-cell} ipython3
rare_cancer["Class"].value_counts()
Expand Down Expand Up @@ -1947,16 +1962,15 @@ unscaled_plot + prediction_plot
```

```{code-cell} ipython3
:tags: [remove-input]
:tags: [remove-cell]
glue("fig:05-workflow-plot", (unscaled_plot + prediction_plot))
```

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
:::{glue:figure} fig:05-workflow-plot
:name: fig:05-workflow-plot
:figclass: caption-hack

Scatter plot of smoothness versus area where background color indicates the decision of the classifier.
```
:::

+++

Expand All @@ -1974,6 +1988,7 @@ found in {numref}`Chapter %s <move-to-your-own-machine>`. This will ensure that
and guidance that the worksheets provide will function as intended.

+++

## References

```{bibliography}
Expand Down
24 changes: 16 additions & 8 deletions source/classification2.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,20 +395,20 @@ the `random_state` argument that is available in many `pandas` and `scikit-learn
functions. Those functions will then use your `Generator` to generate random numbers instead of
`numpy`'s default generator. For example, we can reproduce our earlier example by using a `Generator`
object with the `seed` value set to 1; we get the same lists of numbers once again.
```{code}
```python
from numpy.random import Generator, PCG64
rng = Generator(PCG64(seed=1))
random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
random_numbers1_third
```
```{code}
```text
array([2, 9, 6, 4, 0, 3, 1, 7, 8, 5])
```
```{code}
```python
random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
random_numbers2_third
```
```{code}
```text
array([9, 5, 3, 0, 8, 4, 2, 1, 6, 7])
```
Expand All @@ -432,6 +432,7 @@ You will also notice that we set the random seed using the `np.random.seed` func
as described in {numref}`randomseeds`.

```{code-cell} ipython3
:tags: ["remove-output"]
# load packages
import altair as alt
import pandas as pd
Expand Down Expand Up @@ -462,11 +463,18 @@ perim_concav = alt.Chart(cancer).mark_circle().encode(
perim_concav
```

```{figure} data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:06-precode", perim_concav)
```

:::{glue:figure} fig:06-precode
:name: fig:06-precode

Scatter plot of tumor cell concavity versus smoothness colored by diagnosis label.
```
:::



+++

Expand Down Expand Up @@ -2205,10 +2213,10 @@ and guidance that the worksheets provide will function as intended.
text, it requires a bit more mathematical background than we require.


## References

+++

## References

```{bibliography}
:filter: docname in docnames
```
4 changes: 2 additions & 2 deletions source/clustering.md
Original file line number Diff line number Diff line change
Expand Up @@ -1063,10 +1063,10 @@ and guidance that the worksheets provide will function as intended.
learning, it covers *principal components analysis (PCA)*, which is a very
popular technique for reducing the number of predictors in a data set.

## References

+++

## References

```{bibliography}
:filter: docname in docnames
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/img/regression1/plot3d_knn_regression.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 4bf8838

Please sign in to comment.