Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF fixes production update #343

Merged
merged 17 commits into from
Dec 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/update_build_environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@ name: Rebuild and publish new ubcdsci/py-intro-to-ds image on DockerHub
on:
pull_request:
types: [opened, synchronize]
branches:
- 'main'

jobs:
rebuild-docker:
Expand Down
37 changes: 26 additions & 11 deletions source/classification1.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ kernelspec:
:tags: [remove-cell]
from chapter_preamble import *
from IPython.display import HTML
from IPython.display import Image
from sklearn.metrics.pairwise import euclidean_distances
import numpy as np
import plotly.express as px
Expand Down Expand Up @@ -281,6 +282,7 @@ perimeter and concavity variables. Recall that the default palette in `altair`
is colorblind-friendly, so we can stick with that here.

```{code-cell} ipython3
:tags: ["remove-output"]
perim_concav = alt.Chart(cancer).mark_circle().encode(
x=alt.X("Perimeter").title("Perimeter (standardized)"),
y=alt.Y("Concavity").title("Concavity (standardized)"),
Expand All @@ -289,12 +291,16 @@ perim_concav = alt.Chart(cancer).mark_circle().encode(
perim_concav
```

```{figure} 
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:05-scatter", perim_concav)
```

:::{glue:figure} fig:05-scatter
:name: fig:05-scatter
:figclass: caption-hack

Scatter plot of concavity versus perimeter colored by diagnosis label.
```
:::

+++

Expand Down Expand Up @@ -855,7 +861,11 @@ for neighbor_df in neighbor_df_list:
# tight layout
fig.update_layout(margin=dict(l=0, r=0, b=0, t=1), template="plotly_white")

glue("fig:05-more", fig)
# if HTML, use the plotly 3d image; if PDF, use static image
if "BOOK_BUILD_TYPE" in os.environ and os.environ["BOOK_BUILD_TYPE"] == "PDF":
glue("fig:05-more", Image("img/classification1/plot3d_knn_classification.png"))
else:
glue("fig:05-more", fig)
```

```{figure} 
Expand Down Expand Up @@ -1432,6 +1442,7 @@ The new imbalanced data is shown in {numref}`fig:05-unbalanced`,
and we print the counts of the classes using the `value_counts` function.

```{code-cell} ipython3
:tags: ["remove-output"]
rare_cancer = pd.concat((
cancer[cancer["Class"] == "Benign"],
cancer[cancer["Class"] == "Malignant"].head(3)
Expand All @@ -1445,12 +1456,16 @@ rare_plot = alt.Chart(rare_cancer).mark_circle().encode(
rare_plot
```

```{figure} 
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:05-unbalanced", rare_plot)
```

:::{glue:figure} fig:05-unbalanced
:name: fig:05-unbalanced
:figclass: caption-hack

Imbalanced data.
```
:::

```{code-cell} ipython3
rare_cancer["Class"].value_counts()
Expand Down Expand Up @@ -1947,16 +1962,15 @@ unscaled_plot + prediction_plot
```

```{code-cell} ipython3
:tags: [remove-input]
:tags: [remove-cell]
glue("fig:05-workflow-plot", (unscaled_plot + prediction_plot))
```

```{figure} 
:::{glue:figure} fig:05-workflow-plot
:name: fig:05-workflow-plot
:figclass: caption-hack

Scatter plot of smoothness versus area where background color indicates the decision of the classifier.
```
:::

+++

Expand All @@ -1974,6 +1988,7 @@ found in {numref}`Chapter %s <move-to-your-own-machine>`. This will ensure that
and guidance that the worksheets provide will function as intended.

+++

## References

```{bibliography}
Expand Down
24 changes: 16 additions & 8 deletions source/classification2.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,20 +395,20 @@ the `random_state` argument that is available in many `pandas` and `scikit-learn
functions. Those functions will then use your `Generator` to generate random numbers instead of
`numpy`'s default generator. For example, we can reproduce our earlier example by using a `Generator`
object with the `seed` value set to 1; we get the same lists of numbers once again.
```{code}
```python
from numpy.random import Generator, PCG64
rng = Generator(PCG64(seed=1))
random_numbers1_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
random_numbers1_third
```
```{code}
```text
array([2, 9, 6, 4, 0, 3, 1, 7, 8, 5])
```
```{code}
```python
random_numbers2_third = nums_0_to_9.sample(n=10, random_state=rng).to_list()
random_numbers2_third
```
```{code}
```text
array([9, 5, 3, 0, 8, 4, 2, 1, 6, 7])
```

Expand All @@ -432,6 +432,7 @@ You will also notice that we set the random seed using the `np.random.seed` func
as described in {numref}`randomseeds`.

```{code-cell} ipython3
:tags: ["remove-output"]
# load packages
import altair as alt
import pandas as pd
Expand Down Expand Up @@ -462,11 +463,18 @@ perim_concav = alt.Chart(cancer).mark_circle().encode(
perim_concav
```

```{figure} 
```{code-cell} ipython3
:tags: ["remove-cell"]
glue("fig:06-precode", perim_concav)
```

:::{glue:figure} fig:06-precode
:name: fig:06-precode

Scatter plot of tumor cell concavity versus smoothness colored by diagnosis label.
```
:::



+++

Expand Down Expand Up @@ -2205,10 +2213,10 @@ and guidance that the worksheets provide will function as intended.
text, it requires a bit more mathematical background than we require.


## References

+++

## References

```{bibliography}
:filter: docname in docnames
```
4 changes: 2 additions & 2 deletions source/clustering.md
Original file line number Diff line number Diff line change
Expand Up @@ -1063,10 +1063,10 @@ and guidance that the worksheets provide will function as intended.
learning, it covers *principal components analysis (PCA)*, which is a very
popular technique for reducing the number of predictors in a data set.

## References

+++

## References

```{bibliography}
:filter: docname in docnames
```
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added source/img/regression1/plot3d_knn_regression.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading