Skip to content

Commit

Permalink
fix links in docs
Browse files Browse the repository at this point in the history
  • Loading branch information
krypticmouse committed Oct 22, 2024
1 parent d4b2f70 commit 1120c4c
Show file tree
Hide file tree
Showing 11 changed files with 25 additions and 25 deletions.
2 changes: 1 addition & 1 deletion docs/docs/building-blocks/1-language_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ This is almost never the recommended way to interact with LMs in DSPy, but it is

### Using the LM with DSPy signatures

You can also use the LM via DSPy [`signature` (input/output spec)](https://dspy-docs.vercel.app/docs/building-blocks/signatures) and [`modules`](https://dspy-docs.vercel.app/docs/building-blocks/modules), which we discuss in more depth in the remaining guides.
You can also use the LM via DSPy [`signature` (input/output spec)](/building-blocks/2-signatures) and [`modules`](/building-blocks/3-modules), which we discuss in more depth in the remaining guides.

```python
# Define a module (ChainOfThought) and assign it a signature (return an answer, given a question).
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/building-blocks/2-signatures.md
Original file line number Diff line number Diff line change
Expand Up @@ -157,4 +157,4 @@ Prediction(

While signatures are convenient for prototyping with structured inputs/outputs, that's not the main reason to use them!

You should compose multiple signatures into bigger [DSPy modules](https://dspy-docs.vercel.app/docs/building-blocks/modules) and [compile these modules into optimized prompts](https://dspy-docs.vercel.app/docs/building-blocks/optimizers#what-does-a-dspy-optimizer-tune-how-does-it-tune-them) and finetunes.
You should compose multiple signatures into bigger [DSPy modules](/building-blocks/modules) and [compile these modules into optimized prompts](/building-blocks/optimizers#what-does-a-dspy-optimizer-tune-how-does-it-tune-them) and finetunes.
4 changes: 2 additions & 2 deletions docs/docs/building-blocks/3-modules.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ sidebar_position: 3

A **DSPy module** is a building block for programs that use LMs.

- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature](/2-signatures).
- Each built-in module abstracts a **prompting technique** (like chain of thought or ReAct). Crucially, they are generalized to handle any [DSPy Signature](/building-blocks/2-signatures).

- A DSPy module has **learnable parameters** (i.e., the little pieces comprising the prompt and the LM weights) and can be invoked (called) to process inputs and return outputs.

Expand All @@ -17,7 +17,7 @@ A **DSPy module** is a building block for programs that use LMs.

Let's start with the most fundamental module, `dspy.Predict`. Internally, all other DSPy modules are just built using `dspy.Predict`.

We'll assume you are already at least a little familiar with [DSPy signatures](/2-signatures), which are declarative specs for defining the behavior of any module we use in DSPy.
We'll assume you are already at least a little familiar with [DSPy signatures](/building-blocks/2-signatures), which are declarative specs for defining the behavior of any module we use in DSPy.

To use a module, we first **declare** it by giving it a signature. Then we **call** the module with the input arguments, and extract the output fields!

Expand Down
12 changes: 6 additions & 6 deletions docs/docs/building-blocks/6-optimizers.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ In many cases, we found that compiling leads to better prompts than human writin
<!-- 3. Using dot to compile the `.dot` file into a PNG -->
<!-- Robert Goldman [2024/05/11:rpg] -->

[Subclasses of Teleprompter](figures/teleprompter-classes.png)
![Subclasses of Teleprompter](figures/teleprompter-classes.png)

All of these can be accessed via `from dspy.teleprompt import *`.

Expand All @@ -50,7 +50,7 @@ These optimizers extend the signature by automatically generating and including

2. **`BootstrapFewShot`**: Uses a `teacher` module (which defaults to your program) to generate complete demonstrations for every stage of your program, along with labeled examples in `trainset`. Parameters include `max_labeled_demos` (the number of demonstrations randomly selected from the `trainset`) and `max_bootstrapped_demos` (the number of additional examples generated by the `teacher`). The bootstrapping process employs the metric to validate demonstrations, including only those that pass the metric in the "compiled" prompt. Advanced: Supports using a `teacher` program that is a *different* DSPy program that has compatible structure, for harder tasks.

3. [**`BootstrapFewShotWithRandomSearch`**](https://dspy-docs.vercel.app/deep-dive/optimizers/bootstrap-fewshot): Applies `BootstrapFewShot` several times with random search over generated demonstrations, and selects the best program over the optimization. Parameters mirror those of `BootstrapFewShot`, with the addition of `num_candidate_programs`, which specifies the number of random programs evaluated over the optimization, including candidates of the uncompiled program, `LabeledFewShot` optimized program, `BootstrapFewShot` compiled program with unshuffled examples and `num_candidate_programs` of `BootstrapFewShot` compiled programs with randomized example sets.
3. [**`BootstrapFewShotWithRandomSearch`**](/deep-dive/optimizers/bootstrap-fewshot): Applies `BootstrapFewShot` several times with random search over generated demonstrations, and selects the best program over the optimization. Parameters mirror those of `BootstrapFewShot`, with the addition of `num_candidate_programs`, which specifies the number of random programs evaluated over the optimization, including candidates of the uncompiled program, `LabeledFewShot` optimized program, `BootstrapFewShot` compiled program with unshuffled examples and `num_candidate_programs` of `BootstrapFewShot` compiled programs with randomized example sets.

4. **`KNNFewShot`**. Uses k-Nearest Neighbors algorithm to find the nearest training example demonstrations for a given input example. These nearest neighbor demonstrations are then used as the trainset for the BootstrapFewShot optimization process. See [this notebook](https://github.com/stanfordnlp/dspy/blob/main/examples/knn.ipynb) for an example.

Expand All @@ -59,9 +59,9 @@ These optimizers extend the signature by automatically generating and including

These optimizers produce optimal instructions for the prompt and, in the case of MIPROv2 can also optimize the set of few-shot demonstrations.

5. [**`COPRO`**](https://dspy-docs.vercel.app/deep-dive/optimizers/copro): Generates and refines new instructions for each step, and optimizes them with coordinate ascent (hill-climbing using the metric function and the `trainset`). Parameters include `depth` which is the number of iterations of prompt improvement the optimizer runs over.
5. [**`COPRO`**](/deep-dive/optimizers/copro): Generates and refines new instructions for each step, and optimizes them with coordinate ascent (hill-climbing using the metric function and the `trainset`). Parameters include `depth` which is the number of iterations of prompt improvement the optimizer runs over.

6. [**`MIPROv2`**](https://dspy-docs.vercel.app/deep-dive/optimizers/miprov2): Generates instructions *and* few-shot examples in each step. The instruction generation is data-aware and demonstration-aware. Uses Bayesian Optimization to effectively search over the space of generation instructions/demonstrations across your modules.
6. [**`MIPROv2`**](/deep-dive/optimizers/miprov2): Generates instructions *and* few-shot examples in each step. The instruction generation is data-aware and demonstration-aware. Uses Bayesian Optimization to effectively search over the space of generation instructions/demonstrations across your modules.


### Automatic Finetuning
Expand All @@ -83,13 +83,13 @@ Ultimately, finding the ‘right’ optimizer to use & the best configuration fo
That being said, here's the general guidance on getting started:
* If you have **very few examples** (around 10), start with `BootstrapFewShot`.
* If you have **more data** (50 examples or more), try `BootstrapFewShotWithRandomSearch`.
* If you prefer to do **instruction optimization only** (i.e. you want to keep your prompt 0-shot), use `MIPROv2` [configured for 0-shot optimization to optimize](https://dspy-docs.vercel.app/deep-dive/optimizers/miprov2#optimizing-instructions-only-with-miprov2-0-shot).
* If you prefer to do **instruction optimization only** (i.e. you want to keep your prompt 0-shot), use `MIPROv2` [configured for 0-shot optimization to optimize](/deep-dive/optimizers/miprov2#optimizing-instructions-only-with-miprov2-0-shot).
* If you’re willing to use more inference calls to perform **longer optimization runs** (e.g. 40 trials or more), and have enough data (e.g. 200 examples or more to prevent overfitting) then try `MIPROv2`.
* If you have been able to use one of these with a large LM (e.g., 7B parameters or above) and need a very **efficient program**, finetune a small LM for your task with `BootstrapFinetune`.

## How do I use an optimizer?

They all share this general interface, with some differences in the keyword arguments (hyperparameters). Detailed documentation for key optimizers can be found [here](https://dspy-docs.vercel.app/deep-dive/optimizers/), and a full list can be found [here](https://dspy-docs.vercel.app/cheatsheet).
They all share this general interface, with some differences in the keyword arguments (hyperparameters). Detailed documentation for key optimizers can be found [here](/deep-dive/optimizers/vfrs), and a full list can be found [here](/cheatsheet).

Let's see this with the most common one, `BootstrapFewShotWithRandomSearch`.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/deep-dive/optimizers/bootstrap-fewshot.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ When compiling a DSPy program, we generally invoke a teleprompter, which is an o

## Setting up a Sample Pipeline

We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](https://dspy-docs.vercel.app/docs/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.
We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.

```python
import dspy
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/deep-dive/optimizers/miprov2.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ sidebar_position: 6

### Setting up a Sample Pipeline

We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](https://dspy-docs.vercel.app/docs/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.
We'll be making a basic answer generation pipeline over GSM8K dataset that we saw in the [Minimal Example](/quick-start/minimal-example), we won't be changing anything in it! So let's start by configuring the LM which will be OpenAI LM client with `gpt-3.5-turbo` as the LLM in use.

```python
import dspy
Expand Down
12 changes: 6 additions & 6 deletions docs/docs/faqs.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The **DSPy** philosophy and abstraction differ significantly from other librarie

## Basic Usage

**How should I use DSPy for my task?** We wrote a [eight-step guide](https://dspy-docs.vercel.app/docs/building-blocks/solving_your_task) on this. In short, using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (`modules`) to use, giving each layer a `signature` (input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPy `optimizer` to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.
**How should I use DSPy for my task?** We wrote a [eight-step guide](/building-blocks/solving_your_task) on this. In short, using DSPy is an iterative process. You first define your task and the metrics you want to maximize, and prepare a few example inputs — typically without labels (or only with labels for the final outputs, if your metric requires them). Then, you build your pipeline by selecting built-in layers (`modules`) to use, giving each layer a `signature` (input/output spec), and then calling your modules freely in your Python code. Lastly, you use a DSPy `optimizer` to compile your code into high-quality instructions, automatic few-shot examples, or updated LM weights for your LM.

**How do I convert my complex prompt into a DSPy pipeline?** See the same answer above.

Expand All @@ -34,7 +34,7 @@ You can specify the generation of long responses as a `dspy.OutputField`. To ens

- **How can I ensure that DSPy doesn't strip new line characters from my inputs or outputs?**

DSPy uses [Signatures](https://dspy-docs.vercel.app/docs/deep-dive/signature/understanding-signatures) to format prompts passed into LMs. In order to ensure that new line characters aren't stripped from longer inputs, you must specify `format=str` when creating a field.
DSPy uses [Signatures](/deep-dive/signature/understanding-signatures) to format prompts passed into LMs. In order to ensure that new line characters aren't stripped from longer inputs, you must specify `format=str` when creating a field.

```python
class UnstrippedSignature(dspy.Signature):
Expand All @@ -49,11 +49,11 @@ class UnstrippedSignature(dspy.Signature):

- **How do I define my own metrics? Can metrics return a float?**

You can define metrics as simply Python functions that process model generations and evaluate them based on user-defined requirements. Metrics can compare existent data (e.g. gold labels) to model predictions or they can be used to assess various components of an output using validation feedback from LMs (e.g. LLMs-as-Judges). Metrics can return `bool`, `int`, and `float` types scores. Check out the official [Metrics docs](https://dspy-docs.vercel.app/docs/building-blocks/metrics) to learn more about defining custom metrics and advanced evaluations using AI feedback and/or DSPy programs.
You can define metrics as simply Python functions that process model generations and evaluate them based on user-defined requirements. Metrics can compare existent data (e.g. gold labels) to model predictions or they can be used to assess various components of an output using validation feedback from LMs (e.g. LLMs-as-Judges). Metrics can return `bool`, `int`, and `float` types scores. Check out the official [Metrics docs](/building-blocks/metrics) to learn more about defining custom metrics and advanced evaluations using AI feedback and/or DSPy programs.

- **How expensive or slow is compiling??**

To reflect compiling metrics, we highlight an experiment for reference, compiling the [`SimplifiedBaleen`](https://dspy-docs.vercel.app/docs/tutorials/simplified-baleen) using the [`dspy.BootstrapFewShotWithRandomSearch`](https://dspy-docs.vercel.app/docs/deep-dive/teleprompter/bootstrap-fewshot) optimizer on the `gpt-3.5-turbo-1106` model over 7 candidate programs and 10 threads. We report that compiling this program takes around 6 minutes with 3200 API calls, 2.7 million input tokens and 156,000 output tokens, reporting a total cost of $3 USD (at the current pricing of the OpenAI model).
To reflect compiling metrics, we highlight an experiment for reference, compiling the [`SimplifiedBaleen`](/tutorials/simplified-baleen) using the [`dspy.BootstrapFewShotWithRandomSearch`](/deep-dive/teleprompter/bootstrap-fewshot) optimizer on the `gpt-3.5-turbo-1106` model over 7 candidate programs and 10 threads. We report that compiling this program takes around 6 minutes with 3200 API calls, 2.7 million input tokens and 156,000 output tokens, reporting a total cost of $3 USD (at the current pricing of the OpenAI model).

Compiling DSPy `optimizers` naturally will incur additional LM calls, but we substantiate this overhead with minimalistic executions with the goal of maximizing performance. This invites avenues to enhance performance of smaller models by compiling DSPy programs with larger models to learn enhanced behavior during compile-time and propagate such behavior to the tested smaller model during inference-time.

Expand Down Expand Up @@ -115,7 +115,7 @@ Modules can be frozen by setting their `._compiled` attribute to be True, indica

You can specify JSON-type descriptions in the `desc` field of the long-form signature `dspy.OutputField` (e.g. `output = dspy.OutputField(desc='key-value pairs')`).

If you notice outputs are still not conforming to JSON formatting, try Asserting this constraint! Check out [Assertions](https://dspy-docs.vercel.app/docs/building-blocks/assertions) (or the next question!)
If you notice outputs are still not conforming to JSON formatting, try Asserting this constraint! Check out [Assertions](/building-blocks/assertions) (or the next question!)

- **How do I use DSPy assertions?**

Expand Down Expand Up @@ -172,4 +172,4 @@ At times, DSPy may have hard-coded arguments that are not relevant for your comp

**How can I add my favorite LM or vector store?**

Check out these walkthroughs on setting up a [Custom LM client](https://dspy-docs.vercel.app/docs/deep-dive/language_model_clients/custom-lm-client) and [Custom RM client](https://dspy-docs.vercel.app/docs/deep-dive/retrieval_models_clients/custom-rm-client).
Check out these walkthroughs on setting up a [Custom LM client](/deep-dive/language_model_clients/custom-lm-client) and [Custom RM client](/deep-dive/retrieval_models_clients/custom-rm-client).
2 changes: 1 addition & 1 deletion docs/docs/roadmap.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ Over the next six months, our goal is to dramatically improve each angle of thes

Using DSPy well for solving a new task is just doing good machine learning with LMs, but teaching this is hard. On the one hand, it's an iterative process: you make some initial choices, which will be sub-optimal, and then you refine them incrementally. It's highly exploratory: it's often the case that no one knows yet how to best solve a problem in a DSPy-esque way. One the other hand, DSPy offers many emerging lessons from several years of building LM systems, in which the design space, the data regime, and many other factors are new both to ML experts and to the very large fraction of users that have no ML experience.

Though current docs do address [a bunch of this](https://dspy-docs.vercel.app/docs/building-blocks/solving_your_task) in isolated ways, one thing we've learned is that we should separate teaching the core DSPy language (which is ultimately pretty small) from teaching the emerging ML workflow that works well in a DSPy-esque setting. As a natural extension of this, we need to place more emphasis on steps prior and after to the explicit coding in DSPy, from data collection to deployment that serves and monitors the optimized DSPy program in practice. This is just starting but efforts will be ramping up led by Omar Khattab, Isaac Miller, and Herumb Shandilya.
Though current docs do address [a bunch of this](/building-blocks/solving_your_task) in isolated ways, one thing we've learned is that we should separate teaching the core DSPy language (which is ultimately pretty small) from teaching the emerging ML workflow that works well in a DSPy-esque setting. As a natural extension of this, we need to place more emphasis on steps prior and after to the explicit coding in DSPy, from data collection to deployment that serves and monitors the optimized DSPy program in practice. This is just starting but efforts will be ramping up led by Omar Khattab, Isaac Miller, and Herumb Shandilya.


## 4) Shifting towards more interactive optimization & tracking.
Expand Down
Loading

0 comments on commit 1120c4c

Please sign in to comment.