Introduce MLflow in tutorials for better understanding for what happe…

…ns under the hood (stanfordnlp#7732) * Introduce MLflow in tutorials for better understanding for what happens under the hood Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> * wording Signed-off-by: B-Step62 <yuki.watanabe@databricks.com> --------- Signed-off-by: B-Step62 <yuki.watanabe@databricks.com>
chenmoneygithub · Feb 4, 2025 · 168c052 · 168c052
1 parent 1e3a412
commit 168c052
Show file tree

Hide file tree

Showing 15 changed files with 841 additions and 6 deletions.
diff --git a/docs/docs/static/img/mlflow-tracing-rag.png b/docs/docs/static/img/mlflow-tracing-rag.png
diff --git a/docs/docs/tutorials/agents/index.ipynb b/docs/docs/tutorials/agents/index.ipynb
@@ -8,7 +8,46 @@
     "\n",
     "Let's walk through a quick example of setting up a `dspy.ReAct` agent with a couple of tools and optimizing it to conduct advanced browsing for multi-hop search.\n",
     "\n",
-    "Install the latest DSPy via `pip install -U dspy` and follow along."
+    "Install the latest DSPy via `pip install -U dspy` and follow along.\n",
+    "\n",
+    "<details>\n",
+    "<summary>Optional: Set up MLflow Tracing to understand what's happening under the hood.</summary>\n",
+    "\n",
+    "### MLflow DSPy Integration\n",
+    "\n",
+    "<a href=\"https://mlflow.org/\">MLflow</a> is an LLMOps tool that natively integrates with DSPy and offer explainability and experiment tracking. In this tutorial, you can use MLflow to visualize prompts and optimization progress as traces to understand the DSPy's behavior better. You can set up MLflow easily by following the four steps below.\n",
+    "\n",
+    "![MLflow Trace](./mlflow-tracing-agent.png)\n",
+    "\n",
+    "1. Install MLflow\n",
+    "\n",
+    "```bash\n",
+    "%pip install mlflow>=2.20\n",
+    "```\n",
+    "\n",
+    "2. Start MLflow UI in a separate terminal\n",
+    "```bash\n",
+    "mlflow ui --port 5000\n",
+    "```\n",
+    "\n",
+    "3. Connect the notebook to MLflow\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "mlflow.set_tracking_uri(\"http://localhost:5000\")\n",
+    "mlflow.set_experiment(\"DSPy\")\n",
+    "```\n",
+    "\n",
+    "4. Enabling tracing.\n",
+    "```python\n",
+    "mlflow.dspy.autolog()\n",
+    "```\n",
+    "\n",
+    "Once you have completed the steps above, you can see traces for each program execution on the notebook. They provide great visibility into the model's behavior and helps you understand the DSPy's concepts better throughout the tutorial.\n",
+    "\n",
+    "To kearn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "\n",
+    "</details>"
    ]
   },
   {
@@ -440,6 +479,54 @@
     "evaluate(safe_react)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<details>\n",
+    "<summary>Tracking Evaluation Results in MLflow Experiment</summary>\n",
+    "\n",
+    "<br/>\n",
+    "\n",
+    "To track and visualize the evaluation results over time, you can record the results in MLflow Experiment.\n",
+    "\n",
+    "\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "with mlflow.start_run(run_name=\"agent_evaluation\"):\n",
+    "    evaluate = dspy.Evaluate(\n",
+    "        devset=devset,\n",
+    "        metric=top5_recall,\n",
+    "        num_threads=16,\n",
+    "        display_progress=True,\n",
+    "        # To record the outputs and detailed scores to MLflow\n",
+    "        return_all_scores=True,\n",
+    "        return_outputs=True,\n",
+    "    )\n",
+    "\n",
+    "    # Evaluate the program as usual\n",
+    "    aggregated_score, outputs, all_scores = evaluate(cot)\n",
+    "\n",
+    "    # Log the aggregated score\n",
+    "    mlflow.log_metric(\"top5_recall\", aggregated_score)\n",
+    "    # Log the detailed evaluation results as a table\n",
+    "    mlflow.log_table(\n",
+    "        {\n",
+    "            \"Claim\": [example.claim for example in eval_set],\n",
+    "            \"Expected Titles\": [example.titles for example in eval_set],\n",
+    "            \"Predicted Titles\": outputs,\n",
+    "            \"Top 5 Recall\": all_scores,\n",
+    "        },\n",
+    "        artifact_file=\"eval_results.json\",\n",
+    "    )\n",
+    "```\n",
+    "\n",
+    "To learn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "\n",
+    "</details>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -662,7 +749,7 @@
    "source": [
     "Awesome. It looks like the system improved drastically from 8% recall to around 40% recall. That was a pretty straightforward approach, but DSPy gives you many tools to continue iterating on this from here.\n",
     "\n",
-    "Next, let's inspect the optimized prompts to understand what it has learned. We'll run one query and then inspect the last two prompts, which will show us the prompts used for both ReAct sub-modules, the one that does the agentic loop and the other than prepares the final results."
+    "Next, let's inspect the optimized prompts to understand what it has learned. We'll run one query and then inspect the last two prompts, which will show us the prompts used for both ReAct sub-modules, the one that does the agentic loop and the other than prepares the final results. (Alternatively, if you enabled MLflow Tracing following the instructions above, you can see all steps done by the agent including LLM calls, prompts, tool execution, in a rich tree-view.)"
    ]
   },
   {
@@ -1350,6 +1437,42 @@
     "\n",
     "loaded_react(claim=\"The author of the 1960s unproduced script written for The Beatles, Up Against It, and Bernard-Marie Koltès are both playwrights.\").titles"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<details>\n",
+    "<summary>Saving programs in MLflow Experiment</summary>\n",
+    "\n",
+    "<br/>\n",
+    "\n",
+    "Instead of saving the program to a local file, you can track it in MLflow for better reproducibility and collaboration.\n",
+    "\n",
+    "1. **Dependency Management**: MLflow automatically save the frozen environment metadata along with the program to ensure reproducibility.\n",
+    "2. **Experiment Tracking**: With MLflow, you can track the program's performance and cost along with the program itself.\n",
+    "3. **Collaboration**: You can share the program and results with your team members by sharing the MLflow experiment.\n",
+    "\n",
+    "To save the program in MLflow, run the following code:\n",
+    "\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "# Start an MLflow Run and save the program\n",
+    "with mlflow.start_run(run_name=\"optimized_rag\"):\n",
+    "    model_info = mlflow.dspy.log_model(\n",
+    "        optimized_react,\n",
+    "        artifact_path=\"model\", # Any name to save the program in MLflow\n",
+    "    )\n",
+    "\n",
+    "# Load the program back from MLflow\n",
+    "loaded = mlflow.dspy.load_model(model_info.model_uri)\n",
+    "```\n",
+    "\n",
+    "To learn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "\n",
+    "</details>"
+   ]
   }
  ],
  "metadata": {

diff --git a/docs/docs/tutorials/agents/mlflow-tracing-agent.png b/docs/docs/tutorials/agents/mlflow-tracing-agent.png
diff --git a/docs/docs/tutorials/classification_finetuning/index.ipynb b/docs/docs/tutorials/classification_finetuning/index.ipynb
@@ -24,6 +24,48 @@
     "```"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<details>\n",
+    "<summary>Optional: Set up MLflow Tracing to understand what's happening under the hood.</summary>\n",
+    "\n",
+    "### MLflow DSPy Integration\n",
+    "\n",
+    "<a href=\"https://mlflow.org/\">MLflow</a> is an LLMOps tool that natively integrates with DSPy and offer explainability and experiment tracking. In this tutorial, you can use MLflow to visualize prompts and optimization progress as traces to understand the DSPy's behavior better. You can set up MLflow easily by following the four steps below.\n",
+    "\n",
+    "![MLflow Trace](./mlflow-tracing-classification.png)\n",
+    "\n",
+    "1. Install MLflow\n",
+    "\n",
+    "```bash\n",
+    "%pip install mlflow>=2.20\n",
+    "```\n",
+    "\n",
+    "2. Start MLflow UI in a separate terminal\n",
+    "```bash\n",
+    "mlflow ui --port 5000\n",
+    "```\n",
+    "\n",
+    "3. Connect the notebook to MLflow\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "mlflow.set_tracking_uri(\"http://localhost:5000\")\n",
+    "mlflow.set_experiment(\"DSPy\")\n",
+    "```\n",
+    "\n",
+    "4. Enabling tracing.\n",
+    "```python\n",
+    "mlflow.dspy.autolog()\n",
+    "```\n",
+    "\n",
+    "\n",
+    "To learn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "</details>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -472,6 +514,54 @@
     "evaluate(classify_ft)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<details>\n",
+    "<summary>Tracking Evaluation Results in MLflow Experiment</summary>\n",
+    "\n",
+    "<br/>\n",
+    "\n",
+    "To track and visualize the evaluation results over time, you can record the results in MLflow Experiment.\n",
+    "\n",
+    "\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "with mlflow.start_run(run_name=\"classifier_evaluation\"):\n",
+    "    evaluate_correctness = dspy.Evaluate(\n",
+    "        devset=devset,\n",
+    "        metric=extraction_correctness_metric,\n",
+    "        num_threads=16,\n",
+    "        display_progress=True,\n",
+    "        # To record the outputs and detailed scores to MLflow\n",
+    "        return_all_scores=True,\n",
+    "        return_outputs=True,\n",
+    "    )\n",
+    "\n",
+    "    # Evaluate the program as usual\n",
+    "    aggregated_score, outputs, all_scores = evaluate_correctness(people_extractor)\n",
+    "\n",
+    "    # Log the aggregated score\n",
+    "    mlflow.log_metric(\"exact_match\", aggregated_score)\n",
+    "    # Log the detailed evaluation results as a table\n",
+    "    mlflow.log_table(\n",
+    "        {\n",
+    "            \"Text\": [example.text for example in devset],\n",
+    "            \"Expected\": [example.example_label for example in devset],\n",
+    "            \"Predicted\": outputs,\n",
+    "            \"Exact match\": all_scores,\n",
+    "        },\n",
+    "        artifact_file=\"eval_results.json\",\n",
+    "    )\n",
+    "```\n",
+    "\n",
+    "To learn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "\n",
+    "</details>"
+   ]
+  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -944,6 +1034,42 @@
     "classify_ft(text=\"why hasnt my card come in yet?\")\n",
     "dspy.inspect_history()"
    ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "<details>\n",
+    "<summary>Saving fine-tuned programs in MLflow Experiment</summary>\n",
+    "\n",
+    "<br/>\n",
+    "\n",
+    "To deploy the fine-tuned program in production or share it with your team, you can save it in MLflow Experiment. Compared to simply saving it to a local file, MLflow offers the following benefits:\n",
+    "\n",
+    "1. **Dependency Management**: MLflow automatically save the frozen environment metadata along with the program to ensure reproducibility.\n",
+    "2. **Experiment Tracking**: With MLflow, you can track the program's performance and cost along with the program itself.\n",
+    "3. **Collaboration**: You can share the program and results with your team members by sharing the MLflow experiment.\n",
+    "\n",
+    "To save the program in MLflow, run the following code:\n",
+    "\n",
+    "```python\n",
+    "import mlflow\n",
+    "\n",
+    "# Start an MLflow Run and save the program\n",
+    "with mlflow.start_run(run_name=\"optimized_classifier\"):\n",
+    "    model_info = mlflow.dspy.log_model(\n",
+    "        classify_ft,\n",
+    "        artifact_path=\"model\", # Any name to save the program in MLflow\n",
+    "    )\n",
+    "\n",
+    "# Load the program back from MLflow\n",
+    "loaded = mlflow.dspy.load_model(model_info.model_uri)\n",
+    "```\n",
+    "\n",
+    "To learn more about the integration, visit [MLflow DSPy Documentation](https://mlflow.org/docs/latest/llms/dspy/index.html) as well.\n",
+    "\n",
+    "</details>"
+   ]
   }
  ],
  "metadata": {

diff --git a/docs/docs/tutorials/classification_finetuning/mlflow-tracing-classification.png b/docs/docs/tutorials/classification_finetuning/mlflow-tracing-classification.png