diff --git a/content/blogs/cllm/index.md b/content/blogs/cllm/index.md
index df34d05..127c713 100644
--- a/content/blogs/cllm/index.md
+++ b/content/blogs/cllm/index.md
@@ -160,6 +160,23 @@ Our experiments contain three domain-specific tasks, including Spider (text-to-S
 **Open-domain conversational Challenge (MT-bench):** CLLM trained from LLaMA2-7B using ShareGPT dataset can achieve roughly the same speedup as Medusa2 when combined with lookahead decoding, with comparable scores on MT-bench. However, CLLM offers higher adaptability and memory efficiency as it requires no modifications to the target model's original architecture and no auxiliary components.
 {{< /justify >}}
 
+**Training Cost:** 
+{{< justify >}}
+The fine-tuning cost of CLLMs is moderate, e.g., passing only around 1M tokens for LLaMA-7B to achieve a $3.4\times$ speedup on the Spider dataset. In the cases where the dataset size is large, for example, for CodeSearchNet-Python, only 10% of the dataset is required to generate Jacobi trajectories in training CLLMs to obtain around $2.5\times$ speedup. The total number of tokens can be estimated by taking:
+
+$N = \text{avg # of trajectories per prompt} \times \text{avg seq length} \times \text{# of prompts}$.
+{{< /justify >}}
+
+{{< center >}}
+| dataset | estimated training cost (tokens) |
+|:---:|:---:|
+| Spider | $2\times 10^6$ |
+| CodeSearchNet-Python | $1 \times 10^8$ |
+| GSM8K | $1 \times 10^7$ |
+| ShareGPT | $2 \times 10^8$ |
+
+{{< /center >}}
+
 ### Fast Forwarding and Stationary Tokens
 
 {{< image src="img/trajectory_compare_aligned.png" alt="trajectory_compare" width="120%" title="Figure 7: Comparison of Jacobi trajectory between a target LLM and CLLMs on Spider. Each point along the Jacobi trajectory is a color-coded sequence: blue for correct tokens matching with AR results, and red for inaccurate ones. CLLM demonstrates enhanced efficiency, converging to the fixed point $2\times$ faster the Target LLM. This increased efficiency in the CLLM can be attributed to the consistency loss which facilitates the learning of the structure of each $n$-token sequence given a prefix.">}}
diff --git a/layouts/shortcodes/center.html b/layouts/shortcodes/center.html
new file mode 100644
index 0000000..ec9efdb
--- /dev/null
+++ b/layouts/shortcodes/center.html
@@ -0,0 +1,3 @@
+<div style="text-align: center;">
+    {{ .Inner | markdownify }}
+</div>
diff --git a/public/.DS_Store b/public/.DS_Store
index 01d6442..a370d29 100644
Binary files a/public/.DS_Store and b/public/.DS_Store differ
diff --git a/public/blogs/cllm/img/clm_objective_legacy.png b/public/blogs/cllm/img/clm_objective_legacy.png
new file mode 100644
index 0000000..8686768
Binary files /dev/null and b/public/blogs/cllm/img/clm_objective_legacy.png differ
diff --git a/public/blogs/cllm/index.html b/public/blogs/cllm/index.html
index 1899db4..3098c37 100644
--- a/public/blogs/cllm/index.html
+++ b/public/blogs/cllm/index.html
@@ -386,6 +386,45 @@ <h3 id="results">Results<a hidden class="anchor" aria-hidden="true" href="#resul
 
 
 
+<p><strong>Training Cost:</strong>
+<div style="text-align: justify;">
+    <p>The fine-tuning cost of CLLMs is moderate, e.g., passing only around 1M tokens for LLaMA-7B to achieve a $3.4\times$ speedup on the Spider dataset. In the cases where the dataset size is large, for example, for CodeSearchNet-Python, only 10% of the dataset is required to generate Jacobi trajectories in training CLLMs to obtain around $2.5\times$ speedup. The total number of tokens can be estimated by taking:</p>
+<p>$N = \text{avg # of trajectories per prompt} \times \text{avg seq length} \times \text{# of prompts}$.</p>
+
+</div>
+
+
+</p>
+<div style="text-align: center;">
+    <table>
+<thead>
+<tr>
+<th style="text-align:center">dataset</th>
+<th style="text-align:center">estimated training cost (tokens)</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td style="text-align:center">Spider</td>
+<td style="text-align:center">$2\times 10^6$</td>
+</tr>
+<tr>
+<td style="text-align:center">CodeSearchNet-Python</td>
+<td style="text-align:center">$1 \times 10^8$</td>
+</tr>
+<tr>
+<td style="text-align:center">GSM8K</td>
+<td style="text-align:center">$1 \times 10^7$</td>
+</tr>
+<tr>
+<td style="text-align:center">ShareGPT</td>
+<td style="text-align:center">$2 \times 10^8$</td>
+</tr>
+</tbody>
+</table>
+
+</div>
+
 <h3 id="fast-forwarding-and-stationary-tokens">Fast Forwarding and Stationary Tokens<a hidden class="anchor" aria-hidden="true" href="#fast-forwarding-and-stationary-tokens">#</a></h3>
 
     <figure>

dataset	estimated training cost (tokens)
Spider	$2\times 10^6$
CodeSearchNet-Python	$1 \times 10^8$
GSM8K	$1 \times 10^7$
ShareGPT	$2 \times 10^8$