Skip to content

Commit

Permalink
Update README.md (#58)
Browse files Browse the repository at this point in the history
  • Loading branch information
SF-Zhou authored Mar 3, 2025
1 parent fcb915b commit 85d3212
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The test cluster comprised 25 storage nodes (2 NUMA domains/node, 1 storage serv
### 3. KVCache

KVCache is a technique used to optimize the LLM inference process. It avoids redundant computations by caching the key and value vectors of previous tokens in the decoder layers.
The top figure demonstrates the read throughput of all KVCache clients, highlighting both peak and average values, with peak throughput reaching up to 40 GiB/s. The bottom figure presents the IOPS of removing ops from garbage collection (GC) during the same time period.
The top figure demonstrates the read throughput of all KVCache clients (1×400Gbps NIC/node), highlighting both peak and average values, with peak throughput reaching up to 40 GiB/s. The bottom figure presents the IOPS of removing ops from garbage collection (GC) during the same time period.

![KVCache Read Throughput](./docs/images/kvcache_read_throughput.png)
![KVCache GC IOPS](./docs/images/kvcache_gc_iops.png)
Expand Down

0 comments on commit 85d3212

Please sign in to comment.