Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Use 1MB socket TX and RX buffers #2470

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

larseggert
Copy link
Collaborator

We need to do the same for neqo-glue.

Fixes #1962

Copy link

codecov bot commented Mar 3, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.41%. Comparing base (483916d) to head (fea46ec).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2470      +/-   ##
==========================================
- Coverage   95.42%   95.41%   -0.02%     
==========================================
  Files         115      115              
  Lines       36996    36996              
  Branches    36996    36996              
==========================================
- Hits        35305    35301       -4     
- Misses       1687     1689       +2     
- Partials        4        6       +2     
Components Coverage Δ
neqo-common 97.17% <ø> (-0.36%) ⬇️
neqo-crypto 90.44% <ø> (ø)
neqo-http3 94.50% <ø> (ø)
neqo-qpack 96.29% <ø> (ø)
neqo-transport 96.24% <ø> (ø)
neqo-udp 95.29% <ø> (+0.58%) ⬆️

Copy link

github-actions bot commented Mar 3, 2025

Failed Interop Tests

QUIC Interop Runner, client vs. server, differences relative to 79068bc.

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Copy link

github-actions bot commented Mar 3, 2025

Benchmark results

Performance differences relative to 79068bc.

decode 4096 bytes, mask ff: No change in performance detected.
       time:   [11.834 µs 11.866 µs 11.905 µs]
       change: [-0.2592% +0.2090% +0.6971%] (p = 0.43 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
1 (1.00%) low mild
9 (9.00%) high severe

decode 1048576 bytes, mask ff: No change in performance detected.
       time:   [3.0696 ms 3.0792 ms 3.0904 ms]
       change: [-0.3548% +0.1200% +0.5542%] (p = 0.63 > 0.05)

Found 12 outliers among 100 measurements (12.00%)
2 (2.00%) low mild
10 (10.00%) high severe

decode 4096 bytes, mask 7f: No change in performance detected.
       time:   [19.765 µs 19.813 µs 19.864 µs]
       change: [-0.7559% -0.2047% +0.3059%] (p = 0.47 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
2 (2.00%) low severe
3 (3.00%) low mild
12 (12.00%) high severe

decode 1048576 bytes, mask 7f: No change in performance detected.
       time:   [5.1493 ms 5.1606 ms 5.1733 ms]
       change: [-0.2918% +0.0325% +0.3619%] (p = 0.85 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
13 (13.00%) high severe

decode 4096 bytes, mask 3f: No change in performance detected.
       time:   [6.8741 µs 6.9005 µs 6.9336 µs]
       change: [-0.3551% +0.2139% +0.9690%] (p = 0.56 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) high mild
10 (10.00%) high severe

decode 1048576 bytes, mask 3f: No change in performance detected.
       time:   [1.7539 ms 1.7581 ms 1.7637 ms]
       change: [-0.6174% -0.1459% +0.3291%] (p = 0.56 > 0.05)

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high severe

1 streams of 1 bytes/multistream: No change in performance detected.
       time:   [69.025 µs 70.038 µs 71.512 µs]
       change: [-2.6302% +0.0527% +2.7653%] (p = 0.95 > 0.05)

Found 4 outliers among 100 measurements (4.00%)
2 (2.00%) high mild
2 (2.00%) high severe

1000 streams of 1 bytes/multistream: No change in performance detected.
       time:   [24.510 ms 24.544 ms 24.579 ms]
       change: [-0.1906% +0.0147% +0.2159%] (p = 0.89 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

10000 streams of 1 bytes/multistream: Change within noise threshold.
       time:   [1.6453 s 1.6468 s 1.6484 s]
       change: [-0.3740% -0.2276% -0.0886%] (p = 0.00 < 0.05)

Found 19 outliers among 100 measurements (19.00%)
2 (2.00%) low severe
7 (7.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe

1 streams of 1000 bytes/multistream: No change in performance detected.
       time:   [70.727 µs 71.715 µs 73.151 µs]
       change: [-1.5245% +0.5445% +2.6527%] (p = 0.70 > 0.05)

Found 3 outliers among 100 measurements (3.00%)
1 (1.00%) high mild
2 (2.00%) high severe

100 streams of 1000 bytes/multistream: No change in performance detected.
       time:   [3.2461 ms 3.2522 ms 3.2588 ms]
       change: [-0.4924% -0.1933% +0.0739%] (p = 0.19 > 0.05)

Found 19 outliers among 100 measurements (19.00%)
19 (19.00%) high severe

1000 streams of 1000 bytes/multistream: No change in performance detected.
       time:   [138.13 ms 138.19 ms 138.26 ms]
       change: [-0.1430% -0.0699% +0.0004%] (p = 0.06 > 0.05)
coalesce_acked_from_zero 1+1 entries: No change in performance detected.
       time:   [92.533 ns 92.831 ns 93.128 ns]
       change: [-1.3809% -0.0963% +0.9150%] (p = 0.88 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
7 (7.00%) high mild
3 (3.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [110.33 ns 110.65 ns 110.99 ns]
       change: [-0.7340% -0.2089% +0.2866%] (p = 0.43 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
4 (4.00%) low mild
2 (2.00%) high mild
8 (8.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [109.90 ns 110.23 ns 110.66 ns]
       change: [-1.9362% -0.6801% +0.3640%] (p = 0.28 > 0.05)

Found 13 outliers among 100 measurements (13.00%)
7 (7.00%) low mild
1 (1.00%) high mild
5 (5.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [91.275 ns 91.310 ns 91.351 ns]
       change: [-1.0702% -0.2103% +0.6373%] (p = 0.65 > 0.05)

Found 14 outliers among 100 measurements (14.00%)
5 (5.00%) high mild
9 (9.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [115.65 ms 115.70 ms 115.75 ms]
       change: [-0.1856% -0.1214% -0.0627%] (p = 0.00 < 0.05)

Found 16 outliers among 100 measurements (16.00%)
4 (4.00%) low severe
2 (2.00%) low mild
9 (9.00%) high mild
1 (1.00%) high severe

SentPackets::take_ranges: No change in performance detected.
       time:   [5.3703 µs 5.5566 µs 5.7561 µs]
       change: [-0.2782% +2.5626% +5.4953%] (p = 0.09 > 0.05)

Found 9 outliers among 100 measurements (9.00%)
9 (9.00%) high mild

transfer/pacing-false/varying-seeds: Change within noise threshold.
       time:   [34.257 ms 34.325 ms 34.394 ms]
       change: [+0.4256% +0.6817% +0.9488%] (p = 0.00 < 0.05)
transfer/pacing-true/varying-seeds: Change within noise threshold.
       time:   [34.105 ms 34.162 ms 34.219 ms]
       change: [-0.7713% -0.5383% -0.2868%] (p = 0.00 < 0.05)
transfer/pacing-false/same-seed: Change within noise threshold.
       time:   [33.887 ms 33.932 ms 33.978 ms]
       change: [+0.0531% +0.2456% +0.4449%] (p = 0.02 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
1 (1.00%) low mild
1 (1.00%) high mild

transfer/pacing-true/same-seed: Change within noise threshold.
       time:   [34.599 ms 34.657 ms 34.716 ms]
       change: [-0.4572% -0.2370% -0.0183%] (p = 0.04 < 0.05)

Found 5 outliers among 100 measurements (5.00%)
2 (2.00%) low mild
3 (3.00%) high mild

1-conn/1-100mb-resp/mtu-1504 (aka. Download)/client: 💚 Performance has improved.
       time:   [2.0295 s 2.0369 s 2.0443 s]
       thrpt:  [48.917 MiB/s 49.094 MiB/s 49.272 MiB/s]
change:
       time:   [-9.1521% -8.6803% -8.2298%] (p = 0.00 < 0.05)
       thrpt:  [+8.9679% +9.5054% +10.074%]
1-conn/10_000-parallel-1b-resp/mtu-1504 (aka. RPS)/client: Change within noise threshold.
       time:   [385.95 ms 387.77 ms 389.63 ms]
       thrpt:  [25.665 Kelem/s 25.789 Kelem/s 25.910 Kelem/s]
change:
       time:   [-1.8457% -1.1416% -0.4341%] (p = 0.00 < 0.05)
       thrpt:  [+0.4360% +1.1548% +1.8804%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

1-conn/1-1b-resp/mtu-1504 (aka. HPS)/client: 💚 Performance has improved.
       time:   [27.379 ms 28.055 ms 28.744 ms]
       thrpt:  [34.790  elem/s 35.645  elem/s 36.524  elem/s]
change:
       time:   [-11.525% -8.4360% -5.1664%] (p = 0.00 < 0.05)
       thrpt:  [+5.4479% +9.2132% +13.026%]
1-conn/1-100mb-resp/mtu-1504 (aka. Upload)/client: 💚 Performance has improved.
       time:   [2.8967 s 2.9240 s 2.9535 s]
       thrpt:  [33.858 MiB/s 34.200 MiB/s 34.522 MiB/s]
change:
       time:   [-17.495% -16.475% -15.389%] (p = 0.00 < 0.05)
       thrpt:  [+18.188% +19.724% +21.205%]

Found 7 outliers among 100 measurements (7.00%)
6 (6.00%) high mild
1 (1.00%) high severe

Client/server transfer results

Performance differences relative to 79068bc.

Transfer of 33554432 bytes over loopback, 30 runs. All unit-less numbers are in milliseconds.

Client Server CC Pacing Mean ± σ Min Max Δ main Δ main
neqo neqo reno on 589.8 ± 40.2 545.8 743.6 💔 98.0 4.5%
neqo neqo reno 627.1 ± 132.4 549.5 1212.1 💔 104.6 4.5%
neqo neqo cubic on 593.8 ± 32.7 560.3 694.8 💔 95.5 4.4%
neqo neqo cubic 584.3 ± 31.3 556.9 735.9 💔 91.4 4.2%
google neqo reno on 910.3 ± 104.2 668.9 1095.3 9.0 0.2%
google neqo reno 904.3 ± 100.5 666.1 1108.9 2.4 0.1%
google neqo cubic on 896.4 ± 96.5 657.3 1046.0 2.1 0.1%
google neqo cubic 895.1 ± 96.8 648.8 1039.2 -0.8 -0.0%
google google 546.2 ± 30.5 524.2 692.0 -3.3 -0.2%
neqo msquic reno on 214.7 ± 14.8 196.8 262.5 -20.0 -2.2%
neqo msquic reno 231.2 ± 68.8 195.4 568.0 16.6 1.9%
neqo msquic cubic on 225.5 ± 40.5 197.5 361.8 4.4 0.5%
neqo msquic cubic 222.1 ± 33.6 198.3 376.6 -6.7 -0.7%
msquic msquic 120.9 ± 29.7 102.1 235.4 4.6 1.0%

⬇️ Download logs

Copy link
Collaborator

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of implementing this in neqo-bin, how about proposing it upstream quinn-udp:

https://github.com/quinn-rs/quinn/blob/8d6e48c20b71f7e915c13c33b66e2a09b6d59888/quinn-udp/src/unix.rs#L85-L86

I assume close to all QUIC implementations will want a large send and receive buffer.

@larseggert
Copy link
Collaborator Author

Instead of implementing this in neqo-bin, how about proposing it upstream quinn-udp:

Sounds good. Or maybe quinn-udp can offer functions to set those values.

I don't understand why the benches show a performance improvement, but the runs over loopback show a regression.

@larseggert
Copy link
Collaborator Author

@mxinden where would you suggest to add this in quinn-udp? As part of UdpSocketState or somewhere else?

@larseggert
Copy link
Collaborator Author

@mxinden alternatively, I'd suggest merging this to neqo now and switching over to any quinn-udp mechanism in a future PR.

@mxinden
Copy link
Collaborator

mxinden commented Mar 19, 2025

@larseggert how about something along the lines of:

diff --git a/quinn-udp/src/unix.rs b/quinn-udp/src/unix.rs
index c39941d5..0b5bbd21 100644
--- a/quinn-udp/src/unix.rs
+++ b/quinn-udp/src/unix.rs
@@ -257,6 +257,14 @@ impl UdpSocketState {
         self.may_fragment
     }
 
+    pub fn set_send_buffer_size(&self, bytes: usize)-> io::Result<()>  {
+        todo!();
+    }
+
+    pub fn set_rec_buffer_size(&self, bytes: usize) -> io::Result<()> {
+        todo!();
+    }
+
     /// Returns true if we previously got an EINVAL error from `sendmsg` syscall.
     fn sendmsg_einval(&self) -> bool {
         self.sendmsg_einval.load(Ordering::Relaxed)
@@ -543,7 +551,7 @@ fn recv(io: SockRef<'_>, bufs: &mut [IoSliceMut<'_>], meta: &mut [RecvMeta]) ->
     Ok(1)
 }

@larseggert
Copy link
Collaborator Author

@larseggert how about something along the lines of

Do we also need the equivalent for Windows?

@mxinden
Copy link
Collaborator

mxinden commented Mar 19, 2025

Yes. I would say the more platforms the better.

@larseggert
Copy link
Collaborator Author

@mxinden let me know if quinn-rs/quinn#2179 is what you had in mind.

@larseggert larseggert marked this pull request as draft March 24, 2025 15:36
We need to do the same for `neqo-glue`.

Fixes mozilla#1962
@larseggert
Copy link
Collaborator Author

larseggert commented Mar 25, 2025

On my Mac, I now see:

0.003 DEBUG Increasing send buffer size from 9216 to 1048576
0.003 DEBUG Increasing receive buffer size from 786896 to 1048576

Note that the send buffer size is apparently based on the net.inet.udp.maxdgram sysctl? Weird. Need to check this on other platforms.

Linux (from a QNS run):

0.002 DEBUG Increasing send buffer size from 212992 to 1048576
0.002 DEBUG Default receive buffer size is 1048576, not changing

Windows (from CI):

0.000 WARN Increasing send buffer size from 131072 to 1048576
0.000 WARN Increasing receive buffer size from 131072 to 1048576

https://lists.freebsd.org/pipermail/freebsd-questions/2005-February/075624.html says:

UDP is not buffered in the kernel like TCP is, so a sendspace sysctl
is not possible. When the interface queue becomes full (i.e. you are
sending data to the interface at a greater rate than it can put it on
the wire), you will see this error message, and your application needs
to be able to handle it.

This seems to indicate that we should not do this increase on macOS and BSDs?
@mxinden do we see ENOBUFS and are we handling that similar to EAGAIN?

@larseggert larseggert marked this pull request as ready for review March 25, 2025 12:32
@larseggert
Copy link
Collaborator Author

I also don't understand why loopback benchmark performance is going down.

@mxinden
Copy link
Collaborator

mxinden commented Mar 25, 2025

I assume more buffer space is not always better. This reminds me of the Source Buffer Management draft.

https://stuartcheshire.github.io/draft-cheshire-sbm/draft-cheshire-sbm.html

@mxinden
Copy link
Collaborator

mxinden commented Mar 25, 2025

@mxinden do we see ENOBUFS and are we handling that similar to EAGAIN?

We currently don't. Something along the following lines should work. Want to try running a benchmark on your local Mac to see whether either of them triggers?

diff --git a/Cargo.lock b/Cargo.lock
index 64bfcc1f..d6b8b6a7 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1028,6 +1028,7 @@ name = "neqo-udp"
 version = "0.12.2"
 dependencies = [
  "cfg_aliases",
+ "libc",
  "log",
  "neqo-common",
  "quinn-udp",
diff --git a/neqo-udp/Cargo.toml b/neqo-udp/Cargo.toml
index 4abd402a..8ebc2e83 100644
--- a/neqo-udp/Cargo.toml
+++ b/neqo-udp/Cargo.toml
@@ -19,6 +19,7 @@ workspace = true
 log = { workspace = true }
 neqo-common = { path = "./../neqo-common" }
 quinn-udp = { workspace = true }
+libc = "0.2"
 
 [build-dependencies]
 cfg_aliases = "0.2"
diff --git a/neqo-udp/src/lib.rs b/neqo-udp/src/lib.rs
index d498f5aa..0cfe0e74 100644
--- a/neqo-udp/src/lib.rs
+++ b/neqo-udp/src/lib.rs
@@ -74,7 +74,19 @@ pub fn send_inner(
         src_ip: None,
     };
 
-    state.try_send(socket, &transmit)?;
+    if let Err(e) = state.try_send(socket, &transmit) {
+        match e.raw_os_error() {
+            Some(libc::ENOBUFS) => {
+                todo!();
+            }
+            Some(libc::EAGAIN) => {
+                todo!();
+            }
+            Some(_) | None => {}
+        }
+
+        return Err(e);
+    }
 
     qtrace!(
         "sent {} bytes from {} to {}",

Related: Neither of them are currently exposed through std::io::ErrorKind. See rust-lang/rust#79965 for some discussion on adding these to std.

@larseggert
Copy link
Collaborator Author

I assume more buffer space is not always better. This reminds me of the Source Buffer Management draft.

stuartcheshire.github.io/draft-cheshire-sbm/draft-cheshire-sbm.html

It's different, in that I doubt that we actually fill that larger buffer, and they do. (If we filled it, we should IMO also see a throughput increase.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider increasing OS socket buffer (SO_SNDBUF, SO_RCVBUF)
2 participants