Add a channel congestion control mechanism #2330
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Channels have a limited number of HTLCs that can be in-flight at a given time, because the commitment transaction cannot have an unbounded number of outputs. Malicious actors can exploit this by filling our channels with HTLCs and waiting as long as possible before failing them (also known as a channel jamming attack).
To increase the cost of this attack, we don't let our channels be filled with low-value HTLCs. When we already have many low-value HTLCs in-flight, we only accept higher value HTLCs. Attackers will have to lock non-negligible amounts to carry out the attack.
I chose to use hard-coded buckets for now that encode the following constraints:
max-htlc-value-in-flight
max-htlc-value-in-flight
max-htlc-value-in-flight
max-htlc-value-in-flight
I don't know if we should make it configurable. It's quite hard to reason about the effectiveness of a specific configuration, it really needs to be plotted against various channel configurations, as the
reject htlcs when buckets are full
unit test does.When channels are very big, we probably never expect HTLCs bigger than 1% to be relayed: instead of using a percentage of
max-in-flight
, should we use something non-linear here, such as a percentage off(max-in-flight)
with a carefully chosen non-linear functionf
?Assuming
anchor_outputs_zero_fee_htlc_txs
and a dust limit of330 sat
, the current configuration offers the following guarantees:max-htlc-value-in-flight = 500_000 sat
andmax-accepted-htlcs = 30
:4_950 sat
to block HTLCs below 5_000 sat`49_950 sat
to block HTLCs below 25_000 sat`124_950 sat
to block HTLCs below 50_000 sat`274_950 sat
to jam the channel entirelymax-htlc-value-in-flight = 1_000_000 sat
andmax-accepted-htlcs = 30
:4_950 sat
to block HTLCs below 10_000 sat`94_950 sat
to block HTLCs below 50_000 sat`244_950 sat
to block HTLCs below 100_000 sat`544_950 sat
to jam the channel entirelymax-htlc-value-in-flight = 5_000_000 sat
andmax-accepted-htlcs = 30
:4_950 sat
to block HTLCs below 50_000 sat`454_950 sat
to block HTLCs below 250_000 sat`1_204_950 sat
to block HTLCs below 500_000 sat`2_704_950 sat
to jam the channel entirelymax-htlc-value-in-flight = 500_000 sat
andmax-accepted-htlcs = 50
:8_250 sat
to block HTLCs below 5_000 sat`83_250 sat
to block HTLCs below 25_000 sat`208_250 sat
to block HTLCs below 50_000 sat`458_250 sat
to jam the channel entirelymax-htlc-value-in-flight = 1_000_000 sat
andmax-accepted-htlcs = 50
:8_250 sat
to block HTLCs below 10_000 sat`158_250 sat
to block HTLCs below 50_000 sat`408_250 sat
to block HTLCs below 100_000 sat`908_250 sat
to jam the channel entirelymax-htlc-value-in-flight = 5_000_000 sat
andmax-accepted-htlcs = 50
:8_250 sat
to block HTLCs below 50_000 sat`758_250 sat
to block HTLCs below 250_000 sat`2_008_250 sat
to block HTLCs below 500_000 sat`4_508_250 sat
to jam the channel entirelyI'm not sure how we could plot this to make it easier to analyze, there are quite a lot of moving parameters to take into account...
An important caveat of this bucketing strategy is how it interacts with balance estimation. If our first bucket is full, a sender may think our balance is low and will avoid sending large HTLCs through the channel, while this large HTLC could be forwarded. This means that the balance estimation feature would actually defeat the congestion control mechanism...
We may need to return a dedicated onion error when applying such congestion control to ensure that senders don't misinterpret it as a liquidity failure, but then every relaying node has an incentive to use that new error to avoid revealing that they have a liquidity failure ¯_(ツ)_/¯
NB: this PR builds on top of #2299