I don’t think that this is correct, or as clear-cut as you think: can you detail what makes you think that? The main performance bottleneck for routing nodes is by far disk IO (database operations) that happens when commitment_signed
is sent. Nodes are thus incentivized to batch HTLCs to minimize this disk IO, especially when they route a lot of payments. Even if they don’t route a lot of payments, they cannot know when the next HTLC will come: so it is a sound strategy to wait for a small random duration before signing, in case another HTLC comes in.
Interestingly, since such batching reduces the frequency of disk IO, it provides more stable latency. The end result is a higher median latency (ie not chasing every millisecond) but a smaller standard deviation.
The batching interval really depends on the expected frequency of payments relative to the performance of the DB. But I believe that if lightning is successful at being a payment network, it doesn’t have to be a huge value? I think that we can use a value that provides a good enough payment UX while providing good node performance.
I agree with you. Based on my current understanding, my preferred choice would be:
- receiver-side random delays
- sender-side random delay on retries
- small randomized batching interval at intermediate nodes (mostly for performance, but also to add a small amount of noise of relay latency)
- random message padding / cover traffic (which I think doesn’t have to be CBR to be effective)
As you say, this doesn’t rule out the current 1ms encoding for attributable failures. But I’d be curious to have @MattCorallo and @tnull’s thoughts here: they mentioned during the spec meeting that intermediate forwarding delays are important for privacy even when we have random message padding and cover traffic, and I don’t understand why. So I may be missing something important.