The following disclosure is copied verbatim from a blog post on morehouse.github.io, reproduced here to facilitate discussion.
LND 0.17.5 and below contain a bug in the on-chain resolution logic that can be exploited to steal funds. For the attack to be practical the attacker must be able to force a restart of the victim node, perhaps via an unpatched DoS vector. Update to at least LND 0.18.0 to protect your node.
Background
Whenever a new payment is routed through a lightning channel, or whenever an existing payment is settled on the channel, the parties in that channel need to update their commitment transactions to match the new set of active HTLCs. During the course of these regular commitment updates, there is always a brief moment where one of the parties holds two valid commitment transactions. Normally that party immediately revokes the older commitment transaction after it receives a signature for the new one, bringing their number of valid commitment transactions back down to one. But for that brief moment, the other party in the channel must be able to handle the case where either of the valid commitments confirms on chain.
As part of this handling, nodes need to detect when any currently outstanding HTLCs are missing from the confirmed commitment transaction so that those HTLCs can be failed backward on the upstream channel.
The Excessive Failback Bug
Prior to v0.18.0, LND’s logic to detect and fail back missing HTLCs works like this:
func failBackMissingHtlcs(confirmedCommit Commitment) {
currentCommit, pendingCommit := getValidCounterpartyCommitments()
var danglingHtlcs HtlcSet
if confirmedCommit == pendingCommit {
danglingHtlcs = currentCommit.Htlcs()
} else {
danglingHtlcs = pendingCommit.Htlcs()
}
confirmedHtlcs := confirmedCommit.Htlcs()
missingHtlcs := danglingHtlcs.SetDifference(confirmedHtlcs)
for _, htlc := range missingHtlcs {
failBackHtlc(htlc)
}
}
LND compares the HTLCs present on the confirmed commitment transaction against the HTLCs present on the counterparty’s other valid commitment (if there is one) and fails back any HTLCs that are missing from the confirmed commitment. This logic is mostly correct, but it does the wrong thing in one particular scenario:
- LND forwards an HTLC
H
to the counterparty, signing commitmentC0
withH
added as an output. The previous commitment is revoked. - The counterparty claims
H
by revealing the preimage to LND. - LND forwards the preimage upstream to start the process of claiming the incoming HTLC.
- LND signs a new counterparty commitment
C1
withH
removed and its value added to the counterparty’s balance. - The counterparty refuses to revoke
C0
. - The counterparty broadcasts and confirms
C1
.
In this case, LND compares the confirmed commitment C1
against the other valid commitment C0
and determines that H
is missing from the confirmed commitment.
As a result, LND incorrectly determines that H
needs to be failed back upstream, and executes the following logic:
func failBackHtlc(htlc Htlc) {
markFailedInDatabase(htlc)
incomingHtlc, ok := incomingHtlcMap[htlc]
if !ok {
log("Incoming HTLC has already been resolved")
return
}
failHtlc(incomingHtlc)
delete(incomingHtlcMap, htlc)
}
In this case, the preimage for the incoming HTLC was already sent upstream (step 3), so the corresponding entry in incomingHtlcMap
has already been removed.
Thus LND catches the “double resolution” and returns from failBackHtlc
without sending the incorrect failure message upstream.
Unfortunately, LND only catches the double resolution after H
is marked as failed in the database.
As a result, when LND next restarts it will reconstruct its state from the database and determine that H
still needs to be failed back.
If the incoming HTLC hasn’t been fully resolved with the upstream node, the reconstructed incomingHtlcMap
will have an entry for H
this time, and LND will incorrectly send a failure message upstream.
At that point, the downstream node will have claimed H
via preimage while the upstream node will have had the HTLC refunded to them, causing LND to lose the full value of H
.
Stealing HTLCs
Consider the following topology, where B
is the victim and M0
and M1
are controlled by the attacker.
M0 -- B -- M1
The attacker can steal funds as follows:
M0
routes a large HTLC along the pathM0 -> B -> M1
.M0
goes offline.M1
claims the HTLC fromB
by revealing the preimage, receives a new commitment signature fromB
, and then refuses to revoke the previous commitment.B
attempts to claim the upstream HTLC fromM0
but can’t becauseM0
is offline.M1
force closes theB-M1
channel using their new commitment, thus triggering the excessive failback bug.- The attacker crashes
B
using an unpatched DoS vector. M0
comes back online.B
restarts, loads HTLC resolution data from the database, and incorrectly fails the HTLC withM0
.
At this point, the attacker has succeeded in stealing the HTLC from B
.
M0
got the HTLC refunded, while M1
got the value of the HTLC added to their balance on the confirmed commitment.
The Fix
The excessive failback bug was fixed by a small change to prevent failback of HTLCs for which the preimage is already known. The updated logic now explicitly checks for preimage availability before failing back each HTLC:
func failBackMissingHtlcs(confirmedCommit Commitment) {
currentCommit, pendingCommit := getValidCounterpartyCommitments()
var danglingHtlcs HtlcSet
if confirmedCommit == pendingCommit {
danglingHtlcs = currentCommit.Htlcs()
} else {
danglingHtlcs = pendingCommit.Htlcs()
}
confirmedHtlcs := confirmedCommit.Htlcs()
missingHtlcs := danglingHtlcs.SetDifference(confirmedHtlcs)
for _, htlc := range missingHtlcs {
if preimageIsKnown(htlc.PaymentHash()) {
continue // Don't fail back HTLCs we can claim.
}
failBackHtlc(htlc)
}
}
The preimageIsKnown
check prevents failBackHtlc
from being called when the preimage is known, so such HTLCs are never failed backward or marked as failed in the database.
On restart, the incorrect failback behavior no longer occurs.
The patch was hidden in a massive rewrite of LND’s sweeper system and was released in LND 0.18.0.
Discovery
This vulnerability was discovered during an audit of LND’s contractcourt
package, which handles on-chain resolution of force closures.
Timeline
- 2024-03-20: Vulnerability reported to the LND security mailing list.
- 2024-04-19: Fix merged.
- 2024-05-30: LND 0.18.0 released containing the fix.
- 2025-02-17: Gijs gives the OK to disclose publicly in March.
- 2025-03-04: Public disclosure.
Prevention
It appears all other lightning implementations have independently discovered and handled the corner case that LND mishandled:
- CLN added a preimage check to the failback logic in 2018.
- eclair introduced failback logic in 2023 that filtered upstream HTLCs by preimage availability.
- LDK added a preimage check to the failback logic in 2023.
Yet the BOLT specification has not been updated to describe this corner case. In fact, by a strict interpretation the specification actually requires the incorrect behavior that LND implemented:
## HTLC Output Handling: Remote Commitment, Local Offers
### Requirements
A local node:
- for any committed HTLC that does NOT have an output in this commitment transaction:
- once the commitment transaction has reached reasonable depth:
- MUST fail the corresponding incoming HTLC (if any).
It is quite unfortunate that all implementations had to independently discover and correct this bug. If any single implementation had contributed a small patch to the specification after discovering the issue, it would have at least sparked some discussion about whether the other implementations had considered this corner case. And if CLN had recognized that the specification needed updating back in 2018, there’s a good chance all other implementations would have handled this case correctly from the start.
Takeaways
- Keeping specifications up-to-date can improve security for all implementations.
- Update to at least LND 0.18.0 to protect your funds.