Hi folks. I am posting a draft proposal to attempt to improve the wallet recovery experience for users or their heirs.
TLDR: I am worried an heir will have to copy-paste descriptors from a txt or json into future wallet software. Ideally, heirs may simply drag-and-drop a file into future wallet software and be assured all relevant data that encapsulates a wallet is present. This file should also store metadata, like where signers may be located, or a height to start searching for outputs.
I know that reading a file format specification may not be very exciting, but I am hoping to gather feedback and ultimately improve recovery for the community. The specification is outlined below, and I also have a reference implementation in Rust to experiment with. Any feedback is greatly appreciated!
Motivation
The purpose of a standarized file format is to decrease the chance of loss of funds in the case of wallet recovery or inheritance. BIP32 seed phrases are not sufficient information to recover funds if multiple signers are used, a non-standard derivation path is used, or if the recoverer does not know what to do with such seed phrase. Moreover, seed phrases must be entered into a device to recover wallet public keys. Output descriptors alleviate these issues, however the process of storing, exporting, and sharing descriptors is currently determined by users. A unified file format allows for “one-click” backups and recoveries for descriptor-based bitcoin wallets, transferring the responsibilities of handling descriptors from users to software.
Output descriptors are often exported, shared, and stored in .txt files, but other formats such as JSON are also used. This discrepancy causes developer burden, and varying formats cause uncertainty of support for parsing such data in future wallet software. If descriptor parsing from a txt or json file fails, it is up to the heir or recoverer to resolve the problem. A unifed file format for importing and exporting descriptors not only decreases developer burden, but allows for a higher degree of certainty that an heir will not have to manually “copy and paste” descriptors into future software. Files are portable, duplicable, and easily parsed by most devices. Files may also be encrypted and safely shared with attorneys, business partners, or other semi-trusted entities.
Futher still, metadata about the descriptors, such as the name of the wallet, a description of its uses, where to start scanning for ouputs, and additional information to recover signers are either encoded as a comment in the .txt, introducing difficulty for software to parse reliably, or fields in a JSON, introducing forward-compatiblity concerns for future wallet software. These fields are easily encoded and decoded with a standard format.
Definitions
descriptor, shorthand for “output descriptor.”
|| denotes the concatenation of two elements.
[]bytes represents a variable array of bytes.
[N]bytes represents an array of N bytes.
Record is an entry in an import.
Import an entry in the file that describes metadata and associated descriptors.
Specification
Record
Information is stored in a WDEF file in the form of Records. A Record is a tuple of a Type, Length, and Value, followed by a 4 byte checksum. Every WDEF file is prefixed with a single byte representing the number of records in the file. A record type is represented as a byte, with possible types listed below:
Record Type (Type) |
Value (u8) |
Description |
|---|---|---|
| Name | 0x00 | The name of the wallet in the file |
| Description | 0x01 | Summary of this wallet’s use(s) |
| Info | 0x02 | Any additional information to recover funds |
| RecoveryHeight | 0x03 | Height in the blockchain this wallet began to receive and send payments |
| ExternalDescriptor | 0x04 | A descriptor that is used to receive bitcoins. Encodes public keys and cannot spend bitcoins |
| InternalDescriptor | 0x05 | A descriptor that is used to generate change outputs when spending bitcoins. Encodes public keys and cannot spend bitcoins |
| MultipathDescriptor | 0x06 | A descriptor that encodes multiple paths to unique descriptors. |
A Length is a 16-bit number represented as bytes in little endian. The length represents the number of bytes in the value encoding that follows.
Name, Description, Info, ExternalDescriptor, InternalDescriptor, and MultipathDescriptor are all represented as strings and encoded as the UTF-8 byte array
for such a string representation. RecoveryHeights are represented as a 4 byte little endian array representation.
The checksum for a Record is calculated by SHA256( Type || Value ) and taking the first four bytes of the resulting hash.
A Record is completely defined as:
Type:[1]bytesIDLength:[2]byteslittle endian value representing the length of the next fieldValue:[]bytesvariable length contents representing the record type, often a UTF-8 encoded stringChecksum:[4]bytesa commitment to the record type and record value
Import
An Import is simply a structured list of Records. An import contains a single byte that represents the count of Records in the file, followed by the Records. All Imports must adhere to these requirements, or the file is invalid:
- A single
Nameis required. - Any
Descriptionmust be unique, but is not required. - Any number of
Infomay be included. - A
RecoveryHeightmay be provided, but is not required. TheRecoveryHeightmust be unique. - Any number of
ExternalDescriptor,InternalDescriptor, orMultipathDescriptormay be provided. - At least one
ExternalDescriptororMultipathDescriptormust exist.
An Import is encoded as:
Length:[1]byteslittle endian integer representing theRecordcountRecords:[]bytesthe entries of the file.
File Prefix
Every WDEF file is prefixed with seven bytes of data that identifies the file: 0x00, 0x00, 0x00, 0x57, 0x44, 0x45, 0x46. This data is followed by the protocol version, which is set to 0x00 for version one.
Complete Structure
- File identifier:
0x00, 0x00, 0x00, 0x57, 0x44, 0x45, 0x46 Version:[1]bytesLength:[1]bytesRecords:[]bytes
Encoding Files
First, the seven bytes of the file identifier and single verison byte are written to the file.
Next, the number of records to be recorded in the file should be determined, and the byte representing the length is added to the serialization buffer. If the number of records cannot fit into an 8-bit unsigned integer, encoding fails.
Next, each record is composed as Type || Length || Value || Checksum, and each array of bytes are concatinated together. Encoding fails if the Length cannot be represented as a 16-bit unsigned integer.
If any of the Import requirements are violated, encoding fails.
Decoding Files
Read the first seven bytes of the file, failing the decoding if the file identifier does not match. Read the next byte and interpret it as the version. For versions greater than those supported by the software, decoding fails. All WDEF decoders must implement 0x00.
The next byte is read from the file and interpreted as the number of records. For each record in the record count:
-
Read and interpret the first byte as the
Type. Decoding fails for unrecognizedTypevalues. -
Read the next two bytes and interpret them as an 16-bit unsigned integer, denoted
L -
Read the next
Lbytes and parse into the desired representation. Decoding fails if the bytes cannot be parsed. -
Calculate the checksum using the
ValueandType -
Read the next 4 bytes and fail if the calculated and presented checksum do not match.
-
For descriptors, parse the computed string and attempt to cast it to a descriptor that encodes public keys. Decoding fails for if the provided string cannot be cast to a public key descriptor.
After all the Records have been collected, evalutate if the collection follows the rules for an Import. Decoding fails if a rule is violated.
Files adhering to this standard should be postfixed with the .wdef extension.
Rationale
Descriptor Values are represented strings for the following reasons:
-
Bitcoin Core supports parsing strings as descriptors within
bitcoin-cli. -
UTF-8 encodings are an open standard with forwards compatability for new
Types. -
Most encoding logic may be shared for each
Type.
The checksum is added to ensure that invalid Record encodings are discovered during decoding and file integrity is guaranteed when successfully parsing a WDEF file.