If the concern is really that you might have a very large commitment transaction (say 30k-40k vbytes) that may require a lot of UTXOs in order to CPFP, then it would seem that the downside from having to first consolidate your UTXOs down to 1 in a separate transaction, get that confirmed, and then use that to CPFP is not so large, in percentage terms?
The additional number of vbytes consumed to consolidate first would be something like 110 vbytes, if I’m calculating right (1 extra transaction’s overhead, plus one extra output created and one extra input spent).
Not ideal for a long run solution, but in thinking about tradeoffs, maybe minimizing pinning potential by going with a smaller child size is more valuable?