It is true that larger blocks are a centralizing pressure upon miners. But there are several questions at play that must be answered before we could decide if a centralizing force was to be avoided or not. One such question relates to the goals of decentralization itself, and whether slightly more or less centralization of miners has any practical effect on Bitcoin’s censorship resistant qualities. All else being equal, less centralization would be better, but we must consider the question a matter of cost-benefit.
Another question relates to the balance and magnitude of other centralizing/decentralizing forces that operate on miners, and how propagation delays would affect the result. For instance, it is a fact that stranded and excess electricity are geographically distributed around the globe and across different political regimes. And more generally, cheap electricity is also well distributed. Losses to small miners from propagation delays would need to be large enough to offset higher electricity costs incurred by co-locating to where propagation losses would be smaller. Aren’t the size of these costs currently an order of magnitude or two apart from each other, with electricity costs dwarfing propagation delay losses in miner’s location considerations? And to clear up an earlier point - there is only a finite quantity of free or low cost electricity in each location where it is present. This is a very strong force keeping miners geographically distributed.
Indeed there are N^2 dynamics at play, which can quickly cause problems if we aren’t careful. So we need to be careful.
I disagree that we need to consider the ‘efficiency of the network as whole’ in the way you are advocating. Instead, we primarily ought to consider current and potential end users, and if the costs and benefits weigh out for them to indeed become or remain a user. And generally, we ought to want Bitcoin to a viable choice for transacting for as many people possible, for both practical and philosophical reasons. It is not a thing that individuals worry about their impact on “total global bandwidth consumption”. Not at all. And for the professionals involved in keeping the internet up and running, the push is for more fiber, faster switches and overall better connectivity. Not the rationing of bandwidth. There are parallels to your point of blocksize N^2 scaling with advances in the Megapixel counts for smartphone cameras. Higher pixel counts lead to better photos (taking up more storage), but encourage people to take more photos because of the better results. Indeed, on my smartphone today I have checks phone 236 GB of photos and videos. Which is almost embarrassing, but no one argued against better phone cameras for fear of this outcome.
Perhaps it is a reason why phone cameras aren’t 200 MP already, because storage does matter and diminishing returns exist. But N^2 consumption of a resource doesn’t mean an approach is flawed, or that it increasing N has no benefit - it just means that over some relatively small range of N consumption of available resources goes from trivial to manageable to unworkable.
In terms of viability of the network as a whole we should keep in mind that not all nodes serve blocks, and so the burdens of the N^2 network traffic can be concentrated. However, for this type of question, we ought to use the measure of the cheapest bandwidth available, since there is no reason nodes that block serving nodes couldn’t locate in those places. For instance in Switzerland there are 10 Gbit/s fiber to the home plans for $100 usd / month, which could serve 2,500 Terabytes per month of data at a rate of 1 GB / second. Which works out $0.04 / TB transmitted. Bitcoin’s chain is currently just under 600GB of data, or ¢2.4 in server transmission costs to do an IBD. It would cost $480 total to provide data to all of the 20,000 currently existent nodes for their IBD. The 17GB / month to keep a node in sync with 4MB blocks @ 20k nodes would total 332 TB, at a cost of $13. $13 to feed to the entire trillion dollar-plus network with data for a month. Multiply this by 10 and then square it and it would still be a completely trivial cost.
I don’t disagree on the importance of Lighting, or other scaling layers. And I do agree that the base chain will never have the capacity to settle even a meaningful fraction of global demand for transactions. But neither do either of those truths somehow suggest we shouldn’t increase the on chain capacity to its largest practical and safe amount that current conditions allow for. We should. And today that amount is larger than the current blocksize.
There is not a dichotomy here. It is not one or the other. We can strive to reduce demand for on chain transactions by creating other options that are superior, while at the same time increasing on chain capacity to its safe and viable limits. Bitcoin is an inherently inefficient design, and it has been enormously popular despite that.