I find it a safer bet that there are terrible economics all over. Especially when the buyers are not the users, as is usually the case with supercomputers (just like with all "enterprise" stuff).
In the cluster I'm using there's 36 nodes, of which 13 are currently not idling (doesn't mean they are computing). There are 8 V100 GPUs and 7 A100 GPUs and all are idling. Admittedly it's holiday season and 3AM here, but this it's similar other times too.
This is of course great for me, but I think the safer bet is that the typical load average of a "supercomputer" is under 0.10. And the less useful the hardware, the less will be its load.
It is not a reasonable assumption to compare your local cluster to the largest clusters within DOE or their equivalents in Europe/Japan. These machines regularly run at >90% utilization and you will not be given an allocation if you can’t prove that you’ll actually use the machine.
I do see the phenomenon you describe on smaller university clusters, but these are not power users who know how to leverage HPC to the highest capacity. People in DOE spend their careers working to use as much as these machines as efficiently as possible.
In Europe at least supercomputer are organised in tiers. Tier 0 are the highest grade, tier 3 are small local university clusters like the one you describe. Tier 2 or Tier 1 machines and upward usually require you to apply for time. They are definitely highly utilised. Tier 3 the situation will be very different from one university to the next. But you can be sure that funding bodies will look at utilisation before deciding on upgrades.
Also this amount of GPUs is not sufficient for competitive pure ML research groups from what I have seen. The point of these small decentral underutilized resources is to have slack for experimentation. Want to explore ML application with a master student in your (non-ML) field? Go for it.
Edit: No idea how much of the total hpc market is in the many small instalks, vs the fewer large ones. My instinct is that funders prefer to fund large centralised infrastructure, and getting smaller decentralised stuff done is always a battle. But that's all based on very local experience, and I couldn't guess how well this generalises.
In the cluster I'm using there's 36 nodes, of which 13 are currently not idling (doesn't mean they are computing). There are 8 V100 GPUs and 7 A100 GPUs and all are idling. Admittedly it's holiday season and 3AM here, but this it's similar other times too.
This is of course great for me, but I think the safer bet is that the typical load average of a "supercomputer" is under 0.10. And the less useful the hardware, the less will be its load.