ARIA scanning tools are things that throw an error if they see an element that's missing an attribute, without even attempting to invoke a real screenreader.
I'm arguing for automated testing scripts that use tools like Guidepup to launch a real screenreader and assert things like the new content that was added by fetch() being read out to the user after the form submission has completed.
I want LLMs and coding agents to help me write those scripts, so I can run them in CI along with the rest of my automated tests.
That's very different from what I thought you were arguing for in your top comment, though: a computer-use agent proving the app is usable through a screen reader alone (and hopefully caching a replayable trace to not prompt it on every run).
Guidepup already exists, if people cared they'd use it for tests with or without LLMs. Thanks for showing me this tool BTW! I agree testing against real readers is better than using a third-party's heuristics.
So does hiring a person or tests which rely on entropy because exhaustive testing is infeasible. If you can wrangle the randomness (each has different ways of going about that) then you end up with very useful tests in all 3 scenarios, but only automated tests scale to running every commit. You probably still want the non-automated tests per release or something as well if you can, depending what you're doing, but you don't necessarily want only invariant tests in either case.