Something I'm desperately keen to see is AI-assisted accessibility testing. I'm ...

wouldbecouldbe · 2025-12-09T11:21:40 1765279300

Not a joke. If truly you want a properly functioning website for blind/bad sight/ Step 1 would probably be to put on a blindfold and go through your website with a screenreader (cmd + f5 on a mac).

tdeck · 2025-12-09T13:49:40 1765288180

I always wonder why this isn't a bigger part of the discussion. None of us would develop a visual UI flow without trying it manually at least once, but for some reason this idea isn't extended to discussions about accessibility. The advice always fits into these three groups:

1. Follow a checklist

2. Buy my software

3. Hire blind people to test your app

I'm not saying that these are bad (although some overlay software is actually worse than nothing), but aren't people even a little bit curious to try the user experience you're shipping?

There are popular, free screen readers out there and using one can teach you a lot.

BeFlatXIII · 2025-12-09T18:24:00 1765304640

Perhaps a blindfolded person and a person who has always been blind have very different expectations of how to use software, such that they would give divergent opinions on what makes a good screen reader UI.

tracker1 · 2025-12-09T16:46:48 1765298808

Can't speak for others... and though visually impaired, I don't think I could handle navigating with a screen reader myself. I've sat through blind testing before and it's definitely impressive and I learned a lot. I will say that I do make an effort to do a lot of keyboard only navigation as part of testing. Just that can help a lot in terms of limiting janky UX.

Especially with flexbox and other more modern layout options.

simonw · 2025-12-09T11:24:50 1765279490

Here are notes from last time I tried that: https://tools.simonwillison.net/aria-live-regions

It's not something I'm comfortable enough with to do on a regular basis though.

wouldbecouldbe · 2025-12-09T11:25:39 1765279539

yeah its a real challenge, but probably the only way to really understand it. It's a completely different way of using the web. GPT can really help understand it while doing it.

wonger_ · 2025-12-09T15:41:57 1765294917

Yes, I did this! https://wonger.dev/posts/blindfolded-deployment I navigated GitHub with Windows' default screen reader, and videoed the process and wrote up my learnings. It changed the way I build websites because I'm always thinking about the screen reader routes in the back of my mind. Highly recommend.

hombre_fatal · 2025-12-09T13:27:08 1765286828

MacOS has a VoiceOver tutorial.

It’s pretty eye-opening (heh) to do it and then try to use your websites.

Before you even get to aria labels, you’ll find a lot of things to fix like:

- Add or remove components from tabindex

- Esc should close the modal or the slide-out sidebar

- Closing the sidebar/modal should return focus to the button the user toggled to open it (this one is huge for UX)

I recommend it. These things are useful for keyboard nav in general.

dawnerd · 2025-12-09T16:30:32 1765297832

What’s frustrating is you basically are required to use JS when basic css and html would normally work. There’s been some improvements thanks to the dialog and popover apis but it’s still not fully there yet.

Also seems the large companies that have to have compliance only care about it from a legal standpoint and are fine with just making the tests pass from whatever compliance company they use.

tracker1 · 2025-12-09T16:50:06 1765299006

Using a good component library goes a long way here... I've yet to see a better overall experience than React+MUI myself. Though you should adjust the default color palette.

tracker1 · 2025-12-09T16:48:56 1765298936

I'll add focus the first input field, or the error section at the top, when full-form validation fails. And related, don't allow modals to render buttons off-screen when text/display zooming is maxed out on mobile devices (personally see this a lot).

tracker1 · 2025-12-09T16:44:27 1765298667

+1, or simply hire a blind person to run one or more stages of QC on your application.

Adjacently, I cannot state enough how much I wish other toolkits would offer component libraries as cohesive as MUI is for React... the use of Aria is "just right" IMO and is far more broad, complete and cohesive as a whole (aside from some poor default color/contrast choices in Material defaults inherited from Google).

Another thing that bugs me to no end, since I've developed some visual impairments, is sites/apps that don't function on mobile devices with text/display scaling cranked up. Modals where the buttons are off-screen and no way to scroll to them are useless, similarly allowing text to go too big (gmail) to where an 8-letter word gets split and wrapped.

All around, I definitely think that if you're spending 8+ figures on application developers you can afford testing by a few people who are visually impaired and blind.

Earlier in my career, I sat with a blind user through testing a bunch of eLearning content and it was really eye opening back then... the establishment of aria labels helps a lot... but as the article mentions, you need to use them right. I find that more often than not, using the right elements/components, labels, titles, etc in the first place goes a long way.

burningChrome · 2025-12-09T16:19:26 1765297166

Keep in mind, this is only one area of accessibility. Neurodiverse users, users with cognitive issues, users that have a hard time using a mouse, low vision users and even deaf users all have specific issues.

Simply testing with a screen reader is missing entire groups of users.

tracker1 · 2025-12-09T16:53:59 1765299239

+1 on this one... I've mentioned it a few times in these threads already... but for the love of all that is holy, try your mobile website/app in an actual phone where you have accessibility turned up for text and display.

I cannot tell you how many times I've experienced modals with buttons literally off screen and no navigation option in apps from multi-billion and trillion dollar tech companies. Almost as bad is gmail allowing text to scale so insanely large that it's equally unusable.

ErroneousBosh · 2025-12-09T12:30:48 1765283448

I would very much like to be better at building websites that handle assistive technologies like screen readers well. I don't have a lot of blind users to worry about because they don't let you be a firefighter if you're blind.

But a lot of firefighters are people who simply did not do well in school, even the very senior ones, and that's because they are often very clever people who are of an age where things like dyslexia were just not diagnosed early or often.

So now I deal with a lot of people who use assistive technologies to help with dyslexia and related comorbidities. I have dyscalculia that wasn't diagnosed until I was 19 and at uni, and even then it was diagnosed initially by my mate's Educational Psychology undergrad girlfriend in the pub one evening. That was in the early 90s and we're better at it now - but not by much.

PaulHoule · 2025-12-09T13:35:28 1765287328

I'd argue it's pretty hard on Windows.

NVDA hasn't worked for me since Windows 11.

Narrator + IE and Narrator + Chrome basically work but make ARIA look like vandalism. It will just be reading the text and blurt out "LANDMARK WILL ROBINSON!" in the middle of the text for no obvious reason and doesn't do it the same very time. It's basically possible to use either one of those but Narrator + Firefox is especially spastic and blurts out random landmarks and ARIA junk to the extent that it's really difficult to evaluate.

I mean, that's part of the problem, there is a spec for how HTML is supposed to be rendered, but ARIA is not specific about how ARIA markup is rendered which might means tools could bend to meet users' needs but it also means there is no such thing as "I've convinced myself that my ARIA markup is proper and will work for everyone with mainstream tools"

dawnerd · 2025-12-09T16:32:53 1765297973

Sounds like you might have content that’s changing? But what you’re running into is exactly what a user would be running into.

PaulHoule · 2025-12-09T17:37:31 1765301851

No, I just don't there's a specification for what exactly a screen reader is supposed to with ARIA markup.

For example it could show you or read to you a list of all the nav landmarks on the page, which would be (1) helpful for end users, and (2) sure as hell helpful for me as a dev so I can know how the landmarks work as a system. All screen browsers really seem to do is blurt out "NAVIGATION LANDMARK!" randomly before or in the middle or after the <nav>

devinprater · 2025-12-09T11:09:37 1765278577

There are thousands of blind people on the net. Can't you hire one of them to test for you? Please?

angiolillo · 2025-12-09T14:19:15 1765289955

> There are thousands of blind people on the net. Can't you hire one of them to test for you?

Testing is a professional skill -- not all blind people are good at accessibility testing, just as not all sighted people are good at GUI testing.

My team has carved out an accessibility budget so that every couple years we can hire an accessibility consultancy (which employs a couple blind testers) for a few tens of hours of work to review one of our application workflows. Based on the issues they identify we attempt to write tests to prevent those classes of issues across the whole application suite, but our budget means that less than one percent of our UI has ever been functionally tested for accessibility.

It comes down to cost/benefit. Good testers are expensive, good accessibility testers doubly-so. And while I personally think there's a moral imperative and maybe a marketing angle, improving accessibility truthfully doesn't seem to meaningfully improve sales. But if the testing costs came down by a couple orders of magnitude it would be a complete game-changer.

tracker1 · 2025-12-09T17:03:52 1765299832

I think one path that could be used and wholy underrated is simply sitting with a user while they use/navigate your application. The user themselves doesn't necessarily need to be skilled tester, and you will need to have more session time than a skilled tester, but it does work and can help a lot.

Also, try using your app/site without a mouse. I've found it funny how many actual, experienced, sighted testers don't actually know the keyboard navigation for things like triggering menus, select boxes, etc. Personally, I don't think I could get used to the voice navigation myself, it's not that it doesn't work, it's just kind of noisy. Although, most sites are excessively noisy visually imo.

angiolillo · 2025-12-09T17:56:21 1765302981

Completely agree on both counts. We do usability testing, including with keyboard-focused advanced users.

But usability testing with blind users presents some unique challenges. A past org I worked at ran some usability studies with blind users [1] and while I was only tangentially involved in that project it seemed that subject recruitment and observations were much more complex than typical usability studies. I haven't managed to run a usability study with blind participants at my current org though we have discussed ways we could recruit blind users for studies -- our software is complex enough that we'd need someone who is both blind and a prospective user of our software.

[1] https://www.bloomberg.com/ux/2018/08/28/visually-impaired-wo...

m12k · 2025-12-09T11:24:45 1765279485

If you don't want this to break eventually, you need it tested every time your CI/CD test suite runs. Manual testing just doesn't cut it

tdeck · 2025-12-09T13:54:01 1765288441

We have the exact same problem with visual interfaces, and the combination of manual testing for major changes + deterministic UI testing works pretty well.

Actually it could be even easier to write tests for the screen reader workflow, since the interactions are all text I/O and pushing keys.

cenamus · 2025-12-09T11:58:24 1765281504

AI in your CI pipeline won't help either then, if it randomly gives different answers

simonw · 2025-12-09T12:07:05 1765282025

An AI-generated automated testing script in your pipeline will do great though.

debugnik · 2025-12-09T12:24:19 1765283059

And then we're back at your own:

> I'm not convinced at all by most of the heuristic-driven ARIA scanning tools.

simonw · 2025-12-09T12:36:27 1765283787

That's entirely different.

ARIA scanning tools are things that throw an error if they see an element that's missing an attribute, without even attempting to invoke a real screenreader.

I'm arguing for automated testing scripts that use tools like Guidepup to launch a real screenreader and assert things like the new content that was added by fetch() being read out to the user after the form submission has completed.

I want LLMs and coding agents to help me write those scripts, so I can run them in CI along with the rest of my automated tests.

debugnik · 2025-12-09T15:09:55 1765292995

That's very different from what I thought you were arguing for in your top comment, though: a computer-use agent proving the app is usable through a screen reader alone (and hopefully caching a replayable trace to not prompt it on every run).

Guidepup already exists, if people cared they'd use it for tests with or without LLMs. Thanks for showing me this tool BTW! I agree testing against real readers is better than using a third-party's heuristics.

zamadatix · 2025-12-09T12:17:17 1765282637

So does hiring a person or tests which rely on entropy because exhaustive testing is infeasible. If you can wrangle the randomness (each has different ways of going about that) then you end up with very useful tests in all 3 scenarios, but only automated tests scale to running every commit. You probably still want the non-automated tests per release or something as well if you can, depending what you're doing, but you don't necessarily want only invariant tests in either case.

simonw · 2025-12-09T11:23:07 1765279387

Realistically I'm unlikely to do that for my dozens of non-income-generating personal projects.

I still want them to be accessible!

(The amount of accessibility testing I want to do would bankrupt me very quickly.)

PaulHoule · 2025-12-09T13:30:59 1765287059

This. I've been doing a lot of accessibility work and it seems like the one thing nobody ever does is talk to a screenreader user.

vintermann · 2025-12-09T14:18:21 1765289901

Most of us can't, no. Sorry. There's just not enough money in the things we make.

Andrex · 2025-12-09T12:12:54 1765282374

Why not both?

javchz · 2025-12-09T13:57:52 1765288672

I think this is the way, automated testing for all patches, small changes, and manual testing for big releases.

Moto7451 · 2025-12-09T16:25:31 1765297531

I’m doing a PoC at work with Workback.ai, which is essentially what you’re asking about. So far it’s early but it seems ok at first brush. We have a firm we pay for traditional accessibility assessments, remediation, and VPATs and my expectation is that the AI tooling does not replace them due to how business needs and product design interact with accessibility.

I.e. ChatGPT and Cursor can probably remediate adding screen reader support for a solving a Captia for the blind, but do you want to really do that? There’s likely a better design for the blind.

Either way, I agree. This is a big area where there can be real impact in the industry. So far we’ve gotten scans back in record time compared to human in the loop scans.

jillesvangurp · 2025-12-09T12:57:04 1765285024

A more viable path might actually be agentic testing via agents that simply use a browser or screen reader that can work off high level test scenarios.

I've done some UI testing via the agent mode in chat gpt and I got some pretty decent feedback out of that. I've been trying to do more of that.

Accessibility testing might require a bit more additional tooling than comes with chat gpt by default. But otherwise, this could work.

edflsafoiewq · 2025-12-09T10:42:37 1765276957

Can screen readers emit their narration as text instead of / in addition to audio?

devinprater · 2025-12-09T11:14:32 1765278872

Yes, they can. NVDA has a Speech Viewer. VoiceOver (Mac) has the caption panel.

NVDA Speech viewer: https://download.nvaccess.org/documentation/userGuide.html#S... Caption Panel: https://support.apple.com/guide/voiceover/use-the-caption-pa...

askew · 2025-12-09T11:15:26 1765278926

Yes. Output can be sent to a Braille device too.

klysm · 2025-12-09T16:14:13 1765296853

There might be a great use case here, but the economics make me nervous. Don't the same problems apply here for why we don't have great accessibility? Who is paying for it? How do I justify the investment (AI or not) to management?

askew · 2025-12-09T10:36:08 1765276568

Guidepup also includes a Virtual Screenreader[1].

[1] https://www.guidepup.dev/docs/virtual

PebblesHD · 2025-12-09T10:25:32 1765275932

Rather than improving testing for fallible accessibility assists, why not leverage AI to eliminate the need for them? An agent on your device can interpret the same page a sighted or otherwise unimpaired person would giving you as a disabled user the same experience they would have. Why would that not be preferable? It also puts you in control of how you want that agent to interpret pages.

simonw · 2025-12-09T10:50:02 1765277402

I'm optimistic that modern AI will lead to future improvements in accessibility tech, but for the moment I want to meet existing screenreader users where they are and ensure the products I build are as widely accessible as possible.

K0nserv · 2025-12-09T11:39:55 1765280395

It adds loads of latency for one. If you watch someone who is a competent screen reader user you'll notice they have the speech rate set very high, to you it'll be hard to understand anything. Adding an LLM in the middle of this will add, at least, hundreds of milliseconds of latency to interactions.

eru · 2025-12-09T10:33:36 1765276416

What you are describing is something the end user can do.

What simonw was describing is something the author can do, and end user can benefit whether they use AI or not.

8organicbits · 2025-12-09T12:22:13 1765282933

The golden rule of LLMs is that they can make mistakes and you need to check their work. You're describing a situation where the intended user cannot check the LLM output for mistakes. That violates a safety constraint and is not a good use case for LLMs.

devinprater · 2025-12-09T14:12:20 1765289540

I, myself, as a singular blind person, would absolutely love this. But we ain't there yet. On-device AI isn't finetuned for this, and neither Apple nor Google have shown indications of working on this in release software, so I'm sure we're a good 3 years away from the first version of this.

simonw · 2025-12-09T10:11:03 1765275063

... I just saw Guidepup has an official GitHub Actions setup action, so that's great news! https://github.com/guidepup/setup-action

On macOS it can record audio too.

westont5 · 2025-12-09T17:48:02 1765302482

The agent-driving-a-screenreader footgun is that it's quite easy to build UI that creates a reasonable screereader UX, but unintentionally creates accessibility barriers for other assistive technologies, like voice control.

Ex: a search control is built as <div aria-label="Search"><input type="search"/></div>. An agent driving a screenreader is trying to accomplish a goal that requires search. Perhaps it tries using Tab to explore, in which case it will hear "Search, edit blank" when the input is focused. Great! It moves on.

But voice control users can't say "click Search". That only works if "Search" is in the control's accessible name, which is still blank, the outer div's aria-label has no effect on components it wraps. Would an agent catch that nuance? Would you?

You could realign to "I want to know if my features work for screenreader, voice control, switch, keyboard, mobile keyboard [...] users", but you can imagine the inverse, where an improvement to the voice control UX unintentionally degrades the screenreader UX. Accessibility is full of these tensions, I worry an multi-agent approach would result in either the agents or you getting bogged down by them.

I think a solution needs to incorporate some heuristics, if you consider WCAG a heuristic. For all its faults, a lot of thought went into rules that balance the tensions reasonably well. I used to be more of the "forget WCAG, just show me the UX" attitude, but over the years I've come to appreciate the baseline it sets. To the example above, 2.5.3 Label in Name clearly guides you towards setting an accessible name (not description) on the search itself, patching up support for screenreaders and voice control.

Not that WCAG concerns itself with the details of ARIA (it packs all that complexity into the vague "accessibility supported"[1]). We do need more seamless ways of quickly evaluating whether ARIA or whatever markup pattern has the intended rendering in screen readers, voice control, etc, but at a level that's already constrained. In the example, WCAG should apply its constraints first. Only then should we start running real screen readers and analyzing their audio, and to avoid the footguns that analysis should be at a low level ("does the audio rendering of this control reasonably convey the expected name, state, value, etc?"), not a high level ("does the audio rendering contain the info necessary to move to the next step?").

Unfortunately both agents and heuristic-driven accessibility scanning tools struggle to apply WCAG today. Agents can go deeper than scanners, but they're inconsistent and in my experience really have trouble keeping 55+ high level rules in mind all the time. In the example, an agent would need to use the accessibility tree to accomplish its goal and need to reject a node with label "Search" containing a role=textbox as an option for searching (or at least flag it), which is made trickier by the fact that sometimes it _is_ ok to use container labels to understand context.

I think the answer might be to bundle a lot of those concerns into an E2E test framework, have the agent write a test it thinks can accomplish the goal, and enter a debug loop to shake out issues progressively. Ex: if the agent needs to select the search control for the task, and the only allowed selector syntax requires specifying the control's accessible name (i.e. Playwright's getByRole()), will the agent correctly get blocked? And say the control does have an accessible name, but has some ARIA misconfigurations — can the test framework automatically run against some real screenreaders and report an issue that things aren't working as expected? We've done some explorations on the framework side you might be interested in [2].

[1] https://www.w3.org/TR/WCAG22/#dfn-accessibility-supported [2] https://assistivlabs.com/articles/automating-screen-readers-...

api · 2025-12-09T12:45:42 1765284342

What about just AI assisted accessibility? Like stop requiring apps to do anything at all. The AI visually parses the app UI for the user, explains it, and interacts.

Accessible is an also-have at best for the vast majority of software. This would open a lot more software to blind users than is currently available.

simonw · 2025-12-09T12:47:20 1765284440

That's expensive, slow (listen to a screenreader user some time to see how quickly they operate) and likely only works online.

I'm also not going to shirk my responsibilities as a developer based on a hope that the assistive tech will improve.

api · 2025-12-09T12:55:10 1765284910

It’s expensive for now, slow for now, online for now, but it’s pretty clear that this is the future. If I were blind I’d want it to go here since it would just unlock so much more. Very little software or sites have good accessibility. Open source and indie stuff often has none.

A custom local model trained only for this task seems like a possibility, and could be way smaller than some general purpose model being instructed for this task. I’m thinking screen reader and UI assist only. Could probably be like a 7B quantized model. Maybe smaller.

throawayonthe · 2025-12-09T11:15:24 1765278924

can we just hire disabled people as testers please

simonw · 2025-12-09T11:26:28 1765279588

How about both?