I’m almost sure it’s incorrect usage of D3D. I’ve encountered and fixed similar bugs in my 3D graphics code.
Based on the symptoms and difficulties to reproduce, I think what you see on the screenshots are incomplete renders. The FF code submitted some draw calls to GPU, and copied the render target texture without waiting for draw calls to complete. A good way to fix is use a D3D11_QUERY_EVENT query to wait for completion of rendering.
If all GPU access happens through D3D11 this shouldn’t happen i.e. the API guarantees to wait for the completion of previously submitted draw calls. It may happen in practice when mixing multiple GPU APIs, e.g. when using DXGI surface sharing to pass textures between D3D11 and DX9. Also, it may happen when using D3D from multiple threads. Easy way to detect the latter, set D3D11_CREATE_DEVICE_DEBUG flag when creating the device, and read warnings in debug output of the process.
Another possibility is just bugs in rendering code. To troubleshoot them, https://renderdoc.org/ is awesome. Unfortunately, if the FF doesn’t use D3D to present, only rendering to textures, some changes to FF’s source code is required to capture frames with RD, see this page for details: https://renderdoc.org/docs/in_application_api.html
Firefox + WebRender works great in Renderdoc, either capturing the D3D11 commands generated by ANGLE or forcing opengl. The D3D11 commands and shaders correlate pretty well with the GL commands emmitted by WebRender anyway. This tool is absolutely fantastic.
Bugs are usually quickly fixed once someone on the graphics team manages to reproduce them (in a renderdoc capture).
Unfortunately nobody has managed to reproduce it yet even though the issue has been in the team's radar for a while.
The problem with using RenderDoc is that it doesn't capture ANGLE GL function usage, only the underlying D3D calls. So you have to correlate the high-level GL API calls with whatever ANGLE is lowering them to yourself.
It would be nice to have a native D3D11 backend (and D3D12, and Metal, and Vulkan) someday, but that day isn't today. gfx-rs, or wgpu-rs, looks promising as an abstraction layer over all of these APIs.
Have you spoken to baldurk about this? Since RenderDoc has GL support and it captures with Detours, in theory with some work it should be able to capture at a level above ANGLE. Not sure if that'd be helpful, but I expect maybe.
Alternatively, could you emit markers at or above the Angle level to provide an easier to understand RenderDoc trace?
You can compile ANGLE with ANGLE_ENABLE_DEBUG_TRACE=1 to get a log of all GLES calls to ANGLE's frontend. It's not the same level of detail as RenderDoc, but it's something.
It's opensource with MIT license. I wouldn't expect too many issues supporting such use case, given RD's support for normal OS-supplied GLES. I would only expect complications if you'll try capturing both layers at the same time, GLES between your app and angle, and D3D between angle and d3d11.dll.
Makes sense that a delay sensitive bug is hard to repro too. I have definitely seen similar artifacts throughout the years in Chrome too come and go, seems like it's a common problem in browser D3D implementations.
A few years ago a customer reported a bug in my software which only happened with a specific 3D scene, on specific GPUs (my desktop rendered that scene fine), and was only happening rarely. That’s how I have learned about that GPU synchronization issue across graphics APIs which may cause incomplete renders.
This is interesting, as Webrender does not use D3D but OpenGL, so the parts of Firefox that use D3D might interact badly with it.
There is work to port Webrender to gfx-rs which abstracts over Vulkan, Metal and modern DirectX but it seems to only live in the fork of the Hungarian Szeged university for now: https://github.com/szeged/webrender/issues/198
i was also wondering about the gfx-rs port. i've been watching that repo for over a year, and it's quite impossible to tell how far along things actually are today.
Note that use of code formatting disables these links on HN, as well as making copy paste of them difficult on mobile. Please avoid using code formatting (space-indented blocks) for non-code.
I saw the bug on Ubuntu 18 this morning, on FF 72.0.2 on Nvidia 435.21-0ubuntu0.18.04.2. Can't reproduce. Assumed it was because I have 2x 4K screens plus another 1080p one and I often get weird graphics issues with this setup.
I don't know if it's the same bug but I've certainly seen a similar bug (weird glitches in display that disappear when you scroll) on Fedora with AMD graphics (open source amdgpu driver, X), running Firefox nightly. One screen only..
I have no idea how exactly FF uses GPU. However, all 3D APIs, especially modern ones, are just ways to access the functionality of underlying GPU hardware, which doesn't depend on the OS.
If FF uses Vulkan, the direct equivalent of that query is probably VkFence, but unlike D3D11 VK has other ways too, subpasses and VkSubpassDependency, and maybe something else I forgot.
Based on the symptoms and difficulties to reproduce, I think what you see on the screenshots are incomplete renders. The FF code submitted some draw calls to GPU, and copied the render target texture without waiting for draw calls to complete. A good way to fix is use a D3D11_QUERY_EVENT query to wait for completion of rendering.
If all GPU access happens through D3D11 this shouldn’t happen i.e. the API guarantees to wait for the completion of previously submitted draw calls. It may happen in practice when mixing multiple GPU APIs, e.g. when using DXGI surface sharing to pass textures between D3D11 and DX9. Also, it may happen when using D3D from multiple threads. Easy way to detect the latter, set D3D11_CREATE_DEVICE_DEBUG flag when creating the device, and read warnings in debug output of the process.
Another possibility is just bugs in rendering code. To troubleshoot them, https://renderdoc.org/ is awesome. Unfortunately, if the FF doesn’t use D3D to present, only rendering to textures, some changes to FF’s source code is required to capture frames with RD, see this page for details: https://renderdoc.org/docs/in_application_api.html