Thanks! Good work! It's of much more benefit to analyze slow pdfs than to construct the "proofs" that short and simple pdfs are displayed "fast enough" (but I understand that the later feels so good). Especially since it seems that not only pdf.js will benefit from analyzing the slow ones -- if I understand you traced the problems to the cpp code of the browser? That would mean that even non-pdf-js stuff will be faster once it's fixed, is that correct?
Yes the fix to bug 1007897 will help all web content.
That profile will show calls from JS and how theycall into native C++ calls. This is very useful for profiling things like canvas.
PDF.js isn't bottlenecked by javascript performance, like many well tuned apps out there, so there's a lot of improvements that can be made by tweaking the web platform.
Does your profile explain why large PDFs are so laggy when scrolling in PDF.js? For example take
http://www.math.mtu.edu/~msgocken/pdebook2/mapletut2.pdf
open it in FF and hold the page down or up. On my laptop, that will lockup FF. Yet, if I open it in Acrobat Reader or Chrome, I can scroll up and down much faster without the jerky behavior.
It is even possible a JavaScript app in FF to get the kind of performance Google gets with their pdf plugin? With the power of today's PC's, it seems like something is seriously wrong with "web technology" if my machine struggles to render pdf documents.
The first PDF.js document uses the DOM instead of Canvas like the previous one. This is done to support text selection. Most of the time is spent in the style system. I don't know that area very well but in the past I've seen simplifying CSS selectors make all the difference. I know a fairly important problem for B2G is spending up expensive style flush (https://bugzilla.mozilla.org/show_bug.cgi?id=931668) but I don't know enough about CSS to know if that fix will solve the problem here.
A little OT, but since you're so good at profiling Firefox, I have one more interesting "a lot of real work" page that will maybe inspire you or somebody you know:
This emulates in JavaScript the x86 and necessary hardware to really boot Linux 2.6.20(!) On my computer, Opera 12.17 will show "booted in 2.8 seconds," versus Firefox 29 on which it will be "booted in 7.9 seconds." That's 2.8 times slower.
Looks like getaliasedvar is causing excessive bailouts from Ion (3rd tier JIT). On top of that the platform is trying to synchronously cancel background Ion Jitting and that is taking an excessive amount of time. Reported as: https://bugzilla.mozilla.org/show_bug.cgi?id=1007927
Tweaking the functions listed in the profile to avoid them bailing out should drastically improve this test case.
And at the opposite side of Bellard's useful code, I've also observed that a simple loop which just does the summation of the doubles like this
var s = 0.01
for ( var i = 0; i < 100000000; i++ )
s += 0.1
print( s )
Became around twice slower since some version of Firefox (of course, before that point there were a lot of speedups, very old FF can't be compared with the present state).
Still, really the biggest problems I know of at the moments are those PDFs that the architects produce.
That loop doesn't actually do anything, benchmarking it is pretty much meaningless.
It's important to have benchmarks that aren't trivially converted to no-ops or constant loads by the compiler. (In practice the JIT might not be optimizing that one out, but an aggressive C++ compiler certainly would as long as fast math is enabled - so at some point, a typical JS JIT will too).
Also ensure that you're benchmarking warmed code that has been fully jitted. JS code (other than asm.js in Firefox) has multiple stages of compilation in modern runtimes, typically triggered based on how often it is called.
You are wrong. The last line has the meaning of displaying the result to the user (you are supposed to implement it there, I'm lazy. The same goes for prior warm-up, I don't have to specify it here, I just show the loop). Because the result is needed to be shown, the browser is certainly not allowed to optimize away the calculation. Second, it's not allowed to replace it with a multiplication, as it's a floating point arithmetics and the binary representation of the constants involved is not "nice" an the same stands for partial results too. Do compare the result with the multiplication to get the idea (10000000.01 vs 9999999.99112945). All the additions have to be performed one way or another between the loading of the js and the displaying of the result. So it is a good measure of the quality of the translation from the js to the machine code which does the actual calculation and can also easily point to the unnecessary overheads as it's very simple. The regression I observed is therefore a real one, probably observable in other scenarios but harder to pinpoint and probably avoidable, as the better results did exist once. (Of course, if it would be part of some widely popular benchmark cheats would probably be developed, but at the moment there aren't any. Once anybody implements "we don't care for numerics" optimization, it of course should not be used anymore to asses the quality of JS).
to your test suites and consider it a worthy goal as it really represents a lot of documents typical for the users who produce complex plans. Have you tried it?
It's immediate in Adobe Reader (as in one second) and takes more minutes in pdf.js Firefox 29.
Looks like it could be running much faster:
* 20% of the time is spent copying the canvas because someone is, likely erroneously, holding a reference to the canvas. Looking into it: https://bugzilla.mozilla.org/show_bug.cgi?id=1007897
* 10% of the time is spent waiting on display transactions swaps because canvas isn't triple buffered.
* PDF.js is not getting empty transaction (canvas draw optimizations).
That's just from a quick profile. I'm sure there's a ton more things that could be improved.