This becomes impossible to do correctly because of the halting problem, interestingly. For example, suppose a routine calls F in a loop for most of its work, then at the end takes the square root by sqrtf(). Clearly number of calls matters for the edge weight in the call graph, but this tool would count F and sqrtf equal.
I suppose you could do it by sampling, then you actually just have to look at the sample distribution, though that would show you the graph weighted by cumulative execution times per routine.
As they say though, never let perfect be the enemy of good. Neat idea.
This reaction is kind of like reading about a pre-election poll where the participants were selected by pulling them off 86th street at 11 am, and objecting that the percentages shouldn't be presented with more than two sig figs with a sample size of 100.
You're not, strictly speaking, wrong. But the methodology is already known to be deeply compromised, so your objection is kind of out there.
But this is about ranking code based on how central it is, for the purposes of choosing what to translate to TypeScript first. It’s not a compromise, it’s the correct approach. How often a line of code is executed is not relevant for this purpose.
I think you are mixing performance analysis with dependency analysis which is the point of the project. The sampling you are describing is commonly done by tools called "sampling profilers".
I suppose you could do it by sampling, then you actually just have to look at the sample distribution, though that would show you the graph weighted by cumulative execution times per routine.
As they say though, never let perfect be the enemy of good. Neat idea.