Or for Chinese, Japanese, Hindi, Greek, Ethiopic, Thai, Korean, etc., etc. ... basically, for any text that uses characters other than the Latin and Cyrillic character sets that the Flow Circular font covers.
This is a simple benchmark to compare SIMD and non-SIMD WebAssembly performance using showcqt-js [1]. Please wait a moment until the benchmark completes, indicated by avg entry (w = avg, h = avg, r = avg).
The result on my laptop:
chrome:
name = reference, calc = 444 us, render = 2211 us, total = 2655 us
name = standard, calc = 450 us, render = 1692 us, total = 2141 us
name = simd, calc = 244 us, render = 484 us, total = 728 us
firefox:
name = reference, calc = 508 us, render = 2522 us, total = 3030 us
name = standard, calc = 503 us, render = 1575 us, total = 2078 us
name = simd, calc = 310 us, render = 568 us, total = 878 us
The visualization engine is showcqt-js (https://github.com/mfcc64/showcqt-js). It is a javascript port of ffmpeg showcqt (Show Constant Q Transform). Probably, more correct name is Variable Q Transform.
I've implemented the Constant Q transform before, and it does work well for western music. Are you using an autocorrelation to boost the fundamental? It can work well, particularly when there's just one note being played
It's a cool project. I'm just curious, because I haven't bumped into anyone else who's dabbled with the CQT before.
So in what way is it variable Q? I thought the innovation of CQT over a regular FFT is that the bins represent each note step with the same 'Q', making notes more distinguishable? (whereas a FFT has more 'Q' in the high frequencies)
I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.
linear STFT's window length in time domain = k (constant)
CQT's window length in time domain = k/f
showcqt-js window length in time domain = a * b / (a / c + b * f / (1 - c)) + a * b / (b * f / c + a / (1 - c))
whera a = 384, b = 0.33, c = 0.17
I also apply asymmetric window to reduce latency before doing VQT.
> I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.
Research needed. Doing autocorrelation on simple single instrument monophonic audio is probably easy. But doing it on complex multi instrument audio isn't easy. Probably, it needs some sort of machine learning.
> So this is some sort of compromise to increase speed?
Partially. The main purpose is to increase time domain accuracy on low frequency. If you want to do experiment on window length (in time domain), use ffmpeg showcqt filter (https://ffmpeg.org/ffmpeg-filters.html#showcqt). showcqt-js window length is hardcoded to tc=0.33 attack=0.033 and tlength='st(0,0.17); 384tc / (384 / ld(0) + tcf /(1-ld(0))) + 384tc / (tcf / ld(0) + 384 /(1-ld(0)))' (https://github.com/mfcc64/mpv-scripts/blob/a0cd87eeb974a4602...).
> Have you thought about implementing in WASM?
It is implemented in WASM.
> Agreed, but for a visualisation, it could just be a parameter just for the user to mess with.
If you want to do experiment on it, showcqt.js exposes intermediate color data cqt.color[]. It can be modified arbitrarily, including autocorrelation. If some day I do experiment on it and find that the result is satisfying, maybe I'll include it in YouTube Musical Spectrum.
> The main purpose is to increase time domain accuracy on low frequency
I thought so. The main issue with regular CQT is latency on low frequencies due to the excessive window required. It makes it unsuitable for real- time applications, as you know. Thanks for the insights
Actually, you can use mpv player and install visualizer.lua on mpv-scripts repository. The engine of YouTube Musical spectrum (showcqt-js) is a javascript port of ffmpeg showcqt filter.