Hacker Newsnew | past | comments | ask | show | jobs | submit | mfcc64's commentslogin

It doesn't work for arabic text.


Or for Chinese, Japanese, Hindi, Greek, Ethiopic, Thai, Korean, etc., etc. ... basically, for any text that uses characters other than the Latin and Cyrillic character sets that the Flow Circular font covers.

(Simple example: try https://www.unicode.org/standard/WhatIsUnicode-more.html and observe the list of language links on the left.)


Oh thanks, I have an idea about why that would be the case. Not sure if I'll be able to fix it, but logged it here: https://gitlab.com/vincenttunru/obfuscate/-/issues/6



Dear HN,

This is a simple benchmark to compare SIMD and non-SIMD WebAssembly performance using showcqt-js [1]. Please wait a moment until the benchmark completes, indicated by avg entry (w = avg, h = avg, r = avg).

The result on my laptop:

    chrome:
    name =  reference, calc =     444 us, render =    2211 us, total =    2655 us
    name =   standard, calc =     450 us, render =    1692 us, total =    2141 us
    name =       simd, calc =     244 us, render =     484 us, total =     728 us
    firefox:
    name =  reference, calc =     508 us, render =    2522 us, total =    3030 us
    name =   standard, calc =     503 us, render =    1575 us, total =    2078 us
    name =       simd, calc =     310 us, render =     568 us, total =     878 us
[1] https://github.com/mfcc64/showcqt-js


For a website, probably you want to try this example at https://mfcc64.github.io/html5-showcqtbar/


The visualization engine is showcqt-js (https://github.com/mfcc64/showcqt-js). It is a javascript port of ffmpeg showcqt (Show Constant Q Transform). Probably, more correct name is Variable Q Transform.


I've implemented the Constant Q transform before, and it does work well for western music. Are you using an autocorrelation to boost the fundamental? It can work well, particularly when there's just one note being played


No, I don't use autocorrelation. Of course, it isn't a pitch detection software.


It's a cool project. I'm just curious, because I haven't bumped into anyone else who's dabbled with the CQT before.

So in what way is it variable Q? I thought the innovation of CQT over a regular FFT is that the bins represent each note step with the same 'Q', making notes more distinguishable? (whereas a FFT has more 'Q' in the high frequencies)

I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.


linear STFT's window length in time domain = k (constant)

CQT's window length in time domain = k/f

showcqt-js window length in time domain = a * b / (a / c + b * f / (1 - c)) + a * b / (b * f / c + a / (1 - c))

whera a = 384, b = 0.33, c = 0.17

I also apply asymmetric window to reduce latency before doing VQT.

> I think having an autocorrelation option on the visualization could be cool, as it could reduce the spikes from the overtones (and also show a missing fundamental). But I think you'd need a different convolution for each instrument.

Research needed. Doing autocorrelation on simple single instrument monophonic audio is probably easy. But doing it on complex multi instrument audio isn't easy. Probably, it needs some sort of machine learning.


> showcqt-js window length in time domain = a * b / (a / c + b * f / (1 - c)) + a * b / (b * f / c + a / (1 - c))

> whera a = 384, b = 0.33, c = 0.17

So this is some sort of compromise to increase speed? Have you thought about implementing in WASM?

> Probably, it needs some sort of machine learning.

Agreed, but for a visualisation, it could just be a parameter just for the user to mess with.


> So this is some sort of compromise to increase speed?

Partially. The main purpose is to increase time domain accuracy on low frequency. If you want to do experiment on window length (in time domain), use ffmpeg showcqt filter (https://ffmpeg.org/ffmpeg-filters.html#showcqt). showcqt-js window length is hardcoded to tc=0.33 attack=0.033 and tlength='st(0,0.17); 384tc / (384 / ld(0) + tcf /(1-ld(0))) + 384tc / (tcf / ld(0) + 384 /(1-ld(0)))' (https://github.com/mfcc64/mpv-scripts/blob/a0cd87eeb974a4602...).

> Have you thought about implementing in WASM?

It is implemented in WASM.

> Agreed, but for a visualisation, it could just be a parameter just for the user to mess with.

If you want to do experiment on it, showcqt.js exposes intermediate color data cqt.color[]. It can be modified arbitrarily, including autocorrelation. If some day I do experiment on it and find that the result is satisfying, maybe I'll include it in YouTube Musical Spectrum.


> The main purpose is to increase time domain accuracy on low frequency

I thought so. The main issue with regular CQT is latency on low frequencies due to the excessive window required. It makes it unsuitable for real- time applications, as you know. Thanks for the insights


Can you detect notes with missing fundamental?


Thank's :).


I'll be glad if someone ports it to Audacity.


>Why does the extension on Firefox block AV1 by default? It seems unrelated to the purpose of the extension.

Because AV1's decoding is slow and on some low-end hardware it makes the animation stutter.

> Also have you considered coloring the spectrum according to the musical note? So every C is red, every E is green, every G is cyan, etc.

The color is used for left-right channel separation.


cnlohr's colorchord does something similar to what you mentioned.

https://github.com/cnlohr/colorchord


Thank's for the support.


Actually, you can use mpv player and install visualizer.lua on mpv-scripts repository. The engine of YouTube Musical spectrum (showcqt-js) is a javascript port of ffmpeg showcqt filter.


For any Arch based users, I made an AUR entry for this;

https://aur.archlinux.org/packages/mpv-visualizer/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: