Hacker Newsnew | past | comments | ask | show | jobs | submit | onnodigcomplex's commentslogin

What would you say is a good model for audio to midi transcription?


Lilypond (a tex inspired music markup language, with scheme integration) is also noteworthy!


Lilypond had such beautiful output, though the amount of time since I have used it can be measured in years--is it still setting the typesetting bar above Finale, Sibelius, etc?


I once spend more time than I want to admit constructing the best possible 10-square in Dutch[1]. There are also some other forms that are fun, like cubes[2], or hypercubes. Or using bigrams. Or hexagonal tiles. And so forth.

Finding solutions for big squares/wordlists in reasonable time is actually not a trivial algorithmic problem. Neither is making good wordlists, I ended up creating what is essentially a Dutch version of the Pile[3] just to collect words. Good fun.

[1] https://old.reddit.com/r/thenetherlands/comments/zgr61e/ik_h... [2] https://i.redd.it/a61yuistbkm61.png [3] http://gigacorpus.nl/


Relevant username! How much effort did it took you?


I spend months collecting the dataset to create a wordlist. And I also spend 100+ hours just judging squares and words. And eventually also training char-level LLM's to do that job for me. In Dutch you can compound and there is quite an active morphology so it can be really tricky to judge whether a given word is any good.

Since I worked on it on and off, and also did a bunch other related things I'm not sure, but I would be surprised if I spend any less than 1000+ hours on these silly word-games.


It's a fun programming/puzzle challenge. I developed a perfect pangram in Dutch about a teachers bike that is very lightweight and fast, but not so strong

Jufs BMX: hypervlot c.q. zwak ding


"C.Q." is an abbreviation that could mean different things depending on the context. It is not a widely recognized or commonly used abbreviation, and its meaning is not immediately clear. Can you provide some more information or context about it?


It is common in Dutch, short for casu quo (latin).

"A c.q. B" means something like "A or otherwise B" (example translated from Dutch wikipedia page)

https://nl.wikipedia.org/wiki/Casu_quo


sorry didn't see anything in english on that page.

it's strange how a language that absorbs everything it comes across didn't bag this, perhaps there something else in english that does the same job?


In amateur radio, especially CW/morse code, CQ means "looking for someone to talk to". It's an abbreviated homonym for "seek you".

If you hear "CQ CQ CQ de WB6NOA" being transmitted, it would mean the person who owns call sign WB6NOA is looking to talk to someone on this frequency.


i did know that (in the back of my brain, and watching Ellie in "Contact") but don't see how it applies here.


Apologies, I meant to respond to another message.


In Dutch, c.q. is an abbreviation of the Latin term casu quo, meaning "or instead, alternatively" (lit. in which case). It doesn't really fit here because it suggests that vlot (quick) and zwak (weak) are interchangeable adjectives -- nor does it match the English usage of the same term, where it is used more like a premise/supposition rather than a conjunction.


Another populair one is: "Lynx c.q. vos prikt bh: dag zwemjuf!"


I'm not on my desktop, but I used nanoGPT extensively on my RTX-4090. I trained a GPT2-small but with a small context window (125 > 90M params) at batch sizes of 1536 using gradient accumulation (3*512). This runs at just a smidge over 1 it/s. Some notes

- Gradient checkpointing and 2048 batch size in a single go allows ~10% performance improvement on a per sample basis.

- torch.compile doesn't work for me yet (lowest cuda version I got my 4090 to run on was 11.8 but highest cuda version on which I got the model to compile is 11.7).

- I did the optimalisations in https://arxiv.org/abs/2212.14034


Can you share the code and numbers so that I could compare directly with my 3090?

Do you train in fp16/bf16?

Have you tried fp8?


I have a workflow where I do my composing in Musescore, export to musicxml & convert with musicxml2ly to Lilypond, and finally do the typesetting there.

I basically throw away everything in the generated .ly files except the notes. But saves a lot of effort regardless.

I never found a a Lilypond -> Musescore conversion that worked really well. I did write a very simply tool based on python-ly to at least get the 'notes' to musicxml so that I can import something into Musescore.


> I basically throw away everything in the generated .ly files except the notes. But saves a lot of effort regardless.

That seems like a good workflow to me for the Musescore→LilyPond direction, and speaks to the incompatibility in approaches that make a more seamless integration less likely: LilyPond, as a typesetting engine, is essentially incompatible with a WYSIWYG workflow.

I did a bunch of engraving (probably around 100 works) when I ran a music publishing business. Many of the scores were organ and/or choral works, which are quite complex to engrave (especially compared to solo instrumental pieces). My workflow was to get the notes, words, articulations, etc. into LilyPond with no formatting tweaks, see what it gave me, and then tweak from there. But the defaults provided by the engine were almost always a very good starting point.

I would think that the other direction would be much easier, though, since that's basically the same workflow as copying plain text into Microsoft Word (albeit music notation is more complex than plain text).


Two Greg Egan stories come to mind.

Wang's carpets is a short story about life evolving within a simulation of a naturally occurring computer. (also part of the larger novel Diaspora)

Schild's Ladder is about physicists researching the fundamental 'geometry' of the universe and accidentally create a quickly expanding geometry that is more stable than a regular vaccuum, very unhealthy for all regular matter, but...


To add the all the resources here, Alan Belkin is a composer with many very in depth and practical videos for fellow composers: https://www.youtube.com/channel/UCUQ0TcIbY_VEk_KC406pRpg one of the best theory youtube channels.

And https://musescore.com/groups/counterpoint-and-fugue is one of those well hidden small internet communities centered around contrapuntal writing with many knowledgeable members, original music and in depth essays about contrapuntal details.


If I open 2 tabs of https://www.szynalski.com/tone-generator/ and then listen to 440+880, and then change 880 to 850 it is a world of difference. I would definitely describe that difference as dissonance and consonance.

Now the overtone series IS important and is not always 'simple ratios', a good example in a real instrument is the strong minor third overtone of a carillon, and as expected writing in major for that instrument is hard.


Thank you for that link. I never thought of my browser as a test bench before. (Of course, now I want a DVM, function generator, scope, logic analyser, spectrum analyzer and all the other goodies ;) ).


I'm not sure if this is the same thing as consonance/dissonance, but the graph of sin(x) + sin(2x), an octave, is regular and pretty and the graph of sin(x) + sin(sqrt(2)x), a tritone, is much less so.

https://www.wolframalpha.com/input/?i=sin%28x%29+%2B+sin%282...

https://www.wolframalpha.com/input/?i=sin%28x%29+%2B+sin%28s...


If you plot them as XY it's even more obvious which one is perfect consonance and which one dissonance:

https://www.wolframalpha.com/input/?i=%7B+x+%3D+sin%28t%29%3...

https://www.wolframalpha.com/input/?i=%7B+x+%3D+sin%28t%29%3...


Except that if you use a frequency ratio of sin(x)+sin(2.01 x), which is really very close to an octave and really sounds just as consonant as an octave to almost all people, you almost the same "dissonant" picture:

https://www.wolframalpha.com/input/?i=%7B+x+%3D+sin%28t%29%3...

The strange thing is, none of these "simple ratio" theories account for the fact that our brains allow a lot of "fuzziness" around these simple ratios, so much that you can't really call them simple ratios as they encompass a whole bunch of not-simple ratios as well.


That sine wave sounds a bit "fuzzy" to me, maybe the generator adds a small amount of overtones or aliasing. I tried another generator (http://onlinetonegenerator.com/) and the consonance feels weaker.


Interesting, I still hear it the same, dissonant and consonant, perhaps western music ruined me. Thanks for sharing.

Edit: Didn't see the url, makes my old reply obsolete:

Interesting. I tried to avoid clipping/aliasing by using audacity with as high quality audio as my system allows and I can still reproduce pretty much exactly what you hear on those websites. https://vocaroo.com/i/s0Be5CexLgVs is 440hz, then 440hz+880hz, then 440hz+850hz. But I would be interested in any repeatable signal that does not harmonize at all so do share!


You're right and I was overconfident. I tried some more, and the sense of consonance is weak but still there. Amended the comment.


Suckerpinch (Tom) recently had a video[1] about predicting the position of the oponents pieces from knowning just white. I don't think he's aware of this. Great video anyway.

He trained an NN to predict the pieces and then used Stockfish on the resulting position.

[1] https://www.youtube.com/watch?v=DpXy041BIlA


Yeah, the "pawn tries" rule in particular reminded me a lot of his "blind spycheck" algorithm in that video. If you are allowed to rank every move and perform the first legal one, you can get away with some interesting stuff.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: