You're making the assumption that GPT doesn't automatically build a phonological theory of the english language from rhymes and lyrics and linguistic texts that are fed into it.
We know experimentally that it doesn't... It can reproduce certain rhymes when it's generating a song, but the original GPT-3 was terrible at this, so I think they've fine tuned the model specifically for lyrics or poetry.
I just asked ChatGPT: Do "not" and "knot" sound similar?
> "Not" and "knot" do not sound similar. They are two distinct words that are pronounced differently and have different meanings. "Not" is a negation word that is pronounced with a short, sharp "n" sound followed by a long "o" sound and a "t" sound. "Knot" is a noun that refers to a tangled mass of something, such as rope or thread, and is pronounced with a short "k" sound followed by a long "o" sound and a "t" sound. Despite their similar spellings, "not" and "knot" are not homophones, which are words that are pronounced the same but have different meanings.
> Yes, the words "not" and "knot" do sound similar when spoken aloud. They are pronounced the same way, with a short "o" sound and a long "t" sound. However, they have different meanings and spellings. "Not" is a negation or denial, while "knot" is a fastening made by looping a piece of cord, rope, or other material around itself or through itself and then pulling tight.
I tried a few times and it gave me other answers, including something like your answer and another claiming they were similar but not quite the same.
It's interesting, because even when it "knows" that two words are homophones, it doesn't want to believe they sound the same. I just asked it "Are 'knot' and 'not' homophones?" and it replied:
> Yes, "knot" and "not" are homophones. Homophones are words that are pronounced the same but have different meanings and spellings.
My immediate follow-up: "Do 'knot' and 'not' sound exactly the same?"
> No, "knot" and "not" do not sound exactly the same, although they are pronounced similarly. [...] "Knot" is pronounced with a short, sharp "k" sound at the beginning
I'd love to hear the opinion of someone who has really good knowledge and experience of how byte-pair encoding works in models like these. I think I agree with you that in theory it should be able to build a phonology from the amount of explicitly rhyming material in its training corpus, but for whatever reason it doesn't do this or at least doesn't do it consistently.
I've spend a long time testing this in ChatGPT, and no matter what I do it still gives results like this (paraphrasing here because it's down right now):
>What words rhyme with coffee?
>
doff
happy
toffee
snuff
duff
But how? Many poems don't rhyme, but there's no outward way to tell. And to parse linguistic texts it would need to know the phonetic alphabet, which I assume it doesn't.