With all the recent advancements in LLM and transformers, has the goal of parsing natural languages and representing them as an AST been achieved?
Or is this task still considered to be a hard one?
LLMs seem to understand the text much better than any previous technologies, so anaphoric resolution, and complex tenses, and POS choice, and rare constructs, and cross-language boundaries all don't seem to be hard issues for them.
There are so many research papers published on LLMs and transformer now. With all kinds of applications, but they wll not quite there at all.
An interesting example – I had a project where I needed to parse out addresses and dates in a document. However, the address and date formats were not standardized across documents. Utilizing LLMs was way easier then trying to regex or pattern match across the text.
But if you're trying to take a text document and break it down into some sort of a structured output, the outcome using LLMs will be much more variable.