Ah neat. I'd forgotten I did something like this a few years back using an ADSB receiver on my roof, and integrated it into a Pi-powered digital clock. https://www.behance.net/gallery/42580099/Pi-Clock
It's abandonware now, because it was a PITA to get the flight data from behind a clunky paywall, and an Echo Show could do voice-driven stuff more easily. Maybe I could revisit it now and put it into an e-ink display instead, and use Flightaware or FR24 API...
I can see the use-case and potential for ML in exfiltrating tables, but I'd be worried about the potential for decision-making mistakes in environments the author identifies, such as finance.
The example of TableNet using deep learning for table extraction on top of tesseract for OCR means two layers of ML, either of which could individually introduce pathologies without human oversight. It reminds me of the photocopier that changed numbers for you - https://www.theregister.co.uk/2013/08/06/xerox_copier_flaw_m...
If an ML engine was trained to be able to do things like look for totals and sub-totals in numerical tables and flag errors in summation, then that would clearly add more value in parsing for moderation (the use-case described at the end). But that doesn't seem to be something that's yet... on the table.
It looks like it's not quite the same thing, in that it identifies Excel values that should be formulae. It could be used in a pipeline with spreadsheets extracted by ML/OCR to reverse-engineer formulae though, which is an interesting prospect.
It's abandonware now, because it was a PITA to get the flight data from behind a clunky paywall, and an Echo Show could do voice-driven stuff more easily. Maybe I could revisit it now and put it into an e-ink display instead, and use Flightaware or FR24 API...