We still don't have enough data and people are still wasting their time with try...

PaulHoule · on Jan 24, 2024

Personally I am happy to use a model that isn't quite "top dog".

I have a classification task where I can train multiple models to do automated evaluation in about 3 minutes using BERT + classical ML. The models are consistently good.

Sometimes you can do better fine-tuning the BERT model with your training set but a single round takes 30 minutes. The best fine-tuned models are about as good as my classical ML models but the results are not consistent and I haven't developed a reliable training procedure and if I did it would probably take 3 hours or more because I'd have to train multiple models with different parameters.

Even if I could get 82% AUC over 81% AUC I'm not so sure it is worth the trouble, and if I really felt I needed a better AUC (the number I live by, not the usually useless 'accuracy' and F1) I could develop a stacked model based on my simple classifier which shouldn't be too hard because of the rapid cycle time it makes possible.

My favorite arXiv papers are not the ones where people develop "cutting edge" methods but where people have a run-of-the-mill problem and apply a variety of run-of-the-mill methods. I find people like that frequently get results like mine so they're quite helpful.