If your data doesn't require preprocessing ... Very often you need to prepare data to use xgboost/lightgbm. You need to fill missing values, convert categoricals/dates/text to numeric, do feature engineering.
What is more, MLJAR AutoML is checking much simpler algorithms for you, like Dummy Models (average response or majority vote), linear models, simple decision trees - because very often you don't need Machine Learning. The xgboost/lightgbm cant do this :)
You do not need to fill missing values or treat categorical with LightGBM.
You can pass the name of the columns that are categoricals when constructing the Booster or Data, and LightGBM will work with them under the hood treating them as a 1 hot encoded.
LightGBM also has a way of automatically treat missing values as either zeros, their own category, or the sample average (I might be mistaken on the last one)
All in all, you still need to do feature engineering and the like, but LightGBM removes a lot of the hassle from Xgboost.
One newbie question maybe you can answer: Can XGboost/LightGBM handle out of band data predictions? The specific regression problem I’m tackling involves price predictions on tabular data (similar the the kaggle housing price problem) and I know classic random forest / decision trees have trouble with time series predictions. Not sure if those models handle that better.
> classic random forest / decision trees have trouble with time series predictions
This is not true. The trick is you have to convert your longitudinal data to be cross-sectional via feature engineering of lagged features. There are also related tricks like expanding datetimes into features like day of week, day of month, etc. This can be a lot of work, though there are software tools which help do this.
Some general time-oriented feature engineering plus a vanilla random forest is a great second baseline (after LOCF), and then if needed you can spend 10x the time tuning a GBM to beat that.
If you have a GPU, checkout fastai's tabular learning stuff; easy feature eng and neural nets with embeddings can do a lot with low effort.