There are two main packages that dabl takes inspiration from and that dabl builds upon: scikit-learn and auto-sklearn. Scikit-learn provides many essential building blocks, but is built on the idea to do exactly what the user asks for. That requires specifying every step of the processing in detail. dabl on the other hand has a best-guess philosophy: it tries to do something sensible, and then provides tools for the user to inspect and evaluate the results to judge them. auto-sklearn is completely automatic and black-box. It searches a vast space of models and constructs complex ensemles of high accuracy, taking a substantial amount of computation and time in the process. The goal of auto-sklearn is to build the best model possible given the data. dabl, conversely, tries to enable the user to quickly iterate and get a grasp on the properties of the data at hand and the fitted models. (Source: https://dabl.github.io/dev/index.html).
dabl is meant to support you in the following tasks:
- Data cleaning.
- Exploratory Data analysis.
- Initial Model Building.
- Enhanced Model Building.
- Explainable Model Building.
- Searching optimal parameters with successive halving.
- Choosing the budget.
- Exhausting the budget.
- Aggressive elimination of candidates.
Limitations "Right now dabl does not deal with text data and time series data. It also does not consider neural network models. Image, audio and video data is considered out of scope. All current implementation are quite rudimentary and rely heavily on heuristics. The goal is to replace these with more principled approaches where this provides a benefit".
"This project tries to help make supervised machine learning more accessible for beginners, and reduce boiler plate for common tasks. This library is in very active development, so it’s not recommended for production use". Development at github.com/dabl/dabl.
dabl URL: https://dabl.github.io/dev/