Is it possible to fully automate the data science lifecycle? Is it possible to automatically build a machine learning model from a set of data?
Indeed, in recent months, many tools have appeared that claim to automate all or parts of the data science process. How do they work? Could you build one yourself? If you adopt one of these tools, how much work would be necessary to adapt it to your own problem and your own set of data?
Usually, the price to pay for machine learning automation, or AutoML, is the loss of control to a black box kind of model. What you gain in automation, you lose in fine-tuning or interpretability. Although such a price might be acceptable for circumscribed data science problems on well-defined domains, it could become a limitation for more complex problems on a wider variety of domains. In these cases, a certain amount of interaction with the end user is desirable.
At Knime, we take a softer approach to machine learning automation. Our guided automation—a special instance of guided analytics—makes use of a fully automated web application to guide users through the selection, training, testing, and optimization of a number of machine learning models. The workflow was designed for business analysts to easily create predictive analytics solutions by applying their domain knowledge.