Sktime: a unified Python Library for Time Series Machine Learning

towards-data-science

This post was originally published by Alexandra Amidon at Towards Data Science

The “sklearn” for time series forecasting, classification, and regression

Solving data science problems with time series data in Python is challenging.

Why? Existing tools are not well-suited to time series tasks and do not easily integrate together. Methods in the scikit-learn package assume that data is structured in a tabular format and each column is i.i.d. — assumptions that do not hold for time series data. Packages containing time series learning modules, such as statsmodels, do not integrate well together. Further, many essential time-series operations, such as splitting data into train and test sets across time, are not available in existing python packages.

Logo of the sktime library (Github: https://github.com/alan-turing-institute/sktime)

Sktime uses a nested data structure for time series in pandas data frames.

Native time series data structure, compatible with sktime.

Time series data structure required by scikit-learn.

Time Series Forecasting

Time Series Classification

Example code borrowed from https://pypi.org/project/sktime/

The data passed into the TimeSeriesForestClassifier.

Spread the word

This post was originally published by Alexandra Amidon at Towards Data Science

Related posts