Sktime: The One-Stop Shop for Time Series Analysis in Python

4 min readJun 4, 2023

Introduction

Sktime is an open-source Python library for time series analysis. It provides a unified interface for multiple time series learning tasks, including forecasting, classification, regression, clustering, annotation, and dimensionality reduction. Sktime also provides interfaces to related libraries, such as scikit-learn, statsmodels, tsfresh, PyOD, and fbprophet.

It is designed to be easy to use, flexible, and modular. It offers scikit-learn compatible interfaces and model composition tools, with the goal to make the time series analysis ecosystem more usable and interoperable as a whole. Sktime is built and sustained by an open, diverse, and self-governing community.

Here are some of the features:

Unified interface for multiple time series learning tasks
Scikit-learn compatible interfaces
Model composition tools
Interfaces to related libraries
Easy to use, flexible, and modular
Built and sustained by an open, diverse, and self-governing community

What All Can Sktime be Used For?

Time Series Forecasting: Sktime can be used to forecast future values of time series data. This can be useful for planning, budgeting, and other decision-making tasks.
Time Series Classification: Sktime can be used to classify time series data into different categories. This can be useful for fraud detection, medical diagnosis, and other applications.
Time Series Regression: Sktime can be used to predict the value of a time series variable based on other time series variables. This can be useful for financial forecasting, weather prediction, and other applications.
Time Series Clustering: Sktime can be used to cluster time series data into groups. This can be useful for identifying patterns in data and for reducing the dimensionality of data.
Time Series Annotation: Sktime can be used to annotate time series data with labels or metadata. This can be useful for understanding the meaning of data and for making it more accessible to others.
Dimensionality reduction: Sktime can be used to reduce the dimensionality of time series data. This can be useful for making data easier to visualize and for improving the performance of machine learning algorithms.

Examples of Classifiers Available in Sktime:

Sktime is a powerful tool for time series analysis. It provides a comprehensive set of tools for a wide range of tasks, and it is easy to use and learn. If you are interested in time series analysis, sktime is a great resource. It provides a variety of classifiers for time series data, including:

Distance-based classifiers: These classifiers use distance metrics to compare time series. Examples include the Elastic Ensemble (EE), KNN Time Series Classifier, Proximity Forest, Proximity Stump, and Proximity Tree.
Interval-based classifiers: These classifiers use intervals to represent time series. Examples include the Time Series Forest (TSF), Random Interval Spectral Ensemble (RISE), Supervised Time Series Forest (STSF), Canonical Interval Forest (CIF), and Diverse Representation Canonical Interval Forest (DrCIF).
Dictionary-based classifiers: These classifiers use dictionaries to represent time series. Examples include the Bag-of-Symbols (BoS) Classifier and the Bag-of-Patterns (BoP) Classifier.
Deep learning classifiers: These classifiers use deep learning models to represent time series. Examples include the Convolutional Neural Network (CNN) Classifier and the Recurrent Neural Network (RNN) Classifier.

The choice of classifier depends on the specific time series data and the desired accuracy. Distance-based classifiers are generally fast and easy to train, but they may not be as accurate as interval-based or dictionary-based classifiers. Interval-based and dictionary-based classifiers are more accurate, but they may be slower to train. Deep learning classifiers can be very accurate, but they may require a large amount of data to train.

Python Code for Time Series Forecasting

To get started with sktime, simply install it using pip:

pip install sktime

Then head over to your preferred IDE and start experimenting with the powerful library. Here is an example of using a regression algorithm to solve a forecasting task:

import numpy as np
from sktime.datasets import load_airline
from sktime.forecasting.compose import make_reduction
from sklearn.ensemble import RandomForestRegressor
from sktime.forecasting.model_selection import temporal_train_test_split
from sktime.performance_metrics.forecasting import MeanAbsolutePercentageError

y = load_airline()
y_train, y_test = temporal_train_test_split(y)

# forecasting horizon
fh = np.arange(1, len(y_test) + 1)  

regressor = RandomForestRegressor()
forecaster = make_reduction(
    regressor,
    strategy="recursive",
    window_length=12,
)

forecaster.fit(y_train)

y_pred = forecaster.predict(fh)

smape = MeanAbsolutePercentageError()
smape(y_test, y_pred)
>>> 0.1261192310833735

It is as easy as that! So don’t wait for anyone. Forecast away!

The documentation of sktime is also very comprehensive with all the details provided for each and every implementation of different classifiers available. The link is given below for reference:
https://www.sktime.net/en/latest/api_reference/base.html

For more exciting ML articles, please support me by following Ilyas Ahmed

Until next time, happy forecasting.