16  Self study - Session 2

In this session, we will revisit the chapters Chapter 12, Chapter 13, and Chapter 14.

We have seen that the algorithms for all three tasks more or less can be categorized into the same categories:

We have implemented nearest neighbors (distance-based) for clustering, time series forests (distance-based) for classification, and convolutional neural networks for regression (deep learning-based).

Let us have a look at one additional algorithm for each task.

Note

You can use sktime’s load_UCR_UEA_dataset function to load the data sets mentioned below, e.g.

from sktime.datasets import load_UCR_UEA_dataset

ds_name = "UWaveGestureLibrary"
ret_type = "numpy3d"

X_train, y_train = load_UCR_UEA_dataset(
    name=ds_name,
    split="train",
    return_type=ret_type,
)

X_test, y_test = load_UCR_UEA_dataset(
    name=ds_name,
    split="test",
    return_type=ret_type,
)

# For classification, convert y to categorical
y_train = pd.Series(y_train).astype("category")
y_test = pd.Series(y_test).astype("category")

Exercise 16.1 (Clustering with time series DBScan)  

  • Load the GesturePebbleZ1 data set.
  • Use the Time Series \(k\) means algorithm to cluster the data.
  • Now, try to use the time series DBScan algorithm for clustering.
  • How do the results compare?

In Chapter 13 and Chapter 14, our input was univariate, meaning each sample consisted of exactly one time series. Now, we will work with multivariate time series, which are more common in real-world scenarios since typically multiple sensors are available.

Exercise 16.2 (Classification of multivariate time series)  

  • Load the UWaveGestureLibrary data set.
  • Use the time series \(k\) nearest neighbour algorithm to classify the time series.
  • Use the CNN classifier to classify the time series (make sure to have sktime>=0.40.0 installed).
  • How do the results compare?

Note that not all of the algorithms have implementations for multivariate time series, e.g. sktime’s TimeSeriesForestClassifier only works for univariate time series (and raises an exception when used with a multivariate series). Luckily, there are other implementations available, e.g. aeon offers a TimeSeriesForestClassifier that is compatible with multivariate time series.

Note

aeon is quite strict about dependencies. In case you have problems installing it, consider to initialize a new project and first install aeon before adding further packages.

Exercise 16.3 (Classification using aeon’s Time Series Forest Classifier)  

  • Load the UWaveGestureLibrary data set.
  • Use aeon’s TimeSeriesForestClassifier to classify the data set.
  • Are you able to achieve better results than with the previous algorithms?
Note

You can use aeon’s load_regression function to load the data set mentioned below, e.g.

from aeon.datasets import load_regression

ds_name = "AppliancesEnergy"

X_train, y_train = load_regression(ds_name, split="train")
X_test, y_test = load_regression(ds_name, split="test")

Exercise 16.4 (Regression of multivariate time series)  

  • Load the AppliancesEnergy data set.
  • Use the time series \(k\) nearest neighbour algorithm to predict the energy consumption of the appliances.
  • Use the time series random forest tree algorithm to predict the energy consumption of the appliances.
  • Use the CNN regressor to predict the energy consumption of the appliances (make sure to have sktime>=0.40.0 installed).
  • How do the results compare?