Support Vector Machine (SVM) is a powerful classifier for multidimensional data. This paper evaluates the effectiveness of different SVM kernels combined with Dynamic Time Warping (DTW) as a distance metric for time series classification. We compare several kernel functions (Cauchy, Gaussian, Inverse Multiquadric, Laplacian, Log, Rational Quadratic) incorporated with DTW to assess their performance on time series datasets such as ECG, FordA, and Human Activity Recognition (HAR). ππ‘
- Kernel Function
- Support Vector Machine
- Time Series Classification
- Dynamic Time Warping π
SVM is a supervised learning method used for classification by finding an optimal hyperplane that separates data. Kernel functions in SVM transform input data into a higher-dimensional feature space where linear separation is possible. However, for time series data, the temporal ordering of sequences should be considered, making the choice of kernel function crucial.
DTW is widely used to measure similarity between time series data, especially when the sequences may differ in length or speed. This paper explores integrating DTW with different SVM kernels for improved time series classification. ππ
This study evaluates the performance of various SVM kernel functions with DTW in the task of time series classification. π€β¨
Various studies have explored using DTW for time series classification with SVM. For example, Chen et al. (2019) proposed a new DTW-based kernel for SVM, showing its superior performance on certain datasets. Other distance metrics like Euclidean and Cosine distances have also been explored in combination with SVM for time series classification. π
The methodology involves:
- Data Preprocessing: Cleaning, normalizing, and balancing datasets. π§Ή
- Calculation of DTW Distances: Compute DTW distances between time series. πΉοΈ
- Calculation of Kernel: Replace distance metrics in the kernel function with DTW. π
- SVM Model Training: Train SVM using the DTW-based kernel and optimize hyperparameters via cross-validation. π―
- Model Evaluation: Assess the models using accuracy and confusion matrix metrics. π
The kernels evaluated in this study are:
- ECG Heartbeat Categorization Dataset: Consists of ECG heart rate signals for normal and arrhythmic cases. β€οΈ
- FordA Dataset: Contains engine noise measurements for diagnosing subsystems in vehicles. ππ§
- Human Activity Recognition with Smartphones Dataset: Includes data collected from smartphone sensors for classifying human activities. π±πββοΈ
The preprocessing steps involved resampling to handle class imbalance and normalization to ensure consistent data scaling. βοΈ
- Accuracy: The percentage of correctly classified instances. π―
- Confusion Matrix: A matrix to summarize the performance of a classifier, showing correct and incorrect predictions. π
Kernel Function | Hyperparameter | Compute Time (s) |
---|---|---|
Cauchy | sigma = 500000 | 213.1 |
Gaussian | sigma = 1 | 310.7 |
Inverse Multiquadric | c = 0.4 | 346.7 |
Laplacian | sigma = 1.4 | 405.2 |
Log | c = 1 | 444.4 |
Rational Quadratic | c = 1 | 506.4 |
Kernel Function | Hyperparameter | Compute Time (s) |
---|---|---|
Cauchy | sigma = 500000 | 771.9 |
Gaussian | sigma = 4 | 832.5 |
Inverse Multiquadric | c = 1 | 847.0 |
Laplacian | sigma = 18 | 846.9 |
Log | c = 1 | 839.2 |
Rational Quadratic | c = 100 | 826.4 |
Kernel Function | Hyperparameter | Compute Time (s) |
---|---|---|
Cauchy | sigma = 600000 | 1219.1 |
Gaussian | sigma = 90000 | 1323.3 |
Inverse Multiquadric | c = 0.4 | 1363.6 |
Laplacian | sigma = 700 | 1375.3 |
Log | c = 1 | 1398.1 |
Rational Quadratic | c = 9000 | 1390.3 |
The experiments were conducted on a MacBook Pro 2020 with an M1 chip. ππ»
Kernel Function | ECG (%) | FordA (%) | HAR (%) |
---|---|---|---|
Cauchy | 89.0 | 60.5 | 70.0 |
Gaussian | 76.0 | 45.0 | 57.1 |
Inverse Multiquadric | 85.0 | 60.0 | 80.3 |
Laplacian | 83.0 | 53.5 | 82.7 |
Log | 38.0 | 53.0 | 85.6 |
Rational Quadratic | 84.0 | 54.0 | 87.0 |
- The Inverse Multiquadric kernel performs well across all datasets. π
- Some kernels, like Cauchy and Log, perform inconsistently across datasets. π€
- DTW may struggle with datasets like FordA due to patterns appearing at different frequencies across samples. π
DTW, while powerful for time series data, can be computationally expensive and memory-intensive. Approximations such as FastDTW or parallel computing can help alleviate these issues. β‘π»
Using DTW as a distance metric within SVM kernels allows for more accurate classification of time series with varying lengths and irregular sampling rates. This improves performance, especially in datasets with noisy signals and differing time scales. ππ‘
Using DTW with various kernel functions for time series classification is effective, especially for datasets with varying sequence lengths. The Inverse Multiquadric kernel showed consistently good performance. Proper hyperparameter tuning is crucial for optimizing model performance. π
- Hansheng Lei and Bingyu Sun, A Study on the Dynamic Time Warping in Kernel Machines
- Thomas Hofmann, Bernhard Scholkopf, and Alexander J. Smola, Kernel Methods in Machine Learning
- Jiapu Zhang, A Complete List of Kernels Used in Support Vector Machines
- Vincent Wan and James Carmichael, Polynomial Dynamic Time Warping Kernel Support Vector Machines
- Charu C. Aggarwal, Data Mining
- Jorge L. Reyes-Ortiz et al., Human Activity Recognition Using Smartphones Data Set
- A. Bagnall, FordA Dataset
- Mohammad Kachuee et al., ECG Heartbeat Classification: A Deep Transferable Representation
- NCBI Article