[ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon #2465

Ramana-Raja · 2024-12-19T11:17:54Z

aeon-actions-bot · 2024-12-19T11:18:21Z

Thank you for contributing to `aeon`

I did not find any labels to add based on the title. Please add the [ENH], [MNT], [BUG], [DOC], [REF], [DEP] and/or [GOV] tags to your pull requests titles. For now you can add the labels manually.
I have added the following labels to this PR based on the changes made: [ $\color{#6F6E8D}{\textsf{anomaly detection}}$ ]. Feel free to change these if they do not properly represent the PR.

The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.

If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.

Don't hesitate to ask questions on the aeon Slack channel if you have any.

PR CI actions

These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.

Run pre-commit checks for all files
Run mypy typecheck tests
Run all pytest tests and configurations
Run all notebook example tests
Run numba-disabled codecov tests
Stop automatic pre-commit fixes (always disabled for drafts)
Disable numba cache loading
Push an empty commit to re-run CI checks

SebastianSchmidl

I just had a very brief look at the code because I am short on time. I will have another look next year after the Christmas break. Meanwhile, the code quality can be improved and tests need to be added.

The tests are missing! Please add reasonable tests for the new algorithm. How do you make sure that it produces the same results as the original implementation?
The code mixes standard python (e.g. random, range, ...) with numpy (e.g. np.random, np.arange, ...). Please mostly rely on numpy for better code quality and performance.
Why are most methods in capital letters and start with dunders (__)? If this is a reference to the original implementation, please add the original name as a comment and use our coding standard.

Thank you for taking the time and contributing to aeon!

aeon/anomaly_detection/_idk.py

# Conflicts: # aeon/anomaly_detection/_idk.py

MatthewMiddlehurst · 2025-01-27T00:27:24Z

Are there any other implementations or results we can use to compare output? Accuracy to the original algorithm(s) is important. I am not familiar with this area, but having incorrect implementations is generally damaging. Code could be fine, but not a good idea to merge until someone if confident that it is.

Ramana-Raja · 2025-02-01T10:16:44Z

Are there any other implementations or results we can use to compare output? Accuracy to the original algorithm(s) is important. I am not familiar with this area, but having incorrect implementations is generally damaging. Code could be fine, but not a good idea to merge until someone if confident that it is.

The expected results array for the last test case was generated using the original anomaly detector, Which I used for comparison with this anomaly detector. Let me know if you'd like me to run any additional tests

MatthewMiddlehurst · 2025-02-01T16:54:20Z

@SebastianSchmidl any suggestion on how to best evaluate this? Does not have to be as rigorous as a publication of course just enough to give us the idea the output is similar enough.

SebastianSchmidl · 2025-02-01T18:47:33Z

@SebastianSchmidl any suggestion on how to best evaluate this? Does not have to be as rigorous as a publication of course just enough to give us the idea the output is similar enough.

How do you test the implementations in the other modules?

I would compare this implementation against the original code: Probably, I would run the original code in a Docker image and capture the output for some test cases (including edge cases). Then, we can use the same input and output as fixtures for the tests in aeon, and use np.testing.assert_close et al. to ignore rounding problems.

Unfortunately, I need to focus on other stuff until mid-summer. So, I cannot say if or when I would be able to do this.

MatthewMiddlehurst · 2025-02-01T20:56:12Z

@SebastianSchmidl For classification we would just say run both on a decent number of UCR datasets and provide the accuracy results for all those datasets.

No rush on you doing anything here, the onus is on the code contributor to prove that output is equivalent IMO unless they are trusted by the community or the algorithm author/similar. If you can get around to doing it eventually thats great, but there is no obligation to do anything at all really.

Ramana-Raja · 2025-02-04T18:01:25Z

@SebastianSchmidl For classification we would just say run both on a decent number of UCR datasets and provide the accuracy results for all those datasets.

No rush on you doing anything here, the onus is on the code contributor to prove that output is equivalent IMO unless they are trusted by the community or the algorithm author/similar. If you can get around to doing it eventually thats great, but there is no obligation to do anything at all really.

sorry for the delay, been caught up with exams. I ran the anomaly detection using np.random.normal 100 times, each with a sample size of 100(used np.testing.assert_allclose ,mean and variance),I also added random_state to the original code since not using it can lead to a lot of variation in the results, here is the data and results:

data/code:

results:

MatthewMiddlehurst · 2025-02-11T22:13:23Z

This does not really help us determine whether the implementation is accurate to the original algorithm. We would be looking for a comparison against the original implemented (or published results) using actual datasets and evaluation metrics.

Ramana-Raja · 2025-02-28T20:50:17Z

This does not really help us determine whether the implementation is accurate to the original algorithm. We would be looking for a comparison against the original implemented (or published results) using actual datasets and evaluation metrics.

I used AUC to evaluate both the original model and AEON, and they produced identical results. This is expected since I set a random state for the original model, ensuring consistency. The scores are relatively low because I used only a width of 1, which otherwise reduces the output size of the original model. Here are the results(Also used np.testing.assert_close) :

code(dataset used : kdd_tsad_135,ecg_diff_count_3):

output:

MatthewMiddlehurst

Hi, I'm a little concerned about the licensing the the original code. Please ensure this is an adaptation, and there are not significant chunks of copy and pasted code if any are currently.

MatthewMiddlehurst · 2025-03-03T22:48:23Z

aeon/anomaly_detection/_idk.py

+
+    def _predict(self, X):
+        rng = np.random.default_rng(self.random_state)
+        if self.sliding or self.width > 1:


why self.width > 1? This does not seem to match the docstring where it mentioned this used for both the sliding and fixed window.

@SebastianSchmidl mentioned that the output should be same size as input, if the width becomes greater than 1 it reduces the size of output so i thought i would add reverse window for that too., but here I did mistake by adding self.width wrong, as if width is greater, it would go for square sliding ,will fix it.

Hi that is correct, it should be the same length as the input.

aeon/anomaly_detection/_idk.py

MatthewMiddlehurst · 2025-03-03T22:58:19Z

That is a good base for evaluating but please clean up the code examples before you post, it is a little difficult to parse. Why have you limited it to a width of 1, that seems a bit odd for windowing?

Ramana-Raja · 2025-03-04T14:57:28Z

That is a good base for evaluating but please clean up the code examples before you post, it is a little difficult to parse. Why have you limited it to a width of 1, that seems a bit odd for windowing?

I made that change because, in the original model, the output size is reduced based on the window size. However, the Aeon model uses a reverse window to restore the original output size. I didn’t use the exact same reverse window since I felt it would change too much of the original implementation. But if you'd prefer that approach, I'm happy to reevaluate the test again.

Ramana-Raja · 2025-03-04T16:31:32Z

I'm having trouble applying a reverse window on fixed_window when self.width is greater than 1. Since this involves matrix decomposition, I'm wondering if @SebastianSchmidl could provide some insights

MatthewMiddlehurst · 2025-03-06T23:41:35Z

I wouldn't change the tests given we want it to work with width>3, I don't think this will be merged until that is resolved somehow.

I don't know enough about the algorithm and the windowing function to help with that unfortunately, does using stride and padding_length not work?

Ramana-Raja · 2025-03-07T16:30:51Z

I wouldn't change the tests given we want it to work with width>3, I don't think this will be merged until that is resolved somehow.

I don't know enough about the algorithm and the windowing function to help with that unfortunately, does using stride and padding_length not work?

Hi, this Aeon implementation works just as well as the original implementation, with one key difference: while the original model reduces the input size, Aeon attempts to restore it. This approach works well with sliding enabled, but without sliding, the output becomes too small for reverse windowing. Even if we apply reverse windowing, the result is just repeated numbers. So, I figured it’s best to return the output as is when sliding is set to false.

(used width=2)

Ramana-Raja added 3 commits December 19, 2024 14:48

Added IDK² and s-IDK² anomaly detector to aeon

e9795f1

Added IDK to init

ff4b576

Added IDK to docs

7709a7d

Ramana-Raja requested review from SebastianSchmidl and MatthewMiddlehurst as code owners December 19, 2024 11:17

aeon-actions-bot bot added the anomaly detection Anomaly detection package label Dec 19, 2024

Ramana-Raja and others added 5 commits December 19, 2024 11:18

Automatic pre-commit fixes

7f2916f

Update _idk.py to update docs

18516df

Automatic pre-commit fixes

dd36f8b

Update _idk.py to add get test param

6734468

Automatic pre-commit fixes

b46a6fb

Ramana-Raja changed the title ~~Added IDK² and s-IDK² Anomaly Detector To Aeon~~ [ENH]Added IDK² and s-IDK² Anomaly Detector To Aeon Dec 19, 2024

Ramana-Raja changed the title ~~[ENH]Added IDK² and s-IDK² Anomaly Detector To Aeon~~ [ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon Dec 19, 2024

Ramana-Raja and others added 10 commits December 19, 2024 15:40

Update _idk.py to update axis

ee81313

Update _idk.py to remove univariate

4de22ff

Update _idk.py changed axis

c7f057a

Update _idk.py to make test_param small

c77f556

Update _idk.py change width of test case to 1

6d8467d

Update _idk.py changes psi1 and psi2 test values

4faa551

Update _idk.py to add extra random_state

af6ea04

Automatic pre-commit fixes

172fd80

Update _idk.py to add random_state for test_param

8cef628

Automatic pre-commit fixes

08d5ae8

MatthewMiddlehurst added the enhancement New feature, improvement request or other non-bug code enhancement label Dec 19, 2024

SebastianSchmidl requested changes Dec 20, 2024

View reviewed changes

aeon/anomaly_detection/_idk.py Outdated Show resolved Hide resolved

aeon/anomaly_detection/_idk.py Outdated Show resolved Hide resolved

aeon/anomaly_detection/_idk.py Outdated Show resolved Hide resolved

Ramana-Raja and others added 4 commits December 20, 2024 16:02

test cases and changes have been added as requested by the moderators

f78174e

Merge remote-tracking branch 'origin/s-idk-and-idk' into s-idk-and-idk

6d28f5e

# Conflicts: # aeon/anomaly_detection/_idk.py

Automatic pre-commit fixes

d6b1719

added test_case random state

29f3348

Ramana-Raja and others added 2 commits January 17, 2025 15:27

updated code as requested by moderators

baacc18

Automatic pre-commit fixes

b4e7046

Ramana-Raja and others added 6 commits March 1, 2025 02:33

updated code for reverse_windowing for width is greater than 1

f465a7e

Automatic pre-commit fixes

91611a8

updated test code for accepting updated version of IDK

fa50e7f

Merge remote-tracking branch 'origin/s-idk-and-idk' into s-idk-and-idk

5d7cb71

Automatic pre-commit fixes

0d60782

updated test code for accepting updated version of IDK

9ed1bdc

MatthewMiddlehurst reviewed Mar 3, 2025

View reviewed changes

Ramana-Raja and others added 5 commits March 4, 2025 21:27

changes made as requested by moderators

eb3aaa5

Automatic pre-commit fixes

60cd10c

Updated _idk.py docs

f305486

Automatic pre-commit fixes

a3735bd

Updated anomaly_detection.rst docs

11b8c31

Ramana-Raja requested a review from SebastianSchmidl March 4, 2025 16:32

Ramana-Raja added 2 commits March 6, 2025 21:02

Updated _idk.py docs

27989e4

Updated _idk.py docs

c53c3a8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon #2465

[ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon #2465

Ramana-Raja commented Dec 19, 2024

aeon-actions-bot bot commented Dec 19, 2024

SebastianSchmidl left a comment

MatthewMiddlehurst commented Jan 27, 2025

Ramana-Raja commented Feb 1, 2025 •

edited

Loading

MatthewMiddlehurst commented Feb 1, 2025

SebastianSchmidl commented Feb 1, 2025

MatthewMiddlehurst commented Feb 1, 2025

Ramana-Raja commented Feb 4, 2025 •

edited

Loading

MatthewMiddlehurst commented Feb 11, 2025

Ramana-Raja commented Feb 28, 2025 •

edited

Loading

MatthewMiddlehurst left a comment

MatthewMiddlehurst Mar 3, 2025

Ramana-Raja Mar 4, 2025 •

edited

Loading

MatthewMiddlehurst Mar 6, 2025

MatthewMiddlehurst commented Mar 3, 2025

Ramana-Raja commented Mar 4, 2025

Ramana-Raja commented Mar 4, 2025

MatthewMiddlehurst commented Mar 6, 2025

Ramana-Raja commented Mar 7, 2025 •

edited

Loading

[ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon #2465

Are you sure you want to change the base?

[ENH] Added IDK² and s-IDK² Anomaly Detector To Aeon #2465

Conversation

Ramana-Raja commented Dec 19, 2024

aeon-actions-bot bot commented Dec 19, 2024

Thank you for contributing to aeon

PR CI actions

SebastianSchmidl left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst commented Jan 27, 2025

Ramana-Raja commented Feb 1, 2025 • edited Loading

MatthewMiddlehurst commented Feb 1, 2025

SebastianSchmidl commented Feb 1, 2025

MatthewMiddlehurst commented Feb 1, 2025

Ramana-Raja commented Feb 4, 2025 • edited Loading

MatthewMiddlehurst commented Feb 11, 2025

Ramana-Raja commented Feb 28, 2025 • edited Loading

MatthewMiddlehurst left a comment

Choose a reason for hiding this comment

MatthewMiddlehurst Mar 3, 2025

Choose a reason for hiding this comment

Ramana-Raja Mar 4, 2025 • edited Loading

Choose a reason for hiding this comment

MatthewMiddlehurst Mar 6, 2025

Choose a reason for hiding this comment

MatthewMiddlehurst commented Mar 3, 2025

Ramana-Raja commented Mar 4, 2025

Ramana-Raja commented Mar 4, 2025

MatthewMiddlehurst commented Mar 6, 2025

Ramana-Raja commented Mar 7, 2025 • edited Loading

Thank you for contributing to `aeon`

Ramana-Raja commented Feb 1, 2025 •

edited

Loading

Ramana-Raja commented Feb 4, 2025 •

edited

Loading

Ramana-Raja commented Feb 28, 2025 •

edited

Loading

Ramana-Raja Mar 4, 2025 •

edited

Loading

Ramana-Raja commented Mar 7, 2025 •

edited

Loading