[BUG]: StratifiedBootstrap can give the same sample on train and test set #254

fraimondo · 2024-04-04T13:12:25Z

Is there an existing issue for this?

I have searched the existing issues

Current Behavior

Here we can see when the random choice is made and then split into train/test.

julearn/julearn/model_selection/stratified_bootstrap.py

Lines 100 to 102 in 2e30b6e

    
           bs_inds = np.random.choice(t_inds, len(t_inds), replace=True) 
        
           train.extend(bs_inds[:n_train]) 
        
           test.extend(bs_inds[n_train:])

Expected Behavior

Basically, whatever gets chosen as test, should not be in the train.

This does not go with the Out of Bag Boostrap defitinion.

We should resample with repetition and whatever sample is not in the train set, is the test.

This can also allow us to implement the .632 and .632+ scoring correction methods.

Steps To Reproduce

latest julearn

Environment

not relevant

Relevant log output

No response

Anything else?

No response

fraimondo added the bug Something isn't working label Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: StratifiedBootstrap can give the same sample on train and test set #254

[BUG]: StratifiedBootstrap can give the same sample on train and test set #254

fraimondo commented Apr 4, 2024

[BUG]: StratifiedBootstrap can give the same sample on train and test set #254

[BUG]: StratifiedBootstrap can give the same sample on train and test set #254

Comments

fraimondo commented Apr 4, 2024

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Relevant log output

Anything else?