Skip to content

A python implementation of missing value imputation with kNN

License

Notifications You must be signed in to change notification settings

bwanglzu/Imputer.py

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Imputer

A python implementation for missing value imputation using kNN.

CircleCI codecov Language License

Install

git clone https://github.com/bwanglzu/Imputer.py.git
cd Imputer.py
# install dependencies
pip install -r requirements.txt
# install imputer
python setup.py install

Usage

from imputer import Imputer
impute = Imputer()

Default Usage (X should be a pandas.dataframe/np.ndarray, column is the name or index of the dataframe):

X_imputed = impute.knn(X=data, column='age') # default 10nn

Change Number of k:

X_imputed = impute.knn(X=data, column='age', k=3)

Default impute for numerical features, for categorical feature imputation:

X_imputed = impute.knn(X=data, column='gender', k=10, is_categorical=True)

Test

nosetests --with-coverage

Reference

Troyanskaya O, Cantor M, Sherlock G, et al. Missing value estimation methods for DNA microarrays[J]. Bioinformatics, 2001, 17(6): 520-525.

About

A python implementation of missing value imputation with kNN

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages