Skip to content

Shimoku version based on BitPhy Dataframer by Alberto Camara

License

Notifications You must be signed in to change notification settings

shimoku-tech/shimoku-dataframer

Repository files navigation

Shimoku Dataframer

A small library for creating pandas DataFrame fixtures.

This library generates pandas dataframes with prescribed columns and types, and filled up rows. It can therefore be used to generate arbitrary data for fixtures to be used in unit tests.

Installation

pip install shimoku-dataframer

Usage

Shimoku Dataframer allows you, by passing a dictionary mapping column names to data typesand data, to generate a fixture dataframe.

Supported data types:

Data types are to be passed as strings:

  • 'timestamp': np.datetime64 with minute precision.
  • 'date': np.datetime64 with day precision.
  • 'int': np.int64.
  • 'float': np.float64.
  • 'str': strings.
  • 'constant_str': a column of a single repeated constant string.
  • 'constant_int': a column of a single repeated constant integer.
  • 'enum': a column of values ranging from 0 to a small integer.

After fixing a numpy random seed, the generated fixture is constant and can be used for testing purposes.

Examples

If no parameters are passed, a dataframe with a single column named 'id' and containing integers is created.

from shimoku_dataframer import DataFrameMaker

maker = DataFrameMaker(seed=1)  # seed fixes the numpy random seed.
df = maker.make_df(nrows=5)

yields

index id
0 98539
1 77708
2 5192
3 98047
4 50057

In order to use any of the supported types, pass them as a dictionary as follows.

from shimoku_dataframer import DataFrameMaker

columns = {
    'a': 'str',
    'b': 'float',
    'c': 'int'
}

maker = DataFrameMaker(seed=1)
df = maker.make_df(nrows=3, cols=columns)
a b c
LRmijlfpaqbmhT 1.624345 98539
8gzYuLsul8QCDo -0.611756 77708
YexxPX3EGwnPjh -0.528172 5192

License

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Credits

By BitPhy and Alberto Camara

About

Shimoku version based on BitPhy Dataframer by Alberto Camara

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages