Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added support to download example datasets #24

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion concepts/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,11 @@
from ._example import EXAMPLE
from .contexts import Context
from .definitions import Definition
from .examples import load_dataset

__all__ = ['EXAMPLE',
'Context', 'Definition', 'Shape',
'load', 'load_cxt', 'load_csv',
'load', 'load_cxt', 'load_csv', load_dataset,
'make_context']

__title__ = 'concepts'
Expand Down
40 changes: 40 additions & 0 deletions concepts/examples.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
"""Loading examplary formal contexts."""

import typing
from urllib.request import urlopen

from .contexts import Context

__all__ = ['load_dataset']

DATASET_SOURCE = "https://raw.githubusercontent.com/fcatools/contexts/main/contexts"


# inspired by https://github.com/mwaskom/seaborn/blob/master/seaborn/utils.py#L524
def load_dataset(name: str, data_src: typing.Optional[str] = DATASET_SOURCE,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: How about keyword-only arguments for eveything except name (and maybe that one even positional-only)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I implemented this in the latest commits.

encoding: typing.Optional[str] = 'utf-8'):
"""Load an example formal context from the online repository (requires internet).

Args:
name (str):
Name of the dataset (``{name}.cxt`` on
https://github.com/fcatools/contexts/tree/main/contexts).
data_src (str, optional):
Base URL to the repository to download the dataset.
encoding (str, optional):
Encoding of the file (``'utf-8'``, ``'latin1'``, ``'ascii'``, ...).

Returns:
Context: New :class:`.Context` instance.

Example:
>>> import concepts
>>> concepts.load_dataset('livingbeings_en')
<Context object mapping 8 objects to 9 properties [e32693ba] at 0x...>
"""

url = f"{data_src}/{name}.cxt"

# TODO: implement caching here?

return Context.fromstring(urlopen(url).read().decode(encoding), 'cxt')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use a context-manager for urlopen()?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely. Done.

10 changes: 10 additions & 0 deletions tests/test_examples.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
import pytest
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused import

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.


import concepts


def test_load_dataset():
context = concepts.load_dataset('livingbeings_en')
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probbaly not depend on internet connectivity in the test.

How about having a mocked test and another one doing the actual thing that would be opt-in with a flag?

Something mildy related here: https://github.com/xflr6/graphviz/blob/e5578d39009469df2b7c6743458970643e228226/tests/conftest.py#L5

Copy link
Author

@rjoberon rjoberon Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree but I did not manage to implement this. The examples I found are mainly targeting unittest and not pytest (or pytest with the requests module).


assert len(context.objects) == 8
assert len(context.properties) == 9
Loading