Grizzly is a Pandas based package working with RDF graphs.
Please check the release notes for detailed changes by version.
Grizzly is a ready to install Python package. You need to have Python 3 installed (not tested with Python 2) and the packages listed in the requirements file.
pip install -r requirements.txt
You can install Grizzly directly from this repository. On the command line type
pip install git+
To upgrade Grizzly to a new version run:
pip install --upgrade grizzly
Clone the repository to a convenient location for you.
git clone
Then install Grizzly without transferring it to the Python site-packages directory. This way any change to the source files is immediately reflected when you use the package.
cd grizzly
pip install -e .
Load triples from a CSV file:
import grizzly as gz
graph = gz.read_csv('filename.csv')
Load a graph from a remote graph store (replace the text between asterisks):
creds = {'base_url': 'http://localhost:8000',
'name': 'example',
'username': **username**,
'password': **password**}
store = gz.Repository(**creds)
graph = store.query('select * {?s ?p ?o .} limit 100')
Run a SPAQRL query:
select ?sub
where {
?sub ?pre ?obj .
Run a query using the GraphFrames DSL:
graph.find('(a)-[b]->(c);(c)-[d]->(e)').filter('b = skos:broader')
You can run update queries directly. This returns a new graph with the updates:
updated_graph = graph.query('''
delete {
?parent skos:narrower ?child.
insert {
?parent skos:broader ?child .
where {
?parent skos:narrower ?child.
It is also possible to delete or insert triples that had previously been generated through a CONSTRUCT query:
new_triples = graph.query('''
construct {
?parent skos:broader ?child .
where {
?parent skos:narrower ?child.
updated_graph = graph.insert(new_triples)
Serialize the graph to plain triples that can be uploaded to a graphstore:
triples = graph.to_triples()
read_turtle (string, custom_prefixes={}):
Reads a file or string in ttl format and returns a
. You can supplycustom_prefixes
that should be taken into account. -
read_csv *(string, args, custom_prefixes={}, keep_default_na=False, **kwargs):
Reads a file or string in CSV format and returns a
. You can supplycustom_prefixes
that should be taken into account. -
read_json (string, custom_prefixes={}):
Reads a file or graph store response in the JSON format. If it is RDF/JSON it returns a typed Graph, if it has the MIME type 'application/sparql-results+json' it return a typed DataFrame. You can supply
that should be taken into account. -
read_ntriples (string, custom_prefixes={}):
Reads a file or string of N-triples and returns a
. This might be slow as it uses the query parser at the moment.You can supplycustom_prefixes
that should be taken into account. -
read_remote (query, endpointURL, repository, custom_prefixes={}, username='', password=''):
Queries a SPARQL endpoint at
. The preferred way at the moment is to instead call a query on aRepository
parse (query, rule_name='query', parser='SPARQL'):
Parses a string
with the parser indicated inparser
. By default assumes to start with the start rulequery
but other rules can be as argument torule_name
. -
term *(query, args, ast=False, **kwargs):
No description yet
variable *(query, args, ast=False, **kwargs):
No description yet
resource *(query, args, ast=False, **kwargs):
No description yet
literal *(query, args, ast=False, **kwargs):
No description yet
expression *(query, args, ast=False, **kwargs):
No description yet
call *(query, args, ast=False, **kwargs):
No description yet
constrain (constraints, parts):
No description yet
filter (expression):
Filter values according to expression following the GraphFrames pattern.
copy_metadata (source):
No description yet
decode (value=None):
No description yet
encode (value=None):
No description yet
encode_value (value):
No description yet
to_pandas ():
No description yet
to_koalas ():
No description yet
to_csv *(args, index=False, decode=True, **kwargs):
No description yet
This class represents an RDF graph in memory.
find (query_string):
Query the graph with the motif patterns used in GraphFrame, like e.g.
. Not fully implemented at the moment. -
query (query_string):
Query the graph with a SPARQL query. Currently supported are
queries. Returns aGraph
instance or aTable
instance depending on the type of the query. -
where *(query, args, ast=False, **kwargs):
No description yet
triple *(query, args, ast=False, **kwargs):
No description yet
delete (triples, name=''):
from the graph.triples
must be aDataFrame
or of a subclassed class with three columns. -
insert (triples, name=''):
to the graph.triples
must be aDataFrame
or of a subclassed class with three columns. -
get_graph (name=''):
No description yet
get_encoded_triples (triples):
No description yet
to_table ():
No description yet
to_triples ():
No description yet
to_turtle (filename):
No description yet
to_json (filename=None):
Serializes the graph to RDF/JSON. Returns a string if no
is given, otherwise save the result as file. -
to_property_graph (filename=None, property_names={}, directed=False, multigraph=False, graph={}):
No description yet
to_remote (endpointURL, repository, graph, username=None, password=None):
No description yet
select *(query, args, ast=False, **kwargs):
No description yet
construct *(query, args, ast=False, **kwargs):
No description yet
modify_solution (query):
No description yet
groupby *(args, **kwargs):
No description yet
sort_columns (col_a, col_b):
No description yet
to_graph ():
No description yet
This class represents a graph store repository as an abstraction over a REST API. The API is designed to mirror the API of the local Graph
class. All methods take instances of Graph
as arguments and return instances of Graph
or Table
. Initialize with the base_url
(e.g. http://localhost:7200
). In the case of an RDF4J endpoint either select the repository by providing it as repository_name
or included it in the base_url
(e.g. http://localhost:7200/repositories/example
). Additionally custom_prefixes
can be supplied which will be used to generate prefix defintions in queries. For authentication you can provide username
and password
query (query):
Query the repository with a SPARQL query. Automatically adds known prefix defintions to the query. Currently supported are
queries. Returns aGraph
instance or aTable
instance depending on the type of the query. -
insert (graph, name):
into the repository.graph
should be aGraph
instance. The graph will be inserted intoname
which can either be a named graph or'default'
. ReturnsTrue
if succesful. -
get_graph (name):
Gets an entire graph from the repository and returns it as
instance. The graphname
can either be a named graph or'default'
. -
list_graphs ():
Returns a
containing all named graphs in the repository. -
size ():
Returns the number of triples in the repository.