Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nx.Graph CRUD Interface #3

Merged
merged 59 commits into from
May 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
e86e5a8
cleanup: `DiGraph` & `Graph`
aMahanna May 2, 2024
b92829d
fix: `Digraph`
aMahanna May 2, 2024
e7bb6d1
temp: hide `MultiGraph` & `MultiDiGraph`
aMahanna May 2, 2024
0f5a61e
checkpoint
aMahanna May 3, 2024
0f85094
new: `starter.sh` script for DB
aMahanna May 3, 2024
ad25eb0
skip test if missing `phenolrs`
aMahanna May 3, 2024
39036e5
checkpoint (again)
aMahanna May 3, 2024
7fafe4e
fix: `graph.py`
aMahanna May 3, 2024
6a39dc7
checkpoint (again)
aMahanna May 3, 2024
74737b2
bump
aMahanna May 3, 2024
4cdafa7
update tests
aMahanna May 3, 2024
b89f130
simplify `nx_arangodb` structure, update `dict.py`, cleanup
aMahanna May 3, 2024
f6a60e4
fix: use `orig_func`
aMahanna May 4, 2024
35a5ceb
checkpoint
aMahanna May 4, 2024
fa9263c
remove multigraph
aMahanna May 4, 2024
b8e05a0
update `_nx_arangodb`
aMahanna May 4, 2024
b9e574c
new: `nx.shortest_path`
aMahanna May 4, 2024
45a1e20
update tests
aMahanna May 4, 2024
5ed8afa
checkpoint (CI is failing)
aMahanna May 4, 2024
e38e4b3
remove duplicate file
aMahanna May 4, 2024
8bda427
fix: CI failure
aMahanna May 4, 2024
81ee073
rename: `aql()` instead of `query()`
aMahanna May 4, 2024
11edcaa
cleanup
aMahanna May 4, 2024
d44d6e0
HACK: `from_networkx_arangodb`
aMahanna May 4, 2024
cf6341a
cleanup tests
aMahanna May 4, 2024
20e4cf8
fix: `logger` instead of `print`
aMahanna May 4, 2024
63a1c88
checkpoint (again)
aMahanna May 4, 2024
cacbc95
update: `test_edges_crud`
aMahanna May 4, 2024
d386b9b
remove unused overrides
aMahanna May 4, 2024
a22814a
fix: aql functions
aMahanna May 4, 2024
b401b3a
fix: address edge duplication for `nxadb.Graph`
aMahanna May 4, 2024
14f63d8
add edge duplication test case
aMahanna May 4, 2024
9e89e46
fix: typo
aMahanna May 5, 2024
d70fa1a
more `debug` logs :heart:
aMahanna May 5, 2024
b752650
remove outdated comments
aMahanna May 5, 2024
515eaa9
fix: debugs
aMahanna May 5, 2024
4fe281a
fix: test typo
aMahanna May 5, 2024
880ce41
Update README.md
aMahanna May 5, 2024
069aaee
Update README.md
aMahanna May 5, 2024
2125a57
experimental: `CustomEdgeView`, `CustomEdgeDataView`
aMahanna May 5, 2024
69bd3d0
checkpoint
aMahanna May 5, 2024
9dbf488
cleanup
aMahanna May 5, 2024
7667eec
update readme
aMahanna May 5, 2024
cf1ac75
update readme
aMahanna May 5, 2024
52e1df5
new: `test_readme`
aMahanna May 5, 2024
b4fb7ec
fix: bc
aMahanna May 5, 2024
017658f
fix: shortest_path
aMahanna May 5, 2024
aef36a0
add pass-through classes for `DiGraph`, `MultiGraph`, and `MultiDiGraph`
aMahanna May 5, 2024
d441fe5
fix: `run_nx_tests`
aMahanna May 5, 2024
79f6c9a
cleanup
aMahanna May 5, 2024
3c17c7d
fix: `exceptions.py`
aMahanna May 5, 2024
0c7f0cd
bump
aMahanna May 5, 2024
4cb1305
fix: `create_using`
aMahanna May 5, 2024
5c28290
update readme
aMahanna May 5, 2024
231de65
fix: nxcg
aMahanna May 5, 2024
53b4117
fix: type check
aMahanna May 5, 2024
7cced5a
attempt fix: logger handler
aMahanna May 5, 2024
2be0fca
attempt fix: logger
aMahanna May 5, 2024
d53b1d0
Merge branch 'main' into nxadb-crud
aMahanna May 5, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,11 +21,10 @@ jobs:
cache: 'pip'
cache-dependency-path: setup.py

- name: Set up ArangoDB Instance via Docker
run: docker create --name adb -p 8529:8529 -e ARANGO_ROOT_PASSWORD=test arangodb/arangodb

- name: Start ArangoDB Instance
run: docker start adb
- name: Set up ArangoDB
run: |
chmod +x starter.sh
./starter.sh

- name: Setup pip
run: python -m pip install --upgrade pip setuptools wheel
Expand Down
90 changes: 73 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,85 @@ Development Sandbox:
<a href="https://colab.research.google.com/drive/1gIfJDEumN6UdZou_VlSbG874xGkHwtU2?usp=sharing" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

What's currently possible:
- Algorithm dispatching for GPU & CPU (`betweenness_centrality`, `pagerank`, `louvain_communities`)
- Data Load from ArangoDB to `nx`
- Data Load from ArangoDB to `nxcg`
- ArangoDB CRUD Interface for `nx.Graph`
- Algorithm dispatching to `nx` & `nxcg` (`betweenness_centrality`, `pagerank`, `louvain_communities`)
- Algorithm dispatching to ArangoDB (`shortest_path`)
- Data Load from ArangoDB to `nx` object
- Data Load from ArangoDB to `nxcg` object
- Data Load from ArangoDB via dictionary-based remote connection

Next Milestone:
- NetworkX CRUD Interface for ArangoDB
Next steps:
- Generalize `nxadb`'s support for `nx` & `nxcg` algorithms
- Improve support for `nxadb.DiGraph`
- CRUD Interface Improvements

Planned, but not yet scopped:
- NetworkX Graph Query Method
- Data Write to ArangoDB from `nx`
- Data Write to ArangoDB from `nxcg`
Planned:
- Support for `nxadb.MultiGraph` & `nxadb.MultiDiGraph`
- Data Load from `nx` to ArangoDB
- Data Load from `nxcg` to ArangoDB

```py

import os
import networkx as nx
import nx_arangodb as nxadb

G_1 = nx.karate_club_graph()
os.environ["DATABASE_HOST"] = "http://localhost:8529"
os.environ["DATABASE_USERNAME"] = "root"
os.environ["DATABASE_PASSWORD"] = "password"
os.environ["DATABASE_NAME"] = "_system"

G = nxadb.Graph(graph_name="KarateGraph")

G_nx = nx.karate_club_graph()
assert len(G.nodes) == len(G_nx.nodes)
assert len(G.adj) == len(G_nx.adj)
assert len(G.edges) == len(G_nx.edges)

nx.betweenness_centrality(G)
nx.pagerank(G)
nx.community.louvain_communities(G)
nx.shortest_path(G, "person/1", "person/34")
nx.all_neighbors(G, "person/1")

G.nodes(data='club', default='unknown')
G.edges(data='weight', default=1000)

G.nodes["person/1"]
G.adj["person/1"]
G.edges[("person/1", "person/3")]

G.nodes["person/1"]["name"] = "John Doe"
G.nodes["person/1"].update({"age": 40})
del G.nodes["person/1"]["name"]

G.adj["person/1"]["person/3"]["weight"] = 2
G.adj["person/1"]["person/3"].update({"weight": 3})
del G.adj["person/1"]["person/3"]["weight"]

G.edges[("person/1", "person/3")]["weight"] = 0.5
assert G.adj["person/1"]["person/3"]["weight"] == 0.5

G.add_node("person/35", name="Jane Doe")
G.add_nodes_from(
[("person/36", {"name": "Jack Doe"}), ("person/37", {"name": "Jill Doe"})]
)
G.add_edge("person/1", "person/35", weight=1.5, _edge_type="knows")
G.add_edges_from(
[
("person/1", "person/36", {"weight": 2}),
("person/1", "person/37", {"weight": 3}),
],
_edge_type="knows",
)

G.remove_edge("person/1", "person/35")
G.remove_edges_from([("person/1", "person/36"), ("person/1", "person/37")])
G.remove_node("person/35")
G.remove_nodes_from(["person/36", "person/37"])

G_2 = nxadb.Graph(G_1)
G.clear()

bc_1 = nx.betweenness_centrality(G_1)
bc_2 = nx.betweenness_centrality(G_2)
bc_3 = nx.betweenness_centrality(G_1, backend="arangodb")
bc_4 = nx.betweenness_centrality(G_2, backend="arangodb")
```
assert len(G.nodes) == len(G_nx.nodes)
assert len(G.adj) == len(G_nx.adj)
assert len(G.edges) == len(G_nx.edges)
```
12 changes: 9 additions & 3 deletions _nx_arangodb/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# Copied from nx-cugraph

"""Tell NetworkX about the arangodb backend. This file can update itself:

$ make plugin-info
Expand Down Expand Up @@ -31,20 +29,25 @@
"functions": {
# BEGIN: functions
"betweenness_centrality",
"is_partition",
"louvain_communities",
"louvain_partitions",
"modularity",
"pagerank",
"shortest_path",
"to_scipy_sparse_array",
# END: functions
},
"additional_docs": {
# BEGIN: additional_docs

"shortest_path": "limited version of nx.shortest_path",
# END: additional_docs
},
"additional_parameters": {
# BEGIN: additional_parameters
"is_partition": {
"dtype : dtype or None, optional": "The data type (np.float32, np.float64, or None) to use for the edge weights in the algorithm. If None, then dtype is determined by the edge values.",
},
"louvain_communities": {
"dtype : dtype or None, optional": "The data type (np.float32, np.float64, or None) to use for the edge weights in the algorithm. If None, then dtype is determined by the edge values.",
},
Expand All @@ -57,6 +60,9 @@
"pagerank": {
"dtype : dtype or None, optional": "The data type (np.float32, np.float64, or None) to use for the edge weights in the algorithm. If None, then dtype is determined by the edge values.",
},
"shortest_path": {
"dtype : dtype or None, optional": "The data type (np.float32, np.float64, or None) to use for the edge weights in the algorithm. If None, then dtype is determined by the edge values.",
},
"to_scipy_sparse_array": {
"dtype : dtype or None, optional": "The data type (np.float32, np.float64, or None) to use for the edge weights in the algorithm. If None, then dtype is determined by the edge values.",
},
Expand Down
4 changes: 2 additions & 2 deletions nx_arangodb/__init__.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,3 @@
# Copied from nx-cugraph

from networkx.exception import *

from . import utils
Expand All @@ -13,4 +11,6 @@
from . import algorithms
from .algorithms import *

from .logger import logger

from _nx_arangodb._version import __git_commit__, __version__
3 changes: 2 additions & 1 deletion nx_arangodb/algorithms/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from . import centrality, community, link_analysis
from . import centrality, community, link_analysis, shortest_paths
from .centrality import *
from .community import *
from .link_analysis import *
from .shortest_paths import *
69 changes: 17 additions & 52 deletions nx_arangodb/algorithms/centrality/betweenness.py
Original file line number Diff line number Diff line change
@@ -1,24 +1,19 @@
from networkx.algorithms.centrality import betweenness as nx_betweenness
import networkx as nx

from nx_arangodb.convert import _to_nxadb_graph, _to_nxcg_graph
from nx_arangodb.logger import logger
from nx_arangodb.utils import networkx_algorithm

try:
import nx_cugraph as nxcg

GPU_ENABLED = True
print("ANTHONY: GPU is enabled")
except ModuleNotFoundError:
GPU_ENABLED = False
print("ANTHONY: GPU is disabled")


__all__ = ["betweenness_centrality"]

# 1. If GPU is enabled, call nx-cugraph bc() after converting to an ncxg graph (in-memory graph)
# 2. If GPU is not enabled, call networkx bc() after converting to an nxadb graph (in-memory graph)
# 3. If GPU is not enabled, call networkx bc() **without** converting to a nxadb graph (remote graph)


@networkx_algorithm(
is_incomplete=True,
Expand All @@ -27,56 +22,26 @@
_plc="betweenness_centrality",
)
def betweenness_centrality(
G, k=None, normalized=True, weight=None, endpoints=False, seed=None, run_on_gpu=True
G,
k=None,
normalized=True,
weight=None,
endpoints=False,
seed=None,
run_on_gpu=True,
pull_graph_on_cpu=True,
):
print("ANTHONY: Calling betweenness_centrality from nx_arangodb")
logger.debug(f"nxadb.betweenness_centrality for {G.__class__.__name__}")

# 1.
if GPU_ENABLED and run_on_gpu:
print("ANTHONY: to_nxcg")
G = _to_nxcg_graph(G, weight)

print("ANTHONY: Using nxcg bc()")
logger.debug("using nxcg.betweenness_centrality")
return nxcg.betweenness_centrality(G, k=k, normalized=normalized, weight=weight)

# 2.
else:

print("ANTHONY: to_nxadb")
G = _to_nxadb_graph(G)

print("ANTHONY: Using nx bc()")

betweenness = dict.fromkeys(G, 0.0) # b[v]=0 for v in G
if k is None:
nodes = G
else:
nodes = seed.sample(list(G.nodes()), k)
for s in nodes:
# single source shortest paths
if weight is None: # use BFS
S, P, sigma, _ = nx_betweenness._single_source_shortest_path_basic(G, s)
else: # use Dijkstra's algorithm
S, P, sigma, _ = nx_betweenness._single_source_dijkstra_path_basic(
G, s, weight
)
# accumulation
if endpoints:
betweenness, _ = nx_betweenness._accumulate_endpoints(
betweenness, S, P, sigma, s
)
else:
betweenness, _ = nx_betweenness._accumulate_basic(
betweenness, S, P, sigma, s
)

betweenness = nx_betweenness._rescale(
betweenness,
len(G),
normalized=normalized,
directed=G.is_directed(),
k=k,
endpoints=endpoints,
)
G = _to_nxadb_graph(G, pull_graph=pull_graph_on_cpu)

return betweenness
logger.debug("using nx.betweenness_centrality")
return nx.betweenness_centrality.orig_func(
G, k=k, normalized=normalized, weight=weight, endpoints=endpoints, seed=seed
)
Loading
Loading