Skip to content

Commit 7c3d98e

Browse files
authored
Merge pull request #951 from CitrineInformatics/data-manager-docs
Data Manager Documentation Update
2 parents d5c2833 + 10acd1e commit 7c3d98e

17 files changed

+224
-474
lines changed
+144
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
=============================
2+
Migrating to Use Data Manager
3+
=============================
4+
5+
Summary
6+
=======
7+
8+
This guide provides users of Citrine Python background and instructions for migrating code to
9+
take full advantage of Data Manager features and
10+
prepare for the future removal of endpoints that will occur with Citrine Python v4.0.
11+
12+
The key change will be that :py:class:`Datasets <citrine.resources.dataset.Dataset>` are now assets
13+
of :py:class:`Teams <citrine.resources.team.Team>`,
14+
rather than :py:class:`Projects <citrine.resources.project.Project>`.
15+
The bulk of code changes will be migrating calls that access collections of data objects and Datasets from a Project-based method to a Team or Dataset-based method.
16+
17+
If you require any additional assistance migrating your Citrine Python code,
18+
do not hesitate to reach out to your Citrine customer support team.
19+
20+
What’s new?
21+
===========
22+
23+
Once Data Manager has been enabled on your deployment of the Citrine Platform,
24+
the primary change that will affect Citrine Python code is that Datasets,
25+
formerly contained within a Project, are rather assets of a Team.
26+
In other words, Teams contain both Datasets and Projects.
27+
28+
Projects still contain assets such as GEMTables, Predictors, DesignSpaces, etc., but Datasets and their contents are now at the level of a Team.
29+
Data within a Dataset (in the form of GEMD Objects, Attributes, and Templates, as well as files) are only leveraged within a Project by creating a GemTable.
30+
31+
After Data Manager is activated, any new Datasets created,
32+
either via Citrine Python or the Citrine Platform web UI, will be created at a Team level,
33+
and will not be accessible via the typical `project.<Collection>.{method}` endpoints\* .
34+
New collections, at both the Team and Dataset level, will be available in v3.4 of Citrine Python.
35+
36+
\*Newly-registered Datasets can be accessible via Project-based methods if pulled into a project with `project.pull_in_resource(resource=dataset)`.
37+
However, this is not recommended as endpoints listing data by projects and the “pull_in” endpoint for datasets will be removed in 4.0.
38+
39+
How does this change my code?
40+
=============================
41+
42+
The change in behavior is most localized to two sets of operations on Datasets and their constituent GEMD data objects:
43+
Sharing and Project-based Collections.
44+
45+
Sharing
46+
-------
47+
48+
**Within a Team**
49+
50+
Previously, sharing a Dataset from one Project to another was a 2-step process: first publishing the Dataset to a Team, then pulling the Dataset into the new project.
51+
Now that all Datasets are assets of teams, sharing within a team is unnecessary.
52+
All of the `publish`, `un_publish`, and `pull_in_resource` endpoints, when applied to Datasets will undergo deprecation.
53+
To be precise, the following calls will return a deprecation warning version for Citrine Python versions 3.4 and above, and be removed in version 4.0:
54+
55+
.. code-block:: python
56+
57+
# Publishing a Dataset to a Team will do nothing once Data Manager is activated
58+
project.publish(resource=dataset)
59+
60+
# Un-publishing a Dataset will similarly be a no-op with Data Manager activated
61+
project.un_publish(resource=dataset)
62+
63+
# Pulling a Team into a Project can still be done with Data Manager activated, but not
64+
# recommended and will be deprecated
65+
project.pull_in_resource(resource=dataset)
66+
67+
**Between Teams**
68+
69+
Sharing a Dataset from one project to another where those projects are in different Teams was a 3-step process:
70+
publishing to the Team, sharing from one Team to another, then pulling into a Project.
71+
With Data Manager, only the sharing action is needed.
72+
73+
Previous code for sharing My Dataset from Project A in Team A to eventually use in a Training Set
74+
in Project B in Team B:
75+
76+
.. code-block:: python
77+
78+
project_a.publish(resource=my_dataset)
79+
team_a.share(
80+
resource=my_datset,
81+
target_team_id=team_b.uid,
82+
)
83+
project_b.pull_in_resource(resource=my_dataset)
84+
85+
Is now:
86+
87+
.. code-block:: python
88+
89+
team_a.share(
90+
resource=my_datset,
91+
target_team_id=team_b.uid,
92+
)
93+
94+
Project-based Collections
95+
-------------------------
96+
97+
As Datasets are now assets of Teams, typical ways to `list()`, `get()`, or otherwise manipulate Datasets or data objects within a Project will undergo a deprecation cycle.
98+
As of v3.4, these endpoints will still work as usual with a deprecation warning, but will be removed in v4.0.
99+
It is therefore recommended to migrate your code from all project-based listing endpoints as soon as possible
100+
to adhere to supported patterns and avoid any costly errors.
101+
102+
The following endpoints will return a return a deprecation warning version for Citrine Python versions 3.4 and above, and be removed in version 4.0.
103+
Moreover, they will not reference Datasets or their contents that are registered after Data Manager has been activated:
104+
105+
.. code-block:: python
106+
107+
# Listing Datasets or their Contents (such as MaterialSpecs or ProcessTemplates) from a Project
108+
project.datasets.list()
109+
project.gemd.list()
110+
project.process_runs.list()
111+
...
112+
113+
# Getting Datasets or GEMD Assets via their UID and a Project
114+
project.datasets.get(uid)
115+
project.measurement_specs.get(uid)
116+
...
117+
118+
# Doing any operations (updating, deleting, etc.) to Datasets via a Project collection
119+
project.datasets.update()
120+
121+
The following new methods introduced in citrine python v3.4 are preferred:
122+
123+
.. code-block:: python
124+
125+
# Listing Datasets or their Contents
126+
team.datasets.list()
127+
team.gemd.list()
128+
dataset.property_templates.list()
129+
...
130+
131+
# Getting Datasets or GEMD Assets via their UID
132+
team.datasets.get(uid)
133+
team.ingredient_runs.get(uid)
134+
dataset.process_specs.get(uid)
135+
...
136+
137+
# Doing any operations (updating, deleting, dumping, etc.) to Datasets or GEMD Assets
138+
team.datasets.delete(uid)
139+
dataset.condition_templates.update(object)
140+
...
141+
142+
Note again that even though these endpoints will still be operational,
143+
registration of any new Datasets will be at a Team level and thus inaccessible via these Project-based collections,
144+
unless “pulled in” to a specific Project in that Team.

docs/source/FAQ/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,5 +6,5 @@ FAQ
66
:maxdepth: 2
77

88
prohibited_data_patterns
9-
team_management_migration
109
v3_migration
10+
data_manager_migration

docs/source/FAQ/team_management_migration.rst

-125
This file was deleted.

docs/source/data_entry.rst

+5-5
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ Equivalent behavior is available through the type-agnostic ``gemd`` collection a
4949
dataset.register(ProcessSpec(...))
5050
5151
Note that registration must be performed within the scope of a dataset: the dataset into which the objects are being written.
52-
The data model object collections that are defined with the project scope (such as `project.process_specs`) are read-only and will throw an error if their register method is called.
52+
The data model object collections that are defined with the team scope (such as `team.process_specs`) are read-only.
53+
Attempts to register, update, etc. via those collections will throw an error.
5354

5455
If you are registering several objects at the same time, you can use the ``register_all`` method that is available via the same objects:
5556

@@ -73,20 +74,19 @@ For example:
7374

7475
.. code-block:: python
7576
76-
project.process_templates.get(LinkByUID(scope="standard-templates", id="milling"))
77+
team.process_templates.get(LinkByUID(scope="standard-templates", id="milling"))
7778
7879
If you know the CitrineID, you do not need to specify a scope:
7980

8081
.. code-block:: python
8182
82-
project.process_templates.get(CitrineID)
83-
83+
team.process_templates.get(CitrineID)
8484
8585
If you don't know any of the data model object's unique identifiers, then you can list the data model objects and find your object in that list:
8686

8787
.. code-block:: python
8888
89-
project.process_templates.list()
89+
team.process_templates.list()
9090
9191
These results can be further constrained by dataset:
9292

docs/source/getting_started/basic_functionality.rst

+2-2
Original file line numberDiff line numberDiff line change
@@ -83,11 +83,11 @@ Get
8383
^^^
8484

8585
Get retrieves a specific resource with a known unique identifier string.
86-
If the project ``ceramic_resistors_project`` has a dataset with an id that you have saved as ``special_dataset_id``, then you could retrieve it with:
86+
If the team ``ceramic_resistors_team`` has a dataset with an id that you have saved as ``special_dataset_id``, then you could retrieve it with:
8787

8888
.. code-block:: python
8989
90-
ceramic_resistors_project.datasets.get(special_dataset_id)
90+
ceramic_resistors_team.datasets.get(special_dataset_id)
9191
9292
List
9393
^^^^

docs/source/getting_started/code_examples.rst

+18-18
Original file line numberDiff line numberDiff line change
@@ -21,44 +21,44 @@ Assuming that your Citrine deployment is ``https://matsci.citrine-platform.com``
2121
citrine = Citrine(API_KEY, API_SCHEME, API_HOST, API_PORT)
2222
2323
24-
Create a Project and Dataset
24+
Create a Team and Dataset
2525
----------------------------
2626

27-
One of your first actions might be to create a new Project and a Dataset, which you and your collaborators can populate.
28-
The code below creates a Project and one Dataset associated with it.
29-
It also inspects the newly registered Project to get its unique id.
27+
One of your first actions might be to create a new Team and a Dataset, which you and your collaborators can populate.
28+
The code below creates a Team and one Dataset associated with it.
29+
It also inspects the newly registered Team to get its unique id.
3030
Note that all resources are given descriptive names and summaries.
3131

3232
.. code-block:: python
3333
34-
from citrine.resources.project import Project
34+
from citrine.resources.team import Team
3535
from citrine.resources.dataset import Dataset
36-
band_gaps_project = citrine.projects.register(name="Band gaps",
36+
band_gaps_team = citrine.teams.register(name="Band gaps",
3737
description="Actual and DFT computed band gaps")
38-
print("My new project has name {} and id {}".format(
39-
band_gaps_project.name, band_gaps_project.uid))
38+
print("My new team has name {} and id {}".format(
39+
band_gaps_team.name, band_gaps_team.uid))
4040
4141
Strehlow_Cook_description = "Band gaps for elemental and binary " \
4242
"semiconductors with phase and temperature of measurement. DOI 10.1063/1.3253115"
4343
Strehlow_Cook_dataset = Dataset(name="Strehlow and Cook",
4444
summary="Strehlow and Cook band gaps", description=Strehlow_Cook_description)
45-
Strehlow_Cook_dataset = band_gaps_project.datasets.register(Strehlow_Cook_dataset)
45+
Strehlow_Cook_dataset = band_gaps_team.datasets.register(Strehlow_Cook_dataset)
4646
47-
Find an existing Project and Dataset
47+
Find an existing Team and Dataset
4848
------------------------------------
4949

5050
Often you will work with existing resources.
51-
The code below retrieves a Project with the name "Copper oxides project" and a datset with a known unique id that is stored as ``dataset_A_uid``.
51+
The code below retrieves a Team with the name "Copper oxides team" and a dataset with a known unique id that is stored as ``dataset_A_uid``.
5252
For more information on retrieving resources, see :ref:`Reading Resources <functionality_reading_label>`.
5353

5454
.. code-block:: python
5555
56-
project_name = "Copper oxides project"
57-
all_projects = citrine.projects.list()
58-
copper_oxides_project = next((project for project in all_projects
59-
if project.name == project_name), None)
60-
assert copper_oxides_project is not None
61-
dataset_A = copper_oxides_project.datasets.get(uid=dataset_A_uid)
56+
team_name = "Copper oxides team"
57+
all_teams = citrine.teams.list()
58+
copper_oxides_team = next((team for team in all_teams
59+
if team.name == team_name), None)
60+
assert copper_oxides_team is not None
61+
dataset_A = copper_oxides_team.datasets.get(uid=dataset_A_uid)
6262
6363
Find a template
6464
---------------
@@ -69,7 +69,7 @@ The example below searches for a process template with the tag "Oven_17" and ass
6969

7070
.. code-block:: python
7171
72-
firing_templates = list(band_gaps_project.process_templates.list_by_tag(tag="Oven_17"))
72+
firing_templates = list(band_gaps_team.process_templates.list_by_tag(tag="Oven_17"))
7373
assert len(firing_templates) == 1
7474
firing_template_17 = firing_templates[0]
7575

0 commit comments

Comments
 (0)