CitrineInformatics · jspeerless · Jul 19, 2024 · Jul 18, 2024 · Jul 18, 2024 · Jul 18, 2024
@@ -0,0 +1,144 @@
+=============================
+Migrating to Use Data Manager
+=============================
+
+Summary
+=======
+
+This guide provides users of Citrine Python background and instructions for migrating code to 
+take full advantage of Data Manager features and
+prepare for the future removal of endpoints that will occur with Citrine Python v4.0.
+
+The key change will be that :py:class:`Datasets <citrine.resources.dataset.Dataset>` are now assets
+of :py:class:`Teams <citrine.resources.team.Team>`,
+rather than :py:class:`Projects <citrine.resources.project.Project>`.
+The bulk of code changes will be migrating calls that access collections of data objects and Datasets from a Project-based method to a Team or Dataset-based method.
+
+If you require any additional assistance migrating your Citrine Python code,
+do not hesitate to reach out to your Citrine customer support team.
+
+What’s new?
+===========
+
+Once Data Manager has been enabled on your deployment of the Citrine Platform,
+the primary change that will affect Citrine Python code is that Datasets,
+formerly contained within a Project, are rather assets of a Team.
+In other words, Teams contain both Datasets and Projects.
+
+Projects still contain assets such as GEMTables, Predictors, DesignSpaces, etc., but Datasets and their contents are now at the level of a Team.
+Data within a Dataset (in the form of GEMD Objects, Attributes, and Templates, as well as files) are only leveraged within a Project by creating a GemTable.
+
+After Data Manager is activated, any new Datasets created,
+either via Citrine Python or the Citrine Platform web UI, will be created at a Team level,
+and will not be accessible via the typical  `project.<Collection>.{method}` endpoints\* .
+New collections, at both the Team and Dataset level, will be available in v3.4 of Citrine Python.
+
+\*Newly-registered Datasets can be accessible via Project-based methods if pulled into a project with `project.pull_in_resource(resource=dataset)`.
+However, this is not recommended as endpoints listing data by projects and the “pull_in” endpoint for datasets will be removed in 4.0.
+
+How does this change my code?
+=============================
+
+The change in behavior is most localized to two sets of operations on Datasets and their constituent GEMD data objects:
+Sharing and Project-based Collections.
+
+Sharing
+-------
+
+**Within a Team**
+
+Previously, sharing a Dataset from one Project to another was a 2-step process: first publishing the Dataset to a Team, then pulling the Dataset into the new project.
+Now that all Datasets are assets of teams, sharing within a team is unnecessary.
+All of the `publish`, `un_publish`, and `pull_in_resource` endpoints, when applied to Datasets will undergo deprecation.
+To be precise, the following calls will return a deprecation warning version for Citrine Python versions 3.4 and above, and be removed in version 4.0:
+
+.. code-block:: python
+
+    # Publishing a Dataset to a Team will do nothing once Data Manager is activated
+    project.publish(resource=dataset)
+
+    # Un-publishing a Dataset will similarly be a no-op with Data Manager activated
+    project.un_publish(resource=dataset)
+
+    # Pulling a Team into a Project can still be done with Data Manager activated, but not
+    # recommended and will be deprecated
+    project.pull_in_resource(resource=dataset)
+
+**Between Teams**
+
+Sharing a Dataset from one project to another where those projects are in different Teams was a 3-step process:
+publishing to the Team, sharing from one Team to another, then pulling into a Project.
+With Data Manager, only the sharing action is needed.
+
+Previous code for sharing My Dataset from Project A in Team A to eventually use in a Training Set
+in Project B in Team B:
+
+.. code-block:: python
+
+    project_a.publish(resource=my_dataset)
+    team_a.share(
+        resource=my_datset,
+        target_team_id=team_b.uid,
+    )
+    project_b.pull_in_resource(resource=my_dataset)
+
+Is now:
+
+.. code-block:: python
+
+    team_a.share(
+        resource=my_datset,
+        target_team_id=team_b.uid,
+    )
+
+Project-based Collections
+-------------------------
+
+As Datasets are now assets of Teams, typical ways to `list()`, `get()`, or otherwise manipulate Datasets or data objects within a Project will undergo a deprecation cycle.
+As of v3.4, these endpoints will still work as usual with a deprecation warning, but will be removed in v4.0.
+It is therefore recommended to migrate your code from all project-based listing endpoints as soon as possible
+to adhere to supported patterns and avoid any costly errors.
+
+The following endpoints will return a return a deprecation warning version for Citrine Python versions 3.4 and above, and be removed in version 4.0.
+Moreover, they will not reference Datasets or their contents that are registered after Data Manager has been activated:
+
+.. code-block:: python
+
+    # Listing Datasets or their Contents (such as MaterialSpecs or ProcessTemplates) from a Project
+    project.datasets.list()
+    project.gemd.list()
+    project.process_runs.list()
+    ...
+
+    # Getting Datasets or GEMD Assets via their UID and a Project
+    project.datasets.get(uid)
+    project.measurement_specs.get(uid)
+    ...
+
+     # Doing any operations (updating, deleting, etc.) to Datasets via a Project collection
+    project.datasets.update()
+
+The following new methods introduced in citrine python v3.4 are preferred:
+
+.. code-block:: python
+
+    # Listing Datasets or their Contents
+    team.datasets.list()
+    team.gemd.list()
+    dataset.property_templates.list()
+    ...
+
+    # Getting Datasets or GEMD Assets via their UID
+    team.datasets.get(uid)
+    team.ingredient_runs.get(uid)
+    dataset.process_specs.get(uid)
+    ...
+
+    # Doing any operations (updating, deleting, dumping, etc.) to Datasets or GEMD Assets
+    team.datasets.delete(uid)
+    dataset.condition_templates.update(object)
+    ...
+
+Note again that even though these endpoints will still be operational, 
+registration of any new Datasets will be at a Team level and thus inaccessible via these Project-based collections,
+unless “pulled in” to a specific Project in that Team.
@@ -6,5 +6,5 @@ FAQ
     :maxdepth: 2
 
     prohibited_data_patterns
-    team_management_migration
     v3_migration
+    data_manager_migration
@@ -49,7 +49,8 @@ Equivalent behavior is available through the type-agnostic ``gemd`` collection a
     dataset.register(ProcessSpec(...))
 
 Note that registration must be performed within the scope of a dataset: the dataset into which the objects are being written.
-The data model object collections that are defined with the project scope (such as `project.process_specs`) are read-only and will throw an error if their register method is called.
+The data model object collections that are defined with the team scope (such as `team.process_specs`) are read-only.
+Attempts to register, update, etc. via those collections will throw an error.
 
 If you are registering several objects at the same time, you can use the ``register_all`` method that is available via the same objects:
 
@@ -73,20 +74,19 @@ For example:
 
 .. code-block:: python
 
-    project.process_templates.get(LinkByUID(scope="standard-templates", id="milling"))
+    team.process_templates.get(LinkByUID(scope="standard-templates", id="milling"))
 
 If you know the CitrineID, you do not need to specify a scope:
 
 .. code-block:: python
 
-    project.process_templates.get(CitrineID)
-
+    team.process_templates.get(CitrineID)
 
 If you don't know any of the data model object's unique identifiers, then you can list the data model objects and find your object in that list:
 
 .. code-block:: python
 
-    project.process_templates.list()
+    team.process_templates.list()
 
 These results can be further constrained by dataset:
 

@@ -83,11 +83,11 @@ Get
 ^^^
 
 Get retrieves a specific resource with a known unique identifier string.
-If the project ``ceramic_resistors_project`` has a dataset with an id that you have saved as ``special_dataset_id``, then you could retrieve it with:
+If the team ``ceramic_resistors_team`` has a dataset with an id that you have saved as ``special_dataset_id``, then you could retrieve it with:
 
 .. code-block:: python
 
-    ceramic_resistors_project.datasets.get(special_dataset_id)
+    ceramic_resistors_team.datasets.get(special_dataset_id)
 
 List
 ^^^^

@@ -21,44 +21,44 @@ Assuming that your Citrine deployment is ``https://matsci.citrine-platform.com``
     citrine = Citrine(API_KEY, API_SCHEME, API_HOST, API_PORT)
 
 
-Create a Project and Dataset
+Create a Team and Dataset
 ----------------------------
 
-One of your first actions might be to create a new Project and a Dataset, which you and your collaborators can populate.
-The code below creates a Project and one Dataset associated with it.
-It also inspects the newly registered Project to get its unique id.
+One of your first actions might be to create a new Team and a Dataset, which you and your collaborators can populate.
+The code below creates a Team and one Dataset associated with it.
+It also inspects the newly registered Team to get its unique id.
 Note that all resources are given descriptive names and summaries.
 
 .. code-block:: python
 
-    from citrine.resources.project import Project
+    from citrine.resources.team import Team
     from citrine.resources.dataset import Dataset
-    band_gaps_project = citrine.projects.register(name="Band gaps",
+    band_gaps_team = citrine.teams.register(name="Band gaps",
         description="Actual and DFT computed band gaps")
-    print("My new project has name {} and id {}".format(
-        band_gaps_project.name, band_gaps_project.uid))
+    print("My new team has name {} and id {}".format(
+        band_gaps_team.name, band_gaps_team.uid))
 
     Strehlow_Cook_description = "Band gaps for elemental and binary " \
         "semiconductors with phase and temperature of measurement. DOI 10.1063/1.3253115"
     Strehlow_Cook_dataset = Dataset(name="Strehlow and Cook",
         summary="Strehlow and Cook band gaps", description=Strehlow_Cook_description)
-    Strehlow_Cook_dataset = band_gaps_project.datasets.register(Strehlow_Cook_dataset)
+    Strehlow_Cook_dataset = band_gaps_team.datasets.register(Strehlow_Cook_dataset)
 
-Find an existing Project and Dataset
+Find an existing Team and Dataset
 ------------------------------------
 
 Often you will work with existing resources.
-The code below retrieves a Project with the name "Copper oxides project" and a datset with a known unique id that is stored as ``dataset_A_uid``.
+The code below retrieves a Team with the name "Copper oxides team" and a dataset with a known unique id that is stored as ``dataset_A_uid``.
 For more information on retrieving resources, see :ref:`Reading Resources <functionality_reading_label>`.
 
 .. code-block:: python
 
-    project_name = "Copper oxides project"
-    all_projects = citrine.projects.list()
-    copper_oxides_project = next((project for project in all_projects
-        if project.name == project_name), None)
-    assert copper_oxides_project is not None
-    dataset_A = copper_oxides_project.datasets.get(uid=dataset_A_uid)
+    team_name = "Copper oxides team"
+    all_teams = citrine.teams.list()
+    copper_oxides_team = next((team for team in all_teams
+        if team.name == team_name), None)
+    assert copper_oxides_team is not None
+    dataset_A = copper_oxides_team.datasets.get(uid=dataset_A_uid)
 
 Find a template
 ---------------
@@ -69,7 +69,7 @@ The example below searches for a process template with the tag "Oven_17" and ass
 
 .. code-block:: python
 
-    firing_templates = list(band_gaps_project.process_templates.list_by_tag(tag="Oven_17"))
+    firing_templates = list(band_gaps_team.process_templates.list_by_tag(tag="Oven_17"))
     assert len(firing_templates) == 1
     firing_template_17 = firing_templates[0]