-
Notifications
You must be signed in to change notification settings - Fork 56
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #113 from datalad-handbook/yoda
YODA principles
- Loading branch information
Showing
12 changed files
with
13,539 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
.. _intromidterm: | ||
|
||
A Data Analysis Project with DataLad | ||
------------------------------------ | ||
|
||
|
||
Time flies and the semester rapidly approaches the midterms. | ||
In DataLad-101, students are not given an exam -- instead, they are | ||
asked to complete and submit a data analysis project with DataLad. | ||
|
||
The lecturer hands out the requirements: The projects needs to | ||
|
||
- be prepared in the form of a DataLad dataset | ||
- needs to contain a data analysis performed with Python tools | ||
- should incorporate DataLad whenever possible (data retrieval, publication, | ||
script execution, general version control) and | ||
- needs to comply to the YODA principles | ||
|
||
Luckily, the midterms are only in a couple of weeks, and a lot of the | ||
requirements of the project will be taught in the upcoming sessions. | ||
Therefore, there's little you can do to prepare for the midterm | ||
than to be extra attentive on the next lectures on the YODA | ||
principles and DataLads Python API. |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
.. _summary_yoda: | ||
|
||
Summary: YODA principles | ||
------------------------ | ||
|
||
The YODA principles are a small set of guidelines that can make a huge | ||
difference towards reproducibility, comprehensibility, and transparency | ||
in a data analysis project. | ||
|
||
These standards are not complex -- quite the opposite, they are very | ||
intuitive. They structure essential components of a data analysis project -- | ||
data, code, computational environments, and lastly also the results -- | ||
in a modular and practical way, and use basic principles and commands | ||
of DataLad you are already familiar with. | ||
|
||
There are many advantages to this organization of contents. | ||
|
||
- Having input data as independent dataset(s) that are not influenced (only | ||
consumed) by an analysis allows for a modular reuse of pure data datasets, | ||
and does not conflate the data of an analysis with the results or the code. | ||
|
||
- Keeping code within an independent, version-controlled directory, but as a part | ||
of the analysis dataset, makes sharing code easy and transparent, and helps | ||
to keep directories neat and organized. Moreover, | ||
with the data as subdatasets, data and code can be automatically shared together. | ||
|
||
- Including the computational environment into an analysis dataset encapsulates | ||
software and software versions, and thus prevents re-computation failures | ||
(or sudden differences in the results) once | ||
software is updated, and software conflicts arising on different machines | ||
than the one the analysis was originally conducted on. | ||
|
||
- Having all of these components as part of a DataLad dataset allows version | ||
controlling all pieces within the analysis regardless of their size, and | ||
generates provenance for everything, especially if you make use of the tools | ||
that DataLad provides. | ||
|
||
- The yoda procedure is a good starting point to build your next data analysis | ||
project up on. | ||
|
||
Now what can I do with it? | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Using tools that DataLad provides you are able to make the most out of | ||
your data analysis project. The YODA principles are a guide to accompany | ||
you on your path to reproducibility. | ||
|
||
What should have become clear in this section is that you are already | ||
equipped with enough DataLad tools and knowledge that complying to these | ||
standards will feel completely natural and effortless in your next analysis | ||
project. | ||
The next section will add to your existing skills by demonstrating how to | ||
use DataLad also within Python scripts. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.