-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
YODA principles #113
Merged
YODA principles #113
Changes from 22 commits
Commits
Show all changes
31 commits
Select commit
Hold shift + click to select a range
11fa1a4
add placeholder for YODA principles
adswa a1b4eb4
add an intro to the book part
adswa e65b97f
rename yoda file
adswa 149a911
update contents
adswa fa90777
YODA and call the DataLad team geeks
adswa 7a77819
summarize YODA practices
adswa 7cfa376
add yoda
adswa 567d965
include yoda
adswa 16b022b
add svg for yoda and modular datasets
adswa 13b7423
start restructuring
adswa 38bd4db
add ref to siblings
adswa dd17f9f
svg instead of png
adswa 06c94dd
add full YODA wf image
adswa 2b9af3b
WIP on P1 & P2
adswa 86b1a81
add data_origin.svg for P2
adswa bcb6d78
Merge branch 'master' of github.com:datalad-handbook/book into yoda
adswa f775916
Merge branch 'master' of github.com:datalad-handbook/book into yoda
adswa 2aa1883
start with a bad example
adswa e87d789
finalize a first draft
adswa 8a6603c
typos, formatting, tweaks
adswa ba07a6d
move summary into dedicated page
adswa 103af4e
add yoda procedure
adswa 69694d2
WIP: attempting an outlook in the introduction
adswa aefd539
first round of comments
adswa c7fd16d
P1: link figures a bit better
adswa f52b4a9
finish a high-level overview of what will be learned with the handbook
adswa a47fe56
upper case heading
adswa 8ef19b9
link FAIR website
adswa 6d93554
explicitly state dataset nesting
adswa a58c663
Merge branch 'master' of github.com:datalad-handbook/book into yoda
adswa 98798df
add missing links to yoda elsewhere
adswa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
.. _intromidterm: | ||
|
||
A Data Analysis Project with DataLad | ||
------------------------------------ | ||
|
||
|
||
Time flies and the semester rapidly approaches the midterms. | ||
In DataLad-101, students are not given an exam -- instead, they are | ||
asked to complete and submit a data analysis project with DataLad. | ||
|
||
The lecturer hands out the requirements: The projects needs to | ||
|
||
- be prepared in the form of a DataLad dataset | ||
- needs to contain a data analysis performed with Python tools | ||
- should incorporate DataLad whenever possible (data retrieval, publication, | ||
script execution, general version control) and | ||
- needs to comply to the YODA principles | ||
|
||
Luckily, the midterms are only in a couple of weeks, and a lot of the | ||
requirements of the project will be taught in the upcoming sessions. | ||
Therefore, there's little you can do to prepare for the midterm | ||
than to be extra attentive on the next lectures on the YODA | ||
principles and DataLads Python API. |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
.. _summary_yoda: | ||
|
||
Summary: YODA principles | ||
------------------------ | ||
|
||
The YODA principles are a small set of guidelines that can make a huge | ||
difference towards reproducibility, comprehensibility, and transparency | ||
in a data analysis project. | ||
|
||
These standards are not complex -- quite the opposite, they are very | ||
intuitive. They structure essential components of a data analysis project -- | ||
data, code, computational environments, and lastly also the results -- | ||
in a modular and practical way, and use basic principles and commands | ||
of DataLad you are already familiar with. | ||
|
||
There are many advantages to this organization of contents. | ||
|
||
- Having input data as independent dataset(s) that are not influenced (only | ||
consumed) by an analysis allows for a modular reuse of pure data datasets, | ||
and does not conflate the data of an analysis with the results or the code. | ||
|
||
- Keeping code within an independent, version-controlled directory, but as a part | ||
of the analysis dataset, makes sharing code easy and transparent, and helps | ||
to keep directories neat and organized. Moreover, | ||
with the data as subdatasets, data and code can be automatically shared together. | ||
|
||
- Including the computational environment into an analysis dataset encapsulates | ||
software and software versions, and thus prevents re-computation failures | ||
(or sudden differences in the results) once | ||
software is updated, and software conflicts arising on different machines | ||
than the one the analysis was originally conducted on. | ||
|
||
- Having all of these components as part of a DataLad dataset allows version | ||
controlling all pieces within the analysis regardless of their size, and | ||
generates provenance for everything, especially if you make use of the tools | ||
that DataLad provides. | ||
|
||
- The yoda procedure is a good starting point to build your next data analysis | ||
project up on. | ||
|
||
Now what can I do with it? | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
Using tools that DataLad provides you are able to make the most out of | ||
your data analysis project. The YODA principles are a guide to accompany | ||
you on your path to reproducibility. | ||
|
||
What should have become clear in this section is that you are already | ||
equipped with enough DataLad tools and knowledge that complying to these | ||
standards will feel completely natural and effortless in your next analysis | ||
project. | ||
The next section will add to your existing skills by demonstrating how to | ||
use DataLad also within Python scripts. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't read in full yet, but this wants to tell me that something in here is Python-specific, and I can ignore it for other projects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, I get your point. I will try to reduce that feeling, if we keep this.
Can I get your take on the general idea? It was to put the YODA principles into the context of the narrative, and this I thought was easiest possible in the context of a data analysis. My initial idea was:
all wrapped up in a "midterm project" context in the educational narrative.
However, thinking about this now, it also feels like a lot in a single chapter (Yoda, Python API, datalad publish). The alternative would be to have individual parts in their own chapters or as parts of other chapters, and then combine/apply them in a single section.
I'm undecided yet, so if anyone has preferences...