Generating a Knowledgebase #645

joshmoore · 2025-02-05T17:06:09Z

At the AHM meeting, a KB was discussed which guided users to the basic concepts that they need for getting started with RDM. (What is OME-Zarr? How do I get started with OMERO? etc.)

Would it be possible to generate an initial outline for a KB using the material in search insight and/or the search engine?

haesleinhuepf · 2025-02-05T20:55:37Z

Yes, a starting point might be resources/nfdi4bioimage.yml . We should summarize it into a 3-level hierarchy of important topics in markdown format. For each topic we select one relevant url from the yml file.

git-bob ask gemini-2.0-pro-exp-02-05 to answer this

github-actions · 2025-02-05T20:57:00Z

^{This message was generated by git-bob (version: 0.17.0, model: gemini-2.0-pro-exp-02-05, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.}

Okay, here's a response to the github issue, focusing on generating the requested knowledge base outline:

As suggested by @haesleinhuepf, I will process resources/nfdi4bioimage.yml to generate a 3-level Markdown hierarchy. I will select representative URLs for each topic.

# NFDI4Bioimage Knowledge Base Outline

## Level 1: Core Concepts

### Research Data Management (RDM)
   *   **Level 2: RDM Basics**
        *   *Level 3: Getting Started with RDM*
            *   URL: [Forschungsdaten.info](https://forschungsdaten.info/)
   *   **Level 2: RDM Planning**
        *    *Level 3: Data Management Plans*
            *   URL: [Euro-BioImaging's Template for Research Data Management Plans](https://zenodo.org/records/11473803)
   *  **Level 2: FAIR Principles**
        * *Level 3: Introduction to FAIR*
            * URL: [FAIR Principles](https://www.go-fair.org/fair-principles/)
   *  **Level 2: Metadata**
        * *Level 3: Metadata Standards*
            *   URL: [REMBI: Recommended Metadata for Biological Images—enabling reuse of microscopy data in biology](https://www.nature.com/articles/s41592-021-01166-8)
    * **Level 2: Sharing and Licensing**
        * *Level 3: Sharing research Data*
            * URL: [Sharing research data with Zenodo](https://focalplane.biologists.com/2023/02/15/sharing-research-data-with-zenodo/)
### Bioimage Analysis
 *   **Level 2: Bioimage Analysis Basics**
        *   *Level 3: Introduction to Bioimage Analysis*
            *   URL: [What is Bioimage Analysis? An Introduction](https://www.dropbox.com/s/5abw3cvxrhpobg4/20220923_DefragmentationTS.pdf?dl=0)
 *   **Level 2: Common Tools**
     *   *Level 3: ImageJ/Fiji*
            * URL: [Introduction to Image Analysis with Fiji](https://github.com/mpicbg-scicomp/CourseIntroToIA)
     *    *Level 3: Python*
            *   URL: [Bio-image Analysis with Python](https://github.com/BiAPoL/Bio-image_Analysis_with_Python)
     *   *Level 3: Napari*
            *   URL: [Bioimage analysis with Napari](https://www.fabriziomusacchio.com/teaching/teaching_bioimage_analysis/)
     * *Level 3: CellProfiler*
           * URL: [CellProfiler tutorials](https://github.com/CellProfiler/tutorials)
 * **Level 2: Specific Tasks**
        *   *Level 3: Segmentation*
             * URL: [Quality assurance of segmentation results](https://focalplane.biologists.com/2023/04/13/quality-assurance-of-segmentation-results/)
        *  *Level 3: Tracking*
           *  URL: [Tracking in napari](https://focalplane.biologists.com/2023/06/01/tracking-in-napari/)
 *   **Level 2: Artificial Intelligence**
        *   *Level 3: Deep Learning Basics*
            *   URL: [Deep Learning Based Segmentation For Biologists](https://github.com/tpecot/DeepLearningBasedSegmentationForBiologists/)
        * *Level 3: Prompt Engineering*
            * URL: [Prompt Engineering in Bio-image Analysis](https://focalplane.biologists.com/2024/07/18/prompt-engineering-in-bio-image-analysis/)

### Data and Infrastructure
    * **Level 2: Data Repositories**
        * *Level 3: BioImage Archive*
           * URL: [Bioimage Archive](https://www.ebi.ac.uk/bioimage-archive/)
        * *Level 3: Image Data Resource (IDR)*
            * URL: [Image Data Resources](https://idr.openmicroscopy.org/)
    * **Level 2: Data Management Platforms**
        * *Level 3: OMERO*
           *  URL: [OME Documentation](https://www.openmicroscopy.org/docs/)
    * **Level 2: Workflow Engines**
        * *Level 3: Galaxy*
            * URL: [Galaxy Training Material](https://github.com/galaxyproject/training-material)
        * *Level 3: Nextflow*
              * URL: [NextFlow documentation](https://www.nextflow.io/)

haesleinhuepf · 2025-02-05T21:02:30Z

We need this in more detail (more entries in level 2 and 3). Also answer in markdown without fences. Example:

Research Data Management (RDM)

RDM Basics

Getting Started with RDM

Forschungsdaten.info

git-bob ask gemini-2.0-pro-exp-02-05 to answer this

github-actions · 2025-02-05T21:04:12Z

^{This message was generated by git-bob (version: 0.17.0, model: gemini-2.0-pro-exp-02-05, log), an experimental AI-based assistant. It can make mistakes and has limitations. Check its messages carefully.}

Okay, I'll provide an updated version of the knowledge base outline, addressing the feedback from @haesleinhuepf. The goal is to expand the level 2 and 3 entries, drawing more comprehensively from the nfdi4bioimage.yml file, and format it as simple markdown.

Research Data Management (RDM)

RDM Basics

Getting Started with RDM

Data Management Plans

RDM Policies

Efficiently starting institutional research data management

FAIR Principles

Introduction to FAIR

FAIR in Bioimaging

Data Stewardship

Sustainable Data Stewardship

Metadata

Metadata Standards

Metadata Annotation Tools

Micro-Meta App: an interactive tool for collecting microscopy metadata based on community specifications

Sharing and Licensing

Sharing research Data

Citing research data

Bioimage Analysis

Bioimage Analysis Basics

Introduction to Bioimage Analysis

Bioimage Analysis Workflows

Building a Bioimage Analysis Workflow using Deep Learning

Common Tools

ImageJ/Fiji

Python

Napari

CellProfiler

QuPath

QuPath: Open source software for analysing (awkward) images

Specific Tasks

Segmentation

Quality assurance of segmentation results

Tracking

Registration

Artificial Intelligence

Deep Learning Basics

Prompt Engineering

Data and Infrastructure

Data Repositories

BioImage Archive

Image Data Resource (IDR)

Other Repositories

Annotated high-throughput microscopy image sets for validation

Data Management Platforms

OMERO

Workflow Engines

Galaxy

Galaxy Training Material

Nextflow

This expanded outline provides a more detailed starting point for the NFDI4Bioimage knowledge base, categorized into three levels and linking to numerous specific resources from the provided YAML file. It presents a structured view of essential concepts and tools for RDM and bioimage analysis.

haesleinhuepf · 2025-02-05T21:07:34Z

@joshmoore what do you think about the proposed ouline? Feel free to ask git-bob directly for modifications. You should be able to instruct it directly.

SaibotMagd · 2025-02-06T08:31:55Z

I would like to add:

Metadata

Metadata Standards

Publishing research code FAIR - a roadmap
git-bob ask gemini-2.0-pro-exp-02-05 to answer this

SaibotMagd · 2025-02-06T08:39:59Z

At the AHM meeting, a KB was discussed which guided users to the basic concepts that they need for getting started with RDM. (What is OME-Zarr? How do I get started with OMERO? etc.)

Would it be possible to generate an initial outline for a KB using the material in search insight and/or the search engine?

As I'm working on a institute-vise onboarding process. A start guide should be structured more like a pipeline that you can follow step by step. it should be as simple as possible. You can build on your existing knowledge later to delve deeper into the individual parts.

Some central questions I always have in mind is: “Who should look for which source, for what reason, when? What does this person want? How many resources are they willing to use?”
(possible Answers)

“Who:

everyone including staff workers

For what reason?

people are looking for help when they need some, so when they struggle; rarely beforehand

What does this person want?

a very short answer to a somewhat precise question (even if they are unable to formulate the question in the right way, or they don't even know what there're looking for)

How many resources are they willing to use?

not more than 1-3 people are searching for answers
it is rare that the answer to an urgent question is sought for more than 1-3 days. after that, either the answer is given up, if that is possible, or the easiest possible workaround is sought, or the problem is postponed until later

# That's why I designed a sample research process for us. The current design looks like this (very early version):

Enhanced Structure and Workflow for an Empirical Research Project

1. Application for Funding

Description: Create detailed applications for funding from the DFG and the EU. The applications should include a clear description of the research project, the goals, the methodology, and the expected benefits.
Tools:
Tutorials:
- DFG Application Portal Tutorial
- Horizon Europe Tutorial
Workflow:
1. Identify suitable funding programs.
2. Create a detailed project plan.
3. Fill out the application forms using Foxit Reader or Adobe Reader.
4. Use RDMO or Adamant JSON Editor for a simplified interface with pre-filled forms, which can then be integrated into elan.
5. Register the project or study in CNI egroupware Studienanmeldung.
6. Save the applications in the /Application folder.

2. Research of the Research Area

Description: Conduct a comprehensive literature review to understand the current state of research. Use Zotero to manage and organize your sources.
Tools:
- Zotero
Tutorials:
- Zotero Tutorial
Workflow:
1. Define keywords and search terms.
2. Search for relevant articles and studies.
3. Import the found sources into Zotero.
4. Save the literature research in the /Research folder.

3. Development of a Hypothesis

Description: Formulate a clear and testable hypothesis based on your literature review. Document the hypothesis and the underlying assumptions.
Tools:
- Google Scholar
Tutorials:
- Google Scholar Tutorial
Workflow:
1. Analyze the collected sources.
2. Identify research gaps.
3. Formulate a hypothesis.
4. Save the hypothesis in the /Hypothesis folder.

4. Operationalization of the Idea

Description: Document the operationalization of your hypothesis in an electronic lab notebook. Use Jupyter Notebooks to make the process transparent and traceable.
Tools:
Tutorials:
- Jupyter Notebook Tutorial
Workflow:
1. Define the variables and measurement methods.
2. Create a Jupyter Notebook for documentation.
3. Describe the experimental setup and execution.
4. Use Zotero for storing sources and notes.
5. Use Markdown in LIN-nextcloud or locally with Obsidian for documentation.
6. Use open documents in LIN-nextcloud or locally with LibreOffice.
7. Use rspace ELN for daily documentation (includes Collabora).
8. Use pyrats for planning and documenting animal husbandry, with integration into rspace ELN.
9. Save the notebook in the /Operationalization folder.

5. Implementation and Documentation

Description: Implement the operationalization and document the entire process in an electronic lab notebook. Use Jupyter Notebooks for documentation.
Tools:
- Jupyter Notebook
- rspace ELN
Tutorials:
- Jupyter Notebook Tutorial
Workflow:
1. Conduct the experiments.
2. Document each step in the Jupyter Notebook.
3. Record all observations and results.
4. Create project-specific documents in rspace ELN.
5. Save the notebook in the /Implementation folder.

6. Data Analysis and Visualization

Description: Analyze the collected data and visualize the results using open-source tools such as Jupyter Hub, Python, R, or MATLAB.
Tools:
- Jupyter Hub
- Python
- R
- MATLAB
- Fiji/ImageJ
- OBS Studio
- bioformats
Tutorials:
Workflow:
1. Clean and prepare the data.
2. Choose appropriate analysis methods.
3. Perform the data analysis.
4. Visualize the results.
5. Ensure that the programs record the work steps seamlessly.
6. Use source code from MATLAB, Python, R, or any other language supported by JupyterHub.
7. Use the Macro Recorder in Fiji/ImageJ for manual tasks.
8. Document manual tasks with OBS Studio and later extract significant frames to save space.
9. Document the program and version used for proprietary formats.
10. Include a link to the program in the documentation.
11. Convert raw files to non-proprietary formats if programs have access protection.
12. Save the notebook in the /DataAnalysis folder.

7. Creation of the Paper

Description: Write the research paper using a typesetting program such as Overleaf (LaTeX), Markdown, or LibreOffice Writer.
Tools:
Tutorials:
Workflow:
1. Structure the paper.
2. Write the individual sections.
3. Insert figures and tables.
4. Review the paper for errors.
5. Use local Overleaf for LaTeX-based publications.
6. Use Collabora or LibreOffice for writing in combination with bibtex citations filled by Zotero.
7. Publish on protocols.io or a suitable archive from the Publikationsarchivdatenbank.
8. Assign staff in pure.
9. Save the paper in the /Paper folder.

8. Compiling Data According to BIDS Standard

Description: Structure the collected data according to the BIDS (Brain Imaging Data Structure) standard to ensure reusability and comparability.
Tools:
- BIDS Specification
Tutorials:
- BIDS Tutorial
Workflow:
1. Study the BIDS specification.
2. Organize the data according to the BIDS standard.
3. Check the structure for completeness.
4. Save the dataset in the /BIDS folder.

9. Writing to Journals

Description: Identify suitable scientific journals and write cover letters to submit your paper.
Tools:
- Journal Finder
Tutorials:
- Journal Finder Tutorial
Workflow:
1. Identify appropriate journals.
2. Write a cover letter for each journal.
3. Review the cover letters for completeness.
4. Save the cover letters in the /Journals folder.

10. Uploading the Dataset to an Archive

Description: Upload the structured dataset to a data archive like Zenodo to ensure long-term availability.
Tools:
- Zenodo
Tutorials:
- Zenodo Tutorial
Workflow:
1. Create an account on Zenodo.
2. Upload the dataset.
3. Fill in the metadata.
4. Save the dataset in the /Archive folder.

11. Uploading the Source Code to GitHub

Description: Upload the source code of your analysis and documentation to GitHub to ensure transparency and reusability.
Tools:
- GitHub
Tutorials:
- GitHub Tutorial
Workflow:
1. Create a repository on GitHub.
2. Upload the source code.
3. Document the code and its usage.
4. Save the source code in the /GitHub folder.

12. Uploading the Paper to a Pre-Print Server

Description: Upload the paper to a pre-print server like arXiv to receive early feedback and increase visibility.
Tools:
- arXiv
Tutorials:
- arXiv Tutorial
Workflow:
1. Create an account on arXiv.
2. Upload the paper.
3. Fill in the metadata.
4. Save the paper in the /PrePrint folder.

13. Linking the Dataset with the Source Code and the Paper

Description: Link the dataset, the source code, and the paper together to ensure comprehensive documentation and traceability.
Tools:
- Zenodo
- GitHub
- arXiv
Workflow:
1. Create links between the platforms.
2. Document the links in the paper.
3. Verify the links for functionality.

Back to Table of Contents

Recommended Folder Structure on the Hard Drive

/ResearchProject
    /Application
        - Application_DFG.pdf
        - Application_EU.pdf
    /Research
        - LiteratureResearch.zotero
    /Hypothesis
        - Hypothesis.pdf
    /Operationalization
        - Operationalization.ipynb
    /Implementation
        - Implementation.ipynb
    /DataAnalysis
        - DataAnalysis.ipynb
    /Paper
        - Paper.tex
        - Paper.md
        - Paper.odt
    /BIDS
        - Dataset_BIDS.zip
    /Journals
        - JournalSubmission.pdf
    /Archive
        - Dataset_Zenodo.zip
    /GitHub
        - SourceCode.zip
    /PrePrint
        - Paper_arXiv.pdf```
[Back to Table of Contents](#table-of-contents)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generating a Knowledgebase #645

Generating a Knowledgebase #645

joshmoore commented Feb 5, 2025

haesleinhuepf commented Feb 5, 2025

github-actions bot commented Feb 5, 2025

haesleinhuepf commented Feb 5, 2025

github-actions bot commented Feb 5, 2025

haesleinhuepf commented Feb 5, 2025

SaibotMagd commented Feb 6, 2025 •

edited

Loading

SaibotMagd commented Feb 6, 2025 •

edited

Loading

Generating a Knowledgebase #645

Generating a Knowledgebase #645

Comments

joshmoore commented Feb 5, 2025

haesleinhuepf commented Feb 5, 2025

github-actions bot commented Feb 5, 2025

haesleinhuepf commented Feb 5, 2025

Research Data Management (RDM)

RDM Basics

Getting Started with RDM

github-actions bot commented Feb 5, 2025

Research Data Management (RDM)

RDM Basics

Getting Started with RDM

Data Management Plans

RDM Policies

FAIR Principles

Introduction to FAIR

FAIR in Bioimaging

Data Stewardship

Metadata

Metadata Standards

Metadata Annotation Tools

Sharing and Licensing

Sharing research Data

Citing research data

Bioimage Analysis

Bioimage Analysis Basics

Introduction to Bioimage Analysis

Bioimage Analysis Workflows

Common Tools

ImageJ/Fiji

Python

Napari

CellProfiler

QuPath

Specific Tasks

Segmentation

Tracking

Registration

Artificial Intelligence

Deep Learning Basics

Prompt Engineering

Data and Infrastructure

Data Repositories

BioImage Archive

Image Data Resource (IDR)

Other Repositories

Data Management Platforms

OMERO

Workflow Engines

Galaxy

Nextflow

haesleinhuepf commented Feb 5, 2025

SaibotMagd commented Feb 6, 2025 • edited Loading

Metadata

Metadata Standards

SaibotMagd commented Feb 6, 2025 • edited Loading

“Who:

For what reason?

What does this person want?

How many resources are they willing to use?

Enhanced Structure and Workflow for an Empirical Research Project

1. Application for Funding

2. Research of the Research Area

3. Development of a Hypothesis

4. Operationalization of the Idea

5. Implementation and Documentation

6. Data Analysis and Visualization

7. Creation of the Paper

8. Compiling Data According to BIDS Standard

9. Writing to Journals

10. Uploading the Dataset to an Archive

11. Uploading the Source Code to GitHub

12. Uploading the Paper to a Pre-Print Server

13. Linking the Dataset with the Source Code and the Paper

Recommended Folder Structure on the Hard Drive

SaibotMagd commented Feb 6, 2025 •

edited

Loading

SaibotMagd commented Feb 6, 2025 •

edited

Loading