Home

noWorkflow

What is noworkflow?

noWorkflow is a tool designed to automatically trace the provenance of a Python script without requiring changes to the original code, thereby providing users with the creation and analysis of a detailed history of how data was produced and transformed. This history ensures transparency and reliability in scientific experiments and data processes. Developed in Python, noWorkflow can capture the provenance of scripts using software engineering techniques such as abstract syntax tree (AST) analysis, reflection, and profiling to collect provenance without necessitating a version control system or any other external environment.

Team

The main noWorkflow team is composed by researchers from Universidade Federal Fluminense (UFF) in Brazil and New York University (NYU), in the USA.

João Felipe Pimentel (UFF) (main developer)
Juliana Freire (NYU)
Leonardo Murta (UFF)
Vanessa Braganholo (UFF)
Arthur Paiva (UFF)

Collaborators

David Koop (University of Massachusetts Dartmouth)
Fernando Chirigati (NYU)
Paolo Missier (Newcastle University)
Vynicius Pontes (UFF)
Henrique Linhares (UFF)
Eduardo Jandre (UFF)
Jessé Lima (Summer of Reproducibility)
Joshua Daniel Talahatu (Google Summer of Code)

History

The project started in 2013, when Leonardo Murta and Vanessa Braganholo were visiting professors at New York University (NYU) with Juliana Freire. At that moment, David Koop and Fernando Chirigati also joined the project. They published the initial paper about noWorkflow in IPAW 2014. After going back to their home university, Universidade Federal Fluminense (UFF), Leonardo and Vanessa invited João Felipe Pimentel to join the project in 2014 for his PhD. João, Juliana, Leonardo and Vanessa integrated noWorkflow and IPython and published a paper about it in TaPP 2015. They also worked on provenance versioning and fine-grained provenance collection and published papers in IPAW 2016. During the same time, David, João, Leonardo and Vanessa worked with the YesWorkflow team on an integration between noWorkflow & YesWorkflow and published a demo in IPAW 2016. The research and development on noWorkflow continues and is currently under the responsibility of João Felipe, in the context of his PhD thesis.

Publications

Why use noworkflow?

NoWorkflow identifies dependencies, parameters, and dataflows, helping to keep a detailed history of the script executions. Speed up the check of the results in different versions, clear the way to collaboration and reproducibility, and make it easier to understand and share experiments.

Who uses noworkflow?

Research Scientists, data scientists and professionals that works with Python scripts and needs to keep track of processes and data.

Where does noworkflow apply?

Scientific research, data analysis projects, and academic environments where reproducibility is essential, such as research labs and complex experiments, demand rigorous documentation processes.

When to use noworkflow?

When you need to capture, analyze, and document data provenance in complex experiments and workflows. It meets the specific needs of PROV, Prolog, and dataflow users.

How to install?

Follow the link to Quick Install

How to use?

In progress, updates soon.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly