-
Notifications
You must be signed in to change notification settings - Fork 30
Home
Copyright (c) 2016 Universidade Federal Fluminense (UFF). Copyright (c) 2016 Polytechnic Institute of New York University. All rights reserved.
noWorkflow is a tool designed to automatically trace the provenance of a Python script without requiring changes to the original code, thereby providing users with the creation and analysis of a detailed history of how data was produced and transformed. This history ensures transparency and reliability in scientific experiments and data processes. Developed in Python, noWorkflow can capture the provenance of scripts using software engineering techniques such as abstract syntax tree (AST) analysis, reflection, and profiling to collect provenance without necessitating a version control system or any other external environment.
The main noWorkflow team is composed by researchers from Universidade Federal Fluminense (UFF) in Brazil and New York University (NYU), in the USA.
- João Felipe Pimentel (UFF) (main developer)
- Juliana Freire (NYU)
- Leonardo Murta (UFF)
- Vanessa Braganholo (UFF)
- Arthur Paiva (UFF)
Collaborators
- David Koop (University of Massachusetts Dartmouth)
- Fernando Chirigati (NYU)
- Paolo Missier (Newcastle University)
- Vynicius Pontes (UFF)
- Henrique Linhares (UFF)
- Eduardo Jandre (UFF)
- Jessé Lima (Summer of Reproducibility)
- Joshua Daniel Talahatu (Google Summer of Code)
The project started in 2013, when Leonardo Murta and Vanessa Braganholo were visiting professors at New York University (NYU) with Juliana Freire. At that moment, David Koop and Fernando Chirigati also joined the project. They published the initial paper about noWorkflow in IPAW 2014. After going back to their home university, Universidade Federal Fluminense (UFF), Leonardo and Vanessa invited João Felipe Pimentel to join the project in 2014 for his PhD. João, Juliana, Leonardo and Vanessa integrated noWorkflow and IPython and published a paper about it in TaPP 2015. They also worked on provenance versioning and fine-grained provenance collection and published papers in IPAW 2016. During the same time, David, João, Leonardo and Vanessa worked with the YesWorkflow team on an integration between noWorkflow & YesWorkflow and published a demo in IPAW 2016. The research and development on noWorkflow continues and is currently under the responsibility of João Felipe, in the context of his PhD thesis.
- MURTA, L. G. P.; BRAGANHOLO, V.; CHIRIGATI, F. S.; KOOP, D.; FREIRE, J.; noWorkflow: Capturing and Analyzing Provenance of Scripts. In: International Provenance and Annotation Workshop (IPAW), 2014, Cologne, Germany.
- PIMENTEL, J. F. N.; FREIRE, J.; MURTA, L. G. P.; BRAGANHOLO, V.; Collecting and Analyzing Provenance on Interactive Notebooks: when IPython meets noWorkflow. In: Theory and Practice of Provenance (TaPP), 2015, Edinburgh, Scotland.
- PIMENTEL, J. F.; FREIRE, J.; BRAGANHOLO, V.; MURTA, L. G. P.; Tracking and Analyzing the Evolution of Provenance from Scripts. In: International Provenance and Annotation Workshop (IPAW), 2016, McLean, Virginia.
- PIMENTEL, J. F.; FREIRE, J.; MURTA, L. G. P.; BRAGANHOLO, V.; Fine-grained Provenance Collection over Scripts Through Program Slicing. In: International Provenance and Annotation Workshop (IPAW), 2016, McLean, Virginia.
- PIMENTEL, J. F.; DEY, S.; MCPHILLIPS, T.; BELHAJJAME, K.; KOOP, D.; MURTA, L. G. P.; BRAGANHOLO, V.; LUDÄSCHER B.; Yin & Yang: Demonstrating Complementary Provenance from noWorkflow & YesWorkflow. In: International Provenance and Annotation Workshop (IPAW), 2016, McLean, Virginia.
- PIMENTEL, J. F.; MURTA, L. G. P.; BRAGANHOLO, V.; FREIRE, J.; noWorkflow: a Tool for Collecting, Analyzing, and Managing Provenance from Python Scripts. In: International Conference on Very Large Data Bases (VLDB), 2017, Munich, Germany.
- OLIVEIRA, E.; Enabling Collaboration in Scientific Experiments. Masters Dissertation, Universidade Federal Fluminense, 2022.
NoWorkflow identifies dependencies, parameters, and dataflows, helping to keep a detailed history of the script executions. Speed up the check of the results in different versions, clear the way to collaboration and reproducibility, and make it easier to understand and share experiments.
Research Scientists, data scientists and professionals that works with Python scripts and needs to keep track of processes and data.
Scientific research, data analysis projects, and academic environments where reproducibility is essential, such as research labs and complex experiments, demand rigorous documentation processes.
When you need to capture, analyze, and document data provenance in complex experiments and workflows. It meets the specific needs of PROV, Prolog, and dataflow users.
Follow the link to Quick Install
In progress, updates soon.