Skip to content

Distributed data pre-processing for energy recontruction resolution improvements in JUNO

Notifications You must be signed in to change notification settings

niklai99/distributed-juno

Repository files navigation

Distributed processing of JUNO datasets 

JUNO is a neutrino experiment that aims to measure the energy of positrons released by neutrino interactions with high resolution. The particular data structure suggests the use of ML image processing techniques. In this context a distributed data preprocessing scheme is required, to deal with the huge datasets involved and the necessary construction of 2D images.

This project exploits Dask library to achieve preprocessing distribution over 5 virtual machines. The interplay between some hyperparameters such as the number of workers, the number of parttions, and the number of threads is investigated.

Partitions vs. Workers, one thread per worker

The file final-notebook.ipynb contains a résumé of the work done.


Authors

The project has been developed by

About

Distributed data pre-processing for energy recontruction resolution improvements in JUNO

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •