Trabajo Fin de Grado en Ingeniería Informática - UNED

Título: Implementación de casos de uso de patrones de diseño MapReduce en Python, sobre infraestructuras paralelas distribuidas basadas en contenedores ligeros

Este repositorio es parte del Trabajo Final de Grado en Ingeniería Informática de la UNED (Universidad Nacional de Educación a Distancia).

En este trabajo se abordan temáticas como el Big Data y la computaciónn distribuida, así como las herramientas Hadoop y MapReduce, que permiten desplegar este tipo de infraestructuras, junto con los desarrollos de patrones de diseño en Python y las librerías MRJob para la programación en el paradigma MapReduce.

Autor: Manuel Rodríguez Sánchez

Director: Agustín Carlos Caminero Herráez

Fecha de defensa: 18 de diciembre 2020

End of Degree Project in Computer Engineering - UNED

Title: Implementing MapReduce design pattern use cases in Python, on distributed parallel infrastructures based on lightweight containers

This repository is part of the Final Degree Project in Computer Engineering at UNED (National University of Distance Education).

This work addresses topics such as Big Data and distributed computing, as well as the Hadoop and MapReduce tools, which allow the deployment of this type of infrastructure, together with the development of design patterns in Python and the MRJob libraries for programming in the MapReduce paradigm.

Author: Manuel Rodríguez Sánchez

Director: Agustín Carlos Caminero Herráez

Defense date: December 18, 2020

Instrucciones previas a la ejecución y pruebas de los patrones en el cluster

Los patrones podemos ejecutarlos en modo local o en el cluster Hadoop. En los dos casos, usan una serie de archivos o tablas, que si queremos ejecutarlos en el cluster, hay que cargar estos archivos en el HDFS. Por esta razón, es aconsejable que se sigan las instrucciones detalladas en el cuaderno Jupyter llamado: "Instrucciones para cargar en HDFS los archivos de datos". Una vez hechos estos pasos, el sistema lo tendremos preparado para ejecutar los prototipos en el cluster. En el caso que los ejecutemos en modo local no sería necesario. No obstante, se puede ver un ejemplo de ejecución de cada uno de los prototipos, tanto en modo local como en el Cluster Hadoop.

Instructions before executing and testing the patterns in the cluster

The patterns can be executed locally or in the Hadoop cluster. In both cases, they use datasets, that if we want to execute them in the cluster, we have to load these datasets in the HDFS. For this reason, it's advisable to follow the detailed instructions in the Jupyter Notebook called: "Instructions for uploading data files to HDFS". Once these steps are done, the system will be ready to run the prototypes in the cluster. In the case that we execute them in local mode it would not be necessary. However, you can see an example of execution of each of the prototypes, both in local mode and in the Hadoop Cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.ipynb_checkpoints		.ipynb_checkpoints
Tema_3_ Ejemplos_Memoria		Tema_3_ Ejemplos_Memoria
Tema_4_Ejemplo_ejecucion		Tema_4_Ejemplo_ejecucion
Tema_6_1_Patrones_de_resumen		Tema_6_1_Patrones_de_resumen
Tema_6_2_Patrones_de_filtrado		Tema_6_2_Patrones_de_filtrado
Tema_6_3_Patrones_organizacion_datos		Tema_6_3_Patrones_organizacion_datos
Tema_6_4_Patrones_de_union		Tema_6_4_Patrones_de_union
Instrucciones para cargar en HDFS los archivos de datos.ipynb		Instrucciones para cargar en HDFS los archivos de datos.ipynb
README.md		README.md
Trabajo fin de Carrera.pdf		Trabajo fin de Carrera.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trabajo Fin de Grado en Ingeniería Informática - UNED

Título: Implementación de casos de uso de patrones de diseño MapReduce en Python, sobre infraestructuras paralelas distribuidas basadas en contenedores ligeros

End of Degree Project in Computer Engineering - UNED

Title: Implementing MapReduce design pattern use cases in Python, on distributed parallel infrastructures based on lightweight containers

Instrucciones previas a la ejecución y pruebas de los patrones en el cluster

Instructions before executing and testing the patterns in the cluster

About

Releases

Packages

Languages

manursanchez/TFG_Manuel_R

Folders and files

Latest commit

History

Repository files navigation

Trabajo Fin de Grado en Ingeniería Informática - UNED

Título: Implementación de casos de uso de patrones de diseño MapReduce en Python, sobre infraestructuras paralelas distribuidas basadas en contenedores ligeros

End of Degree Project in Computer Engineering - UNED

Title: Implementing MapReduce design pattern use cases in Python, on distributed parallel infrastructures based on lightweight containers

Instrucciones previas a la ejecución y pruebas de los patrones en el cluster

Instructions before executing and testing the patterns in the cluster

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages