diff --git a/docs/_static/drawio/sdlf-architecture-datalake.drawio b/docs/_static/drawio/sdlf-architecture-datalake.drawio new file mode 100644 index 00000000..a67e8add --- /dev/null +++ b/docs/_static/drawio/sdlf-architecture-datalake.drawio @@ -0,0 +1,142 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/_static/drawio/sdlf-architecture-datamesh.drawio b/docs/_static/drawio/sdlf-architecture-datamesh.drawio new file mode 100644 index 00000000..6757f146 --- /dev/null +++ b/docs/_static/drawio/sdlf-architecture-datamesh.drawiodiff --git a/docs/_static/sdlf-architecture-datalake.png b/docs/_static/sdlf-architecture-datalake.png new file mode 100644 index 00000000..35e1bef2 Binary files /dev/null and b/docs/_static/sdlf-architecture-datalake.png differ diff --git a/docs/_static/sdlf-architecture-datamesh.png b/docs/_static/sdlf-architecture-datamesh.png new file mode 100644 index 00000000..b724809b Binary files /dev/null and b/docs/_static/sdlf-architecture-datamesh.png differ diff --git a/docs/_static/sdlf-architecture.png b/docs/_static/sdlf-pipeline-full.png similarity index 100% rename from docs/_static/sdlf-architecture.png rename to docs/_static/sdlf-pipeline-full.png diff --git a/docs/architecture.md b/docs/architecture.md index 7ec33499..ee4a528d 100644 --- a/docs/architecture.md +++ b/docs/architecture.md @@ -4,21 +4,29 @@ SDLF supports both a centralized datalake deployment pattern and decentralized d ## Centralized Data Lake -![Centralized Data Lake Architecture](_static/sdlf-architecture.png) +![Centralized Data Lake Architecture](_static/sdlf-architecture-datalake.png) !!! warning - We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF Implementation + We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF implementation. ## Data Mesh -![Data Mesh Architecture](_static/sdlf-architecture.png) +The Data Mesh pattern is fundamentally about decentralized data ownership, with data owned by specialized domain teams rather than a centralized data team. This usually means: +- each data domain team has its own dedicated data infrastructure, for production and/or consumption +- each data domain team is able to deploy the tools and infrastructure it needs - a self-serve data platform + +A governance layer is federating data assets in a business catalog to ensure compliance against policies and standards, and ease of data sharing across teams. + +As such, it can be seen as a collection of data domain-specific datalakes deployed with SDLF. [Amazon SageMaker Data and AI Governance](https://aws.amazon.com/sagemaker/data-ai-governance/) (built on Amazon DataZone) can be used for the governance layer. + +![Data Mesh Architecture](_static/sdlf-architecture-datamesh.png) !!! warning - We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF Implementation + We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF implementation. ## Transactional Data Lake Using [Iceberg](https://docs.aws.amazon.com/prescriptive-guidance/latest/apache-iceberg-on-aws/introduction.html). !!! warning - We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF Implementation + We strongly recommend that customers conduct a [Well Architected Review](https://aws.amazon.com/architecture/well-architected/) of their SDLF implementation. diff --git a/docs/constructs/pipeline.md b/docs/constructs/pipeline.md index 30546a8c..94711cf7 100644 --- a/docs/constructs/pipeline.md +++ b/docs/constructs/pipeline.md @@ -13,7 +13,7 @@ Each pipeline is divided into stages (i.e. StageA, StageB...), which map to AWS Each Step Functions is comprised of one or more steps relating to operations in the orchestration process (e.g. Starting an Analytical Job, Running a crawler...). -![SDLF Architecture](../_static/sdlf-architecture.png) +![SDLF Architecture](../_static/sdlf-pipeline-full.png) An example architecture for a SDLF pipeline is detailed in the diagram above. The entire process is event-driven.