Data Mesh for Architects
A talk that addresses Data Mesh from an Architect's point of view
(This summary is prepared by S Kannan, Editorial Assistant at The Fifth Elephant. Read the summary before proceeding to watch the talk.)
Data Mesh is a self-service data infrastructure platform that also provides federated computation and governance with support for interoperability standards. What are the responsibilities and challenges for an Architect in building such a design from a practical perspective? How do you design data products and handle analytical data at scale? In this talk, an example of a MediaCity organisation from the content publishing industry is used to demonstrate architecting the data platform for its various domain boundaries.
Should the architect take a business or project perspective to get a buy in from the stakeholders? An architect needs to be cognizant of the optimal flow of data across domains, while keeping the cost of operations to a minimum. The ownership of data products to domains is also the responsibility of the architect. The definitions of the various data products across domains is important, and the architect needs to apply product thinking when building the platform.
A number of practical approaches are possible. The top-down approach requires a huge infrastructure investment, whereas a bottom-up approach starts with the bare minimum platform layers, and one can iterate and make incremental changes. A hybrid approach uses a combination of the two when one has a centralised data lake source. The architect needs to be aware of storage requirements, provisioning comput resources, orchestration needs, and manage input output operations. The ability to view lineage and monitoring of data products is also an important aspect that needs to be addressed.
Hosted by