
Open Lakehouse Architecture: How to Scale AI to Production
The AI Forecast: Data and AI in the Cloud Era
Open lakehouse architecture is becoming the foundation for production AI and enterprise AI at scale.
In this episode of The AI Forecast, Dipankar Mazumdar, Director of Developer Relations at Cloudera and co-author of the book “Engineering Lakehouse with Open Table Formats,” joins Paul Muller to explain why open lakehouse architecture is critical for moving from AI pilot to production AI.
They break down:
How Apache Iceberg and open table formats decouple storage from compute
How schema evolution enables change without costly data rewrites
How multiple engines can securely access the same data without duplication
How to prevent small-file performance bottlenecks
How to control AI compute costs at scale
How to embed governance, metadata, and data lineage into AI workloads
Production-ready AI requires scalable data architecture and governance built in from day one. AI and GenAI pilots may be everywhere, but your architecture is what truly decides what survives.
Stay in touch with Dipankar:
Dipankar Mazumdar on LinkedIn: https://www.linkedin.com/in/dipankar-mazumdar/
Dipankar’s website: https://dipankarmazumdar.github.io/
Dipankar’s book on Amazon: https://www.amazon.com/Engineering-Lakehouses-Open-Table-Formats-ebook/dp/B0DKJD39X8
+++
Follow and subscribe to The AI Forecast for more conversations with the innovators shaping the future of enterprise AI.