Open Lakehouse Architecture: How to Scale AI to Production
04 March 2026

Open Lakehouse Architecture: How to Scale AI to Production

The AI Forecast: Data and AI in the Cloud Era

About

Open lakehouse architecture is becoming the foundation for production AI and enterprise AI at scale. 


In this episode of The AI Forecast, Dipankar Mazumdar, Director of Developer Relations at Cloudera and co-author of the book “Engineering Lakehouse with Open Table Formats,” joins Paul Muller to explain why open lakehouse architecture is critical for moving from AI pilot to production AI. 


They break down: 



    How Apache Iceberg and open table formats decouple storage from compute
    How schema evolution enables change without costly data rewrites
    How multiple engines can securely access the same data without duplication
    How to prevent small-file performance bottlenecks
    How to control AI compute costs at scale
    How to embed governance, metadata, and data lineage into AI workloads 

Production-ready AI requires scalable data architecture and governance built in from day one. AI and GenAI pilots may be everywhere, but your architecture is what truly decides what survives.  


Stay in touch with Dipankar:  



    Dipankar Mazumdar on LinkedIn: https://www.linkedin.com/in/dipankar-mazumdar/ 


    Dipankar’s website: https://dipankarmazumdar.github.io/ 


    Dipankar’s book on Amazon: https://www.amazon.com/Engineering-Lakehouses-Open-Table-Formats-ebook/dp/B0DKJD39X8 

 


+++ 


Follow and subscribe to The AI Forecast for more conversations with the innovators shaping the future of enterprise AI.