Open Source Airflow Contributions and Performance Improvements at G-Research with Christos Bisias
19 March 2026

Open Source Airflow Contributions and Performance Improvements at G-Research with Christos Bisias

The Data Flowcast: Mastering Apache Airflow ® for Data Engineering and AI

About

Modern Airflow isn’t just orchestration. It's a contribution.



In this episode, we explore how open source investment drives real performance gains and deeper observability.


We’re joined by Christos Bisias, Open Source Software Engineer, Apache Airflow at G-Research, to discuss how his team uses Airflow for large-scale data transformations, contributes upstream and improves scheduler throughput and OpenTelemetry support. From trace-level observability to CI-enforced metrics governance and a major scheduler optimization, this conversation spans strategy, engineering and community impact.


Key Takeaways:


00:00 Introduction.

01:20 How G-Research applies machine learning and big data to predict financial market movements.

02:15 Contributing to open source is a business decision.

03:10 Maintaining a fork is costly.

04:30 OpenTelemetry collects metrics, logs and traces to provide deep system visibility.

06:10 Custom spans help identify bottlenecks inside tasks and enable performance optimization.

08:05 OpenTelemetry integration works properly in Airflow 3.0 and above.

10:00 A YAML-based metrics registry with CI enforcement ensures consistency between docs and exported metrics.

12:10 Scheduler throughput improved significantly by applying concurrency limits earlier in the database query. 

15:20 Future Task SDK changes may enable language-agnostic DAG authoring beyond Python.


Resources Mentioned:


Christos Bisias

https://www.linkedin.com/in/xbis/


G-Research

https://www.linkedin.com/company/g-research/


Apache Airflow

https://airflow.apache.org/


OpenTelemetry

https://opentelemetry.io/


Prometheus

https://prometheus.io/


Grafana

https://grafana.com/


Jaeger

https://www.jaegertracing.io/





Thanks for listening to “The Data Flowcast: Mastering Apache Airflow® for Data Engineering and AI.” If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.



#AI #Automation #Airflow