|2020||Dec 2020 08:53 AM||14.5||4934|
|2018||Jul 2018 09:04 AM||4.0||4934|
AI Technology Stack
Databand.ai platform orchestrates ML creation and data processing within organizations and provides visibility to data scientists and engineers involved in the process. The platform streamlines the integration, productization, and testing of ML pipelines, thus enabling the different stakeholders to work together on ML projects in an efficient, frictionless, way.
These are some recent contributions we’ve made to Apache Airflow:
Together with our friends from Polidea we created a new executor useful for debugging and DAG development purposes. This executor executes single task instance at time and is able to work with SQLite and sensors.
Working with Polidea, we’ve made major progress in optimizing Airflow scheduler performance. In total, tests are showing 10x faster query performance with over 2000 fewer queries by count. See the list below for some of the optimizations that have been pushed (and counting):
[AIRFLOW-6856] Bulk fetch paused_dag_ids
[AIRFLOW-6857] Bulk sync DAGs
[AIRFLOW-6862] Do not check the freshness of fresh DAG
[AIRFLOW-6869] Bulk fetch DAGRuns for _process_task_instances
[AIRFLOW-6881] Bulk fetch DAGRun for create_dag_run
[AIRFLOW-6887] Do not check the state of fresh DAGRun