-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor telemetry to collect events during DAG run and not DAG parsing #300
Conversation
346b6f7
to
97d6cad
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #300 +/- ##
==========================================
- Coverage 93.38% 92.57% -0.82%
==========================================
Files 8 10 +2
Lines 680 700 +20
==========================================
+ Hits 635 648 +13
- Misses 45 52 +7 ☔ View full report in Codecov by Sentry. |
6afa937
to
bc1578d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work, @pankajastro , happy for us to merge the PR once the docs (https://github.com/astronomer/dag-factory/blob/main/PRIVACY_NOTICE.md) are updated.
599dc18
to
1eee294
Compare
Updated the docs |
### Added - Add support to TaskFlow and improve dynamic task mapping support by @tatiana in #314 - Render YML DAG config as DAG Docs by @pankajastro #305 - Support building DAGs out of topologically unsorted YAML files by @tatiana in #307 - Add support for nested task groups by @glazunov996 and @pankajastro in #292 - Add support for templating `on_failure_callback` by @jroach-astronomer #252 ### Fixed - Fix compatibility with apache-airflow-providers-cncf-kubernetes>=10.0.0 by @tatiana in #311 - Refactor telemetry to collect events during DAG run and not during DAG parsing by @pankajastro #300 ### Docs - Fix reference for HttpSensor in README.md by @pankajastro in #277 - Add example DAG for task group by @pankajastro in #293 - Add CODEOWNERS by @pankajkoti in #270 - Update CODEOWNERS to track all files by @pankajkoti in #276 - Modified Status badge in README by @jaejun #298 ### Others - Refactor dynamic task mapping implementation by @tatiana in #313 - Remove pytest durations from tests by @tatiana in #309 - Remove DAG retries check since many DAGs have different retry values by @tatiana in #310 - Lint fixes after running `pre-commit run --all-files` by @tatiana in #312 - Remove redundant exception code by @pankajastro #294 - Add GitHub issue template for bug reports and feature requests by @pankajkoti in #269 Closes: #223
DAG Factory 0.20 started collecting telemetry as part of the PR #250. However, one limitation of this initial implementation is that it emitted telemetry every time DAGs were parsed. This means that the data collected did not represent the actual usage and was proportional to the number of times a DAG was parsed. This PR aims to address this limitation by changing DAG Factory to emit telemetry during DAG runs.
This implementation leverages Airflow listeners to only emit events after a Factory-Built DAG is run.
Closes: #282
With this data, we can get the following insight
Airflow Version
DAG Hash
DAG Factory Version
Event Type
Platform Machine
Platform System
Python Version
DAG Run Status
Task Count in DAG run
** Telemetry Version**