Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor telemetry to collect events during DAG run and not DAG parsing #300

Merged
merged 16 commits into from
Dec 3, 2024

Conversation

pankajastro
Copy link
Contributor

@pankajastro pankajastro commented Nov 26, 2024

DAG Factory 0.20 started collecting telemetry as part of the PR #250. However, one limitation of this initial implementation is that it emitted telemetry every time DAGs were parsed. This means that the data collected did not represent the actual usage and was proportional to the number of times a DAG was parsed. This PR aims to address this limitation by changing DAG Factory to emit telemetry during DAG runs.

This implementation leverages Airflow listeners to only emit events after a Factory-Built DAG is run.

Closes: #282

With this data, we can get the following insight

  • Number of failed DagRuns
  • Number of successful DagRuns
  • Total tasks associated to each DagRun
  • DagRun hash

Airflow Version
Screenshot 2024-12-03 at 8 00 14 PM

DAG Hash
Screenshot 2024-12-03 at 8 01 28 PM

DAG Factory Version
Screenshot 2024-12-03 at 8 02 08 PM

Event Type
Screenshot 2024-12-03 at 8 02 46 PM

Platform Machine
Screenshot 2024-12-03 at 8 03 18 PM

Platform System
Screenshot 2024-12-03 at 8 04 23 PM

Python Version
Screenshot 2024-12-03 at 8 05 05 PM

DAG Run Status
Screenshot 2024-12-03 at 8 06 45 PM

Task Count in DAG run
Screenshot 2024-12-03 at 8 07 25 PM

** Telemetry Version**
Screenshot 2024-12-03 at 8 07 58 PM

@pankajastro pankajastro force-pushed the telemetry_event branch 2 times, most recently from 346b6f7 to 97d6cad Compare November 26, 2024 08:43
@codecov-commenter
Copy link

codecov-commenter commented Nov 26, 2024

Codecov Report

Attention: Patch coverage is 83.72093% with 7 lines in your changes missing coverage. Please review.

Project coverage is 92.57%. Comparing base (9d2b8f5) to head (6f1d80f).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
dagfactory/listeners/runtime_event.py 74.07% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #300      +/-   ##
==========================================
- Coverage   93.38%   92.57%   -0.82%     
==========================================
  Files           8       10       +2     
  Lines         680      700      +20     
==========================================
+ Hits          635      648      +13     
- Misses         45       52       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

dagfactory/dagbuilder.py Outdated Show resolved Hide resolved
dagfactory/dagbuilder.py Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Show resolved Hide resolved
@pankajastro pankajastro marked this pull request as ready for review November 27, 2024 19:49
@pankajastro pankajastro requested a review from a team as a code owner November 27, 2024 19:49
tests/test_telemetry.py Outdated Show resolved Hide resolved
dagfactory/constants.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/listeners/runtime_event.py Outdated Show resolved Hide resolved
dagfactory/plugin/__init__.py Outdated Show resolved Hide resolved
tests/test_telemetry.py Outdated Show resolved Hide resolved
@tatiana tatiana changed the title Add listener to collect telemetry data Refactor telemetry to collect events during DAG run and not during DAG parsing Dec 3, 2024
Copy link
Collaborator

@tatiana tatiana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, @pankajastro , happy for us to merge the PR once the docs (https://github.com/astronomer/dag-factory/blob/main/PRIVACY_NOTICE.md) are updated.

@pankajastro
Copy link
Contributor Author

Great work, @pankajastro , happy for us to merge the PR once the docs (https://github.com/astronomer/dag-factory/blob/main/PRIVACY_NOTICE.md) are updated.

Updated the docs

@pankajastro pankajastro merged commit 72bc85b into main Dec 3, 2024
67 checks passed
@pankajastro pankajastro deleted the telemetry_event branch December 3, 2024 14:41
@tatiana tatiana changed the title Refactor telemetry to collect events during DAG run and not during DAG parsing Refactor telemetry to collect events during DAG run and not DAG parsing Dec 3, 2024
@tatiana tatiana mentioned this pull request Dec 6, 2024
tatiana added a commit that referenced this pull request Dec 6, 2024
### Added

- Add support to TaskFlow and improve dynamic task mapping support by
@tatiana in #314
- Render YML DAG config as DAG Docs by @pankajastro #305
- Support building DAGs out of topologically unsorted YAML files by
@tatiana in #307
- Add support for nested task groups by @glazunov996 and @pankajastro in
#292
- Add support for templating `on_failure_callback` by @jroach-astronomer
#252

### Fixed

- Fix compatibility with
apache-airflow-providers-cncf-kubernetes>=10.0.0 by @tatiana in #311
- Refactor telemetry to collect events during DAG run and not during DAG
parsing by @pankajastro #300

### Docs

- Fix reference for HttpSensor in README.md by @pankajastro in #277
- Add example DAG for task group by @pankajastro in #293
- Add CODEOWNERS by @pankajkoti in #270
- Update CODEOWNERS to track all files by @pankajkoti in #276
- Modified Status badge in README by @jaejun #298

### Others

- Refactor dynamic task mapping implementation by @tatiana in #313
- Remove pytest durations from tests by @tatiana in #309
- Remove DAG retries check since many DAGs have different retry values
by @tatiana in #310
- Lint fixes after running `pre-commit run --all-files` by @tatiana in
#312
- Remove redundant exception code by @pankajastro #294
- Add GitHub issue template for bug reports and feature requests by
@pankajkoti in #269

Closes: #223
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[telemetry] Identify a better strategy of when to emit telemetry
3 participants