Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read DAG from file only later during _run_raw_task #1510

Closed
wants to merge 54 commits into from

Commits on Jun 12, 2024

  1. AIP-64: Add TaskInstance history table (apache#39951)

    * AIP-64: Add TaskInstance history table
    
    This commit adds the taskinstance history table as a first step
    to implementing AIP-64
    
    Co-authored-by: Jed Cunningham <[email protected]>
    Co-Authored-By: dstandish <[email protected]>
    
    * Remove rel from TI<->TIHistory
    
    * Add a couple comments
    
    * Update the history table at strategic points
    
    * Add history table to db cleanup
    
    * Remove history table update from ti.set_state
    
    * Lazily import the history table
    
    * Fix server default db migration for max_tries & map_index
    
    * Fix Backcompat for provider test in old airflow
    
    * Update the history table only when task completes
    
    * record state as failed when it's upd_for_retry
    
    * Only record the ti history when it's rerunning
    
    * fixup! Only record the ti history when it's rerunning
    
    * Don't use column.copy() since it's deprecated
    
    * Add test for task clearing
    
    * Refactor TI history recording
    
    * Update test
    
    * Use table to use UUID as PK. also removed onupdate cascade for TI &TIH table
    
    * Add unique constraint and use autoincrementing ID for PK
    
    * Add back onupdate cascade
    
    * Remove TaskIntanceHistory from lazy imports in models
    
    * Update comment
    
    ---------
    
    Co-authored-by: Jed Cunningham <[email protected]>
    Co-authored-by: dstandish <[email protected]>
    Co-authored-by: Jed Cunningham <[email protected]>
    4 people authored Jun 12, 2024
    Configuration menu
    Copy the full SHA
    89b32e6 View commit details
    Browse the repository at this point in the history
  2. Much smaller CI output for paralell tests (apache#40192)

    The output of parallel tests especially for lowest-direct tests
    is very long because we are printing state every 10 seconds and there
    are many paralell test types (90+ for lowest direct). We do not need
    to print progress that often, and it has already been added in apache#39946
    but one place to add it was missing - context manager still had
    the default 10 seconds refresh time.
    
    After this change the output will be printed every 20 seconds in
    the regular tests and every 2 minutes in "lowest-direct" tests
    (controlled by env variable) so the output should be much easier
    to find reasons for issues.
    potiuk authored Jun 12, 2024
    Configuration menu
    Copy the full SHA
    a90c07e View commit details
    Browse the repository at this point in the history
  3. Bump minimum version of google auth (apache#40190)

    The apache#39873 added an implicit dependency to google auth > 2.29.0
    because it uses SubjectTokenSupplier added in that version.
    
    Our "Lowest-direct" tests caught it (yay!) so we should add the
    min requirement to the dependency.
    potiuk authored Jun 12, 2024
    Configuration menu
    Copy the full SHA
    23a0152 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    14deaa2 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c98cd54 View commit details
    Browse the repository at this point in the history
  6. add Coinone (apache#40176)

    jx2lee authored Jun 12, 2024
    Configuration menu
    Copy the full SHA
    1372e10 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    794678f View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    930db71 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    f0b51cd View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    c1ffe45 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    28c1419 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    a84d56d View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    c2a93ea View commit details
    Browse the repository at this point in the history
  14. Configuration menu
    Copy the full SHA
    835f28c View commit details
    Browse the repository at this point in the history

Commits on Jun 13, 2024

  1. Ensures DAG params order regardless of backend (apache#40156)

    * Ensures DAG params order regardless of backend
    
    Fixes apache#40154
    
    This change adds an extra attribute to the serialized DAG param objects which helps us decide
    the order of the deserialized params dictionary later even if the backend messes with us.
    
    I decided not to limit this just to MySQL since the operation is inexpensive and may turn
    out to be helpful.
    
    I made sure the new test fails with the old implementation + MySQL. I assume this test will be
    executed with MySQL somewhere in the build actions?
    
    * Removes GitHub reference
    
    Co-authored-by: Jed Cunningham <[email protected]>
    
    * Serialize DAG params as array of tuples to ensure ordering
    
    Alternative to previous approach: We serialize the DAG params dict as a list of tuples which _should_ keep their ordering regardless of backend.
    
    Backwards compatibility is ensured because if `encoded_params` is a `dict` (not the expected `list`) then `dict(encoded_params)` still works.
    
    * Make backwards compatibility more explicit
    
    Based on suggestions by @uranusjr with an additional fix to make mypy happy.
    
    ---------
    
    Co-authored-by: Jed Cunningham <[email protected]>
    Usiel and jedcunningham authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    2149b4d View commit details
    Browse the repository at this point in the history
  2. local task job: add timeout, to not kill on_task_instance_success lis…

    …tener prematurely (apache#39890)
    
    Signed-off-by: Maciej Obuchowski <[email protected]>
    mobuchowski authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    fa65a20 View commit details
    Browse the repository at this point in the history
  3. Resolve deprecations in LatestOnlyOperator tests (apache#40181)

    * Resolve deprecations in `LatestOnlyOperator`
    
    * Use explicit data_interval
    boraberke authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    feb8307 View commit details
    Browse the repository at this point in the history
  4. doc: metrics allow_list complet example (apache#40120)

    Co-authored-by: raphaelauv <[email protected]>
    raphaelauv and raphaelauv authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    205ad57 View commit details
    Browse the repository at this point in the history
  5. AIP-64: Add UI endpoint for task instance history (apache#40221)

    * AIP-64: Add UI endpoint for task instance history
    
    This adds UI endpoint for task instance history
    
    Co-authored-by: Jed Cunningham <[email protected]>
    Co-Authored-By: dstandish <[email protected]>
    Co-Authored-By: Brent Bovenzi <[email protected]>
    
    * fixup! AIP-64: Add UI endpoint for task instance history
    
    ---------
    
    Co-authored-by: Jed Cunningham <[email protected]>
    Co-authored-by: dstandish <[email protected]>
    Co-authored-by: Brent Bovenzi <[email protected]>
    4 people authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    2272ea2 View commit details
    Browse the repository at this point in the history
  6. Fix Scheduler restarting due to too many completed pods in cluster (a…

    …pache#40183)
    
    * Fix Scheduler restarting due to too many completed pods in cluster
    
    Currently, when a pod completes and is not deleted due to the user's configuration,
    the watcher keeps listing these pods and checking their status. We should instead stop
    watching the pod once it succeeds. To do that, pods are created with the executor done
    label set to False and changed to True when the pod completes. The watcher then watches
    only those pods that the pod executor done label is False
    
    closes: apache#22612
    
    * Update airflow/providers/cncf/kubernetes/pod_generator.py
    
    Co-authored-by: Jed Cunningham <[email protected]>
    
    * Add back removed section
    
    * Don't add pod key label from get go
    
    * Update airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py
    
    Co-authored-by: Jed Cunningham <[email protected]>
    
    ---------
    
    Co-authored-by: Jed Cunningham <[email protected]>
    ephraimbuddy and jedcunningham authored Jun 13, 2024
    Configuration menu
    Copy the full SHA
    67798b2 View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2024

  1. Configuration menu
    Copy the full SHA
    d5a7544 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6f40984 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    f0bae33 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    e2b8f68 View commit details
    Browse the repository at this point in the history
  5. fix bigquery_to_gcs documentation (apache#40219)

    Currently the documentation states Importing files and also suggest that the file would be imported from GCS to BigQuery instead of what the operator actually does which is exporting single BigQuery table to GCS
    joonvena authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    d8a3257 View commit details
    Browse the repository at this point in the history
  6. Add executor field to the task instance API (apache#40034)

    Return executor as part of TaskInstance queries and also enable
    filtering by executor field.
    
    Also use the changes to display the executor field on the TaskInstance
    Details web page.
    
    Co-authored-by: Vincent <[email protected]>
    o-nikolas and vincbeck authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    c2959c9 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    2587295 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    bffb7b0 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    16b17f7 View commit details
    Browse the repository at this point in the history
  10. Configuration menu
    Copy the full SHA
    3e88f47 View commit details
    Browse the repository at this point in the history
  11. Databricks: stop including user names in list_jobs (apache#40178)

    * Databricks: stop including user names in `list_jobs`
    
    The user's name is not used on the Airflow side, and this argument saves the lookup, which makes the request faster.
    stephenpurcell-db authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    a1f9b7d View commit details
    Browse the repository at this point in the history
  12. Handle db isolation for mapped operators and task groups (apache#39259)

    * Handle db isolation for mapped operators and task groups
    
    * Update airflow/models/taskinstance.py
    dstandish authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    e69ab3a View commit details
    Browse the repository at this point in the history
  13. Configuration menu
    Copy the full SHA
    1a8d12f View commit details
    Browse the repository at this point in the history
  14. refactor: Added get_extra_dejson method with nested parameter which a…

    …llows you to specify if you want the nested json as string to be also deserialized. The extra_dejson property uses this method with nested set to False. (apache#39811)
    
    Co-authored-by: David Blain <[email protected]>
    dabla and davidblain-infrabel authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    ca73694 View commit details
    Browse the repository at this point in the history
  15. Chart: set workers.safeToEvict default to False (apache#40229)

    This is a safer default for our workers.
    
    This can be safe to set to true if you have a long enough
    `workers.terminationGracePeriodSeconds` set, but what is
    "long enough" is very situational, so I feel its better to
    default to not evicting worker pods.
    jedcunningham authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    4c9f12d View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    f5079db View commit details
    Browse the repository at this point in the history
  17. Update Dag.test() to run with an executor if desired (apache#40205)

    * Update `Dag.test()` to run with an executor if desired
    
    * Add missing parameter
    
    * Fix typo
    
    * Move `add_logger_if_needed` to local execution
    
    * Add `keep-env-variables` to `breeze testing db-tests`, `breeze testing non-db-tests` and `breeze shell`
    
    * Add documentation
    
    * Fix tests
    
    * Introduce `use-executor` flag
    
    * Update `debug` documentation
    
    * Fix test
    vincbeck authored Jun 14, 2024
    Configuration menu
    Copy the full SHA
    9595357 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    3014165 View commit details
    Browse the repository at this point in the history
  19. Configuration menu
    Copy the full SHA
    8eebe2b View commit details
    Browse the repository at this point in the history

Commits on Jun 15, 2024

  1. Change httpx to requests in file_task_handler (apache#39799)

    * Change httpx to requests in file_task_handler
    
    - httpx does not support CIDRs in NO_PROXY
    - simply, convert httpx to requests, issues done
    - related issue: apache#39794
    
    * Add cidr no_proxy test test_log_handlers.py
    
    * Apply monkeypatch fixture
    
    ---------
    
    Co-authored-by: scott-py <[email protected]>
    softyoungha and scott-py authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    1ddadf5 View commit details
    Browse the repository at this point in the history
  2. Add dependency to httpx >= 0.25.0 everywhere (apache#40256)

    Our "lowest-dependency" tests detectaed that weaviate client depends
    implicitly on httpx >= 0.19.0 (imports USE_CLIENT_DEFAULTS from
    httpx and it's missing < 0.19.0). Howeer this error is raised during
    importing of examples for weaviate in "Always" tests, and closer look at
    weaviate shows that it actually has >=0.25.0 and it makes sense for all
    our providers to bump httpx to 0.25.0 as minimum as well as add it to
    weaviate explicitly..
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    35871f8 View commit details
    Browse the repository at this point in the history
  3. Fix typing in telegram provider (apache#40255)

    The python-telegram-bot new version has typing added and we should
    pass the right dict type to it.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    bc4ca9d View commit details
    Browse the repository at this point in the history
  4. Working fix for typing in telegram provider (apache#40258)

    The python-telegram-bot new version has typing added and we should
    pass the right dict type to it. The apache#40255 was an unsuccessful
    attempt to fix it, this one actually fixes it.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    1451bac View commit details
    Browse the repository at this point in the history
  5. Fix typo when uninstalling weaviate for Pydantic 1 tests (apache#40259)

    Weaviate 4 requires Pydantic v2, and we currently limit weaviate to
    4+. There was already removal of weaviate for Pydantic 1, but it
    had a typo which caused the weaviate-client not to be uninstalled.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    cb372e5 View commit details
    Browse the repository at this point in the history
  6. Add timeout to base python test (apache#40262)

    The tests for Python Virtualev operator when running in parallel
    can take more time than 60 seconds (default timeout) we already
    set the timeout on python virtualenv operator level but it seems
    that the timeout should also be set on the base class for tests
    that are run from the base class.
    
    This should remove flakiness from those tests.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    e58c048 View commit details
    Browse the repository at this point in the history
  7. Add pytest timeout also to PythonVirtualenvDecorator (apache#40263)

    One more test that timeouts often when running in parallel - this
    one also missed extra timeout.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    30f6161 View commit details
    Browse the repository at this point in the history
  8. Resolve deprecations in the tests for Google MLEngine operators (ap…

    …ache#40261)
    
    * Resolve deprecations in `MLEngine` operators
    
    * Resolve deprecations in `MLEngine` operator utils
    boraberke authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    1363043 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    161fd55 View commit details
    Browse the repository at this point in the history
  10. Remove dependency on special tests for tests finalization (apache#40264)

    Tests finalization is run generally when all tests succeded in
    canary run. What the finalization does is:
    
    * updating constrainst
    * pushing them
    * updating image cache
    * summarizing warnings
    
    However special tests are really to test some special cases - back
    compatibility, lowest dependencies, latest boto etc. All those tests
    will only be run in a few selected PRs where we upgrade dependencies
    and they will not affect "regular" PRs, so we can safely update
    the constraints and update the cache without waiting for special tests.
    
    This will increase the frequency of updates to constraints - because
    now they might be quite delayed in case some special tests fail, but
    this is unnecessary holding the constraints update.
    potiuk authored Jun 15, 2024
    Configuration menu
    Copy the full SHA
    1d7ede7 View commit details
    Browse the repository at this point in the history

Commits on Jun 16, 2024

  1. Configuration menu
    Copy the full SHA
    5690439 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    bfe5fd7 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    518a9e4 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    60c2d36 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    8414c90 View commit details
    Browse the repository at this point in the history