Skip to content

Conversation

@collinmcnulty
Copy link
Contributor

@collinmcnulty collinmcnulty commented Aug 18, 2024

The first run behavior for cron based DAGs (and most DAGs really) is quite unintuitive. It is very common for users to want to run a DAG for the first time either immediately, with a logical date of the past run that would have occurred most recently, or to wait until the next run time and then begin. The former is only possible if you set catchup to False and the latter is not possible without knowing ahead of time when that next run would be and hardcoding the start time for the DAG (which is not feasible when doing dev/text/prod environment promotion of the same DAG file).

It is also very common to set the start_date to a dynamic function that calculates a date in the recent past, such as days_ago(1), which can bite you if you're used to doing that (or have coded it into a DAG factory pattern) and then you set up your first monthly DAG. The existing timetables behavior is to silently miss the run, which makes it even worse.

This PR adds the ability to set start_date to None and get the intuitive behavior of running the last run if that run would have happened in the very recent past (10% of the distance between DAG runs) or to wait for the next run otherwise. It also allows using the run_immediately parameter to specify always running the last run, always waiting for the next run, or passing a timedelta to exactly control what "recent past" means for this particular DAG. Setting start_date to None is a much more accurate and clear indication of author intent that days_ago(x).

I hope that if this is widely adopted, we can stop seeing users miss DAG runs or using dynamic start dates.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@collinmcnulty
Copy link
Contributor Author

The "best practice" for having the first run begin at the most recent run is to set the start_date far in the past and catchup=False, but I think having a meaningless date passed to start_date is not a good thing for Airflow to need and doesn't match author intent.

@collinmcnulty
Copy link
Contributor Author

I found a logical bug. Setting to False creates a race condition to actually schedule the first run. We'll need some minimum buffer time where the next run time does not update even when run_immediately=False

@collinmcnulty
Copy link
Contributor Author

@uranusjr We spoke about what CronDataInterval's behavior is when start_date is None, but I found that the DAG object actually restricts you from setting start date to None. This, I believe, is no longer necessary, and actually the CronDataIntervalTimetable's behavior as written seems sensible, it's just not accessible. I added a commit to remove that restriction on start_date

@shahar1 shahar1 added airflow3.0:candidate Potential candidates for Airflow 3.0 area:core labels Aug 24, 2024
@uranusjr
Copy link
Member

Needs to resolve conflicts but otherwise looks good to me.

@github-actions
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 15, 2024
@collinmcnulty
Copy link
Contributor Author

@uranusjr I don't think this test failure is related to the change. I merged in the changes from main and think this new version should be ready to merge.

@github-actions github-actions bot removed the stale Stale PRs per the .github/workflows/stale.yml policy file label Oct 22, 2024
@potiuk
Copy link
Member

potiuk commented Oct 24, 2024

You need to rebase - there are conflicts. That should also address the issue with unrelated test failing @collinmcnulty

@potiuk
Copy link
Member

potiuk commented Oct 24, 2024

Smth is wrong. There are a LOT of changes

@collinmcnulty
Copy link
Contributor Author

I believe I fixed it. Someone else had made the changes to the dag model to allow None in the meantime while this PR was languishing, so now its only about CronTriggerTimetable itself.

@potiuk potiuk requested a review from uranusjr October 25, 2024 20:38
@potiuk potiuk merged commit 94af07f into apache:main Oct 25, 2024
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
 --------- Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

airflow3.0:candidate Potential candidates for Airflow 3.0 area:core

4 participants