Skip to content

Conversation

@michaelpj
Copy link
Collaborator

This uses a action for retrying steps, which is a bit neater, and lets
us more clearly specify what's going on, as well as controlling the
number of retries independently.

I think it's better to specifically use this on the test suites that we
believe to be flaky, rather than adding a lot of noise by doing this on
every test invocation.

Copy link
Collaborator

@fendor fendor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, sounds great!

This uses a action for retrying steps, which is a bit neater, and lets us more clearly specify what's going on, as well as controlling the number of retries independently. I think it's better to specifically use this on the test suites that we believe to be flaky, rather than adding a lot of noise by doing this on every test invocation.
@michaelpj michaelpj marked this pull request as ready for review December 19, 2022 19:36
Copy link
Collaborator

@fendor fendor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@michaelpj michaelpj enabled auto-merge (squash) December 20, 2022 11:15
@wz1000
Copy link
Collaborator

wz1000 commented Dec 21, 2022

Let's fix flaky tests properly. I propose we invert this patch to retry all testsuites thrice unconditionally and fail if any run fails. Then we will have a list of most flaky tests.

I've started a patch to fix flaky testsuite items in ghcide.

@michaelpj
Copy link
Collaborator Author

Let's fix flaky tests properly. I propose we invert this patch to retry all testsuites thrice unconditionally and fail if any run fails.

I'm definitely in favour of fixing flaky tests, I just want our CI to be passing for most people instead of requiring constant restarting as it does right now. Then we can make a ticket for each flaky job to work on it. Having things fail a lot is an incentive to fix the tests for sure, but it also just slows everything down tons.

@michaelpj michaelpj disabled auto-merge December 21, 2022 14:55
@michaelpj
Copy link
Collaborator Author

That said if you think you can actually fix the flakiness soon then go for it!

@andys8
Copy link
Collaborator

andys8 commented Dec 21, 2022

The retry action also seems to add a warning to each run with retries by default (warning_on_retry). So flaky tests can be identified by going through past runs (and not failing pull requests that didn't cause them). Haven't seen this action before, but looks to me like a nice improvement over the current state.

https://github.com/nick-fields/retry#warning_on_retry

@michaelpj
Copy link
Collaborator Author

@wz1000 did you get most of the flaky tests? I'd be happy to just remove the retrying on everything if you think it's not needed now.

@wz1000
Copy link
Collaborator

wz1000 commented Jan 10, 2023

No, #3423 seems like an issue that results in quite some flakiness throughout the testsuite but it will require a bit of work to resolve.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

5 participants