-
Couldn't load subscription status.
- Fork 15k
[Github][Bazel] Add Workflow to Run Bazel Build #165071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This patch adds a job to the bazel checks workflow to run the bazel build/test. This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts. This should add minimal load to the CI given the low frequency of bazel PRs, and especially when we enable GCS based caching due to bazel's effective use of caching. Google is also sponsoring the Linux Premerge CI and is interested in having premerge bazel builds which is why it makes sense to do premerge testing of this alternative build system.
| @llvm/pr-subscribers-github-workflow Author: Aiden Grossman (boomanaiden154) ChangesThis patch adds a job to the bazel checks workflow to run the bazel build/test. This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts. This should add minimal load to the CI given the low frequency of bazel PRs, and especially when we enable GCS based caching due to bazel's effective use of caching. Google is also sponsoring the Linux Premerge CI and is interested in having premerge bazel builds which is why it makes sense to do premerge testing of this alternative build system using those resources. Full diff: https://github.com/llvm/llvm-project/pull/165071.diff 1 Files Affected:
diff --git a/.github/workflows/bazel-checks.yml b/.github/workflows/bazel-checks.yml index 65d51649dd9e7..5cc0342015cf6 100644 --- a/.github/workflows/bazel-checks.yml +++ b/.github/workflows/bazel-checks.yml @@ -30,3 +30,28 @@ jobs: - name: Run Buildifier run: | buildifier --mode=check $(find ./utils/bazel -name *BUILD*) + + bazel-build: + name: "Bazel Build/Test" + runs-on: llvm-premerge-linux-runners + if: github.repository == 'llvm/llvm-project' + steps: + - name: Fetch LLVM sources + uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0 + # TODO(boomanaiden154): We should use a purpose built container for this. Move + # over when we have fixed the issues with using custom containers with Github + # ARC in GKE. + - name: Setup System Dependencies + run: | + sudo apt-get update + sudo apt-get install -y libmpfr-dev libpfm4-dev + sudo curl -L https://github.com/bazelbuild/bazelisk/releases/download/v1.27.0/bazelisk-amd64.deb > /tmp/bazelisk.deb + sudo apt-get install -y /tmp/bazelisk.deb + rm /tmp/bazelisk.deb + - name: Build/Test + working-directory: utils/bazel + run: | + bazel test --config=ci --sandbox_base="" \ + @llvm-project//llvm/... \ + @llvm-project//clang/... \ + @llvm-project//mlir/... |
| Only doing a couple of the core projects now to limit what we build before we setup caching. Building lldb is also failing on a missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! FYI @keith @aaronmondal as the other bazel owners.
One notable difference vs the current buildkite setup: this runs bazel test ... instead of bazel query ...| xargs bazel test. The query/xargs is a hack to make sure that targets tagged "manual" will still be built. It comes from the IREE build setup, but AFAIK we don't need it for LLVM -- skipping "manual" targets should be fine (and expected).
| - name: Build/Test | ||
| working-directory: utils/bazel | ||
| run: | | ||
| bazel test --config=ci --sandbox_base="" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be bazelisk instead of bazel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, bazelisk would be better. There is actually this action which might be nice to look into as it's fairly standardized and it seems to me like it could work more or less out of the box:
| @llvm-project//llvm/... \ | ||
| @llvm-project//clang/... \ | ||
| @llvm-project//mlir/... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts.
llvm+clang+mlir is most of targets already. If you want to get this running w/ a minimal config and only add other projects once we have caching, should we start smaller? e.g. just @llvm-project//llvm/unittests:adt_tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that this reduces reliance on buildkite, though I'm fairly biased as I have strong interest in this as well.
@jaroeichler and I are generally looking to sponsor infra for bazel workflows in LLVM as well. We've just recently started to build that out, so I can't give a too exact timeframe for that (maybe around 1-3 months?) But regarding runner images we might be able to contribute to this effort sooner. (Jaro, WDYT?)
| # ARC in GKE. | ||
| - name: Setup System Dependencies | ||
| run: | | ||
| sudo apt-get update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems risky. Unless this is pinned, it'll be very cache inefficient since apt installs are not reproducible by default. Even with pinning, i believe since the installs contain timestamps it would still not be possible to reuse the cache like this. IIRC tools like the OSSF scorecard would flag this as "critical level" risk as it essentially allows whatever tooling to end up on this runner and AFAICS the llvm-premerge-linux-runners aren't pinned either.
Another thing is that this fairly heavily creates a dependence on ubuntu and the particular configuration of the runner.
I guess if it's only temporary I don't have too much of an opinion against this, but I think it's worth mentioning that this really shouldn't stay like this forever.
Regarding a more long-term solution, my vote would be on a nixos image as that not only freezes deps properly, but is also verifiable for external users, i.e. from a configuration file it'll be possible to bit-by-bit reproduce the runner image by third parties. This gives essentially perfect cache-reuse and makes things comparatively easily verifiable.
I believe i have such an image lying around somewhere. If there is interest, i can take another look or set up a new one for this usecase.
For just the initial implementation, though it seems unintuitive, it might be an option to actually just remove the apt update call. I'm not sure whether this works with the default llvm-premerge runners, but if it does, it would at least pseudo-pin the apt repo to whatever version is on those runners which might be a bit more stable than updating everytime this workflow runs.
| - name: Build/Test | ||
| working-directory: utils/bazel | ||
| run: | | ||
| bazel test --config=ci --sandbox_base="" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, bazelisk would be better. There is actually this action which might be nice to look into as it's fairly standardized and it seems to me like it could work more or less out of the box:
Yes, will spend some time in the next few days to work on this. |
| We can't use custom runner images on the existing premerge infrastructure because we do not have a workaround for kubernetes-sigs/apiserver-network-proxy#748 yet. That shouldn't be a permanent state, but is where we're at currently. I don't think this is feasible to do on the free GIthub Actions runners. |
Ah sorry for the confusion! For Bazel specifically its possible to use remote execution rather than running locally. I.e. the runner in this workflow just needs to construct the build graph, and can then send that off to one or multiple separate builder machines. This way it doesn't matter too much whether it's a free or a self-hosted runner, because the actual compilation happens elsewhere (e.g. in the same K8s cluster in another container or somewhere else entirely). The way this works is roughly that the llvm-premerge-runner (or a free runner, kinda doesn't matter too much) invokes the bazel build and then sets The tricky part here is that the entire build needs to be fully hermetic so that the machine that constructs the compile commands (i.e. the free or the llvm-premerge runner) needs to do so with binary-identical tools that match what the remote builders expect. Otherwise even with "the same paths" it would lead to cache misses. Fortunately that's a fixable toolchain issue and mostly bazel-side configuration that can be layered on top without affecting other workflows/development flows. The "image" i was talking about was the image for such remote builders, not for the runner that invokes this workflow. |
This patch adds a job to the bazel checks workflow to run the bazel build/test. This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts.
This should add minimal load to the CI given the low frequency of bazel PRs, and especially when we enable GCS based caching due to bazel's effective use of caching. Google is also sponsoring the Linux Premerge CI and is interested in having premerge bazel builds which is why it makes sense to do premerge testing of this alternative build system using those resources.