[RFC] Enabling the Lit Internal Shell by Default

jh7370 · July 30, 2024, 2:37pm

The “temp” filenames lit uses should be driven by the %t substitution, which means it’s a path name derived from the test. There shouldn’t be any chance of a name clash unless two instances of the same test are being run at the same time.

Harini0924 · August 1, 2024, 10:57pm

Currently, there are 3 tests that use the $() syntax:
HWAddressSanitizer-x86_64 :: TestCases/hwasan_symbolize.cpp
Clang :: Driver/offload-packager.c
MemorySanitizer-X86_64 :: allocator_mapping.cpp
Right now, lit’s parser is unable to understand the $() syntax in any of these, so tests aren’t running as normal.

MaskRay:

The proposal appears to combine two goals: “changing the default for non-Windows platforms” and “implementing additional features.”

I’ve noticed recent patches from @jayfold to enhance the internal lit shell for non-Windows platforms. Could you provide an estimate of the remaining issues or any areas where you might need assistance from the community?

Perhaps the primary problem is the 100+ tests that do REUIQRES: shell. We could migrate them or implement a temporary option in lit that uses the external shell for these specific tests on non-Windows platforms.
% rg 'REQUIRES: .*shell' -l | wc -l 154 
The “Implementation Plan” in this proposal outlines features that I believe aren’t crucial for changing the default.
While supporting some features might improve the developer experience, I’d like to see concrete examples.

Our current plan is to address the failures that we are seeing with the internal shell with the existing set of tests so that it can be used without breakages. We believe that implementing these new features is a part of changing the default - though there are cases like |& that we agree can just be rewritten.

We’ve already made changes to the llvm subproject, and the internal shell is almost ready to be switched on for that subproject.

Here are the links to our patches we have up.

jh7370 · August 19, 2024, 8:58am

The thread has gone a bit stale. From what I’ve read, everybody agrees that the goal of “switching to internal shell by default everywhere” is a worthwhile one, but there doesn’t seem to be much appetite for adding new functionality, except where it is absolutely essential. I think it would be worthwhile raising each new piece of functionality that you want to implement as a separate RFC, so it can be judged on its individual merits. In each of these RFCs, I’d expect to see a count of the number of tests that need it and what would be involved in porting those tests to use existing lit etc functionality. If the impacted test count is small, it would be best to explicitly list the tests. If you’ve got a PR exhibiting the functionality to back the RFC up, that’s fine, but be prepared to close the PR, if the consensus is that the tests should be modified instead.

perry-ca · August 19, 2024, 7:03pm

We have been trying this out on z/OS and have a sizable number of failures to investigate. So far we have identified two main problems.

The issue with sub-shells that has been talked about. We see lit tests with the use of $() as well as the back quotes (``). I think these need to be supported before enabling internal shell or at least an automatic way to fall back to external shell if syntax not supported is found.
Handling automatic code page conversions. On z/OS, the OS will do auto conversion if a process is reading an EBCDIC file from an ASCII application and vice versa. This doesn’t seem to work with the internal shell. We are still investigating so no details yet but tests like LLVM :: tools/split-file/basic.test fail with the internal shell.

We have also been seeing that the internal shell is 50% slower than the external shell. This seems counter intuitive but this is what we are seeing. Still investigating this too.

rnk · August 20, 2024, 9:51pm

I advocate that we should keep the lit shell dialect extremely minimal.

The lit shell dialect is shell-like, but we should not add support for subshells (i.e. $() and backticks), or other features, that encourage extensive processing and computation in the shell. We don’t want to encourage test writers to push substantial amounts of data through the test shell as Python strings. We really just want to hook up the pipelines and create a testing environment where C++ processes talk to each other through native OS files and pipes. We also want to say “no” to the never-ending project of reimplementing all of bash in Python, we have to limit scope somewhere, and it should be really close to where we are today, even if we have to make a few more accommodations for the sake of practicality.

My position is that using advanced bash features in most tests is just not good taste.

Harini0924 · August 21, 2024, 12:22am

jh7370:

The thread has gone a bit stale. From what I’ve read, everybody agrees that the goal of “switching to internal shell by default everywhere” is a worthwhile one, but there doesn’t seem to be much appetite for adding new functionality, except where it is absolutely essential. I think it would be worthwhile raising each new piece of functionality that you want to implement as a separate RFC, so it can be judged on its individual merits. In each of these RFCs, I’d expect to see a count of the number of tests that need it and what would be involved in porting those tests to use existing lit etc functionality. If the impacted test count is small, it would be best to explicitly list the tests. If you’ve got a PR exhibiting the functionality to back the RFC up, that’s fine, but be prepared to close the PR, if the consensus is that the tests should be modified instead.

rnk:

I advocate that we should keep the lit shell dialect extremely minimal.

The lit shell dialect is shell-like, but we should not add support for subshells (i.e. $() and backticks), or other features, that encourage extensive processing and computation in the shell. We don’t want to encourage test writers to push substantial amounts of data through the test shell as Python strings. We really just want to hook up the pipelines and create a testing environment where C++ processes talk to each other through native OS files and pipes. We also want to say “no” to the never-ending project of reimplementing all of bash in Python, we have to limit scope somewhere, and it should be really close to where we are today, even if we have to make a few more accommodations for the sake of practicality.

My position is that using advanced bash features in most tests is just not good taste.

After evaluating the features we were planning on implementing (brackets, command substitution $(), ulimit, unset, for loops), we agree that the tests that rely on these features can just be rewritten without needing to add new functionality. We will be closing the PRs opened previously to implement these features, and will focus on rewriting the tests instead. Thank you @jh7370 and @rnk for the feedback on this matter.

pogo59 · August 21, 2024, 1:56pm

+1. In the really inescapable cases, the test can be rewritten as a Python script directly. There are some tests done this way already.

Topic		Replies	Views
RFC: A cross platform way of using shell commands in lit tests LLVM Dev List Archives	19	159	August 18, 2016
RFC: Improving lit's debug output LLVM Project	57	2364	September 20, 2023
RFC: Dropping DejaGNU LLVM Dev List Archives	13	114	October 20, 2009
lit improvement LLVM Dev List Archives	3	158	August 6, 2015
Is there room for another build system? LLVM Dev List Archives	19	116	August 5, 2008

[RFC] Enabling the Lit Internal Shell by Default

Related topics