@@ -36,13 +36,7 @@ You need to specify the array library to test. It can be specified via the
3636$ export ARRAY_API_TESTS_MODULE=numpy.array_api
3737```
3838
39- Alternately, change the ` array_module ` variable in ` array_api_tests/_array_module.py `
40- line, e.g.
41-
42- ``` diff
43- - array_module = None
44- + import numpy.array_api as array_module
45- ```
39+ Alternately, import/define the ` xp ` variable in ` array_api_tests/__init__.py ` .
4640
4741### Run the suite
4842
@@ -160,6 +154,13 @@ library to fail.
160154
161155### Configuration
162156
157+ #### API version
158+
159+ You can specify the API version to use when testing via the
160+ ` ARRAY_API_TESTS_VERSION ` environment variable. Currently this defaults to the
161+ array module's ` __array_api_version__ ` value, and if that attribute doesn't
162+ exist then we fallback to ` "2021.12" ` .
163+
163164#### CI flag
164165
165166Use the ` --ci ` flag to run only the primary and special cases tests. You can
@@ -178,13 +179,22 @@ By default, tests for the optional Array API extensions such as
178179will be skipped if not present in the specified array module. You can purposely
179180skip testing extension(s) via the ` --disable-extension ` option.
180181
181- #### Skip test cases
182+ #### Skip or XFAIL test cases
183+
184+ Test cases you want to skip can be specified in a skips or XFAILS file. The
185+ difference between skip and XFAIL is that XFAIL tests are still run and
186+ reported as XPASS if they pass.
187+
188+ By default, the skips and xfails files are ` skips.txt ` and ` fails.txt ` in the root
189+ of this repository, but any file can be specified with the ` --skips-file ` and
190+ ` --xfails-file ` command line flags.
182191
183- Test cases you want to skip can be specified in a ` skips.txt ` file in the root
184- of this repository, e.g.:
192+ The files should list the test ids to be skipped/xfailed. Empty lines and
193+ lines starting with ` # ` are ignored. The test id can be any substring of the
194+ test ids to skip/xfail.
185195
186196```
187- # ./ skips.txt
197+ # skips.txt or xfails .txt
188198# Line comments can be denoted with the hash symbol (#)
189199
190200# Skip specific test case, e.g. when argsort() does not respect relative order
@@ -200,39 +210,81 @@ array_api_tests/test_add[__iadd__(x, s)]
200210array_api_tests/test_set_functions.py
201211```
202212
203- For GitHub Actions, you might like to keep everything in the workflow config
204- instead of having a seperate ` skips.txt ` file, e.g.:
213+ Here is an example GitHub Actions workflow file, where the xfails are stored
214+ in ` array-api-tests.xfails.txt ` in the base of the ` your-array-library ` repo.
215+
216+ If you want, you can use ` -o xfail_strict=True ` , which causes XPASS tests (XFAIL
217+ tests that actually pass) to fail the test suite. However, be aware that
218+ XFAILures can be flaky (see below, so this may not be a good idea unless you
219+ use some other mitigation of such flakyness).
220+
221+ If you don't want this behavior, you can remove it, or use ` --skips-file `
222+ instead of ` --xfails-file ` .
205223
206224``` yaml
207225# ./.github/workflows/array_api.yml
208- ...
209- ...
210- - name : Run the test suite
226+ jobs :
227+ tests :
228+ runs-on : ubuntu-latest
229+ strategy :
230+ matrix :
231+ python-version : ['3.8', '3.9', '3.10', '3.11']
232+
233+ steps :
234+ - name : Checkout <your array library>
235+ uses : actions/checkout@v3
236+ with :
237+ path : your-array-library
238+
239+ - name : Checkout array-api-tests
240+ uses : actions/checkout@v3
241+ with :
242+ repository : data-apis/array-api-tests
243+ submodules : ' true'
244+ path : array-api-tests
245+
246+ - name : Run the array API test suite
211247 env :
212248 ARRAY_API_TESTS_MODULE : your.array.api.namespace
213249 run : |
214- # Skip test cases with known issues
215- cat << EOF >> skips.txt
216-
217- # Comments can still work here
218- array_api_tests/test_sorting_functions.py::test_argsort
219- array_api_tests/test_add[__iadd__(x1, x2)]
220- array_api_tests/test_add[__iadd__(x, s)]
221- array_api_tests/test_set_functions.py
222-
223- EOF
224-
225- pytest -v -rxXfE --ci
250+ export PYTHONPATH="${GITHUB_WORKSPACE}/your-array-library"
251+ cd ${GITHUB_WORKSPACE}/array-api-tests
252+ pytest -v -rxXfE --ci --xfails-file ${GITHUB_WORKSPACE}/your-array-library/array-api-tests-xfails.txt array_api_tests/
226253` ` `
227254
255+ > **Warning**
256+ >
257+ > XFAIL tests that use Hypothesis (basically every test in the test suite except
258+ > those in test_has_names.py) can be flaky, due to the fact that Hypothesis
259+ > might not always run the test with an input that causes the test to fail.
260+ > There are several ways to avoid this problem:
261+ >
262+ > - Increase the maximum number of examples, e.g., by adding ` --max-examples
263+ > 200` to the test command (the default is `100`, see below). This will
264+ > make it more likely that the failing case will be found, but it will also
265+ > make the tests take longer to run.
266+ > - Don't use `-o xfail_strict=True`. This will make it so that if an XFAIL
267+ > test passes, it will alert you in the test summary but will not cause the
268+ > test run to register as failed.
269+ > - Use skips instead of XFAILS. The difference between XFAIL and skip is that
270+ > a skipped test is never run at all, whereas an XFAIL test is always run
271+ > but ignored if it fails.
272+ > - Save the [Hypothesis examples
273+ > database](https://hypothesis.readthedocs.io/en/latest/database.html)
274+ > persistently on CI. That way as soon as a run finds one failing example,
275+ > it will always re-run future runs with that example. But note that the
276+ > Hypothesis examples database may be cleared when a new version of
277+ > Hypothesis or the test suite is released.
278+
228279# ### Max examples
229280
230281The tests make heavy use
231282[Hypothesis](https://hypothesis.readthedocs.io/en/latest/). You can configure
232- how many examples are generated using the ` --max-examples` flag, which defaults
233- to 100. Lower values can be useful for quick checks, and larger values should
234- result in more rigorous runs. For example, `--max-examples 10_000` may find bugs
235- where default runs don't but will take much longer to run.
283+ how many examples are generated using the `--max-examples` flag, which
284+ defaults to `100`. Lower values can be useful for quick checks, and larger
285+ values should result in more rigorous runs. For example, `--max-examples
286+ 10_000` may find bugs where default runs don't but will take much longer to
287+ run.
236288
237289
238290# # Contributing
0 commit comments