-
- Notifications
You must be signed in to change notification settings - Fork 613
Add support for external licenses in scans #2979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for external licenses in scans #2979
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KevinJi22 I think this is looking good so far! I left some notes about docstrings and style.
fda3b2d to 93cc339 Compare | There is something in your changes that make the tests fail... |
93cc339 to b1cf4aa Compare | @pombredanne I'm actually combining all the licenses and rules in the input directories into a single index here. In load_licenses_from_multiple_dirs, all the licenses (including the ones in the ScanCode database) are combined into a single license database, and the same is done with all the rules. The _CACHED_DIRECTORIES variable is used to determine whether a new index needs to be built. For example, if we've created an index with additional directories A and B, but now we run a scan with additional directories B and C, we'd need to recreate the combined index. I'll update this code with comments so that this is more clear. |
93e90a4 to ec988c5 Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a few nitpicks about formatting. I have an issue trying out the new -dir option. I am running scancode -l -dir tests/licensedcode/data/example_external_licenses/example1 --json-pp ~/Desktop/out.json /home/jono/src/scancode-toolkit-kevinji22/tests/licensedcode/data/plugin_license/external_licenses/scan/, however I am not seeing the custom license being detected in the scan results.
7696b97 to c5afa46 Compare | Thanks for the feedback! This bug was actually due to how This means that running the scan with the same parameters should work this time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! After talking to @pombredanne, the next thing we want on top is to be able to install custom licenses using wheels. Something similar to the way we handle providing binaries to scancode via plugins: https://github.com/nexB/scancode-plugins/tree/main/builtins
We also merged the fix for the PyPI test in develop, so be sure to rebase your branch.
65dbef5 to 2243f51 Compare | @KevinJi22 Don't feel compelled to amend your commit and force push it every time you make a change. You can create as many commits as you want on this PR. 🙂 |
60d0c6a to 0db7bab Compare bcede69 to f3839c3 Compare f9e82e9 to 392bc01 Compare Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
| @KevinJi22 we (@pombredanne and me) added the following changes: CLI Options:
Tests:
Misc:
Remaining work:
Otherwise tests are all green and this is ready AFAIK. |
392bc01 to 6412039 Compare Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
1506621 to 044f60d Compare
Modified the code to add Ready to review @pombredanne! Was also wondering if there could be a |
Sure, but remember that each new added option may be a sign of our failure to automatically do the right thing without options :) |
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
65bb925 to 1ee3b9d Compare There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here some extra feedback. I think we should refactor this a bit more such that indexing licenses and rules just takes a list of directories. Everything else should IMHO just use the things already loaded. And we can simplify the detection of builtins based on a list of builtin licenses... This would help keep the code tidier IMHO. @AyanSinhaMahapatra @KevinJi22 let's chat at your convenience
Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
1ee3b9d to cf417c4 Compare Refactor according to the feedback. Signed-off-by: Ayan Sinha Mahapatra <ayansmahapatra@gmail.com>
cf417c4 to 54fb102 Compare
I created #3137 as a follow up! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good to go. Merging
Signed-off-by: Philippe Ombredanne <pombredanne@nexb.com>
2c2ba92 to f2b1e13 Compare
This is for the 2022 Google Summer of Code. This PR contains all the work that needs to be done for the project.
This adds
--additional-license-directoryas a command line option in license reindexing. This allows users to specify paths to directories of licenses and rules they'd like to use during license detection, but would not like to add to the ScanCode database of licenses. First, the user must run Scancode with the--reindex-licensesoption and they can use--additional-license-directoryto add directories of licenses or rules to the index.This involves adding a new option in
licensedcode/plugin_license.py, and this option is used as a parameter inscancode/api.py. In this approach, the licenses and rules contained in these additional directories are combined with the existing licenses and rules in the ScanCode database to produce a single index. The code for this is found inlicensedcode/cache.pyand the helper methods for loading these licenses and rules are found inlicensedcode/models.py.This commit also includes unit tests to verify that license detection succeeds with additional directories and installed licenses/rules. Part of the setup for the unit test and future tests involves creating a new directory in
tests/licensedcode/datathat contains sample external licenses used in the unit tests.Fixes #480
Tasks
Run tests locally to check for errors.