- Notifications
You must be signed in to change notification settings - Fork 30
feat: add check output to database and implement souffle policy engine #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
e1274f1 to 99f4e58 Compare 0a6c6d8 to 09a2f2d Compare src/macaron/__main__.py Outdated
| """Verify a provenance against a user defined policy.""" | ||
| prov_file = verify_args.provenance | ||
| policy_file = verify_args.policy | ||
| policy_files = list(filter(lambda path: ".yaml" == os.path.splitext(path)[1], global_config.policy_paths)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extension can be .yml too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. I also added a todo under the policy class; I think eventually it would be better to have the Policy class define what filetypes it can be constructed from (eg. with a method that is passed the filename) so the configuration can just ask each policy and pick the first one that says yes. Currently though cue and YAML policies both use the same class, and souffle policies are another unrelated class since they need different interfaces, but the design could still be rationalized better.
| @@ -1,4 +1,4 @@ | |||
| # Copyright (c) 2022 - 2022, Oracle and/or its affiliates. All rights reserved. | |||
| # Copyright (c) 2022 - 2023, Oracle and/or its affiliates. All rights reserved. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of these files seem to have copyright header updates only. Can you please unstage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should have been removed in recent rebase
| /** | ||
| * The build is verifiably automated and . | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /** | |
| * The build is verifiably automated and . | |
| */ | |
| /** | |
| * The build is verifiably automated and deployable. | |
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 7584d19
69bb689 to 7584d19 Compare | | ||
| .decl json_path(j: JsonType, a: JsonType, key:symbol) | ||
| | ||
| json_path(a, b, key) :- a = $Object(k, b), json(name,_,a), key=cat(name, cat(".", k)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small note on the inconsistent spacing of the attributes in the relations. Perhaps we could create a ticket for resolving it later. Btw, have you encountered any linter for souffle 🤔 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a small note on the inconsistent spacing of the attributes in the relations. Perhaps we could create a ticket for resolving it later. Btw, have you encountered any linter for souffle thinking ?
I have come across this repo to lint Souffle Datalog: https://github.com/langston-barrett/souffle-lint
It could be something to add as a pre-commit hook locally, but haven't tried it out yet.
| self.db_man = DatabaseManager(db_path) | ||
| """Set up the database and ensure it is empty.""" | ||
| self.db_path = str(Path(__file__).parent.joinpath("macaron.db")) | ||
| print(self.db_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more print left here 😅
Signed-off-by: behnazh-w <behnaz.hassanshahi@oracle.com>
| Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
| Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
| Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
| Thank you for your pull request and welcome to our community! To contribute, please sign the Oracle Contributor Agreement (OCA).
To sign the OCA, please create an Oracle account and sign the OCA in Oracle's Contributor Agreement Application. When signing the OCA, please provide your GitHub username. After signing the OCA and getting an OCA approval from Oracle, this PR will be automatically updated. If you are an Oracle employee, please make sure that you are a member of the main Oracle GitHub organization, and your membership in this organization is public. |
#46) Signed-off-by: Alistair Michael <alistair.michael@oracle.com> Signed-off-by: behnazh-w <behnaz.hassanshahi@oracle.com>
Preparing Macaron for policy engine which uses the Souffle datalog interpreter.
Architecture
The goal is to use souffle to evaluate the policy, while loading the facts directly from the sqlite database.
For this to work there is the following requirements, from the souffle docs:
To input it there needs to be a relation declared that matches the view, and a corresponding input statement.
So broadly the way this pr works is:
DatabaseManager.create_tables()creates the database, tables, and views if they don’t exist.CheckResult["result-tables"]Analyzerstores these to the database after analysis is completed, along with the information macaron stores such as the analyzed repositories and dependency treeThe policy engine is invoked from a separate script which is passed the database file and a policy file
At this stage what this does is
_a corresponding souffle declaration and import is generatedChanges Summary
Import SQLAlchemy to manage database connection
Refactor
DatabaseManagerto use SQLAlchemy (api change)AnalyzeContextnow returns orm-mapped tables to be inserted into the database, rather than constructing SQL queriesCheckResulthas a new field"result_tables: list[Table]"Analyzernow populates tables to store the analysis, dependency, and slsa-level results, and check_resultsAnalyzeContextnow stores a orm-mapped table to represent the repository being analyzed, which is stored to the database by theAnalyzerobject before analysis startsAnalyzerstores all tables which checks insert intoCheckResult["result_tables"]to the database after analysisbase_check.pydefines a table to store check resultsbase_check.pydefines an SQLAlchemy declarative mixinCheckFactsTablewhich defines check_result id, and repository id, foreign key fields which when result tables inherit from it, the analyzer will automatically populate.provenance_l3_checkis stricter as per pull/29.add:
policy_engine/__main__.pyis the entry point for the policy engineadd:
policy_engine/souffle_code_generator.pycontains the logic for generating the souffle datalog for data importadd:
policy_engine/souffle.pycontains the wrapper for invoking souffle in a temporary directorypolicy_engine/policy.pyhas some changes due to a manually reverted refactor; it will likely have to be refactored again to integrate the policy engineTo do