Skip to content

Conversation

@ailrst
Copy link
Contributor

@ailrst ailrst commented Jan 25, 2023

Replaced by : PR 46

Preparing Macaron for policy engine which uses the Souffle datalog interpreter.

Architecture

The goal is to use souffle to evaluate the policy, while loading the facts directly from the sqlite database.

For this to work there is the following requirements, from the souffle docs:

The data is expected to be stored in a table matching the relation name prefixed by an underscore and the sqlite3 database is expected to contain a view matching the relation name. For example, for the relation edge, the sqlite3 database should have a table named _edge.

To input it there needs to be a relation declared that matches the view, and a corresponding input statement.

So broadly the way this pr works is:

  1. Tables are declaratively defined in macaron and checks using SQLAlchemy
  2. From the analyzer class, DatabaseManager.create_tables() creates the database, tables, and views if they don’t exist.
  3. Checks store populated orm-mapped tables to CheckResult["result-tables"]
  4. Analyzer stores these to the database after analysis is completed, along with the information macaron stores such as the analyzed repositories and dependency tree

The policy engine is invoked from a separate script which is passed the database file and a policy file

python -m macaron.policy_engine -h usage: policy_engine [-h] -d DATABASE [-f FILE] [-s] 

options:
-h, --help show this help message and exit
-d DATABASE, --database DATABASE
Database path
-f FILE, --file FILE Replace policy file
-s, --show-preamble Show preamble
$ python -m macaron.policy_engine -d output/macaron.db -f tests/policy_engine/resources/policies/testpolicy.dl -h

At this stage what this does is

  1. The database is opened and all the schema is reflected into the SQLAlchemy metadata
  2. For each table beginning with an _ a corresponding souffle declaration and import is generated
  3. Some helper relations and rules are generated
  4. The prelude is constructed by combining the import statements, helper relations, and some additional non-generated rules
  5. A file is created with the generated prelude prepended to the actual policy file
  6. Souffle is invoked on this file, and the results are printed

Changes Summary

  • Import SQLAlchemy to manage database connection

  • Refactor DatabaseManager to use SQLAlchemy (api change)

    • and add corresponding unit tests
  • AnalyzeContext now returns orm-mapped tables to be inserted into the database, rather than constructing SQL queries

  • CheckResult has a new field "result_tables: list[Table]"

  • Analyzer now populates tables to store the analysis, dependency, and slsa-level results, and check_results

  • AnalyzeContext now stores a orm-mapped table to represent the repository being analyzed, which is stored to the database by the Analyzer object before analysis starts

  • Analyzer stores all tables which checks insert into CheckResult["result_tables"] to the database after analysis

    • Nearly all checks are modified to define and store result tables
  • base_check.py defines a table to store check results

  • base_check.py defines an SQLAlchemy declarative mixin CheckFactsTable which defines check_result id, and repository id, foreign key fields which when result tables inherit from it, the analyzer will automatically populate.

  • provenance_l3_check is stricter as per pull/29.

  • add: policy_engine/__main__.py is the entry point for the policy engine

  • add: policy_engine/souffle_code_generator.py contains the logic for generating the souffle datalog for data import

  • add: policy_engine/souffle.py contains the wrapper for invoking souffle in a temporary directory

    • and corresponding unit tests
  • policy_engine/policy.py has some changes due to a manually reverted refactor; it will likely have to be refactored again to integrate the policy engine

To do

  • Policy engine: validate database version before proceeding
  • Integrate provenance policies using CUE or proof of concept policy engine
  • Have macaron run policy and include result in reports
  • Update check authoring documentation
  • Update build; add souffle to docker
@oracle-contributor-agreement oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Jan 25, 2023
Alistair Michael added 24 commits January 25, 2023 13:17
Check now displays all provenance artefacts which passed, failed, or were skipped.
- added separate tables for analysis and repository - breaking: `analyze_result` now has an id referencing the analysis it pertains to - deleted sql query generation code
no longer stores ORM mapped tables in CheckResult
for build_as_code, provenance_available, trusted_builder_l3
@ailrst ailrst changed the title Database update and policy engine feat: Database update and policy engine Jan 26, 2023
@ailrst ailrst closed this Jan 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

OCA Verified All contributors have signed the Oracle Contributor Agreement.

1 participant