-
- Notifications
You must be signed in to change notification settings - Fork 243
Add improver pipeline to flag ghost packages #644 #917 #1395 #1533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
16 commits Select commit Hold shift + click to select a range
9acd345 Add base pipeline
keshav-space 56dafb2 Add improver pipeline to remove ghost packages
keshav-space 3a13c78 Add logging config for pipelines
keshav-space cba58b8 Use latest pipeline
keshav-space 7a72929 Use aboutcode.pipeline
keshav-space a686c61 Add test for remove_ghost_packages pipeline
keshav-space f5ac60a Drop support for Python 3.8
keshav-space d870b4f Add status field to Package model
keshav-space 539b7f6 Flag ghost packages
keshav-space aa0e57c Pin aboutcode.pipeline
keshav-space 5c8770b Use boolean field to flag ghost package
keshav-space d0465cc Use paginated queryset for better memory performance
keshav-space b848747 Update package details template to show Ghost tag
keshav-space b0f90cb Improve docstring
keshav-space 75de1e2 Drop version class wrapper
keshav-space 0f41b18 Add CHANGELOG
keshav-space File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| # Generated by Django 4.1.13 on 2024-08-23 12:47 | ||
| | ||
| from django.db import migrations, models | ||
| | ||
| | ||
| class Migration(migrations.Migration): | ||
| | ||
| dependencies = [ | ||
| ("vulnerabilities", "0061_alter_packagechangelog_software_version_and_more"), | ||
| ] | ||
| | ||
| operations = [ | ||
| migrations.AddField( | ||
| model_name="package", | ||
| name="is_ghost", | ||
| field=models.BooleanField( | ||
| default=False, | ||
| help_text="True if the package does not exist in the upstream package manager or its repository.", | ||
| ), | ||
| ), | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| # | ||
| # Copyright (c) nexB Inc. and others. All rights reserved. | ||
| # VulnerableCode is a trademark of nexB Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # See http://www.apache.org/licenses/LICENSE-2.0 for the license text. | ||
| # See https://github.com/nexB/vulnerablecode for support or download. | ||
| # See https://aboutcode.org for more information about nexB OSS projects. | ||
| # | ||
| import logging | ||
| from datetime import datetime | ||
| from datetime import timezone | ||
| | ||
| from aboutcode.pipeline import BasePipeline | ||
| | ||
| from vulnerabilities.utils import classproperty | ||
| | ||
| module_logger = logging.getLogger(__name__) | ||
| | ||
| | ||
| class VulnerableCodePipeline(BasePipeline): | ||
| def log(self, message, level=logging.INFO): | ||
| """Log the given `message` to the current module logger and execution_log.""" | ||
| now_local = datetime.now(timezone.utc).astimezone() | ||
| timestamp = now_local.strftime("%Y-%m-%d %H:%M:%S.%f")[:-3] | ||
| message = f"{timestamp} {message}" | ||
| module_logger.log(level, message) | ||
| self.append_to_log(message) | ||
| | ||
| @classproperty | ||
| def qualified_name(cls): | ||
| """ | ||
| Fully qualified name prefixed with the module name of the pipeline used in logging. | ||
| """ | ||
| return f"{cls.__module__}.{cls.__qualname__}" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| # | ||
| # Copyright (c) nexB Inc. and others. All rights reserved. | ||
| # VulnerableCode is a trademark of nexB Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # See http://www.apache.org/licenses/LICENSE-2.0 for the license text. | ||
| # See https://github.com/nexB/vulnerablecode for support or download. | ||
| # See https://aboutcode.org for more information about nexB OSS projects. | ||
| # | ||
| | ||
| import logging | ||
| from itertools import groupby | ||
| from traceback import format_exc as traceback_format_exc | ||
| | ||
| from aboutcode.pipeline import LoopProgress | ||
| from fetchcode.package_versions import SUPPORTED_ECOSYSTEMS as FETCHCODE_SUPPORTED_ECOSYSTEMS | ||
| from fetchcode.package_versions import versions | ||
| from packageurl import PackageURL | ||
| | ||
| from vulnerabilities.models import Package | ||
| from vulnerabilities.pipelines import VulnerableCodePipeline | ||
| | ||
| | ||
| class FlagGhostPackagePipeline(VulnerableCodePipeline): | ||
| """Detect and flag packages that do not exist upstream.""" | ||
| | ||
| @classmethod | ||
| def steps(cls): | ||
| return (cls.flag_ghost_packages,) | ||
| | ||
| def flag_ghost_packages(self): | ||
| detect_and_flag_ghost_packages(logger=self.log) | ||
| | ||
| | ||
| def detect_and_flag_ghost_packages(logger=None): | ||
| """Check if packages are available upstream. If not, mark them as ghost package.""" | ||
| interesting_packages_qs = ( | ||
| Package.objects.order_by("type", "namespace", "name") | ||
| .filter(type__in=FETCHCODE_SUPPORTED_ECOSYSTEMS) | ||
| .filter(qualifiers="") | ||
| .filter(subpath="") | ||
| ) | ||
| | ||
| distinct_packages_count = ( | ||
| interesting_packages_qs.values("type", "namespace", "name") | ||
| .distinct("type", "namespace", "name") | ||
| .count() | ||
| ) | ||
| | ||
| grouped_packages = groupby( | ||
| interesting_packages_qs.paginated(), | ||
| key=lambda pkg: (pkg.type, pkg.namespace, pkg.name), | ||
| ) | ||
| | ||
| ghost_package_count = 0 | ||
| progress = LoopProgress(total_iterations=distinct_packages_count, logger=logger) | ||
| for type_namespace_name, packages in progress.iter(grouped_packages): | ||
| ghost_package_count += flag_ghost_packages( | ||
| base_purl=PackageURL(*type_namespace_name), | ||
| packages=packages, | ||
| logger=logger, | ||
| ) | ||
| | ||
| if logger: | ||
| logger(f"Successfully flagged {ghost_package_count:,d} ghost Packages") | ||
| | ||
| | ||
| def flag_ghost_packages(base_purl, packages, logger=None): | ||
| """ | ||
| Check if `packages` are available upstream. | ||
| If not, update `is_ghost` to `True`. | ||
keshav-space marked this conversation as resolved. Show resolved Hide resolved | ||
| Return the number of packages flagged as ghost. | ||
| """ | ||
| known_versions = get_versions(purl=base_purl, logger=logger) | ||
| # Skip if encounter error while fetching known versions | ||
| if known_versions is None: | ||
| return 0 | ||
| | ||
| ghost_packages = 0 | ||
| for pkg in packages: | ||
| pkg.is_ghost = False | ||
| if pkg.version.lstrip("vV") not in known_versions: | ||
| pkg.is_ghost = True | ||
| ghost_packages += 1 | ||
| | ||
| if logger: | ||
| logger(f"Flagging ghost package {pkg.purl!s}", level=logging.DEBUG) | ||
| pkg.save() | ||
| | ||
| return ghost_packages | ||
| | ||
| | ||
| def get_versions(purl, logger=None): | ||
| """Return set of known versions for the given purl.""" | ||
| try: | ||
| return {v.value.lstrip("vV") for v in versions(str(purl))} | ||
| except Exception as e: | ||
| if logger: | ||
| logger( | ||
| f"Error while fetching known versions for {purl!s}: {e!r} \n {traceback_format_exc()}", | ||
| level=logging.ERROR, | ||
| ) | ||
| return | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| # | ||
| # Copyright (c) nexB Inc. and others. All rights reserved. | ||
| # VulnerableCode is a trademark of nexB Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # See http://www.apache.org/licenses/LICENSE-2.0 for the license text. | ||
| # See https://github.com/nexB/vulnerablecode for support or download. | ||
| # See https://aboutcode.org for more information about nexB OSS projects. | ||
| # | ||
| | ||
| import io | ||
| | ||
| | ||
| class TestLogger: | ||
| buffer = io.StringIO() | ||
| | ||
| def write(self, msg, level=None): | ||
| self.buffer.write(msg) | ||
| | ||
| def getvalue(self): | ||
| return self.buffer.getvalue() |
71 changes: 71 additions & 0 deletions 71 vulnerabilities/tests/pipelines/test_flag_ghost_packages.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # | ||
| # Copyright (c) nexB Inc. and others. All rights reserved. | ||
| # VulnerableCode is a trademark of nexB Inc. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # See http://www.apache.org/licenses/LICENSE-2.0 for the license text. | ||
| # See https://github.com/nexB/vulnerablecode for support or download. | ||
| # See https://aboutcode.org for more information about nexB OSS projects. | ||
| # | ||
| | ||
| | ||
| from pathlib import Path | ||
| from unittest import mock | ||
| | ||
| from django.test import TestCase | ||
| from fetchcode.package_versions import PackageVersion | ||
| from packageurl import PackageURL | ||
| | ||
| from vulnerabilities.models import Package | ||
| from vulnerabilities.pipelines import flag_ghost_packages | ||
| from vulnerabilities.tests.pipelines import TestLogger | ||
| | ||
| | ||
| class FlagGhostPackagePipelineTest(TestCase): | ||
| data = Path(__file__).parent.parent / "test_data" | ||
| | ||
| @mock.patch("vulnerabilities.pipelines.flag_ghost_packages.versions") | ||
| def test_flag_ghost_package(self, mock_fetchcode_versions): | ||
| Package.objects.create(type="pypi", name="foo", version="2.3.0") | ||
| Package.objects.create(type="pypi", name="foo", version="3.0.0") | ||
| | ||
| mock_fetchcode_versions.return_value = [ | ||
| PackageVersion(value="2.3.0"), | ||
| ] | ||
| interesting_packages_qs = Package.objects.all() | ||
| base_purl = PackageURL(type="pypi", name="foo") | ||
| | ||
| self.assertEqual(0, Package.objects.filter(is_ghost=True).count()) | ||
| | ||
| flagged_package_count = flag_ghost_packages.flag_ghost_packages( | ||
| base_purl=base_purl, | ||
| packages=interesting_packages_qs, | ||
| ) | ||
| self.assertEqual(1, flagged_package_count) | ||
| self.assertEqual(1, Package.objects.filter(is_ghost=True).count()) | ||
| | ||
| @mock.patch("vulnerabilities.pipelines.flag_ghost_packages.versions") | ||
| def test_detect_and_flag_ghost_packages(self, mock_fetchcode_versions): | ||
| Package.objects.create(type="pypi", name="foo", version="2.3.0") | ||
| Package.objects.create(type="pypi", name="foo", version="3.0.0") | ||
| Package.objects.create( | ||
| type="deb", | ||
| namespace="debian", | ||
| name="foo", | ||
| version="3.0.0", | ||
| qualifiers={"distro": "trixie"}, | ||
| ) | ||
| | ||
| mock_fetchcode_versions.return_value = [ | ||
| PackageVersion(value="2.3.0"), | ||
| ] | ||
| | ||
| self.assertEqual(3, Package.objects.count()) | ||
| self.assertEqual(0, Package.objects.filter(is_ghost=True).count()) | ||
| | ||
| logger = TestLogger() | ||
| | ||
| flag_ghost_packages.detect_and_flag_ghost_packages(logger=logger.write) | ||
| expected = "Successfully flagged 1 ghost Packages" | ||
| | ||
| self.assertIn(expected, logger.getvalue()) | ||
| self.assertEqual(1, Package.objects.filter(is_ghost=True).count()) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.