Skip to content

Commit 04a6a5d

Browse files
refactor: centralize PAT validation, streamline repo checks & housekeeping
* `.venv*` to `.gitignore` * `# type: ignore[attr-defined]` hints in `compat_typing.py` for IDE-agnostic imports * Helpful PAT string in `InvalidGitHubTokenError` for easier debugging * Bump **ruff-pre-commit** hook → `v0.12.1` * CONTRIBUTING: * Require **Python 3.9+** * Recommend signed (`-S`) commits * PAT validation now happens **only** in entry points (`utils.auth.resolve_token` for CLI/lib, `server.process_query` for Web UI) * Unified `_check_github_repo_exists` into `check_repo_exists`, replacing `curl -I` with `curl --silent --location --write-out %{http_code} -o /dev/null` * Broaden `_GITHUB_PAT_PATTERN` * `create_git_auth_header` raises `ValueError` when hostname is missing * Tests updated to expect raw HTTP-code output * Superfluous “token can be set via `GITHUB_TOKEN`” notes in docstrings * `.gitingestignore` & `.terraform` from `DEFAULT_IGNORE_PATTERNS` * Token validation inside `create_git_command` * Obsolete `test_create_git_command_invalid_token` * Adjust `test_clone.py` and `test_git_utils.py` for new status-code handling * Consolidate mocks after token-validation relocation BREAKING CHANGE: `create_git_command` no longer validates GitHub tokens; callers must ensure tokens are valid (via `validate_github_token`) before invoking lower-level git helpers.
1 parent 2592303 commit 04a6a5d

14 files changed

+90
-138
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -126,6 +126,7 @@ celerybeat.pid
126126
# Environments
127127
.env
128128
.venv
129+
.venv*
129130
env/
130131
venv/
131132
ENV/

.pre-commit-config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -75,7 +75,7 @@ repos:
7575
args: ["--disable=line-length"]
7676

7777
- repo: https://github.com/astral-sh/ruff-pre-commit
78-
rev: v0.12.0
78+
rev: v0.12.1
7979
hooks:
8080
- id: ruff-check
8181
- id: ruff-format

CONTRIBUTING.md

Lines changed: 13 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ Thanks for your interest in contributing to Gitingest! 🚀 Gitingest aims to be
1919
cd gitingest
2020
```
2121

22+
**Note**: To contrubute, ensure you have **Python 3.9 or newer** installed, as some of the `pre-commit` hooks (e.g. `pyupgrade`) require Python 3.9+.
23+
2224
3. Set up the development environment and install dependencies:
2325

2426
```bash
@@ -31,7 +33,7 @@ Thanks for your interest in contributing to Gitingest! 🚀 Gitingest aims to be
3133
4. Create a new branch for your changes:
3234

3335
```bash
34-
git checkout -b your-branch
36+
git checkout -S -b your-branch
3537
```
3638

3739
5. Make your changes. Make sure to add corresponding tests for your changes.
@@ -66,10 +68,18 @@ Thanks for your interest in contributing to Gitingest! 🚀 Gitingest aims to be
6668

6769
9. Confirm that everything is working as expected. If you encounter any issues, fix them and repeat steps 6 to 8.
6870

69-
10. Commit your changes:
71+
10. Commit your changes (signed):
72+
73+
All commits to Gitingest must be [GPG-signed](https://docs.github.com/en/authentication/managing-commit-signature-verification) so that the project can verify the authorship of every contribution. You can either configure Git globally with:
74+
75+
```bash
76+
git config --global commit.gpgSign true
77+
```
78+
79+
or pass the `-S` flag as shown below.
7080

7181
```bash
72-
git commit -m "Your commit message"
82+
git commit -S -m "Your commit message"
7383
```
7484

7585
If `pre-commit` raises any issues, fix them and repeat steps 6 to 9.

src/gitingest/cli.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -131,7 +131,6 @@ async def _async_main(
131131
If ``True``, also ingest files matched by ``.gitignore`` or ``.gitingestignore`` (default: ``False``).
132132
token : str | None
133133
GitHub personal access token (PAT) for accessing private repositories.
134-
Can also be set via the ``GITHUB_TOKEN`` environment variable.
135134
output : str | None
136135
The path where the output file will be written (default: ``digest.txt`` in current directory).
137136
Use ``"-"`` to write to ``stdout``.

src/gitingest/clone.py

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,6 @@
1313
ensure_git_installed,
1414
is_github_host,
1515
run_command,
16-
validate_github_token,
1716
)
1817
from gitingest.utils.os_utils import ensure_directory
1918
from gitingest.utils.timeout_wrapper import async_timeout
@@ -23,7 +22,7 @@
2322

2423

2524
@async_timeout(DEFAULT_TIMEOUT)
26-
async def clone_repo(config: CloneConfig, token: str | None = None) -> None:
25+
async def clone_repo(config: CloneConfig, *, token: str | None = None) -> None:
2726
"""Clone a repository to a local path based on the provided configuration.
2827
2928
This function handles the process of cloning a Git repository to the local file system.
@@ -36,7 +35,6 @@ async def clone_repo(config: CloneConfig, token: str | None = None) -> None:
3635
The configuration for cloning the repository.
3736
token : str | None
3837
GitHub personal access token (PAT) for accessing private repositories.
39-
Can also be set via the ``GITHUB_TOKEN`` environment variable.
4038
4139
Raises
4240
------
@@ -51,10 +49,6 @@ async def clone_repo(config: CloneConfig, token: str | None = None) -> None:
5149
branch: str | None = config.branch
5250
partial_clone: bool = config.subpath != "/"
5351

54-
# Validate token if provided
55-
if token and is_github_host(url):
56-
validate_github_token(token)
57-
5852
# Create parent directory if it doesn't exist
5953
await ensure_directory(Path(local_path).parent)
6054

src/gitingest/query_parser.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,6 @@ async def parse_query(
4949
Patterns to ignore. Can be a set of strings or a single string.
5050
token : str | None
5151
GitHub personal access token (PAT) for accessing private repositories.
52-
Can also be set via the ``GITHUB_TOKEN`` environment variable.
5352
5453
Returns
5554
-------
@@ -109,7 +108,6 @@ async def _parse_remote_repo(source: str, token: str | None = None) -> Ingestion
109108
The URL or domain-less slug to parse.
110109
token : str | None
111110
GitHub personal access token (PAT) for accessing private repositories.
112-
Can also be set via the ``GITHUB_TOKEN`` environment variable.
113111
114112
Returns
115113
-------
@@ -301,7 +299,6 @@ async def try_domains_for_user_and_repo(user_name: str, repo_name: str, token: s
301299
The name of the repository.
302300
token : str | None
303301
GitHub personal access token (PAT) for accessing private repositories.
304-
Can also be set via the ``GITHUB_TOKEN`` environment variable.
305302
306303
Returns
307304
-------

src/gitingest/utils/auth.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@
44

55
import os
66

7+
from gitingest.utils.git_utils import validate_github_token
8+
79

810
def resolve_token(token: str | None) -> str | None:
911
"""Resolve the token to use for the query.
@@ -19,4 +21,7 @@ def resolve_token(token: str | None) -> str | None:
1921
The resolved token.
2022
2123
"""
22-
return token or os.getenv("GITHUB_TOKEN")
24+
token = token or os.getenv("GITHUB_TOKEN")
25+
if token:
26+
validate_github_token(token)
27+
return token
Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,13 +1,13 @@
11
"""Compatibility layer for typing."""
22

33
try:
4-
from typing import ParamSpec, TypeAlias # Py ≥ 3.10
4+
from typing import ParamSpec, TypeAlias # type: ignore[attr-defined] # Py ≥ 3.10
55
except ImportError:
6-
from typing_extensions import ParamSpec, TypeAlias # Py 3.8 / 3.9
6+
from typing_extensions import ParamSpec, TypeAlias # type: ignore[attr-defined] # Py 3.8 / 3.9
77

88
try:
9-
from typing import Annotated # Py ≥ 3.9
9+
from typing import Annotated # type: ignore[attr-defined] # Py ≥ 3.9
1010
except ImportError:
11-
from typing_extensions import Annotated # Py 3.8
11+
from typing_extensions import Annotated # type: ignore[attr-defined] # Py 3.8
1212

1313
__all__ = ["Annotated", "ParamSpec", "TypeAlias"]

src/gitingest/utils/exceptions.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -41,8 +41,6 @@ def __init__(self, message: str) -> None:
4141
class InvalidGitHubTokenError(ValueError):
4242
"""Exception raised when a GitHub Personal Access Token is malformed."""
4343

44-
def __init__(self) -> None:
45-
super().__init__(
46-
"Invalid GitHub token format. Token should start with 'github_pat_' or 'ghp_' "
47-
"followed by at least 36 characters of letters, numbers, and underscores.",
48-
)
44+
def __init__(self, token: str) -> None:
45+
msg = f"Invalid GitHub token format: {token!r}. To generate a token, see https://github.com/settings/tokens."
46+
super().__init__(msg)

0 commit comments

Comments
 (0)