- Notifications
You must be signed in to change notification settings - Fork 569
Django support sensitive variables #2399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…om/vagi8/sentry-python into django-support-sensitive-variables
| Future Task: Django version 5.0 is expected to support coroutine functions as well. This PR does not add support for it. I will deal with it in a separate PR. |
szokeasaurusrex left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vagi8 Thank you for your contribution!
It appears that some of the Django tests you wrote are currently failing. The output from our CI test run is available here; if you can see it, that's great, but it is possible that it is only available to Sentry employees. If that's the case, let me know, and I can send you the output from the failing tests directly.
You can also run the Django tests locally using the following command: tox -e py3.11-django-v4.1 (this requires Tox to be installed and setup correctly, please let me know if you need guidance on how to do this).
Once you've fixed the failing tests, we will be happy to take a more thorough look at this PR. Please also feel free to reach out to us if you need help setting up the local test environment or any assistance with debugging.
| I did check out this error earlier and I was hoping I could get some help resolving. This is an error in test setup, where django is not able to get the view for the route (as shown in the image). The same works in my local (using pytest command for latest django version as well). All views are present (to the best of my knowledge). Tox test case pass for django-v1.11 and below. Could you help me why the test environment is not able to identify the route ? |
| Hey @vagi8. Could you try removing/commenting out this test https://github.com/getsentry/sentry-python/blob/master/tests/integrations/django/test_basic.py#L1057 and seeing if the problem persists? I found out in an unrelated PR that that test may be altering the urlconf for other tests, leading to failures. I'll see if I can fix that in another PR. |
| @sentrivana |
| @vagi8 Thanks for verifying! I've pushed a small commit to your PR to address the broken test case. |
| @vagi8 Can you please take a look at these?
|
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @vagi8, I have been pretty busy for the past few weeks, but I finally had some time to review your PR again and add my feedback. Please let me know if you have any questions about my feedback!
| Hey @szokeasaurusrex, Thank you for being patient on the PR. Thanks for the feedback. I resolved everything. |
szokeasaurusrex left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the changes! It was much easier to understand the code now, and I have identified some ways we can improve this even more :)
Also, thank you for your patience with my review; my team has been quite busy the past few weeks with some bigger projects, but I finally had some time to take a look at this.
| # MultiValueDicts will have a return value. | ||
| is_multivalue_dict = isinstance(value, MultiValueDict) | ||
| except Exception as e: | ||
| return "{!r} while evaluating {!r}".format(e, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should not be returning a string here. Perhaps, we can a warning instead, and set is_multivalue_dict to False? Like so:
| return "{!r} while evaluating {!r}".format(e, value) | |
| logger.warning("{!r} while evaluating {!r}".format(e, value)) | |
| is_multivalue_dict = False |
| def _cleanse_special_types(request, value): | ||
| # type: (WSGIRequest, Any) -> Any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def _cleanse_special_types(request, value): | |
| # type: (WSGIRequest, Any) -> Any | |
| if TYPE_CHECKING: | |
| T = TypeVar("T") | |
| def _cleanse_special_types(request, value): | |
| # type: (WSGIRequest, T) -> T |
We might also need to import TypeVar (within an if TYPE_CHECKING block) at the file's beginning
| This mitigates leaking sensitive POST parameters if something like | ||
| request.POST['nonexistent_key'] throws an exception | ||
| """ | ||
| sensitive_post_parameters = getattr(request, "sensitive_post_parameters", []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be anything other than a list?
| exception_idx = 0 | ||
| for _, (_, _, tbs) in enumerate(walk_exception_chain(exc_info)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have clarified; the point of using enumerate here is so that we replace the exception_idx counter. Essentially, the following is equivalent to the current code:
for exception_idx, (_, _, tbs) in enumerate(walk_exception_chain(exc_info)):Applying the above suggestion, we can also remove the exception_idx += 1 line on line 577.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, after looking further into what this function is actually doing, the better solution would actually be to not keep track of the indices at all. In other words, you should both get rid of exception_idx and also the enumerate (same with frame_idx in the loop below, too).
Instead, we can simply pass the tb objects to _cleanse_sensitive_vars. That way, we also get rid of the confusing index handling logic in that function. We will also need to pull the error out of the event for this to work (so I would suggest getting rid of the error parameter to this function entirely).
| request, event, sensitive_variables, exception_idx, frame_idx | ||
| ) | ||
| # get sensitve variables from the frame | ||
| if ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this if statement after the previous one? Would it not make more sense to have it before?
| request, # type: WSGIRequest | ||
| event, # type: Event | ||
| sensitive_variables, # type: Union[dict[str, Any], str] | ||
| reverse_exception_idx, # type: int | ||
| frame_idx, # type: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| request, # type: WSGIRequest | |
| event, # type: Event | |
| sensitive_variables, # type: Union[dict[str, Any], str] | |
| reverse_exception_idx, # type: int | |
| frame_idx, # type: int | |
| request, # type: WSGIRequest | |
| frame, # type: ... | |
| sensitive_variables, # type: Union[dict[str, Any], str] |
With my suggestions from above, we can probably change the function signature to the above
| exception_values = event.get("exception", {}).get("values", []) | ||
| if not exception_values: | ||
| return | ||
| | ||
| exception_idx = len(exception_values) - reverse_exception_idx - 1 | ||
| if exception_idx < 0 or exception_idx >= len(exception_values): | ||
| return | ||
| | ||
| exception = exception_values[exception_idx] | ||
| stacktrace = exception.get("stacktrace", {}) | ||
| frames = stacktrace.get("frames", []) | ||
| if frame_idx >= len(frames): | ||
| return | ||
| | ||
| frame = frames[frame_idx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| exception_values = event.get("exception", {}).get("values", []) | |
| if not exception_values: | |
| return | |
| exception_idx = len(exception_values) - reverse_exception_idx - 1 | |
| if exception_idx < 0 or exception_idx >= len(exception_values): | |
| return | |
| exception = exception_values[exception_idx] | |
| stacktrace = exception.get("stacktrace", {}) | |
| frames = stacktrace.get("frames", []) | |
| if frame_idx >= len(frames): | |
| return | |
| frame = frames[frame_idx] |
Again, with the above changes, we can clean this up
| if sensitive_variables == "__ALL__" or name in sensitive_variables: | ||
| value = SENSITIVE_DATA_SUBSTITUTE | ||
| else: | ||
| value = _cleanse_special_types(request, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does _cleanse_special_types rely on the request to determine what values are sensitive, instead of sensitive_variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also probably modify _cleanse_special_types to modify value in place if possible, rather than copying it
| value = SENSITIVE_DATA_SUBSTITUTE | ||
| else: | ||
| value = _cleanse_special_types(request, value) | ||
| clean_vars[name] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| value = SENSITIVE_DATA_SUBSTITUTE | |
| else: | |
| value = _cleanse_special_types(request, value) | |
| clean_vars[name] = value | |
| new_value = SENSITIVE_DATA_SUBSTITUTE | |
| else: | |
| new_value = _cleanse_special_types(request, value) | |
| clean_vars[name] = new_value |
| How are we on this @vagi8 ? Is there some time or motivation to bring this over the finish line? |
| @antonpirker little busy but I can try to spend some time. Any deadline ? We Django 5 released as well so I might have to add support for coroutines too. |
| We do not have deadlines :-) |
| This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
| This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
3 similar comments
| This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
| This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
| This issue has gone three weeks without activity. In another week, I will close it. But! If you comment or otherwise update it, I will reset the clock, and if you remove the label "A weed is but an unloved flower." ― Ella Wheeler Wilcox 🥀 |
| I don't know what's up with stalebot, closing manually. Please open a new PR if you find some time to continue on this @vagi8 🙏🏻 |

Fix #2356