- Notifications
You must be signed in to change notification settings - Fork 354
Monitor the main stack's stack guard #4207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
esp-hal/esp_config.yml Outdated
| - '8' | ||
| active: 'chip == "esp32s3"' | ||
| | ||
| # TODO somni doesn't yet support `%` and evalexpr doesn't support `&` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be available with somni-expr 0.2.0
esp-hal/src/exception_handler/mod.rs Outdated
| | ||
| #[cfg(xtensa)] | ||
| #[unsafe(no_mangle)] | ||
| #[unsafe(link_section = ".rwtext")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this have to be in RAM? We should be able to load this from flash, and the performance isn't critical as this will just die miserably if called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the same - for now this was just copied, but in this case since we know "the world is about to end" a cache miss is not the worst thing that could happen 👍
7f1c63a to 7aa0d62 Compare esp-hal/src/soc/esp32/cpu_control.rs Outdated
| debug_assert!(!entry.is_null()); | ||
| | ||
| #[cfg(stack_guard_monitoring)] | ||
| crate::soc::enable_main_stack_guard_monitoring(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work as is, at least it's not watching the second core's stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not intended for now to monitor the second core's stack - just detect the second core trashing the first core's stack - maybe not too useful/likely - t.b.h. I'm not sure if observing the other core's stack is really helpful 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Observing the second core's stack (that is, the second core guarding its own stack) makes a lot more sense to me, especially since it's much more limited than the first one's. Cross-checking like this is probably not very useful, randomly writing to that one address is very unlikely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thinking about it - trying to do that will probably open a can of worms - I changed my mind - let's not monitor the other cores stack
89a7f26 to 71b9f54 Compare | That recursive assert failure doesn't look too pretty 😅 |
| // is copied to the core's stack. | ||
| static START_CORE1_FUNCTION: AtomicPtr<()> = AtomicPtr::new(core::ptr::null_mut()); | ||
| static APP_CORE_STACK_TOP: AtomicPtr<u32> = AtomicPtr::new(core::ptr::null_mut()); | ||
| static APP_CORE_STACK_GUARD: AtomicPtr<u32> = AtomicPtr::new(core::ptr::null_mut()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be cfg-gated to not generate warnings - or maybe the whole functionality shouldn't 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah it's not unused because of the always present start_app_core_with_stack_guard_offset - but I wasn't honoring stack_guard_monitoring in start_app_core 👍
efcb57f to c259809 Compare
bugadani left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S3 tests are broken. Please avoid pushing to this PR without fixing it, the HIL runner can't recover by itself.
Ok - not having an env var set to disable it helps a lot |
| { | ||
| let stack_guard = APP_CORE_STACK_GUARD.load(Ordering::Acquire); | ||
| // setting 0 effectively disables the functionality | ||
| crate::debugger::set_stack_watchpoint(stack_guard as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stack guard value should be written before enabling the watchpoint on the second core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't write the stack guard yet - but why not might help if someone is manually inspecting the stack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently incidental, esp-rtos may decide to place it's stack canary to the watched address
Co-authored-by: Dániel Buga <bugadani@gmail.com>
* Allow setting a custom idle hook (esp-rs#4209) * Document esp-rtos just a bit (esp-rs#4208) * Slight MG touchup * Document esp-rtos in migration guide * Document things * Apply suggestion from @Copilot Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Improve esp-rtos docs --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * refactor(ledc): introduce freq unset error and mode conversion (esp-rs#4214) * feat: introduce freq unset error and mode conversion * chore: format * refactor * refactor!: rename api * fix * fix docs * Disable unusable pins for various chips (esp-rs#4202) * ESP32: disable integrated SPI-connected pins * remove unavailable/non-existent pins for esp32c2 and esp32h2 * address reviews, drop more user-unusable pins * edit the changelog * fix psram (esp32) oi * RTOS: Use simpler sync object for executor (esp-rs#4215) * Use simpler sync object internally * Update esp-rtos/src/task/mod.rs Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Monitor the main stack's stack guard (esp-rs#4207) * Monitor the main stack's stack guard * CHANGELOG.md * Clippy * Save a bit of RAM * Some some IRAM * Avoid warning * Hint on register names * Protect the main stack on the second core, too * Fix `rt` w/o main stack protection * Use correct cfg gates * Refactor * Improvements * More rt gating * Stack-guard for second core * Adapt examples and tests * Bump somni-expr * CHANGELOG and migration guide * Fix doc-tests * fmt * Use ESP_HAL_CONFIG_STACK_GUARD_OFFSET for the second core * Fix test * Honor stack_guard_monitoring everywhere * Fix rebased migration guide * Fix doc-tests again * Update esp-hal/src/debugger.rs Co-authored-by: Dániel Buga <bugadani@gmail.com> * Write the stack guard value on the second core --------- Co-authored-by: Dániel Buga <bugadani@gmail.com> * Move `multicore` from `esp-storage` to `esp-hal` (esp-rs#4188) * Move multicore from esp-storage to esp-hal * changelog * reviews * address review comments and clean up * remove duplicated park_core() and split is_running() per chip * cleanup * Clean up blocking until deadline (esp-rs#4217) * Hw-based RTOS stack overflow detection (esp-rs#4218) * Respect the stack guard offset in thread stacks * Use the hardware watchpoint to protect rtos thread stacks * Use watchpoints for stack overflow detection * Do not rely on the value of the stack guard (esp-rs#4220) * Make BLE task configurable (esp-rs#4223) * Redo config options (esp-rs#4224) * Remove unused deps (esp-rs#4230) * `esp-phy` the second (esp-rs#4228) * Hopefully fixes it * fmt * Fix documentation link in Cargo.toml --------- Co-authored-by: Dániel Buga <bugadani@gmail.com> * twai: do not abort transmissions on recoverable errors (esp-rs#4227) * twai: do not abort transmissions on recoverable errors The TWAI driver's interrupt handler currently aborts transmissions if certain (error) interrupts were triggered. This causes the transmission of frames to become unreliable, because the `transmit_async` operation does not return any error codes if a transmission was aborted. Additionally, the TWAI controller is designed to be a reliable and, with the exception of the bus-off state, normally would simply try to rentransmit a message on its own. Therefore, this commit removes transmission abortion on errors and, instead, only aborts a transmission if the controller changes into the bus-off state. Fixes esp-rs#4222 * Update esp-hal/CHANGELOG.md Co-authored-by: Juraj Sadel <jurajsadel@gmail.com> --------- Co-authored-by: Juraj Sadel <jurajsadel@gmail.com> * Change default wifi_max_burst_size (esp-rs#4231) * Remove outdated limitation (esp-rs#4232) * feat: Use github actions artifacts to store semver baseline --------- Co-authored-by: Dániel Buga <bugadani@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Ethan Wu <7086cmd@gmail.com> Co-authored-by: Kirill Mikhailov <62840029+playfulFence@users.noreply.github.com> Co-authored-by: Björn Quentin <bjoernQ@users.noreply.github.com> Co-authored-by: Juraj Sadel <juraj.sadel@espressif.com> Co-authored-by: Simon Neuenhausen <frostie.neuenhausen@gmail.com> Co-authored-by: Harald Böhm <harald@boehm.dev> Co-authored-by: Juraj Sadel <jurajsadel@gmail.com>
Thank you for your contribution!
We appreciate the time and effort you've put into this pull request.
To help us review it efficiently, please ensure you've gone through the following checklist:
Submission Checklist 📝
cargo xtask fmt-packagescommand to ensure that all changed code is formatted correctly.CHANGELOG.mdin the proper section.Extra:
Pull Request Details 📖
Description
This gives us the stack-overflow part from the (unstable) stack-protector feature for free.
We could even make the "save zone" a bit larger - up to 64 bytes on Xtensa and "as much as we like" on RISC-V - but it would create constraints on the alignment - it seems this might already help.
e.g. this example on C6 does funny things without the protection (using rustc 1.91.0-nightly (5eda692e7 2025-09-11))
(make sure to set the guard-offset to approx. 4096)
If we see this working "well enough" we can consider expanding it to esp-rtos tasks.
skip-changelogfor the change in esp-preemptTesting
Manual testing, running existing examples