Skip to content

Conversation

@jackylee-go
Copy link

@jackylee-go jackylee-go commented Oct 21, 2025

There are four modificaions of following commits.

  1. Remove the default configuration "FLASH_LOAD_SIZE" from "Kconfig.defconfig".
    Because of it is only used by "linker.ld" and it will automatically assign the same default value as "Kconfig.defconfig".

  2. Add a "__weak" declaration to "soc_early_init_hook" function.
    This is because "soc_early_init_hook" is already implemented in the et171 SOC.
    By overriding the weak function, applications or external modules can implement their own "soc_early_init_hook".

  3. To fixed the i-cache activation order issue
    Because ANDES's family of SOCs enables cache in reset.S, this execution precedes z_data_copy() (which copies the flash code to RAM).
    That might cause the i-cache to fetch instructions from SRAM before the d-cache writes them back to SRAM.
    To prevent this, executing "fence.i" instruction to force cache synchronization.

  4. Updated boards/index.rst
    Modify index.rst according to the suggestions in #91940

Jacky Lee added 3 commits October 21, 2025 11:47
Remove the default configuration FLASH_LOAD_SIZE, because "linker.ld" will automatically configure FLASH_LOAD_SIZE to the same default value Signed-off-by: Jacky Lee <jacky.lee@egistec.com>
By adding a __weak declaration, the implementation of 'soc_early_init_hook' can be more flexible. For example, some soc init operations that have not yet been upstreamed can be replaced with this method. Signed-off-by: Jacky Lee <jacky.lee@egistec.com>
Since caching is enabled before z_data_copy(), RAM functions may still be cached in the d-cache instead of being written to SRAM. In this case, the i-cache will fetch the wrong content from SRAM. Thus, using "fence.i" to fix it. Signed-off-by: Jacky Lee <jacky.lee@egistec.com>
Updated boards/index.rst to remove unnecessary context and fix statements. Signed-off-by: Jacky Lee <jacky.lee@egistec.com>
@jackylee-go
Copy link
Author

@niedzwiecki-dawid @nordicjm @jimmyzhe @kartben
It seems the CI didn't assign any reviewer.
Could you kindly take a moment to review it? Much appreciated!

__weak void soc_early_init_hook(void)
{
#if CONFIG_XIP && CONFIG_ICACHE
/* Since caching is enabled before z_data_copy(), RAM functions may still be cached
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean arch_data_copy ? there is no z_data_copy

Copy link
Author

@jackylee-go jackylee-go Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it named 'arch_data_copy' now.

commit 0f4e1acb

SHA-1: 94e171155ac0830020b8e41d31aa132cb37d2c7b * arch: init: rename z_data_copy -> arch_data_copy Do not use private API prefix and move to architecture interface as those functions are primarily used across arches and can be defined by the architecture. 
{
#if CONFIG_XIP && CONFIG_ICACHE
/* Since caching is enabled before z_data_copy(), RAM functions may still be cached
* in the d-cache instead of being written to SRAM. In this case, the i-cache will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what section from the linker is the relevant here, because, normally this should only copy data over?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ramfunc section is also copied through this function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we then do it in kernel/xip.c and use the cache api and implement the icache driver for riscv processors with the Zifencei extention?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the arch_cache_xxx functions are not yet implemented.
And cache enabling functionality is implemented directly in the assembly code.
Therefore, I performed the fix directly in the assembly code.

Otherwise, we must abandon the implementation in reset.S and fully implement arch_cache_xxx.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for now we can leave it here, but in the long therm we should move over to the cache api

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, we must abandon the implementation in reset.S and fully implement arch_cache_xxx.

arch/riscv/custom/andes/reset.S enables I/D Cache for SoC using Andes Custom CSR, which speeds up the boot sequence. I don't prefer to remove this.

I think the main problem is that the RAM function copy operation doesn’t handle I/D-cache synchronization right now.

zephyr/arch/common/xip.c

Lines 31 to 34 in b677e82

#ifdef CONFIG_ARCH_HAS_RAMFUNC_SUPPORT
arch_early_memcpy(&__ramfunc_region_start, &__ramfunc_load_start,
__ramfunc_end - __ramfunc_region_start);
#endif /* CONFIG_ARCH_HAS_RAMFUNC_SUPPORT */

arch_data_copy() runs too early that cache driver (ARCH/EXTERNAL) not initialize yet. Maybe introducing arch_early_ramfunc_memcpy() is an option (execute fence.i in RISC-V port), since the current arch_early_memcpy() cannot handle instruction and data synchronization.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you create a general issue for that?
Then I'm fine with the quick fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jackylee-go for the feedback.
This issue also affects AE350, so I created #98046 to record it.

I am ok with this fix too.

@@ -7,8 +7,16 @@
#include <zephyr/init.h>
#include <zephyr/kernel.h>

void soc_early_init_hook(void)
__weak void soc_early_init_hook(void)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What feature needs to be overridden instead of application appending the initialization via SYS_INIT()?

Copy link
Author

@jackylee-go jackylee-go Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, not all drivers for this SOC are completed.
For example, setting the root clock is implemented in 'soc_early_init_hook'.
Therefore, if you want to change the clock, you must override soc_early_init_hook.
Changing the root clock at runtime using SYS_INIT() can pose some risks.
(Zephyr does not expect clocks to be changed at runtime.)

Copy link
Author

@jackylee-go jackylee-go Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, other drivers that haven't issue a PR have been completed and verified as modules.
However, some implementations require to modify 'soc_early_init_hook', which will cause conflicts when using upstream to verify.
Add a 'weak' declaration can avoid those conflicts.

Copy link
Contributor

@jimmyzhe jimmyzhe Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The clock requirement should be solved by implement a clock driver or add Egis Kconfig option to set the clock.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andyross @peter-mitsis for comment on making soc_early_init_hook weak

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No other vendor has a weak symbol for these and late/early soc hooks are used 174 times in zephyr, if you need it or have these issues I would say your drivers or system setup is badly architected or misdesigned

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I emphasize that not all drivers for this SOC are fully implemented.
This declaration is added solely to facilitate smoother development of subsequent drivers.
If the relevant driver has already been developed, this declaration can be removed.

If this approach is not appropriate for upstream , please let me know.
I’ll try to find an alternative solution.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this basically reiterates why this soc should never have been added upstream

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use SYS_INIT with INIT_LEVEL_EARLY, thats even happening before soc_early_init_hook or use board_early_init_hook , that is directly after soc_early_init_hook

In general if you are developing drivers, that you plan to contribute work in-tree

Copy link
Author

@jackylee-go jackylee-go Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point is that this isn't a order issue of 'SYS_INIT'.
It's that there might be different implementations that conflict with each other.

But based on the current upstream content, even if there is a conflict, there won't be any problem.

{
#if CONFIG_XIP && CONFIG_ICACHE
/* Since caching is enabled before z_data_copy(), RAM functions may still be cached
* in the d-cache instead of being written to SRAM. In this case, the i-cache will
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jackylee-go for the feedback.
This issue also affects AE350, so I created #98046 to record it.

I am ok with this fix too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

5 participants