win32/vmem.h change emulated calloc() to OS/CRT's native calloc() #23643
Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Suggestion cannot be applied right now. Please check back later.
Most OSes/libcs have an optimization that calloc() sometimes, or most of the time, do not call memset() in userland wasting CPU to zeroize brand new memory blocks/pages obtained fresh from the kernel. The larger the calloc() allocation is, the higher chance the memory blocks will be obtained fresh from the kernel. MS CRT's calloc() is a wrapper function that is either thin or heavy (personal opinions), and ultimatly forwards to HeapAlloc(hSecretUBHandle,HEAP_ZERO_MEMORY,size).
Whether HeapAlloc@Kernel32.dll has or doesn't have the don't memset(,0,); fresh kernel pages optimization, this author doesn't know and it is irrelevant. WinPerl did its part, to take advantage of the optimization if it exists inside Microsoft's closed source OS.
Historically perlhost.h/vmem.h was perl5xx.dll emulating calloc() because this area of the interp is "unfinished business" from the late 1990s where Win95 and "Win32s Runtime" on Win 3.11 WFW OS compatiblity was critical for WinPerl. WinNT Kernel Win OSes have always been POSIX-like or actually Unix SVR1 1983 compatible from the start (and remained compatible with POSIX/SVR1 1983 until WSL 1).
The alternate never used memory allocator in vmem.h doesn't have a Calloc() method, so the nextgen and current "native kernel32.dll malloc()" code couldn't implement a Calloc() method. The DIY malloc() impl doesn't have a Calloc() because in 1993-1997-ish, VirtualAlloc, VirtualProtect, VirtualFree, couldn't be used in WinPerl for some reason lost to time.
This author's Win95 Kernel32.dll file exports all 3 functions and they are not stubs that only do "return STATUS_NOT_IMPLEMENTED;".
do_crt_invalid_parameter() was added so the DIY allocator behaves like the native MS CRT calloc() behaves. perlhost.h's design concept is that the library can be copy pasted without modification to the PHP and Python interps, something like that. Therefore perlhost.h and vmem.h aren't allowed to be aware of the Perl C API. So no croak()/die()/die_noperl().
-split off the very cold "Free to wrong pool" panic branch into its own
function. Less "dead" machine code for the CPU to skip around in the
perf critical VMemNL::Free() call. VC 2022 -O1 LTO inlined the
DispatchWrongPool() method against our wishes, so override VC 2022's and
GCC's inline criteria. We do not want inlining here.
-move 2 of void* writes out of the CS lock inside PerlMemSharedMalloc()
PerlMemMalloc and PerlMemParseMalloc and the Calloc()s, they are writes
of constants to a new mem block and not reads/writes to the head (VMem*)
object, or the first block hanging off the VMem* LL, so its not needed
to muxtex lock those 2 writes
-m_lRefCount assignment in VMem::VMem so CC doesn't need to save var this
around fn call InitializeCriticalSection in this function
-change return NULL; to return ptr; better codegen on MSVC 2022, since
optmizer doesnt realize var ptr is a free 0x0 value after false test
and instead emits xor RAX, RAX;
-reorder the VMem struct so VMemNL m_VMem (the per-my_perl pool) is at the
the front