Skip to content

Off-by-one memory error in a string fastsearch since 3.11 #105235

Closed
@bacher09

Description

@bacher09

Bug report

This bug happens in Objects/stringlib/fastsearch.h:589 during matching the last symbol. In some cases, it causes crashes, but it's a bit hard to reproduce since in order this to happen, the last symbol should be the last in this particular memory page and the next page should not be read accessible or have a different non-contiguous address with the previous one.

The simplest script that reproduces the bug for me is:

import mmap def bug(): with open("file.tmp", "wb") as f: # this is the smallest size that triggers bug for me f.write(bytes(8388608)) with open("file.tmp", "rb") as f: with mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ) as fm: with open("/proc/self/maps", "rt") as f: print(f.read()) # this triggers bug res = fm.find(b"fo") if __name__ == "__main__": bug() 

But since the result of this script depends on a file system, kernel, and perhaps even a moon phase 😄 , here's a much more reliable way to reproduce it:

import mmap def read_maps(): with open("/proc/self/maps", "rt") as f: return f.read() def bug(): prev_map = frozenset(read_maps().split('\n')) new_map = None for i in range(0, 2049): # guard mmap with mmap.mmap(0, 4096 * (i + 1), flags=mmap.MAP_PRIVATE | mmap.MAP_ANONYMOUS, prot=0) as guard: with mmap.mmap(0, 8388608 + 4096 * i, flags=mmap.MAP_ANONYMOUS | mmap.MAP_PRIVATE, prot=mmap.PROT_READ) as fm: new_map = frozenset(read_maps().split('\n')) for diff in new_map.difference(prev_map): print(diff) prev_map = new_map # this crashes fm.find(b"fo") print("---") if __name__ == "__main__": bug() 

This causes the bug across all Linux environments that I've tried. It uses a trick with inaccessible memory region to increase the chances of this bug happening and no files, to speed it up.
Here's some extra info from GDB:

Program received signal SIGSEGV, Segmentation fault. 0x000055555570ba81 in stringlib_default_find (s=0x7ffff6a00000 "", n=8388608, p=0x7ffff745a3e0 "fo", m=2, maxcount=-1, mode=1) at Objects/stringlib/fastsearch.h:589 589 if (!STRINGLIB_BLOOM(mask, ss[i+1])) { (gdb) pipe info proc mappings | grep -A 1 -B 1 file.tmp 0x555555cb4000 0x555555d66000 0xb2000 0x0 rw-p [heap] 0x7ffff6a00000 0x7ffff7200000 0x800000 0x0 r--s /home/slava/src/cpython/python_bug/file.tmp 0x7ffff7400000 0x7ffff7600000 0x200000 0x0 rw-p (gdb) p &ss[i] $1 = 0x7ffff71fffff "" (gdb) p &ss[i + 1] $2 = 0x7ffff7200000 <error: Cannot access memory at address 0x7ffff7200000> (gdb) p i $3 = 8388606 (gdb) p ss $4 = 0x7ffff6a00001 "" (gdb) p s $5 = 0x7ffff6a00000 "" 

Your environment

  • CPython 3.11.3
  • OS: Linux 6.1 (but it should be OS independent)

I've also tried a bit modified version of a script on OS X, and it crashes there as well.

cc @sweeneyde (since you are the author of d01dceb and 6ddb09f).

Linked PRs

Metadata

Metadata

Assignees

Labels

3.11only security fixes3.12only security fixes3.13bugs and security fixesinterpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions