Skip to content

Conversation

@zrr1999
Copy link
Member

@zrr1999 zrr1999 commented Mar 31, 2025

PR Category

Execute Infrastructure

PR Types

Performance

Description

实现了guard tree版本的lookup

@zrr1999 zrr1999 requested a review from Copilot March 31, 2025 08:33
@paddle-bot
Copy link

paddle-bot bot commented Mar 31, 2025

你的PR提交成功,感谢你对开源项目的贡献!
请关注后续CI自动化测试结果,详情请参考Paddle-CI手册
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

@paddle-bot paddle-bot bot added the contributor External developers label Mar 31, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements a faster guard lookup mechanism by introducing a guard tree approach in the SOT executor. Key changes include:

  • Adding new make_faster_guard methods decorated with check_faster_guard across variables.
  • Modifying the executor and executor cache to propagate and utilize guard_nodes alongside guard_fn.
  • Removing the legacy make_faster_guard helper from the guard module and splitting guard-related properties in the function graph.

Reviewed Changes

Copilot reviewed 8 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
python/paddle/jit/sot/opcode_translator/executor/variables/iter.py Added make_faster_guard method with check_faster_guard decorator.
python/paddle/jit/sot/opcode_translator/executor/variables/container.py Introduced make_faster_guard methods for container variables.
python/paddle/jit/sot/opcode_translator/executor/variables/callable.py Added make_faster_guard implementations for callable variables.
python/paddle/jit/sot/opcode_translator/executor/variables/basic.py Implemented make_faster_guard across several basic variable classes.
python/paddle/jit/sot/opcode_translator/executor/opcode_executor.py Propagated and stored guard_nodes in addition to guard_fn.
python/paddle/jit/sot/opcode_translator/executor/guard.py Removed the legacy make_faster_guard function.
python/paddle/jit/sot/opcode_translator/executor/function_graph.py Split guard properties into guard_nodes and guard_fn along with event registration.
python/paddle/jit/sot/opcode_translator/executor/executor_cache.py Updated caching logic to include guard_nodes and ensure consistency of lookup.
Files not reviewed (2)
  • paddle/fluid/pybind/jit.cc: Language not supported
  • paddle/fluid/pybind/sot/guards.h: Language not supported
Comments suppressed due to low confidence (3)

python/paddle/jit/sot/opcode_translator/executor/executor_cache.py:182

  • The assertion assumes that the guard_tree lookup always returns the exact index of the matching guard function. Consider handling mismatches more gracefully instead of relying on a strict assertion, which may lead to unexpected cache lookup failures.
assert (index == cache_index), "cache_index is not equal to index" 

python/paddle/jit/sot/opcode_translator/executor/executor_cache.py:338

  • Using a dummy GuardNode along with a TODO comment suggests a placeholder implementation. It's recommended to implement a proper zero-expression constructor for GuardNode or explicitly handle this fallback case to avoid potential runtime issues.
paddle.framework.core.GuardNode( paddle.framework.core.DummyGuard(), paddle.framework.core.ExprNode(), ) 

python/paddle/jit/sot/opcode_translator/executor/variables/basic.py:300

  • [nitpick] Consider providing a descriptive error message in the NotImplementedError (e.g., including the class name) to maintain consistency and aid debugging.
raise NotImplementedError 
@zrr1999 zrr1999 force-pushed the guard_tree_lookup branch from 04341c1 to 096124c Compare April 1, 2025 10:30
@zrr1999 zrr1999 requested review from SigureMo and gouzil as code owners April 1, 2025 10:30
@zrr1999 zrr1999 force-pushed the guard_tree_lookup branch from d9f51cf to 9339a27 Compare April 7, 2025 14:16
@SigureMo SigureMo requested a review from Copilot April 7, 2025 18:45
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 14 out of 14 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

paddle/fluid/pybind/jit.cc:189

  • The py::init binding for ExternVarExprNode appears to be missing a closing parenthesis after listing the constructor arguments. Please verify the syntax to ensure the binding compiles correctly.
py::class_<ExternVarExprNode, ExprNode, std::shared_ptr<ExternVarExprNode>>( 
@zrr1999 zrr1999 force-pushed the guard_tree_lookup branch from 29ae43c to eac2656 Compare April 12, 2025 07:14
def need_guard(self) -> bool:
# TODO(zrr1999): to implement gen_instructions and trace_value_from_frame
# TODO(zrr1999): implemented in #68555
return False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SymbolicOperationTracker 相关的后面我会看看怎么实现

BinaryOperatorTracker 会在后续 PR 移除

if paddle.framework.use_pir_api():
# TODO(zrr1999): Better log
log(0, "[Warning] Guard Tree Only support PIR")
return []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

现在我们会在老 IR 跑 make_faster_guard 么?直接 assert 可以吗?如果前两天动转静单测有这个问题,现在可能没了,因为前两个 PR 已经将 SOT+PT 的单测去掉了

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

看来我应该再拖两天再改

],
),
# TODO(zrr1999): add dist_info check
]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里后面直接用 TensorMetaMatchGuard 应该就不用在这里考虑 dist_info

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 147 to 157
if not enable_strict_guard and enable_guard_tree:
guard_tree = paddle.framework.core.GuardTree(guard_nodes_list)
cache_index = guard_tree.lookup(frame)
if cache_index is not None:
# TODO(zrr1999): add a mapping between custom_code and cache_index
return guarded_fns[cache_index][0]

for custom_code, guard_fn in guarded_fns:
else:
if enable_strict_guard:
mirror_guard_error = None
try:
with EventGuard("try mirror guard"):
mirror_guard_result = guard_fn.mirror_guard(frame)
except Exception as e:
log(2, f"[Cache] Mirror guard error: {e}\n")
mirror_guard_error = e
guard_tree = paddle.framework.core.GuardTree(guard_nodes_list)
cache_index = guard_tree.lookup(frame)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这块逻辑有点复杂呀,为啥不是用 enable_guard_tree or enable_strict_guard 判断是否要跑 guard_tree.lookup,然后用是否有 enable_strict_guard 判断是否直接返回呢?感觉不需要让后续代码整体加一个缩进层级

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有道理

@SigureMo SigureMo changed the title [SOT][Faster Guard] implement faster lookup [SOT][Faster Guard][3.13] implement faster lookup Apr 12, 2025
@zrr1999 zrr1999 force-pushed the guard_tree_lookup branch from eac2656 to 1bc1c05 Compare April 13, 2025 04:59
@SigureMo
Copy link
Member

奇怪 test_enumerate 为啥会稳定超时,复现下试试?

Copy link
Member

@SigureMo SigureMo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTMeow 🐾

@SigureMo SigureMo merged commit d81a381 into PaddlePaddle:develop Apr 13, 2025
33 of 34 checks passed
@SigureMo SigureMo changed the title [SOT][Faster Guard][3.13] implement faster lookup [SOT][Faster Guard] implement faster lookup Apr 13, 2025
YqGe585 pushed a commit to YqGe585/Paddle that referenced this pull request May 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor External developers

2 participants