[WIP] pd: support dpa2/dpa3 inference #4846

HydrogenSulfate · 2025-07-22T13:37:22Z

Support DPA2/DPA3 C++ inference with lammps

Note

we use a workaround as Dict is not supported in paddle PIR and inference and is replaced by List when using comm_dict.

other related work:

fix an indexing but: [slice] fix index_elemwentwise_get_gard bug slice-check PaddlePaddle/Paddle#74147

Summary by CodeRabbit

New Features
- Added support for distributed and parallel computation in PaddlePaddle-based models with spin considerations.
- Introduced a new custom PaddlePaddle operator for efficient border communication across MPI processes.
- Integrated optional Paddle backend into the build system.
Bug Fixes
- Improved tensor operation robustness and assertion checks in dynamic mode for parallel scenarios.
Refactor
- Standardized communication data structures from dictionaries to lists across multiple modules.
- Enhanced type annotations for improved clarity and maintainability.
Chores
- Updated build scripts and configuration files to support new features and streamline development.
- Disabled specific pre-commit hooks for formatting and BibTeX tidying.

Copilot

Pull Request Overview

This PR adds support for DPA2/DPA3 C++ inference with LAMMPS by implementing Paddle-based communication operations. The implementation addresses missing MPI communication functionality required for these models in distributed inference scenarios.

Implements new communication operations for border exchange between MPI processes
Adds C++ Paddle extensions with custom operators for message passing
Updates C++ inference API to handle DPA2/DPA3 model requirements with proper tensor naming

Reviewed Changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
source/op/pd/comm.cc	New custom Paddle operator implementation for MPI border communication
source/op/pd/CMakeLists.txt	Build configuration for Paddle custom operators with MPI support
source/api_cc/src/DeepPotPD.cc	Updates to C++ inference API for DPA2/DPA3 tensor handling
source/api_cc/include/DeepPotPD.h	Header additions for communication tensor management
source/CMakeLists.txt	Integration of Paddle operator build system
deepmd/pd/model/descriptor/*.py	Python model updates to support communication dictionaries
deepmd/pd/entrypoints/main.py	Model freezing updates for new tensor specifications

Comments suppressed due to low confidence (1)

Copilot · 2025-07-22T13:39:21Z

source/op/pd/comm.cc

+#include <mpi-ext.h>
+#endif
+#endif
+// #include <paddle/include/paddle_inference_api.h>


Remove commented-out include directive or add a comment explaining why it's preserved.

Suggested change

// #include <paddle/include/paddle_inference_api.h>

// #include <paddle/include/paddle_inference_api.h> // Preserved for potential future use in inference-related functionality.

Copilot · 2025-07-22T13:39:22Z

source/op/pd/comm.cc

+ std::vector<int64_t> g1_tensor_shape,
+ std::vector<int64_t> communicator_tensor_shape,
+ std::vector<int64_t> nlocal_tensor_shape,
+ std::vector<int64_t> nghost_tenso_shape) {


Parameter name has a typo: 'nghost_tenso_shape' should be 'nghost_tensor_shape'.

Suggested change

std::vector<int64_t> nghost_tenso_shape) {

std::vector<int64_t> nghost_tensor_shape) {

Copilot · 2025-07-22T13:39:22Z

source/op/pd/comm.cc

+ paddle::DataType g1_tensor_dtype,
+ paddle::DataType communicator_tensor_dtype,
+ paddle::DataType nlocal_tensor_dtype,
+ paddle::DataType nghost_tenso_dtype) {


Parameter name has a typo: 'nghost_tenso_dtype' should be 'nghost_tensor_dtype'.

Suggested change

paddle::DataType nghost_tenso_dtype) {

paddle::DataType nghost_tensor_dtype) {

Copilot · 2025-07-22T13:39:23Z

source/op/pd/comm.cc

+ const paddle::Tensor& nghost_tensor,
+ paddle::Tensor& recv_g1_tensor_grad) {
+ bool type_flag =
+ (sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;


Type checking uses sendlist_tensor.dtype() instead of recv_g1_tensor_grad.dtype(), which is inconsistent with the template instantiation logic.

Suggested change

(sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;

(recv_g1_tensor_grad.dtype() == paddle::DataType::FLOAT64) ? true : false;

Copilot · 2025-07-22T13:39:23Z

source/op/pd/CMakeLists.txt

+file(GLOB OP_SRC comm.cc)
+add_library(deepmd_op_pd MODULE ${OP_SRC})
+
+# ========== 新增：Paddle 配置（路径由用户外部传入） ==========


Comments should be in English to maintain consistency with the rest of the codebase.

Suggested change

# ========== 新增：Paddle 配置（路径由用户外部传入） ==========

# ========== New: Paddle configuration (path provided externally by the user) ==========

Copilot · 2025-07-22T13:39:24Z

source/op/pd/CMakeLists.txt

+
+# Paddle lib
+set(PADDLE_INFER_LIB ${PADDLE_LIB}/paddle/lib/libpaddle_inference.so)
+# Math lib（默认用MKL）


Comments should be in English to maintain consistency with the rest of the codebase.

Suggested change

# Math lib（默认用MKL）

# Math lib (default: MKL)

Copilot · 2025-07-22T13:39:24Z

deepmd/pt/train/training.py

+ # load
+ self.wrapper.load_state_dict(torch.load("./wrapper_dict.pt"))
+ print("model loaded")
+ inp = np.load("./input_dict.npz", allow_pickle=True)
+ for k, v in inp.items():
+ if isinstance(v, np.ndarray):
+ # print(k, type(v), v.shape, v.dtype)
+ try:
+ input_dict[k] = torch.tensor(v)
+ except TypeError:
+ pass
+ if isinstance(input_dict[k], torch.Tensor):
+ input_dict[k] = input_dict[k].cuda()
+ print("input_dict loaded")
+ lab = np.load("./label_dict.npz", allow_pickle=True)
+ for k, v in lab.items():
+ if isinstance(v, np.ndarray):
+ # print(k, type(v), v.shape, v.dtype)
+ try:
+ label_dict[k] = torch.tensor(v)
+ except TypeError:
+ pass
+ if isinstance(label_dict[k], torch.Tensor):
+ label_dict[k] = label_dict[k].cuda()
+ print("label_dict loaded")


Debug code should be removed before merging to production. This includes hard-coded file loading and early exit statements.

Suggested change

# load

self.wrapper.load_state_dict(torch.load("./wrapper_dict.pt"))

print("model loaded")

inp = np.load("./input_dict.npz", allow_pickle=True)

for k, v in inp.items():

if isinstance(v, np.ndarray):

# print(k, type(v), v.shape, v.dtype)

try:

input_dict[k] = torch.tensor(v)

except TypeError:

pass

if isinstance(input_dict[k], torch.Tensor):

input_dict[k] = input_dict[k].cuda()

print("input_dict loaded")

lab = np.load("./label_dict.npz", allow_pickle=True)

for k, v in lab.items():

if isinstance(v, np.ndarray):

# print(k, type(v), v.shape, v.dtype)

try:

label_dict[k] = torch.tensor(v)

except TypeError:

pass

if isinstance(label_dict[k], torch.Tensor):

label_dict[k] = label_dict[k].cuda()

print("label_dict loaded")

# Load model state

model_path = config.get("model_path", "./wrapper_dict.pt")

self.wrapper.load_state_dict(torch.load(model_path))

log.info("Model state loaded from %s", model_path)

# Load input dictionary

input_path = config.get("input_path", "./input_dict.npz")

inp = np.load(input_path, allow_pickle=True)

for k, v in inp.items():

if isinstance(v, np.ndarray):

try:

input_dict[k] = torch.tensor(v)

except TypeError:

pass

if isinstance(input_dict[k], torch.Tensor):

input_dict[k] = input_dict[k].cuda()

log.info("Input dictionary loaded from %s", input_path)

# Load label dictionary

label_path = config.get("label_path", "./label_dict.npz")

lab = np.load(label_path, allow_pickle=True)

for k, v in lab.items():

if isinstance(v, np.ndarray):

try:

label_dict[k] = torch.tensor(v)

except TypeError:

pass

if isinstance(label_dict[k], torch.Tensor):

label_dict[k] = label_dict[k].cuda()

log.info("Label dictionary loaded from %s", label_path)

Copilot · 2025-07-22T13:39:24Z

deepmd/pd/train/training.py

 with nvprof_context(enable_profiling, "Backward pass"):
 loss.backward()
-
+ exit()


Debug code should be removed before merging to production. This early exit will prevent normal training execution.

Suggested change

exit()

Copilot · 2025-07-22T13:39:25Z

deepmd/pt/model/model/transform_output.py

 extended_virial = extended_virial.view(list(extended_virial.shape[:-2]) + [9]) # noqa:RUF005
 else:
 extended_virial = None
+ print(


Debug print statements should be removed before merging to production.

deepmd/pd/train/training.py

deepmd/pt/train/training.py

coderabbitai · 2025-07-22T13:46:14Z

📝 Walkthrough

Walkthrough

This update introduces a custom PaddlePaddle operator (border_op) for distributed tensor communication, with full C++/CMake integration and Python bindings. It changes the communication data structure from a dictionary to a list of tensors (comm_dict), updates type annotations and logic across multiple descriptor and model classes, and adapts the build system to optionally include Paddle support. Several pre-commit hooks are also disabled.

Changes

Files/Paths	Change Summary
`.pre-commit-config.yaml`, `deepmd/pd/utils/env.py`	Commented out `prettier` and `bibtex-tidy` hooks; commented out "gather_grad" in `EAGER_COMP_OP_BLACK_LIST`.
`deepmd/pd/model/atomic_model/base_atomic_model.py`, `deepmd/pd/model/atomic_model/dp_atomic_model.py`, `deepmd/pd/model/model/ener_model.py`, `deepmd/pd/model/model/make_model.py`, `deepmd/pd/model/descriptor/dpa1.py`, `deepmd/pd/model/descriptor/dpa2.py`, `deepmd/pd/model/descriptor/dpa3.py`, `deepmd/pd/model/descriptor/se_a.py`, `deepmd/pd/model/descriptor/se_t_tebd.py`	Changed `comm_dict` parameter type from `Optional[dict[str, paddle.Tensor]]` to `Optional[list[paddle.Tensor]]` in method signatures and updated type checks accordingly.
`deepmd/pd/model/descriptor/repflows.py`, `deepmd/pd/model/descriptor/repformers.py`	Added conditional import and assignment of custom Paddle operator (`border_op`), updated logic to support distributed communication with spin/no-spin handling, and improved tensor assertions and padding logic.
`deepmd/pd/entrypoints/main.py`	Updated input specifications for `paddle.jit.to_static` to explicitly define communication tensor structure instead of a generic placeholder.
`deepmd/pd/cxx_op.py`	New Python module to load the custom Paddle operator (`deepmd_op_pd`) using Paddle's extension loading utilities; exposes `ENABLE_CUSTOMIZED_OP` and `paddle_ops_deepmd`.
`source/CMakeLists.txt`	Reordered and extended CMake logic to optionally add Paddle operator subdirectory (`op/pd/`) and integrate Paddle as a backend.
`source/api_cc/include/DeepPotPD.h`	Added `mapping_tensor` (unique pointer) member and commented vector for communication tensors in `DeepPotPD` class.
`source/api_cc/src/DeepPotPD.cc`	Updated tensor handle names for Paddle inference; implemented logic for message passing and mapping tensor setup, replacing previously disabled exceptions.
`source/op/pd/CMakeLists.txt`	New CMake file to build `deepmd_op_pd` Paddle custom operator, handling Paddle inference, third-party dependencies, MPI (with CUDA-aware detection), and installation logic.
`source/op/pd/comm.cc`	New C++ source implementing the `border_op` custom Paddle operator for distributed communication (forward/backward), with full registration, shape/dtype inference, and MPI/CUDA-aware logic.

Sequence Diagram(s)

sequenceDiagram participant Python participant Paddle participant CustomOp (border_op) participant MPI Python->>Paddle: Load model/descriptor with comm_dict (list of tensors) Paddle->>CustomOp: Call border_op with send/recv lists, communicator, etc. CustomOp->>MPI: Perform send/recv (CPU/GPU, CUDA-aware if possible) CustomOp-->>Paddle: Return communicated tensor Paddle-->>Python: Continue forward/backward pass with updated data

Estimated code review effort

4 (~90 minutes)

Possibly related PRs

deepmodeling/deepmd-kit#4418: Adds DPA-2 support for Paddle backend, including similar comm_dict type changes and distributed communication logic.
deepmodeling/deepmd-kit#4414: Implements and tests DescrptDPA1 with comm_dict type and logic changes, directly related to similar changes in this PR.

Suggested labels

LAMMPS

Suggested reviewers

njzjz
iProzd

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 329f9bc and d1369a7.

📒 Files selected for processing (2)

source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/src/DeepPotPD.cc (3 hunks)

✅ Files skipped from review due to trivial changes (1)

source/api_cc/include/DeepPotPD.h

🚧 Files skipped from review as they are similar to previous changes (1)

source/api_cc/src/DeepPotPD.cc

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (30)

GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (python)
GitHub Check: Analyze (c-cpp)
GitHub Check: Test Python (6, 3.12)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Test C++ (false)
GitHub Check: Test C++ (true)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C++ (cuda, cuda)

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 13

♻️ Duplicate comments (1)

deepmd/pd/model/descriptor/repflows.py (1)

72-75: Improve error message clarity

🧹 Nitpick comments (17)

.pre-commit-config.yaml (2)
68-75: Avoid blanket disabling of the Prettier hook

Commenting-out the entire Prettier block removes automated formatting for Markdown/YAML/CSS across the repo, which quickly drifts style consistency—especially with multiple contributors.
If the alpha release is the blocker, prefer downgrading/pinning to a stable tag or narrowing the exclude/files pattern instead of disabling altogether.

Example fix (pin to stable 3.x and keep existing exclusions):
- # - repo: https://github.com/pre-commit/mirrors-prettier - # rev: v4.0.0-alpha.8 - # hooks: - # - id: prettier - # types_or: [markdown, yaml, css] - # exclude: ^(source/3rdparty|\.github/workflows|\.clang-format) + - repo: https://github.com/pre-commit/mirrors-prettier + rev: 3.2.5 + hooks: + - id: prettier + types_or: [markdown, yaml, css] + exclude: ^(source/3rdparty|\.github/workflows|\.clang-format)
86-104: Re-evaluate removal of bibtex-tidy

Disabling bibtex-tidy means bibliography files will no longer be normalized, increasing the chance of merge conflicts and inconsistent field ordering.
Unless the hook is demonstrably blocking work, consider re-enabling with a tighter pattern (e.g., only under docs/refs/*.bib) or pinning to a version known to work.

No immediate code change required, but please confirm this is an intentional long-term decision.
deepmd/pd/model/descriptor/repformer_layer.py (1)
113-113: Remove commented debugging code before finalizing the PR.

The commented print statement appears to be leftover debugging code that should be cleaned up before this WIP PR is finalized.
- # print(g1_ext.shape, index.shape)
deepmd/pt/train/wrapper.py (1)
178-178: Remove debugging print statement before finalizing the PR.

This print statement appears to be debugging code that should be removed before this WIP PR is finalized. Debug prints in the forward pass will clutter logs and impact performance.
- print(self.loss)
deepmd/pd/train/wrapper.py (1)
176-176: Remove debugging print statement before finalizing the PR.

This print statement appears to be debugging code that should be removed before this WIP PR is finalized. Debug prints in the forward pass will clutter logs and impact performance.
- print(self.loss)
deepmd/pt/model/model/transform_output.py (1)
99-101: Remove debugging print statement before finalizing the PR.

This print statement outputs detailed tensor statistics and appears to be debugging code. It should be removed before this WIP PR is finalized as it will generate excessive log output and impact performance.
- print( - f"extended_force: {extended_force.min().item():.10f} {extended_force.max().item():.10f} {extended_force.mean().item():.10f} {extended_force.std().item():.10f}" - )
source/op/pd/CMakeLists.txt (6)
4-4: Replace Chinese comments with English for consistency.

Chinese comments are inconsistent with the rest of the codebase which uses English comments.
-# ========== 新增：Paddle 配置（路径由用户外部传入） ========== +# ========== Paddle Configuration (path provided externally by user) ==========
19-28: Remove commented-out legacy code.

Large blocks of commented-out code should be removed to improve readability. If this code might be needed later, it should be documented in a separate issue or commit message.
-# link_directories(${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib) -# link_directories(${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib) -# link_directories(${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib) -# link_directories(${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib) -# link_directories(${PADDLE_LIB}/paddle/lib) target_link_libraries(deepmd_op_pd -# PRIVATE ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a -# ${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a -# ${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a -# ${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a dl pthread)
31-31: Replace Chinese comment with English.
-# Math lib（默认用MKL） +# Math lib (default MKL)
35-45: Remove large commented-out legacy code block.

This extensive commented-out code should be removed to improve maintainability.
-# ========== 原有逻辑：保持不动 ========== -# target_link_libraries(deepmd_op_pd PRIVATE ${PADDLE_LIBRARIES}) -# if(${OP_CXX_ABI_PT} EQUAL ${OP_CXX_ABI}) target_link_libraries(deepmd_op_pd -# PRIVATE ${LIB_DEEPMD}) else() target_link_libraries(deepmd_op_pd PRIVATE -# ${LIB_DEEPMD}_compat_cxxabi) endif() -# remove_definitions(-D_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI}) -# target_compile_definitions( deepmd_op_pd PUBLIC -# "$<$<COMPILE_LANGUAGE:CXX>:_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI_PT}>") -# if(APPLE) set_target_properties(deepmd_op_pd PROPERTIES INSTALL_RPATH -# "@loader_path") else() set_target_properties(deepmd_op_pd PROPERTIES -# INSTALL_RPATH "$ORIGIN") endif()
66-66: Replace Chinese comment with English.
-# ========== 新增：链接 Paddle 所需库 ========== +# ========== Link Required Paddle Libraries ==========
77-77: Replace Chinese comment with English.
-# ========== 保持原有安装路径逻辑 ========== +# ========== Maintain Original Installation Path Logic ==========
deepmd/pt/train/training.py (2)
761-764: Use contextlib.suppress for cleaner exception handling.

Following the static analysis suggestion to improve code readability.
+import contextlib + - try: - input_dict[k] = torch.tensor(v) - except TypeError: - pass + with contextlib.suppress(TypeError): + input_dict[k] = torch.tensor(v)
772-775: Use contextlib.suppress for cleaner exception handling.

Following the static analysis suggestion to improve code readability.
- try: - label_dict[k] = torch.tensor(v) - except TypeError: - pass + with contextlib.suppress(TypeError): + label_dict[k] = torch.tensor(v)
deepmd/pd/model/descriptor/repformers.py (2)
68-71: Improve error message clarity

The current error message mentions "when freezing the model" which might be confusing. Consider making it clearer about the build requirement.
 raise NotImplementedError( - "border_op is not available since customized Paddle OP library is not built when freezing the model. " - "See documentation for DPA3 for details." + "border_op is not available because the customized Paddle OP library was not built. " + "Please rebuild with Paddle custom op support enabled. " + "See documentation for distributed training setup." )
490-495: Simplify tensor padding logic

The current implementation manually constructs shapes and concatenates tensors. This can be simplified.
- _shapes = g1.shape[1:] - _shapes[1] = n_padding - g1 = paddle.concat( - [g1.squeeze(0), paddle.zeros(_shapes, dtype=g1.dtype)], - axis=1, - ) + g1 = paddle.nn.functional.pad( + g1.squeeze(0), [0, 0, 0, n_padding], value=0.0 + )
source/op/pd/comm.cc (1)
56-56: Clarify unusual pointer arithmetic

The expression (sendlist_tensor + 0).data<int>() is unusual. If this is intentional, please add a comment explaining why. Otherwise, simplify it.
- int** sendlist = reinterpret_cast<int**>((sendlist_tensor + 0).data<int>()); + int** sendlist = reinterpret_cast<int**>(sendlist_tensor.data<int>());

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 68f0d21 and c510815.

📒 Files selected for processing (17)

.pre-commit-config.yaml (2 hunks)
deepmd/pd/entrypoints/main.py (1 hunks)
deepmd/pd/model/descriptor/dpa2.py (1 hunks)
deepmd/pd/model/descriptor/repflows.py (8 hunks)
deepmd/pd/model/descriptor/repformer_layer.py (1 hunks)
deepmd/pd/model/descriptor/repformers.py (7 hunks)
deepmd/pd/train/training.py (2 hunks)
deepmd/pd/train/wrapper.py (1 hunks)
deepmd/pd/utils/env.py (1 hunks)
deepmd/pt/model/model/transform_output.py (1 hunks)
deepmd/pt/train/training.py (1 hunks)
deepmd/pt/train/wrapper.py (1 hunks)
source/CMakeLists.txt (1 hunks)
source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/src/DeepPotPD.cc (2 hunks)
source/op/pd/CMakeLists.txt (1 hunks)
source/op/pd/comm.cc (1 hunks)

🧠 Learnings (9)

📓 Common learnings

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-10-08T15:32:11.479Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-09-19T04:25:12.408Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4226 File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202 Timestamp: 2024-10-16T21:49:57.401Z Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

deepmd/pd/model/descriptor/repformer_layer.py (2)

Learnt from: njzjz
PR: #4160
File: deepmd/dpmodel/utils/env_mat.py:52-64
Timestamp: 2024-09-24T01:59:37.973Z
Learning: Negative indices in nlist are properly handled by masking later in the computation, so they do not cause issues in indexing operations.

Learnt from: njzjz
PR: #4160
File: deepmd/dpmodel/utils/env_mat.py:52-64
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Negative indices in nlist are properly handled by masking later in the computation, so they do not cause issues in indexing operations.

deepmd/pd/model/descriptor/dpa2.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pt/train/training.py (3)

Learnt from: njzjz
PR: #4302
File: deepmd/pd/infer/inference.py:35-38
Timestamp: 2024-11-25T07:42:55.735Z
Learning: In the file deepmd/pd/infer/inference.py, when loading the model checkpoint in the Tester class, it's acceptable to not include additional error handling for loading the model state dictionary.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pd/model/descriptor/repflows.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

source/api_cc/src/DeepPotPD.cc (4)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-08T15:32:11.479Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-05T03:11:02.922Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

deepmd/pd/model/descriptor/repformers.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pd/train/training.py (2)

Learnt from: njzjz
PR: #4302
File: deepmd/pd/infer/inference.py:35-38
Timestamp: 2024-11-25T07:42:55.735Z
Learning: In the file deepmd/pd/infer/inference.py, when loading the model checkpoint in the Tester class, it's acceptable to not include additional error handling for loading the model state dictionary.

Learnt from: HydrogenSulfate
PR: #4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function nvprof_context is defined in deepmd/pd/utils/utils.py, so importing it in deepmd/pd/train/training.py is correct.

source/api_cc/include/DeepPotPD.h (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

🧬 Code Graph Analysis (2)

deepmd/pt/model/model/transform_output.py (1)

source/api_cc/include/DeepTensor.h (1)

std (139-338)

deepmd/pd/model/descriptor/dpa2.py (2)

source/api_cc/src/DeepPotPD.cc (1)

mapping (245-245)

source/api_cc/include/DeepPotPD.h (1)

numel (285-293)

🪛 Ruff (0.12.2)

deepmd/pt/train/training.py

761-764: Use contextlib.suppress(TypeError) instead of try-except-pass

Replace with contextlib.suppress(TypeError)

(SIM105)

772-775: Use contextlib.suppress(TypeError) instead of try-except-pass

Replace with contextlib.suppress(TypeError)

(SIM105)

deepmd/pd/train/training.py

769-773: Use contextlib.suppress(Exception) instead of try-except-pass

(SIM105)

781-785: Use contextlib.suppress(Exception) instead of try-except-pass

(SIM105)

🧰 Additional context used

🧠 Learnings (9)

📓 Common learnings

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-10-08T15:32:11.479Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-09-19T04:25:12.408Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4226 File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202 Timestamp: 2024-10-16T21:49:57.401Z Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

deepmd/pd/model/descriptor/repformer_layer.py (2)

Learnt from: njzjz
PR: #4160
File: deepmd/dpmodel/utils/env_mat.py:52-64
Timestamp: 2024-09-24T01:59:37.973Z
Learning: Negative indices in nlist are properly handled by masking later in the computation, so they do not cause issues in indexing operations.

Learnt from: njzjz
PR: #4160
File: deepmd/dpmodel/utils/env_mat.py:52-64
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Negative indices in nlist are properly handled by masking later in the computation, so they do not cause issues in indexing operations.

deepmd/pd/model/descriptor/dpa2.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pt/train/training.py (3)

Learnt from: njzjz
PR: #4302
File: deepmd/pd/infer/inference.py:35-38
Timestamp: 2024-11-25T07:42:55.735Z
Learning: In the file deepmd/pd/infer/inference.py, when loading the model checkpoint in the Tester class, it's acceptable to not include additional error handling for loading the model state dictionary.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pd/model/descriptor/repflows.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

source/api_cc/src/DeepPotPD.cc (4)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-08T15:32:11.479Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-05T03:11:02.922Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

deepmd/pd/model/descriptor/repformers.py (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

deepmd/pd/train/training.py (2)

Learnt from: njzjz
PR: #4302
File: deepmd/pd/infer/inference.py:35-38
Timestamp: 2024-11-25T07:42:55.735Z
Learning: In the file deepmd/pd/infer/inference.py, when loading the model checkpoint in the Tester class, it's acceptable to not include additional error handling for loading the model state dictionary.

Learnt from: HydrogenSulfate
PR: #4414
File: deepmd/pd/train/training.py:66-66
Timestamp: 2024-11-29T12:15:22.226Z
Learning: The function nvprof_context is defined in deepmd/pd/utils/utils.py, so importing it in deepmd/pd/train/training.py is correct.

source/api_cc/include/DeepPotPD.h (2)

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-09-19T04:25:12.408Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

Learnt from: njzjz
PR: #4144
File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246
Timestamp: 2024-10-08T15:32:11.479Z
Learning: Refactoring between test classes TestInferDeepPotDpaPt and TestInferDeepPotDpaPtNopbc is addressed in PR #3905.

🧬 Code Graph Analysis (2)

deepmd/pt/model/model/transform_output.py (1)

source/api_cc/include/DeepTensor.h (1)

std (139-338)

deepmd/pd/model/descriptor/dpa2.py (2)

source/api_cc/src/DeepPotPD.cc (1)

mapping (245-245)

source/api_cc/include/DeepPotPD.h (1)

numel (285-293)

🪛 Ruff (0.12.2)

deepmd/pt/train/training.py

761-764: Use contextlib.suppress(TypeError) instead of try-except-pass

Replace with contextlib.suppress(TypeError)

(SIM105)

772-775: Use contextlib.suppress(TypeError) instead of try-except-pass

Replace with contextlib.suppress(TypeError)

(SIM105)

deepmd/pd/train/training.py

769-773: Use contextlib.suppress(Exception) instead of try-except-pass

(SIM105)

781-785: Use contextlib.suppress(Exception) instead of try-except-pass

(SIM105)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)

GitHub Check: Test C++ (true)
GitHub Check: Test C++ (false)
GitHub Check: Test Python (6, 3.12)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test Python (5, 3.12)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Build wheels for cp311-macosx_x86_64
GitHub Check: Analyze (python)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Analyze (c-cpp)
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build wheels for cp311-manylinux_x86_64

🔇 Additional comments (6)

source/CMakeLists.txt (1)

529-531: LGTM!

The Paddle backend subdirectory is correctly added following the same pattern as TensorFlow and PyTorch backends.

deepmd/pd/model/descriptor/dpa2.py (1)

801-802: Improved validation logic.

Good improvements to the validation:

Treating empty comm_dict the same as None is more robust

Checking mapping.numel() > 0 ensures the tensor contains elements, not just that it exists

source/api_cc/include/DeepPotPD.h (1)

397-404: Clear documentation of comm_vec structure.

The comment clearly documents the purpose and contents of comm_vec. This helps future maintainers understand the communication tensor structure.

deepmd/pd/entrypoints/main.py (1)

391-409: Improved input specifications for static graph conversion.

The changes enhance clarity and type safety:

Explicit naming of inputs ("extended_coord", "extended_atype")

Detailed specification of comm_dict components with proper types and shapes

This makes the interface more explicit and helps with validation.

source/api_cc/src/DeepPotPD.cc (2)

195-196: Input tensor names aligned with Python interface.

The tensor names are correctly updated to match the Python interface changes in deepmd/pd/entrypoints/main.py.

Also applies to: 199-200

228-230: Verify the communicator tensor handling.

The conditional copy and casting of lmp_list.world needs verification:

The cast from void* to int* assumes world contains an integer value

The condition lmp_list.world > 0 treats a pointer as a boolean/integer

Please verify:

What type is lmp_list.world and what does it contain?

Should this be checking if the pointer is non-null instead?

coderabbitai · 2025-07-22T13:46:17Z

deepmd/pd/model/descriptor/repflows.py

+ assert len(comm_dict) >= 6
+ has_spin = len(comm_dict) >= 7
+ if not has_spin:
+ n_padding = nall - nloc
+ # node_ebd = paddle.nn.functional.pad(
+ # node_ebd.squeeze(0), [0, 0, 0, n_padding], value=0.0
+ # )
+ # [nframes, nloc, tebd_dim]
+ _shapes = node_ebd.shape[1:]
+ _shapes[1] = n_padding
+ node_ebd = paddle.concat(
+ [
+ node_ebd.squeeze(0),
+ paddle.zeros(_shapes, dtype=node_ebd.dtype),
+ ],
+ axis=1,
+ )
+ real_nloc = nloc
+ real_nall = nall
+ else:
+ # for spin
+ real_nloc = nloc // 2
+ real_nall = nall // 2
+ real_n_padding = real_nall - real_nloc
+ node_ebd_real, node_ebd_virtual = paddle.split(
+ node_ebd, [real_nloc, real_nloc], axis=1
+ )
+ # mix_node_ebd: nb x real_nloc x (n_dim * 2)
+ mix_node_ebd = paddle.concat(
+ [node_ebd_real, node_ebd_virtual], axis=2
+ )
+ # nb x real_nall x (n_dim * 2)
+ node_ebd = paddle.nn.functional.pad(
+ mix_node_ebd.squeeze(0), (0, 0, 0, real_n_padding), value=0.0
+ )
+
+ assert len(comm_dict) >= 6
+ # assert "send_list" in comm_dict
+ # assert "send_proc" in comm_dict
+ # assert "recv_proc" in comm_dict
+ # assert "send_num" in comm_dict
+ # assert "recv_num" in comm_dict
+ # assert "communicator" in comm_dict
+ ret = paddle_ops_deepmd_border_op(
+ comm_dict[0],
+ comm_dict[1],
+ comm_dict[2],
+ comm_dict[3],
+ comm_dict[4],
+ node_ebd,
+ comm_dict[5],
+ paddle.to_tensor(
+ real_nloc,
+ dtype=paddle.int32,
+ place=paddle.CPUPlace(),
+ ), # should be int of c++, placed on cpu
+ paddle.to_tensor(
+ real_nall - real_nloc,
+ dtype=paddle.int32,
+ place=paddle.CPUPlace(),
+ ), # should be int of c++, placed on cpu
+ )


🛠️ Refactor suggestion

⚠️ Potential issue

Replace hardcoded indices with named keys for comm_dict

Similar to repformers.py, this implementation uses hardcoded numeric indices to access comm_dict, which is fragile and error-prone.

Apply the same pattern suggested for repformers.py - use named dictionary keys or a dataclass to make the code more maintainable and self-documenting. This will also make it easier to maintain consistency between the two descriptor implementations.

🤖 Prompt for AI Agents

In deepmd/pd/model/descriptor/repflows.py around lines 596 to 657, the code accesses comm_dict using hardcoded numeric indices, which is fragile and reduces readability. Refactor the code to replace these numeric indices with named dictionary keys or use a dataclass to represent comm_dict, similar to the pattern used in repformers.py. This will improve maintainability and make the code more self-documenting by clearly indicating what each accessed element represents.

coderabbitai · 2025-07-22T13:46:17Z

deepmd/pd/model/descriptor/repformers.py

+ has_spin = len(comm_dict) >= 7
+ if not has_spin:
+ n_padding = nall - nloc
+ # g1 = paddle.nn.functional.pad(
+ # g1.squeeze(0), (0, 0, 0, n_padding), value=0.0
+ # )
+ _shapes = g1.shape[1:]
+ _shapes[1] = n_padding
+ g1 = paddle.concat(
+ [g1.squeeze(0), paddle.zeros(_shapes, dtype=g1.dtype)],
+ axis=1,
+ )
+ real_nloc = nloc
+ real_nall = nall
+ else:
+ # for spin
+ real_nloc = nloc // 2
+ real_nall = nall // 2
+ real_n_padding = real_nall - real_nloc
+ g1_real, g1_virtual = paddle.split(
+ g1, [real_nloc, real_nloc], axis=1
+ )
+ # mix_g1: nb x real_nloc x (ng1 * 2)
+ mix_g1 = paddle.concat([g1_real, g1_virtual], axis=2)
+ # nb x real_nall x (ng1 * 2)
+ g1 = paddle.nn.functional.pad(
+ mix_g1.squeeze(0), (0, 0, 0, real_n_padding), value=0.0
+ )
+
+ # assert "send_list" in comm_dict
+ # assert "send_proc" in comm_dict
+ # assert "recv_proc" in comm_dict
+ # assert "send_num" in comm_dict
+ # assert "recv_num" in comm_dict
+ # assert "communicator" in comm_dict
+ # print(f"g1.shape = ", g1.shape)
+ ret = paddle_ops_deepmd_border_op(
+ comm_dict[0],
+ comm_dict[1],
+ comm_dict[2],
+ comm_dict[3],
+ comm_dict[4],
+ g1,
+ comm_dict[5],
+ paddle.to_tensor(


🛠️ Refactor suggestion

⚠️ Potential issue

Replace hardcoded indices with named keys for comm_dict

The current implementation uses hardcoded numeric indices (0-5) to access comm_dict, which is fragile and error-prone. Additionally, the spin detection logic relies on the length of comm_dict (line 484), which is not robust.

Consider using a dictionary with named keys or a dataclass:

- has_spin = len(comm_dict) >= 7 + # comm_dict should contain named keys + has_spin = "spin_comm" in comm_dict if isinstance(comm_dict, dict) else len(comm_dict) >= 7 if not has_spin: # ... existing code ... - # assert "send_list" in comm_dict - # assert "send_proc" in comm_dict - # assert "recv_proc" in comm_dict - # assert "send_num" in comm_dict - # assert "recv_num" in comm_dict - # assert "communicator" in comm_dict - # print(f"g1.shape = ", g1.shape) + # Use named access if comm_dict is a dict + if isinstance(comm_dict, dict): + send_list = comm_dict["send_list"] + send_proc = comm_dict["send_proc"] + recv_proc = comm_dict["recv_proc"] + send_num = comm_dict["send_num"] + recv_num = comm_dict["recv_num"] + communicator = comm_dict["communicator"] + else: + # Legacy list-based access (deprecated) + send_list = comm_dict[0] + send_proc = comm_dict[1] + recv_proc = comm_dict[2] + send_num = comm_dict[3] + recv_num = comm_dict[4] + communicator = comm_dict[5] + ret = paddle_ops_deepmd_border_op( - comm_dict[0], - comm_dict[1], - comm_dict[2], - comm_dict[3], - comm_dict[4], + send_list, + send_proc, + recv_proc, + send_num, + recv_num, g1, - comm_dict[5], + communicator,

This makes the code more maintainable and self-documenting.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

has_spin = len(comm_dict) >= 7

if not has_spin:

n_padding = nall - nloc

# g1 = paddle.nn.functional.pad(

# g1.squeeze(0), (0, 0, 0, n_padding), value=0.0

# )

_shapes = g1.shape[1:]

_shapes[1] = n_padding

g1 = paddle.concat(

[g1.squeeze(0), paddle.zeros(_shapes, dtype=g1.dtype)],

axis=1,

)

real_nloc = nloc

real_nall = nall

else:

# for spin

real_nloc = nloc // 2

real_nall = nall // 2

real_n_padding = real_nall - real_nloc

g1_real, g1_virtual = paddle.split(

g1, [real_nloc, real_nloc], axis=1

)

# mix_g1: nb x real_nloc x (ng1 * 2)

mix_g1 = paddle.concat([g1_real, g1_virtual], axis=2)

# nb x real_nall x (ng1 * 2)

g1 = paddle.nn.functional.pad(

mix_g1.squeeze(0), (0, 0, 0, real_n_padding), value=0.0

)

# assert "send_list" in comm_dict

# assert "send_proc" in comm_dict

# assert "recv_proc" in comm_dict

# assert "send_num" in comm_dict

# assert "recv_num" in comm_dict

# assert "communicator" in comm_dict

# print(f"g1.shape = ", g1.shape)

ret = paddle_ops_deepmd_border_op(

comm_dict[0],

comm_dict[1],

comm_dict[2],

comm_dict[3],

comm_dict[4],

g1,

comm_dict[5],

paddle.to_tensor(

# comm_dict should contain named keys

has_spin = "spin_comm" in comm_dict if isinstance(comm_dict, dict) else len(comm_dict) >= 7

if not has_spin:

n_padding = nall - nloc

# g1 = paddle.nn.functional.pad(

# g1.squeeze(0), (0, 0, 0, n_padding), value=0.0

# )

_shapes = g1.shape[1:]

_shapes[1] = n_padding

g1 = paddle.concat(

[g1.squeeze(0), paddle.zeros(_shapes, dtype=g1.dtype)],

axis=1,

)

real_nloc = nloc

real_nall = nall

else:

# for spin

real_nloc = nloc // 2

real_nall = nall // 2

real_n_padding = real_nall - real_nloc

g1_real, g1_virtual = paddle.split(

g1, [real_nloc, real_nloc], axis=1

)

# mix_g1: nb x real_nloc x (ng1 * 2)

mix_g1 = paddle.concat([g1_real, g1_virtual], axis=2)

# nb x real_nall x (ng1 * 2)

g1 = paddle.nn.functional.pad(

mix_g1.squeeze(0), (0, 0, 0, real_n_padding), value=0.0

)

# Use named access if comm_dict is a dict

if isinstance(comm_dict, dict):

send_list = comm_dict["send_list"]

send_proc = comm_dict["send_proc"]

recv_proc = comm_dict["recv_proc"]

send_num = comm_dict["send_num"]

recv_num = comm_dict["recv_num"]

communicator = comm_dict["communicator"]

else:

# Legacy list-based access (deprecated)

send_list = comm_dict[0]

send_proc = comm_dict[1]

recv_proc = comm_dict[2]

send_num = comm_dict[3]

recv_num = comm_dict[4]

communicator = comm_dict[5]

ret = paddle_ops_deepmd_border_op(

send_list,

send_proc,

recv_proc,

send_num,

recv_num,

g1,

communicator,

paddle.to_tensor(

🤖 Prompt for AI Agents

In deepmd/pd/model/descriptor/repformers.py around lines 484 to 528, the code accesses comm_dict using hardcoded numeric indices and determines spin presence by checking the length of comm_dict, which is fragile and unclear. Refactor comm_dict to be a dictionary with descriptive named keys or use a dataclass to represent its structure. Replace all numeric index accesses with these named keys to improve code clarity and maintainability, and update the spin detection logic to use explicit keys or flags instead of relying on the length of comm_dict.

deepmd/pd/train/training.py

coderabbitai · 2025-07-22T13:46:18Z

deepmd/pd/utils/env.py

 "expand_grad",
 "floor_grad",
- "gather_grad",
+ # "gather_grad",


💡 Verification agent

🧩 Analysis chain

Document the reason for removing "gather_grad" from the blacklist.

Please add a comment explaining why "gather_grad" was removed from the blacklist and how this change supports the DPA2/DPA3 inference functionality. This helps future maintainers understand the rationale.

Additionally, please verify that this change improves performance or functionality for the target use case:

- # "gather_grad", + # "gather_grad", # Removed to support DPA2/DPA3 distributed communication operations

🏁 Script executed:

#!/bin/bash # Description: Search for gather_grad usage in the codebase to understand the impact # Expected: Find references to gather_grad operations that might benefit from this change # Search for gather_grad usage patterns rg -A 3 -B 3 "gather_grad" # Search for gather operations that might be related rg -A 2 "\.gather\(" # Look for communication or distributed operations that might use gather rg -A 2 -B 2 "comm_dict|border_op"

Length of output: 55951

Document removal of “gather_grad” from the eager‐mode blacklist

Please add a brief comment in deepmd/pd/utils/env.py (line ~140) explaining why “gather_grad” was removed—namely, to allow the composite‐op implementation of the gather gradient (instead of the kernel fallback) when running distributed inference under DPA2/DPA3. This will help future maintainers understand that it’s intentional and tied to parallel communication support.

• File: deepmd/pd/utils/env.py
Line: 140

Proposed diff:

- # "gather_grad", + # "gather_grad", # enabled composite gather_grad to support DPA2/DPA3 distributed inference

Also, please add or update a small test or benchmark to verify that using the composite path for gather_grad in eager mode does not regress performance or correctness in distributed setups.

🤖 Prompt for AI Agents

In deepmd/pd/utils/env.py at line 140, add a brief comment explaining that "gather_grad" was removed from the eager-mode blacklist to enable the composite-op implementation of the gather gradient during distributed inference with DPA2/DPA3, supporting parallel communication. Additionally, create or update a small test or benchmark to verify that using the composite path for gather_grad in eager mode maintains performance and correctness in distributed environments.

coderabbitai · 2025-07-22T13:46:18Z

source/api_cc/src/DeepPotPD.cc

+ this->comm_vec.emplace_back(sendlist_tensor);
+ this->comm_vec.emplace_back(sendproc_tensor);
+ this->comm_vec.emplace_back(recvproc_tensor);
+ this->comm_vec.emplace_back(sendnum_tensor);
+ this->comm_vec.emplace_back(recvnum_tensor);
+ this->comm_vec.emplace_back(communicator_tensor);


⚠️ Potential issue

Potential lifetime issue with tensor handles.

The tensor handles obtained from GetInputHandle are being stored in comm_vec, but these handles might not persist after the function scope ends. This could lead to undefined behavior.

Consider storing the tensors differently or ensuring their lifetime matches the usage pattern. You might need to use std::unique_ptr<paddle_infer::Tensor> or ensure the handles remain valid.

🤖 Prompt for AI Agents

In source/api_cc/src/DeepPotPD.cc around lines 237 to 242, the tensor handles obtained from GetInputHandle are stored directly in comm_vec, but these handles may become invalid after the function scope ends, causing undefined behavior. To fix this, change comm_vec to store smart pointers such as std::unique_ptr<paddle_infer::Tensor> to manage the lifetime of the tensor handles properly, or otherwise ensure that the tensor handles remain valid for as long as comm_vec uses them by extending their lifetime appropriately.

coderabbitai · 2025-07-22T13:46:18Z

source/api_cc/src/DeepPotPD.cc

+ if (lmp_list.mapping) {
+ std::vector<std::int64_t> mapping(nall_real);
+ for (size_t ii = 0; ii < nall_real; ii++) {
+ mapping[ii] = lmp_list.mapping[fwd_map[ii]];
+ }
+ this->mapping_tensor->Reshape({1, nall_real});
+ this->mapping_tensor->CopyFromCpu(mapping.data());


⚠️ Potential issue

Fix usage of uninitialized mapping_tensor.

This code uses this->mapping_tensor which was problematically initialized in the header file. Combined with the initialization issue in DeepPotPD.h, this will likely crash.

After fixing the initialization in the header, ensure mapping_tensor is properly initialized before use here:

+ if (!this->mapping_tensor) { + this->mapping_tensor = predictor_fl->GetInputHandle("mapping"); + } std::vector<std::int64_t> mapping(nall_real); for (size_t ii = 0; ii < nall_real; ii++) { mapping[ii] = lmp_list.mapping[fwd_map[ii]]; } this->mapping_tensor->Reshape({1, nall_real}); this->mapping_tensor->CopyFromCpu(mapping.data());

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (lmp_list.mapping) {

std::vector<std::int64_t> mapping(nall_real);

for (size_t ii = 0; ii < nall_real; ii++) {

mapping[ii] = lmp_list.mapping[fwd_map[ii]];

}

this->mapping_tensor->Reshape({1, nall_real});

this->mapping_tensor->CopyFromCpu(mapping.data());

if (lmp_list.mapping) {

// Ensure mapping_tensor is initialized before use

if (!this->mapping_tensor) {

this->mapping_tensor = predictor_fl->GetInputHandle("mapping");

}

std::vector<std::int64_t> mapping(nall_real);

for (size_t ii = 0; ii < nall_real; ii++) {

mapping[ii] = lmp_list.mapping[fwd_map[ii]];

}

this->mapping_tensor->Reshape({1, nall_real});

this->mapping_tensor->CopyFromCpu(mapping.data());

}

🤖 Prompt for AI Agents

In source/api_cc/src/DeepPotPD.cc around lines 244 to 250, the code uses this->mapping_tensor without ensuring it is properly initialized, which can cause a crash. After correcting the initialization in DeepPotPD.h, add a check or initialization step before this usage to confirm mapping_tensor is allocated and ready. This may involve allocating the tensor if it is null or resetting it appropriately before calling Reshape and CopyFromCpu.

coderabbitai · 2025-07-22T13:46:19Z

source/op/pd/CMakeLists.txt

+target_link_libraries(
+ deepmd_op_pd
+ PRIVATE ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a
+ ${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a
+ ${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a
+ ${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a
+ ${PADDLE_INFER_LIB}
+ dl
+ pthread)


💡 Verification agent

🧩 Analysis chain

Verify hardcoded library paths exist during build.

The build script assumes specific library paths exist but doesn't validate them. Consider adding file existence checks for critical libraries.

🏁 Script executed:

#!/bin/bash # Check if similar hardcoded library path patterns exist elsewhere in the codebase rg -A 5 -B 5 "lib/lib.*\.a" --type cmake

Length of output: 2019

Ensure hardcoded library paths are validated in CMakeLists.txt

The target_link_libraries call in source/op/pd/CMakeLists.txt (lines 67–75) links directly to .a files under ${PADDLE_LIB_THIRD_PARTY_PATH} without any existence checks. If one of these paths is missing or incorrect, the build will fail with an obscure linker error. Add explicit guards—either if(EXISTS …) checks or find_library—to surface a clear, early error.

Files to update:

source/op/pd/CMakeLists.txt (around lines 67–75)

Suggested diff:

target_link_libraries( deepmd_op_pd - PRIVATE ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a - ${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a - ${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a - ${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a + PRIVATE + $<$<IF:$<BOOL:EXISTS:${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a>, + ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a, + $<ERROR:Missing glog library at ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib/libglog.a>> + $<$<IF:$<BOOL:EXISTS:${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a>, + ${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a, + $<ERROR:Missing gflags library at ${PADDLE_LIB_THIRD_PARTY_PATH}/gflags/lib/libgflags.a>> + $<$<IF:$<BOOL:EXISTS:${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a>, + ${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a, + $<ERROR:Missing xxhash library at ${PADDLE_LIB_THIRD_PARTY_PATH}/xxhash/lib/libxxhash.a>> + $<$<IF:$<BOOL:EXISTS:${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a>, + ${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a, + $<ERROR:Missing protobuf library at ${PADDLE_LIB_THIRD_PARTY_PATH}/protobuf/lib/libprotobuf.a>>

Alternatively, you can use find_library(GLOG_LIB NAMES glog PATHS ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib NO_DEFAULT_PATH) (and similarly for the others) followed by:

if (NOT GLOG_LIB) message(FATAL_ERROR "glog not found in ${PADDLE_LIB_THIRD_PARTY_PATH}/glog/lib") endif()

This will ensure any missing or mis-configured third-party library is caught immediately.

🤖 Prompt for AI Agents

In source/op/pd/CMakeLists.txt around lines 67 to 75, the target_link_libraries command links directly to static library files without verifying their existence, which can cause unclear linker errors if paths are wrong. To fix this, replace the hardcoded paths with find_library calls for each library (glog, gflags, xxhash, protobuf) specifying the appropriate search paths under ${PADDLE_LIB_THIRD_PARTY_PATH}. After each find_library, add an if(NOT ...) check to produce a clear fatal error message if the library is not found. Then link the found library variables in target_link_libraries instead of the hardcoded paths.

coderabbitai · 2025-07-22T13:46:19Z

source/op/pd/comm.cc

+ if (sendproc[iswap] != me) {
+ if (nrecv) {
+ MPI_Irecv(recv_g1, nrecv * tensor_size, mpi_type, recvproc[iswap], 0,
+ world, &request);
+ }
+ if (nsend) {
+ MPI_Send(send_g1, nsend * tensor_size, mpi_type, sendproc[iswap], 0,
+ world);
+ }
+ if (nrecv) {
+ MPI_Wait(&request, MPI_STATUS_IGNORE);
+ }


🛠️ Refactor suggestion

Add error handling for MPI operations

MPI operations can fail but the code doesn't check return values. Consider adding error checking to make the code more robust.

Example for MPI_Send:

if (nsend) { - MPI_Send(send_g1, nsend * tensor_size, mpi_type, sendproc[iswap], 0, - world); + int mpi_err = MPI_Send(send_g1, nsend * tensor_size, mpi_type, + sendproc[iswap], 0, world); + if (mpi_err != MPI_SUCCESS) { + PD_THROW("MPI_Send failed with error code: ", mpi_err); + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (sendproc[iswap] != me) {

if (nrecv) {

MPI_Irecv(recv_g1, nrecv * tensor_size, mpi_type, recvproc[iswap], 0,

world, &request);

}

if (nsend) {

MPI_Send(send_g1, nsend * tensor_size, mpi_type, sendproc[iswap], 0,

world);

}

if (nrecv) {

MPI_Wait(&request, MPI_STATUS_IGNORE);

}

if (sendproc[iswap] != me) {

if (nrecv) {

MPI_Irecv(recv_g1, nrecv * tensor_size, mpi_type, recvproc[iswap], 0,

world, &request);

}

if (nsend) {

int mpi_err = MPI_Send(send_g1, nsend * tensor_size, mpi_type,

sendproc[iswap], 0, world);

if (mpi_err != MPI_SUCCESS) {

PD_THROW("MPI_Send failed with error code: ", mpi_err);

}

}

if (nrecv) {

MPI_Wait(&request, MPI_STATUS_IGNORE);

}

🤖 Prompt for AI Agents

In source/op/pd/comm.cc around lines 120 to 131, the MPI calls (MPI_Irecv, MPI_Send, MPI_Wait) do not check their return values for errors. Modify the code to capture the return value of each MPI call and add conditional checks to handle errors appropriately, such as logging an error message or aborting the operation if an MPI call fails. This will make the code more robust by detecting and responding to MPI communication failures.

coderabbitai · 2025-07-22T13:46:19Z

source/op/pd/comm.cc

+ bool type_flag =
+ (sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;


⚠️ Potential issue

Fix dtype check to use the correct tensor

The dtype check is performed on sendlist_tensor which contains integer indices, but should check the floating-point gradient tensor instead.

bool type_flag = - (sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false; + (recv_g1_tensor_grad.dtype() == paddle::DataType::FLOAT64) ? true : false;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

bool type_flag =

(sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;

bool type_flag =

(recv_g1_tensor_grad.dtype() == paddle::DataType::FLOAT64) ? true : false;

🤖 Prompt for AI Agents

In source/op/pd/comm.cc around lines 355 to 356, the dtype check is incorrectly performed on sendlist_tensor, which holds integer indices. Update the dtype check to use the floating-point gradient tensor instead to correctly determine if the data type is FLOAT64.

coderabbitai · 2025-07-22T13:48:25Z

Caution

Review failed

The head commit changed during the review from 3effbdb to 85c8ad5.

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

source/op/pd/CMakeLists.txt (1)
67-75: Hardcoded library paths need validation.

The build script links directly to hardcoded library paths without validating their existence, which can cause unclear linker errors if paths are incorrect.

As noted in previous reviews, consider adding existence checks for these critical libraries:
find_library(GLOG_LIB NAMES glog PATHS ${PADDLE_INFERENCE_DIR_THIRD_PARTY_PATH}/glog/lib NO_DEFAULT_PATH) if(NOT GLOG_LIB)  message(FATAL_ERROR "glog library not found in ${PADDLE_INFERENCE_DIR_THIRD_PARTY_PATH}/glog/lib") endif()
Apply similar checks for gflags, xxhash, and protobuf libraries.

🧹 Nitpick comments (2)

source/op/pd/CMakeLists.txt (2)

23-23: Replace Chinese comments with English for consistency.

Several comments are in Chinese and should be translated to English to maintain codebase consistency.

-# Math lib（默认用MKL） +# Math lib (default: MKL) -# link: libdeepmd paddle - 添加 CXX ABI 兼容性处理 if(${OP_CXX_ABI_PT} EQUAL +# link: libdeepmd paddle - Add CXX ABI compatibility handling if(${OP_CXX_ABI_PT} EQUAL -# 添加 CXX ABI 编译定义 remove_definitions(-D_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI}) +# Add CXX ABI compile definitions remove_definitions(-D_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI}) -# 添加 RPATH 设置 +# Add RPATH settings -# 链接 Paddle 所需库 +# Link required Paddle libraries

Also applies to: 28-36, 65-65

28-36: Clean up commented code.

Remove or properly handle the commented-out CXX ABI compatibility code. If this functionality is needed, implement it properly; otherwise, remove the dead code.

-# link: libdeepmd paddle - 添加 CXX ABI 兼容性处理 if(${OP_CXX_ABI_PT} EQUAL -# ${OP_CXX_ABI}) target_link_libraries(deepmd_op_pd PRIVATE ${LIB_DEEPMD}) -# else() target_link_libraries(deepmd_op_pd PRIVATE ${LIB_DEEPMD}_compat_cxxabi) -# endif() - -# 添加 CXX ABI 编译定义 remove_definitions(-D_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI}) -# target_compile_definitions( deepmd_op_pd PUBLIC -# "$<$<COMPILE_LANGUAGE:CXX>:_GLIBCXX_USE_CXX11_ABI=${OP_CXX_ABI_PT}>")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c510815 and 1a07636.

📒 Files selected for processing (17)

deepmd/pd/cxx_op.py (1 hunks)
deepmd/pd/model/atomic_model/base_atomic_model.py (2 hunks)
deepmd/pd/model/atomic_model/dp_atomic_model.py (1 hunks)
deepmd/pd/model/descriptor/dpa1.py (1 hunks)
deepmd/pd/model/descriptor/dpa2.py (3 hunks)
deepmd/pd/model/descriptor/dpa3.py (1 hunks)
deepmd/pd/model/descriptor/repflows.py (8 hunks)
deepmd/pd/model/descriptor/repformers.py (7 hunks)
deepmd/pd/model/descriptor/se_a.py (1 hunks)
deepmd/pd/model/descriptor/se_t_tebd.py (1 hunks)
deepmd/pd/model/model/ener_model.py (1 hunks)
deepmd/pd/model/model/make_model.py (1 hunks)
source/CMakeLists.txt (1 hunks)
source/api_cc/include/DeepPotPD.h (1 hunks)
source/api_cc/src/DeepPotPD.cc (2 hunks)
source/op/pd/CMakeLists.txt (1 hunks)
source/op/pd/comm.cc (1 hunks)

🧠 Learnings (5)

📓 Common learnings

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-10-08T15:32:11.479Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-09-19T04:25:12.408Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4226 File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202 Timestamp: 2024-10-16T21:49:57.401Z Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

deepmd/pd/model/model/ener_model.py (2)

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-08T15:32:11.479Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-05T03:11:02.922Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

source/op/pd/CMakeLists.txt (2)

Learnt from: njzjz
PR: #3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-06-13T16:32:13.786Z
Learning: For code blocks in doc/model/train-fitting-dos.md that display commands, use 'txt' as the language specification as per user njzjz's preference.

Learnt from: njzjz
PR: #3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-10-08T15:32:11.479Z
Learning: For code blocks in doc/model/train-fitting-dos.md that display commands, use 'txt' as the language specification as per user njzjz's preference.

deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

Learnt from: njzjz
PR: #4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in deepmd/dpmodel/atomic_model/base_atomic_model.py, do not suggest simplifying for kk in ret_dict.keys() to for kk in ret_dict unless it's relevant to the PR.

deepmd/pd/model/atomic_model/base_atomic_model.py (1)

Learnt from: njzjz
PR: #4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in deepmd/dpmodel/atomic_model/base_atomic_model.py, do not suggest simplifying for kk in ret_dict.keys() to for kk in ret_dict unless it's relevant to the PR.

🧬 Code Graph Analysis (1)

deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

deepmd/pd/model/network/network.py (1)

Tensor (30-33)

✅ Files skipped from review due to trivial changes (1)

deepmd/pd/model/descriptor/se_t_tebd.py

🚧 Files skipped from review as they are similar to previous changes (7)

source/CMakeLists.txt
source/api_cc/include/DeepPotPD.h
source/api_cc/src/DeepPotPD.cc
deepmd/pd/model/descriptor/dpa2.py
deepmd/pd/model/descriptor/repflows.py
deepmd/pd/model/descriptor/repformers.py
source/op/pd/comm.cc

🧰 Additional context used

🧠 Learnings (5)

📓 Common learnings

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-10-08T15:32:11.479Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4144 File: source/api_cc/tests/test_deeppot_dpa_pt.cc:166-246 Timestamp: 2024-09-19T04:25:12.408Z Learning: Refactoring between test classes `TestInferDeepPotDpaPt` and `TestInferDeepPotDpaPtNopbc` is addressed in PR #3905.

Learnt from: njzjz PR: deepmodeling/deepmd-kit#4226 File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202 Timestamp: 2024-10-16T21:49:57.401Z Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in `deepmd/dpmodel/atomic_model/base_atomic_model.py`, do not suggest simplifying `for kk in ret_dict.keys()` to `for kk in ret_dict` unless it's relevant to the PR.

deepmd/pd/model/model/ener_model.py (2)

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-08T15:32:11.479Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

Learnt from: 1azyking
PR: #4169
File: deepmd/pt/loss/ener_hess.py:341-348
Timestamp: 2024-10-05T03:11:02.922Z
Learning: In deepmd/pt/loss/ener_hess.py, the label uses the key "atom_ener" intentionally to maintain consistency with the forked version.

source/op/pd/CMakeLists.txt (2)

Learnt from: njzjz
PR: #3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-06-13T16:32:13.786Z
Learning: For code blocks in doc/model/train-fitting-dos.md that display commands, use 'txt' as the language specification as per user njzjz's preference.

Learnt from: njzjz
PR: #3875
File: doc/model/train-fitting-dos.md:107-107
Timestamp: 2024-10-08T15:32:11.479Z
Learning: For code blocks in doc/model/train-fitting-dos.md that display commands, use 'txt' as the language specification as per user njzjz's preference.

deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

Learnt from: njzjz
PR: #4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in deepmd/dpmodel/atomic_model/base_atomic_model.py, do not suggest simplifying for kk in ret_dict.keys() to for kk in ret_dict unless it's relevant to the PR.

deepmd/pd/model/atomic_model/base_atomic_model.py (1)

Learnt from: njzjz
PR: #4226
File: deepmd/dpmodel/atomic_model/base_atomic_model.py:202-202
Timestamp: 2024-10-16T21:49:57.401Z
Learning: When reviewing PRs, avoid making refactor suggestions that are not directly related to the PR's changes. For example, in deepmd/dpmodel/atomic_model/base_atomic_model.py, do not suggest simplifying for kk in ret_dict.keys() to for kk in ret_dict unless it's relevant to the PR.

🧬 Code Graph Analysis (1)

deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

deepmd/pd/model/network/network.py (1)

Tensor (30-33)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (29)

GitHub Check: Test Python (5, 3.12)
GitHub Check: Analyze (c-cpp)
GitHub Check: Analyze (python)
GitHub Check: Build C++ (clang, clang)
GitHub Check: Build C++ (cuda120, cuda)
GitHub Check: Test Python (6, 3.9)
GitHub Check: Build C++ (cuda, cuda)
GitHub Check: Test Python (6, 3.12)
GitHub Check: Build C++ (rocm, rocm)
GitHub Check: Build C++ (cpu, cpu)
GitHub Check: Build C library (2.18, libdeepmd_c.tar.gz)
GitHub Check: Test Python (4, 3.12)
GitHub Check: Test Python (3, 3.9)
GitHub Check: Test Python (1, 3.12)
GitHub Check: Test C++ (false)
GitHub Check: Test Python (3, 3.12)
GitHub Check: Test Python (2, 3.12)
GitHub Check: Test Python (5, 3.9)
GitHub Check: Test Python (4, 3.9)
GitHub Check: Build C library (2.14, >=2.5.0rc0,<2.15, libdeepmd_c_cu11.tar.gz)
GitHub Check: Test Python (2, 3.9)
GitHub Check: Test Python (1, 3.9)
GitHub Check: Test C++ (true)
GitHub Check: Build wheels for cp311-macosx_arm64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp311-win_amd64
GitHub Check: Build wheels for cp311-manylinux_x86_64
GitHub Check: Build wheels for cp310-manylinux_aarch64
GitHub Check: Build wheels for cp311-macosx_x86_64

🔇 Additional comments (7)

deepmd/pd/model/model/make_model.py (1)

241-241: LGTM! Type annotation updated for Paddle PIR compatibility.

The change from Optional[dict[str, paddle.Tensor]] to Optional[list[paddle.Tensor]] aligns with the PR objective to work around Paddle PIR's lack of Dict support. This is part of a coordinated update across the codebase to standardize communication tensor handling for distributed inference.

deepmd/pd/model/descriptor/dpa1.py (1)

599-599: LGTM! Consistent type annotation update for distributed communication.

The comm_dict parameter type change from dictionary to list maintains consistency with the broader codebase refactoring for Paddle PIR compatibility and distributed inference support.

deepmd/pd/model/descriptor/se_a.py (1)

263-263: LGTM! Type annotation consistently updated.

The comm_dict parameter type change aligns with the coordinated refactoring across descriptor classes to support the new communication tensor interface for distributed inference.

deepmd/pd/model/descriptor/dpa3.py (1)

460-460: LGTM! Type annotation updated consistently with other descriptors.

The comm_dict parameter type change from dictionary to list maintains consistency across descriptor classes and supports the new distributed communication interface for DPA3 inference.

deepmd/pd/model/model/ener_model.py (1)

102-102: LGTM! Final consistent type annotation update.

The comm_dict parameter type change completes the coordinated refactoring across model classes, ensuring consistent communication tensor handling for distributed energy model inference.

deepmd/pd/model/atomic_model/base_atomic_model.py (1)

194-194: LGTM: Type annotation changes are consistent with the broader refactor.

The changes from Optional[dict[str, paddle.Tensor]] to Optional[list[paddle.Tensor]] for the comm_dict parameter align correctly with the codebase-wide transition to support the new distributed communication approach using the custom Paddle operator.

Also applies to: 277-277

deepmd/pd/model/atomic_model/dp_atomic_model.py (1)

253-253: LGTM: Type annotation change is consistent.

The change from Optional[dict[str, paddle.Tensor]] to Optional[list[paddle.Tensor]] for the comm_dict parameter correctly aligns with the broader codebase refactor to support the new communication structure.

coderabbitai · 2025-07-23T05:14:46Z

deepmd/pd/cxx_op.py

+def load_library(module_name: str) -> tuple[bool, ModuleType]:
+ """Load OP library and return the module if success.
+
+ Parameters
+ ----------
+ module_name : str
+ Name of the module
+
+ Returns
+ -------
+ bool
+ Whether the library is loaded successfully
+ ModuleType
+ loaded custom operator module
+ """
+ current_file = Path(__file__).resolve()
+ base_dir = current_file.parents[2]
+
+ module_file = base_dir / "source" / "op" / "pd" / "comm.cc"
+ paddle_ops_deepmd = load(
+ name="deepmd_op_pd",
+ sources=[str(module_file)],
+ )
+ return True, paddle_ops_deepmd


🛠️ Refactor suggestion

Address unused parameter and improve error handling.

The load_library function has several issues:

The module_name parameter is unused - the function hardcodes "deepmd_op_pd"

Always returns True for success, even though paddle.utils.cpp_extension.load could fail

No error handling for compilation failures

-def load_library(module_name: str) -> tuple[bool, ModuleType]: +def load_library(module_name: str) -> tuple[bool, Optional[ModuleType]]: """Load OP library and return the module if success. Parameters ---------- module_name : str Name of the module Returns ------- bool Whether the library is loaded successfully - ModuleType + Optional[ModuleType] loaded custom operator module """ + try: current_file = Path(__file__).resolve() base_dir = current_file.parents[2] module_file = base_dir / "source" / "op" / "pd" / "comm.cc" paddle_ops_deepmd = load( - name="deepmd_op_pd", + name=module_name, sources=[str(module_file)], ) return True, paddle_ops_deepmd + except Exception as e: + import logging + logging.getLogger(__name__).warning(f"Failed to load custom operator {module_name}: {e}") + return False, None

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In deepmd/pd/cxx_op.py around lines 14 to 37, the load_library function does not use the module_name parameter, always returns True regardless of success, and lacks error handling for compilation failures. Modify the function to use the module_name parameter dynamically when calling load, wrap the load call in a try-except block to catch exceptions, return False and None if loading fails, and only return True and the module if loading succeeds.

coderabbitai · 2025-07-23T05:14:47Z

deepmd/pd/cxx_op.py

+ return True, paddle_ops_deepmd
+
+
+ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd")


🛠️ Refactor suggestion

Handle potential loading failures gracefully.

The immediate loading on import could cause import failures if Paddle is not properly configured or if the C++ compilation fails.

-ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd") +try: + ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd") +except Exception: + ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = False, None

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd")

try:

ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd")

except Exception:

ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = False, None

🤖 Prompt for AI Agents

In deepmd/pd/cxx_op.py at line 40, the code loads the custom Paddle library immediately on import, which can cause import failures if Paddle is not configured correctly or if the C++ compilation fails. Modify this to handle loading errors gracefully by wrapping the load_library call in a try-except block, catching exceptions, and providing a clear error message or fallback behavior to prevent the entire module from failing to import.

source/api_cc/src/DeepPotPD.cc

 config->EnableNewIR(true);
 config->EnableCustomPasses({"add_shadow_output_after_dead_parameter_pass"},
 true);
+ // config->SwitchIrOptim(false);


source/api_cc/src/DeepPotPD.cc

 config_fl->EnableNewIR(true);
 config_fl->EnableCustomPasses({"add_shadow_output_after_dead_parameter_pass"},
 true);
+ // config_fl->SwitchIrOptim(false);


HydrogenSulfate added 3 commits July 21, 2025 01:06

update for dpa3_dynamic debug code

d2a46b7

support DPA2/DPA3 inference

8fe0df6

fix code for view ops

c510815

Copilot AI review requested due to automatic review settings July 22, 2025 13:37

HydrogenSulfate marked this pull request as draft July 22, 2025 13:37

github-actions bot added Python OP C++ labels Jul 22, 2025

HydrogenSulfate changed the title ~~[WIP] Support dpa2/dpa3 inference~~ [WIP] pd: support dpa2/dpa3 inference Jul 22, 2025

Copilot AI reviewed Jul 22, 2025

View reviewed changes

github-advanced-security bot found potential problems Jul 22, 2025

View reviewed changes

refine code

3effbdb

coderabbitai bot reviewed Jul 22, 2025

View reviewed changes

refine more code

38fb926

HydrogenSulfate force-pushed the support_comm branch from 85c8ad5 to 38fb926 Compare July 22, 2025 13:53

HydrogenSulfate added 4 commits July 22, 2025 21:59

fix typing

4d3f99c

update code

d6a3b1e

update missing file

1a07636

fix fetch name

329f9bc

coderabbitai bot reviewed Jul 23, 2025

View reviewed changes

github-advanced-security bot found potential problems Jul 23, 2025

View reviewed changes

source/api_cc/src/DeepPotPD.cc Fixed Show fixed Hide fixed

HydrogenSulfate added 2 commits August 5, 2025 16:52

update code for successfully run dpa2 C++ inference

5b0997d

Merge remote-tracking branch 'origin/devel' into support_comm

03a09d6

HydrogenSulfate force-pushed the support_comm branch from d1369a7 to 03a09d6 Compare August 5, 2025 08:55

HydrogenSulfate added 2 commits August 5, 2025 17:07

fix

b7974d1

add setup.py for paddle cxx_op

01e697b

github-advanced-security bot found potential problems Aug 5, 2025

View reviewed changes

HydrogenSulfate closed this Aug 5, 2025

	// #include <paddle/include/paddle_inference_api.h>
	// #include <paddle/include/paddle_inference_api.h> // Preserved for potential future use in inference-related functionality.

	std::vector<int64_t> nghost_tenso_shape) {
	std::vector<int64_t> nghost_tensor_shape) {

	paddle::DataType nghost_tenso_dtype) {
	paddle::DataType nghost_tensor_dtype) {

	(sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;
	(recv_g1_tensor_grad.dtype() == paddle::DataType::FLOAT64) ? true : false;

	# ========== 新增：Paddle 配置（路径由用户外部传入） ==========
	# ========== New: Paddle configuration (path provided externally by the user) ==========

		bool type_flag =
		(sendlist_tensor.dtype() == paddle::DataType::FLOAT64) ? true : false;

		return True, paddle_ops_deepmd


		ENABLE_CUSTOMIZED_OP, paddle_ops_deepmd = load_library("deepmd_op_pd")

Uh oh!

[WIP] pd: support dpa2/dpa3 inference #4846

[WIP] pd: support dpa2/dpa3 inference #4846

Uh oh!

Conversation

HydrogenSulfate commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Copilot AI Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot Jul 22, 2025

Choose a reason for hiding this comment

coderabbitai bot commented Jul 22, 2025

Review failed

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

HydrogenSulfate commented Jul 22, 2025 •

edited

Loading

coderabbitai bot commented Jul 22, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

CodeRabbit Configuration File (`.coderabbit.yaml`)