feat(dpmodel): add NEP descriptor#5597
Conversation
for more information, see https://pre-commit.ci
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (9)
🚧 Files skipped from review as they are similar to previous changes (9)
📝 WalkthroughWalkthroughAdds NEP descriptor support across dpmodel, JAX, and PyTorch backends, plus schema registration, a GPUMD ChangesNEP Descriptor Feature
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 OpenGrep (1.23.0)deepmd/jax/descriptor/nep.py┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.13][ERROR]: unable to find a config; path source/tests/common/dpmodel/test_descriptor_nep.py┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.20][ERROR]: unable to find a config; path deepmd/pt_expt/descriptor/nep_coeff.py┌──────────────┐ �[32m✔�[39m �[1mOpengrep OSS�[0m [00.21][ERROR]: unable to find a config; path
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/dpmodel/descriptor/nep.py`:
- Around line 509-510: Reject non-positive cutoff values in the NEP descriptor
initialization before they are stored. In the cutoff validation logic around the
rcut_radial and rcut_angular checks in nep.py, add explicit strict-positivity
validation for both values so rcut_radial=0.0 and rcut_angular=0.0 are rejected
early, preventing _nep_cutoff() from dividing by zero on the first forward pass.
- Around line 513-518: The validation in nep.py currently allows 4-body terms
even when l_max is below 2, which later causes the harmonic indexing in
descrpt_nep_args() to fail at runtime. Add an upfront guard in the parameter
validation near l_max_4body/l_max_5body checks that rejects any configuration
with l_max_4body enabled unless l_max >= 2, and make sure the default
l_max_4body path in descrpt_nep_args() is covered by this check. Keep the
existing constraints around l_max_4body, l_max_5body, and the 5-body invariant
consistent with the new l_max requirement.
In `@deepmd/tools/nep_txt.py`:
- Around line 264-265: The backend export path in serialize_hook is being called
with input_file even though Backend.detect_backend_by_model(input_file) already
returns a callable hook; update the logic in nep_txt.py so serialize_hook is
invoked with no filename argument and its return value is assigned to model. Use
the existing Backend.detect_backend_by_model and serialize_hook symbols to
locate the change, and keep the non-.pt export flow for
.pte/.pt2/.jax/.savedmodel consistent with the backend contract.
In `@doc/model/nep.md`:
- Around line 146-148: The storage-scaling claim in the NEP documentation is
incorrect because the dense table used by the expansion coefficients still
scales as O(ntypes ** 2); update the text in the section describing the
`(ntypes, ntypes, n_desc, k_max)` tensor to say memory/storage is quadratic in
`ntypes`, while only the forward/runtime access pattern is neighbor-driven. Keep
the wording aligned with the surrounding explanation of the expansion
coefficients and their use in the backends.
In `@source/tests/consistent/descriptor/test_nep.py`:
- Around line 67-69: The skip_dp property is returning a missing class attribute
from CommonTest, which causes an AttributeError when the skip logic checks it.
Update the skip_dp property in test_nep.py to return an actual boolean value for
this test class instead of referencing CommonTest.skip_dp, and make sure the
value matches the intended skip behavior for the NEP descriptor tests.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 448e5088-af3b-4b12-b64f-5f55831263d0
📒 Files selected for processing (16)
deepmd/dpmodel/descriptor/__init__.pydeepmd/dpmodel/descriptor/nep.pydeepmd/jax/descriptor/__init__.pydeepmd/jax/descriptor/nep.pydeepmd/pt_expt/descriptor/__init__.pydeepmd/pt_expt/descriptor/nep.pydeepmd/pt_expt/descriptor/nep_coeff.pydeepmd/tools/__init__.pydeepmd/tools/nep_txt.pydeepmd/utils/argcheck.pydoc/model/index.rstdoc/model/nep.mdexamples/water/nep/input.jsonsource/tests/common/dpmodel/test_descriptor_nep.pysource/tests/consistent/descriptor/test_nep.pysource/tests/pt_expt/descriptor/test_nep.py
| if rcut_angular > rcut_radial: | ||
| raise ValueError("rcut_angular must not exceed rcut_radial") |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Reject non-positive cutoffs up front.
These checks only compare the two cutoffs against each other. rcut_radial=0.0 and rcut_angular=0.0 still pass here, then _nep_cutoff() divides by zero on Line 192 during the first forward pass. Please require both cutoffs to be strictly positive before storing them.
Suggested fix
if sel is None:
raise ValueError("sel must be provided for the NEP descriptor")
+ if rcut_radial <= 0.0 or rcut_angular <= 0.0:
+ raise ValueError("rcut_radial and rcut_angular must be positive")
if rcut_angular > rcut_radial:
raise ValueError("rcut_angular must not exceed rcut_radial")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if rcut_angular > rcut_radial: | |
| raise ValueError("rcut_angular must not exceed rcut_radial") | |
| if sel is None: | |
| raise ValueError("sel must be provided for the NEP descriptor") | |
| if rcut_radial <= 0.0 or rcut_angular <= 0.0: | |
| raise ValueError("rcut_radial and rcut_angular must be positive") | |
| if rcut_angular > rcut_radial: | |
| raise ValueError("rcut_angular must not exceed rcut_radial") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/dpmodel/descriptor/nep.py` around lines 509 - 510, Reject non-positive
cutoff values in the NEP descriptor initialization before they are stored. In
the cutoff validation logic around the rcut_radial and rcut_angular checks in
nep.py, add explicit strict-positivity validation for both values so
rcut_radial=0.0 and rcut_angular=0.0 are rejected early, preventing
_nep_cutoff() from dividing by zero on the first forward pass.
| if l_max_4body not in (0, 2): | ||
| raise ValueError("l_max_4body must be 0 or 2") | ||
| if l_max_5body not in (0, 1): | ||
| raise ValueError("l_max_5body must be 0 or 1") | ||
| if l_max_5body == 1 and l_max_4body != 2: | ||
| raise ValueError("the 5-body invariant requires l_max_4body == 2") |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Disallow 4-body invariants when l_max < 2.
l_max_4body=2 currently passes validation even when l_max=1. Line 875 then indexes the missing L=2 harmonic block (sum_s[:, :, 3:8]), so a config as small as {l_max: 1} crashes at runtime because descrpt_nep_args() defaults l_max_4body to 2. Add an upfront guard that requires l_max >= 2 whenever 4-body terms are enabled.
Suggested fix
if l_max < 1 or l_max > 8:
raise ValueError("l_max must be in [1, 8]")
if l_max_4body not in (0, 2):
raise ValueError("l_max_4body must be 0 or 2")
+ if l_max_4body == 2 and l_max < 2:
+ raise ValueError("l_max must be at least 2 when l_max_4body == 2")
if l_max_5body not in (0, 1):
raise ValueError("l_max_5body must be 0 or 1")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if l_max_4body not in (0, 2): | |
| raise ValueError("l_max_4body must be 0 or 2") | |
| if l_max_5body not in (0, 1): | |
| raise ValueError("l_max_5body must be 0 or 1") | |
| if l_max_5body == 1 and l_max_4body != 2: | |
| raise ValueError("the 5-body invariant requires l_max_4body == 2") | |
| if l_max < 1 or l_max > 8: | |
| raise ValueError("l_max must be in [1, 8]") | |
| if l_max_4body not in (0, 2): | |
| raise ValueError("l_max_4body must be 0 or 2") | |
| if l_max_4body == 2 and l_max < 2: | |
| raise ValueError("l_max must be at least 2 when l_max_4body == 2") | |
| if l_max_5body not in (0, 1): | |
| raise ValueError("l_max_5body must be 0 or 1") | |
| if l_max_5body == 1 and l_max_4body != 2: | |
| raise ValueError("the 5-body invariant requires l_max_4body == 2") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/dpmodel/descriptor/nep.py` around lines 513 - 518, The validation in
nep.py currently allows 4-body terms even when l_max is below 2, which later
causes the harmonic indexing in descrpt_nep_args() to fail at runtime. Add an
upfront guard in the parameter validation near l_max_4body/l_max_5body checks
that rejects any configuration with l_max_4body enabled unless l_max >= 2, and
make sure the default l_max_4body path in descrpt_nep_args() is covered by this
check. Keep the existing constraints around l_max_4body, l_max_5body, and the
5-body invariant consistent with the new l_max requirement.
| backend = Backend.detect_backend_by_model(input_file)() | ||
| model = backend.serialize_hook(input_file) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🔴 Critical | ⚡ Quick win
Call the backend serialize hook, don't pass the filename to the method.
Line 265 treats serialize_hook as though it accepts input_file, but the backend contract returns a callable. As written, the advertised non-.pt paths (.pte/.pt2/.jax/.savedmodel) will fail with TypeError before any export happens.
Proposed fix
- model = backend.serialize_hook(input_file)
+ model = backend.serialize_hook()(input_file)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| backend = Backend.detect_backend_by_model(input_file)() | |
| model = backend.serialize_hook(input_file) | |
| backend = Backend.detect_backend_by_model(input_file)() | |
| model = backend.serialize_hook()(input_file) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/tools/nep_txt.py` around lines 264 - 265, The backend export path in
serialize_hook is being called with input_file even though
Backend.detect_backend_by_model(input_file) already returns a callable hook;
update the logic in nep_txt.py so serialize_hook is invoked with no filename
argument and its return value is assigned to model. Use the existing
Backend.detect_backend_by_model and serialize_hook symbols to locate the change,
and keep the non-.pt export flow for .pte/.pt2/.jax/.savedmodel consistent with
the backend contract.
| backends. The expansion coefficients are stored as a single dense tensor of | ||
| shape `(ntypes, ntypes, n_desc, k_max)`, so both the memory footprint and the | ||
| forward cost follow the neighbor count rather than `ntypes ** 2`. |
There was a problem hiding this comment.
🚀 Performance & Scalability | 🟡 Minor | ⚡ Quick win
Fix the storage-scaling claim.
Lines 146-148 say the dense (ntypes, ntypes, n_desc, k_max) table makes memory follow neighbor count, but that storage is still O(ntypes ** 2). Only the runtime access pattern is neighbor-driven.
Suggested wording
-The expansion coefficients are stored as a single dense tensor of
-shape `(ntypes, ntypes, n_desc, k_max)`, so both the memory footprint and the
-forward cost follow the neighbor count rather than `ntypes ** 2`.
+The expansion coefficients are stored as a single dense tensor of
+shape `(ntypes, ntypes, n_desc, k_max)`. This keeps the forward cost tied to the
+actual neighbor pairs visited at runtime, but the coefficient storage itself
+still scales as `ntypes ** 2`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| backends. The expansion coefficients are stored as a single dense tensor of | |
| shape `(ntypes, ntypes, n_desc, k_max)`, so both the memory footprint and the | |
| forward cost follow the neighbor count rather than `ntypes ** 2`. | |
| The expansion coefficients are stored as a single dense tensor of | |
| shape `(ntypes, ntypes, n_desc, k_max)`. This keeps the forward cost tied to the | |
| actual neighbor pairs visited at runtime, but the coefficient storage itself | |
| still scales as `ntypes ** 2`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@doc/model/nep.md` around lines 146 - 148, The storage-scaling claim in the
NEP documentation is incorrect because the dense table used by the expansion
coefficients still scales as O(ntypes ** 2); update the text in the section
describing the `(ntypes, ntypes, n_desc, k_max)` tensor to say memory/storage is
quadratic in `ntypes`, while only the forward/runtime access pattern is
neighbor-driven. Keep the wording aligned with the surrounding explanation of
the expansion coefficients and their use in the backends.
| @property | ||
| def skip_dp(self) -> bool: | ||
| return CommonTest.skip_dp |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Locate CommonTest and skip_dp definitions =="
rg -nC2 'class CommonTest|def skip_dp|skip_dp\s*=' \
source/tests/consistent/descriptor/common.py \
source/tests/consistent/descriptor/test_nep.py
echo
echo "== Show the relevant CommonTest section =="
sed -n '1,220p' source/tests/consistent/descriptor/common.py | nl -ba | sed -n '1,220p'
echo
echo "== Show the local override =="
sed -n '60,80p' source/tests/consistent/descriptor/test_nep.py | nl -baRepository: deepmodeling/deepmd-kit
Length of output: 625
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Find CommonTest and skip_dp declarations =="
fd -a 'common.py' source/tests/consistent/descriptor
rg -nC3 'class CommonTest|skip_dp\s*[:=]|`@property`\s*$|def skip_dp' source/tests/consistent/descriptor/common.py source/tests/consistent/descriptor/test_nep.py
echo
echo "== Read the relevant CommonTest file section with line numbers =="
python3 - <<'PY'
from pathlib import Path
path = Path("source/tests/consistent/descriptor/common.py")
for i, line in enumerate(path.read_text().splitlines(), 1):
if 1 <= i <= 260:
print(f"{i:4d}: {line}")
PY
echo
echo "== Read the local override with line numbers =="
python3 - <<'PY'
from pathlib import Path
path = Path("source/tests/consistent/descriptor/test_nep.py")
for i, line in enumerate(path.read_text().splitlines(), 1):
if 55 <= i <= 80:
print(f"{i:4d}: {line}")
PYRepository: deepmodeling/deepmd-kit
Length of output: 16257
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for any skip_dp definition or override across the descriptor tests =="
rg -nC2 'skip_dp\s*[:=]|def\s+skip_dp|`@property`\s*$' source/tests/consistent/descriptor
echo
echo "== Inspect the surrounding CommonTest class around skip flags =="
python3 - <<'PY'
from pathlib import Path
path = Path("source/tests/consistent/descriptor/common.py")
for i, line in enumerate(path.read_text().splitlines(), 1):
if 330 <= i <= 380:
print(f"{i:4d}: {line}")
PYRepository: deepmodeling/deepmd-kit
Length of output: 44186
Return a real boolean here instead of CommonTest.skip_dp. CommonTest does not define skip_dp, so this property raises AttributeError when the skip logic reads it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@source/tests/consistent/descriptor/test_nep.py` around lines 67 - 69, The
skip_dp property is returning a missing class attribute from CommonTest, which
causes an AttributeError when the skip logic checks it. Update the skip_dp
property in test_nep.py to return an actual boolean value for this test class
instead of referencing CommonTest.skip_dp, and make sure the value matches the
intended skip behavior for the NEP descriptor tests.
| def call( | ||
| self, | ||
| coord_ext: Array, | ||
| atype_ext: Array, | ||
| nlist: Array, | ||
| mapping: Array | None = None, | ||
| fparam: Array | None = None, | ||
| comm_dict: dict | None = None, | ||
| charge_spin: Array | None = None, | ||
| ) -> tuple[Array, None, None, None, Array]: |
| size=(ntypes, ntypes, n_desc, k_max), | ||
| ).astype(prec) | ||
|
|
||
| def call(self, fn: Array, pair_index: Array) -> Array: |
| ClassVar, | ||
| ) | ||
|
|
||
| import deepmd.jax.utils.exclude_mask as _jax_exclude_mask # noqa: F401 |
| def __call__(self, *args: Any, **kwargs: Any) -> Any: | ||
| return torch.nn.Module.__call__(self, *args, **kwargs) | ||
|
|
||
| def __setattr__(self, name: str, value: Any) -> None: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5597 +/- ##
==========================================
- Coverage 82.29% 82.25% -0.05%
==========================================
Files 887 900 +13
Lines 100452 101354 +902
Branches 4056 4056
==========================================
+ Hits 82670 83369 +699
- Misses 16318 16520 +202
- Partials 1464 1465 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Summary by CodeRabbit
nep.txt.