Skip to content

Add minimal ORCA external optimizer example (closes #228)#262

Open
EricBoittier wants to merge 2 commits into
metatensor:mainfrom
EricBoittier:orca-external-tools
Open

Add minimal ORCA external optimizer example (closes #228)#262
EricBoittier wants to merge 2 commits into
metatensor:mainfrom
EricBoittier:orca-external-tools

Conversation

@EricBoittier

Copy link
Copy Markdown

Closes #228.

This is a simple proof-of-concept example, not intended to be pushed directly. Below are versions for the packages I used, for reference.
Using TorchSim as a persistent server/client to allow for OMP capable routines like GOAT would be a nice add-on, but perhaps another package like metatomic_orca would be the way to go.

Summary

Adds a minimal ORCA external-tool integration under python/examples/orca/:

  • metatomic-orca-external — standalone wrapper
  • metatomic-orca-server / metatomic-orca-client — persistent server/client
  • orca_common.py — shared extinp/engrad protocol and MetatomicCalculator evaluation
  • water_opt/ — water geometry optimization input template
  • test_protocol.py — smoke tests (no ORCA binary required)

Example-only; no new pip package or engines documentation page.

Test plan

  • pytest python/examples/orca/test_protocol.py
  • Manual ORCA 6 ExtOpt Opt on water (local)

Contributor checklist

Reviewer checklist

  • CHANGELOG — N/A (no public API changes)

(metatomic-torch) boittier@gpu09:~/metatomic/python/examples/orca/water_opt$ pip list
Package Version


annotated-doc 0.0.4
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.13.0
ase 3.28.0
asttokens 3.0.1
attrs 26.1.0
blosc2 4.5.0
cachetools 7.1.4
certifi 2026.5.20
charset-normalizer 3.4.7
chemiscope 1.0.4
click 8.4.1
cmake 4.3.2
colorama 0.4.6
comm 0.2.3
contourpy 1.3.3
coverage 7.14.1
cuda-bindings 13.3.1
cuda-pathfinder 1.5.5
cuda-toolkit 13.0.2
cycler 0.12.1
decorator 5.3.1
distlib 0.4.3
executing 2.2.1
filelock 3.29.4
fonttools 4.63.0
fsspec 2026.4.0
h11 0.16.0
h5py 3.16.0
hf-xet 1.5.1
httpcore 1.0.9
httpx 0.28.1
huggingface_hub 1.19.0
idna 3.18
iniconfig 2.3.0
ipython 9.14.1
ipython_pygments_lexers 1.1.1
ipywidgets 8.1.8
jedi 0.20.0
Jinja2 3.1.6
jsonschema 4.26.0
jsonschema-specifications 2025.9.1
jupyterlab_widgets 3.0.16
kiwisolver 1.5.0
linkify-it-py 2.1.0
markdown-it-py 4.2.0
MarkupSafe 3.0.3
matplotlib 3.11.0
matplotlib-inline 0.2.2
mdit-py-plugins 0.6.1
mdurl 0.1.2
metatensor-core 0.2.0
metatensor-learn 0.4.0
metatensor-operations 0.5.0
metatensor-torch 0.9.1
metatomic-ase 0.2.0.dev1105+git.70749a8
metatomic-torch 0.2.0.dev1105+git.70749a8
metatrain 2026.2.1
mpmath 1.3.0
msgpack 1.2.0
ndindex 1.10.1
networkx 3.6.1
numexpr 2.14.1
numpy 2.4.6
nvalchemi-toolkit-ops 0.3.1
nvidia-cublas 13.1.1.3
nvidia-cublas-cu12 12.6.4.1
nvidia-cuda-cupti 13.0.85
nvidia-cuda-cupti-cu12 12.6.80
nvidia-cuda-nvrtc 13.0.88
nvidia-cuda-nvrtc-cu12 12.6.85
nvidia-cuda-runtime 13.0.96
nvidia-cuda-runtime-cu12 12.6.77
nvidia-cudnn-cu12 9.10.2.21
nvidia-cudnn-cu13 9.20.0.48
nvidia-cufft 12.0.0.61
nvidia-cufft-cu12 11.3.0.4
nvidia-cufile 1.15.1.6
nvidia-cufile-cu12 1.11.1.6
nvidia-curand 10.4.0.35
nvidia-curand-cu12 10.3.7.77
nvidia-cusolver 12.0.4.66
nvidia-cusolver-cu12 11.7.1.2
nvidia-cusparse 12.6.3.3
nvidia-cusparse-cu12 12.5.4.2
nvidia-cusparselt-cu12 0.7.1
nvidia-cusparselt-cu13 0.8.1
nvidia-nccl-cu12 2.29.3
nvidia-nccl-cu13 2.29.7
nvidia-nvjitlink 13.0.88
nvidia-nvjitlink-cu12 12.6.85
nvidia-nvshmem-cu12 3.4.5
nvidia-nvshmem-cu13 3.4.5
nvidia-nvtx 13.0.85
nvidia-nvtx-cu12 12.6.77
omegaconf 2.3.1
packaging 26.2
parso 0.8.7
pexpect 4.9.0
pillow 12.2.0
pip 26.1.2
platformdirs 4.10.0
plotext 5.3.2
pluggy 1.6.0
prompt_toolkit 3.0.52
psutil 7.2.2
ptyprocess 0.7.0
pure_eval 0.2.3
py-cpuinfo 9.0.0
pydantic 2.13.4
pydantic_core 2.46.4
Pygments 2.20.0
pyparsing 3.3.2
pyproject-api 1.10.1
pytest 9.1.0
pytest-cov 7.1.0
python-dateutil 2.9.0.post0
python-discovery 1.4.2
python_hostlist 2.3.0
PyYAML 6.0.3
referencing 0.37.0
requests 2.34.2
rich 15.0.0
rpds-py 2026.5.1
ruff 0.15.17
scipy 1.17.1
setuptools 81.0.0
setuptools-scm 10.0.5
shellingham 1.5.4
six 1.17.0
stack-data 0.6.3
sympy 1.14.0
tables 3.11.1
textual 8.2.7
textual-plotext 1.0.1
threadpoolctl 3.6.0
tomli_w 1.2.0
torch 2.12.0
torch-sim-atomistic 0.6.0
tox 4.55.1
tqdm 4.68.2
traitlets 5.15.1
triton 3.7.0
typer 0.25.1
typing_extensions 4.15.0
typing-inspection 0.4.2
uc-micro-py 2.0.0
urllib3 2.7.0
vcs-versioning 1.1.1
vesin 0.5.8
vesin-torch 0.5.8
virtualenv 21.5.0
warp-lang 1.14.0
wcwidth 0.8.1
widgetsnbextension 4.0.15

Provide example scripts that implement the ORCA extinp/engrad file
protocol and evaluate energies/gradients via MetatomicCalculator,
including a persistent server/client setup for repeated calls.

Closes metatensor#228.

Co-authored-by: Cursor <cursoragent@cursor.com>
@bananenpampe

Copy link
Copy Markdown

Hey Eric,

Thanks a lot for your contribution. Could you quickly let me know which minimal ORCA version is required to make this example work?
on the current cluster I have a ORCA 6.1.0 installed.

Best regards,
Matthias

@EricBoittier

Copy link
Copy Markdown
Author

Sure thing - it was Version 6.0.1 (version on our cluster)
Please let me know if you have any issues following ~/metatomic/python/examples/orca/README.rst
I'd be happy to help you get it running

@bananenpampe

Copy link
Copy Markdown

Okay, cool, then I will try to test it!

I think we should add before merging some instrcuctions how to match ORCA PAL/NCores with PyTorch CPU threading
(or maybe even how it could use a GPU?). The wrapper already parses NCores from the extinp file, but it is not currently used to configure Metatomic/PyTorch threading. maybe in the README section the PAL, nprocs_group, OMP_NUM_THREADS/MKL_NUM_THREADS could be specified and documented, so the ressources are handled correctly

@EricBoittier

EricBoittier commented Jun 15, 2026

Copy link
Copy Markdown
Author

Sweet!

Yes, that's a good idea. I will have to brush up on Metatomic's and PyTorch's threading to see if they can play nicely together with ORCA.

I assume defaults OMP_NUM_THREADS/MKL_NUM_THREADS=1 seemed to go:

(metatomic-torch) boittier@gpu09:~/metatomic/python/examples/orca$ python metatomic-orca-server --model /mmhome/boittier/home/metatomic_checkpoints/tests/model-md.pt --extensions-directory /mmhome/boittier/home/metatomic_checkpoints/tests/extensions --device cuda --warmup

When I tested, I used Metatomic (version above) with NVIDIA GeForce RTX 5090 and checked nvidia-smi, seemed ok
External energy and gradient ... 1.567 sec (= 0.026 min) 96.9 %
ORCA TERMINATED NORMALLY
TOTAL RUN TIME: 0 days 0 hours 0 minutes 2 seconds 214 msec

I also ran now with CUDA_VISIBLE_DEVICES="" and without the --device cuda flag:
External energy and gradient ... 1.058 sec (= 0.018 min) 97.9 %
ORCA TERMINATED NORMALLY
TOTAL RUN TIME: 0 days 0 hours 0 minutes 1 seconds 339 msec

The run time is a bit funny, probably need to go to bigger systems to see a speed up on GPU

@EricBoittier

Copy link
Copy Markdown
Author

I put an example script for GOAT with multiple threads below.
I've added automatic threading configuration: each evaluation now sets PyTorch and BLAS/OpenMP threads from the NCores value ORCA writes to extinp. There's also a new README section on matching PAL / nprocs_group with PyTorch CPU threading, plus notes on GPU via --device cuda on the server.

For multi-image runs (NEB/GOAT), nprocs_group determines NCores per external call — the README recommends one server per GPU/NCores combo and avoiding oversubscription when several wrappers run in parallel.

Set METATOMIC_DISABLE_THREADING_CONFIG=1 if you prefer manual OMP_NUM_THREADS control.


! ExtOpt GOAT PAL4

%maxcore 4000

%pal
nprocs 4
end

%geom
MaxIter 50000
Convergence tight
end

%method
ProgExt "/mmhome/boittier/home/metatomic/python/examples/orca/metatomic-orca-client"
Ext_Params "-b 127.0.0.1:8888"
end

  • xyzfile 0 1 waterdimer.xyz

https://github.com/peverati/ACCDB/blob/master/Geometries/02_water-dimer_1p0_dim_A21x12.xyz


Timings for individual modules:

Sum of individual times ... 1003.238 sec (= 16.721 min)
Geometry relaxation ... 0.246 sec (= 0.004 min) 0.0 %
GOAT ... 999.664 sec (= 16.661 min) 99.6 %
External energy and gradient ... 3.329 sec (= 0.055 min) 0.3 %
ORCA TERMINATED NORMALLY
TOTAL RUN TIME: 0 days 0 hours 16 minutes 55 seconds 552 msec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ORCA - external optimizer interface

2 participants