Port Protein to CUDA C by vyeoms · Pull Request #601 · PufferAI/PufferLib

vyeoms · 2026-06-29T16:27:25Z

Summary

Building on top of #587, so the diff will be updated once the GP port PR is accepted. Porting Protein to pure CUDA C:

src/protein_util.h: Pure C utilities for Protein. Defines the search space type (linear, log, pow2, logit), normalizations, Pareto front utilities, and other numeric helpers (for example the Nelder Mead minimizer used here).
src/protein.cu: Core implementation for Protein, implements the original as faithfully as I could make it. Has code for reqs like Adam, acquisition scoring, and other device-side numeric operations (e.g. for the classifier), which I don't know if would be preferred in a separate file. Currently ~1200 LoC.
tests/test_protein.cu: Unit testing the components from src/protein.cu. Build with nvcc -o test_protein tests/test_protein.cu -I src/ -lcublas -lcusolver -lcurand and run with ./test_protein.
tests/test_protein_sweep.cu: Replicates the synthetic sweep test from tests/test_sweep.py. Outputs an HTML with the plot, and a CSV with the results for registry. Build with nvcc -o test_protein_sweep tests/test_protein_sweep.cu -I src/ -lcublas -lcusolver -lcurand and run with ./test_protein_sweep.

Notes

Fair to note, this is a pure CUDA implementation. Unlike the GP port I mentioned in #587, I don't have a pure C CPU version for Protein currently.

This should build natively with PufferLib to run with puffer sweep <env>, but falls back to the original python implementation in case the Protein CUDA build isn't available.

Numeric and qualitative results

Unit testing

tests/test_protein.cu generally tests the following aspects:

Do we get the actual Pareto-dominant points?
Does the Pareto pruning actually prune bad results and keep the good ones?
Do the cost models converge to the true cost in a toy setting?
Is the logistic regression classifier good enough? Tests on a set of linearly separable points.
Are the samples from the acquisition score within bounds?

Plus a small integration test fitting a toy cost function.

Synthetic test

tests/test_protein_sweep.cu replicates the synthetic eval from tests/test_sweep.py. Output from the CUDA C test:

Original implementation with gpytorch:

Breakout

Ran a 20 iteration sweep on Breakout. Sweep result on my laptop with one A100 GPU:

Playing with the top score hyperparameters:

This change defines __setstate__ and __getstate__ in Protein to handle CUDA graph capture in child processes spawned by multiprocessing. The child processes don't handle GP training or updates, so they don't need to be calling CudaMalloc. Stripping the CUDA-heavy parameters for the child processes reduces the graph capture load

vyeoms added 11 commits June 29, 2026 13:12

(bugfix) Update test_sweep.py to use current version of pufferl

1b90fc5

Add custom CUDA Gaussian Process implementation

df756d5

Use custom CUDA GP implementation in sweep

a1d0155

Test custom CUDA GP to verify numeric accuracy with gpytorch

725f2e1

clang-format to WebKit standard on GP implementation

4562158

Clean up magic numbers

e09c003

Trimming code, tightening covariance kernel implementation

731807d

Porting Protein to CUDA C

906e2b2

Unit testing CUDA C port for Protein

ae73c45

Replicating tests/test_sweep.py for Protein CUDA C port

93f6e89

vyeoms force-pushed the port/c_protein branch from ec5453d to 046e212 Compare June 30, 2026 09:23

Trimming code

5931952

vyeoms force-pushed the port/c_protein branch from 046e212 to 5931952 Compare June 30, 2026 10:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Port Protein to CUDA C#601

Port Protein to CUDA C#601
vyeoms wants to merge 12 commits into
PufferAI:4.0from
vyeoms:port/c_protein

vyeoms commented Jun 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vyeoms commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Notes

Numeric and qualitative results

Unit testing

Synthetic test

Breakout

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

vyeoms commented Jun 29, 2026 •

edited

Loading