Summary
OpenSSL::X509::CRL currently offers no way to check whether a given serial is revoked without instantiating the entire revocation list as Ruby objects via #revoked. For large CRLs (millions of entries) this is very slow and uses a lot of memory.
libcrypto already exposes a purpose-built primitive for this: X509_CRL_get0_by_serial() which does a sorted lookup over the already-parsed CRL. It just isn't bound in this gem. I'd like to check whether you'd be open to exposing it before I put together a PR.
Motivation
I maintain ssl-test (used by updown.io to monitor TLS certificates). For every checked certificate we fetch its CRL (or OCSP) and need to answer a single question: "is this serial in the list?"
The only available API is #revoked, which returns an Array of OpenSSL::X509::Revoked — i.e. it materialises all entries even though we only care about one:
crl = OpenSSL::X509::CRL.new(der)
revoked = crl.revoked.find { |r| r.serial == cert.serial }
For busy CAs a CRL can contain over a million entries, so #revoked builds 1M+ Ruby objects on every check. That dominates both CPU and memory for our workload and the process RSS fly up over time (even with jemalloc, all the freed memory slots can't always be reclaimed, likely due to fragmentation).
Real-world example + benchmark
Here is a self-contained benchmark script to reproduce the results following:
- URL:
http://c.cf-i.ssl.com/ae801ed1c55bb579d79208b0d772acfb8cc3a208.crl
- 53.6 MB DER, 1,094,762 revoked entries.
Comparing the two ways to answer "is serial X revoked?" — both parse the CRL, then look the serial up. Times are wall-clock; peak RSS is the process high-water mark (/proc/self/status VmHWM), each approach measured in its own process:
| Approach |
parse |
lookup |
peak RSS |
crl.revoked.find { ... } (current) |
~760 ms |
~1280 ms |
~972 MB |
X509_CRL_get0_by_serial (proposed) |
~690 ms |
~307 ms |
~540 MB |
≈ 4.2× faster lookup and ≈ 1.8× less peak RSS (~430 MB saved). And that is for just ONE certificate check. The parse cost is essentially the same for both, the saving is the ~1.09M OpenSSL::X509::Revoked Ruby objects that #revoked builds but get0_by_serial never needs. The gain scales with CRL size, so for the multi-million-entry CRLs this matters a lot.
Environment: Ruby 3.3.5, openssl gem 3.2.0, OpenSSL 3.5.5.
How I work around it today
Two hacks, both unsatisfying:
-
Raw-DER byte search. Since a CRL is DER and each entry encodes its serial as a canonical INTEGER, I scan the raw body for OpenSSL::ASN1::Integer.new(serial).to_der; a miss means "not revoked" and I skip #revoked entirely (falling back to it only on a match, to read the reason/date). Validated against ~1.6M real revoked serials across ~900 CRLs with zero false negatives. It's very fast but it's fragile (relies on strict-DER encoding) so I wouldn't recommend implementing this in OpenSSL (unless you think it's a good idea). But I'll probably keep this optimisation in my gem.
-
Calling X509_CRL_get0_by_serial via Fiddle (how the benchmark above was produced). Works, but requires re-d2i-parsing the CRL outside the Ruby object, hand-managing C pointers, and locating the right libcrypto. All of which a one-line binding would make unnecessary. I'd rather not do that in my code if we can upstream the real method to OpenSSL.
Proposed API
A method on OpenSSL::X509::CRL that returns the single matching entry (or nil) without building the whole list, e.g.:
crl.find_revoked(serial) # => OpenSSL::X509::Revoked or nil
# and/or a convenience predicate:
crl.revoked?(serial) # => true / false
Implementation would wrap X509_CRL_get0_by_serial(crl, &ret, serial). A couple of notes:
- The function sorts the revoked stack on first call (idempotent, cached on the
X509_CRL), so subsequent lookups on the same object are O(log n).
- It's a
get0 (no ownership transfer); the returned X509_REVOKED * is owned by the CRL, so the wrapper should return a Revoked that keeps the CRL alive (or dups the entry) to stay memory-safe.
serial could accept an OpenSSL::BN/Integer and convert to ASN1_INTEGER internally.
Would you be open to adding this (under whatever name/shape you prefer)? If so I'm happy to prepare the PR with tests. Thanks!
Summary
OpenSSL::X509::CRLcurrently offers no way to check whether a given serial is revoked without instantiating the entire revocation list as Ruby objects via#revoked. For large CRLs (millions of entries) this is very slow and uses a lot of memory.libcrypto already exposes a purpose-built primitive for this:
X509_CRL_get0_by_serial()which does a sorted lookup over the already-parsed CRL. It just isn't bound in this gem. I'd like to check whether you'd be open to exposing it before I put together a PR.Motivation
I maintain
ssl-test(used by updown.io to monitor TLS certificates). For every checked certificate we fetch its CRL (or OCSP) and need to answer a single question: "is this serial in the list?"The only available API is
#revoked, which returns anArrayofOpenSSL::X509::Revoked— i.e. it materialises all entries even though we only care about one:For busy CAs a CRL can contain over a million entries, so
#revokedbuilds 1M+ Ruby objects on every check. That dominates both CPU and memory for our workload and the process RSS fly up over time (even with jemalloc, all the freed memory slots can't always be reclaimed, likely due to fragmentation).Real-world example + benchmark
Here is a self-contained benchmark script to reproduce the results following:
http://c.cf-i.ssl.com/ae801ed1c55bb579d79208b0d772acfb8cc3a208.crlComparing the two ways to answer "is serial X revoked?" — both parse the CRL, then look the serial up. Times are wall-clock; peak RSS is the process high-water mark (
/proc/self/statusVmHWM), each approach measured in its own process:crl.revoked.find { ... }(current)X509_CRL_get0_by_serial(proposed)≈ 4.2× faster lookup and ≈ 1.8× less peak RSS (~430 MB saved). And that is for just ONE certificate check. The parse cost is essentially the same for both, the saving is the ~1.09M
OpenSSL::X509::RevokedRuby objects that#revokedbuilds butget0_by_serialnever needs. The gain scales with CRL size, so for the multi-million-entry CRLs this matters a lot.Environment: Ruby 3.3.5,
opensslgem 3.2.0, OpenSSL 3.5.5.How I work around it today
Two hacks, both unsatisfying:
Raw-DER byte search. Since a CRL is DER and each entry encodes its serial as a canonical
INTEGER, I scan the raw body forOpenSSL::ASN1::Integer.new(serial).to_der; a miss means "not revoked" and I skip#revokedentirely (falling back to it only on a match, to read the reason/date). Validated against ~1.6M real revoked serials across ~900 CRLs with zero false negatives. It's very fast but it's fragile (relies on strict-DER encoding) so I wouldn't recommend implementing this in OpenSSL (unless you think it's a good idea). But I'll probably keep this optimisation in my gem.Calling
X509_CRL_get0_by_serialviaFiddle(how the benchmark above was produced). Works, but requires re-d2i-parsing the CRL outside the Ruby object, hand-managing C pointers, and locating the rightlibcrypto. All of which a one-line binding would make unnecessary. I'd rather not do that in my code if we can upstream the real method to OpenSSL.Proposed API
A method on
OpenSSL::X509::CRLthat returns the single matching entry (or nil) without building the whole list, e.g.:Implementation would wrap
X509_CRL_get0_by_serial(crl, &ret, serial). A couple of notes:X509_CRL), so subsequent lookups on the same object are O(log n).get0(no ownership transfer); the returnedX509_REVOKED *is owned by the CRL, so the wrapper should return aRevokedthat keeps the CRL alive (or dups the entry) to stay memory-safe.serialcould accept anOpenSSL::BN/Integer and convert toASN1_INTEGERinternally.Would you be open to adding this (under whatever name/shape you prefer)? If so I'm happy to prepare the PR with tests. Thanks!