Skip to content

Feature request: efficient single-serial CRL revocation lookup (wrap X509_CRL_get0_by_serial) #1064

@jarthod

Description

@jarthod

Summary

OpenSSL::X509::CRL currently offers no way to check whether a given serial is revoked without instantiating the entire revocation list as Ruby objects via #revoked. For large CRLs (millions of entries) this is very slow and uses a lot of memory.

libcrypto already exposes a purpose-built primitive for this: X509_CRL_get0_by_serial() which does a sorted lookup over the already-parsed CRL. It just isn't bound in this gem. I'd like to check whether you'd be open to exposing it before I put together a PR.

Motivation

I maintain ssl-test (used by updown.io to monitor TLS certificates). For every checked certificate we fetch its CRL (or OCSP) and need to answer a single question: "is this serial in the list?"

The only available API is #revoked, which returns an Array of OpenSSL::X509::Revoked — i.e. it materialises all entries even though we only care about one:

crl = OpenSSL::X509::CRL.new(der)
revoked = crl.revoked.find { |r| r.serial == cert.serial }

For busy CAs a CRL can contain over a million entries, so #revoked builds 1M+ Ruby objects on every check. That dominates both CPU and memory for our workload and the process RSS fly up over time (even with jemalloc, all the freed memory slots can't always be reclaimed, likely due to fragmentation).

Real-world example + benchmark

Here is a self-contained benchmark script to reproduce the results following:

  • URL: http://c.cf-i.ssl.com/ae801ed1c55bb579d79208b0d772acfb8cc3a208.crl
  • 53.6 MB DER, 1,094,762 revoked entries.

Comparing the two ways to answer "is serial X revoked?" — both parse the CRL, then look the serial up. Times are wall-clock; peak RSS is the process high-water mark (/proc/self/status VmHWM), each approach measured in its own process:

Approach parse lookup peak RSS
crl.revoked.find { ... } (current) ~760 ms ~1280 ms ~972 MB
X509_CRL_get0_by_serial (proposed) ~690 ms ~307 ms ~540 MB

4.2× faster lookup and ≈ 1.8× less peak RSS (~430 MB saved). And that is for just ONE certificate check. The parse cost is essentially the same for both, the saving is the ~1.09M OpenSSL::X509::Revoked Ruby objects that #revoked builds but get0_by_serial never needs. The gain scales with CRL size, so for the multi-million-entry CRLs this matters a lot.

Environment: Ruby 3.3.5, openssl gem 3.2.0, OpenSSL 3.5.5.

How I work around it today

Two hacks, both unsatisfying:

  1. Raw-DER byte search. Since a CRL is DER and each entry encodes its serial as a canonical INTEGER, I scan the raw body for OpenSSL::ASN1::Integer.new(serial).to_der; a miss means "not revoked" and I skip #revoked entirely (falling back to it only on a match, to read the reason/date). Validated against ~1.6M real revoked serials across ~900 CRLs with zero false negatives. It's very fast but it's fragile (relies on strict-DER encoding) so I wouldn't recommend implementing this in OpenSSL (unless you think it's a good idea). But I'll probably keep this optimisation in my gem.

  2. Calling X509_CRL_get0_by_serial via Fiddle (how the benchmark above was produced). Works, but requires re-d2i-parsing the CRL outside the Ruby object, hand-managing C pointers, and locating the right libcrypto. All of which a one-line binding would make unnecessary. I'd rather not do that in my code if we can upstream the real method to OpenSSL.

Proposed API

A method on OpenSSL::X509::CRL that returns the single matching entry (or nil) without building the whole list, e.g.:

crl.find_revoked(serial)  # => OpenSSL::X509::Revoked or nil
# and/or a convenience predicate:
crl.revoked?(serial)      # => true / false

Implementation would wrap X509_CRL_get0_by_serial(crl, &ret, serial). A couple of notes:

  • The function sorts the revoked stack on first call (idempotent, cached on the X509_CRL), so subsequent lookups on the same object are O(log n).
  • It's a get0 (no ownership transfer); the returned X509_REVOKED * is owned by the CRL, so the wrapper should return a Revoked that keeps the CRL alive (or dups the entry) to stay memory-safe.
  • serial could accept an OpenSSL::BN/Integer and convert to ASN1_INTEGER internally.

Would you be open to adding this (under whatever name/shape you prefer)? If so I'm happy to prepare the PR with tests. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions