Skip to main content

Module stream_pool

Module stream_pool 

Source
Expand description

StreamPool — owned non-blocking CUDA streams indexed by StreamId.

The runtime hands out stable StreamIds to callers and resolves them to live cudarc::driver::CudaStream handles internally. The pool grows on demand: acquire returns a fresh non-blocking stream up to max_streams. Streams are never returned to a free-list — they stay alive for the runtime’s lifetime so StreamId handles remain valid for correlated allocate/launch/deallocate sequences.

§Failure semantics

acquire returns Result. On capacity exhaustion or cudarc::driver::CudaStream::fork failure the call returns StreamPoolError rather than silently collapsing onto the default stream — that fall-back was a footgun: it broke stream-ordered isolation (a “non-default” allocation could end up on the legacy default stream) without surfacing the failure to the caller.

Structs§

StreamPool
Pool of owned non-blocking CUDA streams.

Enums§

StreamPoolError
Errors returned by StreamPool::acquire. Both variants are hard failures; callers must not silently substitute StreamId::DEFAULT.

Constants§

DEFAULT_MAX_STREAMS
Default maximum stream count. The executor’s typical concurrency is 1 deterministic stream + a small handful of join/scan helpers, so 16 leaves substantial headroom without burning device-state on idle streams.
DEFAULT_POOL_MB_PER_STREAM
ENV_WCOJ_POOL_MB_PER_STREAM

Functions§

configured_pool_bytes_per_stream
configured_pool_mb_per_stream
planned_pool_budget_bytes