Skip to main content

CudaStream

Struct CudaStream 

pub struct CudaStream { /* private fields */ }
Expand description

A wrapper around sys::CUstream that you can schedule work on.

  • Create with [CudaContext::new_stream()], [CudaContext::default_stream()], or CudaStream::fork().

Work done on this is asynchronous with respect to the host.

See CUDA C/C++ Streams and Concurrency See 3. Stream synchronization behavior See 6.6. Event Management See Out-of-order execution See Dependence analysis

Implementations§

§

impl CudaStream

pub fn fork(&self) -> Result<Arc<CudaStream>, DriverError>

Create’s a new stream and then makes the new stream wait on self

pub fn cu_stream(&self) -> *mut CUstream_st

The underlying cuda stream object

§Safety

Do not destroy this value.

pub fn context(&self) -> &Arc<CudaContext>

The context the stream belongs to.

pub fn synchronize(&self) -> Result<(), DriverError>

Will only block CPU if you call [CudaContext::set_flags()] with sys::CUctx_flags::CU_CTX_SCHED_BLOCKING_SYNC.

See cuda docs

pub fn record_event( &self, flags: Option<CUevent_flags_enum>, ) -> Result<CudaEvent, DriverError>

Creates a new [CudaEvent] and records the current work in the stream to the event.

pub fn wait(&self, event: &CudaEvent) -> Result<(), DriverError>

Waits for the work recorded in [CudaEvent] to be completed.

You can record new work in event after calling this method without affecting this call.

See cuda docs

pub fn join(&self, other: &CudaStream) -> Result<(), DriverError>

Ensures this stream waits for the current workload in other to complete. This is shorthand for self.wait(other.record_event())

§

impl CudaStream

pub fn null<T>(self: &Arc<CudaStream>) -> Result<CudaSlice<T>, DriverError>

Allocates an empty CudaSlice with 0 length.

pub unsafe fn alloc<T>( self: &Arc<CudaStream>, len: usize, ) -> Result<CudaSlice<T>, DriverError>
where T: DeviceRepr,

Allocates a CudaSlice with len elements of type T.

§Safety

This is unsafe because the memory is unset.

pub fn alloc_zeros<T>( self: &Arc<CudaStream>, len: usize, ) -> Result<CudaSlice<T>, DriverError>

Allocates a CudaSlice with len elements of type T. All values are zero’d out.

pub fn memset_zeros<T, Dst>( self: &Arc<CudaStream>, dst: &mut Dst, ) -> Result<(), DriverError>

Set’s all the memory in dst to 0. dst can be a CudaSlice or CudaViewMut

pub fn memcpy_stod<T, Src>( self: &Arc<CudaStream>, src: &Src, ) -> Result<CudaSlice<T>, DriverError>
where T: DeviceRepr, Src: HostSlice<T> + ?Sized,

👎Deprecated:

Use clone_htod

Copy a [T]/Vec<T>/[PinnedHostSlice<T>] to a new CudaSlice.

pub fn clone_htod<T, Src>( self: &Arc<CudaStream>, src: &Src, ) -> Result<CudaSlice<T>, DriverError>
where T: DeviceRepr, Src: HostSlice<T> + ?Sized,

Copy a [T]/Vec<T>/[PinnedHostSlice<T>] to a new CudaSlice.

pub fn memcpy_htod<T, Src, Dst>( self: &Arc<CudaStream>, src: &Src, dst: &mut Dst, ) -> Result<(), DriverError>
where T: DeviceRepr, Src: HostSlice<T> + ?Sized, Dst: DevicePtrMut<T>,

Copy a [T]/Vec<T>/[PinnedHostSlice<T>] into an existing CudaSlice/CudaViewMut.

pub fn memcpy_dtov<T, Src>( self: &Arc<CudaStream>, src: &Src, ) -> Result<Vec<T>, DriverError>
where T: DeviceRepr, Src: DevicePtr<T>,

👎Deprecated:

Use clone_dtoh

Copy a CudaSlice/CudaView to a new Vec<T>.

pub fn clone_dtoh<T, Src>( self: &Arc<CudaStream>, src: &Src, ) -> Result<Vec<T>, DriverError>
where T: DeviceRepr, Src: DevicePtr<T>,

Copy a CudaSlice/CudaView to a new Vec<T>.

pub fn memcpy_dtoh<T, Src, Dst>( self: &Arc<CudaStream>, src: &Src, dst: &mut Dst, ) -> Result<(), DriverError>
where T: DeviceRepr, Src: DevicePtr<T>, Dst: HostSlice<T> + ?Sized,

Copy a CudaSlice/CudaView to a existing [T]/Vec<T>/[PinnedHostSlice<T>].

pub fn memcpy_dtod<T, Src, Dst>( self: &Arc<CudaStream>, src: &Src, dst: &mut Dst, ) -> Result<(), DriverError>
where Src: DevicePtr<T>, Dst: DevicePtrMut<T>,

Copy a CudaSlice/CudaView to a existing CudaSlice/CudaViewMut.

pub fn clone_dtod<T, Src>( self: &Arc<CudaStream>, src: &Src, ) -> Result<CudaSlice<T>, DriverError>
where T: DeviceRepr, Src: DevicePtr<T>,

Copy a CudaSlice/CudaView to a new CudaSlice.

§

impl CudaStream

pub unsafe fn upgrade_device_ptr<T>( self: &Arc<CudaStream>, cu_device_ptr: u64, len: usize, ) -> CudaSlice<T>

Creates a CudaSlice from a sys::CUdeviceptr. Useful in conjunction with CudaSlice::leak().

§Safety
  • cu_device_ptr must be a valid allocation
  • cu_device_ptr must space for len * std::mem::size_of<T>() bytes
  • The memory may not be valid for type T, so some sort of memset operation should be called on the memory.
§

impl CudaStream

§

impl CudaStream

pub fn launch_builder<'a>(&'a self, func: &'a CudaFunction) -> LaunchArgs<'a>

Creates a new kernel launch builder that will launch func on stream self.

Add arguments to the builder using [LaunchArgs::arg()], and submit it to the stream using [LaunchArgs::launch()].

Trait Implementations§

§

impl Debug for CudaStream

§

fn fmt(&self, f: &mut Formatter<'_>) -> Result<(), Error>

Formats the value using the given formatter. Read more
§

impl Drop for CudaStream

§

fn drop(&mut self)

Executes the destructor for this type. Read more
§

impl PartialEq for CudaStream

§

fn eq(&self, other: &CudaStream) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
§

impl Eq for CudaStream

§

impl Send for CudaStream

§

impl StructuralPartialEq for CudaStream

§

impl Sync for CudaStream

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
§

impl<Q, K> Equivalent<K> for Q
where Q: Eq + ?Sized, K: Borrow<Q> + ?Sized,

§

fn equivalent(&self, key: &K) -> bool

Checks if this value is equivalent to the given key. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,