crates/xlog-cuda) can export/import CudaBuffer data using Apache Arrow.
This enables interoperability with the RAPIDS ecosystem (cuDF) and other Arrow-native tools.
Current State
- Export/import is compatible with Arrow and cuDF workflows.
- Arrow IPC export/import is not zero-copy today: export downloads GPU → host; import uploads host → GPU.
- Arrow C Data Interface device export is zero-copy and keeps buffers on GPU.
- Arrow C Data Interface device import is available experimentally (feature-gated) for supported types.
- A zero-copy path exists via DLPack export/import (per-column) from device memory (contiguous 1D columns).
Rust API
xlog_cuda::CudaKernelProvider::to_arrow_record_batchxlog_cuda::CudaKernelProvider::from_arrow_record_batchxlog_cuda::CudaKernelProvider::to_arrow_ipc_streamxlog_cuda::CudaKernelProvider::from_arrow_ipc_streamxlog_cuda::CudaKernelProvider::write_arrow_ipc_stream_filexlog_cuda::CudaKernelProvider::read_arrow_ipc_stream_filexlog_cuda::CudaKernelProvider::to_arrow_device_record_batchxlog_cuda::CudaKernelProvider::from_arrow_device_record_batch(experimental, requires--features arrow-device-import)xlog_cuda::ArrowDeviceArray/xlog_cuda::ArrowDeviceArrayOwnedxlog_cuda::CudaKernelProvider::to_dlpack_table(zero-copy export)xlog_cuda::CudaKernelProvider::from_dlpack_tensors(zero-copy import, infers schema)xlog_cuda::CudaKernelProvider::from_dlpack_tensors_with_schema(zero-copy import, checks schema)
Python cuDF Example (via Arrow IPC)
- In Rust, write an Arrow IPC stream file using
write_arrow_ipc_stream_file(...). - In Python:
Zero-Copy (DLPack)
DLPack provides a GPU-native interchange path that avoids host copies. The current implementation includes:- ✅ DLPack export (current): produces DLPack
DLManagedTensorpointers for each column without copies - ✅ DLPack import (current): consumes DLPack
DLManagedTensorpointers and wraps them without copies - ✅ Python capsule/FFI layer:
crates/pyxlogbuilds apyxlogmodule viamaturinthat:- accepts DLPack capsules /
__dlpack__producers for input relations - returns DLPack capsules for query result columns
- provides a
dlpack_roundtrip(...)helper for low-level DLPack validation
- accepts DLPack capsules /
Zero-Copy (Arrow C Data Interface, device export/import)
The CUDA backend can export and (experimentally) import device-resident Arrow C Data Interface handles without host transfers:- Export: produces an
ArrowDeviceArraywith CUDA device pointers. - Import (experimental): consumes an
ArrowDeviceArrayOwnedand wraps device pointers asCudaColumnwithout copies. - Device descriptor:
device_type = ARROW_DEVICE_CUDA,device_id = <cuda device>. - Supported types (export):
U32,U64,I32,I64,F32,F64,Bool(bit-packed), andSymbol(exported asUInt32). - Supported types (import): numeric types +
Symbol(asUInt32withxlog.symbol=true). Import currently rejects nulls and does not support bit-packedBoolyet. - Symbol metadata: schema fields include
xlog.symbol=trueandxlog.symbol_encoding=u32. - Ownership:
ArrowDeviceArrayOwnedkeeps GPU buffers alive; releasing the FFI handle frees keepalive state.
Python (experimental)
When built withpyxlog feature arrow-device-import, Python exposes:
pyxlog.export_arrow_device(...) -> PyCapsule(namearrow_device_array)pyxlog.import_arrow_device(...) -> (dlpack_tensors, names, num_rows)