ThreadedScans.jl
ThreadedScans
— ModuleThreadedScans: parallel scan implementations
ThreadedScans.jl provides threading-based parallel scan implementations for Julia. The main high-level API is ThreadedScans.scan!(op, xs)
.
High-level API
ThreadedScans.scan!
— FunctionThreadedScans.scan!(op, xs::AbstractVector; [basesize]) -> xs
Compute inclusive scan of a vector xs
with an associative binary function op
in parallel.
Low-level API
ThreadedScans.partitioned_hillis_steele!
— FunctionThreadedScans.partitioned_hillis_steele!(op, xs::AbstractVector) -> xs
Compute inclusive scan. Intermediate reductions for ntasks
chunks are computed in parallel and then merged using prefix sum algorithm by Hillis and Steele. It is marginally better than other methods especially when the number of worker threads is large.
Keyword arguments
ntasks = Threads.nthreads()
: number of tasks
ThreadedScans.dac!
— FunctionThreadedScans.dac!(op, xs::AbstractVector) -> xs
Compute inclusive scan. Use a divide-and-conquer strategy for spawning the tasks. Intermediate reductions are propagated sequentially. It is slightly better than ThreadedScans.linear!
.
Keyword arguments
ntasks = Threads.nthreads()
: number of tasks
ThreadedScans.linear!
— FunctionThreadedScans.linear!(op, xs::AbstractVector) -> xs
Compute inclusive scan. Spawn, wait, and propagation of intermediate reductions are all done sequentially.
Keyword arguments
ntasks = Threads.nthreads()
: number of tasks