ThreadedScans.jl
ThreadedScans — ModuleThreadedScans: parallel scan implementations
ThreadedScans.jl provides threading-based parallel scan implementations for Julia. The main high-level API is ThreadedScans.scan!(op, xs).
High-level API
ThreadedScans.scan! — FunctionThreadedScans.scan!(op, xs::AbstractVector; [basesize]) -> xsCompute inclusive scan of a vector xs with an associative binary function op in parallel.
Low-level API
ThreadedScans.partitioned_hillis_steele! — FunctionThreadedScans.partitioned_hillis_steele!(op, xs::AbstractVector) -> xsCompute inclusive scan. Intermediate reductions for ntasks chunks are computed in parallel and then merged using prefix sum algorithm by Hillis and Steele. It is marginally better than other methods especially when the number of worker threads is large.
Keyword arguments
ntasks = Threads.nthreads(): number of tasks
ThreadedScans.dac! — FunctionThreadedScans.dac!(op, xs::AbstractVector) -> xsCompute inclusive scan. Use a divide-and-conquer strategy for spawning the tasks. Intermediate reductions are propagated sequentially. It is slightly better than ThreadedScans.linear!.
Keyword arguments
ntasks = Threads.nthreads(): number of tasks
ThreadedScans.linear! — FunctionThreadedScans.linear!(op, xs::AbstractVector) -> xsCompute inclusive scan. Spawn, wait, and propagation of intermediate reductions are all done sequentially.
Keyword arguments
ntasks = Threads.nthreads(): number of tasks