SplittablesBase.jl
SplittablesBase — ModuleSplittablesBase: a simple API for parallel computation on collections
SplittablesBase.jl defines a simple set of APIs:
halve(collection): splitting givencollectionroughly in half.amount(collection): an "approximation" oflength.
These are the basis of parallel algorithms that can be derived from reduce. Custom containers can support many parallel algorithms by simply defining these functions.
SplittablesTesting.jl provides simple test utility functions SplittablesTesting.test_ordered(examples) and SplittablesTesting.test_unordered(examples) that run some automatable tests with each example container in examples.
See more in the documentation.
Supported collections
halve methods for following collections in Base are implemented in SplittablesBase.jl:
AbstractArrayAbstractStringDictkeys(::Dict)values(::Dict)SetTupleNamedTuplezippairsBase.GeneratorIterators.filterIterators.flattenIterators.partitionIterators.productIterators.enumerateIterators.reverseskipmissing
Packages using SplittablesBase.jl
See also
SplittablesBase.halve — FunctionSplittablesBase.halve(collection) -> (left, right)Split collection (roughly) in half.
Examples
julia> using SplittablesBase: halve
julia> halve([1, 2, 3, 4])
([1, 2], [3, 4])Implementation
Implementations of halve on custom collections must satisfy the following laws.
(1) If the original collection is ordered, concatenating the sub-collections returned by halve must create a collection that is equivalent to the original collection. More precisely,
isequal(
vec(collect(collection)),
vcat(vec(collect(left)), vec(collect(right))),
)must hold.
Similar relationship must hold for unordered collections; i.e., taking union of left and right collections as multiset must create a collection that is equivalent to the original collection as a multiset:
using StatsBase: countmap
isequal(
countmap(collect(collection)),
merge(+, countmap(collect(left)), countmap(collect(right))),
)(2) halve must eventually shorten the collection. More precisely, the following function must terminate:
function recursive_halve(collection)
length(collection) <= 1 && return
left, right = halve(collection)
recursive_halve(left)
recursive_halve(right)
endFurthermore, whenever implementable with cheap operations, length(left) should be close to length(collection) ÷ 2 as much as possible.
SplittablesBase.amount — FunctionSplittablesBase.amount(collection) :: IntReturn the number of elements in collection or rough "estimate" of it.
Examples
julia> using SplittablesBase: amount
julia> amount([1, 2, 3, 4])
4
julia> amount("aえ𝑖∅υ")
12
julia> length("aえ𝑖∅υ") # != `amount`
5Note that amount on strings is not equal to length because the latter cannot be computed in O(1) time.
Implementation
Implementations of amount on a collection must satisfy the following laws.
(1) Any empty collection must have zero amount.
(2) Any operation that increments length on collection must increments amount.
Ideally, the time-complexity of amount should be O(1).
SplittablesTesting.test_ordered — FunctionSplittablesTesting.test_ordered(examples)
SplittablesTesting.test_unordered(examples)Run interface tests on each test case in examples.
examples is an iterator where each element is either:
A container to be tested.
A
NamedTuplewith following keys:label: A label used forTest.@testcase.:data: A container to be tested.
Examples
julia> using SplittablesBase
julia> SplittablesBase.Testing.test_ordered([
(label = "First Test", data = 1:5),
(label = "Second Test", data = (a = 1, b = 2, c = 3)),
zip(1:3, 4:6),
]);
Test Summary: | Pass Total
First Test | 2 2
Test Summary: | Pass Total
Second Test | 2 2
Test Summary: | Pass Total
3 | 2 2SplittablesTesting.test_unordered — FunctionSplittablesTesting.test_unordered(examples)See test_ordered.