SplittablesBase.jl
SplittablesBase
— ModuleSplittablesBase: a simple API for parallel computation on collections
SplittablesBase.jl defines a simple set of APIs:
halve(collection)
: splitting givencollection
roughly in half.amount(collection)
: an "approximation" oflength
.
These are the basis of parallel algorithms that can be derived from reduce
. Custom containers can support many parallel algorithms by simply defining these functions.
SplittablesTesting.jl provides simple test utility functions SplittablesTesting.test_ordered(examples)
and SplittablesTesting.test_unordered(examples)
that run some automatable tests with each example container in examples
.
See more in the documentation.
Supported collections
halve
methods for following collections in Base
are implemented in SplittablesBase.jl:
AbstractArray
AbstractString
Dict
keys(::Dict)
values(::Dict)
Set
Tuple
NamedTuple
zip
pairs
Base.Generator
Iterators.filter
Iterators.flatten
Iterators.partition
Iterators.product
Iterators.enumerate
Iterators.reverse
skipmissing
Packages using SplittablesBase.jl
See also
SplittablesBase.halve
— FunctionSplittablesBase.halve(collection) -> (left, right)
Split collection
(roughly) in half.
Examples
julia> using SplittablesBase: halve
julia> halve([1, 2, 3, 4])
([1, 2], [3, 4])
Implementation
Implementations of halve
on custom collections must satisfy the following laws.
(1) If the original collection is ordered, concatenating the sub-collections returned by halve
must create a collection that is equivalent to the original collection. More precisely,
isequal(
vec(collect(collection)),
vcat(vec(collect(left)), vec(collect(right))),
)
must hold.
Similar relationship must hold for unordered collections; i.e., taking union of left and right collections as multiset must create a collection that is equivalent to the original collection as a multiset:
using StatsBase: countmap
isequal(
countmap(collect(collection)),
merge(+, countmap(collect(left)), countmap(collect(right))),
)
(2) halve
must eventually shorten the collection. More precisely, the following function must terminate:
function recursive_halve(collection)
length(collection) <= 1 && return
left, right = halve(collection)
recursive_halve(left)
recursive_halve(right)
end
Furthermore, whenever implementable with cheap operations, length(left)
should be close to length(collection) ÷ 2
as much as possible.
SplittablesBase.amount
— FunctionSplittablesBase.amount(collection) :: Int
Return the number of elements in collection
or rough "estimate" of it.
Examples
julia> using SplittablesBase: amount
julia> amount([1, 2, 3, 4])
4
julia> amount("aえ𝑖∅υ")
12
julia> length("aえ𝑖∅υ") # != `amount`
5
Note that amount
on strings is not equal to length
because the latter cannot be computed in O(1) time.
Implementation
Implementations of amount
on a collection must satisfy the following laws.
(1) Any empty collection must have zero amount
.
(2) Any operation that increments length
on collection must increments amount
.
Ideally, the time-complexity of amount
should be O(1).
SplittablesTesting.test_ordered
— FunctionSplittablesTesting.test_ordered(examples)
SplittablesTesting.test_unordered(examples)
Run interface tests on each test case in examples
.
examples
is an iterator where each element is either:
A container to be tested.
A
NamedTuple
with following keys:label
: A label used forTest.@testcase
.:data
: A container to be tested.
Examples
julia> using SplittablesBase
julia> SplittablesBase.Testing.test_ordered([
(label = "First Test", data = 1:5),
(label = "Second Test", data = (a = 1, b = 2, c = 3)),
zip(1:3, 4:6),
]);
Test Summary: | Pass Total
First Test | 2 2
Test Summary: | Pass Total
Second Test | 2 2
Test Summary: | Pass Total
3 | 2 2
SplittablesTesting.test_unordered
— FunctionSplittablesTesting.test_unordered(examples)
See test_ordered
.