Folds.jl

Folds.FoldsModule

Folds: sequential, threaded, and distributed fold interface for Julia

Stable Dev GitHub Actions

Folds.jl provides a unified interface for sequential, threaded, and distributed folds.

julia> using Folds

julia> Folds.sum(1:10)
55

julia> Folds.sum(1:10, ThreadedEx())  # equivalent to above
55

julia> Folds.sum(1:10, DistributedEx())
55

Iterator transforms and transducers

Most of the functions can be used with iterator comprehensions:

julia> Folds.sum(y for x in 1:10 if isodd(x) for y in 1:x^2)
4917

and Transducers.jl:

julia> using Transducers

julia> 1:10 |> Filter(isodd) |> MapCat(x -> 1:x^2) |> Folds.sum
4917

Package interop

Folds.jl interoperates with various packages. For example, StructArrays.jl can be used as an input and/or output:

julia> using StructArrays

julia> table = StructVector(
           x = [:a, :a, :b, :a, :b],
           y = [1, 2, 3, 4, 5],
       );

julia> Folds.copy(StructVector, (row for row in table if row.x === :a))
3-element StructArray(::Vector{Symbol}, ::Vector{Int64}) with eltype NamedTuple{(:x, :y), Tuple{Symbol, Int64}}:
 (x = :a, y = 1)
 (x = :a, y = 2)
 (x = :a, y = 4)

It also works with OnlineStats.jl by treating it as a reducing function (or more precisely a monoid):

julia> using OnlineStats

julia> Folds.reduce(Mean(), 1:10)
Mean: n=10 | value=5.5

Extensible execution mechanism

Folds.jl decouples the implementation and the execution mechanism ("executor"). Additional executors can be installed from FoldsThreads.jl, FoldsCUDA.jl (rather WIP), and FoldsDagger.jl (very WIP).

source
Folds.allFunction
Folds.all([f,] collection; [executor_options...])
Folds.all([f,] collection, executor)

Check if all the elements in collection, optionally evaluated by f, are true. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.all([true, true, true])
true

julia> Folds.all([true, true, false])
false

julia> Folds.all(n -> (ℯ * (n/ℯ)^n ≤ factorial(n)), (big(n) for n in 1:100))
true
source
Folds.anyFunction
Folds.any([f,] collection; [executor_options...])
Folds.any([f,] collection, executor)

Check if any of the elements in collection, optionally evaluated by f, is true. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.any([true, false, false])
true

julia> Folds.any([false, false, false])
false

julia> Folds.any(a/(b+c) + b/(a+c) + c/(a+b) < 3/2 for a in 1:100, b in 1:100, c in 1:100)
false
source
Folds.collectFunction
Folds.collect(collection; [executor_options...]) :: AbstractArray
Folds.collect(collection, executor) :: AbstractArray

Materialize collection as an array. Parallel by default.

Iterator transformations such as (f(x) for x in xs if p(x)) wrapping parallelizable container(s) xs are executed in parallel. See Extended help in Folds.reduce for more information.

Unlike Base.collect, the output can be an array of type other than Array.

Examples

julia> using Folds

julia> Folds.collect(x^2 for x in 1:4 if isodd(x))
2-element Vector{Int64}:
 1
 9

julia> Folds.collect(i for i in 1:10_000_000 if sin(i) > 1 - 1e-12)
4-element Vector{Int64}:
  573204
 4846147
 7138963
 9119090
source
Folds.copyFunction
Folds.copy(T::Type, collection; [executor_options...]) :: T
Folds.copy(T::Type, collection, executor) :: T

Materialize collection as a collection of type T. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds, StructArrays

julia> Folds.copy(StructVector, ((x = x, y = x^2) for x in 1:3))
3-element StructArray(::Vector{Int64}, ::Vector{Int64}) with eltype NamedTuple{(:x, :y), Tuple{Int64, Int64}}:
 (x = 1, y = 1)
 (x = 2, y = 4)
 (x = 3, y = 9)
source
Folds.countFunction
Folds.count([f,] collection; [executor_options...])
Folds.count([f,] collection, executor)

Count the number of true items in collection or items that evaluates to true by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.count([true, true, false])
2

julia> Folds.count(gcd(x, 857142) == 1 for x in 1:10_000_000)
2721603
source
Folds.extremaFunction
Folds.extrema([f,] collection; [executor_options...])
Folds.extrema([f,] collection, executor)

Compute the minimum and the maximum of the items in collection, optionally evaluated by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.extrema([4, 8, 3, 5, 5])
(3, 8)

julia> Folds.extrema(xor(i, (i + one(i))^i) for i in UInt32(1):UInt32(10_000_000))
(0x00000003, 0xfffffa3d)
source
Folds.findallFunction
Folds.findall([f,] collection; [executor_options...])
Folds.findall([f,] collection, executor)

Find all indices for which the item is true or evaluates to true by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> let pidigits = string(BigFloat(π; precision = 2^20))[3:end]
           Folds.findall(1:length(pidigits)) do i
               startswith(SubString(pidigits, i), string(i))
           end
       end
3-element Vector{Int64}:
     1
 16470
 44899
source
Folds.findfirstFunction
Folds.findfirst([f,] collection; [executor_options...])
Folds.findfirst([f,] collection, executor)

Find the first index containing true or, if f is given, an item that evaluates to true by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> let s1 = string(BigFloat(π; precision = 2^20))[3:end],
           s2 = string(BigFloat(ℯ; precision = 2^20))[3:end],
           w = 4
           Folds.findfirst(1:length(s1)-w; basesize = 10000) do i
               SubString(s1, i, i + w) == SubString(s2, i, i + w)
           end
       end
26548
source
Folds.findlastFunction
Folds.findlast([f,] collection; [executor_options...])
Folds.findlast([f,] collection, executor)

Find the last index containing true or, if f is given, an item that evaluates to true by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> let s1 = string(BigFloat(π; precision = 2^20))[3:end],
           s2 = string(BigFloat(ℯ; precision = 2^20))[3:end],
           w = 4
           Folds.findlast(1:length(s1)-w; basesize = 10000) do i
               SubString(s1, i, i + w) == SubString(s2, i, i + w)
           end
       end
303001
source
Folds.foreachFunction
Folds.foreach(f, collections...; [executor_options...])
Folds.foreach(f, collections..., executor)

Call f on the elements of collections in parallel.

This is equivalent to

for x in zip(collections...)
    f(x...)
end

except that f may be applied in parallel and in unspecified order.

Referenceables.jl can be used to update the elements in (a subset of) collections.

See Folds.reduce for more information.

Examples

julia> using Folds, Referenceables

julia> xs = 1:2^20;

julia> ys = ones(length(xs));

julia> Folds.foreach(referenceable(ys), xs) do y, x
           y[] = sin(y[] * x)
       end
source
Folds.issortedFunction
Folds.issorted(collection; [lt,] [by,] [rev,] [order,] [executor_options...])
Folds.issorted(collection, executor; [lt,] [by,] [rev,] [order,])

Check if collection is sorted. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.issorted(0:9)
true

julia> Folds.issorted([0:1000_0000; [0]])
false
source
Folds.mapFunction
Folds.map(f, collections...; [executor_options...])
Folds.map(f, collections..., executor)

Equivalent to Folds.collect(f(x...) for x in zip(collections...)). Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.map(+, 1:3, 2:2:6)
3-element Vector{Int64}:
 3
 6
 9

julia> Folds.map(i -> floor(Int, i / π), (i for i in 1:10_000_000 if sin(i) > 1 - 1e-12))
4-element Vector{Int64}:
  182456
 1542576
 2272402
 2902696
source
Folds.map!Function
Folds.map!(f, dest, collections...; [executor_options...]) -> dest
Folds.map!(f, dest, collections..., executor) -> dest

Equivalent to dest .= f.(collections...). Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.map!(+, zeros(Int, 3), 1:3, 2:2:6)
3-element Vector{Int64}:
 3
 6
 9

julia> Folds.map!(sin, Vector{Float64}(undef, 2^20), 1:2^20);
source
Folds.mapreduceFunction
Folds.mapreduce(f, op, collections...; [init] [executor_options...])
Folds.mapreduce(f, op, collections..., executor; [init])

Equivalent to Folds.reduce(op, (f(x...) for x in zip(collections...))).

See Folds.reduce for more information.

source
Folds.maximumFunction
Folds.maximum([f,] collection; [executor_options...])
Folds.maximum([f,] collection, executor)

Compute the minimum of the items in collection, optionally evaluated by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.maximum([4, 8, 3, 5, 5])
8

julia> Folds.maximum(xor(i, (i + one(i))^i) for i in UInt32(1):UInt32(10_000_000))
0xfffffa3d
source
Folds.minimumFunction
Folds.maximum([f,] collection; [executor_options...])
Folds.maximum([f,] collection, executor)

Compute the maximum of the items in collection, optionally evaluated by f. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.minimum([4, 8, 3, 5, 5])
3

julia> Folds.minimum(xor(i, (i + one(i))^i) for i in UInt32(1):UInt32(10_000_000))
0x00000003
source
Folds.prodFunction
Folds.prod([f,] collection; [init,] [executor_options...])
Folds.prod([f,] collection, executor; [init])

Compute f(x₁) * f(x₂) * f(x₃) * ... * f(xₙ) for the elements xᵢ in collection with f defaults to identity. Parallel by default.

init should be an object that behaves like the identity of *.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.prod(1:5)
120

julia> floor(Folds.prod(1 + (sin(n) + 1)/10_000_000 for n in 1:10_000_000); digits = 3)
2.718
source
Folds.reduceFunction
Folds.reduce(op, collection; [init] [executor_options...])
Folds.reduce(op, collection, executor; [init])

Reduce a collection with an associative operator. Parallel by default.

Given a collection with elements x₁, x₂, ..., xₙ, and an associative operator , Folds.reduce(⊗, collection) computes

x₁ ⊗ x₂ ⊗ x₃ ⊗ x₄ ⊗ ... ⊗ xₙ

If no executor is specified, executor_options are passed to the default executor for collection.

Extended help

If no executor is specified, an appropriate executor is chosen automatically based on collection (e.g., CUDAEx for CuArrays if FoldsCUDA.jl is loaded) assuming that the reduction can be parallelized; i.e.,:

  1. iteration over collection and evaluation ofop are data race-free,
  2. binary function op is (at least approximately) associative and init behaves as the identity of op.

For example, consider

Folds.reduce(op, (f(x) for x in xs if p(x))

The first assumption indicates that Folds.reduce requires that op, f, p, and iterate on xs do not have any data races. For example, a stateless function is safe to use. If these functions need to access shared state that can be mutated while invoking Folds.reduce, it must be protected using, e.g., lock or atomic. Note that, for a good performance, it is recommended to restructure the code to avoid requiring locks in these functions.

The second point indicates that Folds.reduce requires op to be associative on the set of all values that f can produce. The default executor and many other executors do not require exact associativity for deterministic result, provided that scheduling parameters (e.g., basesize) are configured. For example, Folds.reduce(+, floats, ThreadedEx(); init = 0.0) may produce slightly different result when julia is started wih different number of threads. For a deterministic result independent of the number of threads in julia, use ThreadedEx(basesize = ...) where ... is a large enough number. Different executor may require different properties of op (e.g., exact associativity, commutativity); check the documentation of the executor for more information.

The default executor is chosen based on collection. If collection is an iterator transformed from another iterator, the innermost iterator is used for determining the executor. Consider the following values for collection:

xs
(f(x) for x in xs)
(f(x) for x in xs if p(x))

In all cases, xs determines the executor to be used. Thus, the reduction

xs :: CuArray
Folds.reduce(+, (f(x) for x in xs if p(x)))

uses CUDAEx executor if FoldsCUDA.jl is loaded. If collection is a zip or Iterators.product, Folds.reduce tries to find an appropriate executor using a promotion mechanism.

It is safe for the operator op to mutate of the first argument if Transducers.OnInit is used for init. It can be used to create mutable accumulators (the object passed to the first argument to op) that can be mutated without a data race. Since the second argument to op can be originated from collection or another output of op, mutating it can lead to unpredictable side-effects although it may not be a problem in some cases (e.g., collection would be thrown away after this computation).

source
Folds.scan!Function
Folds.scan!(op, xs; [executor_options...]) -> xs
Folds.scan!(op, xs, executor) -> xs

Inclusive in-place scan.

Folds.scan!(⊗, xs) computes

[xs[1], xs[1] ⊗ xs[2], xs[1] ⊗ xs[2] ⊗ xs[3], ...]

stores the result in xs, and returns xs.

Examples

julia> using Folds

julia> Folds.scan!(+, [1, 2, 3])
3-element Vector{Int64}:
 1
 3
 6
source
Folds.setFunction
Folds.set(collection; [executor_options...]) :: AbstractSet
Folds.set(collection, executor) :: AbstractSet

Materialize collection as a set. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> pidigits = deleteat!([x - '0' for x in string(BigFloat(π; precision = 2^20))], 2);

julia> @show pidigits[1:5];
pidigits[1:5] = [3, 1, 4, 1, 5]

julia> sort!(collect(Folds.set(x for x in Iterators.partition(pidigits, 8) if issorted(x))))
8-element Vector{SubArray{Int64, 1, Vector{Int64}, Tuple{UnitRange{Int64}}, true}}:
 [0, 0, 1, 1, 4, 4, 4, 6]
 [0, 0, 1, 2, 2, 3, 7, 8]
 [0, 1, 1, 1, 3, 7, 8, 8]
 [0, 1, 3, 4, 5, 6, 6, 8]
 [1, 1, 5, 5, 8, 8, 8, 8]
 [3, 4, 9, 9, 9, 9, 9, 9]
 [4, 4, 5, 5, 6, 8, 8, 9]
 [5, 5, 5, 5, 8, 9, 9, 9]
source
Folds.sumFunction
Folds.sum([f,] collection; [init,] [executor_options...])
Folds.sum([f,] collection, executor; [init])

Compute f(x₁) + f(x₂) + f(x₃) + ... + f(xₙ) for the elements xᵢ in collection with f defaults to identity. Parallel by default.

init should be an object that behaves like the identity of +.

See Extended help in Folds.reduce for more information.

Example

julia> using Folds

julia> f(x) = gcd(x, 42);

julia> Folds.sum(f, 1:1000_000)
4642844

julia> Folds.sum(f, 1:1000_000, SequentialEx())
4642844
source
Folds.uniqueFunction
Folds.unique([f,] collection; [executor_options...])
Folds.unique([f,] collection, executor)

List the unique elements from collection in the order that appears in collection. If f given, the uniqueness is determined by comparing its output. Parallel by default.

See Extended help in Folds.reduce for more information.

Examples

julia> using Folds

julia> Folds.unique([2, 4, 3, 0, 0, 4, 3, 4, 3, 1, 0, 0, 4, 1, 4, 1, 3, 3, 4, 0])
5-element Vector{Int64}:
 2
 4
 3
 0
 1

julia> pidigits = deleteat!([x - '0' for x in string(BigFloat(π; precision = 2^20))], 2);

julia> @show pidigits[1:5];
pidigits[1:5] = [3, 1, 4, 1, 5]

julia> Folds.unique(x for x in pidigits if isodd(x))
5-element Vector{Int64}:
 3
 1
 5
 9
 7
source