How to make your data type reducible
Let's see how to make a vector-of-vector a reducible collection; i.e., a type that can be fed to foldl
.
struct VecOfVec{T}
vectors::Vector{Vector{T}}
end
Method 1: The __foldl__
interface:
We need @next
and complete
to invoke the reducing function rf
.
using Transducers
using Transducers: @next, complete
Supporting foldl
and similar only requires Transducers.__foldl__
:
function Transducers.__foldl__(rf, val, vov::VecOfVec)
for vector in vov.vectors
for x in vector
val = @next(rf, val, x)
end
end
return complete(rf, val)
end
Note that it's often a good idea to implement Base.eltype
:
Base.eltype(::VecOfVec{T}) where {T} = T
It can be then used as the input to the transducers:
vov = VecOfVec(collect.([1:n for n in 1:3]))
collect(Map(identity), vov)
6-element Vector{Int64}: 1 1 2 1 2 3
Macro @next
is used instead of function next
to avoid the boilerplate for supporting early termination (see the details in in @next
documentation). In practice, using @next
means that your __foldl__
supports early termination:
vov |> Take(3) |> collect
3-element Vector{Int64}: 1 1 2
More complex example:
vov |> PartitionBy(isequal(1)) |> Zip(Map(copy), Map(sum)) |> collect
4-element Vector{Tuple{Vector{Int64}, Int64}}: ([1, 1], 2) ([2], 2) ([1], 1) ([2, 3], 5)
Method 2: In terms of existing transducers and reducibles with asfoldable
Transducers.jl has a function Transducers.asfoldable
which can be overloaded to convert your existing type into a foldable before being fed into __foldl__
. This allows us to express a new type in terms of existing structures:
struct VecOfVec2{T}
vectors::Vector{T}
end
Transducers.asfoldable(vov::VecOfVec2) = vov.vectors |> Cat()
This simply tells Transducers.jl that a VecOfVec2
may be folded over by simply applying the Cat
transducer to the .vectors
field of the struct. This is enough to reproduce all the examples from the previous section:
vov2 = VecOfVec2(collect.([1:n for n in 1:3]))
collect(Map(identity), vov2)
6-element Vector{Int64}: 1 1 2 1 2 3
vov2 |> Take(3) |> collect
3-element Vector{Int64}: 1 1 2
vov2 |> PartitionBy(isequal(1)) |> Zip(Map(copy), Map(sum)) |> collect
4-element Vector{Tuple{Vector{Int64}, Int64}}: ([1, 1], 2) ([2], 2) ([1], 1) ([2, 3], 5)
Comparison to defining an Iterator:
Notice that writing Transducers.__foldl__
is very straightforward comparing to how to define an iterator:
function Base.iterate(vov::VecOfVec, state=(1, 1))
Iterator state
is a tuple of an index i
to vov.vectors
and an index j
to vov.vectors[i]
:
i, j = state
If i
is larger than the number of items, we are done:
i > length(vov.vectors) && return nothing
If j
is in bound, we are iterating the same sub-vector:
vi = vov.vectors[i]
if j <= length(vi)
return vi[j], (i, j + 1)
end
Otherwise, find the next non-empty sub-vector and start iterating it:
for k in i + 1:length(vov.vectors)
vk = vov.vectors[k]
if !isempty(vk)
return vk[1], (k, 2) # i=k, j=2
end
end
return nothing
end
Base.length(vov::VecOfVec) = sum(length, vov.vectors)
collect(vov)
6-element Vector{Int64}: 1 1 2 1 2 3
This page was generated using Literate.jl.