# Useful patterns

This page includes some useful patterns using Transducers.jl.

`using Transducers`

## Flattening nested objects using `MapCat`

### Simple `MapCat`

usage

Consider a vector of "objects" (here just `NamedTuple`

s) which in turn contain a vector of objects:

```
nested_objects = [
(a = 1, b = [(c = 2, d = 3), (c = 4, d = 5)]),
(a = 10, b = [(c = 20, d = 30), (c = 40, d = 50)]),
];
```

We can flatten this into a table by using `Map`

inside `MapCat`

:

```
using TypedTables
astable(xs) = copy(Table, xs) # using `TypedTables` for a nice display
table1 = nested_objects |> MapCat() do x
x.b |> Map() do b # not `MapCat`
(a = x.a, b...)
end
end |> astable
```

Table with 3 columns and 4 rows: a c d ┌─────────── 1 │ 1 2 3 2 │ 1 4 5 3 │ 10 20 30 4 │ 10 40 50

(Note that the transducer used inside `MapCat`

is `Map`

, not `MapCat`

)

### Nested `MapCat`

This pattern can handle more nested objects:

```
more_nested_objects = [
(a = 1, b = [(c = 2, d = [(e = 3, f = 4), (e = 4, f = 5)]),
(c = 6, d = [])]),
(a = 10, b = [(c = 20, d = [(e = 30, f = 40), (e = 40, f = 50)])]),
];
```

By using nested `MapCat`

(except for the "inner most" processing which uses `Map`

since there is nothing to con**cat**enate):

```
table3 =
more_nested_objects |> MapCat() do x
x.b |> MapCat() do b
b.d |> Map() do d
(a = x.a, c = b.c, d...)
end
end
end |> astable
```

Table with 4 columns and 4 rows: a c e f ┌─────────────── 1 │ 1 2 3 4 2 │ 1 2 4 5 3 │ 10 20 30 40 4 │ 10 20 40 50

### Comparison with iterator comprehension

As a comparison, here is how to do it with iterator comprehension

```
rows = (
(a = x.a, c = b.c, d...)
for x in more_nested_objects
for b in x.b
for d in b.d
)
@assert Table(collect(rows)) == table3
```

For a simple flattening and mapping, iterator comprehension as above perhaps is the simplest solution.

Note that Transducers.jl works well with iterator comprehensions. Transducers.jl-specific entry points like `foldxl`

converts iterator comprehensions to transducers internally. `eduction`

can be used to explicitly do this conversion:

`@assert astable(eduction(rows)) == table3`

### Complex `MapCat`

example

For more complex processing that requires intermediate variables, the iterator comprehension does not work well. Fortunately, it is easy to use intermediate variables with transducers:

```
more_nested_objects |>
MapCat() do x
a2 = x.a * 2
x.b |> MapCat() do b
a2_plus_c = a2 + b.c
b.d |> Map() do d
c_plus_e = b.c + d.e
c_plus_f = b.c + d.f
(a2_plus_c = a2_plus_c, c_plus_e = c_plus_e, c_plus_f = c_plus_f)
end
end
end |>
astable
```

Table with 3 columns and 4 rows: a2_plus_c c_plus_e c_plus_f ┌────────────────────────────── 1 │ 4 5 6 2 │ 4 6 7 3 │ 40 50 60 4 │ 40 60 70

`MapCat`

with `zip`

Note also that `MapCat`

can be combined with arbitrary iterator combinators such as `zip`

```
[(a = 1:3, b = 'x':'z'), (a = 1:4, b = 'i':'l')] |>
MapCat() do x
zip(x.a, x.b)
end |>
MapSplat((a, b) -> (a = a, b = b)) |>
astable
```

Table with 2 columns and 7 rows: a b ┌───── 1 │ 1 x 2 │ 2 y 3 │ 3 z 4 │ 1 i 5 │ 2 j 6 │ 3 k 7 │ 4 l

`MapCat`

with `Iterators.product`

... and `product`

```
[(a = 1:3, b = 'x':'z'), (a = 1:4, b = 'i':'l')] |>
MapCat() do x
Iterators.product(x.a, x.b)
end |>
Enumerate() |>
Filter(x -> x[1] % 5 == 0) |> # include only every five item
MapSplat((n, (a, b)) -> (n = n, a = a, b = b)) |>
astable
```

Table with 3 columns and 5 rows: n a b ┌───────── 1 │ 5 2 y 2 │ 10 1 i 3 │ 15 2 j 4 │ 20 3 k 5 │ 25 4 l

## "Missing value" handling with `KeepSomething`

Transducers.jl has a generic filtering such as `Filter`

as well as type-based filtering such as `NotA`

and `OfType`

. These transducers can be used to filter out "missing values" represented as `missing`

or `nothing`

.

`KeepSomething`

is a transducer that is useful for working on `Union{Nothing,Some{T}}`

. It filters out `nothing`

and yield itmes after applying `something`

.

`[nothing, 1, Some(nothing), 2, 3] |> KeepSomething(identity) |> collect`

4-element Vector{Union{Nothing, Int64}}: 1 nothing 2 3

Thus, `KeepSomething`

works well with any tools that operate on `Union{Nothing,Some{T}}`

. Here is an example of using it with Maybe.jl. Consider a vector of heterogeneous dictionaries with varying set of keys:

```
heterogeneous_objects = [
Dict(:a => 1, :b => Dict(:c => 2)),
Dict(:a => 1), # missing key
Dict(:a => 1, :b => Dict()), # missing key
Dict(:b => Dict(:c => 2)), # missing key
Dict(:a => 10, :b => Dict(:ccc => 20)), # alternative key name
];
```

Using `@something`

and `@?`

macros from Maybe.jl, we can convert this to a regular table quite easily:

```
using Maybe
using Maybe: @something
heterogeneous_objects |>
KeepSomething() do x
c = @something { # (1)
@? x[:b][:c]; # (2)
@? x[:b][:ccc]; # (3)
return; # (4)
}
@? (a = x[:a], c = c) # (5)
end |>
astable
```

Table with 2 columns and 2 rows: a c ┌─────── 1 │ 1 2 2 │ 10 20

In this example, for each dictionary `x`

, the body of the `do`

block works as follows:

- (1) Try to extract the item
`c`

.- (2) First, try to get it from
`x[:b][:c]`

. - (3) If
`x[:b][:c]`

doesn't exist, try`x[:b][:ccc]`

next. - (4) If both
`x[:b][:c]`

and`x[:b][:ccc]`

do not exist, return`nothing`

.`KeepSomething`

will filter out this entry.

- (2) First, try to get it from
- (5) Try to extract the item
`a`

from`x[:a]`

.- If this does not exist, the whole expression wrapped by
`@?`

evaluates to`nothing`

. This, in turn, will be filtered out by`KeepSomething`

. - If
`x[:a]`

exists,`@? (a = x[:a], c = c)`

evaluates to`Some((a = value_of_a, c = value_of_c))`

. The`Some`

wrapper is unwrapped by`something`

called by`KeepSomething`

.

- If this does not exist, the whole expression wrapped by

For more information, see the tutorial in Maybe.jl documentation.

## Multiple outputs

Usually, reducers like `sum`

and `collect`

have one output. However we can use `TeeRF`

etc. to "fan-out" input items to multiple outputs.

### Multiple output vectors

Here is an example of creating two output vectors of integers and symbols in one go:

```
ints, symbols =
[1, :two, missing, 3, 4, :five, 6] |>
Filter(!isequal(6)) |>
foldxl(TeeRF(
OfType(Int)'(push!!), # push integers to a vector
OfType(Symbol)'(push!!), # push symbols to a vector
))
```

([1, 3, 4], [:two, :five])

Here, we use `TeeRF(rf₁, rf₂, ..., rfₙ)`

to fan-out input items to multiple reducing functions. To compose each reducing function, we use `OfType`

transducer as reducing function transformation `xf'(rf)`

.

### Handling empty results

Note that fold with `push!!`

throws when the input is empty. To obtain an empty vector when the input is empty or all filtered out, we need to specify `init`

. MicroCollections.jl includes a library of collections useful as `init`

. Here, we can use `EmptyVector`

:

```
using MicroCollections
ints, strings =
[1, :two, missing, 3, 4, :five, 6] |>
Filter(!isequal(6)) |>
foldxl(TeeRF(
OfType(Int)'(push!!), # push integers to a vector
OfType(String)'(push!!), # push strings to a vector (but there is no string)
); init = EmptyVector())
```

([1, 3, 4], Union{}[])

### Composed transducers with `TeeRF`

Each reducing function passed to `TeeRF`

can use arbitrary complex transducers. Here is an example of filtering-in symbols and then map them to strings:

```
ints, strings =
[1, :two, missing, 3, 4, :five, 6] |>
Filter(!isequal(6)) |>
foldxl(
TeeRF(
OfType(Int)'(push!!),
opcompose(OfType(Symbol), Map(String))'(push!!), # filter _then_ map
);
init = EmptyVector(),
)
```

([1, 3, 4], ["two", "five"])

### Nested `TeeRF`

Each reducing function itself passed to `TeeRF`

can even be composed using `TeeRF`

(or other reducing function combinators; e.g., `ProductRF`

). Here is an example of computing extrema on integers:

```
(imin, imax), strings =
[1, :two, missing, 3, 4, :five, 6] |>
Filter(!isequal(6)) |>
foldxl(
TeeRF(
OfType(Int)'(TeeRF(max, min)), # extrema on integers
opcompose(OfType(Symbol), Map(String))'(push!!), # filter _then_ map
);
init = ((typemin(Int), typemax(Int)), EmptyVector()),
)
```

((4, 1), ["two", "five"])

### When input is a tuple: `ProductRF`

`ProductRF`

is like `TeeRF`

but it expects that the input is already a tuple:

```
ints, io =
[(1:3, 'x':'z'), nothing, (1:4, 'i':'l')] |>
NotA(Nothing) |>
foldxl(
ProductRF(
opcompose(Cat(), Filter(isodd))'(push!!), # process 1:3 etc.
Cat()'((io, char) -> (write(io, char); io)), # process 'x':'z' etc.
);
init = (EmptyVector(), IOBuffer()),
);
```

`String(take!(io))`

"xyzijkl"

`ints`

4-element Vector{Int64}: 1 3 1 3

### When input is a row: `DataTools.oncol`

`oncol`

from DataTools.jl is like `ProductRF`

but acts on `NamedTuple`

(as well as any Setfield.jl-compatible possibly nested objects).

```
using DataTools
foldxl(oncol(a = +, b = *), [(a = 1, b = 2), (a = 3, b = 4)])
```

(a = 4, b = 8)

*This page was generated using Literate.jl.*