# Elixir AI
```elixir
Mix.install([
{:kino, "~> 0.9.0"},
{:benchee, "~> 1.1"},
{:nx, "~> 0.5.2"},
{:axon, "~> 0.5.1"},
{:exla, "~> 0.5.2"}
])
```
## Artificial Intelligence With Elixir
Elixir supports the capabilities of performing computations capable of producing *artificial intelligence*. This is accomplished with the following libraries:
* `Nx` - multi-dimensional tensor library with multi-staged compilation to the CPU/GPU
* `Axon` - high-level interface for creating neural network models
## Nx: Multi-dimensional Tensors and Numerical Expressions
[Nx Hexdocs](https://hexdocs.pm/nx/Nx.html)
The following code cells in this section are from, or derivatives of, the [Intro to Nx](https://hexdocs.pm/nx/intro-to-nx.html) guide in the `hexdocs`
<!-- livebook:{"break_markdown":true} -->
A simple tensor
```elixir
t = Nx.tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
```
Since tensors are used to perform multi-dimensional mathematics, they have a shape associated with them
```elixir
t |> Nx.shape()
```
`Nx.Tensor` can be created from `List`
```elixir
t = 1..4 |> Enum.chunk_every(2) |> Nx.tensor(names: [:y, :x])
```
The above `Nx.Tensor` has named dimensions so they can be accessed accordingly
```elixir
%{"first column" => t[x: 0], "first row" => t[y: 0]}
```
Exercise:
* Create a `{3,3}` tensor with named dimensions
* Return a `{2,2}` tensor containing the first two columns of the first two rows
```elixir
t =
1..9
|> Enum.chunk_every(3)
|> Nx.tensor(names: [:j, :i])
t = t[j: 0..1][i: 0..1]
```
### Tensor Aware Functions
```elixir
t
```
The `Nx` module has many functions to do scalar operations on a `Nx.Tensor`
```elixir
t |> Nx.cos()
```
You can call functions that aggregates the contents of a tensor for example to get the sum of the numbers in a `Nx.Tensor`
```elixir
t |> Nx.sum()
```
```elixir
# Sum of rows in tensor
t |> Nx.sum(axes: [:i])
```
#### Exercise
* Create a `{2, 2, 2}` tensor
* With values `1..8`
* With dimension names `[:z, :y, :x]`
* Calculate the sums along the `:y` axis
```elixir
1..8
|> Enum.chunk_every(4)
|> Enum.map(&Enum.chunk_every(&1, 2))
|> Nx.tensor(names: [:z, :y, :x])
|> Nx.sum(axes: [:y])
```
Other matrix operations such as subtraction are available via the `Nx` module
```elixir
a = Nx.tensor([[5, 6], [7, 8]])
b = Nx.tensor([[1, 2], [3, 4]])
a |> Nx.subtract(b)
```
### Broadcasting
`Nx.broadcast/2` takes a tensor or a scalar and a shape, translating it to a compatible shape by copying it
```elixir
Nx.broadcast(1, {2, 2})
```
```elixir
a = Nx.tensor([[1, 2], [3, 4]])
# Want to do [[1, 2], [3, 4]] - 1 (subtract 1 from every element in the LHS)
b = Nx.subtract(a, Nx.broadcast(1, {2, 2}))
b == Nx.subtract(a, 1)
# Here we pass a tensor to Nx.broadcast/2 and it will extract it's shape to make a compatible operation
b == Nx.subtract(a, Nx.broadcast(1, a))
```
```elixir
# Subtract row (or column) wise
# Want to do [[1, 2], [3, 4]] - [[1, 2]] === [[1, 2], [3, 4]] - [[1, 2], [1, 2]] === [[0, 0], [2, 2]]
a = Nx.tensor([[1, 2], [3, 4]])
b = Nx.tensor([[1, 2]])
c = a |> Nx.subtract(Nx.broadcast(b, {2, 2}))
# The subtraction function will take care of the broadcast implicitly
c2 = a |> Nx.subtract(b)
c == c2
```
### Automatic Differentiation (autograd)
Gradients are critical for solving systems of equations and building probablistic models. In advanced math, derivatives, or differential equations, are use to take gradients. Nx can compute these derivatives automatically throught a feature called auomatic differentiation, or autograd
```elixir
defmodule Funs do
import Nx.Defn
defn poly(x) do
3 * Nx.pow(x, 2) + 2 * x + 1
end
defn poly_slope_at(x) do
grad(&poly/1).(x)
end
defn sinwave(x) do
Nx.sin(x)
end
defn sinwave_slope_at(x) do
grad(&sinwave/1).(x)
end
end
```
The function `grad/1` takes a function and returns a function returning the gradient
You can check if this value is correct by looking at the graph of `6x + 2`
```elixir
Funs.poly_slope_at(2)
```
You can check if this value is correct by looking at the graph of `acos(x)`
```elixir
Funs.sinwave_slope_at(1)
```
## Axon: High-Level Interface For Neural Network Models
[Hexdocs](https://hexdocs.pm/axon/Axon.html)
Axon is high-level interface for creating neural network models
Axon is built entirely on top of Nx numerical definitions, so every neural network can be JIT or AOT compiled using any Nx compiler, or even transformed into high-level neural network formats like TensorFlow Lite and ONNX.
## First Model (Identity Model)
Everything in `Axon` centers around the `%Axon{}` struct which represents an instance of an Axon model
Models are graphs which represent transformation and flow of input data to a desired output. You can think of models as representing a single computation or function
All Axon models start with a declaration of input noes. These are the root nodes of your computation graph, and correspond to the actual input data you want to sent to Axon:
```elixir
model = Axon.input("data")
```
`input`, technically speaking, is now a valid `Axon` model which you can inspect, execute and initialize
```elixir
template = Nx.template({2, 8}, :f32)
```
```elixir
model |> Axon.Display.as_graph(template)
```
The execution flow is just a single node, because the graph consists of an input node. You pass dat in and the models returns the same data, without any intermediate transformations
You can build the `%Axon{}` struct into it's `initialization` and `forward` functions by calling `Axon.build/2`. This pattern of "lowering" or transforming the `%Axon{}` struct into other functions or representations is very common in `Axon`. By traversing the data structure you can create useful functions, execution visualizations, and more
```elixir
{init_fn, predict_fn} = Axon.build(model)
```
`init_fn` returns all of your model's trainable parameters and state. You need to pass a template and any initial parameters you want your model to start with (this is useful for things like transfer learning)
`predict_fn` returns transformed inputs from your model's trainable parameters and the given inputs
```elixir
training_params = init_fn.(template, %{})
```
The `init_fn/2` returned `%{}` because `model` does not have any trainable parameters. This should make sense because it's just an input layer
```elixir
input = 1..8 |> Enum.chunk_every(4) |> Nx.tensor(type: :f32)
predict_fn.(training_params, input)
```
Passing the `training_params` and some `input` to the `predict_fn`, the model can actually be executed, returning the given input as expected
## Sequential Models
Sequential models are named after the sequential nature in which data flows through them. Sequential models transform the input with sequential, successive transformations
To create a sequential model in `Axon` is the same as writing sequential transformation in "regular" Elixir
Links:
* [ReLU Activation](https://machinelearningmastery.com/rectified-linear-activation-function-for-deep-learning-neural-networks/)
* [Dropout for Regularizing Deep Neural Networks](https://machinelearningmastery.com/dropout-for-regularizing-deep-neural-networks/)
* [Softmax Function](https://en.wikipedia.org/wiki/Softmax_function)
```elixir
model =
Axon.input("data")
# layer with 32 outputs
|> Axon.dense(32)
# layer with an element wise operation of :relu
|> Axon.activation(:relu)
# layer to reduce overfitting, effective regularization method
|> Axon.dropout(rate: 0.5)
# layer with 1 output
|> Axon.dense(1)
# layer with an element wise operation of :softmax
|> Axon.activation(:softmax)
```
Visualizing the model we can see how the data will flow
```elixir
template = Nx.template({4, 8}, :f32)
```
```elixir
Axon.Display.as_graph(model, template)
```
```elixir
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
training_params = init_fn.(template, %{})
```
This model actually has trainable parameters. The parameter map is just a regular Elixir map. Each top-level entry maps to a layer with a key corresponding to that layer's name and a value corresponding to that layer's trainable parameters. Each layer's individual trainable parameters are given layer-specific names and map directly to `Nx` tensors
```elixir
input = Nx.iota({4, 8}, type: :f32)
```
```elixir
predict_fn.(training_params, input)
```
## Complex Models
Some models require a more flexible API. Since `Axon` models are just Elixir data structures, you can manipulate them and decompose architectures as you would any other Elixir program
```elixir
input = Axon.input("data")
```
```elixir
x1 = input |> Axon.dense(32)
```
```elixir
x2 = input |> Axon.dense(64) |> Axon.relu() |> Axon.dense(32)
```
```elixir
model = Axon.add(x1, x2)
```
Your model branches `input` into `x1` and `x2`. Each branch performs a different set of transformations. At the end, the branches are merged with `Axon.add/3`. The Layer that is created with `Axon.add/3` is sometimes called a *combinator*. It is a layer that operates on multiple `Axon` models at once, typically to merge some branches together
`model` represents the final `Axon` model
Visualizing this model you can see the fully built branching in this model
```elixir
template = Nx.template({2, 16}, :f32)
```
```elixir
Axon.Display.as_graph(model, template)
```
```elixir
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
training_params = init_fn.(template, %{})
```
As your model's architecture grows in complexity, you might find yourself reaching for better abstractions to organize your model creation code. PyTorch models are often organized into `nn.Module`. If you're translating models from PyTorch to Axon, it's natural to create one Elixir function per `nn.Module`.
You should write your models as you would any other Elixir code
```elixir
defmodule ComplexModel do
def create() do
Axon.input("data")
|> conv_block()
|> Axon.flatten()
|> dense_block()
|> dense_block()
|> Axon.dense(1)
end
defp conv_block(input) do
input
|> Axon.conv(3, padding: :same)
|> Axon.mish()
|> Axon.add(input)
|> Axon.max_pool(kernel_size: {2, 3})
end
defp dense_block(input) do
input
|> Axon.dense(32)
|> Axon.relu()
end
end
```
```elixir
model = ComplexModel.create()
```
```elixir
template = Nx.template({1, 28, 28, 3}, :f32)
```
```elixir
Axon.Display.as_graph(model, template)
```
## Multi-Input & Multi-Output Models
##### Multi-Input Model
Sometimes your model has the need for multiple inputs
```elixir
input1 = Axon.input("input1")
input2 = Axon.input("input2")
model = Axon.add(input1, input2)
```
You can inspect the inputs of your model in two ways: `Axon.Display.as_graph/1` and `Axon.get_inputs/1`
```elixir
Axon.get_inputs(model)
```
```elixir
inputs = %{"input1" => Nx.template({2, 8}, :f32), "input2" => Nx.template({2, 8}, :f32)}
```
```elixir
Axon.Display.as_graph(model, inputs)
```
```elixir
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
training_params = init_fn.(inputs, %{})
```
```elixir
input1 = Nx.iota({2, 8}, type: :f32)
```
```elixir
input2 = Nx.iota({2, 8}, type: :f32)
```
```elixir
inputs = %{"input1" => input1, "input2" => input2}
```
```elixir
predict_fn.(training_params, inputs)
```
##### Multi-Output Models
You also might want to have multiple outputs from your model. `Axon.container/2` can be used to wrap multiple nodes into any supported `Nx` container:
```elixir
data = Axon.input("data")
x1 = data |> Axon.dense(32) |> Axon.relu()
x2 = data |> Axon.dense(64) |> Axon.relu()
model = Axon.container({x1, x2})
```
```elixir
template = Nx.template({2, 8}, :f32)
Axon.Display.as_graph(model, template)
```
```elixir
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(template, %{})
data = Nx.iota({2, 8}, type: :f32)
predict_fn.(params, data)
```
## Creating Custom Layers
`Axon` has a number of built-in layers such as `Axon.relu/0` and `Axon.dense/1`. As you develop more sophisticated models you will likely need to develop custom layers
A layer in `Axon` is just a `defn` implementation with special `Axon` inputs. Every layer in `Axon` is implemented with the `Axon.layer/3` function. The API of `Axon.layer/3` intentionally mirrors `Kernel.apply/2` to make developing custom layers as close to writing "normal" Elixir code as possible.
`defn` looks like any other `def` implementation except `defn` implementations must always account for `opts` as a second parameter because it will always receive a `:mode` option indicating whether or not the model is running in training or inference mode. This allows you to custom the behavior of the layer based on the execution mode.
If you plan on re-using customer layers in many locations, it's recommended that you wrap them in Elixir functions as an interface. If you were to try and use `defn` instead of `def` you would receive an error about a `LazyContainer` being incorrectly used
```elixir
defmodule CustomLayers do
import Nx.Defn
def my_layer(%Axon{} = input, opts \\ []) do
opts = Keyword.validate!(opts, [:name])
alpha = Axon.param("alpha", fn _ -> {} end)
Axon.layer(&my_layer_impl/3, [input, alpha], name: opts[:name], op_name: :my_layer)
end
defnp my_layer_impl(input, alpha, _opts \\ []) do
input
|> Nx.sin()
|> Nx.multiply(alpha)
end
end
```
With the layer implementation defined, `Axon.layer/2` can be used to apply the layer to a model
```elixir
model =
Axon.input("data")
|> CustomLayers.my_layer()
|> CustomLayers.my_layer()
|> Axon.dense(1)
```
```elixir
Axon.Display.as_graph(model, Nx.template({2, 8}, :f32))
```
```elixir
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
params = init_fn.(Nx.template({2, 8}, :f32), %{})
```
```elixir
predict_fn.(params, Nx.iota({2, 8}, type: :f32))
```
## Model Hooks
Inspecting or visualizing the values of intermediate layers in your model during the forward or backwards pass can be key to understanding the behavior of your model (e.g., visualizing the gradients of activation functions to ensure learning in a stable manner). Axon supports this via *model hooks*.
Model hooks are unidirectional communication with an executing model. Hooks are unidirectional in the sense that you can only receive information from your model, and not send information to the model.
Hooks attach per-layer and can execute at 4 different points in model execution: `initalization`, `pre-forward`, `forward pass`, or `backward pass`
```elixir
input = Nx.iota({2, 4}, type: :f32)
```
```elixir
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_init) end, on: :initialize)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :dense_forward) end, on: :forward)
|> Axon.relu()
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :relu) end, on: :forward)
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
params = init_fn.(input, %{})
```
```elixir
predict_fn.(params, input)
```
Hooks execute in the order they were attached to a layer. If you attach 2 hooks to the same layer which execute different functions on the same event, they will run in order
```elixir
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook1) end, on: :forward)
|> Axon.attach_hook(fn val -> IO.inspect(val, label: :hook2) end, on: :forward)
|> Axon.relu()
{init_fn, predict_fn} = Axon.build(model)
```
```elixir
params = init_fn.(input, %{})
```
```elixir
predict_fn.(params, input)
```
Hooks can also be configured to run on all events
```elixir
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(&IO.inspect/1, on: :all)
|> Axon.relu()
|> Axon.dense(1)
{init_fn, predict_fn} = Axon.build(model)
```
`initialization` hook
```elixir
params = init_fn.(input, %{})
```
`pre-forward` and `forward` hooks
```elixir
predict_fn.(params, input)
```
`backwards` hook
```elixir
Nx.Defn.grad(fn params -> predict_fn.(params, input) end).(params)
```
Hooks can be configured to only run when the model is built in a certain mode such as `training` and `inference` mode
```elixir
model =
Axon.input("data")
|> Axon.dense(8)
|> Axon.attach_hook(&IO.inspect/1, on: :forward, mode: :train)
|> Axon.relu()
{init_fn, predict_fn} = Axon.build(model, mode: :train)
```
```elixir
params = init_fn.(input, %{})
```
```elixir
predict_fn.(params, input)
```
Now building with `:inference` mode, the hook will not run
```elixir
{init_fn, predict_fn} = Axon.build(model, mode: :inference)
params = init_fn.(input, %{})
```
```elixir
predict_fn.(params, input)
```