Run this notebook

Use Livebook to open this notebook and explore new ideas.

It is easy to get started, on your machine or the cloud.

Click below to open and run it in your Livebook at .

(or change your Livebook location)

# Mbox Parsing ```elixir Mix.install([ {:explorer, "~> 0.10.1"}, {:kino, "~> 0.15.3"}, {:kino_explorer, "~> 0.1.24"} ]) ``` ## Analyze your mailbox to find the junk! [![Run in Livebook](https://livebook.dev/badge/v1/blue.svg)](https://livebook.dev/run?url=https%3A%2F%2Fgist.github.com%2Fpetermueller%2Fa664ef33f38cb2726bf3e0239798beb7) Go to Google Takeout and initiate an export. It'll take a little bit. Once unzipped/untarred, update the `path` variable below. Test out the size of the `Enum.take` below, as this livebook is not particularly efficient, and a large `.mbox` file can cause timeouts. ```elixir import Kino.Shorts alias Explorer.DataFrame, as: DF :ok ``` ```elixir path = "~/Documents/Takeout/Mail/All mail Including Spam and Trash.mbox" |> Path.expand() # Just to confirm it's working :) first_few = path |> File.stream!() |> Stream.map(&String.trim/1) |> Enum.take(10) tree(first_few) ``` ```elixir chunk_fun = fn <<"From ", _rest::binary>> = line, [] -> {:cont, [line]} <<"From ", _rest::binary>> = line, acc -> {:cont, Enum.reverse(acc), [line]} line, acc -> {:cont, [line | acc]} end after_fun = fn [] -> raise "Won't happen, but let's not hang if we mess up" [<<"From ", _rest::binary>> = line] -> {:cont, [line]} acc -> {:cont, Enum.reverse(acc), []} end stream = File.stream!(path) |> Stream.map(&String.trim_trailing(&1, "\n")) |> Stream.chunk_while([], chunk_fun, after_fun) ``` ```elixir empty_msg_map = Map.from_keys([:delivered_to, :from, :to, :subject], nil) lines_to_keep = fn <<"From ", _rest::binary>> -> [] <<"Delivered-To: ", rest::binary>> -> [delivered_to: rest] <<"From: ", rest::binary>> -> [from: rest] <<"To: ", rest::binary>> -> [to: rest] <<"Subject: ", rest::binary>> -> [subject: rest] _ -> [] end formatted_stream = stream |> Stream.flat_map(fn lines -> [Enum.flat_map(lines, lines_to_keep)] end) |> Stream.map(&Enum.into(&1, empty_msg_map)) df = formatted_stream |> Enum.take(4000) |> DF.new() ``` <!-- livebook:{"attrs":"eyJhc3NpZ25fdG8iOm51bGwsImNvbGxlY3QiOmZhbHNlLCJkYXRhX2ZyYW1lIjoiZGYiLCJkYXRhX2ZyYW1lX2FsaWFzIjoiRWxpeGlyLkRGIiwiaXNfZGF0YV9mcmFtZSI6dHJ1ZSwibWlzc2luZ19yZXF1aXJlIjoiRWxpeGlyLkV4cGxvcmVyLkRhdGFGcmFtZSIsIm9wZXJhdGlvbnMiOlt7ImFjdGl2ZSI6dHJ1ZSwiY29sdW1ucyI6WyJmcm9tIl0sImRhdGFfb3B0aW9ucyI6eyJkZWxpdmVyZWRfdG8iOiJzdHJpbmciLCJmcm9tIjoic3RyaW5nIiwic3ViamVjdCI6InN0cmluZyIsInRvIjoic3RyaW5nIn0sIm9wZXJhdGlvbl90eXBlIjoiZ3JvdXBfYnkifSx7ImFjdGl2ZSI6dHJ1ZSwiY29sdW1ucyI6WyJmcm9tIl0sImRhdGFfb3B0aW9ucyI6eyJkZWxpdmVyZWRfdG8iOiJzdHJpbmciLCJmcm9tIjoic3RyaW5nIiwic3ViamVjdCI6InN0cmluZyIsInRvIjoic3RyaW5nIn0sIm9wZXJhdGlvbl90eXBlIjoic3VtbWFyaXNlIiwicXVlcnkiOiJjb3VudCJ9LHsiYWN0aXZlIjp0cnVlLCJkYXRhX29wdGlvbnMiOnsiZnJvbSI6InN0cmluZyIsImZyb21fY291bnQiOiJpbnRlZ2VyIn0sImRpcmVjdGlvbiI6ImRlc2MiLCJvcGVyYXRpb25fdHlwZSI6InNvcnRpbmciLCJzb3J0X2J5IjoiZnJvbV9jb3VudCJ9XX0","chunks":null,"kind":"Elixir.KinoExplorer.DataTransformCell","livebook_object":"smart_cell"} --> ```elixir require Explorer.DataFrame df |> DF.lazy() |> DF.group_by("from") |> DF.summarise(from_count: count(from)) |> DF.sort_by(desc: from_count) ```
See source

Have you already installed Livebook?

If you already installed Livebook, you can configure the default Livebook location where you want to open notebooks.
Livebook up Checking status We can't reach this Livebook (but we saved your preference anyway)
Run notebook

Not yet? Install Livebook in just a minute

Livebook is open source, free, and ready to run anywhere.

Run on your machine

with Livebook Desktop

Run in the cloud

on select platforms

To run on Linux, Docker, embedded devices, or Elixir’s Mix, check our README.

PLATINUM SPONSORS
SPONSORS
Code navigation with go to definition of modules and functions Read More ×