Test Your Seeds 17 May 2024
When building products with Phoenix
and Ecto
, a common practice is to include
development seeds in a file named priv/repo/seeds.exs
. By default, it's expected
that this file will include a list of actions to run procedurally upon loading the
file. With a few changes, we can instead make a testable module, so that we will know
when our refactors will break the seeds for ourselves and for others.
Let me provide a simple example. Let's assume that we have an application with profiles
and with orgs
in its database. This application might have a Core.People
context for
creating, getting, and updating profiles and orgs. The create functions might look a
little like this:
defmodule Core.People do
def create_org(attrs \\ []),
do: attrs |> new_org() |> Core.Repo.insert()
def create_profile(attrs \\ []),
do: creator |> new_profile(attrs) |> Core.Repo.insert()
# ... etc
end
For development, it's helpful to provide ourselves with seeded data, so that when we open up the application locally on our workstations we are able to do things such as logging in and testing new features without having to go through the registration flow each time we reset our local database.
These development seeds might drop into priv/repo/seeds.exs
as follows.
_alice = Core.People.create_profile(name: "Alice", email: "[email protected]")
_billy = Core.People.create_profile(name: "Billy", email: "[email protected]")
_org = Core.People.create_org(name: "My Organization")
These seeds are inserted when running mix tasks in a terminal. These tasks are defined
as aliases in mix.exs
:
defmodule MyApp.MixProject do
use Mix.Project
# ... application, project, deps, etc.
defp aliases,
do: [
# ...
"ecto.reset": ["ecto.drop", "ecto.setup"],
"ecto.setup": ["ecto.create", "ecto.migrate", "run priv/repo/seeds.exs"]
]
end
Other team members will presumably discover when the seeds have changed, and run
mix ecto.reset
to drop and recreate their development database.
Refactoring code can break seeds
Now let's presume that after several months of development, the team decides that when new orgs are created, the profile of the person doing the action should be recorded, either in logs or in the database.
defmodule Core.People do
def create_org(%Schema.Profile{} = creator, attrs \\ []),
do: creator |> new_org(attrs) |> Core.Repo.insert()
def create_profile(attrs \\ []),
do: creator |> new_profile(attrs) |> Core.Repo.insert()
# ... etc
end
Presumably there will be unit tests for the create_org
function that will now fail.
Tests for controllers and live views will also fail, highlighting any callers of create_org/1
that require updating to create_org/2
. After some updates throughout the codebase, all
tests and linters will pass, and one may feel free to ship the changes…
Days or weeks later, someone will try to reset their development database, and discover to their chagrin that the process crashes. Nobody remembered to update the seeds file when the function declarations changed! More importantly, no test failed to highlight the need for updating the file.
Elixir scripts are not loaded by default
Why did nothing fail or crash when the context changed? Elixir scripts are not
automatically loaded by the compile or the VM. priv/repo/seeds.exs
is an Elixir
script file, and must be manually required or loaded in order for the Erlang VM
to compile and run its contents. Code.require_file/2
can be used to do this… but with procedural code in a seeds file, the act of
requiring the file will execute its contents.
Seeds as a module
Let's start by rewriting the contents of the seeds file as a module:
defmodule Seeds do
require Logger
def run do
Logger.configure(level: :info)
run_seeds()
end
def run_seeds do
_alice = Core.People.create_profile(name: "Alice", email: "[email protected]")
_billy = Core.People.create_profile(name: "Billy", email: "[email protected]")
_org = Core.People.create_org(name: "My Organization")
end
end
Now nothing is executed by just requiring the file. The Seeds
module can be
utilized by ecto.setup
with a minimal change:
defmodule MyApp.MixProject do
use Mix.Project
# ... application, project, deps, etc.
defp aliases,
do: [
# ...
"ecto.reset": ["ecto.drop", "ecto.setup"],
"ecto.setup": ["ecto.create", "ecto.migrate", "run --require priv/repo/seeds.exs --eval 'Seeds.run()'"]
]
end
Testing the seeds module
Now we can write a simple test that executes the interior bits of the seeds:
defmodule SeedsTest do
use Test.DataCase, async: true
test "successfully creates seed data" do
[{seeds_module, _}] = Code.require_file("priv/repo/seeds.exs")
seeds_module.run_seeds()
# assert expected profiles are created
# assert expected orgs are created
end
end
Note that we can't refer to Seeds
directly in the test, because at the time the
VM loads and compiles the test file itself, our module does not exist.
Code.require_file/2
returns a list: atuple for each interior module, where the first member of the
tuple is the loaded module.
Epilogue
After initially publishing this post, a few people reached out to me with questions. Specifically,
where do I put my test file, and why don't I include the Seeds
module in lib
(in an Elixir file,
vs an Elixir script file)?
Q: Where is the test file? In the codebase from which I pulled this example, we put the test file
in test/seeds_test.exs
. I could also see putting this into test/priv/seeds_test.exs
or
test/priv/repo/seeds_test.exs
. The main point is making it discoverable, but for that we use
annotations in our files: Nova plugin,
VSCode plugin,
Neovim plugin.
Q: Why not put Seed modules in lib? In many cases I don't mind changing production code or including
test or development code in packaged releases. In this case, I feel like I don't want this module to
ever be accidentally run in production… it should be clear when seeing …@example.com
email addresses
that no one should run these functions from a release, but I can't predict what will seem obvious to
myself or others in future months or years. Things that seem obvious now might not be obvious in the
middle of a production incident. So rather than include the Seeds
module in a file that's automatically
loaded and available in any environment, I would prefer instead to ship clear, concise, and testable
context functions (the create_profile
and create_org
functions, for example), even if those
function heads are never used
by workflows available from the application.
Q: What about seeds that do need to run in production? Production seeds have very different requirements from development seeds. When production seeds change, data needs to be migrated. When development seeds change, one can drop and recreate the development database. Even with an application that needs production seeds, I would prefer those code paths to be different… and consider running production seeds functions from my development seeds module.
Attribution
- image: Łukasz Rawa @ Unsplash: https://unsplash.com/photos/brown-and-black-dried-leaves-NDro8tjU4e0