From the last update of ExVenture, ExVenture can now be configured to run in a multiple node configuration. In this post I’ll show you the basics of how I did that.
What is ExVenture?
ExVenture is a multiplayer text-based game, a Multi-User Dungeon (MUD), server.
You can see more information about ExVenture at exventure.org and GitHub. You can see it running on MidMUD. There is also a public Trello board.
Starting point
ExVenture was heavily geared towards running as a single node to start out. I used Registry very heavily, which only works for a single node. I also have a few local cache processes that out of the gate wouldn’t work well when spanning multiple nodes.
My first thoughts where about splitting up the app into an umbrella app and figuring out how to boot the web on one node, the telnet connection on another, etc. I started talking about this in the MUD Coders Guild, and I got a question of “why?”
This was a great question and got me thinking about what else I could do.
I eventually settled on trying to get the same application booting on all nodes and a leader node starting the processes that can only exist once in the cluster, e.g. the world.
Clustering
First up was connecting up nodes in an automated fashion. Since my app was heavily geared towards being a single node, this should be fine. Each node wouldn’t talk to each other and the entire world would be spun up on each node.
This was extremely simple with libcluster. It was as simple as installing the hex package and adding this to my configuration files:
config :libcluster,
topologies: [
local: [
strategy: Cluster.Strategy.Epmd,
config: [hosts: [:"world1@host", :"world2@host"]]
]
]
Then when starting my app I switched to booting with iex
to get the sname
flag available.
iex --sname world1 -S mix
iex --sname world2 -S mix
Picking a Leader
Once the world was clustered, I started on picking a leader. I had heard about Raft before, but never really looked into it.
If you are interested in clustering at all, I’d highly recommend giving the paper a read. It is very simple to follow along and understand. Which was the main point of creating Raft, a simple to understand consensus algorithm.
For ExVenture, I went with implementing my own Raft module because I only wanted the leader election part of Raft. I don’t need (at least yet!) the rest of Raft.
You can see that in the Raft module. There is a lot to this module that doesn’t need to be covered here, but it boils down to the group picks a leader and that leader calls the subscriptions that care about who was leader.
A leader is picked
Once a leader is picked, the Game.World.Master
process uses pg2 to find all of the other Game.World.Master
processes and sees what zones are alive in the cluster.
After finding out what nodes are alive it spins up the zones not online across the cluster, using a simple rebalacing algorithm.
Global process registry
I also switched to using the global process registry as part of this. I started looking at swarm but it seemed to be something different than what I was looking for.
I will most likely end up changing this in the future but switching {:via, Registry, {Game.NPC, id}}
to {:global, {Game.NPC, id}}
was an extremely simple change that worked. So I went with it and haven’t looked back.
Messages spanning the cluster
The game was now officially multi-node and you could play on either iex
servers and see other players and NPCs on the other one.
There was only one final step and that was setting up pg2
groups for each of my caches and my communication layer.
This is very simple and can be done as follows in the init function and a slight change to casting to your GenServers.
@key :items
def init(_) do
:ok = :pg2.create(@key)
:ok = :pg2.join(@key, self())
#...
end
def insert(item) do
members = :pg2.get_members(@key)
Enum.map(members, fn member ->
GenServer.call(member, {:insert, item})
end)
end
Next Steps
With this up and running I was able to get MidMUD in a multi-node (3 world servers) set up for production. It has been working out pretty well so far and only once I found a large bug (of not being able to communicate across nodes in channels.)
I will post more updates as I continue enhancing the distributed nature of ExVenture.
You can see everything described here in these two pull requests, #37 and #39.