Today’s Erlang Thursday kicks of taking a look at the digraph module.
As I was looking into it, this module didn’t line up with my expectations of Erlang behavior, so I want to focus on that difference before taking a look at the functions in the module.
If we start by browsing through the function signatures in the digraph
module, we can see that the only function that returns a digraph()
are digraph:new/0
and digraph:new/1
.
Thinking this was odd for a Erlang API, I went to the Erlang shell, and added a vertex to the digraph()
, and then inpsected the result of that operation.
G = digraph:new(). % {digraph,69651,73748,77845,true} G2 = digraph:add_vertex(G, foo, bar). % foo
The return value of calling digraph:add_vertex/3
was foo
, which was the second argument, and doesn’t match up with what the representation of a graph looks like.
Okay, time to look at the digraph()
in G
again then to see if that changed.
G. % {digraph,69651,73748,77845,true}
That tuple result of the digraph()
looks the same, so let’s see if that vertex we added is in the graph, since we did get the return value of foo
.
digraph:vertices(G). % [foo] digraph:vertex(G, foo). % {foo,bar}
Hrmm… Okay, looks like that vertex is in there.
Let’s add another vertex to the digraph()
bound to G
.
V = digraph:add_vertex(G). % ['$v'|0] digraph:vertices(G). % [['$v'|0],foo]
That one is added as well.
HC SVNT DRACONES (Here Are Dragons)
So the behavior I want to call out in this post before we start looking at the functions in this module is that these functions exhibit observably mutable behavior on a digraph()
.
I say it is observably mutable, becuase while if it is not being changed under the covers of the implementation, the structure can be changed while the binding of the variable to the reference stays the same.
digraph:vertices(G). % [['$v'|0],foo] Copy = G. % {digraph,69651,73748,77845,true} V2 = digraph:add_vertex(G, wat). % wat digraph:vertices(Copy). % [['$v'|0],foo,wat] digraph:vertices(G). % [['$v'|0],foo,wat]
This even mutates other varaible references as well, so this breaks any convention that I have seen in the Erlang ecosystem about keeping all data immutable.
We will continue looking at the digraph
module in future Erlang Thursday posts, but I wanted to spend some time calling out the mutability inherent in the digraph()
s, so that when you need to use a one, you can be aware that this is not something you want to use in your concurrent parts of your application without great caution.
Updated (October 18th)
As part of the translation into Lisp Flavoured Erlang as part of LFE Fridays, Robert Virding updated me with the reasoning of the digraph()
‘s mutability, which he included in the translation as well.
The Dragons slain
The reason behind the dragons is how a digraph()
is implemented. A digraph
is built of 3 ETS tables, with, in this case, the table ids 8207
, 12304
and 16401
. You can see this by calling ets:i/0 which lists information about all the current tables. You can see that the 3 tables are owned by the LFE shell process:
> self(). <0.28.0> > ets:i(). id name type size mem owner ---------------------------------------------------------------------------- 1 code set 282 10393 code_server 4098 code_names set 64 7713 code_server 8207 vertices set 3 328 <0.28.0> 12304 edges set 0 305 <0.28.0> 16401 neighbours bag 2 319 <0.28.0> ac_tab ac_tab set 6 839 application_controller file_io_servers file_io_servers set 0 305 file_server_2 global_locks global_locks set 0 305 global_name_server global_names global_names set 0 305 global_name_server ... ok
The digraph()
structure itself is just a tagged tuple containing the table ids. As all changes are made to the ETS
tables the structure itself never changes. Data about the tables and their contents can be read with ets:info/1 and ets:i/1.
–Proctor
Erlang is simply being practical here – the graphs are stored as ETS structures and are mutable. Working with immutable graphs in a purely functional manner (especially in the general case, if you want to support large graphs of any kind) is a very hard problem. In fact, there is scarcely any academic / theoretical work done in this area, and practical solutions simply don’t exist. There has been some rather fascinating research on “inductive graphs” – see for example web.engr.oregonstate.edu/~erwig/papers/InductiveGraphs_JFP01.pdf
I have addressed the use of Erlang’s digraph module from Elixir in a blog post here: http://blog.jordan-dimov.com/elixir-graph-data-structures-with-erlangs-digraph/
Based off everything I have seen with the design of the language and libraries, I never doubted there was a good, pragmatic, and well informed reason; the catch is that it is something subtle that doesn’t align with normal data structure behaviors, and that if someone expects the same behavior of immutability as lists, tuples, or others, then they would be in for a surprise. 😉
As I mentioned, I am planning to keep taking a more in depth look at the different functions in the module, but felt that this was something that felt like the documentation at the top of the module was missing:
Thanks for the link to the inductive graphs information, and your blog post was very good read as well.