Category Archives: Erlang

Erlang Thursday – ETS Introduction, Part 1

Today’s Erlang Thursday starts the beginning of an intro to the ets module, and ETS in general.

ETS stands for Erlang Term Storage, and is a in-memory store for Erlang terms, a.k.a pieces of an Erlang data type, that provides constant access time to the data stored.

ETS can be thought of as a key/value store style storage, and it uses the concept of tables as the way of grouping together data.

One of the first things that is useful to know is that ETS tables are created by a process which, unless transfered to another process, is the owner of the table.

When the owner dies, the table gets deleted, and is no longer accessible.

Let’s see what this would look like.

First, after starting a new Erlang shell, we will check the PID (process identifier) of the shell we are in.

self().
% <0.34.0>

We then will create a new ETS table. We will be going into future details about the various ways new tables can be created in future posts, so for now, we will just create a new table by only specifying a name and empty list of options.

TableId = ets:new(table, []).
% 20496

Capturing table id, we will take a look at the info that ETS knows about that table with ets:info/1.

ets:info(TableId).
% [{read_concurrency,false},
%  {write_concurrency,false},
%  {compressed,false},
%  {memory,305},
%  {owner,<0.34.0>},
%  {heir,none},
%  {name,table},
%  {size,0},
%  {node,nonode@nohost},
%  {named_table,false},
%  {type,set},
%  {keypos,1},
%  {protection,protected}]

Time to cause the owning process to crash. In this case we’ll do a bad pattern match to cause a bad match exception.

1 = 2.
% ** exception error: no match of right hand side value 2

And let’s check the PID of the process to double check that the shell has indeed started a new process for us to run in.

self().
% <0.40.0>

And yes, the PID self() returned is different than the PID we got when we called self() the first time.

Time to look at the info for the table we created earlier again and see what we get.

ets:info(TableId).
% undefined

undefined. So we no longer have any table found by ETS for that table id.

We take a secondary look using ets:all/0 to see if we can see if it might be floating around somewhere still but the call to ets:info/1 is just not returning for the table id.

ets:all().
% [8207,file_io_servers,inet_hosts_file_byaddr,
%  inet_hosts_file_byname,inet_hosts_byaddr,inet_hosts_byname,
%  inet_cache,inet_db,global_pid_ids,global_pid_names,
%  global_names_ext,global_names,global_locks,4098,1,ac_tab]

Doesn’t look like it, so let’s create another table with the same table name as before.

Table2Id = ets:new(table, []).
% 24592

That succeeds and doesn’t complain about trying to create a table with the same name as an existing table.

We will call ets:all/0 again, and we can see there is an item in the list with the id that was returned from ets:new/2.

ets:all().
% [24592,8207,file_io_servers,inet_hosts_file_byaddr,
%  inet_hosts_file_byname,inet_hosts_byaddr,inet_hosts_byname,
%  inet_cache,inet_db,global_pid_ids,global_pid_names,
%  global_names_ext,global_names,global_locks,4098,1,ac_tab]

Time to crash the process again.

1 = 2.
% ** exception error: no match of right hand side value 2

We note that we do have a new PID again.

self().
% <0.47.0>

And if we call ets:all/0 one more time, we can see that the table identifier that was previously in the list has gone away.

ets:all().
% [8207,file_io_servers,inet_hosts_file_byaddr,
%  inet_hosts_file_byname,inet_hosts_byaddr,inet_hosts_byname,
%  inet_cache,inet_db,global_pid_ids,global_pid_names,
%  global_names_ext,global_names,global_locks,4098,1,ac_tab]

So with this initial look at ETS, we have demonstrated an owning process crash does remove the table, and we have also gotten an preview of a couple of the functions in the ets module, specifically ets:new/2, ets:info/1, and ets:all/0.

We will continue looking at the overview of ETS for a few posts, while doing some cursory coverage of some of the functions in the ets module, and after that, we will then start to get into the specifics of the different functions in the ets module.

–Proctor

Erlang Thursday – digraph:del_path/3

Today’s Erlang Thursday is on digraph:del_path/3.

We will continue working with the same graph we started with in the previous post on digraph:get_path/3.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}
V1 = digraph:add_vertex(Graph).
% ['$v'|0]
V2 = digraph:add_vertex(Graph).
% ['$v'|1]
V3 = digraph:add_vertex(Graph).
% ['$v'|2]
V4 = digraph:add_vertex(Graph).
% ['$v'|3]
E1 = digraph:add_edge(Graph, V1, V2).
% ['$e'|0]
E2 = digraph:add_edge(Graph, V2, V3).
% ['$e'|1]
E3 = digraph:add_edge(Graph, V3, V4).
% ['$e'|2]
E4 = digraph:add_edge(Graph, V2, V4).
% ['$e'|3]
E5 = digraph:add_edge(Graph, V4, V1).
% ['$e'|4]

digraph:del_path/3 takes three arguments, a Graph, a source vertex, and a destination vertex, and removes all edges in a each path in the graph from the source vertex to the destination vertex, until no path exist between the source and destination vertices.

The return value of digraph:del_path/3 is always a return value of true.

Looking at the picture of the graph above as reference, we are going to call digraph:del_path/3 for the graph with a source vertex of V1, and a destination vertex of V4.

digraph:del_path(Graph, V1, V4).
% true
digraph:vertices(Graph).
% [['$v'|2],['$v'|1],['$v'|0],['$v'|3]]
digraph:edges(Graph).
% [['$e'|1],['$e'|2],['$e'|4]]

Translating the edge names, we see that the edge from V1 to V2 has been removed, as well as the edge from V2 to V4 has been removed.

So how did Erlang come up with this result?

This puzzled me at first, as it wasn’t one of the two scenarios I was expecting to see, which were either: remove all edges but the edge from V4 to V1, or remove only the edge from V1 to V2.

I then opened the Erlang source code on Github for digraph module to clarify my thinking, and looking at the code it then made sense what was happening.

First digraph:del_path/3 calls digraph:get_path/3, and removes all edges in that path, and then recurses until no path is found.

This is when it clicked as to why Erlang was removing only those edges.

If we call digraph:get_path/3 on a fresh version of the graph, we see that it returns the path of V1 -> V2 -> V4.

digraph:get_path(Graph, V1, V4).
[['$v'|0],['$v'|1],['$v'|3]]

Erlang then removes the edges in that path, and the will call digraph:del_path/3, which calls digraph:get_path/3 again, but as we removed the edge between V1 and V2, no path is found so the process is finished.

This is why we see more edges removed if we reset the Graph again (by exiting the shell and recreating it from scratch by pasting the initialization into the shell), and call digraph:del_path/3 between V2 and V4.

digraph:del_path(Graph, V2, V4).
% true
digraph:edges(Graph).
% [['$e'|0],['$e'|4]]

This case there are the paths V2 -> V4 and V2 -> V3 -> V4, and if we remove the path V2 -> V4, the removal of all the edges associated with that path doesn’t break the path of V2 -> V3 -> V4, so it can remove all edges in that path as well.

So we have a win in the case where the documentation wasn’t quite as clear as it could be, but having the Erlang standard library be open source gets us a win because we can always go in and check out what the code is really doing.

–Proctor

Erlang Thursday – digraph:get_cycle/2

Today’s Erlang Thursday is on digraph:get_cycle/2.

We will continue working with the graph from the previous post on digraph:get_path/3.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}
V1 = digraph:add_vertex(Graph).
% ['$v'|0]
V2 = digraph:add_vertex(Graph).
% ['$v'|1]
V3 = digraph:add_vertex(Graph).
% ['$v'|2]
V4 = digraph:add_vertex(Graph).
% ['$v'|3]
E1 = digraph:add_edge(Graph, V1, V2).
% ['$e'|0]
E2 = digraph:add_edge(Graph, V2, V3).
% ['$e'|1]
E3 = digraph:add_edge(Graph, V3, V4).
% ['$e'|2]
E4 = digraph:add_edge(Graph, V2, V4).
% ['$e'|3]
E5 = digraph:add_edge(Graph, V4, V1).
% ['$e'|4]

digraph:get_cycle/2 takes a graph G, and an vertex V, and tries to find a path that creates a cycle between the vertex V in graph G.

digraph:get_cycle(Graph, V1).
% [['$v'|0],['$v'|1],['$v'|3],['$v'|0]]
digraph:get_cycle(Graph, V2).
% [['$v'|1],['$v'|3],['$v'|0],['$v'|1]]

Next, we add a new vertex V5, and a new edge originating from V4 and ending on V5

We then call digraph:get_cycle/2 on V5, and we get back a false as no cyle exists in the graph with vertex V5 in it.

V5 = digraph:add_vertex(Graph).
% ['$v'|4]
E6 = digraph:add_edge(Graph, V4, V5).
% ['$e'|5]
digraph:get_cycle(Graph, V5).
% false

The digraph module also contains the function digraph:get_short_cycle/2.

digraph:get_short_cycle/2 attempts to find the shortest cycle in the graph G for vertex V.

The documentation for digraph:get_short_cycle/2 exact phrasing is:

Tries to find an as short as possible simple cycle through the vertex V of the digraph G.

So depending on how you read that, the shortest cycle might not be guaranteed to be returned, but simply a shorter cycle, which may depend on the overall size and complexity of the graph.

digraph:get_short_cycle(Graph, V1).
% [['$v'|0],['$v'|1],['$v'|3],['$v'|0]]
digraph:get_short_cycle(Graph, V5).
% false

–Proctor

Erlang Thursday – digraph:in_neighbors/2

Today’s Erlang Thursday is on digraph:in_neighbors/2.

digraph:in_neighbors/2 takes a graph G, and a vertex V, and will return a list of all the vertices that have edges originating from them that are directed toward the vertex V.

We will continue working with the graph from last week’s post on digraph:get_path/3.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}
V1 = digraph:add_vertex(Graph).
% ['$v'|0]
V2 = digraph:add_vertex(Graph).
% ['$v'|1]
V3 = digraph:add_vertex(Graph).
% ['$v'|2]
V4 = digraph:add_vertex(Graph).
% ['$v'|3]
E1 = digraph:add_edge(Graph, V1, V2).
% ['$e'|0]
E2 = digraph:add_edge(Graph, V2, V3).
% ['$e'|1]
E3 = digraph:add_edge(Graph, V3, V4).
% ['$e'|2]
E4 = digraph:add_edge(Graph, V2, V4).
% ['$e'|3]
E5 = digraph:add_edge(Graph, V4, V1).
% ['$e'|4]

With that graph setup again, we can now find the in_neighbors of different vertices in our graph.

digraph:in_neighbours(Graph, V4).
% [['$v'|1],['$v'|2]]
digraph:in_neighbours(Graph, V1).
% [['$v'|3]]
digraph:in_neighbours(Graph, V2).
% [['$v'|0]]

So for vertex V4 we see the return value of [['$v'|1],['$v'|2]], which are the vertices V2 and V3. For V1 we have an inbound neighbor of V4, and for V2 we have the inbound neighbor of V1.

digraph:out_neighbors/2

The digraph module also contains the function digraph:out_neighbors/2, which returns a list of the vertices that a the given vertex “points to” with its edges in the directed graph.

digraph:out_neighbours(Graph, V2).
% [['$v'|3],['$v'|2]]
digraph:out_neighbours(Graph, V4).
% [['$v'|0]]
digraph:out_neighbours(Graph, V1).
% [['$v'|1]]

We can see from the picture of our graph that V2 has edges that “point to” the vertices V3 and V4, and if we look at the result of digraph:out_neighbors/2, we get the result of the vertices V3 and V4.

In this case we get the list of vertices where V4 is first and V3 is second, but that may not be the case, as the documentation states that the the edges are “in some unspecified order”, which holds true of digraph:in_neighbors/2 as well.

–Proctor

Erlang Thursday – digraph:get_path/3

Today’s Erlang Thursday is on digraph:get_path/3.

digraph:get_path/3 takes a graph, a starting vertex, and an ending vertex and will attempt to find some path through the graph of length greater than zero, where all vertices in the path are distinct, except allowing for the first and last vertices to be the same.

If a path is found, it returns a list of the vertices visited (in order) to complete the path, if no path is found, false is returned.

First we will setup a new graph that we can traverse.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}
V1 = digraph:add_vertex(Graph).
% ['$v'|0]
V2 = digraph:add_vertex(Graph).
% ['$v'|1]
V3 = digraph:add_vertex(Graph).
% ['$v'|2]
V4 = digraph:add_vertex(Graph).
% ['$v'|3]
E1 = digraph:add_edge(Graph, V1, V2).
% ['$e'|0]
E2 = digraph:add_edge(Graph, V2, V3).
% ['$e'|1]
E3 = digraph:add_edge(Graph, V3, V4).
% ['$e'|2]
E4 = digraph:add_edge(Graph, V2, V4).
% ['$e'|3]
E5 = digraph:add_edge(Graph, V4, V1).
% ['$e'|4]

This will give us a graph that looks like the following:

Now we can get to playing with digraph:get_path/3 and see what the paths are from any sets of nodes.

digraph:get_path(Graph, V2, V3).
% [['$v'|1],['$v'|2]]
digraph:get_path(Graph, V2, V4).
% [['$v'|1],['$v'|3]]
digraph:get_path(Graph, V2, V1).
% [['$v'|1],['$v'|3],['$v'|0]]
digraph:get_path(Graph, V3, V1).
% [['$v'|2],['$v'|3],['$v'|0]]
digraph:get_path(Graph, V1, V4).
% [['$v'|0],['$v'|1],['$v'|3]]
digraph:get_path(Graph, V1, V1).
% [['$v'|0],['$v'|1],['$v'|3],['$v'|0]]

Note that these just happen to be the shortest paths, but this is not guaranteed to return the shortest path, but just the first path found.

And if we add a new vertex, and don’t connect it to any other node in the graph, and we call digraph:get_path/3, we can see it returns false.

V5 = digraph:add_vertex(Graph).
% ['$v'|4]
digraph:get_path(Graph, V1, V5).
% false

–Proctor

Erlang Thursday – digraph:add_edge/4

Today’s Erlang Thursday is on digraph:add_edge/4.

digraph:add_edge/4 takes a graph as its first argument, the originating (eminating) vertex as its second arugment, the destination (incident) vertex as its third argument, and a label.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}
Vertex1 = digraph:add_vertex(Graph, foo).
% foo
Vertex2 = digraph:add_vertex(Graph, bar).
% bar
Edge1 = digraph:add_edge(Graph, Vertex1, Vertex2, {foo, bar}).
% ['$e'|0]
digraph:edges(Graph).
% [['$e'|0]]
Edge2 = digraph:add_edge(Graph, Vertex2, Vertex1, {bar, foo}).
% ['$e'|1]
digraph:edges(Graph).
% [['$e'|1],['$e'|0]]

The digraph module also contains digraph:add_edge/5 which allows you to specify the edge identifier, in this case we want the edge to be myEdge.

digraph:add_edge(Graph, myEdge, Vertex2, Vertex1, myLabel).
% myEdge
digraph:edges(Graph).
% [['$e'|1],['$e'|2],['$e'|3],myEdge,['$e'|0]]

As well as digraph:add_edge/3 which allows you to not specify the edge or the label.

digraph:add_edge(Graph, Vertex2, Vertex1).
% ['$e'|2]
digraph:add_edge(Graph, Vertex2, Vertex1).
% ['$e'|3]
digraph:edges(Graph).
% [['$e'|1],['$e'|2],['$e'|3],['$e'|0]]

And if you note in the examples for digraph:add_edge/3 and digraph:add_edge/5 we added a number of edges with the same eminate and incident vertices, and it was happy to create those edges for us.

We can also create acyclic digraphs by using digraph:new/1, and specifying that we want the digraph() to be acyclic.

Graph2 = digraph:new([acyclic]).
% {digraph,20498,24595,28692,false}
VertexA = digraph:add_vertex(Graph2, foo).
% foo
VertexB = digraph:add_vertex(Graph2, bar).
% bar
EdgeAB = digraph:add_edge(Graph2, VertexA, VertexB, {foo, bar}).
% ['$e'|0]
EdgeBA = digraph:add_edge(Graph2, VertexB, VertexA, {bar, foo}).
% {error,{bad_edge,[foo,bar]}}

When we try to add an edge that will create a cycle in an acyclic directed graph, we get a return of a bad_edge error with the two edges specified.

–Proctor

BEAMing With Joy presentation from ElixirConf 2015 now available

Confreaks has started releases the videos from ElixirConf 2015 a couple of days ago, and just released the recording of my BEAMing With Joy talk yesterday.

The video, slides, favorite response to the talk so far, and a place for any questions you have can be found at http://www.proctor-it.com/beaming-with-joy/.

And if you have any comments, questions, or feedback, I would love to hear from you.

–Proctor

Erlang Thursday – digraph:add_vertex/1

Today’s Erlang Thursday starts to dig into the digraph module, as promised last week, and takes a look at digraph:add_vertex/1.

First we create a new directed graph, so we have something we can add vertices to.

Graph = digraph:new().
% {digraph,20498,24595,28692,true}

We then add some vertices to the graph by using digraph:add_vertex/1.

digraph:add_vertex(Graph).
% ['$v'|0]
digraph:add_vertex(Graph).
% ['$v'|1]
digraph:add_vertex(Graph).
% ['$v'|2]

As we don’t specify any information about the vertex we want to add, Erlang will create a new vertex for us of the format ['$v', I], with an empty list as the label where I is a non-negative integer.

We can also use digraph:add_vertex/2 to add a vertex if we wish to provide the vertex identifer, or provide vertex identifier and label in the case of digraph:add_vertex/3. As with digraph:add_vertex/1, digraph:add_vertex/2 uses the empty list as the label as well.

digraph:add_vertex(Graph, vertex1).
% vertex1
digraph:add_vertex(Graph, vertex2, "Vertex 2").
% vertex2

We have now added 5 vertices, and can check what vertices we have in the digraph() by using digraph:vertices/1.

digraph:vertices(Graph).
% [['$v'|2],['$v'|1],['$v'|0],vertex2,vertex1]

If we decide we want to try to add a vertex ourselves of the format ['$v' | I], we can run into trouble if you call digraph:add_vertex/1 after it.

digraph:add_vertex(Graph, ['$v' | 3]).
% ['$v'|3]
digraph:add_vertex(Graph).
% ['$v'|3]
digraph:vertices(Graph).
% [['$v'|2],['$v'|1],['$v'|0],['$v'|3],vertex2,vertex1]
digraph:add_vertex(Graph, ['$v' | 4]).
% ['$v'|4]
digraph:vertices(Graph).
% [['$v'|4],
%  ['$v'|2],
%  ['$v'|1],
%  ['$v'|0],
%  ['$v'|3],
%  vertex2,vertex1]

So we add a vertex by specifying the vertex() we want to add, and then add a new vertex and let Erlang take care of creating that vertex, and we wind up “losing” a vertex, as one essentially gets overridden when we look at the end state of the digraph().

–Proctor

Erlang Thursday – The digraph module

Today’s Erlang Thursday kicks of taking a look at the digraph module.

As I was looking into it, this module didn’t line up with my expectations of Erlang behavior, so I want to focus on that difference before taking a look at the functions in the module.

If we start by browsing through the function signatures in the digraph module, we can see that the only function that returns a digraph() are digraph:new/0 and digraph:new/1.

Thinking this was odd for a Erlang API, I went to the Erlang shell, and added a vertex to the digraph(), and then inpsected the result of that operation.

G = digraph:new().
% {digraph,69651,73748,77845,true}
G2 = digraph:add_vertex(G, foo, bar).
% foo

The return value of calling digraph:add_vertex/3 was foo, which was the second argument, and doesn’t match up with what the representation of a graph looks like.

Okay, time to look at the digraph() in G again then to see if that changed.

G.
% {digraph,69651,73748,77845,true}

That tuple result of the digraph() looks the same, so let’s see if that vertex we added is in the graph, since we did get the return value of foo.

digraph:vertices(G).
% [foo]
digraph:vertex(G, foo).
% {foo,bar}

Hrmm… Okay, looks like that vertex is in there.

Let’s add another vertex to the digraph() bound to G.

V = digraph:add_vertex(G).
% ['$v'|0]
digraph:vertices(G).
% [['$v'|0],foo]

That one is added as well.

HC SVNT DRACONES (Here Are Dragons)

So the behavior I want to call out in this post before we start looking at the functions in this module is that these functions exhibit observably mutable behavior on a digraph().

I say it is observably mutable, becuase while if it is not being changed under the covers of the implementation, the structure can be changed while the binding of the variable to the reference stays the same.

digraph:vertices(G).
% [['$v'|0],foo]
Copy = G.
% {digraph,69651,73748,77845,true}
V2 = digraph:add_vertex(G, wat).
% wat
digraph:vertices(Copy).
% [['$v'|0],foo,wat]
digraph:vertices(G).
% [['$v'|0],foo,wat]

This even mutates other varaible references as well, so this breaks any convention that I have seen in the Erlang ecosystem about keeping all data immutable.

We will continue looking at the digraph module in future Erlang Thursday posts, but I wanted to spend some time calling out the mutability inherent in the digraph()s, so that when you need to use a one, you can be aware that this is not something you want to use in your concurrent parts of your application without great caution.

Updated (October 18th)

As part of the translation into Lisp Flavoured Erlang as part of LFE Fridays, Robert Virding updated me with the reasoning of the digraph()‘s mutability, which he included in the translation as well.

The Dragons slain

The reason behind the dragons is how a digraph() is implemented. A digraph is built of 3 ETS tables, with, in this case, the table ids 8207, 12304 and 16401. You can see this by calling ets:i/0 which lists information about all the current tables. You can see that the 3 tables are owned by the LFE shell process:

> self().
<0.28.0>
> ets:i().
 id              name              type  size   mem      owner
 ----------------------------------------------------------------------------
 1               code              set   282    10393    code_server
 4098            code_names        set   64     7713     code_server
 8207            vertices          set   3      328      <0.28.0>
 12304           edges             set   0      305      <0.28.0>
 16401           neighbours        bag   2      319      <0.28.0>
 ac_tab          ac_tab            set   6      839      application_controller
 file_io_servers file_io_servers   set   0      305      file_server_2
 global_locks    global_locks      set   0      305      global_name_server
 global_names    global_names      set   0      305      global_name_server
...

ok

The digraph() structure itself is just a tagged tuple containing the table ids. As all changes are made to the ETS tables the structure itself never changes. Data about the tables and their contents can be read with ets:info/1 and ets:i/1.

–Proctor

Erlang Thursday – erl_tar:table/1

Today’s Erlang Thursday is on erl_tar:table/1.

erl_tar:table/1 returns a list of filenames included in the tar file.

erl_tar:table("animal_sounds.tar").
% {ok,["dog.txt","cat.txt","pony.txt","bear.txt"]}

There is also a version erl_tar:table/2 that takes a options list as well.

erl_tar:table("animal_sounds.tar.gz", [compressed]).
% {ok,["dog.txt","cat.txt","pony.txt","bear.txt"]}
erl_tar:table("animal_sounds.tar.gz", [compressed,verbose]).
% {ok,[{"dog.txt",regular,5,
%       {{2015,9,23},{22,18,47}},
%       420,501,20},
%      {"cat.txt",regular,5,{{2015,9,23},{22,18,56}},420,501,20},
%      {"pony.txt",regular,8,{{2015,9,23},{22,19,10}},420,501,20},
%      {"bear.txt",regular,19,
%       {{2015,9,23},{22,21,16}},
%       420,501,20}]}

With the verbose option, instead of just getting a list of filenames, we get a list of tuples when we pass verbose.

The tuple is: Filename, Filetype (regular file/directory or a symbolic link), Bytes of the file, Timestamp tuple, Permissions (expressed in decimal instead of octal), UserId, and GroupId.

The documentation does not specify any of the information about the return type, and the credit for clarification of what the 420,501,20 items represent is all from Robert Virding, from emailing him this post to be translated as part of LFE Fridays.

–Proctor