Erlang Thursday – ETS Introduction Part 3: ETS Table Types

Today’s Erlang Thursday continues the introduction to ETS and takes a look at the different types of storage strategies that ETS supports.

The different types that ETS supports are: set, ordered_set, bag, and duplicate bag.

Each of these different types can be passed in when creating a new ETS table, but let’s see what type of ETS table we get when we don’t specify any of the types.

ETS_Empty = ets:new(ets_empty, []).
% 36886
ets:info(ETS_Empty).
% [{read_concurrency,false},
%  {write_concurrency,false},
%  {compressed,false},
%  {memory,305},
%  {owner,<0.50.0>},
%  {heir,none},
%  {name,ets_empty},
%  {size,0},
%  {node,nonode@nohost},
%  {named_table,false},
%  {type,set},
%  {keypos,1},
%  {protection,protected}]

If we look above, we can see the type tagged tuple has the type of set.

To see how the different types can work we will create three tuples to add to the ETS tables of the different type to see what they store.

Item1 = {1, a}.
% {1,a}
Item2 = {1.0, "a"}.
% {1.0,"a"}
Item3 = {1, "one"}.
% {1,"one"}

We will have a two tuples with the first element of 1, and one tuple whose first element is 1.0 to see how the different types of ETS tables behave when given the “same” key.

Why have key of both 1 and 1.0? Because depending on the comparison of equality used, they may or may not be seen as the same, and therefore the same key.

1 == 1.0.
% true
1 =:= 1.0.
% false

First we will take a look at an ETS table of type set.

ETS_Set = ets:new(ets_set, [set]).
40978

We insert Item1 followed by an insert of Item2, and use ets:tab2list/1 to see what is stored in the ETS table.

ets:insert(ETS_Set, Item1).
% true
ets:insert(ETS_Set, Item2).
% true
ets:tab2list(ETS_Set).
% [{1,a},{1.0,"a"}]

An ETS table of type set sees 1 and 1.0 as different keys. So now let’s add Item3 and see what happens when we do an insert with an already existing key.

ets:insert(ETS_Set, Item3).
% true
ets:tab2list(ETS_Set).
% [{1,"one"},{1.0,"a"}]

The previous tuple with the key of 1 was replaced by the tuple for Item3 which is the last thing we inserted.

Let’s look at what an ordered_set does.

ETS_OrdSet = ets:new(ets_ordset, [ordered_set]).
% 45075

Again we’ll insert Item1 followed by Item2 and use ets:tab2list/1 to check it’s state.

ets:insert(ETS_OrdSet, Item1).
% true
ets:insert(ETS_OrdSet, Item2).
% true
ets:tab2list(ETS_OrdSet).
% [{1.0,"a"}]

In this case, the key of 1.0 was seen the same as the previous 1 that was in there, so it overwrites the first item inserted.

We insert Item3 to the ordered_set, and we can see it gets replaced yet again.

ets:insert(ETS_OrdSet, Item3).
% true
ets:tab2list(ETS_OrdSet).
% [{1,"one"}]

Now lets check an ETS table that is a bag.

ETS_Bag = ets:new(ets_bag, [bag]).
% 49172

And we yet again add Item1 and Item2 to the table.

ets:insert(ETS_Bag, Item1).
% true
ets:insert(ETS_Bag, Item2).
% true
ets:tab2list(ETS_Bag).
% [{1,a},{1.0,"a"}]

Looking at ets:tab2list/1, we can see that for a bag they are treated as two different items.

And again we will see what happens when we insert Item3 into this ETS table.

ets:insert(ETS_Bag, Item3).
% true
ets:tab2list(ETS_Bag).
% [{1,a},{1,"one"},{1.0,"a"}]

In the case of a bag type of ETS table, we have Item2 along with entries Item1 and Item3 even though Item1 and Item3 both have the same key.

The last type of ETS table we have is a duplicate_bag.

ETS_DupBag = ets:new(ets_dupbag, [duplicate_bag]).
% 53269

We insert Item1 followed by Item2 as we did with all of the other types of ETS tables.

ets:insert(ETS_DupBag, Item1).
% true
ets:insert(ETS_DupBag, Item2).
% true
ets:tab2list(ETS_DupBag).
% [{1,a},{1.0,"a"}]

And like all of the other ETS table types, we insert Item3 into the duplicate_bag ETS table type.

ets:insert(ETS_DupBag, Item3).
% true
ets:tab2list(ETS_DupBag).
% [{1,a},{1,"one"},{1.0,"a"}]

And we see we have all three items in the ETS table for a duplicate_bag type.o

If we look at the behavior of bag and duplicate_bag though, we see that the behavior of both seems to be the same.

So what is the difference between the two???

If you dig into the documentation, and look at the description of the types under ets:new/2, it says that a bag will allow duplicate keys, but allow the item to only be added once, a duplicate_bag will allow multiple entries even if they have the same values as well.

To see this in action, we will add Item1 to both the ETS_Bag table and the ETS_DupBag table and see what happens.

First with just the ETs bag type.

ets:insert(ETS_Bag, Item1).
% true
ets:tab2list(ETS_Bag).
% [{1,a},{1,"one"},{1.0,"a"}]

The return value is the same as it was before, so adding an item that is already in a ETS table of type bag will not add it again.

So what does the duplicate_bag type of ETS table do?

ets:insert(ETS_DupBag, Item1).
% true
ets:tab2list(ETS_DupBag).
% [{1,a},{1,"one"},{1,a},{1.0,"a"}]

And we can see the tuple {1, a} shows up twice, because we called ets:insert/2 with that value twice.

–Proctor