Erlang Thursday – erl_tar:extract/1

Today’s Erlang Thursday cover’s erl_tar:extract/1.

erl_tar:extract/1 takes a file, either as a binary tuple, file descriptor tuple, or filename, and extracts the contents of the tar out to the current directory.

Since we will need to have a tar file to extract, let’s create some files and add them to a new tar file.

$ echo "woof" > dog.txt
$ echo "meow" > cat.txt
$ echo "sparkle" > pony.txt
$ echo 'Wocka Wocka Wocka!' > bear.txt
$ tar -cvf animal_sounds.tar dog.txt cat.txt pony.txt bear.txt
a dog.txt
a cat.txt
a pony.txt
a bear.txt

And while we are at it, lets create a compressed version as well.

$ tar -cvzf animal_sounds.tar.gz dog.txt cat.txt pony.txt bear.txt
a dog.txt
a cat.txt
a pony.txt
a bear.txt

Since we are going to test out extracting the tar, we will go ahead and clean up the files that we put in the tar.

$ rm dog.txt cat.txt pony.txt bear.txt

With all the ceremony of making sure we have a tar file to experiment with out of the way, it is time to fire up our Erlang shell, and call erl_tar:extract/1.

erl_tar:extract("animal_sounds.tar").
% ok

That seemed straight forward enough, so let’s see if we have our files extracted back out at the command prompt.

$ ls dog.txt cat.txt pony.txt bear.txt
bear.txt cat.txt  dog.txt  pony.txt
$ rm dog.txt cat.txt pony.txt bear.txt

And since we saw them, we will go ahead and remove them to get back to a clean state.

erl_tar:extract/2

Erlang also has a erl_tar:extract/2, which allows us to give options to the extraction process, by passing a list as its second argument.

We can have erl_tar:extract/2 extract the files and tell it to be verbose, and then follow that up with another extraction, where we specify that we not only want it to be verbose, but don’t overwrite any files that are already there.

erl_tar:extract("animal_sounds.tar", [verbose]).
% x /Users/proctor/tmp/dog.txt
%
% x /Users/proctor/tmp/cat.txt
%
% x /Users/proctor/tmp/pony.txt
%
% x /Users/proctor/tmp/bear.txt
%
% ok
erl_tar:extract("animal_sounds.tar", [verbose, keep_old_files]).
% x /Users/proctor/tmp/dog.txt - exists, not created
%
% x /Users/proctor/tmp/cat.txt - exists, not created
%
% x /Users/proctor/tmp/pony.txt - exists, not created
%
% x /Users/proctor/tmp/bear.txt - exists, not created
%
% ok

And yet again, we swing back to the command prompt to remove the extracted files.

$ rm dog.txt cat.txt pony.txt bear.txt

Next we extract animal_sounds.tar.gz by passing the atom compressed in the list of options.

erl_tar:extract("animal_sounds.tar.gz", [verbose, compressed, keep_old_files]).
% x /Users/proctor/tmp/dog.txt
%
% x /Users/proctor/tmp/cat.txt
%
% x /Users/proctor/tmp/pony.txt
%
% x /Users/proctor/tmp/bear.txt
%
% ok

And sometimes when working with a tar file in your program, you don’t want to have to do all the management of the files on the filesystem just to read the contents of a tar file, so there is even an option to keep it all in memory.

erl_tar:extract("animal_sounds.tar.gz", [verbose, compressed, keep_old_files, memory]).
% {ok,[{"dog.txt",<<"woofn">>},
%      {"cat.txt",<<"meown">>},
%      {"pony.txt",<<"sparklen">>},
%      {"bear.txt",<<"Wocka Wocka Wocka!n">>}]}

When passing the memory option, the return value of erl_tar:extract/2 becomes an tuple of the status, and a list of tuples composed of the filename, and the contents of the file as a Binary for each file in the tar that was extracted.

If an error occurs on extraction to memory, for example we forget to pass the compressed option to a compressed tar file, it returns an error tuple.

erl_tar:extract("animal_sounds.tar.gz", [verbose, memory]).
% {error,eof}

There are quite a bit more options that erl_tar:extract/2 can take as well, so I highly recommend checking out the documentation for the full list of options.

–Proctor