Category Archives: Scripting

Shelling out in Ruby

One of the things that is nice about Ruby is the ability to use it for scripting. Ruby makes it nice and easy to shell out to run standard *nix commands.

The problem is, the nice simple way that seems to be the most common, is also the least safe. I’m looking at you backticks.

#!/usr/env ruby

result = `gerp -e 'some regex' foo`
puts "<time to process result>"
puts "All Good"

And when we run the above, we see that our script completed, even though we really bombed out trying to run gerp instead of grep.

sh: gerp: command not found
<time to process result>
All Good

Oops.

Sure we get a warning gerp: command not found, but the script still proceeds to plow ahead and do any other side effects that it is setup to do, although something went wrong earlier.

This has become one of the biggest thorns in my side at work.

Lucky the technical solution to this is straight forward, so I want to share it with readers of this blog so you can stop making the mistake of using backticks to shell out commands, and use a solution that does not cover up issues in production.

#!/usr/env ruby

require 'open3'

def execute_syscall(cmd)
  results, error, status = Open3.capture3(cmd)
  raise error unless status.success?

  results
end

result = execute_syscall "gerp -e 'some regex' foo"
puts "<time to process result>"
puts "All Good"

First we require open3. Open3 allows us to be able to capture the results, the error stream, and the status. This way, we can check if the status of the command was anything other than success, and if not, we raise the error we get from STDERR.

Now when we run it, the script stops in its tracks. Not only that, but our script returns a failure error code as well.

> ruby script_demo.rb
script_demo.rb:7:in `execute_syscall': sh: gerp: command not found (RuntimeError)
	from script_demo.rb:12:in `&lt;main&gt;'
> echo $?
1

That way we can know that something went wrong in our program, especially when it is setup as a cron job or as some other unsupervised task.

Hope this can save you some headaches and frustration on your end as well.

–Proctor

cronolog and STDERR

At work we use cronolog for automatic rotation of log files for a number of processes since those processes just write to STDOUT and STDERR instead of using a proper logging library. Unfortunately, that means when running the script/program we have to redirect STDERR to STDOUT, and then pipe the results to cronolog, since cronolog reads from STDIN. The result looks something along the lines of the following:

ruby main.rb &2>1 | cronolog /logs/main.log /logs/main-%Y-%m-%d.log

The problem with this is if that errors are few and far between, as one hopes they should be, then it might be really tricky to find the errors amongst the other logging. Ideally, I thought it would be nice to have STDOUT go to one log file, and STDERR get written to a err file for the process.

After some digging into From Bash to Z Shell I found something about process substitution in the Bash shell. After a little experimentation and tweaking, I came up with the following:

ruby main.rb 
     > >(/usr/sbin/cronolog /logs/main.log /logs/main-%Y-%m-%d.log) 
     2> >(/usr/sbin/cronolog /logs/main.err /logs/main-%Y-%m-%d.err)

This allows me to use cronolog with both the STDOUT and STDERR streams. By using cronolog in the process substitution, it allows the output streams to be treated as input streams to cronolog, where as before I had to combine them into one stream and then pipe the single stream to cronolog as in the first example.

Hope this can help someone else, and save some hours of digging.

–Proctor

Log File parsing with Futures in Clojure

As the follow up to my post Running Clojure shell scripts in *nix enviornments, here is how I implemented an example using futures to parse lines read in from standard in as if the input was piped from a tail and writing out the result of parsing the line to standard out.

First due to wanting to run this a script from the command line I add this a the first line of the script:

 
!/usr/bin/env lein exec

As well, I will also be wanting to use the join function from the clojure.string namespace.

 
(use '[clojure.string :only (join)])

When dealing with futures I knew I would need an agent to adapt standard out.

(def out (agent *out*))

I also wanted to separate each line by a new line so I created a function writeln. The function takes a Java Writer and calls write and flush on each line passed in to the function:

(defn writeln [^java.io.Writer w line]
  (doto w
    (.write (str line "n"))
    .flush))

Next I have my function to analyze the line, as well as sending the result of that function to the agent via the send-off function.

(defn analyze-line [line]
  (str line "   " (join "  " (map #(join ":" %) (sort-by val > (frequencies line))))))

(defn process-line [line]
  (send-off out writeln (analyze-line line)))

The analyze-line function is just some sample code to return a string of the line and the frequencies of each character in the line passed in. The process-line function takes a line and calls send-off to the agent out for the function writeln with the results of calling the function analyze-line.

With all of these functions defined I now need to just loop continuously and process lines that are not empty, and call process-line for each line as a future.

(loop []
  (let [line (read-line)]
    (when line
      (future (process-line line)))
      (recur)))

Running Clojure shell scripts in *nix environments

I was recently trying to create a basic piece of Clojure code to play with “real-time” log file parsing by playing with futures. The longer term goal of the experiment is to be able to tail -f a log file pipe that into my Clojure log parser as input.

As I wasn’t sure exactly what I would need to be doing, I wanted an easy way to run some code quickly without having to rebuild the jars through Leiningen every time I wanted to try something, in a manner similar to the way I am thinking I will be using it if the experiment succeeds.

I created a file test_input with the following lines:

1 hello
2 test
3 abacus
4 qwerty
5 what
6 dvorak

With this in place, my goal was to be able to run something like cat test_file | parser_concept. After a bit of searching I found the lein-exec plugin for Leiningen, and after very minor setup I was able to start iterating with input piped in from elsewhere.

The first step was to open my profiles.clj file in my ~/.lein directory. I made sure lein-exec was specified in my user plugins as so:

{:user {:plugins [[lein-exec "0.2.1"]
                  ;other plugins for lein
                 ]}}

With this in place I just put the following line at the top of my script.clj file:

#!/usr/bin/env lein exec

I then changed the permissions of script.clj file to make it executable, I was able to run the following and have my code run against the input.

cat test_input | ./script.clj

I will be posting a follow up entry outlining my next step of experimenting with “processing” each line read in as a future.