guh.me - gustavo's personal blog

Text processing with Ruby and Ru

βΈ»

Piping and processing the command line output to Ruby is as easy as typing ruby -ne [CODE], but it can be made easier (and more useful) with the ru gem. From their README:

Ru brings Ruby’s expressiveness, cleanliness, and readability to the command line.

It lets you avoid looking up pesky options in man pages and Googling how to write a transformation in bash that would take you approximately 1s to write in Ruby.

The best part is that it allows us to use all the Ruby Core and ActiveSupport methods, which makes it possible to write stuff like this:

gustavo@possantinho ~ $ cat /var/log/postgresql/postgresql-9.4-main.log | ru 'grep(/\[unknown\]/).map(:to_date).uniq'
2015-08-31
2015-09-04
2015-09-05

I can easily use grep to select out these lines, and uniq to remove the duplicate entries, but I have no idea how I’d extract the dates using sed and grep. On the other hand, it’s pretty easy to do it using Ruby - the only con is that it’s somewhat slow, and you’d probably not want to use it on a huge dataset.

You can read more about ru on its Github repository.