"I'm not proud of being a congenital pain in the ass. But I will take money for it."

Scary sed tricks

Thu 07 September 2017 | -- (permalink)

sed, the Unix stream editor, is usually only the right tool for a job if the job is a quick-n-dirty hack, but occasionally one needs to do that, or one needs to embed some whacky thing in a shell script on a system with minimal Unix tools, or whatever. In that spirit, a few obscure sed tricks that it's useful to know.

Line-oriented commands on the command line

sed has a this grouping operation, where you can use a single pattern to control a group of commands. It also has commands to let one insert a new line before or after the current one. All of which works neatly when using -f, but almost nobody ever does that. So say you know that with -f you'd write something like:

/wombat/{
  s/bat/bait/
  a\
  fodder
}

How do you do that on the command line? Turns out that each -e is treated as a virtual line, so you do something like:

sed -e '/wombat/{'      \
    -e '  s/bat/bait/'  \
    -e '  a\'           \
    -e '  fodder'       \
    -e '}'

Indentation in of the nested commands is optional in both cases.

Editing in-place and modern regexps

OK, you can find this one just by reading the man page but if you've been using sed for so long that you don't read the man page anymore, check out the -i and -E options.

Processing the whole file at once

Use this one with caution, as stuffing the entire file into the pattern space is really not something sed is designed to do. But every once in a while it's the least wrong tool for the job, and its apparent inability to deal with the whole file can be infuriating.

Well, not anymore, thanks to a really clever bit of code from https://unix.stackexchange.com/questions/182153/sed-read-whole-file-into-pattern-space-without-failing-on-single-line-input (read the whole answer to find out why you should not do this unless you really must).

Basically, instead of fighting with sed's line-oriented nature, this hack cleverly uses sed's existing behavior and the hold space to do what you want. Although the post doesn't say it, this hack even leaves you a simple way to do both line-oriented and whole-file-oriented processing, so long as you're OK with the line-oriented stuff happening first.

General form of the hack:

sed ' <line-oriented-stuff>  H;1h;$!d;x; <whole-file-oriented-stuff>'

That H;1h;$!d;x; magic is worth picking apart with a copy of the sed man page.

Example of this hack in use:

expand <foo.c | sed 's=//\(.*\)$=/*\1 */=; H;1h;$!d;x; s=\*/ *\(\n *\)/\*=\1 *=g' >bar.c

is a quick hack to change C++ style block comments to C style (yes, you could do this without expand but the regexps would be more complicated, and in this case I also needed to fix a really weird notion of tab stop settings, not shown, so needed expand anyway).