Comments on: And You Can Quote Me On That! http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/ this blog is girtby.net Wed, 30 Sep 2009 01:44:34 -0400 http://wordpress.org/?v=2.9-rare hourly 1 By: Alastair http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1495 Alastair Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1495 <p>I should point out also that since writing the above I have discovered the <code>-print0</code> argument to the <code>find</code> tool, and the corresponding <code>-0</code> delimiter in <code>xargs</code>. In general, this is a useful combination for handling filenames with spaces, but I suspect that in this particular case the above sed script is probably just as simple. Exercise for the reader?</p> I should point out also that since writing the above I have discovered the -print0 argument to the find tool, and the corresponding -0 delimiter in xargs. In general, this is a useful combination for handling filenames with spaces, but I suspect that in this particular case the above sed script is probably just as simple. Exercise for the reader?

]]>
By: Richard http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1496 Richard Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1496 <p>Why for when you can find?</p> <p><code> $ find . -name "*.bat" -exec bash -c "basename \"{}\" .bat | xargs -i@ mv @.bat @.bak" ";" </code></p> <p>is only <i>slightly</i> unreadable ;) (Yes, that semicolon really does have to be there). I've no idea why you'd resort to -print0 and the like, since like everything else it can be solved with an extra layer of indirection, err, I mean quotes.</p> <p>Of course this find is only good enough if you're ok with it recursively 'backing up' innocent .bat files. You have been warned.</p> Why for when you can find?

$ find . -name "*.bat" -exec bash -c "basename \"{}\" .bat | xargs -i@ mv @.bat @.bak" ";"

is only slightly unreadable ;) (Yes, that semicolon really does have to be there). I’ve no idea why you’d resort to -print0 and the like, since like everything else it can be solved with an extra layer of indirection, err, I mean quotes.

Of course this find is only good enough if you’re ok with it recursively ‘backing up’ innocent .bat files. You have been warned.

]]>
By: Aristotle Pagaltzis http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1497 Aristotle Pagaltzis Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1497 <p><b>Alastair:</b></p> <blockquote> <p>The problem as I see it is the “for” command itself. What it’s doing is expanding the *.bat bit into a space-delimited list of filenames (eg “foo.bat bar.bat I love Unix.bat”), then iterating over them one word at a time (“foo.bat”, “bar.bat”, “I”, “love”, “Unix.bat”).</p> </blockquote> <p>Wrong. In fact, “<code>for</code>” is the <em>recommended</em> way of iterating over a glob. What happens is that <em>the shell</em> expands the <code>*.bat</code> glob into a list of filenames, each of which gets passed <em>as a unit</em> to ”<code>for</code>”, which then assigns it to <code>f</code>. Therefore “<code>for</code>” will <em>always</em> work, even where your <code>sed</code> script will fail (ie. when face of filenames with newlines in them).</p> <p>Your code still fails because you’re not quoting the variables when <em>using</em> them, causing the shell to expand the variable into a list of <code>IFS</code>-delimited strings. So while the shell passes “I love Unix.bat” as a single string to “<code>for</code>”, and that assigns it to <code>f</code> all right, the <code>$f</code> in your call to <code>mv</code> then expands to “I”, “love”, “Unix.bat”, which blows up. To avoid that, you use doublequotes:</p> <pre><code>for f in *.bat ; do mv "$i" "${i%.bat}.bak" ; done </code></pre> <p>Your Linux Zealots weren’t very well versed in the ways of Unix. :)</p> <p>Btw: notice that I said the shell expands the glob into a list? That means that the <code>ls</code> your <code>ls *.bat</code> incantation is superfluous; by the time <code>ls</code> is invoked, the actual work has already happened. This is called a <a href="http://partmaps.org/era/unix/award.html#ls">useless use of <code>ls *</code></a>.</p> <p><b>Richard:</b></p> <pre><code>find -name '*.bat' -print0 | sed 's!\\.bat\x0!\x0!g' | xargs -0i mv {}.bat {}.bak </code></pre> Alastair:

The problem as I see it is the “for” command itself. What it’s doing is expanding the *.bat bit into a space-delimited list of filenames (eg “foo.bat bar.bat I love Unix.bat”), then iterating over them one word at a time (“foo.bat”, “bar.bat”, “I”, “love”, “Unix.bat”).

Wrong. In fact, “for” is the recommended way of iterating over a glob. What happens is that the shell expands the *.bat glob into a list of filenames, each of which gets passed as a unit to ”for”, which then assigns it to f. Therefore “for” will always work, even where your sed script will fail (ie. when face of filenames with newlines in them).

Your code still fails because you’re not quoting the variables when using them, causing the shell to expand the variable into a list of IFS-delimited strings. So while the shell passes “I love Unix.bat” as a single string to “for”, and that assigns it to f all right, the $f in your call to mv then expands to “I”, “love”, “Unix.bat”, which blows up. To avoid that, you use doublequotes:

for f in *.bat ; do mv "$i" "${i%.bat}.bak" ; done

Your Linux Zealots weren’t very well versed in the ways of Unix. :)

Btw: notice that I said the shell expands the glob into a list? That means that the ls your ls *.bat incantation is superfluous; by the time ls is invoked, the actual work has already happened. This is called a useless use of ls *.

Richard:

find -name '*.bat' -print0 | sed 's!\\.bat\x0!\x0!g' | xargs -0i mv {}.bat {}.bak
]]>
By: Aristotle Pagaltzis http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1498 Aristotle Pagaltzis Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1498 <p>Err, <code>s/when face of/when faced with/</code>.</p> Err, s/when face of/when faced with/.

]]>
By: Aristotle Pagaltzis http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1499 Aristotle Pagaltzis Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1499 <p>I also just noticed that all your examples say “<code>for f in</code>” (notice the <code>f</code>) but then use <code>$i</code>, not <code>$f</code>, in the body of the loop; and I copied that mistake.</p> I also just noticed that all your examples say “for f in” (notice the f) but then use $i, not $f, in the body of the loop; and I copied that mistake.

]]>
By: Alastair http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1500 Alastair Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1500 <p>Aristotle, thanks for noticing the $f mistake, corrected.</p> <p>Also I understand what you are trying to say with the "for" statement and verified that the correctly quoted version works correctly. Thanks, and now all I have to internalise is the <code>${f%.bat}</code> syntax (I've read that section in the manpage n times and it still baffles me).</p> <p>One minor point: your wording implies that "for" is not part of the shell (ie "the shell passes to"). This is not the case, as "for" is an integral part of the shell, it's not even a builtin.</p> Aristotle, thanks for noticing the $f mistake, corrected.

Also I understand what you are trying to say with the “for” statement and verified that the correctly quoted version works correctly. Thanks, and now all I have to internalise is the ${f%.bat} syntax (I’ve read that section in the manpage n times and it still baffles me).

One minor point: your wording implies that “for” is not part of the shell (ie “the shell passes to”). This is not the case, as “for” is an integral part of the shell, it’s not even a builtin.

]]>
By: Alastair http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1501 Alastair Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1501 <blockquote> <p>Btw: notice that I said the shell expands the glob into a list? That means that the ls your ls *.bat incantation is superfluous; by the time ls is invoked, the actual work has already happened.</p> </blockquote> <p>Err, care to explain this? How would you rewrite the following expression to remove the <code>ls</code>?</p> <pre><code>ls *.bat | sed -e "s/[\\\"$]/\\&/g" -e "s/(.*).bat$/mv \"\1.bat\" \"\1.bak\"/" | sh </code></pre> <p>(missing backslash corrected)</p>

Btw: notice that I said the shell expands the glob into a list? That means that the ls your ls *.bat incantation is superfluous; by the time ls is invoked, the actual work has already happened.

Err, care to explain this? How would you rewrite the following expression to remove the ls?

ls *.bat | sed -e "s/[\\\"$]/\\&/g" -e "s/(.*).bat$/mv \"\1.bat\" \"\1.bak\"/" | sh

(missing backslash corrected)

]]>
By: Aristotle Pagaltzis http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1502 Aristotle Pagaltzis Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1502 <blockquote> <p>“for” is an integral part of the shell, it's not even a builtin.</p> </blockquote> <p>I know. What I was referring to is that glob expansion and “<code>for</code>” execution happen in different stages.</p> <blockquote> <p>How would you rewrite the following expression to remove the ls?</p> </blockquote> <p>With anything that prints its argument list. In the simplest cases, even <code>echo</code> might suffice. If you’re using bash, it’s very simple: it has is a builtin <code>printf</code>, so you can just say “<code>printf '%s\n' *.bat</code>”.</p> <p>Heh. That suggests an alternative to my reply to Richard:</p> <pre><code>printf '%s\0' *.bat | sed 's!\.bat\x0!\x0!g' | xargs -0i mv {}.bat {}.bak </code></pre>

“for” is an integral part of the shell, it’s not even a builtin.

I know. What I was referring to is that glob expansion and “for” execution happen in different stages.

How would you rewrite the following expression to remove the ls?

With anything that prints its argument list. In the simplest cases, even echo might suffice. If you’re using bash, it’s very simple: it has is a builtin printf, so you can just say “printf '%s\n' *.bat”.

Heh. That suggests an alternative to my reply to Richard:

printf '%s\0' *.bat | sed 's!\.bat\x0!\x0!g' | xargs -0i mv {}.bat {}.bak
]]>
By: marxy http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1503 marxy Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1503 <p>I find that languages like python are everywhere I want to be and the code is so much more readable than shell. <pre><code>from os import walk, rename from os.path import join</p> <p>directory = "scans for rosie copy" extfrom = ".jpg" extto = ".gif"</p> <h1>extfrom = ".gif"</h1> <h1>extto = ".jpg"</h1> <p>for root, dirs, files in walk(directory): for fileName in files: if fileName.endswith(extfrom): newFileName = fileName.replace(extfrom, extto) rename(join(root, fileName), join(root, newFileName)) </code></pre></p> <p>I put it to you that the idle reader can figure out what's going on here which is more important than fewer lines of code.</p> <p>Sorry for the digression.</p> I find that languages like python are everywhere I want to be and the code is so much more readable than shell.
from os import walk, rename
from os.path import join

directory = "scans for rosie copy" extfrom = ".jpg" extto = ".gif"

extfrom = ".gif"

extto = ".jpg"

for root, dirs, files in walk(directory): for fileName in files: if fileName.endswith(extfrom): newFileName = fileName.replace(extfrom, extto) rename(join(root, fileName), join(root, newFileName))

I put it to you that the idle reader can figure out what’s going on here which is more important than fewer lines of code.

Sorry for the digression.

]]>
By: Alastair http://girtby.net/archives/2006/11/20/and-you-can-quote-me-on-that/comment-page-1/#comment-1504 Alastair Mon, 20 Nov 2006 05:46:58 +0000 http://girtby.net/2007/02/21/and-you-can-quote-me-on-that#comment-1504 <p>marxy, in general you are right, it is more important to be clear than clever. However in the case of system admin tasks it really is useful to internalise a number of shell programming idioms which you will find yourself using over and over again. For me, the <code>find</code> command is like this, particularly in combination with <code>-print0</code> as discussed.</p> <p>In these cases, optimising for brevity gets you under the threshold of being able to memorise it in entirety. So the next time you come across the same problem, you simply are able to regurgitate the command and tailor for the situation. Agree that the python script is clearer, but does it follow you around to every system? If not, what then? Rewrite it? Go download it from another machine?</p> <p>There's also a middle ground between uber-shell scripting and using a real language like python. Frequently I simply load up the directory listing in emacs and use a macro function to mould it to my will. Once I've got a list of shell commands to run, just hit M-| to run them.</p> <p>Lastly, I can't help but recommend python list comprehensions, they're fantastic:</p> <pre><code>from os import walk, rename from os.path import join extfrom = ".bat" extto = ".bak" for dirpath, dirnames, filenames in walk("."): map( lambda (old, new): rename(join(dirpath,old), join(dirpath,new)), \ [ (file, file.replace(extfrom,extto)) \ for file in filenames if file.endswith(extfrom) ] ) </code></pre> <p>Which, come to think of it, doesn't prove a point about either readability or brevity. :) But it is cool nonetheless...</p> marxy, in general you are right, it is more important to be clear than clever. However in the case of system admin tasks it really is useful to internalise a number of shell programming idioms which you will find yourself using over and over again. For me, the find command is like this, particularly in combination with -print0 as discussed.

In these cases, optimising for brevity gets you under the threshold of being able to memorise it in entirety. So the next time you come across the same problem, you simply are able to regurgitate the command and tailor for the situation. Agree that the python script is clearer, but does it follow you around to every system? If not, what then? Rewrite it? Go download it from another machine?

There’s also a middle ground between uber-shell scripting and using a real language like python. Frequently I simply load up the directory listing in emacs and use a macro function to mould it to my will. Once I’ve got a list of shell commands to run, just hit M-| to run them.

Lastly, I can’t help but recommend python list comprehensions, they’re fantastic:

from os import walk, rename
from os.path import join
extfrom = ".bat"
extto = ".bak"
for dirpath, dirnames, filenames in walk("."):
    map( lambda (old, new): rename(join(dirpath,old), join(dirpath,new)), \
         [ (file, file.replace(extfrom,extto)) \
           for file in filenames if file.endswith(extfrom) ] )

Which, come to think of it, doesn’t prove a point about either readability or brevity. :) But it is cool nonetheless…

]]>