[TriLUG] e2fsck under cron gets retcode=8 operational error

Wed Sep 19 11:34:34 EDT 2012

On Wed, 19 Sep 2012, Aaron Schrab wrote:

> At 05:37 -0700 19 Sep 2012, Joseph Mack NA3T <jmack at wm7d.net> wrote:
>> o output should be machine readable
>
> At least with df, there is an option for that.  I disagree 
> that that should be the default.  Far more people will be 
> looking at the df output themselves than will be writing 
> code to parse it;

this is the problem. It depends whether you think

o computers should require humans as I/O devices in order to 
maintain the machine.

o should run unattended, maintaining themselves and each 
other for their lifetimes and only calling for attention on 
a failure beyond their control (eg disk needs to be 
physically replaced).

I subscribe to the 2nd view. In this case, anyone running 
`df` is doing so for their own interest, not to run the 
machine. The output of df should be optimised for a 
computer, not for a human, which has much greater 
flexibility in parsing output and can handle formats that no 
computer can read.

> the ones writing that code are more likely to read the 
> docs to find options to produce a suitable format; and the 
> option to specify the format only needs to be put into the 
> code once, not remembered.

You have to know that the option is there in the first place 
and you have to know that the utility was written for 
humans, not for computers. You have to remember -P every 
time you use df not just once.

There is a large barrier to having machines maintain 
themselves. In the LVS project a script to monitor services 
and automatically add and delete entries when services come 
up and do down, was a nightmare to write, as ipvsadm was 
designed for humans, not to be machine readable.

If the barrier to self maintaining machines is too high, 
because the coder knows that they'll have a nightmare of 
broken utilities to patch together, then computers will 
always require humans as an I/O device. These jobs will be 
really boring. The limits on scaleability of computer 
systems will be smaller.

>> o make best effort. If you don't have a tty, then do the 
>> things you can do without a tty.
>
> Doing this would result in a program that requires a tty 
> in some cases, but not others.  It may work without a tty 
> in most cases, but require one when something unusual 
> happens.  That would make it more likely that a script 
> using that tool would be deployed before the problem is 
> found.  With the tty always being required, the problem 
> should be found early in development.

but I don't have a tty. e2fsck can work without one and if 
it can work without a tty, then it should. It shouldn't fail 
because it doesn't have a tty, that wasn't even needed for 
the job it was asked to do.

>> If an unknown option is given, ignore it, with a warning.
>
> Definitely not.  Warnings are too likely to be ignored.

If you ignore warnings, you have been warned and you can't 
say otherwise.

> Also the person may have been expecting that unknown 
> option to completely change the behaviour of the command. 
> Consider if someone typed `rm --dry-run -rf /`, and rm 
> were to ignore the unsupported --dry-run option; that 
> would not be something that's at all helpful.  With a 
> decent shell it's easy enough to go back and fix the 
> problem.

You're talking about a machine with a human I/O device. If 
you're going to be messing with anything, sysadmin, 
gardening, going for a walk, you have to be prepared for 
things to go wrong. You can't 2nd guess the user and say 
"are you sure?" every time. You have to expect that they're 
going to get it right most of the time or else a human 
shouldn't be sitting there.

`rm -rf` is the Godwin's law counter example used in 
discussions of maintaining machines. In potentially 
catastrophic situations, you should make multiple checks 
that you aren't stepping off a cliff. It's not like `rm -rf 
/` is used a lot. I've not used it once in 20yrs of unix. 
No-one should ever run anything that looks remotely like `rm 
-rf /`. No-one should run `rm -rf /` without at least first 
running `ls /`. In your example, I say the machine should 
scrub your disk. You will restore from backups and be back 
working in an hour or two, after learning a lesson in being 
careful with `rm -rf /`.

When I said that unknown options should be ignored, I was 
thinking of the scripts that maintain my machines, that no 
longer work properly, because of upgrades in utilities that 
are broken by design. Recently I found that the new version 
of `find` gives sheets of warnings when running with a 
script I've been using for a decade. It says I have the 
options in the wrong order. Well tough. I have a life to get 
on with. I'm not going to spend my life mopping up after 
people who decide to rewrite something that worked for a 
decade, so that it's cast as an error on my part.

A while back in iptables, some new functionality was added. 
Rather than writing a wrapper around the old code, leaving 
the old code still callable, the function was rewritten, 
with the order of two of the parameters reversed. Now every 
compile failed. This was Harald's way of informing everyone 
that there'd been a change and you should go find out what 
had happened and fix your code. Now Harald is a great guy 
and has single handedly taken a bunch of GPL violations to 
the courts in Germany and won them all. He's done a lot more 
for the world than ever I have, so I can't fault him. 
However deliberately breaking code, to announce new 
functionality is not acceptable.

Joe

-- 
Joseph Mack NA3T EME(B,D), FM05lw North Carolina
jmack (at) wm7d (dot) net - azimuthal equidistant map
generator at http://www.wm7d.net/azproj.shtml
Homepage http://www.austintek.com/ It's GNU/Linux!