[TriLUG] DMA interrupt recovery
marty at rtmx.net
Wed Feb 11 14:56:39 EST 2004
Good luck, Douglas
I've seen this on my "Wally" system
(AKA Walter, Walton, or Waldo, depending on which OS I bring up...
You'll never guess where I bought this hunk of iron.
The power supply failed within the 1st 6 months...)
I've added a couple of EIDE drives to Wally, and I've seen this same
problem come and go. It's gone for now. My recent attempt at installing
Fedora failed the first time thru while creating the ext3 filesystems,
which is well into the install. I had not seen the "lost interrupt"
for a couple of months at that time.
So I don't know what the source is, but when I have encountered it in the
past on Wally, I have reseated my drive cables. For the fedora install,
there was nothing on the disk I cared about, so I instututed the
/dev/slash/burn policy on the HD partition table thru the vtty3 console
early in the 2nd (and final) Fedora install. Everything ended up just
So my suspicion is that at some critical period early in the life of
the drive, there was some mildy flaky cable problem, and combined with
a bad bit or CRC code or some other firmware "FM" somewhere on the
electronics/firmware, that this problem rears it's ugly head on occasion
(i.e., due to temp, humidity, stray quarks, phase of moon ;-)
This isn't to say that the described symptoms do not generally indicate a
hard drive failure, but if you are seeing this problem with a frequency
that is not consistent with 5 to 10,000 MTBF, then the root cause is
not hard drive failures. (Presuming, of course, you observe basically
sound ESD precautions and don't go around willy-nilly blasting
with unseen static electricity)
Summary - just saw your last posting...
Yes, buy top grade new cables and see if it goes away. If it does, great!
If a differet intermittent problem then recurs, then you found the root
but you'll need back up your filesystem (tar, not dd) recreate the drive,
My 2 bits
PS I'm really not kidding about the ESD. I once worked with a
human static generator we nicknamed "sparky"
From: trilug-bounces at trilug.org [mailto:trilug-bounces at trilug.org]On
Behalf Of Douglas Kojetin
Sent: Wednesday, February 11, 2004 11:31 AM
To: Triangle Linux Users Group discussion list
Subject: Re: [TriLUG] DMA interrupt recovery
oye! this'll be my second in a month or so (which is significant for
the relatively few numbers of computers we have vs. the # of times i've
seen this happen before -- at least to me!). i just ran WD diag tools
quick test -- it passed. but, i'm going to rsync the data to another
computer (hopefully) and run the extended test to be sure next.
On Feb 11, 2004, at 11:14 AM, Jason Tower wrote:
> probably a failed hard drive, i've seen this on at least four drives in
> the last six months. you might try turning off DMA with 'hdparm -d0 /
> dev/hda' but then performance will be abyssimally slow, if it works at
TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
TriLUG Organizational FAQ : http://trilug.org/faq/
TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
TriLUG PGP Keyring : http://trilug.org/~chrish/trilug.asc
More information about the TriLUG