[TriLUG] general RAID questions

David Brain dbrain at gmail.com
Sun Nov 27 20:40:44 EST 2011


Hi,

Answers in-line, based on having done run a fair amount of storage related
On Sun, Nov 27, 2011 at 12:39 PM, Joseph Mack NA3T <jmack at wm7d.net> wrote:
> I've just worked through the RAID section of Alan Porter's Jan presentation
> this year.
>
> http://trilug.org/~porter/meetings/2011-01-13_AlanPorter_raid-lvm-luks.odp
>
> I've not done any RAID and have these questions.
>
> o is software raid the default in linux?

In what way?  It is  available in most (all?) distributions, but of
course you have the option of HW RAID too, as many hardware
controllers are also supported.
>
> with hardware RAID a light on a failed drive comes on. Presumably the
> hardware driver card knows that the disk has failed a r/w, fails out the
> disk, runs in degraded mode and turns on the light. You hotswap the drive
> (presumably the failure is visible with software too, but I assume some poor
> flunkey patrols the racks once a day with a bucket of replacement disks).
> However with software raid, there won't be lights.

>From experience here if the array is busy it will be the one that's
blinking 'differently'.  It also helps to know how things are
allocated in the shelf, so when you build the array test to see which
'end' of the tray is lower alphabetically from the /dev/sd?
standpoint.
>
> o What does a disk failure (eg click of death) look like on s/w raid? If a
> disk gets a r/w fail does mdadm know about it straight away? Do you check
> the output of mdadm every 5 mins with Nagios/rrdtool type scripts and send
> messages to someone to change the disk?

Looks like a few read/write failures on a device, followed by mdadm
declaring the disk dead.  mdadm can send emails on failure - and yes
Nagios etc can be used too.
>
> o with s/w raid, if you get a notice that /dev/sdm1 has failed, how do you
> find drive /dev/sdm in a rack if there's no light to turn on? Do you have to
> label all your drive bays at install time?

Oh, labeling that would be a good idea..
>
> o can you run
>
> smartctl -t long /dev/sdm

Don't think so - although never tried it.  You can test individual drives.
>
> on each of the RAID devices? (I assume multipath complicates things here. Do
> you test each branch of the multipath separately?)
>
> o s/w raid requires intervention at the keyboard to remove the old drive
> from mdadm, go to the racks to hotswap the drive, return to the keyboard,
> add the new drive, and check that the rebuild suceeds. If you have 100s of
> racks, I assume this process doesn't scale. So wouldn't everyone be running
> hardware RAID?
>

Hotspare drive would be your friend here, also be sure to run a
consistency check fairly frequently - there's nothing worse than
having an array fail, then fail to rebuild due to previously
undiscovered errors on other drives. Oh and yes, for most 'serious'
things I think I've always run HW RAID (or a SAN).  However software
raid is _great_ for making use of the 'spare' disk trays and for
setting up RAID 1 for servers with dual drives for the OS.


>
> Thanks Joe
>
> --
> Joseph Mack NA3T EME(B,D), FM05lw North Carolina
> jmack (at) wm7d (dot) net - azimuthal equidistant map
> generator at http://www.wm7d.net/azproj.shtml
> Homepage http://www.austintek.com/ It's GNU/Linux!
> --
> This message was sent to: dbrain at gmail.com <dbrain at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that
> address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web  :
> http://www.trilug.org/mailman/options/trilug/dbrain%40gmail.com
> TriLUG FAQ          : http://www.trilug.org/wiki/Frequently_Asked_Questions
>



More information about the TriLUG mailing list