[TriLUG] Odd Network problem

Mark Shuford mark at tmhco.com
Sun Feb 13 22:05:36 EST 2005


I've just re-read this. It's a bit of a ramble... much tiredness on my
part the last couple of days. I'm trying to explain the same thing a
couple of different ways. Looks like I'm a mess to me...too...

But, as I said: Check Core Switch/Router and look to see if anything not
accessing the problem node has some funkie thing in the arp cache or
even the routing table. And, if you've more time than I had, a sniffer
looking for ICMP messages could be helpful.

Intermittents are THE royal pain in the Bu-Tox...



Duplex and Link negotiation looked like a good first bet to us, also --
"Hmmm... we thought that might be it. -- when we had similar problems
with our Extreme Black Diamond switch. Hard setting such didn't help.


Look and see what sort of static routes you may have out of your core
switch. I tracked down that when we were having this sort of behavior
that something had happened to the switch's connectivity to the node in
question and sent an ICMP redirect to the box trying to reach it. It was
never determined if this was a hard failure or a problem in firmware or
running memory. 

Since this sent it off into Internet Land instead of our internal net
the problem persisted until the bad entry in the cache timed out. And
this could affect multiple nodes if they are all downstream of another
switch that that connects to the flakey core switch.

Other symptoms: Some boxes could ping the trouble node, some couldn't.
The troubled node generally didn't have any trouble seeing anything. 
Apparently, that some could some couldn't depended on the timing of the
attempts to contact the node. If the connectivity was good at that time
all is well.

This wasn't just shooting in the dark. I could look at the arp cache on
one of the machines not able to hit the trouble node and see that it was
trying to reach it through an alternate router.

The Black Diamond is also aware of Layer III. If your Cisco isn't
running an SRP (Switch Route Processor) then I'm not thinking this will
help any. 

We, too, had a 'flat' network with a few gateways out to the real world.
All the regular traffic was supposed to go through the Black Diamond --
however, for some un-known reason (don't blame me... it was before I got
there) one of the _very_ ancillary gateways was set as the default route
across the network. And its default route was to the Internet gateway.

So when the ICMP snafu came from the core switch the new direction to go
was advised to be the ancillary router. Which did another ICMP and thus
it tried to go to Internet Land.


M Shuford


On Sun, 13 Feb 2005 07:42:21 -0500
Jeff Groves <jgroves at krenim.org> wrote:

> Here's my guess:
> 
>    Full/Half duplex mis-matches or autosense failure causing the same
> 
> Router is expecting something different than the network card is set
> for..
> 
> 
> Jeff G.
> 
> Chris Knowles wrote:
> > Got a weird one.  
> > 
> > (Oh, regarding that crashing box, further investigation pointed at
> > the motherboard as a culprit.)
> > 
> > I've got a Nagios server in place that's been happily warning us of
> > doom and gloom for over a year.  It's one of the great success
> > stories for Linux at our company.
> > 
> > Until now.
> > 
> > Starting this morning, it has been randomly unable to ping various
> > boxes on our network.  That is, until you ping the nagios server
> > from the "unpingable" server.   Then Nagios can ping that server all
> > it wants. 
> > 
> > This is all local network, no routing involved.  
> > 
> > Any idears as to what could be causing this?  (This is a simple
> > switched network, and other than this seems to be working fine.)
> > 
> > Any help is appreciated.  
> > 
> > CJK
> > 
> 
> -- 
> Law of Procrastination:
>          Procrastination avoids boredom; one never has
>          the feeling that there is nothing important to do.
> -- 
> TriLUG mailing list        :
> http://www.trilug.org/mailman/listinfo/trilug
> TriLUG Organizational FAQ  : http://trilug.org/faq/
> TriLUG Member Services FAQ : http://members.trilug.org/services_faq/
> TriLUG PGP Keyring         : http://trilug.org/~chrish/trilug.asc


-- 
Mark Shuford



More information about the TriLUG mailing list