[TriLUG] Jihad! ( was Remote server monitoring)
william at trilug.org
Thu Sep 1 12:30:55 EDT 2005
The question wasn't entirely theoretical. We have an in-house developed
system monitoring tool at $WORK to make sure that our servers aren't being
bogged down by manufacturing processes (a lot of back-end stuff going on
with databases and so-on). We also have a large worldwide VPN where
segments run over hardware we don't own or control. Consequently, fixing
the outages isn't an option...
On Thu, 1 Sep 2005, Tarus Balog wrote:
> On Sep 1, 2005, at 12:14 PM, William Sutton wrote:
> > It seems like a more sensible alternative to polling is to have
> > separate
> > tools for monitoring and data collection/reporting: Place the
> > monitor on
> > the servers, and allow them to queue up reports in event of network
> > problems.
> Depends on what you want to monitor. I can have a program check if
> apache is running on the server, but does that mean that server is
> available in LA? New York? If all you care about is "is there an
> apache process running on this server that I can connect to, from
> this server" then, yeah. If you want to measure service availability,
> you need to measure it from the user's point of view. If Travelocity
> is slow, I go to Orbitz, whether or not the Travelocity server is
> actually up as far as they are concerned. In my case, I want to
> capture the user experience.
> You can also place "agents" on systems, but agent management outside
> of what ships with a O/S can be problematic on an enterprise scale. I
> guess you could write an agent to store performance data, like CPU,
> disk, etc., and then report it up to an NMS, but many people would
> rather spend resources to fix issues with the "spotty" network and
> leave it at that.
> Tarus Balog
> The OpenNMS Group, Inc.
> Main : +1 919 545 2553 Fax: +1 503-961-7746
> Direct: +1 919 647 4749 Skype: tarusb
> Key Fingerprint: 8945 8521 9771 FEC9 5481 512B FECA 11D2 FD82 B45C
More information about the TriLUG