[TriLUG] System overload issues

Ron Kelley rkelleyrtp at gmail.com
Fri May 24 11:44:49 EDT 2013


Adding more swap will only prolong the inevitable death.  Add more RAM or get an SSD...



Thanks,


-Ron

On May 24, 2013, at 11:28 AM, Brian McCullough wrote:

> On Fri, May 24, 2013 at 09:56:56AM -0400, Bill Farrow wrote:
>> On Fri, May 24, 2013 at 9:34 AM, Brian McCullough <bdmc at buadh-brath.com> wrote:
>>> Frequently during the day, the system will become ( or the web sites will become )
>>> non-responsive for periods ranging from one minute to well over an hour.
>> 
>> Have you thought about putting limits on processes to prevent them
>> from taking the system to it's knees ?  I would start by looking at
>> ulimit.  If you can prevent the system from becoming un-responsive,
>> then you can start investigating which process is going haywire and
>> hopefully fix it properly.
> 
> Thank you, Bill.  I hadn't thought of ulimit, since I have only used
> that to limit disk space ( if I remember correctly ) in the past.
> 
> 
>>> Now, other things seem to be showing failure symtoms; for instance, bzip2, which
>>> compresses the MySQL database backup seems to take hours instead of minutes;
>> 
>> How big is the mysql dump file that is being compressed ?  
> 
> I think it is somewhere about 7.5G; it compresses to 1.1G.  It am in the
> process of unpacking one of the backups to confirm the original size.
> 
>> Time how
>> long it takes when the system is running normally, and compare with
>> when the system is under load.
>> 
>> time bzip2 test-backup
>> 
>> 
>> I'm going to second Ron Kelley's suggestion that it might be a bad
>> hard drive.  Check dmesg and syslog for hard drive error messages.  
> 
> I just took a look at dmesg, I haven't for a while, I guess, and find
> something that I think is MUCH more interesting.
> 
> My ( gut ) feeling has been that things are thrashing, and I see
> something at the bottom of the current dmesg that suggests that that may
> be ( part of ) the issue.
> 
> What I see is:
> 
> 
> Swap cache: add 17613573, delete 17613356, find 25621613/26574285, race
> 41+1296
> Free swap  = 0kB
> Total swap = 4192888kB
> Free swap:            0kB
> 2293760 pages of RAM
> 249431 reserved pages
> 311175 pages shared
> 585 pages swap cached
> Out of memory: Killed process 21911, UID 48, (httpd).
> 
> 
> There are more DMA statistics and CPU statistics prior to that, but the
> "Free swap: 0kB" is a red flag to me.
> 
> Am I correct, and should I start by increasing swap space, or should I
> work on reducing the need for it?
> 
> 
> Brian
> 
> -- 
> This message was sent to: Ron Kelley <rkelleyrtp at gmail.com>
> To unsubscribe, send a blank message to trilug-leave at trilug.org from that address.
> TriLUG mailing list : http://www.trilug.org/mailman/listinfo/trilug
> Unsubscribe or edit options on the web	: http://www.trilug.org/mailman/options/trilug/rkelleyrtp%40gmail.com
> TriLUG FAQ          : http://www.trilug.org/wiki/Frequently_Asked_Questions
> TriLUG is dedicated to a harassment-free experience for everyone. Our anti-harassment policy can be found at: http://trilug.org/anti-harassment




More information about the TriLUG mailing list