[TriLUG] Linux(RedHat) kernel question(long/involved)

Leslie(Pete) Boyd lhb at hpcc.epa.gov
Wed Oct 27 14:13:01 EDT 2010


Hello Trilug,

    I have lurked on this list for approximately 10 years and 
    have learned so much from reading it.
    
    Now, I have a question regarding linux kernel interals.
    
    At this site, I manage approximately 120 linux servers and
    several SGI Altix servers as well. 
    
    Now the question:
     
     How does the LINUX kernel handle a multiprocessor box?
     Situation: 1 to N users are logged into the box.
                How are the CPUs allocated to these users?
                How will the I/O be distributed for these users?
                
    My reason for needing this information:
    
    We have 40 Dell R610 servers with 8 processors and 48Gb of memory.
    Storage for this cluster consists of two 42Tbyte SATAbeast units
    attached via two Qlogic fiber controllers with a dedicated 
    server for each controller.
    
    The total configuration for this cluster is 40 Dell R610s, each
    with dual quad CPUs(320 nodes) and 72Tbytes of RAW disk space.
    
    Torque is used to queue jobs for the cluster and MPICH is used
    to distribute the job across the nodes. The RAIDS are mounted 
    using EXT4. NFS with automounter is used to distribute the disks
    to each of the individual servers.
    
    Problem: When several jobs are running on the cluster, the load
             average on the disk servers climbs above 8. Sometimes as
             much as 12 and the performance of the running jobs 
             drops.
             
    We are in the process of installing/configuring a lustre 
    filesystem, however; the disks will remain attached and I
    need to solve the load problem.
    
    Thanks in advance for your suggestions and comments.
                         
******************
Leslie(Pete) Boyd         
Contractor:               The rich man isn't the one who has the most,
Vision Technoligies       but the one who needs the least. 
Senior Systems Engineer   
US EPA  Rm. E460                --- IN GOD WE TRUST --  
919/541-1438                     
******************





More information about the TriLUG mailing list