[TriLUG] Linux(RedHat) kernel question(long/involved)
lhb at hpcc.epa.gov
Wed Oct 27 14:13:01 EDT 2010
I have lurked on this list for approximately 10 years and
have learned so much from reading it.
Now, I have a question regarding linux kernel interals.
At this site, I manage approximately 120 linux servers and
several SGI Altix servers as well.
Now the question:
How does the LINUX kernel handle a multiprocessor box?
Situation: 1 to N users are logged into the box.
How are the CPUs allocated to these users?
How will the I/O be distributed for these users?
My reason for needing this information:
We have 40 Dell R610 servers with 8 processors and 48Gb of memory.
Storage for this cluster consists of two 42Tbyte SATAbeast units
attached via two Qlogic fiber controllers with a dedicated
server for each controller.
The total configuration for this cluster is 40 Dell R610s, each
with dual quad CPUs(320 nodes) and 72Tbytes of RAW disk space.
Torque is used to queue jobs for the cluster and MPICH is used
to distribute the job across the nodes. The RAIDS are mounted
using EXT4. NFS with automounter is used to distribute the disks
to each of the individual servers.
Problem: When several jobs are running on the cluster, the load
average on the disk servers climbs above 8. Sometimes as
much as 12 and the performance of the running jobs
We are in the process of installing/configuring a lustre
filesystem, however; the disks will remain attached and I
need to solve the load problem.
Thanks in advance for your suggestions and comments.
Contractor: The rich man isn't the one who has the most,
Vision Technoligies but the one who needs the least.
Senior Systems Engineer
US EPA Rm. E460 --- IN GOD WE TRUST --
More information about the TriLUG