|
|
Project: Symbiotic Applications on SMP Machines
Student Researchers: Annalisa Ruskievicz, Nicole Wolter
Advisor: John ("Jay") R. Boisseau, Ph.D., Greg Johnson, M.S.
Institution: San Diego State University
CREW FINAL REPORT
Goals and purposes of project:
The goal for this project was to co-schedule applications that would
be mutually beneficial to one another in the true spirit of symbiosis.
The motivation for this project was drawn from the current standard
of building high performance computing (HPC) hardware. As the performance
of HPC hardware increases, usually by increasing the numbers of processors,
researchers are confronted with the problem of utilizing nodes to their
maximum capacity. Writing codes that are efficient on a larger number
of processors relies upon increasing the parallelism and decreasing
serial regions. This, however, becomes more difficult as codes can only
be optimized to a certain extent. Parallelization is limited by I/O
overhead and memory and bandwidth constraints. Based on tests run on
a simulated machine, we determined that two different codes emphasizing
different resource usages could be run together to utilize more of the
individual processors within each node.
Parallel computing is used mainly used for two reasons, crunching a
large amount of data and performing computationally intense calculations.
Reducing either the amount of space needed for data storage or the computational
overhead is the goal of symbiosis. The large amount of data that is
collected in a scientific code generally makes little sense without
visualization or interpretation of some kind. Often a visualization
program is run on the data out put from the computation. Therefore research
on large fluid hydrodynamic problems is a two-step process. Data is
created during computational runs and saved until a later date, when
a visualization program can render the data. Creating a one-step process
by simultaneously running the application and visualization codes would
expedite the scientific process, reduce research time, and eventually
enable on-the-fly visualization of an application. In other words, symbiosis
has the potential to enable computational steering, in which researchers
are able to interpret data during run time and as a result create conclusions
based upon results delivered, and, if necessary, modify the application
parameters.
Account of the process used in completing research:
We started our project by becoming familiar with Blue Horizon, the clustered
SMP machine at the San Diego Supercomputer Center, upon which we would
be testing our hypothesis. We initially tested the performance of various
scientific codes to determine the best applicants for our symbiotic
experiment. Criteria for selecting the codes included compatibility
with our selected visualization code, MPIRE; availability of the source
code; and current use of the code in solving real science problems at
the San Diego Supercomputer Center or at an NPACI site. Among the applications
tested were two astrophysics codes (HPS and SCF) and a chemistry code
(GAMESS).
Compatibility with our designated visualization code, MPIRE, was the
main criterion to meet. The codes in consideration were selected because
of their use and availability. MPIRE creates a visual rendering based
on regularly gridded data points, therefore the output data from the
scientific codes had to be in a gridded format or in such a form that
would allow us to modify the data to create regularly gridded data points.
For example, one incompatibility we discovered with SCF was questionable
input data. The code was being used as a benchmark application and the
input data was fabricated. This implied that we would not be solving
true "science" problems. We chose the astrophysics code, HPS
as our scientific code.
In the meantime, hardware modifications made running co-scheduled jobs
on the Blue Horizon difficult. Blue Horizon utilizes a job scheduler
from the Maui Supercomputer Center, which does not allow delegating
processors for simultaneous execution of two or more parallel jobs that
reach across two or more of the SMP nodes. To alleviate this dilemma,
we designated a set of 8 nodes, each node with 8 processors, that have
the Maui scheduler disabled. The B80 nodes, as the designated nodes
were named, was thus set aside for our project.
We continued to collect timings and found that the most accurate timings
were acquired from LoadLeveler, a utility program. Upon job completion,
these timings are emailed to the submitting party. Since the B80 nodes
are test nodes the LoadLeveler does not recognize them and no timing
results are sent. This notification problem initiated another change
to the Blue Horizon nodes.
Research is continuing through the summer and will include running more
benchmarks, and pairing codes to compare timings, determine speedup,
and to quantify the increased throughput by using co-scheduling.
Conclusions and results achieved:
Some success with co-scheduling came to Drs. Allan Snavely and Giridhar
Chukkapalli both of SDSC. Snavely initiated co-scheduling on a simulated
machine while Chukkapalli tested Blue Horizon capabilities prior to
the hardware modifications mentioned earlier. Results of these previous
tests were combined with our results and presented by our Technical
Lead at the IBM SP Scientific Computing User Group Conference in Barcelona,
May 2001. The abstract of "Node-Level Co-scheduling of Parallel
Jobs on Clustered SMP Machines" can be found at http://www.spscicomp.org/ScicomP3/abstracts.html#johnson.
Further investigation this summer will culminate with a paper submission
to an appropriate HPC conference.
Students involved are Annalisa Ruskievicz and Nicole Wolter, who both
graduated from San Diego State University this May in Computer Science.
Jay Boisseau, Ph.D., was our faculty sponsor and our Technical Lead
was Greg Johnson.
|