CRA-W

CREW FINAL REPORT

Goals and purposes of project:

The goal for this project was to co-schedule applications that would be mutually beneficial to one another in the true spirit of symbiosis. The motivation for this project was drawn from the current standard of building high performance computing (HPC) hardware. As the performance of HPC hardware increases, usually by increasing the numbers of processors, researchers are confronted with the problem of utilizing nodes to their maximum capacity. Writing codes that are efficient on a larger number of processors relies upon increasing the parallelism and decreasing serial regions. This, however, becomes more difficult as codes can only be optimized to a certain extent. Parallelization is limited by I/O overhead and memory and bandwidth constraints. Based on tests run on a simulated machine, we determined that two different codes emphasizing different resource usages could be run together to utilize more of the individual processors within each node.

Parallel computing is used mainly used for two reasons, crunching a large amount of data and performing computationally intense calculations. Reducing either the amount of space needed for data storage or the computational overhead is the goal of symbiosis. The large amount of data that is collected in a scientific code generally makes little sense without visualization or interpretation of some kind. Often a visualization program is run on the data out put from the computation. Therefore research on large fluid hydrodynamic problems is a two-step process. Data is created during computational runs and saved until a later date, when a visualization program can render the data. Creating a one-step process by simultaneously running the application and visualization codes would expedite the scientific process, reduce research time, and eventually enable on-the-fly visualization of an application. In other words, symbiosis has the potential to enable computational steering, in which researchers are able to interpret data during run time and as a result create conclusions based upon results delivered, and, if necessary, modify the application parameters.

Account of the process used in completing research:

We started our project by becoming familiar with Blue Horizon, the clustered SMP machine at the San Diego Supercomputer Center, upon which we would be testing our hypothesis. We initially tested the performance of various scientific codes to determine the best applicants for our symbiotic experiment. Criteria for selecting the codes included compatibility with our selected visualization code, MPIRE; availability of the source code; and current use of the code in solving real science problems at the San Diego Supercomputer Center or at an NPACI site. Among the applications tested were two astrophysics codes (HPS and SCF) and a chemistry code (GAMESS).

Compatibility with our designated visualization code, MPIRE, was the main criterion to meet. The codes in consideration were selected because of their use and availability. MPIRE creates a visual rendering based on regularly gridded data points, therefore the output data from the scientific codes had to be in a gridded format or in such a form that would allow us to modify the data to create regularly gridded data points. For example, one incompatibility we discovered with SCF was questionable input data. The code was being used as a benchmark application and the input data was fabricated. This implied that we would not be solving true "science" problems. We chose the astrophysics code, HPS as our scientific code.

In the meantime, hardware modifications made running co-scheduled jobs on the Blue Horizon difficult. Blue Horizon utilizes a job scheduler from the Maui Supercomputer Center, which does not allow delegating processors for simultaneous execution of two or more parallel jobs that reach across two or more of the SMP nodes. To alleviate this dilemma, we designated a set of 8 nodes, each node with 8 processors, that have the Maui scheduler disabled. The B80 nodes, as the designated nodes were named, was thus set aside for our project.

We continued to collect timings and found that the most accurate timings were acquired from LoadLeveler, a utility program. Upon job completion, these timings are emailed to the submitting party. Since the B80 nodes are test nodes the LoadLeveler does not recognize them and no timing results are sent. This notification problem initiated another change to the Blue Horizon nodes.

Research is continuing through the summer and will include running more benchmarks, and pairing codes to compare timings, determine speedup, and to quantify the increased throughput by using co-scheduling.

Conclusions and results achieved:

Some success with co-scheduling came to Drs. Allan Snavely and Giridhar Chukkapalli both of SDSC. Snavely initiated co-scheduling on a simulated machine while Chukkapalli tested Blue Horizon capabilities prior to the hardware modifications mentioned earlier. Results of these previous tests were combined with our results and presented by our Technical Lead at the IBM SP Scientific Computing User Group Conference in Barcelona, May 2001. The abstract of "Node-Level Co-scheduling of Parallel Jobs on Clustered SMP Machines" can be found at http://www.spscicomp.org/ScicomP3/abstracts.html#johnson. Further investigation this summer will culminate with a paper submission to an appropriate HPC conference.

Students involved are Annalisa Ruskievicz and Nicole Wolter, who both graduated from San Diego State University this May in Computer Science. Jay Boisseau, Ph.D., was our faculty sponsor and our Technical Lead was Greg Johnson.