Enabling World Class Research by High Performance Computing

Dari PaloDozen

Title Enabling World Class Research by High Performance Computing
Principla Investigator Eko Mursito Budi
Researchers Estiyanti Ekawati, Nugraha
Year 2009
Status On going

Daftar isi


This research aims to build a high performance computing system, capable to lift the quality of research at ITB toward the world class university. The recent development in technology, as expressed by the millenium development goals, views the future at nano technology and bio technology. The research of those modern technologies utilize the modern computational techniques such as finite difference, classic molecular dynamics, quantum molecular dynamics, as well as genetic algorithm optimization. However, ITB would not capable to join the world class research without the required computing facility.

Building a high performance computing (HPC) facility requires high competence, as well as high budget. Our team has been involving in this endeavour for more than three years. Step by step, we have built:

  1. Cluster computing system, consist of 9 computer with Pentium IV CPU (Central Processing Unit).
  2. Grid computing system, involving 20 AMD dual core computers in a laboratory (total 40 cores CPU, but not dedicated).
  3. Cluster computing system, a compact 6 computers of Intel Xeon quad core system (for a total of 24 cores CPU).

Those endeavor proved two things :

  1. we have the competence to build the HPC system and
  2. the system was still not enough to help the highly demanding modern computational science.

In this research, we are ready to take the next step, building a mixed CPU + GPU cluster computing system. CPU is a general purpose computing engine, usefull for sequential stream computing. To raise the bar, modern CPU such as Intel i7 or AMD Phenom have multiple cores in their CPU, to help parallel computing. On the other hand, GPU (Graphics processing unit) was designed for accelerating graphics processing by using multiple shadder processing cores. The newest GPU from NVIDIA or ATI has up to 512 cores. Thus, we may build a massive parallel high performance computing system by combining the CPU and GPU.

Researcher quickly realize the potential of GPU for modern science computing. At USA, NCSAA has build a mixed CPU+GPU+FPGA cluster system, and open it for public [http: http://www.ncsa.uiuc.edu/Projects/GPUcluster/]. So far, the facility has reised many advanced research from various technologies (physics, biology, geology, as well as social). ITB should pioneer the same approach at Indonesia.


Since 2007, Engineering Physics ITB has mutual join research with Osaka University, Japan. More than 15 of our best students has been promoted to conduct the advanced research at Osaka University, especially on material technology under the supervision of Prof. Kasai. At early March 2009, an International workshop on Quantum Simulation and Design held by Osaka University. Several Indonesian students presented their research including:

  • The adsorption of Lithium Ion on Montmorillonite: A Denstiy Functional Theory (DFT). This research aims to build more efficient Lithium Battery. (Triati Kecanawungu)
  • SOx Adsorption Mechanism on BaO Surface, aims to find better polution prevention (Ferensa Oemry).
  • Theoretical Study of Interaction of Ionic Molecules with Metalllic Carbon nanotubes for Carbon nanotubes - based super capacitor. (Dimas Fajarisandi)
  • A DFT-Based Investigation on Small Water Cluster System, thus the water can be easily separeted into hidrogen and oksigen for fuel cell system (Handoko Setyo Kuncoro).
  • Hydrazine adsorption and decomposition on Ni(100): A DFT investigation. This one is for higher efficient fuel cell generator. (Mohammad Kemal Agusta)

Berkas:Poster QSD 2009 Berkas:Peserta QSD 2009

Without doubt, those are a very interesting advanced research. However, why must they do that abroad ? Simply because ITB does not have the required computing facility to support such kind of research, yet.

For such research, Kasai Lab has invested several high performance computing (HPC) systems. The second generation system, called Sakura, consists of 2 server and 64 computer nodes. Each node is an Intel Xeon dual core processor, for a total of 128 core cluster system. Yet, according to the students, they still have to run a quantum simulation for about 2 weeks !


In comparison, our team has been endeavoring to built a HPC system at TF ITB, using low cost computer, as well as open source software solution. Due to the limited budget, the development has progressing step by step as follow:

  • 2003 : Built a Cluster computing system, consist of 9 computer with Pentium IV CPU (Central Processing Unit). The system was running well, albeit the performance is still not satisfying.
  • 2007 : Integrate A grid computing system, involving 20 AMD dual core computers in a laboratory. However, the computers were not dedicated, thus many unpredictable distraction from public usage held back the research progress.
  • 2008 : Built cluster computing system consisting of 6 Intel Xeon quad core computers. This system is currently our main system.

So far, our experience showed good progress in understanding of how to build a high performance system. Some advanced research by ITB researchers in computational material design and industrial optimization has been conducted, though still in a very limited scale. We are highly confident that some more refinement in the HPC facility will enable world class research in ITB.


The proposed high performance computing system is a heterogeneous CPU + GPU cluster computer system [P3]. The use of GPU as massive parallel computer system has been promoted by NVIDIA [P1,P2,P3]. Some well known laboratories have built such kind of cluster computer system for example :

Both facility built using the most advanced technology for HPC, using the recent multi core CPU and GPU. The most notably component is the InfiniBand network Technology, capable for a very low latency, high bandwith data transfer.

[Berkas:Maryland CPU-GPU Cluster] [Berkas:NCSA Multi Accelerator Cluster]

For our concern, the most challenging task in building HPC is the cost. Some estimation are:

Rack computer server system $500 / unit
Server class mother board $600
Top notch CPU $1500
Top notch GPU $600
Infiniband network $1000 / connection

Thus our approach for the low cost system is to focus on the CPU and GPU, while reducing the other components. We also use open source software for the operating system and parallel computing middle ware.


Our current architecture is shown above. The architecture has the following novelty:

  • The cluster has a modular tree topology, allowing many nodes to be attached to the system, while avoiding the network bandwith bottle neck. This configuration is also perfect for unpredictable growing algorithm, such as artificial intelligence and optimization.
  • Each node has a multi core CPU + GPU system in bus topology, matching massive parallel algorithm such as finite element method or molecular dynamics.
  • The system is available to be accessed from the Internet, allowing many researchers to utilize the facility.

So far, we have implemented the CPU cluster. In this proposal, we aims to add the GPU into the cluster. The easiest part of the research is to buy some GPU card and plug it into the existing computer. The more challenging part is to reconfigure our software system to accept heterogenous computing system, and adjust the network for higher bandwidth usage of each node.


  1. NVIDIA, Technical Brief NVIDIA Quadro vs. GeForce GPUs Features and Benefits, White Paper, 2008
  2. Geer, David Geer, Taking the Graphics Processor beyond Graphics, IEEE Computer Graphics & Application, September 2005
  3. Berillo, Alexey, NVIDIA CUDA: Non-graphic computing with graphics processors, October 21, 2008, http://ixbtlabs.com/articles3/video/cuda-1-p1.html
  4. Showerman, M. Et.al., QP: A Heterogeneous Multi-Accelerator Cluster, 10th LCI International Conference on High-Performance Clustered Computing, March 10-12, 2009; Boulder, Colorado
  5. Wong , Henry Ting-Hei, Architectures and Limits of GPU-CPU Heterogeneous Systems, Master Theses, Electrical and Computer Engineering, THE UNIVERSITY OF BRITISH COLUMBIA, October 2008
  6. Nurdiyanto, E., Eko M. Budi, “Pengembangan Perakit Distro Otomatis”, Seminar Nasional Ilmu Komputer dan Aplikasinya, UNPAR, Bandung
  7. Budi, Eko M., Hermawan K. Dipojono, “Extended Distributed Scheduling Taxonomy with Computability Level Criterion”, International Seminar on Rural-ICT, ITB

Kontributor: Mursito