5. Final Remarks

This work presented a comparison of the most common approaches in Python in SDumont to increase computational performance, using a given test problem. The test problem is a 2D heat transfer problem modeled by the Poisson partial differential equation, being solved by a finite difference method. Requires the calculation of a 5-point stencil over the discretized domain grid. Its serial and parallel implementations in F90 were taken as a reference to compare its performance with some serial and parallel implementations of the same algorithm available in the Python environment: F2Py, Cython, Numba, Numba-GPU and the standard Python itself. The processing times, speedups and parallel efficiencies were presented and discussed for these implementations, considering a problem size specific to the test problem. The Python environment is a high-level interactive tool for rapid prototyping and development of computer code, allowing the integration of modules written in F90, and the use of several third-party libraries. This work intends to show that Python can also be used for HPC through APIs / libraries such as F2Py, Cython, Numba and Numba-GPU. Faster implementations are generated, porting time-consuming parts of Python code to a new function that is called from the Python program, except in the case of F2Py, which reuses an existing F90 code. The performance results of serial and parallel processing for the given test problem show that such an approach is feasible. Therefore, Python not only allows the programmer to readily develop and test the intended Python code, but also to port parts of it to obtain HPC using multiple cores and / or GPUs. The resulting code offers the portability of Python, but provides another level of modularity, since it is possible to exchange a function implemented in Cython, for example, for another one in Numba-GPU according to the available computer architecture.