# CE Seminar

Together with the Computational Engineering Research Center of the TU Darmstadt a joint seminar with interesting talks in the field of CE is organized in every semester. If you are interested in these seminars and would like to receive invitations please subscribe for the corresponding mailing list.

## 2015

## Efficient strategies to reduce the memory footprint and CPU usage in parallel Large Eddy Simulations of turbulent combustion with large flamelet-based chemistry tables

### Prof. Dr. Christian Hasse, Technische Universität Bergakademie Freiberg

**15 Dec 2015, 17:00–18:30; Location: S4|10-1 **

Fast chemical reactions in combustion take place in thin layers, which are usually called flamelets. The flamelet regime is the most prominent one in technical applications. Thus, flamelet-based models have been used very successfully and are actively developed for the simulation of turbulent combustion by the scientific community. Instead of solving these flamelets during the simulation, the flamelet structures are computed beforehand and stored in a look-up table. Then, the thermochemical state (temperature, species composition etc.) is retrieved during the simulation. One drawback of this approach is that the size of these tables can become quite large due to the large number of look-up parameters (dimension of the table leading to multi-dimensional interpolation during look-up) and the number of stored solution variables (thermochemical state). Especially for complex configurations and fuels, both the dimension and the solution size can increase significantly.

In an MPI-parallelized application, each process needs an individual copy of the table in RAM. Considering the increasing table size explained above and the general hardware trend of decreasing RAM/core, the size of the table has become a limiting factor in parallel simulations of turbulent combustion.

This problem will be addressed on two accounts in this talk. First, a novel memory management strategy with an additional abstraction level is presented, which only loads the necessary parts into physical memory and allows for in-memory compression and stripping. Various parallel extensions using the new MPI-3 standard and the system call MMAP are discussed. The second strategy decomposes the local structure of the table and fits each solution variable to multidimensional polynomials of adaptive degree. After merging the different fits across all solution variables, automatic source code generation is employed and finally, instead of the table itself, the compiled fitting functions are stored in a shared library. During the simulation, the interpolation is replaced by the retrieval of an ID from a simple region database, which is associated with a function pointer in the library, which is then called for the final calculation of the thermochemical state.

The application of both methods is discussed with respect to both memory usage and CPU requirements for the LES of turbulent combustion. Finally, it is shown that the proposed strategy can also be applied for other physical problems and an example from radiation modeling is shown.

## Hierarchische Matrizen: ein universelles Konzept zur Behandlung elektromagnetischer Probleme

### Prof. Dr. Mario Bebendorf, University of Bayreuth

**27 Nov 2015, 15:30–17:00; Location: S2|17-103 **

Der Vortrag behandelt die schnelle numerische Lösung elliptischer Randwertprobleme mittels hierarchischer Matrizen. Im ersten Teil des Vortrags werden nicht-lokale Operatoren, die beispielsweise durch Diskretisierung von Integraloperatoren entstehen, untersucht. Die auftretenden Systemmatrizen sind vollbesetzt, so dass bereits ihre Generierung eine effziente Behandlung im Fall großdimensionierter Probleme unmöglich macht. Wir stellen ein überraschend einfaches Verfahren, die Adaptive-Kreuz-Approximation, vor, das unter Verwendung weniger Originaleinträge eine Approximation mit nahezu linearer Komplexität generiert. Ein enormer Vorteil hierarchischer Matrizen gegenüber etablierten schnellen Verfahren wie der Multipol-Methode ist, dass (approximative) arithmetische Operationen wie Addition und Multiplikation definiert werden können. Mit Hilfe dieser Operationen lassen sich Approximationen an Inverse und LU-Zerlegungen mit reduzierter Genauigkeit konstruieren, die als Vorkonditionierer verwendet werden können. Die breite Anwendbarkeit dieser Methoden wird durch numerische Resultate aus Industriekooperationen in den Bereichen Akustik, Elastizität und Elektromagnetismus belegt. Auch Hochfrequenz-Probleme sind mit einer Variante der Methode effizient behandelbar. Ferner werden wir zeigen, dass die Streufeldenergie im Mikromagnetismus mit Hilfe dieser Methoden effizient berechnet werden kann, was eine Simulation von Übergängen zwischen stationären Zuständen ermöglicht.

Die Prinzipien der Adaptiven-Kreuz-Approximation sind besonders bei der Anwendung auf räumlich hochdimensionale Probleme zur Steigerung der Effizienz ausnutzbar und bilden somit eine Grundlage zur effizienten Lösung von partiellen Differentialgleichungen mit stochastischen Koeffizienten.

Im zweite Teil des Vortrages werden hierarchische Matrizen auf Finite-Element-Diskretisierungen angewendet. In diesem Fall sind die auftretenden Steifigkeitsmatrizen schwachbesetzt. Deren Inverse und die Faktoren ihrer LU-Zerlegung sind allerdings im Allgemeinen vollbesetzt, so dass sich hierarchische Matrizen anbieten, um approximative Vorkonditionierer zu konstruieren. Die Schwierigkeit liegt hierbei mehr auf theoretischer Seite: Hierarchische Matrizen erlauben zwar die Approximation jeder Matrix mit beliebiger Genauigkeit, die Approximantion muss aber nicht nahezu lineare Komplexität besitzen. Wir werden Approximationsresultate vorstellen, die belegen, dass sowohl die Inverse als auch die Faktoren der LU-Zerlegung mit nahezu linearer Komplexität approximiert werden können. Interessanterweise hängt die Komplexität im Gegensatz zu Mehrgitterverfahren nur schwach von der Glattheit der Koeffizienten des Differentialoperators ab, so dass sich diese Vorkonditionierer auf die gesamte Klasse elliptischer Probleme anwenden lassen, ohne sie auf das spezielle Problem anpassen zu müssen.

## Adjoint Methods for Efficient Optimization and Control in CFD and CAA

### Prof. Dr. Nicolas Gauger, TU Kaiserslautern

**5 Nov 2015, 17:00–18:30; Location: S4|10-1**

For efficient detailed aerodynamic or aeroacoustic designs as well as optimal active flow control, the use of adjoint approaches is an essential ingredient. Using adjoint methods, one is able to compute the gradients needed for sensitivity-based optimization and control methods, with a numerical effort independent from the number of design or control variables. The principal ideas underlying adjoint approaches will be presented and their efficiency will be demonstrated by several design and control examples in CFD and CAA. In this context we also discuss the development and application of our in-house tool for algorithmic differentiation (AD), called CoDiPack (Code Differentiation Package).

## Model-Order Reduction for Nonlinear Eddy Current Problems via Quadratic-Bilinear Modeling

### Dipl.-Ing. Daniel Klis, Universität des Saarlandes

**2 Nov 2015, 16:15–17:45; Location: S2|17-103 **

The finite-element time-domain simulation of nonlinear systems requires the iterative solution of a large, sparse system of equations at every time-step. Model-order reduction is a powerful tool for reducing the computational effort for this task. State-of-the-art methods compute an affine approximation to the nonlinearity. In contrast, the quadratic-bilinear approach treats the nonlinearity without simplification, by rewriting the original equations as a quadratic-bilinear differential algebraic system.

In this talk, an adaptive order-reduction methodology with error control is presented for the nonlinear eddy current problem.

## Performance Analysis for the Exascale Era: From Measurements to Insights

### Dr. Martin Schulz, Lawrence Livermore National Laboratory, USA

**17 Jul 2015, 14:00–15:30; Location: S4|10-1 **

Abstract

With rising system and application complexity, performance analysis is more important, but also more difficult than ever. Traditional performance are capable of measuring large volumes of performance data that can help with this process, but the interpretation of this data is becoming more and more the bottleneck. Simply comparing individual performance metrics, such as Flop/s and cache misses, and relate them to source code, is no longer sufficient. Instead we require novel approaches that allow a deeper correlation of performance data with application and communication structures, that take the system environment into account, that provide multiple views on measured data, and that offer intuitive visualizations to enable actionable insight.

In this talk I will discuss a general methodology that enables such kind of tools, discuss infrastructure elements that enable software stack wide instrumentation, and I will present two novel performance visualization tools as case studies: MemAxis, a memory analysis tool to gather and display data movement within NUMA systems, and Ravel, a trace visualizer based on logical time. Both tools provide the developer with an application centric view of performance data, which aids in capturing the performance behavior of the application and thereby enables optimization.

Biography

Martin Schulz is a Computer Scientist at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory (LLNL). He earned his Doctorate in Computer Science in 2001 from the Technische Universität München and also holds a Master of Science in Computer Science from the University of Illinois at Urbana Champaign. He has published over 175 peer-reviewed papers. He currently serves as the chair of the MPI forum, the standardization body for the Message Passing Interface. He is the PI for the Office of Science X-Stack project “Performance Insights for Programmers and Exascale Runtimes” (PIPER) as well as for the ASC/CCE project on Open|SpeedShop, and is involved in the DOE/Office of Science exascale projects CESAR, ExMatEx, and ARGO. Martin's research interests include parallel and distributed architectures and applications; performance monitoring, modeling and analysis; memory system optimization; parallel programming paradigms; tool support for parallel programming; power-aware parallel computing; and fault tolerance at the application and system level. Martin was a recipient of the IEEE/ACM Gordon Bell Award in 2006 and an R&D100 award in 2011.

## Development of a discontinuous Galerkin method for variable viscosity Stokes problems: Applications in geodynamics

### Dominic E. Charrier, ETH Zurich

**16 Jul 2015, 10:00–11:30; Location: S4|10-314**

Traditionally in the geodynamics community, staggered grid finite difference schemes and mixed Finite Elements (FE) have been utilised to discretize the variable viscosity Stokes problem. While these methods are considered to be sufficiently robust and accurate for a wide range of variable viscosity problems they tend to have inf-sup constants which are highly dependent on the cell aspect ratio. Discontinuous Galerkin (DG) methods alleviate this issue, which motivates their investigation in the context of geodynamics, where they did not receive much attention so far.

Specifically, we rigorously evaluate the applicability of two Interior Penalty Discontinuous Galerkin methods, namely the Nonsymmetric and Symmetric Interior Penalty Galerkin methods (NIPG and SIPG) for compressible elasticity and incompressible, variable viscosity Stokes problems. Numerical evidence is presented that the NIPG scheme is stable and convergent for Pk spaces even in the incompressible limit on conforming quadrilateral meshes. Both formulations are investigated for their convergence properties regarding velocity and pressure in the context of the Stokes problem. Numerical tests indicate that no spurious features emerge in either the pressure or velocity field.

In order to consider geodynamical applications the time-dependency of the mantle and lithosphere composition is taken into account through an auxiliary advection equation. Realistic simulations require computing on large clusters and the application of robust and scalable preconditioners, which will also be addressed.

## Software-Defined & High-Fidelity Visualization

### Dr. Johannes Guenther, Intel

**15 Jul 2015, 14:00–15:30; Location: S4|10-1 **

**Abstract**

Supercomputers help to analyze huge data sets and to simulate complex theories in different science domains. Today, the results of these analyses and simulations are visualized using GPUs in workstations or small, dedicated visualization clusters. However, visualization on GPUs becomes harder and harder due to the ever increasing data size to transport and process. As an alternative we recommend software-defined visualization, i.e. using the CPUs of HPC-clusters also for visualization. One project of our group in this direction is OpenSWR, a fast software rasterizer implementing the OpenGL API, which can be used as a drop-in replacement for many visualization applications. Another project I like to present is OSPRay, a scalable and flexible, ray tracing based rendering library. In this talk I will show that software-defined visualization cannot only compete well with GPU-based visualization, but is often actually the better choice, because of the offered high-fidelity renderings.

**Biography**

After completing his studies in physics and computer science in 2003, Johannes Guenther has obtained his PhD from the MPI-I in Saarbrücken in 2008. Afterwards, he spent 6 years as an Senior Researcher and Software Architect at RTT AG in Munich before he joined Intel Munich as a Senior Software Graphics Engineer in the area of ray tracing in 2014. Johannes currently works in the field of software-defined visualization where he pushes the usage of CPUs where GPUs are traditionally employed. He participated in Intel's Embree project – a ray tracer used in many popular movies – and other projects.

## A Plane Wave Virtual Element Method for the Helmholtz Problem

### Prof. Ilaria Perugia, University of Vienna

**6 Jul 2015, 17:00–18:30; Location: S4|10-1**

The Virtual Element Method (VEM) is a generalisation of the finite element method recently introduced by Beirao da Veiga, Brezzi, Cangiani, Manzini, Marini and Russo in 2013, which takes inspiration from modern mimetic finite difference schemes, and allows to use very general polygonal/polyhedral meshes.

This talk is concerned with a new method based on inserting plane wave basis functions within the VEM framework in order to construct a conforming, high-order method for the discretisation of the Helmholtz problem. The main ingredients of this plane wave VEM (PW-VEM) are: i) a low frequency space, whose basis functions are not explicitly computed in the element interiors; ii) a proper local projection operator onto a high-frequency space, which has to provide good approximation properties for Helmholtz solutions, and to allow to compute exactly the bilinear form, whenever one of the two entries belongs to that space; iii) an approximate stabilisation term.

The PW-VEM will be derived, and an outline of its convergence analysis will be presented, as well as some numerical testing.

These results have been obtained in collaboration with Paola Pietra (IMATI-CNR "E. Magenes'', Pavia, Italy) and Alessandro Russo (Università di Milano Bicocca, Milano, Italy).

## High Performance Scientific Computing @cs.pub.ro

### Prof. Emil Slusanschi, University Politehnica of Bucharest, Bucharest

**24 Jun 2015, 17:00–18:30; Location: S4|10-314 **

In this talk we will give an overview of the High Performance Scientific Computing activities that are currently taking place at the Computer Science and Engineering of the University Politehnica of Bucharest, Romania. In particular, we will present applications from the areas of: meteorology, aerospace, astrophysics, image processing and seismology.

## Problem-suited numerical methods for some complex flows

### Prof. Dr. Maria Lukacova, University of Mainz

**23 Jun 2015, 13:00–14:30; Location: L1|01-K328**

In this talk we will present efficient numerical schemes for two classes of complex flows:

i) Fluid-Structure interaction problems with application in hemodynamics and

ii) Singular limit flows with application in meteorology.

We derive problem-suited finite volume and/or Discontinous Galerkin approximations and analyse the scheme with respect to stability and accuracy both theoretically as well as numerically.

Extensive numerical tests confirm the reliability and robustness of our new numerical methods.

## Software-Defined & High-Fidelity Visualization

### Dr. Johannes Guenther, Intel

**15 Jul 2015, 14:00–15:30; Location: S4|10-1**

**Abstract**

Supercomputers help to analyze huge data sets and to simulate complex theories in different science domains. Today, the results of these analyses and simulations are visualized using GPUs in workstations or small, dedicated visualization clusters. However, visualization on GPUs becomes harder and harder due to the ever increasing data size to transport and process. As an alternative we recommend software-defined visualization, i.e. using the CPUs of HPC-clusters also for visualization. One project of our group in this direction is OpenSWR, a fast software rasterizer implementing the OpenGL API, which can be used as a drop-in replacement for many visualization applications. Another project I like to present is OSPRay, a scalable and flexible, ray tracing based rendering library. In this talk I will show that software-defined visualization cannot only compete well with GPU-based visualization, but is often actually the better choice, because of the offered high-fidelity renderings.

**Biography**

After completing his studies in physics and computer science in 2003, Johannes Guenther has obtained his PhD from the MPI-I in Saarbrücken in 2008. Afterwards, he spent 6 years as an Senior Researcher and Software Architect at RTT AG in Munich before he joined Intel Munich as a Senior Software Graphics Engineer in the area of ray tracing in 2014. Johannes currently works in the field of software-defined visualization where he pushes the usage of CPUs where GPUs are traditionally employed. He participated in Intel's Embree project – a ray tracer used in many popular movies – and other projects.

## H-matrix accelerated second moment analysis for potentials with rough correlation

### Jürgen Dölz, University of Basel

**2 Jun 2015, 17:00–18:30; Location: S4|10-1 **

After a brief introduction to the boundary element method and H-matrices we consider the efficient solution of strongly elliptic potential problems with stochastic Dirichlet data. The computation of the solution’s two-point correlation is well understood if the two-point correlation of the Dirichlet data is known and sufficiently smooth. Unfortunately, the problem becomes much more involved in case of rough data. We will show that the concept of the H-matrix arithmetic provides a powerful tool to cope with this problem. By employing a parametric surface representation, we end up with an H-matrix arithmetic based on balanced cluster trees. This considerably simplifies the implementation and improves the performance of the H-matrix arithmetic. Numerical experiments are presented to validate and quantify the presented methods and algorithms.

## Energy stable integration of Allen-Cahn and Cahn-Hilliard equations

### Prof. Dr. Bülent Karasözen, Middle East Technical University, Ankara (Turkey)

**23 Apr 2015, 16:00–17:30; Location: S4|10-314 **

We consider Allen-Cahn and Cahn-Hilliard equations as phase fi?eld models with constant and variable mobility. Both equations are discretized in space using symmetric interior penalty discontinuous Galerkin (SIPG) fi?nite elements. Time discretization is performed by the energy stable average vector ?field (AVF) method for gradient systems like Allen-Cahn equation. We show that the fully discrete scheme satis?fies the energy decreasing property. The numerical results for one and two dimensional Allen-Cahn and Cahn-Hilliard equations using adaptive stepping con?firm that the discrete energy decreases monotonically, the phase separation and metastability phenomena can be observed and the ripening time is detected correctly for convex double well and non-convex logarithmic energy functionals.

## Fast Galerkin Methods for Parabolic Boundary Integral Equations

### Prof. Johannes Tausch, Southern Methodist University Dallas, Texas (USA)

**9 Mar 2015, 17:00–18:30; Location: S4|10-1**

The boundary element method is a widely used technique for solving problems governed by elliptic PDEs. On the other hand, the application of integral equation methods to time dependent problems is much less developed and a topic of current research.

Time dependence is reflected in the fact that boundary integral operators involve integrals over time in addition to integrals over the boundary surface. For the numerical solution this means that a time step involves a summation over space and the complete time history. Thus the naive approach has order N2M2 complexity, where N is the number of unknowns in the spatial discretization and M is the number of time steps. We discuss a space-time version of the fast multipole method which reduces the complexity to nearly NM.

A critical aspect of the success of boundary element methods is the choice of a proper discretization method. Since the thermal single layer operators is elliptic, the Galerkin method is unconditionally stable and optimally convergent. Each time step involves the solution of a linear system whose condition number is bounded with appropriate mesh refinement.

We also discuss the application of the methodology to three-dimensional transient Stokes flow. Perhaps surprisingly, this is not straight forward, because of different properties of the fundamental solutions of the heat and Stokes equations.

## Non-Uniform Rational B-Splines in Fluid Flow Simulations

### Dr.-Ing. Stefanie Elgeti, RWTH Aachen

**9 Feb 2015, 16:15–17:45; Location: S2|17-103 **

Two important influences on the accuracy of simulation results are the approximation properties of the numerical scheme with respect to, on the one hand, the unknown solution, and on the other hand, the geometry representation. The latter aspect poses a particular challenge in the context of deforming computational domains.

These deformations can either be externally prescribed, as for example in fluid-structure interaction or shape optimization, or occur in the frame of free-boundary problems, where the computational domain itself is part of the solution. In the context of fluid flow, free-boundary problems can, e.g., be found in free-surface flows. Considering the coupled problem of flow solution and domain deformation, our solution approach is based on the Deforming-Spatial Domain/Stabilized Space-Time (DSD/SST) finite element method. In DSD/SST, the variational form is written over the complete space-time domain, thus easily incorporating deforming domains into the formulation. The deformation itself is treated with a boundary conforming interface tracking scheme. In order to further enhance the boundary conformation, the scheme employs Non-Uniform Rational B-Splines (NURBS) as a support of the standard finite element representation of geometry and flow solution. As the basis of CAD systems, NURBS are closely connected to any engineering application, particularly since the concept of Isogeometric Analysis (IGA) [1] has introduced NURBS to the numerical analysis. However, the generation of complex three-dimensional grids suited for IGA is still a challenge, hindering its use in the area of fluid mechanics. Nevertheless, methods for fluid simulation can profit immensely from the use of NURBS as a boundary description. Several stages of NURBS usage (here, all in the context of the finite element method) are possible:

1. Certain information needed for the computation (e.g., curvature or normals) is computed from a NURBS representing the boundary [2].

2. The computational domain is represented exactly using NURBS, but the solution is still interpolated using polynomials. This idea is, for example, realized in the NURBS-enhanced finite element method [3].

3. The NURBS represent both the geometry and the unknown solution (IGA).

Stemming from both the exact boundary description and the superior approximation properties of NURBS, such approaches can have a variety of advantages. These involve both the computational accuracy reached with a specific computational cost as well as the efficiency and accuracy of coupling schemes. The advantages of the discussed approaches will be demonstrated through a variety of engineering applications. Furthermore, the use of NURBS in fluid flow connected shape optimization problems — especially from the area of production engineering — will be presented. Here, the focus is on the incorporation of shape constraints imposed through the manufacturing processes.

References

[1] T. J. R. Hughes, J. A. Cottrell,Y. Bazilevs. Isogeometric analysis: CAD, finite elements, NURBS, exact geometry and mesh refinement. Computer Methods in Applied Mechanics and Engineering 194 (2005), 4135–4195.

[2] S. Elgeti, H. Sauerland, L. Pauli, M. Behr. On the usage of NURBS as interface representation in free-surface flows. International Journal for Numerical Methods in Fluids 69 (2012), 73–87.

[3] R. Sevilla, S. Fernandez-Mendez, A. Huerta. NURBS-Enhanced Finite Element Method (NEFEM): A Seamless Bridge Between CAD and FEM. Archives of Computational Methods in Engineering 18 (2011), 441–484.

## Multirate Simulation by the MPDE technique for Radio Frequency Circuits

### FH-Prof. Dr.-Ing. habil. Hans Georg Brachtendorf, Upper Austria University of Applied Sciences (FHO), Hagenberg

**12 Jan 2015, 16:15–17:45; Location: S2|17-103 **

In this talk we give an overview of the FHO activities in the fp-7 research projects ICESTARS, nanoCOPS, the ENIAC project ARTEMOS, and the Austrian basic research project “Spline/wavelet based methods in circuit simulation”.

In Radio Frequency (RF) circuits one observes slowly varying envelope or baseband signals modulated by a carrier signal at a carrier frequency fc, the bandpass signal. Communication engineers are mainly interested in distortions of the baseband signals during transmission and developed the method of the Equivalent Complex Baseband (ECB) technique which splits the envelope from the carrier waveform. This method circumvents the bottleneck caused by the sampling theorem and speeds up the run-time significantly.

This technique is however not applicable for nonlinear circuits described by the Modified Nodal Analysis (MNA): the Multirate PDE (MPDE) reformulates the system of ordinary DAE system by a system of PDEs as mixed boundary/initial value problem instead. The formulation of the PDE depends on the specific problem under test. The solution of the ordinary DAE is obtained along a characteristic curve of the PDE.

The boundary/initial value problem can be solved by standard techniques such as the well known Harmonic Balance (HB) method based on trigonmetric basis functions, multistep integration formulas (i.e. BDF methods) etc. A severe disadvantage of trigonometric basis functions is the non compactedness of the basis functions. BDF methods exhibit on the other hand numerical dissipation of energy. In the recent research projects spline/wavelet methods with adaptive grid have been developed instead.

In a current research project the multirate circuit simulator is coupled with an EM/device simulator from MAGWEL. Preliminary results are presented.

The research presented was performed in cooperation with Kai Bittner.