# CE Seminar

Together with the Computational Engineering Research Center of the TU Darmstadt a joint seminar with interesting talks in the field of CE is organized in every semester. If you are interested in these seminars and would like to receive invitations please subscribe for the corresponding mailing list.

## 2012

## Functional A Posteriori Error Estimates for Static Maxwell Type Problems

### Prof. Dr. Dirk Pauly, University Duisburg-Essen

**6 Dec 2012, 17:00; Location: S4|10-1 **

This talk is concerned with the derivation of computable and guaranteed upper and lower bounds of the di?erence between the exact and the approximate solution of boundary value problems for static Maxwell type equations. Our analysis is based upon purely functional argumentation and does not attract speci?c properties of an approximation method. Therefore, the presented estimates are applicable to any approximate solution which is square integrable. In particular, our results hold for non-conforming approximations. Such estimates (also called error majorants of functional type) have been derived earlier, e.g., for elliptic problems.

## Nitsche-based finite element methods for the Stokes problem on cut and composite meshes – analysis, implementation and applications

### Andre Massing, PhD, Simula Research Laboratory AS, Fornebu (Norway)

**22 Nov 2012, 17:00; Location: S4|10-1 **

Multi-domain and multi-physics problems with moving interfaces and parameter studies with changing geometric domains can be severely limited by the use of conforming meshes when complex geometries in three spatial dimensions are involved. To overcome these limitations, several fixed-grid methods based on XFEM, Nitsche's method and related approaches have been investigated and shown promising results in recent years.

Nitsche's method is a general approach to formulate boundary and interface conditions in a weak sense. It presents a promising alternative to existing Lagrange multiplier and penalty methods due to its analytical properties. Consequently, Nitsche's method has recently been employed to formulate new fictitious domain approaches and domain decomposition methods based on composite meshes and arbitrary cutting interfaces.

In the first part of this talk, we review how sophisticated algorithms and data structures from the field of computational geometry can be employed to efficiently implement schemes based on fictitious and overlapping domains. These techniques are part of an emerging framework for Nitsche-type methods based on the FEniCS project. In the second part, we present some very recent results for a Nitsche-based formulation of the Stokes problem on both fictitious and overlapping domains. Using so-called ghost-penalties, optimal a priori estimates are obtained. Moreover, the condition number of stiffness matrix can be bounded independently of the location of the interface and the domain boundary. The talk concludes by demonstrating various application of the given formulations.

References

[1] A. Massing, M. G. Larson, A. Logg, and M. E. Rognes, “A stabilized Nitsche fictitious domain method for the Stokes problem”, submitted for publication, 2012 (available as arXiv preprint arXiv:1206.1933)

[2] A. Massing, M. G. Larson, A. Logg, and M. E. Rognes, “A stabilized Nitsche overlapping mesh method for the Stokes problem”, submitted for publication, 2012 (available as arXiv preprint arXiv:1205.6317)

[3] A. Massing, M. G. Larson, and A. Logg, “Efficient implementation of finite element methods on non-matching and overlapping meshes in 3D ”, SIAM J. Sci. Comput., accepted, 2012 (available as arXiv preprint arXiv:1210.7076)

[4] A. Massing, M. G. Larson and A. Logg, “Towards an Implementation of Nitsche’s Method on Overlapping Meshes in 3D”, AIP Conference Proceedings 1281, 2010

## High-Quality Software Development in Academia: (Un-)Feasible?

### Dr.-Ing. Klaus Iglberger, University Erlangen-Nürnberg

**8 Nov 2012, 17:00; Location: S4|10-1**

Software development in academia is subject to a set of special circumstances that make the development of sustainable, maintainable, and high-quality software particularly difficult. In this talk I address these problems, explain why academia might be facing substantials problems in years to come, but also why the creation of maintainable software is especially important in academia. However, based on these requirements I will also present possible strategies to mitigate these problems. By means of the waLBerla project, today one of the leading CFD tools based on the lattice Boltzmann method, I will demonstrate how sustainable software development in academia can be made feasible.

## Uncertainty Quantification in Flow-Structures Interaction

### Prof. Fernando A. Rochinha, Federal University of Rio de Janeiro, Brazil

**11 Oct 2012, 17:00; Location: S4|10-1**

Consistently replacing prototypes or physical experiments by computations has been pursued within the Engineering and Applied Science community in the last decades, to a large extent due to the impressive computer power available nowadays. But the use of computer simulations as an effective tool still faces conceptual and technical challenges. Chief among these is the reliability of the predictions regarding the real physical response of the systems. Computational analysis in general is intended to provide reliable predictions of particular events, which are used as the basis for crucial decisions. In that sense, Uncertainty Quantification (UQ), which is a critical element in experimental exploration, has been considered recently an important conceptual basis for improving the reliability and wide acceptance of computer simulation predictive capacity.

In recent years there has been significant progress in quantifying and modeling the effect of input uncertainties in the response of partial differential equations (PDEs). The presence of uncertainties is incorporated by transforming the PDEs representing the system into a set of stochastic PDEs (SPDEs). The spectral representation of stochastic dimension has led to Generalized Polynomial Chaos Expansion, which has shown its utility and efficiency in different domains. As a drawback, this approach requires significant recoding which seems not very attractive especially when legacy codes are to be used. In order not to face that type of barrier, collocation methods has been developed which also relies on an explicit representation of the stochastic dimension but retrieve the decoupled nature and non-intrusive implementation of sampling methods like Montecarlo Method.

Some theoretical results involving the convergence of collocation methods have been developed for elliptic linear problems, so the performance of such methods outside this realm is still to be proved, which already been done for some nonlinear problems. Here we apply the collocation method to fluid-structures interaction problems. The first example deals with Fluid-Structure Interaction (FSI) in the context of Vortex-Induced Vibrations (VIV), which is central to the design of risers and floating structures in offshore engineering. Here, the VIV phenomenon is described by a simple model, often used by engineers in the initial design stages, that, despite its simplicity, is capable of tracking important aspects of the dynamics. A long-term response is a key ingredient for understanding fatigue failure mechanisms of structures. As the long-term statistics lead to a great amount of correlated data, these can be obtained and handled with the help of our enabling computational infrastructure.

The second example is devoted to a critical assessment of Large Eddy Simulation (LES) models. The inherent complexity of turbulent flows demands the use of refined grids in time and space for representing the multiscale character of the involved phenomena, specially the dissipation mechanisms. The use of direct numerical simulation (DNS) often leads to prohibitive computational costs that scale with the Reynolds number, in that case, inversely proportional to the smallest scale to be captured within the simulation. Despite some recent improvement on DNS schemes, in many practical engineering applications, LES models, which rely on adding extra dissipation, are often used. Here, a sensitivity analysis with respect to a LES parameter, similar to the one proposed in , is pursued in a benchmark problem. In this example, emphasis is placed on UQ performed within a stabilized finite element high performance computing code coordinated by a scientific workflow.

## Fluid-Structure Interactions in Turbines

### Prof. Romuald Rzadkowski, PhD, Polish Academy of Sciences, Gdańsk

**20 Sep 2012, 17:00; Location: S4|10-1 **

A three-dimensional nonlinear time-marching method and numerical analysis for aeroelastic behaviour of oscillating blade row has been presented. The approach is based on the solution of the coupled fluid-structure problem in which the aerodynamic and structural equations are integrated simultaneously in time. Thus providing the correct formulation of a coupled problem, as the interblade phase angle at which a stability (or instability) would occur, is a part of the solution.The ideal gas flow through multiple interblade passage (with periodicity on the whole annuls) is described by the unsteady Euler equations in the form of conservative laws, which are integrated by use of the explicit monotonous second order accurate Godunov-Kolgan volume scheme and moving hybrid H-H (or H-O) grid. The structure analysis uses the modal approach and 3D finite element model of the blade. The blade motion is assumed to be a linear combination of modes shapes with the modal coefficients depending on time. The influence of the natural frequencies on the aerodynamic coefficient and aeroelastic coupled oscillations for the Fourth Standard Configuration is shown. The stability (instability) areas for the modes are obtained. It has been shown that interaction between modes plays an important role in the aeroelastic blade response. This interaction has essentially nonlinear character and leads to blade limit cycle oscillations.

Numerical simulations of 3D viscous flutter were performed and compared with the available experimental results. The calculations were carried out for bending oscillations of the cascade known as the Eleventh Standard Configuration. The developed numerical algorithm solves the 3D Reynolds-averaged Navier-Stokes equation together with the Baldwin-Lomax turbulence model, using the explicit monotonous second-order accurate Godunov-Kolgan finite-volume scheme and moving hybrid H-O structured grid. Comparison of the calculated and the experimental results for the Eleventh Standard Configurations has shown sufficient quantitative and qualitative agreement for local performances (unsteady pressure amplitude and phase distribution) at off-design conditions. Benchmark solutions are provided for various values of the inter-blade phase angle.

Numerical calculations of the 3D transonic flow of an ideal gas through turbomachinery blade rows moving relatively one to another with taking into account the blades oscillations is presented. The algorithm proposed allows to calculate turbine stages with an arbitrary pitch ratio of stator and rotor blades, taking into account the blade oscillations by action of unsteady loads caused both outer flow nonuniformity and blades motion. There has been performed the calculation for the stage of the turbine with rotor blades of 0.765 m. The numerical results for unsteady aerodynamic forces due to stator-rotor interaction are compared with results obtained with taking into account the blades oscillations.

## preCICE – a multiphysics coupling environment for flexible coupling of black-box solvers

### M.Sc. Bernhard Gatzhammer, TU München

**9 Aug 2012, 10:00; Location: S4|10-1**

## Productivity and Performance with Multi-Core Programming

### Christian Terboven, Center for Computing and Communication, RWTH Aachen University

**4 Jul 2012, 17:00; Location: S4|10-1**

The multicore era has led to a renaissance for shared memory parallel programming models. Moreover, the introduction of Task-level parallelization raises the level of abstraction compared to thread-centric expression of parallelism. However, shared memory parallel applications may exhibit poor performance on NUMA systems if non-local data is accessed. Furthermore, increasingly complex simulation codes as well as the need to employ multiple levels of parallelism in order to exploit today's multicore clusters need to be taken into account in the software development process. This work presents solutions for designing shared memory parallel applications targeting current and future system architectures by following a methodical approach as well as building on successful strategies from the software engineering discipline, such as the introduction of abstractions.

## Recent Advances in PETSc Scalable Solvers

### Dr. Lois Curfman McInnes, Argonne National Laboratory, U.S.A.

**2 Jul 2012, 14:00; Location: S4|10-1**

We will discuss recent advances in the Portable, Extensible Toolkit for Scientific computing (PETSc) that enable application scientists to develop composable linear, nonlinear, and timestepping solvers for multiphysics and multilevel methods on emerging extreme-scale architectures. Examples of usage include lithosphere dynamics, subduction and mantle convection, ice sheet dynamics, subsurface reactive flow, fusion, mesoscale materials modeling, and power networks. We will also discuss how solver composability enables the development of hierarchical Krylov methods, which overcome well-known limitations in scaling of conventional Krylov methods due to global reductions.

## Autotuning: search, specialization, and multiple objectives

### Dr. Paul Hovland, Argonne National Laboratory, U.S.A.

**6 Jun 2012, 17:00; Location: S4|10-1 **

We describe recent and ongoing work in the area of automatic performance tuning. By employing specialization, we can develop tuned variants for the most common usage scenarios in a given application. By posing the empirical search problem as a mathematical optimization problem, we can draw upon the large body of research in derivative free optimization algorithms. Finally, by extending the optimization formulation to the multiple objective case, we can develop techniques to explore the tradeoffs among competing criteria, such as execution time, energy consumption, resilience, and memory footprint.

## The GSI Green-IT Cube, a highly energy efficient data center

### Prof. Dr. Volker Lindenstruth, Goethe-Universität, Frankfurt/Main

**31 May 2012, 17:00; Location: S4|10-1 **

At GSI a new data center is being built, which sets new standards in energy efficiency and cost. The computer racks are cooled directly with heat exchangers and mounted on a steel structure much like a high rack warehouse. This architecture enables power densities exceeding 20 kW/m² and floor or 100 kW/m². The entire data center will support computers up to 16 MW. The power required for the cooling infrastructure corresponds to 7% of the computer power. Ground breaking is scheduled for 2012 and completion for early 2014. A small version of such an architecture is completed, hosting up to 100 19-inch racks. This data center will host the computers for the FAIR facility at GSI. These systems are a mix of high performance servers, augmented with GPGPUs. An overview of the cooling architecture and the latest performance data will be given. In addition an overview of the FAIR computing will be presented.

## Substrate integrated waveguide integration with active components

### Farzaneh Taringou, University of Victoria, Kanada

**14 May 2012, 09:00; Location: S2|17-114**

Substrate integrated waveguide (SIW) technology has been largely explored as a low-cost low-profile transmission line in passive microwave and millimeter circuits and devices. Yet,little research has been done to assess SIW performance in integration with active elements, e.g. Power Amplifier (PA) and Low Noise Amplifier (LNA). The long-term objective of this research is to develop techniques for the integration of SIW with active devices, such as Low Noise Amplifiers (LNAs) or Power Amplifiers (PAs), and surface-mount components on a single substrate layer whereby low-cost and low-profile active integrated SIW-based systems could replace bulky RWG configurations, thus yielding an ultra compact alternative for the receiver’s front end circuitry. Since a large number of SIW-based receiver cards will be integrated to form a two-dimensional, and eventually dual-polarized, phased array feed, it is important that the individual SIW circuits be mass-producible and tolerance-insensitive. This is especially true for current radio-astronomy applications, but it will also be beneficial to all other commercial technologies involving SIW circuitry in the millimetre-wave frequency regime.

## Control of fluid flow using electromagnetic body forces

### Dr. Thomas Albrecht, Forschungszentrum Dresden Rossendorf

**3 May 2012, 17:00; Location: S4|10-1**

In many engineering applications, the way natural fluid flows behave leaves some room for improvement. While geometric optimizations, such as streamlined shapes, require no additional energy input, they might not always be possible, feasible, or sufficient.

Another option is active flow control, where a suitable actuator more or less directly alters flow structures. Of the variety of such devices proposed for flow control applications, we focus on Lorentz force actuators. They consist of (permanent) magnets and electrodes, generating a body force near the wall it is attached to. The momentum added to the flow is linearly driven by an electric current.

The actuator can be applied to prevent transition from laminar to turbulent flow, a process that would otherwise lead to a rapid increase in drag. Its linear response is also advantageous when suppressing flow separation at inclined airfoils to prevent the loss of lift. The talk will cover both applications, and include numerical as well as experimental results.

## Design by Transformation – Application to Dense Linear Algebra Libraries

### Prof. Dr. Robert van de Geijn, University of Texas at Austin, U.S.A.

**17 Apr 2012, 17:00; Location: S4|10-1 **

The FLAME project has yielded modern alternatives to LAPACK and related efforts. An attractive feature of this work is the complete vertical integration of the entire software stack, starting with low level kernels that support the BLAS and finishing with a new distributed memory library, Elemental. In between are layers that target a single core, multicore, and multiGPU architectures. What this now enables is a new approach where libraries are viewed not as instantiations in code but instead as a repository of algorithms, knowledge about those algorithm, and knowledge about target architectures. Representations in code are then mechanically generated by a tool that performs optimizations for a given architecture by applying high-level transformations much like a human expert would. We discuss how this has been used to mechanically generate tens of thousands of different distributed memory implementations given a single sequential algorithm. By attaching cost functions to the component operations, a highly optimized implementation is chosen by the tool. The chosen optimization invariably matches or exceeds the performance of implementations by human experts. We call the underlying approach Design by Transformation (DxT).

Biography:

Robert van de Geijn is a Professor of Computer Science and member of the Institute for Computating Engineering and Sciences at UT-Austin. He received his Ph.D. in Applied Mathematics from the University of Maryland. His interests are in linear algebra libraries, scientific computing, parallel computing, and formal derivation of programs. His FLAME project pursues how fundamental techniques from computer science support high-performance linear algebra libraries. He has written more than a hundred refereed articles and several books on this subject.

This work is in collaboration with Bryan Marker, Don Batory, Jack Poulson, and Andy Terrell.

## Some recent developments of a non-dissipative DGTD method for time-domain electromagnetics

### Dr. Stéphane Lanteri, INRIA Sophia Antipolis-Méditerranée

**8 Mar 2012, 17:00; Location: S4|10-1**

Nowadays, a variety of modeling strategies exist for the computer simulation of electromagnetic wave propagation in the time domain. Despite a lot of advances on numerical methods able to deal accurately and in a flexible way with complex geometries through the use of unstructured (non-uniform) discretization meshes, the FDTD (Finite Difference Time Domain) metho is still the prominent modeling approach for realistic time domain computational electromagnetics, in particular due to the possible straightforward implementation of the algorithm and the availability of computational power. In the FDTD method, the whole computational domain is discretized using a structured (Cartesian) grid. This greatly simplifies the discretization process but also represents the main limitation of the method when complicated geometrical objects come into play. Besides, the last 10 years have witnessed an increased interest in so-called DGTD (Discontinuous Galerkin Time Domain) methods. Thanks to the use of discontinuous finite element spaces, DGTD methods can easily handle elements of various types and shapes, irregular non-conforming meshes, and even locally varying polynomial degree, and hence offer great flexibility in the mesh design. They also lead to (block-) diagonal mass matrices and therefore yield fully explicit, inherently parallel methods when coupled with explicit time stepping. Moreover, continuity is weakly enforced across mesh interfaces by adding suitable bilinear forms (often referred as numerical fluxes) to the standard variational formulations. In this talk, we will describe some recent developments aiming at improving the accuracy and the performances of a non-dissipative DGTD for the simulation of time-domain electromagnetic wave propagation problems involving general domains and heterogeneous media. The common objective of the associated studies is to bring the method to a level of computational efficiency and flexibility that allows to tackle realistic applications of practical interest.

## Mechanoenzymatik: Atomistische Simulation biomolekularer Nanomaschinen

### Prof. Dr. Helmut Grubmüller, Max-Planck-Institut für biophysikalische Chemie, Göttingen

**3 Feb 2012, 17:15; Location: S2|14-024**

Proteins are biological nanomachines. Virtually every function in the cell is carried out by proteins – ranging from protein synthesis, ATP synthesis, molecular binding and recognition, selective transport, sensor functions, mechanical stability, and many more.

The combined interdisciplinary efforts of the past years have revealed how many of these functions are effected on the molecular level.

Computer simulations of the atomistic dynamics play a pivotal role in this enterprise, as they offer both unparalleled temporal and spacial re-solution. With state of the art examples, this talk will explain the basics of this high performance computing method, the type of questions that can (and cannot) be addressed, and its current limitations.

The examples include the mechanical force sensor titin kinase, mechanics of FATP synthase, and the flexible recognition by nuclear pore transporters.

This talk is provided together with the Physics Colloquium at TU Darmstadt.