A convergence of AI, HPC & Big Data analytics is being accelerated by the extensive proliferation of modern compute workflows that combine different methodologies & techniques to solve complex problems. These domains are arguably running the same types of data and compute intensive workloads on HPC hardware nowadays, be it niche supercomputers, small institutional clusters or in the cloud.
Distributed Scaling, Occupancy and Bandwidth issues plague all these domains as well. Currently, there are four major trends for this converged domain. First, the average size of datasets for the applications is rapidly increasing. Read-only input matrices that used to be on the order of megabytes or low-order gigabytes are growing into the double-digit gigabyte range and beyond. Second, the applications are continually required to be more and more accurate. This trend leads to larger working set sizes in memory as the resolution of stored and computed data becomes finer. Third, no matter how close accelerators are to the CPU, memory address spaces are still incoherent and automated memory management systems are not yet reaching the performance of hand-crafted solutions for HPC/AI applications. Fourth, while the physical memory size of accelerators is growing it fails to grow at the same rate as the working set sizes of applications. This leads to the conclusion that future HPC systems will rely heavily on efficient memory management for accelerators to be able to handle future working set sizes, and that considerable research will be essential in this field. Thus, a confluence of HW-SW co-design choices optimized for these converged scenarios will be necessary. Memory Management and mapping form a crucial part of this.
The requirement for Dynamic Memory mapping strategies (in unsupervised and semi-supervised online training, dynamic graph analytics, data analytics, sparse linear algebra and databases) only conflates these above-mentioned issues. In conventional HPC systems, the memory management sub-system runs as a separate service or as a part of the runtime management sub-system on a Service Node and it controls memory allocation on the Computational Nodes. It deals with the following issues:
Almost all accelerator/GPU level memory managers offer the standard malloc/free interface & operate on a block of memory with a configurable size. They all also follow a similar approach of splitting the available memory into large blocks (mostly fixed size) & using these to serve the individual allocation requests. Managing these resources includes the use of lists, queues or even hashing. These are far from optimum. A few approaches have been proposed over the last decade & these need to be evaluated on a level playing field and with state-of-the-art hardware to answer the question – if dynamic memory mapping and management is as slow as commonly thought of. This also involves thoroughly evaluating compute resource allocation (task/process–based, thread-based as well as warp/wave-based), performance scaling, fragmentation and real-world performance considering custom and synthetic workloads as well as standard benchmarks if any.
Following this, novel Memory management strategies must be proposed for these converged domains (with a particular emphasis on mapping). This must result in guidelines for the respective best usage scenario. There should also be insights into the infrastructure interfaces required to integrate any of the tested and proposed memory manager solutions into an application and switch between them for benchmarking purposes.
This project is an initiative of the Compute Systems Architecture Unit (CSA). CSA is researching emerging workloads and their performance on large- scale supercomputer architectures for next-generation Artificial Intelligence (AI) and high-performance computing (HPC) applications. The team is responsible for algorithm research, runtime management innovations, performance modeling, architecture simulation and prototyping for these future applications and the future systems to execute them, to reach multiple orders of magnitude better performance, energy-efficiency, and total-cost-of- ownership.What we do for you
We offer you the opportunity to join one of the world's premier research centers in nanotechnology at its headquarters in Leuven, Belgium. With your talent, passion and expertise, you'll become part of a team that makes the impossible possible. Together, we shape the technology that will determine the society of tomorrow.
We are committed to being an inclusive employer and proud of our open, multicultural, and informal working environment with ample possibilities to take initiative and show responsibility. We commit to supporting and guiding you in this process; not only with words but also with tangible actions. Through imec.academy, 'our corporate university', we actively invest in your development to further your technical and personal growth.
We are aware that your valuable contribution makes imec a top player in its field. Your energy and commitment are therefore appreciated by means of a competitive salary with many fringe benefits.Who you are
This postdoctoral position is funded by imec through KU Leuven. Because of the specific financing statute which targets international mobility for postdocs, only candidates who did not stay or work/study in Belgium for more than 24 months in the past 3 years can be considered for the position (short stays such as holiday, participation in conferences, etc. are not taken into account).