Curriculum Vitae
Professional experience
Chief Architect | Advanced Computing | HiSilicon Turing Department, Huawei Technologies, Germany / January 2021 — Present
-
- Improve the Arm and AI European ecosystem (weather simulation, life sciences and computer-aided engineering) and contribute to next-generation SoC and low-level software technologies
- Combine HPC and AI technologies seamlessly into one common platform
- Research and development of graph neural network (GNN) models for Huawei’s artificial intelligence portfolio
- Analyse hardware- and software architectural optimisation for sparse linear algebra
- Development of SVE-enabled libraries and typical European HPC application benchmarks
- Represent Huawei in standardisation and industry activities such as EU-level strategy shaping activities like the European Technology Platform for HPC (ETP4HPC)
- Drive technical academia engagement programs and cooperation projects
- Execution of additional hirings for HiSilicon team for advanced computing – this includes screening candidate profiles, supporting interviews, refinement of job descriptions and coordination with recruiters
Chief Architect | HiSilicon Turing Department, Huawei Technologies, Germany / January 2019 — Present
-
- Enabling Arm for HPC / Arm+AI ecosystem with cooperation partners to foster the use of HiSilicon solutions, gathering requirements for next generation server systems
- Support HQ HiSilicon R&D team in designing efficient SoC architectures for advanced computing and autonomous driving
- Represent Huawei in standardisation and industry activities such as EU-level strategy shaping activities like the European Technology Platform for HPC (ETP4HPC)
- Drive technical academia engagement programs and cooperation projects
- Hiring for HiSilicon team for advanced computing – this includes screening candidate profiles, supporting interviews, refinement of job descriptions and coordination with recruiters
Chief Architect | Central Hardware Department, Huawei Technologies, Germany / January 2015 — December 2019
- Led and hired a team of >7 experts with a focus on computation (micro-architecture), interconnect and memory technologies
- Successfully orchestrated and organised a project charter for an Arm-based advanced computing prototype, including the definition of project scope, key technologies and milestones
- Defined architectural elements and improvements for Arm-based computing systems in the areas of computation, memory system, storage, switching and interconnect
- Identified and researched on new key technologies contributed to the Huawei’s system strategies
- Facilitated the architectural concept for in-memory processing technologies for the use in Arm-based systems
- Collaborated with three significant research and data centres in Europe to evaluate Arm-based systems and make experimental hardware codesign
- Established partnerships for Horizon 2020 projects on advanced computing
- Principal representative to the Cache Coherent Interconnect for Accelerators (CCIX) standard, implemented in HiSilicon Kunpeng 920 SoC
- Defined high-impact libraries and performance tools for the Arm architecture, generated requirements and executed performance optimisation
- Contributed to the incubation of a team which is focusing on autonomous driving
Senior Engineer | Systems Optimisations Competency Center, IBM Research & Development, Germany / January 2014 — December 2014
- Led the technical and performance team for the SAP HANA in-memory, column-oriented, relational database management system on the IBM POWER architecture
- Ensured that the performance result is on par with competitive hardware configurations
- Provided support to achieve performance objective by leading POWER specific code development and executing performance evaluation
- Analysed utilisation of hardware resources (memory bandwidth, threads, cores and sockets) during intensive scaling tests
- Defined the scale-out and scale-up system architecture and execute corresponding performance measurements
- Investigated hot functions on micro-architecture level
- Improved vector code coverage in SAP HANA by 6%, leading to 15% more performance
Senior Engineer | Blue Gene Active Storage, IBM Research & Development, Germany / July 2011 — December 2013
- Led the technical engineering team of >8 people, which was responsible for the development and delivery of the Blue Gene Active Storage (BGAS) architecture
- Successfully contributed to the BGAS architecture to achieve a balanced integration of solid-state storage, computation, and cost-scaleable network
- Leveraged Blue Gene/Q as a vehicle for rapid prototyping for active storage concepts
- Executed proofs-of-concept on computing-in-storage for applications in neuroscience and middleware software packages including GPFS and DB2
- Responsible for the architecture and development of software packages (including peripheral image, device driver, FPGA image and middleware frameworks) for a hybrid scalable solid-state storage device, which targeted research explorations
- Decomposed acceleration function for industrial and scientific application scenarios
- Responsible for the development of a software-based RDMA network interface controller
- Coordinated research engagements with three customers in Germany, Switzerland and the United Kingdom
- Mentored bachelor and master students
Senior Engineer | Blue Gene/Q, IBM Research & Development, Germany / April 2009 — June 2011
- Led a global PCI Express verification and performance team for the Blue Gene/Q ASIC to complete ahead of schedule and within the expected performance promises
- Led and executed the hardware bring-up of the PCI Express core of the Blue Gene/Q ASIC
- Create the verification and performance plan and regularly reported status to executive management and customers
- Developed proxy applications to simulate parallel file-system traffic tunnelled via InfiniBand over PCI Express
- Created a hardware simulation environment to imitate the operation of the Blue Gene/Q ASIC in combination with PCI Express attached devices (including physical, link and transport layer)
- Implemented an automatic regression framework for I/O traffic on ASICs
Resident Enginner | Open Systems Development, IBM Research & Development, Germany / January 2008 — March 2009
- Led the bring-up of the QPACE project and coordinated >15 developers from industrial and academic partners
- Responsible for the architecture and the development of the firmware for the compute node of the QPACE project
- Developed the bring-up plan for the QPACE project
Resident Engineer | Open Systems Development, IBM Research & Development, Germany / September 2006 — December 2007
- Contributed to the firmware development of blade servers using the PowerPC 970 and Cell/B.E. processor
- Responsible for the PCI Express device discovery algorithm
- Ensured the hardware bring-up, compatibility and performance of PCI Express-based InfiniBand adapters
- Accountable for the PCI Express compliance testing with a focus on the physical layer, configuration space, link \& transport layer and platform configuration
- Led the AbiCell and NICOLL project
Staff Engineer | I/O Firmwware Development, IBM Research & Development, Germany / September 2004 — August 2006
- Developed the Linux kernel framework and interrupt processing routines for the IBM System p InfiniBand and Ethernet device driver
- Ensured compatibility and performance towards upper-level protocols such as the Message Passing
Interface (MPI) standard, Socket Direct Protocol (SDP) and communications using TCP/IP over InfiniBand - Coordinated the open-source and release process with the Linux kernel development community and
Linux distributors - Led the technical bring-up and development of a parallel cluster based on the PowerPC 970 processors and ultra-low latency InfiniBand network components
- Contributed to software optimisations and performance tests to make QPACE to the most energyefficient supercomputers of June 2010