Available Thesis Topics

When applying for a thesis topic, follow the procedure described here and CC the advisor(s) in your email.

Click on a topic for more details.

Note: Some topics are marked as “B.Sc.” level, but we might make an extended version for M.Sc. students if you are really interested.


Object storage services such as Amazon S3 use a REST API for accessing data. This means that the objects are never seen by the operating system as cacheable objects, e.g., in the page cache.

Goal of this thesis: Design an operating system component that transparently detects accesses to an object storage service and locally caches objects for future use. There are multiple challenges to solve:

  • Intercept REST calls (through a library or in the kernel)
  • Implement the local cache (either in user space or in the kernel)
  • Ensure the coherence of the data: when an object is modified in the object storage, the cached object must be invalidated

Target: B.Sc./M.Sc. students

Prerequisites:

  • Programming language: C

Advisor: Jérôme Coquisart

In a NUMA system, kernel text is distributed across all nodes. This causes latency overhead when userspace applications require access to kernel functions, because the memory might be located on another node. A solution has been proposed on the Linux Kernel Mailing List to tackle this problem, which is to replicate kernel text across all NUMA nodes.

Goal of this thesis: In this thesis, you’ll work directly inside the Linux Kernel. You will apply and test a provided kernel patchset. You’ll use virtualisation tools and a real NUMA server to execute and benchmark the new kernel. Based on your observations, you will propose improvements to the solution.

  • Evaluate the patch with a set of defined benchmarks (or develop new benchmarks)
  • Find ways to extend and improve the patch

Target: B.Sc./M.Sc. students

Prerequisites:

  • Programming language: C
  • Linux Kernel Programming

Advisor: Jérôme Coquisart

When I/O-intensive applications execute on the Linux Kernel, the system page cache can fill up within seconds, leading to high resource usage for page reclamation. Linux Kernel developers recently implemented a new feature called Uncached Buffered I/O, that is using the page cache for I/O, but not keeping the pages in memory after the I/O has completed. This results in approximately a 65% performance improvement, but this option has to be manually set by the application developers.

Goal of this thesis: In this thesis, you will have to trigger the Uncached Buffered I/O automatically depending on the state of the system. You will detect when the system’s page cache is full and when page reclamation consumes excessive resources. Then, dynamically switch from standard page cache-based I/O to Uncached Buffered I/O. From there, you’ll have to evaluate your solution.

Target: B.Sc./M.Sc. students

Prerequisites:

  • Programming language: C
  • Linux Kernel Programming

Advisor: Jérôme Coquisart

Most operating systems provide a page cache that stores data accessed from storage devices in memory for faster future access. The page cache is an important component in terms of performance for IO-intensive applications. However, on modern systems, all memory is not equal, which means that depending on where the page cache is located, performance can vary. This is the case on NUMA (Non-Uniform Memory Access) systems, where accesses to the local NUMA node are faster than accesses on remote NUMA nodes.

Goals of this thesis: In the context of an ongoing project in the group where a NUMA-aware page cache is developed, you will propose and implement an evaluation framework for this system. You will design and implement profiling tools to measure the performance, memory usage a access locality of the page cache, as well as micro-benchmarks to evaluate specific components and features of the memory subsystem.

Target: B.Sc. students

Prerequisites:

  • Programming language: C

Advisor: Jérôme Coquisart

Most operating systems provide a page cache that stores data accessed from storage devices in memory for faster future access. The page cache is an important component in terms of performance for IO-intensive applications. However, the utilization of the page cache is not optimal in a virtualized environment:

  • Different VMs could be caching the same data, which duplicates the page cache across different VMs.
  • The host and the guest could be caching the same data, resulting in double-caching.
  • The data cached by a VM cannot be reclaimed by the host operating system in case of memory pressure.

Goals of this thesis: Find a way to skip the guest’s page cache, so that only the host system is caching data from the disk, and compare this solution with different caching strategies.

Target: B.Sc./M.Sc. students

Prerequisites:

  • Programming language: C

Advisor: Jérôme Coquisart

In the last few years, the dominance of the x86 CPU architecture is being challenged by ARM and RISC-V. In this context, executing legacy applications is not trivial, as binaries are not compatible across architectures, and source code is not always available to recompile programs. One notable technique to execute these legacy programs is binary translation, which directly translates binary instructions from one architecture to another.

Goals of the thesis: In the context of an ongoing project where a hybrid binary translator is being implemented, you will add support for new x86 instructions (translating their behaviour correctly into our intermediate representation), ensuring proper testing of your code. You will also optimise the code generated for already supported instructions.

Target: B.Sc. students

Prerequisites:

  • Programming language: C++

Advisor: Redha Gouicem

In the last few years, ARM-based CPUs have greatly improved in terms of performance, challenging the x86 domination, while also having better efficiency. However, for end users, legacy applications need to be ported to ARM in order to properly work. To avoid this work, one can use an emulator to execute x86 binaries on an ARM system.

Goals of the thesis: You will study the design of various x86-to-ARM emulators (QEMU, FEX, box64) and evaluate their performance. In addition to this, you will also study the quality of the translated code and optimisation techniques, and compare them.

Target: B.Sc./M.Sc. students

Prerequisites:

  • Previous knowledge emulation/computer architecture is appreciated
  • Previous experience in reverse engineering is appreciated

Advisor: Redha Gouicem

SSD caching is a cost-efficient solution to design high-performance data storage systems. By employing SSDs as the caching layer for HDD-based main storage, users can benefit from lower response times, thanks to SSDs’ higher performance. However, flash-based SSDs suffer from limited endurance, leading to the need for lifetime-aware caching schemes. Recent studies propose ML-based solutions to address the challenges of SSD caching, and this thesis aims to explore the effectiveness of ML-based solutions.

Goal & Steps: In this thesis, you will investigate the impact of recent ML-based solutions from the Quality of Service (QoS) perspective:

  • Investigating the capabilities of a publicly available I/O caching engine in terms of admission & eviction policies
  • Reviewing recent studies focused on QoS-aware I/O caching
  • Benchmarking the state-of-the-art ML-based I/O caching solutions & comparing them against heuristic solutions from a QoS point-of-view

Target: B.Sc.

Prerequisites:

  • Proficiency in C/C++ programming
  • Familiarity with machine learning concepts (reinforcement learning, decision trees)
  • Problem solving & research capability

Advisor: Mostafa Hadizadeh

Flash-based SSDs offer considerably higher performance compared to conventional HDDs, thanks to their non-mechanical design. However, SSDs face several challenges due to intrinsic drawbacks of Flash memory technology such as limited lifetime and disparity between read/write latency. To address these challenges, SSDs are equipped with several architectural features along with numerous management techniques such as internal buffer, garbage collection, and flash translation layer.

Goal & Steps: In this thesis, you will explore recent optimizations in SSD design and evaluate their impact of overall performance and lifetime:

  • Exploring industrial standards & designs with concentration on garbage collection and buffer management
  • Setting up a SSD emulator & analyzing the impact of the state-of-the-art buffering algorithms on the lifetime and performance
  • Implementing recent garbage collection algorithms and analyzing them with comprehensive benchmarking

Target: B.Sc.

Prerequisites:

  • Proficiency in C/C++ programming
  • Problem solving & research capability

Advisor: Mostafa Hadizadeh

Computational Storage Devices (CSDs) are emerging devices aiming to improve the performance of computer systems by reducing data movement between storage & computation subsystems. Contrary to Von-Neumann paradigm where the data was delivered to central processing unit for computation, recent computational paradigms such as in-memory processing and computational storage devices introduce an alternative approach. By offloading a part of computation to CSD, the system can get rid of extra data movements, and accordingly, achieve higher performance.

Goal & Steps: In this thesis, you will explore CSD and the opportunities provided by this new paradigm:

  • Setting-up an emulation/simulation CSD frameworks: 1) Exploring existing frameworks, 2) Evaluating existing the frameworks in terms of compliance with the standards such SNIA and NVMe
  • Literature review with concentration on application domains that suites CSD paradigm
  • Evaluating the impact of advanced compression algorithms on overall application performance and device lifetime

Target: M.Sc.

Prerequisites:

  • Proficiency in C/C++ programming
  • Problem solving & research capability

Advisor: Mostafa Hadizadeh