HPDA – High Performance Data Analytics track

Master of computer science

Research project

During the master, a student will learn research by doing research. During the two years of the master, a student will thus spend between one or two days each week in a research group in order to do research projects with professors and PhD students of IP Paris.

Research project schedule

When Where What
13/09/2022 at 10h00 Telecom Paris, (room: 1A312) and online Presentation of project proposals
20/09/2022 Project starts
09/02/2023 TBD Mid-project review (for M1 students), or final evaluation (for M2 students)
17/02/2023 Deadline for submitting the research report
22/06/2023 TBD Project evaluation (for M1 students)
30/06/2023 Deadline for submitting the final research report (for M1 students)

Project Evaluation

Research projects will be evaluated on a research report, and a presentation.

Research report

The report is a expected to be a 5 to 8 pages research paper formatted using the IEEE conference style. The report should present the context of the work, its contribution, a positioning with respect to related works.

Both M1 and M2 students are expected to send their research report by email before friday, 17/02/2023. M1 students are expected to update their report and resubmit it in june (deadline: 30/06/2023).

Project defense

The project defense is a 20 minutes presentation of the research work followed by 5-10 minutes of questions. The presentation has to explain the work context and problematics, and describe the contribution of the conducted research project.

Proposed projects

The following project proposals may be shared with several masters track, including DataAI, HPDA, PDS, Cybersecurity.

Id Title Advisor Description Student
1 Forensic performance analysis François Trahay project description
2 High Performance Serverless Computing François Trahay project description Etienne Devaux
3 Memory allocation tracker François Trahay The goal of this project is to design a tool able to identify the memory objects that consume the most memory, and how they evolve/grow during the execution (eg. average growth per second)
4 Dynamic Frequency Scaling on GPU for Deep Learning François Trahay Large scale Deep Learning models that require thousands of GPU hours consume a lot of power. This is because they rely on power-hungry GPUs that each consume hundreds of Watts. The goal of this project is to design and implement a dynamic frequency scaling runtime for GPUs. Such a runtime system would monitor the GPUs and adapt their frequency in order to reduce their energy consumption with a limited impact on the application performance. Thomas Collignon
5 FAASIN: Bringing UNIX pipes to serverless cloud function pipelines Mathieu Bacou project description Sahar Boussoukaya, Mohamed Iyed El Baouab
6 Identifying the source of Linux kernel scheduler regressions Julia Lawall project description
7 Frugal AI based Neural Decoding and Neurofeedback for Cognitive Training Acceleration Van-Tam Nguyen project description
8 Edge infrastructure reconfiguration of IoT systems using AI planning Georgios Bouloukakis project description Ewa Turska
9 Balancing Energy Consumption and Comfort in Smart Spaces using the IoT Georgios Bouloukakis project description Dimitrije Panić
10 Optical network simulator with cross-layer handling of optical functinalities. Cédric Ware, Mounia Lourdiane project description
11 Disaggregated memory managment for a managed runtime Gaël Thomas, Adam Chader, Mathieu Bacou, Yohan Pipereau project description
12 Evaluation of a privacy-preserving embedded languages for C/Intel SGX Gaël Thomas, Subashiny Tanigassalame project description
13 Towards a scalable multi-kernel for heterogeneous processors Gaël Thomas, Mathieu Bacou project description
14 Design and implementation of hardened hypervisor Gaël Thomas, Mathieu Bacou project description
15 Scheduler-aware locks Jean-Pierre Lozi project description
16 State of the Art of micro-services/containers energy consumption Chantal Taconet, Sophie Chabridon project description
17 Performance Evaluation of in situ Applications through Simulation using SimGrid Valentin Honoré project description
18 Scaling Blockchain with Byzantine Leaderless State-Machine Replication Pierre Sutra project description
19 Embedded High Power Computing for Low Power Autonomous Robotics Omar Hammami project description
20 High Performance Computing Low Power accelerators for data analytics: Beating Energy Consumption While Still Doing the Job Omar Hammami project description
21 Parallel Algorithm and Implementation of graph processing for social networks Omar Hammami project description
22 Permissionless asset transfer with shared accounts Petr Kuznetsov Most modern asset transfer systems use consensus to maintain a totally ordered chain of transactions. It was recently shown that consensus is not always necessary for implementing asset transfer. More efficient, asynchronous solutions can be built using reliable broadcast instead of consensus. This approach has been originally used in the closed (permissioned) setting.The goal of this project is to implement a permissionless asset transfer systems which uses consensus only when necessary, based on our recent asynchronous proposal [1] for single-user accounts. [1] https://arxiv.org/abs/2105.04966
23 Coding for communication-efficient broadcast Petr Kuznetsov The goal of the project is to explore the potential of different encoding strategies for implementing communication-efficient dissemination of information in distributed systems. This is a part of a bigger project on weakly consistent data synchronization in large-scale systems, with applications to cryptocurrencies and blockchains.
24 Practical linearizability Petr Kuznetsov Linearizability [1] is the golden standard of consistency in distributed systems. In essence, a linearizable distributed implementation creates the illusion of an atomic object. Creating this illusion is notoriously costly. The goal of this project is to understand benefits and downsides of relaxed notions of linearizability, which allow well-specified non-atomic behavior under contention. The idea is to start with the recently proposed "intermediate-value" linearizability, generalize it to generic lattice agreement data types [2,3], and see if it pays off in practice. [1] https://drops.dagstuhl.de/opus/volltexte/2020/13080/ [2] https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Generalized20Lattice20Agreement20-20PODC12.pdf [3] https://drops.dagstuhl.de/opus/volltexte/2020/11817/pdf/LIPIcs-OPODIS-2019-31.pdf
25 Stochastic accountability in distributed systems Petr Kuznetsov There are two major ways to deal with failures in distributed computing: fault-tolerance and accountability. Fault-tolerance intends to anticipate failures by investing into replication and synchronization, so that the system’s correctness is not affected by faulty components. In contrast, accountability enables detecting failures a posteriori and raising undeniable evidence against faulty components. Instead of heading for detecting all observable failures [1,2], which might be extremely costly, we intend to explore the potential of stochastic accountability in generic distributed systems, already addressed in the networking context [3]. The approach is to randomly sample a subset of events in an execution with the goal to detect faulty behavior. As a first step, we intend to focus on gossip-based broadcast algorithms [4] where a malicious source may "equivocate" in order to make correct processes disagree on the messages they deliver. [1] Andreas Haeberlen, Petr Kouznetsov, Peter Druschel:PeerReview: practical accountability for distributed systems. SOSP 2007: 175-188 [2] Andreas Haeberlen, Petr Kuznetsov: The Fault Detection Problem. OPODIS 2009: 99-114 [3] Kashyap Thimmaraju, Liron Schiff, Stefan Schmid: Preacher: Network Policy Checker for Adversarial Environments. SRDS 2019: 32-41 [4] Rachid Guerraoui, Petr Kuznetsov, Matteo Monti, Matej Pavlovic, Dragos-Adrian Seredinschi: Scalable Byzantine Reliable Broadcast. DISC 2019: 22:1-22:16