Research Seminar Informatica

OUrsi stands for "Open University Research Seminar Informatica". The OUrsi is the research seminar of the Department of Computer Science of the Open University of the Netherlands. OUrsi talks typically take place on the last Tuesday of the month at 11:00 in Heerlen, and are normally streamed online for the benefit of interested parties elsewhere. The OUrsi features presentations on ongoing research in all areas of Computer Science.

If you are interested in giving a presentation, or attending one (live in Heerlen or online), please contact Daniel Stanley Tan. Below are listed the scheduled presentations, with the upcoming OUrsi highlighted .

Presentations

No.	Date	Presenter	Title
	TITLE NAME ABSTRACT
123.	21 Oct	Stefaan Bielen	Ransomware Detection using LLMs and GraphRAG	abstract
	Ransomware Detection using LLMs and GraphRAG Stefaan Bielen Ransomware continues to represent a significant cybersecurity challenge due to its evolving variants and complex evasion techniques, which often succeed in outpacing existing security detection methods, and produce (severe) impact spread across various countries and systems. At the same time, recent developments in the AI (Artificial Intelligence) domain provide new opportunities for developing intelligent solutions that can contribute to tackling such challenges. To this end, this research builds a novel LLM (Large Language Model) based system for ransomware detection enhanced with GraphRAG (Graph Retrieval-Augmented Generation), accounting for both effective graph-based representation and information retrieval in this process. On this basis, the Design Science Research methodology is applied in a Data Science research approach where the ransomware detection agent is developed using two ransomware datasets, feature selection mechanisms, and various prompt engineering strategies such as CoT (Chain-of-Thought). Furthermore, a comprehensive evaluation is conducted to assess the effectiveness of the GraphRAG mechanism in relation to a logistic regression baseline model. Accordingly, the results obtained are promising (accuracy of around 90%) and further reveal lessons learned about the importance of using domain-specific data and knowledge in this field in conjunction with advanced AI models.
122.	14 Oct	Yu Sheng	Spatially-Aware Multi-Room Occupancy Forecasting for Energy-Efficient Homes	abstract
	Spatially-Aware Multi-Room Occupancy Forecasting for Energy-Efficient Homes Yu Sheng Managing residential energy efficiently without compromising comfort remains challenging, largely due to the difficulty of real-time occupancy forecasting. We study spatially-aware predictors that make models understand floor-plan connectivity by combining topology-based positional encodings with Graph Neural Networks (GNNs) on top of CNN/LSTM backbones. Across real-home, multi-room datasets, spatial embeddings consistently improve accuracy and F1, with a GNN-CNN hybrid plus positional encoding achieving the highest overall performance; however, for practical heating control -- balancing comfort, energy savings, and computational cost -- CNN with positional encoding offers the best trade-off. When ground truth is scarce, we obtain reliable room-level labels from sparse PIR and environmental sensors using our prior non-intrusive labeling pipeline, enabling broader evaluation. We also find sensor placement and sampling frequency critically shape predictability. Future work will validate across diverse layouts and integrate the models into real-world energy management systems.
121.	8 July	Bea Waelbers	Curiosity-Driven Ontology Generation Using a BDI Reasoning Agent: A Dynamic Approach to Scalable Knowledge Extraction	abstract
	Curiosity-Driven Ontology Generation Using a BDI Reasoning Agent: A Dynamic Approach to Scalable Knowledge Extraction Bea Waelbers Ontology generation is an important task in the field of knowledge extraction. This process traditionally relies on manual expert annotations and classifications, which can be time-consuming, especially when documents grow in volume and complexity. Recently, large language models have been suggested due to their abilities to understand and process natural language text. However, these methods are often fixed, passive reasoning approaches that require access to the full document set. In this study, a belief-desire-intention (BDI) agent is proposed as a curiosity-driven, novelty exploration methodology to ontology generation. The BDI agent relies on large language models to dynamically generate questions that guide the retrieval of relevant documents, based on its beliefs, desires, and intentions. The Text2KGBench dataset is utilized to compare our BDI model against existing ontology generation models, namely the Hearst pattern extraction and the Rebel methods. Our model consistently outperforms both baselines across continuous and graph-based evaluation metrics, demonstrating its effectiveness in capturing structured knowledge. Our work contributes to the ongoing development of adaptive systems for ontology generation, offering a promising direction for more scalable, goal-directed knowledge extraction.
120.	10 June	Daniel Stanley Tan	Healing Mirror: Enabling Biofeedback for Blended Care of Facial Paralysis using Generative AI	abstract
	Healing Mirror: Enabling Biofeedback for Blended Care of Facial Paralysis using Generative AI Daniel Stanley Tan (joint work with Deniz Iren) Facial palsy, a condition affecting facial muscle control, is currently diagnosed and treated using subjective methods requiring frequent clinical visits. In this talk, I will be sharing our ongoing work on using generative AI to enable blended care, i.e., mostly home-based care, reducing the load on the healthcare system. Our approach aims to provide an objective, quantitative method for monitoring and treating the condition by creating deepfake reconstructions of patients' healthy facial expressions. The hypothesis is that real-time interaction with their deepfake, showing paralyzed muscles moving, will trigger biofeedback mechanisms, improving treatment effectiveness. This research could pave the way for further applications of generative AI in physiological care.
119.	27 May	Frouke Hermens	Computer vision for road safety research	abstract
	Computer vision for road safety research Frouke Hermens In this talk I will present the results of two student thesis projects that examine the use of a computer vision package (YOLOv8 from Ultralytics) for the automatic annotation of images for road safety research. For the first application (annotation of driver gaze direction) near-perfect annotations are obtained, but this needs to be extended to real-world driving conditions. For the second application (annotation of road features) accuracy is around 93-95%, but this may be improved with more data (images). The results provide general guidelines on how to improve outcomes with this computer vision package in different contexts.
118.	8 april	Erik Voogd	The Floyd of Hoare-Floyd: Reasoning about Unstructured Programs	abstract
	The Floyd of Hoare-Floyd: Reasoning about Unstructured Programs Erik Voogd While Hoare logic provides an elegant framework for reasoning about structured programs, many real-world programming language features (e.g. goto, break, continue) violate its structured paradigms. Floyd's earlier logic (1967) -- which Hoare explicitly credited as his inspiration in 1969 -- generalizes correctness reasoning to flowcharts. Key to Floyd's work was attaching individual assertions to flowchart edges, enabling inductive invariants for arbitrary control flow, including unstructured jumps, and thereby strictly subsuming Hoare logic. In this talk, I present: a modern reconstruction of Floyd's logic for correctness reasoning about control-flow graphs, with invariants to deal with loops; and a variation of Floyd logic for reachability rather than correctness, which constitutes a formal foundation for performing reachability analysis in control-flow graphs.
117.	25 March	Tom Dooney	DeepExtractor: Time-domain reconstruction of signals and glitches in gravitational wave data with deep learning	abstract
	DeepExtractor: Time-domain reconstruction of signals and glitches in gravitational wave data with deep learning Tom Dooney Gravitational wave (GW) interferometers, such as LIGO, Virgo, and KAGRA, detect faint signals from distant astrophysical events. However, their high sensitivity also makes them susceptible to background noise, which can obscure these signals. This noise often includes transient artifacts called "glitches" that can mimic genuine astrophysical signals or mask their true characteristics. Fast and accurate reconstruction of both signals and glitches is crucial for reliable scientific inference. In this study, we present DeepExtractor, a deep learning framework that is designed to reconstruct signals and glitches with power exceeding interferometer noise, regardless of their source. We design DeepExtractor to model the inherent noise distribution of GW interferometers, following conventional assumptions that the noise is Gaussian and stationary over short time scales. It operates by predicting and subtracting the noise component of the data, retaining only the clean reconstruction of signal or glitch. Our innovative approach achieves superior generalization capabilities for arbitrary signals and glitches compared to methods that directly map inputs to the clean training waveforms. We focus on applications related to glitches and validate DeepExtractor’s effectiveness through three experiments: (1) reconstructing simulated glitches injected into simulated detector noise, (2) comparing its performance with the state-of-the-art BayesWave algorithm, and (3) analyzing real data from the Gravity Spy dataset to demonstrate effective glitch subtraction from LIGO strain data. Our proposed model achieves a median mismatch of only 0.9% for simulated glitches, outperforming several deep learning baselines. Additionally, DeepExtractor surpasses BayesWave in glitch recovery, offering a dramatic computational speedup by reconstructing one glitch sample in approximately 0.1 seconds on a CPU, compared to BayesWave's processing time of approximately one hour per glitch.
116.	18 March	Sebastiaan Koppen	Backwards Symbolic Execution	abstract
	Backwards Symbolic Execution Sebastiaan Koppen Symbolic execution is a well-known technique for performing program analysis. Generally, it is done by starting with some assumed initial state, i.e. assignment of variables and memory, and computing every reachable state from there. The main challenge regarding symbolic execution is what is known as the state explosion problem, where the number of possible states grows exponentially. Nonetheless, due to its potential of automatically identifying security vulnerabilities, there is plenty of interest in this topic. We are specifically interested in the reachability problem for binary programs, where we ask ourselves if some (usually unwanted) state can be reached. To check this, we explore the use of backwards symbolic execution, where we start at the error-state and compute every possible state from which it can be reached. This different approach has some benefits, but also brings new challenges, such as handling the dynamic behaviour introduced by pointers.
115.	25 Feb	Ebrahim Rahimi	Parsons Problems for Equivalence Proofs in Logic	abstract
	Parsons Problems for Equivalence Proofs in Logic Ebrahim Rahimi Many CS students find it challenging to work with the formulas and laws of propositional logic. It often takes a significant amount of effort from students to fully comprehend these concepts. In this study, we developed a prototype for constructing equivalence proofs in logic using Parsons problems. Initially proposed to teach programming, Parsons problems aim to scaffold and alleviate the cognitive load by guiding learners to complete tasks through drag-and-drop interfaces, eliminating the need for typing or starting from scratch. The study investigates how Parsons problems can be applied in a new context for learning to work with logical formulas. Following the design science research methodology, we developed a prototype grounded in related theoretical concepts and refined iteratively based on input from domain experts. The prototype can be adjusted to user preferences, feedback can be given in various forms, and the difficulty level can be selected. Additionally, a variation of Parsons problems is introduced that allows the user to work in both top-down and bottom-up directions and to select a rewrite rule between blocks. The final prototype received positive evaluations from users (n=15) regarding its learning usefulness and ease of use.
114.	18 Feb	Jeroen Spaans	Graduality in Probabilistic Argumentation Frameworks	abstract
	Graduality in Probabilistic Argumentation Frameworks Jeroen Spaans Formal argumentation can be intuitively described as a computational way of assessing arguments in light of arising conflicts, much like someone would in a debate or a court of law. In a sense, it allows a computer reason and argue like a human would. More formally, it is a calculus of conflict that considers graphs -- called argumentation frameworks -- wherein nodes represent abstract arguments and edges represent attacks between them, and arguments are evaluated based on these graphs' structure. In argumentation, semantics govern how arguments are evaluated; the most common family of semantics rules whether to accept an argument, reject it, or remain undecided on the matter -- much like a judge would during a trial. Rather than following this everything-or-nothing approach to argument acceptability, the work presented in this talk instead focuses on gradual semantics. These use a richer evaluative scale to allow for a ranking of arguments based on their overall degree of acceptability, facilitating the meaningful comparison of acceptable alternatives. We investigate how these gradual semantics may be applied to extended argumentation frameworks wherein each argument and attack is given a probability, used to codify any uncertainty about whether said element belongs in the graph. We offer three perspectives on this application of gradual semantics: first, we define the strengths an argument may be assigned and these strengths' likelihoods, second we offer a way of comparing arguments in this probabilistic setting, and third we propose a method for calculating each argument's overall strength which is subjected to a principle-based evaluation.
113.	21 Jan	Harald Vranken	Detecting security vulnerabilities in source code with AI	abstract
	Detecting security vulnerabilities in source code with AI Harald Vranken The field of automated detection of security vulnerabilities using static code analysis has advanced considerably in the last two decades. In static analysis the source code (or bytecode) of a program is analysed without executing the code. In the first part of this talk I will give a brief overview of the research we have been doing on static code analysis to detect security vulnerabilities with the help of AI. In the second part, I will present our recent work on MultiGLICE in which we apply inter-procedural program slicing with a graph neural network.
112.	14 Jan	Janneke Bolt	Bayesian Networks in Medicine: Presenting Query Response Uncertainty for Decision Support	abstract
	Bayesian Networks in Medicine: Presenting Query Response Uncertainty for Decision Support Janneke Bolt Despite their good characteristics, tools based on Bayesian networks are not yet widely used in medical decision support. Meeting the needs of the intended users of these tools is crucial for their acceptance, which is very much supported by the involvement of the end-users. During the development of a Bayesian network-based tool for the prediction of lymph node metastases in patients with endometrial cancer as one of such needs, a measure of query response uncertainty was put forward. We sustain this need by exploring options for the presentation of query response uncertainty. We consider the level of detail, one-sided versus two-sided intervals and two different 'look-ahead' options.

2024
111.	3 Dec	Joshua Moerman	Decomposition of Finite State Machines	abstract
	Decomposition of Finite State Machines Joshua Moerman This talk is about compositions and decompositions of finite state machines. We have recently seen examples where decomposition is used in automata learning which improved the performance greatly. In this talk I will ignore the learning aspect and focus on (de)compositions themselves. When is decomposition applicable? Will it always reduce the state space? How efficiently can we decompose a model? Such questions will be partly answered by an experimental evaluation on a benchmark suite of finite state machine models (AutomataWiki). This is joint work with Rick Koenders and part of a submission to TACAS'25.
110.	22 Oct	Hessel Poelman	VERLICHT: mapping interventions in Heerlen-Noord	abstract
	VERLICHT: mapping interventions in Heerlen-Noord Hessel Poelman The VERLICHT project is a collaboration between the OU and het Nationaal Programma Heerlen-Noord (NPHLN). NPHLN focusses on improving the condition of Heerlen-Noord, a part of the Parkstad region that was especially hard hit by the closing of the coal mines in the 70s. NPHLN brings together many stakeholders in Heerlen-Noord with the aim of stimulating collaboration between them and thereby improving the efficiency of their interventions. The task of VERLICHT is to map the mechanisms of the interventions that are being done by NPHLN and its partners. For this, the people who actually do the interventions, rather than the layers above, are interviewed about these interventions. The resulting interviews are transcripted, coded and converted to knowledge graphs. In parallel, data is collected from the various partners and coupled to these knowledge graphs. The resulting network can then be queried for things like conflicting statements, statements that are corroborated or invalided by data and statements that were marked as surprising and not likely know by the management layers above the interventions. The knowledge contained in these graphs can then be used to improve current or future interventions.
109.	24 Sept	Frouke Hermens	Using computer vision techniques to investigate human foot placement	abstract
	Using computer vision techniques to investigate human foot placement Frouke Hermens In this talk I will explain how we used pose estimation and object detection to study how people maintain their balance while they manipulate objects. First I will describe three experiments conducted in the Action and Cognition lab at UC Riverside where participants were asked to move a box back and forth between two targets. This will be followed by some pilot work in which we used recordings of marimba players on YouTube and applied the same techniques to examine how players maintain their balance during marimba playing.
108.	23 July	Stef Joosten	Schema-changing data migrations	abstract
	Schema-changing data migrations Stef Joosten Once information systems are in production, they can last for decades. During that time, data migration is sometimes unavoidable. And what if that data migration also involves a changing database schema? Can you take those kinds of risks with production data? OU colleague Stef Joosten has investigated this issue. To what extent can such data migrations be automated? In a scientific article, he and his son, Sebastiaan, have developed a theory that allows schema-changing data migrations (SCDMs) to be performed in a manageable way. In this meeting, he will explain how it works and give a short demonstration.
107.	2 July	Rick Koenders	Output-decomposed learning of Mealy Machines	abstract
	Output-decomposed learning of Mealy Machines Rick Koenders Model learning can be used to infer the behaviour of black-box systems. However, models of software systems can be very big. This makes them difficult to learn with existing algorithms. We present an active automata learning algorithm which learns a decomposition of a finite state machine, based on projecting onto individual outputs. When projecting the outputs to a smaller set, the model itself is reduced in size. By having several such projections, we do not lose any information and the full system can be reconstructed. Depending on the structure of the system this reduces the number of queries drastically, as shown by a preliminary evaluation of the algorithm.
106.	25 June	Ashish Sai	Advances in blockchain energy and environmental footprint research	abstract
	Advances in blockchain energy and environmental footprint research Ashish Sai Interest in the energy and environmental impact of cryptocurrencies like Bitcoin and Ethereum, and alternatives like proof-of-stake (PoS) and proof-of-spacetime (PoSt), is growing, yet existing studies often inaccurately estimate energy consumption due to flawed methodologies and assumptions. This work evaluates previous models' reliability by leveraging literature from social energy sciences and information systems. We conduct a systematic review, propose a quality assessment framework, and introduce refined models for PoS-based Ethereum and PoSt-based Filecoin, improving data quality, transparency, and documentation. Our goal is to enhance scientific rigor in blockchain energy studies, offering a code of conduct for research methodologies to standardize the assessment of blockchain systems' energy and environmental impacts.
105.	11 June	Jesse Heyninck	LogicLM: Combining Logic Programs with Language Model	abstract
	LogicLM: Combining Logic Programs with Language Model Jesse Heyninck In this talk I'll present the project proposal LogicLM (joint work with Stefano Bromuri). (Large) language models ((L)LMs) like chatGPT are becoming frontier technologies, but their reasoning is intransparent and unreliable. The goal of this project is to combine LLMs with symbolic AI, ensuring verifiable and transparent reasoning. This is done by fine-tuning an LM to convert natural language into formal logic and vice versa, and integrating this LM with LM-based approaches for information retrieval and generation. This results in an architecture taking advantage of the strengths of LMs while bypassing their weaknesses by leaning on symbolic AI.
104.	28 May	Jos Craaijo	libLISA: Instruction Discovery and Analysis on x86-64	abstract slides
	libLISA: Instruction Discovery and Analysis on x86-64 Jos Craaijo Even though heavily researched, a full formal model of the x86-64 instruction set is still not available. We present libLISA, a tool for automated discovery and analysis of the ISA of a CPU. This produces the most extensive formal x86-64 model to date, with over 118000 different instruction groups. The process requires as little human specification as possible: specifically, we do not rely on a human-written (dis)assembler to dictate which instructions are executable on a given CPU, or what their in- and outputs are. The generated model is CPU-specific: behavior that is "undefined" is synthesized for the current machine. Producing models for five different x86-64 machines, we mutually compare them, discover undocumented instructions, and generate instruction sequences that are CPU-specific. Experimental evaluation shows that we enumerate virtually all instructions within scope, that the instructions' semantics are correct w.r.t. existing work, and that we improve existing work by exposing bugs in their handwritten models.
103.	14 May	Daniel Stanley Tan	ConCoNet: Class-agnostic counting with positive and negative exemplars	abstract slides
	ConCoNet: Class-agnostic counting with positive and negative exemplars Daniel Stanley Tan Class-agnostic counting is usually phrased as a matching problem between a user-defined exemplar patch and a query image. The count is derived based on the number of objects similar to the exemplar patch. However, defining a target class using only positive exemplar patches inevitably miscounts unintended objects that are visually alike to the exemplar. In this paper, we propose to include negative exemplars that define what not to count. This allows the model to calibrate its notion of what is similar based on both positive and negative exemplars. It effectively disentangles visually similar negatives, leading to a more discriminative definition of the target object. We designed our method such that it can be incorporated with other class-agnostic counting models. Moreover, application-wise, our model can be used into a semi-automatic labeling tool to simplify the job of the annotator.
102.	16 April	Stijn de Gouw	Analysis of Java's BitSet library
	Translation validation for decompilation Nico Naus Translation of code is essential to all that we do as computer scientists. When you compile code, source code is translated multiple times, before being turned into a binary file. It is obviously very important that these translations are correct; otherwise meaning of a program changes after translation. In this talk, we will take a look at a way of proving these kinds of translations correct. As a use-case, we will look at the translation from x86-64 binaries to an intermediate representation. This translation is one of the very first steps in decompilation, a process that can be seen as the (partial) inverse of compilation. In doing so, we will discuss how we assign meaning to a program in the first place, what is meant by a ``correct translation'', and finally how we can show this correctness. The context of disassembly adds a few challenges, which will also be addressed. Since this is OUrsi 101, it was suggested I should keep the talk entry level, so no prior knowledge of any of the above will be required!
100.	5 March	Luc Edixhoven	Expressive specification and verification of choreographies	abstract slides
	Expressive specification and verification of choreographies Luc Edixhoven Choreographies describe possible sequences of interactions among a set of agents. They can be used to specify and analyse these sequences, and contain certain safety properties by construction. As a drawback, choreographic languages typically have limitations on their expressiveness, which limits their applicability. This talk will give a high-level overview of our results in raising the expressiveness of choreographies. This was initially prepared as a 20 minute talk for PROMIS, in early November 2023; in the time remaining, I can delve further into any of the presented topics that the audience deems (the most) interesting.
99.	20 February	Hugo Jonker	Fighting scientific fraud by looking at people, not papers	abstract slides
	Fighting scientific fraud by looking at people, not papers Hugo Jonker Incentives in science push scientist to publish more and be cited more often. Some scientists use nefarious means to increase their publication metrics, for example by abusing their power as reviewers or editors to solicit references or even authorships. Most traditional approaches to fighting scientific fraud focus on identifying malfeasance in the contents: plagiarism, pertubed images, etc. Instead, we propose to look for identifying characteristics of the person committing fraud, not at the contents. This opens up a new way to help fight scientific fraud, taking a more generic view of the process. On the one hand this entails that concrete evidence of specific cases of fraud is out of scope; on the other hand, the new approach offers a way to investigate people, not papers.
98.	6 February	Bea Waelbers (OU, Faculty of Management)	What the Ear Cannot Hear: Predicting Customer Satisfaction in Voice-To-Voice Service Interactions From Text and Audio Signals	abstract slides
	What the Ear Cannot Hear: Predicting Customer Satisfaction in Voice-To-Voice Service Interactions From Text and Audio Signals Bea Waelbers (OU, Faculty of Management) In today's business environment centered around customer satisfaction, our interdisciplinary research pioneers the exploration of predicting satisfaction by analyzing voice-to-voice service interactions using machine learning. Integrating textual and auditory signals from over 1,100 real service interactions, our study breaks from the tradition of text-focused analyses and recognizes the crucial role of auditory features like tone and intonation. This work not only advances the methodology for predicting customer satisfaction but also serves as a vital bridge between consumer psychology, service management, and machine learning, offering practical strategies for service firms to enhance customer interactions and informed decision-making.
97.	23 January	Tom Dooney	cDVGAN: Improved Conditional GANs for Generalized Gravitational Wave Transient Generation	abstract slides
	Tom Dooney cDVGAN: Improved Conditional GANs for Generalized Gravitational Wave Transient Generation We present Conditional Derivative GAN (cDVGAN), a novel conditional GAN framework for simulating multiple classes of gravitational wave (GW) transients in the time domain. cDVGAN can also generate generalized hybrid waveforms that span the variation between waveform classes through class-interpolation in the conditioned class vector. cDVGAN transforms the typical 2-player adversarial game of GANs into a 3-player game with an auxiliary critic analyzing the derivatives of time series signals. Our results show that this provides synthetic data that better captures the features of the original waveforms. cDVGAN conditions on three waveform classes, two preprocessed from LIGO blip and tomte glitch classes from its 3rd observing run (O3), and the third representing binary black hole (BBH) mergers. Our proposed cDVGAN outperforms 4 different baseline GAN models in replicating the features of the three waveform classes. Specifically, our experiments show that training convolutional neural networks (CNNs) with our cDVGAN generated data improves the detection of waveforms embedded in detector noise beyond the synthetic data from other state-of-the-art GAN models. Our best synthetic dataset yields as much as a 7% increase in area-under-the-curve (AUC) performance compared to the next best synthetic dataset from baseline GANs. Moreover, training the CNN with hybrid waveforms from our cDVGAN outperforms CNNs trained only on the standard waveform classes when identifying real signals embedded in LIGO detector background between SNRs ranging from 1 to 16. Lastly, we illustrate cDVGAN as a viable data augmentation method that can surpass the performance of using a traditional data augmentation approach.

2023
96.	19 December	Ali Ozbakir	Unlocking Knowledge: Building a Multi-Document Q&A Bot with LLM, Embeddings, and Conversational Memory	abstract slides
	Unlocking Knowledge: Building a Multi-Document Q&A Bot with LLM, Embeddings, and Conversational Memory Ali Ozbakir Conducting a comprehensive literature review in research domains often involves navigating an extensive corpus of PDF documents, leading to challenges in information retrieval and synthesis. Contradictions and nuanced perspectives within various sources can complicate the process, demanding a more efficient and interactive approach. In this initial attempt, I explore a Large Language Model solution that integrates LangChain, OpenAI, and ChromaDB to streamline literature review workflows. Acknowledging token limitations, I address bottlenecks by embedding multiple PDF documents into queries, employing techniques like splitting and chunking. To facilitate this, a Python script processes the research-oriented documents, for extraction and a splitting to break down content into manageable and partially overlapping chunks. These chunks are then embedded using OpenAI's GPT-based model ("text-embedding-ada-002") via a vector database. The resulting embeddings are stored for persistence, creating a foundation for real-time interaction and dynamic information synthesis within the research corpus. Despite token limitations, this retrieval augmented approach demonstrates promising strides in enhancing the efficiency and effectiveness of literature reviews in research applications. By integrating multiple PDF documents into queries and employing advanced chunking and embedding techniques, researchers can better manage, understand, and synthesize information from diverse sources, paving the way for further advancements in scientific research communication.
95.	12 December	Clara Maathuis, Stefano Bromuri, Greg Alpár	Evidence-based Software Engineering with LLMs: a case study about inclusivity	abstract slides
	Evidence-based Software Engineering with LLMs: a case study about inclusivity Clara Maathuis, Stefano Bromuri, Greg Alpár Software affects nearly all aspects of our lives. Software engineering, encompassing the entire software lifecycle from conception through design and development to maintenance and continuous improvement, is significantly influenced by emerging technologies such as large language models (LLMs) and generative artificial intelligence (GenAI), often epitomized by ChatGPT. These advancements have the potential to substantially change various aspects of the software development process. Given their impact, there is a growing consideration within the industry to delegate tasks to LLMs and GenAI rather than relying solely on human experts. However, the reliance on these tools raises concerns about addressing historical biases embedded in their training data. Our research focuses on assessing the impact of LLMs on software engineering, particularly emphasizing inclusivity. More specifically, we explore how dyslexia affects learning to read with software assistance. Leveraging literature reviews and comprehensive LLM prompt experiments, we share our current findings and invite audience participation to advance our research agenda. We extend our invitation to this OUrsi presentation not only to computer scientists, artificial intelligence researchers, and software engineers but also to experts in psychology and other fields interested in dyslexia and various learning difficulties and disabilities.
94.	21 November, 11:30 - 12:30	Natasha Alechina	Reinforcement learning to logical specifications	abstract slides
	Natasha Alechina Reinforcement learning to logical specifications I will talk about an approach to safe reinforcement learning, which we call Pure Past Action Masking (PPAM). In PPAM, actions are disallowed (``masked'') according to specifications expressed in Pure-Past Linear Temporal Logic (PPLTL). PPAM can enforce non-Markovian constraints, i.e., constraints based on the history of the system, rather than just the current state of the (possibly hidden) MDP. The features used in the safety constraint need not be the same as those used by the learning agent, allowing a clear separation of concerns between the safety constraints and reward specifications of the (learning) agent. We prove that an agent trained with PPAM can learn any optimal policy that satisfies the safety constraints, and that they are as expressive as shields, another approach to enforce non-Markovian constraints in RL. This talk is based on joint work with Giovanni Varricchione, Giuseppe De Giacomo, Mehdi Dastani, Brian Logan, and Giuseppe Perelli.
93.	7 November	Gideon Mailette de Buy Wenniger	MultiSChuBERT: Effective Multimodal Fusion for Scholarly Document Quality Prediction	abstract slides
	MultiSChuBERT: Effective Multimodal Fusion for Scholarly Document Quality Prediction Gideon Mailette de Buy Wenniger Automatic assessment of the quality of scholarly documents is a difficult task with high potential impact. Multimodality, in particular the addition of visual information next to text, has been shown to improve the performance on scholarly document quality prediction (SDQP) tasks. We propose the multimodal predictive model MultiSChuBERT. It combines a textual model based on chunking full paper text and aggregating computed BERT chunk-encodings (SChuBERT), with a visual model based on Inception V3. Our work contributes to the current state-of-the-art in SDQP in three ways. First, we show that the method of combining visual and textual embeddings can substantially influence the results. Second, we demonstrate that gradual-unfreezing of the weights of the visual sub-model, reduces its tendency to ovefit the data, improving results. Third, we show the retained benefit of multimodality when replacing standard BERTBASE embeddings with more recent state-of-the-art text embedding models. Using BERTBASE embeddings, on the (log) number of citations prediction task with the ACL-BiblioMetry dataset, our MultiSChuBERT (text+visual) model obtains an R2 score of 0.454 compared to 0.432 for the SChuBERT (text only) model. Similar improvements are obtained on the PeerRead accept/reject prediction task. In our experiments using SciBERT, scincl, SPECTER and SPECTER2.0 embeddings, we show that each of these tailored embeddings adds further improvements over the standard BERTBASE embeddings, with the SPECTER2.0 embeddings performing best.
92.	23 Oct	Anshuman Mohan	Formal Abstractions for Packet Scheduling	abstract slides
	Formal Abstractions for Packet Scheduling Anshuman Mohan Early programming models for software-defined networking (SDN) focused on basic features for controlling network-wide forwarding paths, but more recent work has considered richer features, such as packet scheduling and queueing, that affect performance. In particular, PIFO trees, proposed by Sivaraman et al., offer a flexible and efficient primitive for programmable packet scheduling. Prior work has shown that PIFO trees can express a wide range of practical algorithms including strict priority, weighted fair queueing, and hierarchical schemes. However, the semantic properties of PIFO trees are not well understood. This paper studies PIFO trees from a programming language perspective. We formalize the syntax and semantics of PIFO trees in an operational model that decouples the scheduling policy running on a tree from the topology of the tree. Building on this formalization, we develop compilation algorithms that allow the behavior of a PIFO tree written against one topology to be realized using a tree with a different topology. Such a compiler could be used to optimize an implementation of a PIFO trees, or realize a logical PIFO tree on a target with a fixed topology baked into the hardware. To support experimentation, we develop a software simulator for PIFO trees, and we present case studies illustrating its behavior on standard and custom algorithms.
91.	10 October	Mina Alishahi	Privacy-preserving data analysis	abstract slides
	Privacy-preserving data analysis Mina Alishahi Privacy and fairness are foundational principles in the realm of trustworthy AI. In the context of machine learning, privacy forms the theoretical and methodological bedrock, facilitating the development of algorithms and methodologies that empower data insights while upholding the rights and expectations of data subjects. Fairness, on the other hand, represents equity and justice, ensuring that AI algorithms treat all individuals impartially and without discrimination. In a world where AI plays a pivotal role in decisions spanning from loans to hiring, fairness ensures that these decisions remain unbiased, transparent, and just. In this presentation, we delve into the integration of privacy in clustering algorithms and feature selection, and we explore emerging avenues in privacy-preserving machine learning focusing on the aspect of fairness.
90.	4 July	Parsa Karimi	The AUTOLINK project	abstract
	The AUTOLINK project Parsa Karimi In this presentation, I will introduce myself, touching upon my academic and industrial background. Following that, I will discuss the role I will occupy the next few years: as a PhD student in the AUTOLINK project. I will discuss the overall project goals and objectives, and give some initial ideas of how Testar can be of value in this project. Finally, I will focus on the details of my contributions in the project and discuss some initial ideas.
89.	20 June	Niels Doorn	Enhancing Software Testing Education	abstract slides
	Understanding the Sensemaking Process in Test Case Design: Enhancing Software Testing Education Niels Doorn Software testing is a widely used technique for quality assurance in the software industry. However, in computer science education, software testing is often neglected, and students struggle to effectively test their software. Teaching software testing is challenging, as it requires students to allocate multiple cognitive resources simultaneously. Despite various attempts to address this issue, progress in improving software testing education has been limited. To enhance pedagogical approaches in software testing, it is crucial to gain a deeper understanding of the sensemaking process that occurs when students design test cases. In an initial exploratory study, we identified four distinct sensemaking approaches employed by students during test model creation. Building upon these findings, we conducted a follow-up study with 50 students from a large university in Spain. In this study, we provided the participants with a specialized web-based tool for modeling test cases. They were tasked with creating test models based on given descriptions of test problems. We evaluated the fit between the models and the test problems, examined the sensemaking processes employed by students, and gathered their perspectives on the assignment. To capture comprehensive data, we collected textual, graphical, and video data, which were analyzed using an iterative inductive analysis process. The insights gained from our study shed light on the sensemaking processes involved in test case modeling. We refined our previous findings and identified new sensemaking approaches. These results have significant implications for influencing the sensemaking process in software testing education. By addressing potential misconceptions and fostering desired mental models for test case design, we can improve the effectiveness of software testing education. Our findings provide a foundation for further research and exploration in this domain. By gaining a deeper understanding of the sensemaking process, we can develop interventions and pedagogical strategies to enhance software testing education and equip students with the necessary skills for effective software testing.
88.	13 June	Lianne Hufkens	Grammar-based action selection rules for scriptless testing	abstract slides
	Grammar-based action selection rules for scriptless testing Lianne Hufkens Scriptless testing at the GUI level involves generating test sequences on the fly. These test sequences consist of user interactions on the GUI. The creation of these sequences is based on action selection. The most commonly used action selection rules are based on stochastic methods. This presentation covers a new approach based on a grammar that allows for the specification of action selection rules, which enables mimicking user behaviour when interacting with web applications. While grammars have been used in software testing to generate input data for test cases, this approach utilizes grammars to specify action selection rules to generate test sequences. The effectiveness of this approach has been evaluated by applying it to filling web forms when randomly testing with the TESTAR tool.
87.	23 May	Joshua & Tobias	Veni interview Thunderdome	abstract
	Veni interview Thunderdome Joshua & Tobias Joshua and Tobias will both present their 5-minute presentations and welcome excruciating audience scrutiny in a 15 minute interview round. This all in order to help prepare them for their Veni interviews.
86.	9 May	Daniel Stanley Tan	Mask-controlled generative adversarial network for image retargeting	abstract slides
	Mask-controlled generative adversarial network for image retargeting Daniel Stanley Tan Image retargeting aims to resize a photograph without distorting its content. Recent solutions leverage generative adversarial networks (GANs) to learn an image's internal distribution. Patches are then replicated and generated to fill the target aspect ratio to preserve the internal distribution. Although these approaches are somewhat successful, they do not have a sense of image semantics or the image's contents, causing semantic errors during generation (e.g., a person with two heads). Our model addresses this by allowing user intervention. The users can preserve the objects they desire through user-defined masks. Our model enforces the masked object to appear at the user-defined location. It also utilizes a de-association regularizer to loosen the association of the selected object with its surroundings. Our method prevents object distortion and enables the user to remove or relocate the object from the input image while allowing the user to set its target size.
85.	18 April	Jesse Heyninck	Algebraic foundations of hybrid artificial intelligence	abstract
	Algebraic foundations of hybrid artificial intelligence Jesse Heyninck For the robustness and trustability of AI, the presence of an internal model of the world as the basis of reasoning and decision making is essential. For example, the absence of such a model is arguably the root of many reported problems of recent show ponies of AI such as chatGPT and BERT. These approaches have been dubbed model-agnostic AI, as they essentially leverage computation and large datasets to memorize training examples and interpolate patterns within these datasets to very similar cases without any notion of model of the studied domain. This approach is often contrasted with model-based or symbolic AI where a formal representation of knowledge is at the core of any formalism. A lot of the research in model-based AI is, not surprisingly, dependent on the modelling language. What this means is that many important advances made in this domain have stayed confined to specific formalisms. Examples include important contributions to the scalability, dynamism and integration of sub-symbolic approaches. I will present a proposal for a top-down methodology for combining symbolic with sub-symbolic AI. This gives a general answer to challenge of how to combine data-driven machine-learning approaches in AI with knowledge-based approaches while retaining the strength of these complementary methodologies. In order to ensure high-quality AI-components, we will also develop techniques to ensure scalability and explainability.
84.	28 March	Tobias Kappé	Completeness and the Finite Model Property for Kleene Algebra, Reconsidered	abstract slides
	Completeness and the Finite Model Property for Kleene Algebra, Reconsidered Tobias Kappé Kleene Algebra (KA) is the algebra of regular expressions. Central to the study of KA is Kozen's (1994) completeness result, which says that any equivalence valid in the language model of KA follows from the axioms of KA. Also of interest is the finite model property (FMP), which says that false equivalences always have a finite counterexample. Palka (2005) showed that, for KA, the FMP is equivalent to completeness. We provide a unified and elementary proof of both properties. In contrast with earlier completeness proofs, this proof does not rely on minimality or bisimilarity techniques for deterministic automata. Instead, our approach avoids deterministic automata altogether, and uses Antimirov's derivatives and the well-known transition monoid construction. Our results are fully verified in the Coq proof assistant.
83.	14 March	Simone Tummers (OU, gezondheidspsychologie)	investigating behaviour change using a Bayesian network approach	abstract slides
	investigating behaviour change using a Bayesian network approach Simone Tummers BACKGROUND: Physical activity (PA) is known to be beneficial for health, but adherence to international PA guidelines is low across different subpopulations. Interventions have been designed to stimulate PA of different target groups by influencing relevant psycho-social determinants, essentially based on a combination of the Integrated Model for Change, the Theory of Planned Behaviour, its successor the Reasoned Action Approach and the self-determination theory. The current study investigates the pathways through which interventions influence PA. Further, gender differences in pathways of change are studied. METHODS: An integrated dataset of five different randomised controlled trial intervention studies is analysed by estimating a Bayesian network. The data include measurements, at baseline and at 3, 6 (short-term), and 12 (long-term) months after the baseline, of important socio-cognitive determinants of PA, demographic factors, and PA outcomes. A fragment is extracted from the Bayesian network consisting of paths between the intervention variable, determinants, and short- and long-term PA outcomes. For each relationship between variables, a stability indicator and its mutual information are computed. Such a model is estimated for the full dataset, and in addition such a model is estimated based only on male and female participants' data to investigate gender differences. RESULTS: The general model (for the full dataset) shows complex paths, indicating that the intervention affects short-term PA via the direct determinants of intention and habit and that self-efficacy, attitude, intrinsic motivation, social influence concepts, planning and commitment have an indirect influence. The model also shows how effects are maintained in the long-term and that previous PA behaviour, intention and attitude pros are direct determinants of long-term PA. The gender-specific models show similarities as well as important differences between the structures of paths for the male- and female subpopulations. For both subpopulations, intention and habit play an important role for short-term effects and maintenance of effects in the long-term. Differences are found in the role of self-efficacy in paths of behaviour change and in the fact that attitude is relevant for males, whereas planning plays a crucial role for females. The average of these differences in subpopulation mechanisms appears to be presented in the general model. CONCLUSIONS: While previous research provided limited insight into how interventions influence PA through relevant determinants, the Bayesian network analyses show the relevance of determinants mentioned by the theoretical framework. The model clarifies the role that different determinants play, especially in interaction with each other. The Bayesian network provides new knowledge about the complex working mechanism of interventions to change PA by giving an insightful overview of influencing paths. Furthermore, by presenting subpopulation-specific networks, the difference between the influence structure of males and females is illustrated. These new insights can be used to improve interventions in order to enhance their effects. To accomplish this, we have developed a new methodology based on a Bayesian network analysis which may be applicable in various other studies.
82.	21 February	Ton Poppe	GraphCase: Structural Node Embeddings beyond Node Properties	abstract slides
	GraphCase: Structural Node Embeddings beyond Node Properties Ton Poppe Graph embeddings typically focus on the graph structure and thereby miss out on other aspects of graphs, such as edge attributes. However, many real-world settings can be modeled as graphs with edge attributes, e.g., roles in an organiza- tional hierarchy or bank transfers in the financial industry. In this paper, we present GraphCase, a novel role-oriented node embedding algorithm that is inductive and scalable for attributed di- rected graphs. GraphCase effectively includes edge information in node embeddings and is designed to capture the structural similarity of nodes. We test GraphCase in an artificial setting (barbell and ring graphs) as well as apply GraphCase to real-world data (Enron e-mail dataset and BZR chemical compounds dataset) showing its superior performance. The derived abstract em- beddings can be made transparent with local sen- sitivity analysis allowing for explainable AI in downstream tasks.
81.	12 January	Alaaeddin Swidan	Hedy in Arabic	abstract slides
	Hedy in Arabic Alaaeddin Swidan Hedy is a gradual programming language that aims to teach kids programming by building up concepts and syntax step by step. Hedy was launched in early 2020 and since then over 2 million Hedy programs have been created worldwide. From its inception Hedy was multi-lingual in its UI, for example in buttons and error messages. But to be truly inclusive, kids must be able to program in their own natural language! Thus, since early 2022, Hedy also supports keywords in multiple languages, including Dutch, French, Spanish, Hindi and Arabic. In this presentation we provide an overview of the challenges we met and the changes that were necessary to make Hedy work multilingually, with a special emphasis on Arabic, which include tricky changes to the website, parser, the syntax highlighter.
80.	12 January	Murat Firat	Embedding ML models into Linear Optimization	abstract slides
	Embedding ML models into Linear Optimization Murat Firat In this presentation, we describe a case study on embedding machine learning into linear optimization in order to optimize expected project scheduling.

2022
79.	13 December	Marloes Venema (RU)	Attaining Basically Everything in Attribute-Based Encryption -- Simplifying the Design of Practical Schemes via Pair Encodings	abstract slides
	Attaining Basically Everything in Attribute-Based Encryption -- Simplifying the Design of Practical Schemes via Pair Encodings Marloes Venema (RU) Attribute-based encryption (ABE) cryptographically implements fine-grained access control on data. In particular, data can be stored by an entity that is not necessarily trusted to enforce access control. Instead, access control can be externally enforced by a trusted authority. For this reason, pairing-based ABE has been considered for various use cases, e.g., cloud settings, which require the ABE scheme to satisfy several desirable properties. Additionally, the scheme should be efficient enough for the setting that is considered. However, measuring the efficiency of schemes is difficult, and may provide an inaccurate view on which schemes are efficient. Furthermore, schemes that can satisfy all desirable properties may be generally less efficient than schemes that satisfy fewer properties. In this talk, I will give an overview of some use cases and the properties that are desirable. I will also give an overview of the pair encodings framework, which is a common abstraction of many pairing-based ABE schemes. We use it to set up a framework, called ABE Squared, for analyzing the efficiency of schemes more accurately than was previously done, and eventually, to design schemes that can satisfy all properties with the required efficiency.
78.	15 November	Tobias Kappé	Reasoning about program equivalence using (Prob)GKAT	abstract slides
	Reasoning about program equivalence using (Prob)GKAT Tobias Kappé Which refactoring operations never change the meaning of an imperative program? Is there a set of equivalences that can be used to prove such correctness claims? Can we automatically decide whether a certain equivalence is valid? This talk will give a tour of Guarded Kleene Algebra with Tests (GKAT), a (co)algebraic framework that presents positive answers to these questions, as well as a probabilistic extension called ProbGKAT. This is joint work with Nate Foster, Dexter Kozen, Justin Hsu, Wojciech Rozowski, Todd Schmid, Steffen Smolka, and Alexandra Silva.
77.	1 November	Hugo Jonker	Are some prices more equal than others? Spoiler: yes.	abstract slides
	Are some prices more equal than others? Spoiler: yes. Hugo Jonker Is the price you're seeing fair? That's a perennial question. One factor that may undermine price fairness is price differentiation -- asking different prices for the same item. This need not violate fairness, e.g., prices go up when the item becomes more scarce (e.g., flight tickets). Price differentiation does not require nor enforce fairness. Indeed, it is nowadays common for online shops to offer different views on their items (e.g., desktop site, mobile site, country-specific sites, etc). Whether these preserve price fairness has hitherto not been investigated. Our goal is to evaluate whether using a different platform affects item price. We devise a method to detect such platform-based price differentiation and perform a case study of platform effects on flight pricing. We acquire pricing data of flights by synchronised scraping of multiple platforms (mobile app, desktop site for different countries, mobile site) over 45 days, gathering extra item attributes to be able to determine item equivalence across different platforms. We analyse the collected data in various ways, comparing between platforms but also providing a holistic view on the acquired data. Our results show no differences between platforms for most investigated companies, though in some cases, using a mobile phone is frequently more expensive.
76.	18 October	Diego Velázquez (UNAM)	Pattern models: A Dynamic Epistemic Logic For Distributed System Analysis	abstract slides
	Pattern models: A Dynamic Epistemic Logic For Distributed System Analysis Diego Velázquez (UNAM) We introduce pattern models, a dynamic epistemic logic designed for analyzing distributed systems. First, we present a version of the pattern models where the full-information protocol, deeply studied in distributed computability, is static in the product definition of pattern models. Latter, we parametrized such logic to add the capability to model dynamics of arbitrary protocols. We give a systematic construction of pattern models for a large variety of distributed-computing models called dynamic-network models. For a proper subclass of dynamic-network models called oblivious, the communication pattern model remains the same throughout the computation so that an oblivious distributed-computing model can be described using constant space.
75.	4 October	Ella Roubtsova	Constraint Formalization for Automated Assessment of Enterprise Models	abstract slides
	Constraint Formalization for Automated Assessment of Enterprise Models Ella Roubtsova Enterprises always do their business within some restrictions. In a team of enterprise architects, the restrictions are transformed into the modelling conventions and the corresponding modelling constraints that should be consistently applied across all enterprise models. This paper presents an approach for refining and formalizing modeling conventions into modelling constraints and using them for assessment of enterprise models by a software component called ArchiChecker. The specifics of the proposed approach is that the modeling conventions are first visualized and formalized using the types of elements and relationships of the Archi- Mate modeling language, that is also used for modelling of enterprise views. The ArchiMate elements and relationships serve as types to formulate constraints. The elements and relationships in an ArchiMate model are instances of the ArchiMate elements and relationships. Using these types and instances the ArchiChecker automatically generates the lists of violations of modeling conventions in the enterprise models. Each violation shows how a specific enterprise view deviates from a given modeling convention. The paper reports a case study of application of the proposed approach to enterprise modelling views and modelling conventions used in a medical center. The case study is used to discuss the added value of formalization and automated assessment of modelling constraints in enterprise modelling.
74.	20 September	Stijn de Gouw	Verifying JDK's identity hashmap implementation	abstract slides
	Verifying JDK's identity hashmap implementation Stijn de Gouw Hash maps are a common and important data structure in efficient algorithm implementations. Despite their wide-spread use, real-world implementations are not regularly verified. We present the first case study of the IdentityHashMap class in the Java JDK. We specified its behavior using the Java Modeling Language (JML) and proved correctness for the main insertion and lookup methods with KeY, a semi-interactive theorem prover for JML-annotated Java programs. Furthermore, we report how unit testing and bounded model checking can be leveraged to find a suitable specification more quickly. We also investigated where the bottlenecks in the verification of hash maps lie for KeY by comparing required automatic proof effort for different hash map implementations and draw conclusions for the choice of hash map implementations regarding their verifiability.
73.	24 June	Ashish Sai	Promoting rigour in Blockchain's energy & environmental footprint research	abstract slides
	Promoting rigour in Blockchain's energy & environmental footprint research Ashish Sai Harald made the national news with his research on the energy consumption of bitcoin. Various other researchers also investigated this issue, ending up with wildly different estimates. In this talk, I investigate the underlying assumptions and discuss how some assumptions strongly impact these estimates: small changes in an assumption leading to huge changes in the resulting estimate. Finally, I propose a set of guidelines to promote rigour in scientific estimates of environmental footprint research.
72.	24 June	Daniel Stanley Tan	Anomaly detection, image forensics, and creative AI	abstract slides
	Anomaly detection, image forensics and creative AI Daniel Stanley Tan In this talk, I present an overview of my research in three AI-related areas: anomaly detection, image forensics, and creative AI. Anomaly detection I will discuss defect detection, where small changes from the expected baseline in images need to be automatically recognised. A second topic in this vein is the automated detection of crop pests and diseases from images taken by drones. This would allow large farms to far better ascertain the current health status of their crops and take appropriate targeted measures. Image forensics We dive into detection of faked images. The underlying principle is that for each image, its processing pipeline results in a spatial signature. When combining (parts of) multiple images, these spatial signatures are misaligned with respect to each other, indicating a forgery. Creative AI We discuss ways to learn the style of images (e.g., Van Gogh's "Starry Night") and then apply that to other images (e.g., photos of your holiday).
71.	14 June	Jesse Heyninck	Lexicographic Entailment, Syntax Splitting and the Drowning Problem	abstract slides
	Lexicographic Entailment, Syntax Splitting and the Drowning Problem Jesse Heyninck Lexicographic inference [Lehmann, 1995] is a well-known and popular approach to reasoning with non-monotonic conditionals. It is a logic of very high-quality, as it extends rational closure and avoids the so-called drowning problem. It seems, however, this high quality comes at a cost, as rea- soning on the basis of lexicographic inference is of high computational complexity. In this paper, we show that lexicographic inference satisfies syntax splitting [Kern-Isberner et al., 2020], which means that we can restrict our attention to parts of the belief base that share atoms with a given query, thus seriously restricting the computational costs for many concrete queries. Furthermore, we make some observations on the relationship between c- representations and lexicographic inference, and reflect on the relation between syntax splitting and the drowning problem.
70.	31 May	Stefano Schivo	Convinces biologists that formal verification is actually a thing, hilarity ensues	abstract slides
	Convinces biologists that formal verification is actually a thing, hilarity ensues Stefano Schivo We often hear people talk about "Multidisciplinary Research": regrettably, sometimes this leads us to think "buzz-word". But what happens in practice when people from different disciplines (try to) collaborate? What does this look like? What kind of challenges and rewards can we expect? With this talk I want to provide some reflection points on multidisciplinary collaborations. And maybe even give you some confidence to try working with "outsiders" yourself, who knows? This presentation is centered around my personal experience in collaborating with biologists. I will show off some interesting results, and reflect about what we learned from each other during this time. Potential generic advice and considerations are mainly left to the public.
69.	17 May	Freek Verbeek	Formally Verified Lifting of C-compiled x86-64 Binaries	abstract slides
	Formally Verified Lifting of C-compiled x86-64 Binaries Freek Verbeek Lifting binaries to a higher-level representation is an essential step for decompilation, binary verification, patching and security analysis. In this paper, we present the first approach to provably overapproximative x86-64 binary lifting. A stripped binary is verified for certain sanity properties such as return address integrity and calling convention adherence. Establishing these properties allows the binary to be lifted to a representation that contains an overapproxima- tion of all possible execution paths of the binary. The lifted representation contains disassembled instructions, reconstructed control flow, invariants and proof obligations that are sufficient to prove the sanity properties as well as correctness of the lifted representation. We apply this approach to Linux Foundation and Intel's Xen Hypervisor covering about 400K instructions. This demonstrates our approach is the first approach to provably overapproximative binary lifting scalable to commercial off-the-shelf systems. The lifted representation is exportable to the Isabelle/HOL theorem prover, allowing formal verification of its correctness. If our technique succeeds and the proofs obligations are proven true, then -- under the generated assumptions -- the lifted representation is correct.
68.	10 May	Harald Vranken	Detection of Botnets via DGA-Generated Domain Names	abstract slides
	Detection of Botnets via DGA-Generated Domain Names Harald Vranken Botnets often apply domain name generation algorithms (DGAs) to evade detection by generating large numbers of pseudo-random domain names of which only few are registered by cybercriminals. In this paper, we address how DGA-generated domain names can be detected by means of machine learning and deep learning. We first present an extensive literature review on recent prior work in which machine learning and deep learning have been applied for detecting DGA-generated domain names. We observe that a common methodology is still missing and, due to use of different datasets, experimental results can hardly be compared. We next propose the use of TF-IDF to measure frequencies of the most relevant n-grams in domain names, and use these as features in learning algorithms. We perform experiments with various machine-learning and deep-learning models using TF-IDF features, of which a deep MLP model yields the best results. For comparison, we also apply an LSTM model with embedding layer to convert domain names from a sequence of characters into a vector representation. The performance of our LSTM and MLP models is rather similar, achieving 0.994 and 0.995 AUC, and average F1-scores of 0.907 and 0.891 respectively.
67.	19 April	Luc Edixhoven	Balanced ω-regular languages	abstract slides
	Balanced ω-regular languages Luc Edixhoven Last time around I gave a talk about balanced regular languages: regular languages over an alphabet of different types of opening and closing parentheses, where the parentheses must be properly paired in some way (cf. the Dyck language). I showed how we can express precisely all these languages by enriching regular expressions with a parameterised shuffle operator. This time I will talk about how we adapted our previous constructions and proofs to languages of infinite words, formally known as omega-languages. While undoubtedly fun, infinity (rather expectedly) introduces additional complications. The visual analogy using jigsaw puzzle pieces makes a triumphant return.
66.	15 March	Sung-Shik Jongmans	A Predicate Transformer for Choreographies	abstract slides
	A Predicate Transformer for Choreographies Sung-Shik Jongmans Construction and analysis of distributed systems is difficult; choreographic programming is a deadlock-freedom-by-construction approach to simplify it. In this talk, I present a new theory of choreographic programming. It supports for the first time: construction of distributed systems that require decentralised decision making (i.e., if/while-statements with multiparty conditions); analysis of distributed systems to provide not only deadlock freedom but also functional correctness (i.e., pre/postcondition reasoning). Both contributions are enabled by a single new technique, namely a predicate transformer for choreographies. This is joint work with Petra van den Bos (University of Twente); our paper has been accepted for publication in ESOP 2022.
65.	22 February	Frouke Hermens	Predicting choice from eye movements	abstract slides
	Predicting choice from eye movements Frouke Hermens This talk will discuss ongoing work on whether eye movements reveal a decision maker's final decision. The data-set under consideration examined decisions on whether or not to get screened for Chlamydia (for women, a disease with few symptoms, but with a risk of serious complications when undetected). The set-up is relevant for current acute questions such as whether to get a booster vaccine against Covid-19. Various machine learning (classification) models were fitted on information about what was shown on the screen, regions fixated and what was shown in the regions fixated. Classification performance was not improved by including eye tracking data and was at a relatively low 71% correct, only marginally above the base rate of 69.5% accept decisions. I would like to discuss what these results means for publication of machine learning outcomes (only publish when you achieve good classification?) and how many attempts at fitting models are needed before concluding there may be insufficient information in the data for accurate classification.
64.	11 January	Hugo Jonker	Ethics in CS research: a tale of two failures	abstract slides
	Ethics in CS research: a tale of two failures Hugo Jonker In 2021, two different CS research efforts stirred up quite some hullabaloo for what was considered to be ethics failures on part of the researchers and their institutions. In this talk, I discuss both cases (the "Hypocrite Commits" S&P'21 paper and the Princeton & Radboud CCPA study), identifying the aspects under dispute thanks to the power of hindsight. Moreover, I (briefly) discuss the 1978 US Belmont Report on ethical guidelines for Human Subject research, and the Menlo report describing an ethical framework for research involving ICT. Finally, I discuss deception research, a type of research where subjects are deliberately misinformed about the nature of the research. This type of research has already been recognised, in other fields, as requiring extra care for the subjects and preferably avoided.

2021
63.	2 November	Mo Liu, University of Lorraine	Expressivity of some versions of APAL	abstract slides
	Expressivity of some versions of APAL Mo Liu, University of Lorraine Arbitrary public announcement logic (APAL) is a logic of change of knowledge with modalities representing quantification over announcements. It extends public announcement logic (PAL). We present three rather different versions of APAL: FSAPAL only quantifies over announcements containing a finite subset of all propositional variables. SCAPAL only quantifies over announcements containing variables occurring in the formula bound by the quantifier. IPAL quantifies over announcements implying a given formula. We determine the relative expressivity of FSAPAL, SCAPAL and IPAL in relation to APAL and PAL.
62.	2 November	Marta Gawek, University of Lorraine	Dynamic Epistemic Separation Logic	abstract slides
	Dynamic Epistemic Separation Logic Mart Gawek, University of Lorraine We investigate extensions of separation logic with epistemic and dynamic epistemic modalities. Separation logics are based on the intuitionistic logic of bunched implications (BI) or its classical counterpart Boolean BI. These logics combine additive and multiplicative connectives in the language, expressing the notions of resource composition and resource decomposition. Dynamic Epistemic Separation Logic (DESL) is a generalization of the Public Announcement Separation Logic (PASL) of Courtault et al. We present the syntax and semantics of DESL as well as reduction axioms for the elimination of dynamic modalities.
61.	2 November	Krisztina Fruzsa, TU Wien	Knowledge-based analysis of the Firing Rebels problem	abstract slides
	Knowledge-based analysis of the Firing Rebels problem Krisztina Fruzsa, TU Wien We provide an epistemic analysis of a simple variant of the fundamental consistent broadcasting primitive for byzantine fault-tolerant asynchronous distributed systems. Our Firing Rebels with Relay (FRR) primitive enables agents in the system to trigger an action (FIRE) at all correct agents, in an all-or-nothing fashion. By using our epistemic reasoning framework developed for byzantine fault-tolerant distributed systems, we discover the necessary and sufficient state of knowledge that needs to be acquired by an individual correct agent in order to FIRE according to the specification of the problem. It involves eventual common hope (a modality related to belief), which we show to be attained already by achieving eventual mutual hope in the case of FRR.
60.	5 October	Hugo Jonker	HLISA: a Human-Like Simulation API for webpage interaction	abstract slides
	HLISA: a Human-Like Simulation API for webpage interaction Hugo Jonker Automated browsers (web bots) are an invaluable tool for studying the web. However, research has shown that web bots can bedistinguished from regular browsers, and that they may be served different content as a consequence. This undermines their utility as a measurement tool. So far, three methods have been used to detectweb bots: browser fingerprint, order of site traversal, and aspects of page interaction. While site traversal depends on the study being executed, the other two aspects can be controlled in a generic fashion. The identifiability of web bot fingerprints has been extensively studied, but how to alter the fingerprint has received less attention. In this paper, we study which method to alter the fingerprint incurs the least side effects. Secondly, we provide an initial investigation of how the interaction API of Selenium differs from human interaction. We incorporate these results into HLISA, an API that simulates interaction like humans. Finally, we discuss the conceptual armsrace between simulators and detectors and find that conceptually, detecting HLISA requires modelling human interaction.
59.	7 September	Fabian van den Broek	Getting attribute-based signatures in the legal framework	abstract slides
	Getting attribute-based signatures in the legal framework Fabian van den Broek Digital signatures were invented in the 70s. In The Netherlands, digital signatures have been recognised by law since the 2000s. A special variant of digital signatures, attribute-based signatures, was first proposed in 2010. Since then, various studies have explored the possibilities of attribute signatures and enriched their application area. For example, attribute-based signatures are incorporated into the IRMA framework to enable non-interactive proofs of attributes. This has brought attribute-based signatures (literally) into the hands of many ordinary citizens. However, to date, attribute-based signatures are not specifically mentioned in the law. In this presentation, we discuss the advantages and limits of using attribute-based signatures by means of several case studies. Based on these case studies, we analyze whether and to what extent attribute-based signatures fit within current legal regulations. We consider both regular attribute-based signatures and anonymous attribute-based signatures. The latter are of particular interest, as the law defines signatures as containing an identifying aspect -- which is absent in these anonymous signatures.
58.	7 July	Gideon Maillette de Buy Wenniger	Efficient deep learning using richer context, with applications to handwriting recognition and scholarly document quality prediction	abstract slides
	Efficient deep learning using richer context, with applications to handwriting recognition and scholarly document quality prediction Gideon Maillette de Buy Wenniger Using richer context is one of the driving forces behind many advances in the performance of deep learning models, and before the advent of deep learning, it has been for other machine learning approaches also. My interest in richer context for machine learning models goes back to my PhD [1,2,3], in which reordering context labels were applied to improve translations produced by a hierarchical statistical machine translation model. More recently I worked on neural handwriting recognition [4], in which multi-dimensional long short-term memories (MDLSTMs) are applied to encode images while considering the full context of all other pixels in the image. In this application computational efficiency is crucial, and I developed various new strategies to increase it. The second application I will cover is scholarly document quality prediction. In this task, some property indicative of or correlated to quality is predicted from the textual and/or visual contents of a scholarly document alone. This property can be 1) the acceptance/rejection of a scientific paper in a journal or major conference, or 2) the log of the number of citations. The latter property, while imperfect as a notion of quality, is attractive because of its availability for huge amounts of scientific publications. I will discuss two new approaches presented in my recent work, that leverage ways to increase context for quality prediction models. The first approach [5] does so by incorporating document structure information to prediction, and the second approach [6] by full text encoding facilitated by the use of a BERT+GRU model in combination with chunking of the full document text. Using these approaches, new state-of-the art results were achieved for accept/reject prediction and log number of citation prediction tasks. I will end my talk with an outlook on my current ongoing work to make scholarly document quality prediction more explainable using saliency maps and other explainable deep learning methods. The work on hierarchical statistical machine translation, only mentioned briefly, is joint work together with my former PhD supervisor Khalil Sima'an. The work on handwriting recognition is joint work with Lambert Schomaker and Andy Way. Finally, the work on scholarly document quality prediction is joint work with Thomas van Dongen, Lambert Schomaker, Elerei Aedmaa, Herbert Kruitbosch, and Edwin Valentijn. See the below list of references for more information about the research.
57.	22 June	Hans van Ditmarsch	Gossip and Knowledge	abstract slides
	Gossip and knowledge Hans van Ditmarsch A well-studied phenomenon in network theory since the 1970s are optimal schedules to distribute information by one-to-one communication between nodes that are connected in the network. One can take these communicative actions to be telephone calls, and protocols to spread information this way are known as gossip protocols or epidemic protocols. A common abstraction is to call the information of each agent its secret, and that the goal of information dissemination is that all agents know all secrets: that is the termination condition of gossip protocols. Following investigations assuming a global scheduler, it is now typically assumed that gossip protocols are distributed in some way, where the only role of the environment is to ensure randomization. Statistical approaches to gossip have taken a large flight since then, wherein network topology is an important parameter. In epistemic gossip protocols, an agent (node) will call another agent not because it is so instructed by a scheduler, or at random, but based on its knowledge or ignorance of the distribution of secrets over the network and of other agents' knowledge or ignorance of that. One such protocol requires that an agent may only call another agent if it does not know the other agent's secret. Epistemic features of gossip protocols may affect their termination, the (order of complexity) expectation of termination, their reachability (what distributions of secrets may occur before all agents know all secrets), and so on. Variations involve agents exchanging telephone numbers in addition to agents exchanging secrets (which results in network expansion), or agents exchanging knowledge about secrets; we may also assume common knowledge of the protocol; further generalizations would involve multi-casts. We present a survey of distributed epistemic gossip protocols.
56.	15 June	Joshua Moerman	Weighted Register Automata	abstract slides
	Weighted Register Automata Joshua Moerman Since I am new at the open universiteit, I will first introduce myself. After that I will present recent research, which will be presented at LICS 2021. Register automata are an extension of automata that deal with data, possibly from an infinite domain. Many algorithms from classical automata theory remain to work on deterministic register automata. However, the nondeterministic register automata are strictly more expression and some properties become undecidable. Motivated by a recent result that the subclass of unambiguous register automata has decidable equivalence, we investigate weighted register automata, the common generalisation of register automata and weighted automata. These include unambiguous register automata. We show that equivalence is also decidable for these automata. In our view, the main contribution is the development of the mathematical theory of so-called orbit-finitely spanned vector spaces, on which our decision procedure is based. This is joint work with Mikołaj Bojańczyk and Bartek Klin.
55.	25 May	Nico Naus	Exploit Logic	abstract slides
	Exploit Logic Nico Naus Automatic exploit generation is a relatively new area of research. Work in this area aims to automate the manual and labour intensive task of finding exploits in software. With Exploit Logic, we are aiming to lay the foundation for a new automatic exploit generation system. Exploit Logic, which formally defines the relation between exploits and the preconditions which allow them to occur. This relation is used to calculate the search space of preconditions for a specific program and exploit. This presentation is an introduction into Exploit Logic, and the current state of my research.
54.	20 April	Frouke Hermens	Data driven group comparisons of eye fixations to dynamic stimuli	abstract slides
	Frouke Hermens Data driven group comparisons of eye fixations to dynamic stimuli Recent advances in software and hardware have allowed eye tracking to move away from static images to more ecologically relevant video streams. The analysis of eye tracking data for such dynamic stimuli, however, is not without challenges. The frame-by-frame coding of regions of interest (ROIs) is labour intensive, and computer vision techniques to automatically code such ROIs are not yet mainstream, restricting the use of such stimuli. Combined with the more general problem of defining relevant ROIs for video frames, methods are needed that facilitate data analysis. Here we present a first evaluation of an easy-to-implement data-driven method with the potential to address these issues. To test the new method, we examined the differences in eye movements of self-reported politically left- or right-wing leaning participants to video clips of left- and right-wing politicians. The results show that our method can accurately predict group membership on the basis of eye movement patterns, isolate video clips which best distinguish people on the political left-right spectrum and reveal the section of each video clip with the largest group differences. Our methodology thereby aids the understanding of group differences in gaze behaviour, and the identification of critical stimuli for follow-up studies or for use in saccade diagnosis.
53.	23 March	Vincent van der Meer	Deducing File History from NTFS timestamps	abstract slides
	Deducing File History from NTFS timestamps Vincent van der Meer Proper attribution is a key factor in digital forensics. In this paper, we investigate one such area of attribution: file timestamp history. This is crucial for determining whether a file could have been accessible to a suspect. In particular, we determine file history given a file's current NTFS timestamps. That is, we infer all possible sequences of file system operations which culminate in the file's current NTFS timestamps. To this end, we provide an exhaustive overview of such operations and their effect on timestamps, clarifying inconsistencies in literature. Based upon this overview, we propose a method for reconstructing possible histories of operations, accounting for various forms of timestamp forgery. We provide an implementation of this method, that depicts possible histories graphically.
52.	9 March	Stef Joosten	Programming with relation algebra's	abstract slides
	Programming with relation algebra's Stef Joosten Why use relation algebra as a programming language? Well, besides being cool for using formal specifications directly for programming, it gives neat language properties such as declarative semantics, strong typing, reactive programming, and calculating with programs. And it solves real engineering problems that will help solve the software crisis (Fred Brooks, The Mythical Man-Month: Essays on Software Engineering, Addison-Wesley, 1975). This presentation gives an overview of the language Ampersand, its formal foundations and its applications in industry. It contains a demonstration. We end by discussing some research challenges that are stil open.
51.	23 February	Olivia Rodríguez Valdés	Towards a testing tool that learns to test	abstract slides
	Towards a testing tool that learns to test Olivia Rodríguez Valdés GUI testing is a type of testing that verifies software as a whole through its GUI, interacting with widgets and testing the System Under Test (SUT) from the user's perspective. Guiding the testing process of the SUT in a smarter way should lead to an enhancement of error detection/recognition. We will study the application of Reinforcement Learning techniques in automated GUI testing. Using the scriptless GUI testing tool TESTAR as a vehicle, we will focus our research on improving the action selection mechanisms, that guide the learning process. We will tailor rewards towards the goal of enhancing the effectiveness and efficiency of testing.
50.	26 January	Luc Edixhoven	Expressing safe communication protocols	abstract slides
	Expressing safe communication protocols Luc Edixhoven Parallelisation is becoming ever more important in modern-day computing and, consequently, also proper communication between different threads, processes and machines. However, existing static methods to verify the safety of message passing-based communication protocols are generally limited in their expressiveness. As a result, many protocol implementations which are, in fact, safe, cannot be formally verified as such. Our aim is to find an expressive formal basis which is able to support, if possible, any safe protocol. Our first step in this adventure deals with regular protocols; we base our approach on formal language theory, by enriching regular expressions with a parameterised interleaving operator.

2020
49.	8 December	Bastiaan Heeren	Software technology for automated feedback generation	abstract slides
	Software technology for automated feedback generation Bastiaan Heeren Intelligent Tutoring Systems (ITSs) that offer well-designed feedback have the potential to enhance education in many problem domains. However, the cost of developing such systems is known to be high, with an estimated 200 hours of development for one hour of practice. The Ideas research project aims at simplifying ITS construction by using software technology for automated feedback generation. Examples of such techniques are embedded domain-specific languages, combinators and their algebraic laws, (datatype-)generic programming, and rewriting. Over the past 14 years, our approach has been tested on many problem domains, and has supported the exploration of new feedback opportunities. In this presentation, I will explain two techniques: the embedding of light-weight rewrite rules in problem-solving procedures, and the use of generic traversals for controlling rule application. The techniques will be illustrated by a simple problem-solving procedure for rewriting propositions into negation normal form.
48.	24 November	Alaaeddin Swidan	Learning programming for school-age children: challenges and opportunities	abstract slides
	Learning programming for school-age children: challenges and opportunities Alaaeddin Swidan There are currently a variety of programming environments and tools meant to teach children programming, as one aspect of a larger goal of developing their computational thinking. Block-based programming is one of the popular choices to introduce programming to children. Due to the low ceiling that such environments provide, there comes a point where students are introduced to other "traditional" programming languages. With this learning path in mind, in addition to the idea that learning programming is difficult, there arise many challenges that children and educators face. In this presentation we will focus on two of them: challenges to achieve higher conceptual development in block-based languages and the efforts needed for a successful transfer of knowledge from block based to textual programming languages.
47.	3 November	Sung-Shik Jongmans	Discourje: Runtime Verification of Communication Protocols in Clojure	abstract slides
	Discourje: Runtime Verification of Communication Protocols in Clojure Sung-Shik Jongmans This talk presents Discourje: a runtime verification framework for communication protocols in Clojure. Discourje guarantees safety of protocol implementations relative to specifications, based on an expressive new version of multiparty session types. The framework has a formal foundation and is itself implemented in Clojure to offer a seamless specification-implementation experience. Benchmarks show Discourje's overhead can be less than 5% for real/existing concurrent programs.
46.	14 October	Stefano Bromuri	Deep Learning applied to finance and logistics	abstract slides
	Deep Learning applied to finance and logistics Stefano Bromuri This presentation is meant to be a highlight concerning the activities of the CAROU team in the Heerlen campus for all computer scientists that may be interested in participating. Specifically, this presentation starts by revisiting the concepts of machine learning (ML) and deep learning (DL). After discussing the building blocks of ML and DL, the presentation then focuses on three projects in the AI domain currently carried out by CAROU with and for industrial partners in the Heerlen campus. The three domains considered are emotion analytics (APG), logistics (DHL) and customer segmentation (APG). Ethical considerations are discussed for each of the three projects.
45.	29 September	Marloes Venema (Radboud University)	Systemizing Attribute-Based Encryption -- In Search of Practical Decentralized ABE	abstract slides
	Systemizing Attribute-Based Encryption -- In Search of Practical Decentralized ABE Marloes Venema Attribute-based encryption (ABE) cryptographically implements fine-grained access control on data. As such, data can be stored by an entity that is not necessarily trusted to enforce access control, or an entity that is not even trusted to have access to the plaintext data at all. Instead, access control can be externally enforced by a trusted entity. For instance, you can store your personal data in the cloud without the storage servers getting actual access to it. You can also manage access to the data, and share it with the people with whom you want to share that data. Additionally, some multi-authority variants of ABE -- which do not have a central authority -- can effectively and securely implement access control in multiple-domain settings. We show that, despite the amount of work that has been done in the area of ABE, a practical and privacy-friendly scheme does not exist yet. To make ABE more practical and privacy friendly, we outline important open problems.
44.	15 September	Christof Ferreira Torres (Université du Luxembourg)	Ægis: Shielding Vulnerable Smart Contracts Against Attacks	abstract slides
	Ægis: Shielding Vulnerable Smart Contracts Against Attacks Christof Ferreira Torres (Université du Luxembourg) In recent years, smart contracts have suffered major exploits, costing millions of dollars. Unlike traditional programs, smart contracts are deployed on a blockchain. As such, they cannot be modified once deployed. Though various tools have been proposed to detect vulnerable smart contracts, the majority fails to protect vulnerable contracts that have already been deployed on the blockchain. Only very few solutions have been proposed so far to tackle the problem of post-deployment. However, these solutions suffer from low precision and are not generic enough to prevent any type of attack. In this work, we introduce Ægis, a dynamic analysis tool that protects smart contracts from being exploited during runtime. Its capability of detecting new vulnerabilities can easily be extended through so-called attack patterns. These patterns are written in a domain-specific language that is tailored to the execution model of Ethereum smart contracts.The language enables the description of malicious control and data flows. In addition, we propose a novel mechanism to streamline and speed up the process of managing attack patterns. Patterns are voted upon and stored via a smart contract, thus leveraging the benefits of tamper-resistance and transparency provided by the blockchain. We compare Ægis to current state-of-the-art tools and demonstrate that our solution achieves higher precision in detecting attacks. Finally, we perform a large-scale analysis on the first 4.5 million blocks of the Ethereum blockchain, thereby confirming the occurrences of well reported attacks, as well as uncovering unreported attacks in the wild.
43.	12 May	Stijn de Gouw	Verifying OpenJDK's LinkedList using KeY	abstract slides
	Verifying OpenJDK's LinkedList using KeY Stijn de Gouw As a particular case study of the formal verification of state-of-the-art, real software, we discuss the specification and verification of a corrected version of the implementation of a linked list as provided by the Java Collection framework.
42.	11 May	Clara Maathuis	Social Manipulation Campaigns	abstract slides
	Social Manipulation Campaigns Clara Maathuis The increased media attention towards the growing number of social manipulation campaigns that aimed at influencing or damaging perceptions, processes, and systems, succeeded in rerouting and rewriting thoughts and agendas of academic researchers, professionals, and decision makers in the last years. This implied that after a long struggle, wearing and seeing through offensive lenses could be perceived as the right approach to advancing effective, efficient, and adaptive intelligent solutions, programs, and strategies for defence, deterrence, and resilience purposes when fighting such threats. However, these measures should embed in their design phase socio-technical constraints, complexities, requirements, and values, and should further consider hybrid implementation and evaluation methods. In these lines, this presentation aims at i) addressing this phenomenon, and ii) discussing some incidents and solutions to countering such threats.
41.	14 April 2020	Freek Verbeek	Formal Proofs of Return Address Integrity	abstract slides
	Formal Proofs of Return Address Integrity Freek Verbeek We present a methodology for generating a characterization of the memory used by an assembly program, as well as a formal proof that the assembly is bounded to the generated memory regions. A formal proof of memory usage is required for compositional reasoning over assembly programs. Moreover, it can be used to prove low-level security properties, such as integrity of the return address of a function. Our verification method is based on interactive theorem proving, but provides automation by generating pre- and postconditions, invariants, control-flow, and assumptions on memory layout. As a case study, three binaries of the Xen hypervisor are disassembled. These binaries are the result of a complex build-chain compiling production code, and contain various complex and nested loops, large and compound data structures, and functions with over 100 basic blocks. The methodology has been successfully applied to 251 functions, covering 12,252 assembly instructions.

2019
40.	5 November	Sung-Shik Jongmans	Distributed programming using role-parametric session types in Go	abstract slides
	Distributed programming using role-parametric session types in Go Sung-Shik Jongmans In this talk, I'll present a framework for the static specification and safe programming of message passing protocols where the number and kinds of participants are dynamically instantiated. More precisely, we developed the first theory of distributed multiparty session types (MPST) to support parameterised protocols with indexed roles; our framework statically infers the different kinds of participants induced by a protocol definition as role variants, and produces decoupled endpoint projections of the protocol onto each variant. This enables safe MPST-based programming of the parameterised endpoints in distributed settings: each endpoint can be implemented separately by different programmers, using different techniques (or languages). We proved the decidability of role variant inference and well-formedness checking, and the correctness of projection. We implemented our theory as a toolchain for programming such role-parametric MPST protocols in Go. Our approach is to generate API families of lightweight, protocol- and variant-specific type wrappers for I/O. The APIs ensure a well-typed Go endpoint program (by native Go type checking) will perform only compliant I/O actions w.r.t. the source protocol. We leveraged the abstractions of MPST to support the specification and implementation of Go applications involving multiple channels, possibly over mixed transports (e.g., Go channels, TCP), and channel passing via a unified programming interface. We evaluated the applicability and run-time performance of our generated APIs using microbenchmarks and real-world applications. This is joint work with David Castro, Raymond Hu, Nicholas Ng, and Nobuko Yoshida; the talk is based on a paper that was published in POPL 2019 (https://dx.doi.org/10.1145/3290342).
39.	24 September	Hugo Jonker	Fingerprint surface-based detection of web bot detectors	abstract slides
	Fingerprint surface-based detection of web bot detectors Hugo Jonker Web bots are used to automate client interactions with websites, which facilitates large-scale web measurements. However, websites may employ web bot detection. When they do, their response to a bot may differ from responses to regular browsers. The discrimination can result in deviating content, restriction of resources or even the exclusion of a bot from a website. This places strict restrictions upon studies: the more bot detection takes place, the more results must be manually verified to confirm the bot's findings. To investigate the extent to which bot detection occurs, we reverse-analysed commercial bot detection. We found that in part, bot detection relies on the values of browser properties and the presence of certain objects in the browser's DOM model. This part strongly resembles browser fingerprinting. We leveraged this for a generic approach to detect web bot detection: we identify what part of the browser fingerprint of a web bot uniquely identifies it as a web bot by contrasting its fingerprint with those of regular browsers. This leads to the fingerprint surface of a web bot. Any website accessing the fingerprint surface is then accessing a part unique to bots, and thus engaging in bot detection. We provide a characterisation of the fingerprint surface of 14 web bots. We show that the vast majority of these frameworks are uniquely identifiable through well-known fingerprinting techniques. We design a scanner to detect web bot detection based on the reverse analysis, augmented with the found fingerprint surfaces. In a scan of the Alexa Top 1 Million, we find that 12.8% of websites show indications of web bot detection.
38.	5 July	Christof Ferreira Torres (Université du Luxembourg)	The Art of The Scam: Demystifying Honeypots in Ethereum Smart Contracts	abstract slides
	The Art of The Scam: Demystifying Honeypots in Ethereum Smart Contracts Christof Ferreira Torres Modern blockchains, such as Ethereum, enable the execution of so-called smart contracts -- programs that are executed across a decentralised network of nodes. As smart contracts become more popular and carry more value, they become more of an interesting target for attackers. In the past few years, several smart contracts have been found to be vulnerable and thus exploited by attackers. However, a new trend towards a more proactive approach seems to be on the rise, where attackers do not search for vulnerable contracts anymore. Instead, they try to lure their victims into traps by deploying vulnerable-looking contracts that contain hidden traps. This new type of contracts is commonly referred to as honeypots. In this paper, we present the first systematic analysis of honeypot smart contracts, by investigating their prevalence, behaviour and impact on the Ethereum blockchain. We develop a taxonomy of honeypot techniques and use this to build HONEYBADGER: a tool that employs symbolic execution and well defined heuristics to expose honeypots. We perform a large-scale analysis on more than 2 million smart contracts and show that our tool not only achieves high precision, but also high scalability. We identify 690 honeypot smart contracts as well as 240 victims in the wild, with an accumulated profit of more than $90,000 for the honeypot creators. Our manual validation shows that 87% of the reported contracts are indeed honeypots.
37.	18 June	Greg Alpár	How resistance can turn into curiosity? -- The Open Maths approach	abstract slides
	How resistance can turn into curiosity? -- The Open Maths approach Greg Alpár Open Maths is an initiative, based on state-of-the-art research results, to make mathematics accessible and enjoyable for students in the higher education and in life-long learning. In this talk, I will give an overview about the first two Open Maths courses and their impact on students. Be prepared for some surprises while we delve deeper in one particular aspect of this new approach!
36.	28 May	Vincent van der Meer	Investigating File Fragmentation	abstract slides
	Investigating File Fragmentation Vincent van der Meer In digital forensics, data recovery tools are used to find lost or deleted evidence. One popular approach is file carving, which recovers files based on their contents instead of metadata (such as a file system). To recover fragmented files, file carvers must make assumptions about fragmentation of files on the examined devices. There has been one large-scale study on file fragmentation, on devices from 1998-2006. Significant advances in hard- and software since then more than merit a new study. In this paper, we examine file fragmentation of 220 systems (334 drives), that serve as the primary computer of their owners. To this end, we have devised a privacy-friendly approach to acquiring the data. Only meta- information of the file system was collected. Moreover, meta-information was pro-actively filtered to avoid any potential privacy-sensitive issues. Our results indicate that the average level of fragmentation is lower compared to previous studies. Interestingly, we find that fragmentation not only concerns gaps between blocks of the same file, but also the order of these blocks. In particular, we find that a type of fragmentation where a later block is stored on disk before an earlier block of the same file, occurs in nearly half of all fragmented files. This type of out-of-order fragmentation is rarely discussed in literature, yet its high rate of occurrence should impact the practice of file carving.
35.	29 January	Frank Tempelman	The Semi-Direct Control System - a vehicle for AI research	abstract slides
	The Semi-Direct Control System - a vehicle for AI research Frank Tempelman In mijn voordracht van komende dinsdag zal ik, na een initiële, korte introductie van mijzelf, een overzicht geven van mijn mogelijke onderzoeksonderwerpen op met name het gebied van kunstmatige intelligentie en dan vooral, gegeven mijn achtergrond en contacten, in combinatie met luchtvaart. Aan de orde komt het Semi-Direct Control System, een alternatief besturingssysteem voor onbemande vliegtuigen (UAVs), voornamelijk toepasbaar in zogeheten 'dog fights' op grote afstand, waarbij vliegtuigen elkaar op korte onderlinge afstand te lijf gaan. Het probleem is in deze dat een direct controlesysteem, waarbij de vlieger op vaak duizenden kilometers afstand rechtstreeks op basis van sensorinformatie van het UAV met zijn controle-apparatuur het ding bestuurt, te traag is. Dat zou op te lossen te zijn door het UAV een veel grotere mate van autonomie te geven: geef hem meer intelligentie en je hoeft hem niet steeds van commando's te voorzien, want hij kan het zelf. Zover zijn we echter nog niet, en het SDCS vormt een tussenweg, waarbij de frequentie van commando's een trade-off vormt met de benodigde intelligentie in het UAV. Daarmee biedt het SDCS een mooi en praktisch toepasbaar onderzoeksonderwerp voor de (toename van) autonomie in vliegtuigen en ook andere voertuigen, op het gebied van onder andere kennisacquisitie, - formalisering en -representatie, beeldherkenning en beslissingslogica. Het zou heel interessant zijn om die trade-off tussen de mate van controle en de benodigde intelligentie in het vliegtuig in kaart te brengen.

2018
34.	11 December	Bernard van Gastel	On the security of self-encrypting SSDs	abstract
	On the security of self-encrypting SSDs Bernard van Gastel Various SSDs offer an option to enable encryption at a firmware level. The security goals should be similar to those of disk encryption handled by software. And indeed, Bitlocker, the disk encryption built into Windows, fully relies on firmware-level encryption if it is present. However, when Bernard and Carlo Meijer (RU) investigated drives from Samsung and Crucial, they found several issues with the afforded security, such as a master password set to the empty string.
33.	16 October	Johan Jeuring	Intelligent Programming Tutors	abstract, slides
	Intelligent Programming Tutors Johan Jeuring, OU & University of Utrecht Intelligent programming tutors (IPTs) are used to support students who learn how to program. There exist IPTs for almost any programming language. In this talk I will give a brief introduction to IPTs. In particular, I will discuss what kind of feedback and hints these tutors give, and what technologies they use to calculate feedback.
32.	25 Sept	Josje Lodder	Developing an e-learning tool for inductive proofs	abstract, slides
	Developing an e-learning tool for inductive proofs Josje Lodder Proving is a central subject of most logic course. Students attending such courses do not only have to learn how to prove consequences in a formal system, but they also have to reason about formal systems. A typical example of an exercise that occurs in many textbooks is the following: prove that the number of left parentheses in a logical formula is equal to the number of right parentheses. These kinds of properties can be proved by structural induction, where the structure of the proof follows the structure of the inductive definition of the logical language. Students have conceptual as well as technical problems with inductive proofs. In this talk I will discuss our ideas and first results concerning the development of an e-learning tool that helps students to practice with inductive proofs.
31.	26 June	Hugo Jonker	Investigating Adblock wars	abstract, slides
	Investigating Adblock wars Hugo Jonker We investigate the extent to which websites react to filter list updates. We collected data on the Alexa Top-10K for 120 days, and on specific sites newly added to filter lists for 140 days. By evaluating how long a filter rule triggers on a website, we can gauge how long it remains effective. We matched websites with both regular adblocking filter lists (Easylist) and with filter lists that remove anti-adblocking (primarily UblockProtector / NanoProtector). From our data, we observe that the effectiveness of the Easylist adblocking filter decays initially, but after ~80 days seems to stabilize. However, we found no evidence for any significant decay in effectiveness of the more specialized, but less widely used, Ublock Protector / Nano Protector list.
30.	24 April	Stijn de Gouw	Human-in-the-Loop Simulation of Cloud Services	abstract, slides
	Human-in-the-Loop Simulation of Cloud Services Stijn de Gouw We present an integrated tool-suite for the simulation of software services which are offered on the Cloud. The tool -suite uses the Abstract Behavioral Specification (ABS) language for modeling the software services and their Cloud deployment. For the real-time execution of the ABS models we use a Haskell backend which is based on a source-to-source translation of ABS into Haskell. The tool-suite then allows Cloud engineers to interact in real-time with the execution of the model by deploying and managing service instances. The resulting human-in-the-loop simulation of Cloud services can be used both for training purposes and for the (semi-)automated support for the real-time monitoring and management of the actual service instances.
29.	27 March	Benjamin Krumnow	The Shepherd Project: Automated security audits of website login processes	abstract, slides
	The Shepherd Project: Automated security audits of web login processes Benjamin Krumnow HTTP does not have memory. HTTP Cookies were created to address this. These cookies thus allow web sites to remember things, such as whether you are logged in or not. After logging in, the cookies authenticate the user, so that the user does not have to type his password with every click. If an attacker steals these cookies, he could thus impersonate a user. In 2010, the Firesheep plugin for Firefox demonstrated how easy it was to steal such cookies for many popular web sites (Gmail, Facebook, etc.). The problem that Firesheep exploited was that cookies were sent over an insecure channel (HTTP instead of HTTPS). Such an attack is called a session hijacking attack. In the Shepherd project, we investigate the adoption of countermeasures against session hijacking attacks. Over the last year, we have developed an automatic selenium-based scanner. This scanner performs an automated audit of the security of login processes of web sites. In this OUrsi talk, I will give you an overview of the current scanning framework, explain how it assesses login security and talk about our current study, where we analyse over 65K websites. I hope that we will have lively discussion to identify possible unclear points and future directions.
28.	27 February	Vincent van der Meer	Future-proofing recovery of deleted files	abstract, slides
	To maintain the ability to recover deleted files: about the further development and future-proofing of this digital forensic specialization Vincent van der Meer Reconstructing deleted files, file carving, is a crucial part of digital forensic work, often concerning child pornography. The explosive growth of data makes file carving increasingly time-consuming and therefore increasingly inadequate, so that evidence remains hidden. File carving is still set up separately per device and per app, resulting in a considerable delay to develop file carving for new apps. This research will investigate how the app-specific part can be made as small as possible by abstraction and generalization. Spin-off will be a widely applicable, modular, maintainable and expandable framework with corresponding benchmark to compare the results of new technologies.
27.	19 February	Hassan Alizadeh	Intrusion Detection and Traffic Classification using Application-aware Traffic Profiles	abstract, slides
	Intrusion Detection and Traffic Classification using Application-aware Traffic Profiles Hassan Alizadeh Along with the growing number of applications and end-users, online network attacks and advanced generations of malware have continuously proliferated. Many studies have addressed the issue of intrusion detection by inspecting aggregated network traffic with no knowledge of the responsible applications/services. Such systems fail to detect intrusions in applications whenever their abnormal traffic fits into the network normality profiles. This research addresses the problem of detecting intrusions in (known) applications when their traffic exhibits anomalies. To do so, a network monitoring tool needs to: (1) identify the source application responsible for the traffic; (2) have per-application traffic profiles; and (3) detect deviations from profiles given a set of traffic samples. This work is mainly devoted to the last two topics. Upon an initial survey on traffic classification techniques, this research proposes the use of three different kinds of Gaussian Mixture Model (GMM) learning approaches, namely classic, an unsupervised and Universal Background Model (UBM) to build per-application profiles. The effectiveness of the proposed approaches has been demonstrated by conducting different sets of experiments on two public datasets collected from real networks, where the source of each traffic flow is provided. Experimental results indicate that while the proposed approaches are effective for most applications, the UBM approach significantly outperforms other approaches.
26.	19 February	Fenia Aivaloglou	How Kids Code and How We Know	abstract, slides
	How Kids Code and How We Know Fenia Aivaloglou Programming education is nowadays beginning from as early as the elementary school age. Block-based programming languages like Scratch are increasingly popular, and there is substantial research showing that they are suitable for early programming education. However, how do children program, and are they really learning? In this talk we will focus on Scratch and explore the programs that children write. By scraping the Scratch public repository we have download and analyzed 250K projects in terms of complexity, used abstractions and programming concepts, and code smells. We will further discuss the effect of code smells to novice Scratch programmers and our experiments in Dutch classrooms. Focusing on how to teach good programming practices to children, we designed the edx MOOC "Scratch: Programmeren voor kinderen" and we analyzed the results of its first run, where more that 2K children actively participated. We found factors that can predict successful course completion, as well as age-related differences in the students' learning performance. To further examine the children's motivation and self-efficacy and their relation to learning performance we run a Scratch course in three schools and we will present our initial findings.
25.	30 January	Greg Alpár	Blockchain for identity management, seriously?	abstract, slides
	Blockchain for identity management, seriously? Greg Alpár Right now (Dec. 11, 2017, 15:45) 1 Bitcoin is €14228.71. This amazing success of a cryptocurrency has made the blockchain technology, underlying that of Bitcoin, so exciting for businesses that it is often called the next big revolution after the internet. Blockchains are proposed to solve all kinds of problems in our digital world. Besides the financial context, identity is considered to be another promising area for blockchains. Is it a good idea? In this OUrsi session, I would like to share with you some thoughts with regard to the application of blockchains in identity management. Since there is no big results in our study yet, this talk will hopefully turn into a fruitful discussion.

2017
24.	30 November	Pekka Aho	Automated Model Extraction and Testing of Graphical User Interface Applications	abstract, slides
	Automated Model Extraction and Testing of Graphical User Interface Applications Pekka Aho The industry practice for test automation through graphical user interface is script-based. In some tools the scripts can be recorded from the user (capture & replay tools) and in others, the scripts are written manually. The main challenge with script-based test automation is the maintenance effort required when the GUI changes. One attempt to reduce the maintenance effort has been model-based testing that aims to generate test cases from manually created models. When the system under testing (SUT) changes, changing the models and re-generating the test cases is supposed to reduce the maintenance effort. However, creating the formal models allowing test case generation requires very specialized expertise. Random testing (monkey testing) is a maintenance-free and scriptless approach to automate GUI testing. In random GUI testing, the tool emulates an end-user by randomly interacting with the GUI under testing by pressing buttons and menus and providing input to other types of GUI widgets. More recent research aims to automatically extract the GUI models by runtime analysis while automatically exploring and interacting with the GUI (or during random testing). Test case generation based on extracted models is problematic but by comparing extracted GUI models of consequent versions it is possible to detect changes on the GUI. This approach aims to address the maintenance problem with scriptless regression testing and change analysis.
23.	30 November	Twan van Laarhoven	Local network community detection: K-means clustering and conductance	abstract, slides
	Local network community detection: K-means clustering and conductance Twan van Laarhoven Joint work Elena Marchiori Local network community detection is the task of finding a single community of nodes in a graph, concentrated around few given seed nodes. The aim is to perform this search in a localized way, without having to look at the entire graph. Conductance is a popular objective function used in many algorithms for local community detection. In this talk I will introduce a continuous relaxation of conductance. I will show that continuous optimization of this objective still leads to discrete communities. I investigate the relation of conductance with weighted kernel k-means for a single community, which leads to the introduction of a new objective function, σ-conductance. Conductance is obtained by setting σ to 0. I will present two algorithms for optimizing σ-conductance, which show promising results on benchmark datasets.
22.	27 June	Jeroen Keiren	An O(m log n) algorithm for stuttering equivalence	abstract, slides
	*An O(m log n)* algorithm for stuttering equivalence** Jeroen Keiren In this presentation, I will discuss a new algorithm to determine stuttering equivalence with time complexity O(m log n), where n is the number of states and m is the number of transitions of a Kripke structure. Theoretically, this algorithm substantially improves upon existing algorithms, which all have time complexity of the order O(mn) at best. Moreover, when compared with other algorithms, the new algorithm has better or equal space complexity. Practical results confirm these findings: they show that the algorithm can outperform existing algorithms by several orders of magnitude, especially when the Kripke structures are large.
21.	13 June	Stef Joosten	Software development in relation algebra with Ampersand	abstract, slides
	Software development in relation algebra with Ampersand Stef Joosten Relation algebra can be used as a programming language for building information systems. We will demonstrate this principle by discussing a database-application for legal reasoning. In this case study, we will focus on the mechanisms of programming in relation algebra. Besides being declarative, relation algebra comes with attractive promises for developing big software. The case study was compiled with Ampersand, a compiler whose design and development is the focus of a research collaboration between Ordina and the Open University.
20.	9 May, 13.30 - 14.30	Hieke Keuning	Code Quality Issues in Student Programs	abstract, slides
	Code Quality Issues in Student Programs Hieke Keuning Abstract: Because low quality code can cause serious problems in software systems, students learning to program should pay attention to code quality early. Although many studies have investigated mistakes that students make during programming, we do not know much about the quality of their code. We have examined the presence of quality issues related to program-flow, choice of programming constructs and functions, clarity of expressions, decomposition and modularization in a large set of student Java programs. We investigated which issues occur most frequently, if students are able to solve these issues over time and if the use of code analysis tools has an effect on issue occurrence. We found that students hardly fix issues, in particular issues related to modularization, and that the use of tooling does not have much effect on the occurrence of issues.
19.	4 April	Greg Alpár	The Alphabet of ABCs	abstract, slides
	The Alphabet of ABCs Greg Alpár The digital world becomes more and more a part of the physical world. Data, bound to our identity, relationships and communications, is processed all around the world. This may result in privacy harms, including unwanted advertising, data creep, exclusion, insecurity, etc. Although much of this data could possibly be used without linking personally to you, currently everybody seems to accept that we have to give up on the control over our privacy. A fundamental technological paradigm today is that you almost have to identify yourself in order to use a given digital service. What if we could avoid the need for this? For instance, you could just prove that you have a train ticket without a central party storing an account about you and about all your trips. Or you could merely show that you are over 18 of age without having to show (and let save) your name and credential number, so you are eligible to buy a bottle of 'jenever'. Or you could do online shopping with giving away as little personal information as you used to in a department store in the 70s. Attribute-based credentials (ABCs) provide a wide variety of possible scenarios, in which you can prove that you are entitled to do things that otherwise you can only do by having to use identifiers. In the last OUrsi meeting Fabian presented the IRMA technology, one of the most important practical ABC projects, and its applications. In this OUrsi meeting, I will talk about the neat cryptographic and mathematic techniques that allow for the above-mentioned high level of flexibility and practical privacy. (Don't get scared, you won't need lots of background in mathematics to understand my presentation!)
18.	7 March	Rolando Trujillo (University of Luxembourg)	Counteracting active attacks in social network graphs	abstract, slides
	Counteracting active attacks in social network graphs Rolando Trujillo Active privacy attacks in social networks attempt to de-anonymize a social graph by using social links previously established between the attackers and their victims. Malicious users are hard to identify a priori, so active attacks are typically prevented by anonymizing (perturbing) the social graph prior publication. This talk is an introduction to this type of privacy attack and the anonymization techniques that have been proposed to counteract it. Towards the end of the talk we will discuss drawbacks of current approaches and possible improvements.
17.	7 February	Fabian van den Broek	IRMA, as simple as ABC	abstract, slides
	IRMA, as simple as ABC Fabian van den Broek I will present the IRMA technology. IRMA (I Reveal My Attributes) is an attribute-based credential system in which users can obtain credentials stating attributes about themselves (e.g.\ I'm an employee of the Open University, with employee-number 1234). These attributes can be disclosed selectively to verifiers (e.g. I'm an employee of the Open University, xxxxxxxxxxxxxxxxxxxxx), offering a very flexible and potentially privacy-friendly way of authentication. This presentation will mostly focus on the progress we made in the last two years and the progress we are hoping to achieve in the coming years.

2016
16.	25 October	Jeroen Keiren	Game-based approach to branching bisimulation	abstract, slides
	Game-based approach to branching bisimulation Jeroen Keiren Branching bisimilarity and branching bisimilarity with explicit divergences are typically used in process algebras with silent steps when relating implementations to specifications. When an implementation fails to conform to its specification, i.e., when both are not related by branching bisimilarity (with explicit divergence), pinpointing the root causes can be challenging. In this paper, we provide characterisations of branching bisimilarity (with explicit divergence) as games between Spoiler and Duplicator, offering an operational understanding of both relations. Moreover, we show how such games can be used to assist in diagnosing non-conformance between implementation and specification.
15.	4 October	Arjen Hommersom	Flexible Probabilistic Logic for Solving the Problem of Multimorbidity	abstract, slides
	Flexible Probabilistic Logic for Solving the Problem of Multimorbidity Arjen Hommersom One of the central questions in artificial intelligence is how to combine knowledge representation principles with machine learning methods. Effectively dealing with structure and relations in data opens up a wealth of applications in areas such as economics, biology, and healthcare. In this talk I will discuss a recent project proposal, which aims to develop a new methodology that provides powerful methods for deriving knowledge from structured data using state-of-the-art probabilistic logics. These methods will be applied to deal with a highly significant and challenging problem in modern medicine: managing patients with multimorbidity, i.e., patients that have multiple chronic conditions. The main purpose of this talk is to obtain feedback as this project will likely soon be presented to an evaluation committee in the exact sciences.
14.	28 June	Hugo Jonker	Gaming the H-index	abstract, slides
	Gaming the H-index Hugo Jonker In this talk, we discuss some early results looking into formally modelling the scientific process, and elucidating what attacks can be performed on bibliographic metrics. Specifically, we distinguish between hacking, fraud, and gaming, provide initial notions of these three terms and initial results towards identifying which gaming attacks are possible on the H-index.
13.	31 May	Josje Lodder	Strategies for axiomatic Hilbert-style proofs	abstract, slides
	Strategies for axiomatic Hilbert-style proofs Josje Lodder Een van de onderwerpen die studenten tegenkomen bij een logicacursus is een afleidingssysteem zoals natuurlijke deductie of axiomatisch afleiden. Het vinden van afleidingen is niet altijd eenvoudig. Studenten moeten hier veel mee oefenen, en een e-learning systeem kan hier bij helpen. Voor natuurlijke deductie bestaan er verschillende ondersteunende tools, maar voor axiomatisch afleiden zijn er geen tools die studenten helpen bij het maken van deze afleidingen. Om gerichte feedback en feedforward te kunnen geven hebben we strategieën ontwikkeld waarmee we automatisch axiomatische bewijzen kunnen genereren, en we gaan deze strategieën ook gebruiken om uitwerkingen van studenten te volgen, zodat we op elk moment gericht feedback kunnen geven.
12.	26 April	Bastiaan Heeren	Ask-Elle: an Adaptable Programming Tutor for Haskell Giving Automated Feedback	abstract, slides
	Ask-Elle: an Adaptable Programming Tutor for Haskell Giving Automated Feedback Bastiaan Heeren Ask-Elle is a tutor for learning the higher-order, strongly-typed functional programming language Haskell. It supports the stepwise development of Haskell programs by verifying the correctness of incomplete programs, and by providing hints. Programming exercises are added to Ask-Elle by providing a task description for the exercise, one or more model solutions, and properties that a solution should satisfy. The properties and model solutions can be annotated with feedback messages, and the amount of flexibility that is allowed in student solutions can be adjusted. The main contribution of our work is the design of a tutor that combines (1) the incremental development of different solutions in various forms to a programming exercise with (2) automated feedback and (3) teacher-specified programming exercises, solutions, and properties. The main functionality is obtained by means of strategy-based model tracing and property-based testing. We have tested the feasibility of our approach in several experiments, in which we analyse both intermediate and final student solutions to programming exercises, amongst others.
11.	29 March	Hugo Jonker	MitM attacks evolved	abstract, slides
	MitM attacks evolved Hugo Jonker The security community seems to be thoroughly familiar with man-in-the-middle attacks. However, the common perception of this type of attack is outdated. It originates from when network connections were fixed, not mobile, before 24/7 connectivity became ubiquitous. The common perception of this attack stems from an era before the vulnerability of the protocol's context was realised. Thanks to revelations by Snowden and by currently available man-in-the-middle tools focused on protocol meta-data (such as so-called `Stingrays' for cellphones), this view is no longer tenable. Security protocols that only protect the contents of their messages are insufficient. Contemporary security protocols must also take steps to protect their context: who is talking to whom, where is the sender located, etc. In short: the attacker has evolved. It's high time for our security models and requirements to catch up.
10.	26 February 13.45-14.30, Utrecht	Tanja Vos	TESTAR: Automated Testing at the User Interface Level	abstract, slides
	TESTAR: Automated Testing at the User Interface Level Tanja Vos Testing applications at the User Interface level is an important yet expensive and labour-intensive activity. Several tools exist to automate UI level testing. These tools are based on capture replay or visual imagine recognition. We present TESTAR, a tool for automated GUI testing that takes a totally different approach and has demonstrated to be highly useful in practice. Using TESTAR, you can start testing immediately and you do not need to specify test cases in advance. TESTAR automatically generates and executes random test sequences based on a structure that is automatically derived from the UI through the accessibility API. Defects that can be found this way are crashes or suspicious outputs. For defects related to functionality, expected outputs need to be added to the tool. This can be done incrementally. The structure on which testing is based is build automatically during testing, the UI is not assumed to be fixed and tests will run even though the UI evolves, which will reduce the maintenance problem that threatens the approaches mentioned earlier.
9.	23 February	Freek Verbeek	Cross-layer invariants for Network-on-Chips	abstract, slides
	Cross-layer invariants for Network-on-Chips Freek Verbeek One of the major challenges in the design and verification of manycore systems is cache coherency. In bus-based architectures, this is a well-studied problem. When replacing the bus by a communication network, however, new problems arise. Cross-layer deadlocks can occur even when the protocol and the network are both deadlock-free when considered in isolation. To counter this problem, we propose a methodology for deriving cross-layer invariants for Network-on-Chips. These invariants relate the state of the protocols run by the cores to the state of the communication network. We show how they can be used to prove the absence of cross-layer deadlocks. Our approach is generally applicable and shows promising scalability.
8.	26 January	Harald Vranken	About a recently submitted research proposal


2015
7.	27 October	Bernard van Gastel	Inschatten van energieverbruik van software door middel van een typesysteem	abstract, slides
	Inschatten van energieverbruik van software door middel van een typesysteem Bernard van Gastel Energieverbruik speelt een steeds bepalendere rol binnen ICT. In veel toepassingsgebieden is het belangrijk om het energieverbruik te voorspellen, en ervoor te kunnen optimaliseren. Deze methoden moeten praktisch zijn: zowel snel feedback geven als voor grote programma/bibliotheken toepasbaar zijn. Omdat het energieverbruik sterkt afhangt van de hardware die gebruikt wordt, moet de gebruikte hardware snel in te stellen zijn en te wisselen zijn. In dit kader stellen wij een type systeem voor dat statisch gebruikt kan worden om energie verbruik compositioneel af te leiden. Dit type systeem werkt op een kleine theoretische programmeertaal waarbij hardware interacties expliciet gemaakt zijn.
6.	29 Sept 10.00-11.00, Eindhoven	Sung-Shik Jongmans	Protocol Programming with Automata: (Why and) How?	abstract, slides
	Protocol Programming with Automata: (Why and) How? Sung-Shik Jongmans With the advent of multicore processors, exploiting concurrency has become essential in writing scalable programs on commodity hardware. One important aspect of exploiting concurrency is programming protocols. In this talk, I explain my recent work on programming protocols based on a theory of automata.
5.	25 August	Sylvia Stuurman	Setup for a experiment on teaching refactoring	slides

4.	30 June	Jeroen Keiren	State Space Explosion: Facing the Challenge	abstract, slides
	State Space Explosion: Facing the Challenge Jeroen Keiren When processes run in parallel, the total number of states in a system grows exponentially; this is typically referred to as the state space explosion problem. The holy grail in model checking is finding a way in which we can effectively cope with this state space explosion. In this talk I discuss a line of research in which we use an equational fixed point logic and parity games as a basis for verification. I will focus on the latter, and discuss ideas that can be used to reduce these parity games. For those wondering whether this general theory has any applications at all I discuss industrial applications of this technology. If you want to know what these applications are, come listen to my talk!
3.	26 May	Arjen Hommersom	Probabilistic Models for Understanding Health Care Data	abstract, slides
	Probabilistic Models for Understanding Health Care Data Arjen Hommersom In the first part of this talk, I will give an overview of my recent research in using probabilistic models for health care applications. In the second part, I will zoom into recent results from the NWO CAREFUL project, where we attempt to reconcile probabilistic automata learning with Bayesian network modelling. In particular, the aim is to identify the probabilistic structure of states within a probabilistic automaton. We propose a new expectation-maximisation algorithm that exploits the state-based structure of these models. This model is currently being applied to learn models from psychiatry data to provide novel insights into the treatment of psychotic depressions. Preliminary results will be discussed.
2.	28 April	Hugo Jonker	FP-Block: usable web privacy by controlling browser fingerprinting	abstract, slides
	FP-Block: usable web privacy by controlling browser fingerprinting Hugo Jonker Online tracking of users is used for benign goals, such as detecting fraudulous logins, but also to invade user privacy. We posit that for non-oppressed users, tracking within one website does not have a substantial negative impact on privacy, while it enables legitimate benefits. In contrast, cross-domain tracking negatively impacts user privacy, while being of little benefit to the user. Existing methods to counter tracking treat any and all tracking similar: client-side storage is blocked, or all sites are fed random characteristics to hamper re-identification. We develop a tool, FP-Block, that counters cross-domain tracking, while allowing intra-domain tracking. For each visited site, FP-Block generates a unique, consistent fingerprint: a "web identity". This web identity is then used for any interaction with that site, including for third-party embedded content. This ensures that ubiquitously embedded parties will see different, unrelatable fingerprints from different sites, and can thus no longer track the user across the web.
1.	March 31	Christoph Bockisch	A design method for modular energy-aware software	abstract, slides
	A design method for modular energy-aware software Christoph Bockish Nowadays achieving green software by reducing the overall energy consumption of the system is becoming more and more important. A well-known solution is to make the software energy-aware by extending its functionality with energy optimizers, which monitor the energy consumption of software and adapt it accordingly. Modular design of energy-aware software is necessary to make the extensions manageable and to cope with the complexity of the software. To this aim, we propose a model of the energy utilization -- or more general resource utilization -- and an accompanying design method for components in energy-intensive systems. Optimizer components can analyze such Resource Utilization Models (RUM) of other components in terms of such a model and adapt their behavior accordingly. We show how compact RUMs can be extracted from existing implementations. Using the Counterexample-Guided Abstraction Refinement (CEGAR) approach, along with model-checking tools, abstract models can be generated that help establish key properties relating to energy consumption. This approach is illustrated by the concrete example of a network manager sub-system.

empty


phone:	+31 (0)45 576 2877
email:	secretariaat.bw@ou.nl
www:	http://cs.ou.nl/
twitter:	@ou_informatica

Research Seminar Informatica

Research

Teaching

Group

contact info

Presentations

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015