phone: | +31 (0)45 576 2143 |
email: | hugo.jonker@ou.nl |
www: | http://www.open.ou.nl/hjo |
twitter: | @hugojonker |
For students with a background in security and/or formal methods, I am happy to supervise thesis projects on security and on privacy, under the broad theme "Security and Privacy in Modern Times". Concrete projects will align with my current research. For an idea of the type of subjects I am happy to supervise:
Projects can be tailored to MSc as well as BSc students. Other topics exist and can be discussed - contact me if you're interested.
Note. In general, I expect the thesis to be written in English, and to provide a solid basis for a (later to be written) publication. See also the page about my approach to supervision.
In general, I'm interested in research that identifies the "bad guys", research that helps to identify security or privacy weaknesses (or the impact of such weaknesses), and research to mitigate security and privacy issues. More concretely, here is an incomplete list of project categories I'm happy to supervise. If you're looking for a project in one of these categories, do contact me.
This is a (partially outdated) list of project ideas. Contact me to discuss specific subjects of interest to you.
In Vincent van der Meer's study into contemporary patterns of
file fragmentation [DFRWS-APAC20], we
found that about a third of fragmented files are fragmented out
of order, i.e.: at least one fragment appears after the next
fragment. This has so far not been addressed in file carvers.
Vincent and Jeroen van den Bos have created a modern file
carving framework that can support carving of out-of-ordered
fragmented files. This carver makes use of a exceptionally
strong and exhaustively tested JPEG validator [DFRWS-EU24] to identify fragmentation
points. The goal of this project is to use this framework to
design and implement carving strategies, to benchmark their
recovery performance (how much is recovered, how fast is file
recovery), and, optionally, to compare their recovery
performance to other file carvers.
Skills used: Java programming.
Co-supervisor: Vincent van der Meer.
Many apps created specifically for children are free. They
generate income for their creators by displaying advertisements.
There are strict rules for advertising to children (Children's
Online Privacy Protection Act in the USA, Digital Services Act
in EU). Both of these prohibit tracking children online. This
should rule out behavioural advertising. The core idea of this
project is to use real-world devices, and automate interaction
(e.g., using the method developed by Meesters in his thesis) to
check why advertisements in children's apps are shown.
Skills used: Appium, programming.
Co-supervisor: Fabian van den Broek (OU).
In federated learning, you train an ML-approach locally, then
the locally trained models are aggregated into a global model. This
allows for privacy in training ML-models. However, it also allows a
malicious agent to submit an incorrectly trained model. The goal of this
project is to measure the reliability of submitted model parameters.
Skills used: machine learning, deep learning.
Co-supervisor: Mina Alishahi (OU).
Online social reputation, gained via reviews, has become an
important factor for companies. However, malicious actors can
manipulate their reputation by manipulating reviews. For
example, a seller of goods could gather positive reviews and
then perform a "bait-and-switch", update the product description
and photo, and sell something completely different.
The goal of this project is
to detect manipulation of social reputation by matching
reviews to the reviewed item / service.
Skills used: NLP, web scraping.
Co-supervisor: Mina Alishahi (OU).
Online abuse has become a rampant phenomena. In this project,
we're focused on collecting data from social media that allows
analysis of origins of online abuse -- either the development of
victims, or the development of abuser. How does abuse start? When
does abuse cross a line? Do victims retaliate? Do abusers become
increasingly abusive? How much of this can be automatically
identified at a large scale?
Skills used: scraping, text analysis, security
assessment, predictive modelling.
Co-supervisor: Clara Maathuis (OU).
Scientific fraud is increasingly becoming a
problem. Several classes of fraud (such as plagiarism) can be
automatically detected. However, such detection methods are
focused on the results of one specific type of fraud, instead of
the underlying incentives behind fraud.
In this project, we build upon previous work that developed
methods to identify outliers in publication metrics. This project
focuses on "secondary" or derived publication metrics, such as
number of co-authors, average number of papers with co-authors,
etc. The goal is to identify which of such "derived" or secondary
publication metrics are useful as indicators for scientific
fraud.
Skills used in project: basics of set theory, basic python
programming.
Online marketplaces where small companies
and individuals can sell goods have become commonplace.
Examples of such marketplaces include those for traditional
goods (Amazon.com, bol.com, etc.), but also marketplaces for
digital goods such as e-books.
Typically, such marketplaces offer some quality control for
helping users gauge sellers, such as reviews. However, these
controls may be gamed. On Amazon, it is possible to swap out a
specific item (e.g., jar of honey) for a much more expensive
item (e.g., drone) while keeping the reviews and user
ratings(!). In e-books, a boom in spammy books has occurred.
(see the "fighting ebook spam" project below).
The goal of this project is to semi-formalise this problem and
design and develop a way towards automated detection of such
baiting / bait and switch shenanigans.
Skills used: programming, web scraping.
Possible directions: formalisation, machine learning,
theoretical generic approach, practical market-specific solution.
Web bots (scrapers) automatically traverse
the internet to gather data from and measure aspects of web
sites. Web bots may be used for benign as well as nefarious
purposes. To combat nefarious bots, web sites sometimes employ
bot detection methods. Unfortunately, anti-bot measures affect the
reliability of studies performed using web bots. Thanks to
preliminary work, some lower bounds on the prevalence of web bot
detection are known. However, previous work uses two orthogonal
approaches to identify bot detection. As such, a comprehensive
picture is missing. The goal of this project is to construct
a classifier, train it to recognise bot detection, and determine
how often web bot detection is used on the internet. The input
for the recognition is to come from two orthogonal approaches:
fingerprint-surface based detection of web bots and behavioral
detection of web bots.
Skills used: web scraping, machine learning.
Privacy is a hot research topic. Many
papers analyse privacy of systems. To do so, they have to specify
what privacy (in their specific case) actually is. This has led to
a handful of different formalisations of privacy. The goal of the
project is to establish a formal framework in which at least three
approaches to privacy definitions (observational equivalence,
unlinkability and quantified privacy) can be formalised and
compared. Are they equivalent? If not, in which cases do they
differ? Etc.
Expected prior knowledge: basic formal modelling, basic formal
analysis (trace equivalences, observational equivalences).
A phone can read its own vibration out
using its accelerometers. This is unique for each phone and
cannot be imitated: a physical uncloneable function or PUF.
However, many apps can trigger the buzz function and read out
accelerometer values. The goal of this project is to develop an
app that allows for authentication using this PUF functionality
in a secure way.
Skills used in project: Android programming, security
analysis.
Co-supervisor: dr. Fabian van den Broek.
This project builds upon the work of the
Pwitter project in developing a privacy layer for Twitter. In this
project, the concept is extended to a more generic framework,
beyond the simple structure of Twitter (where there only exists a
follow relation). You will build a layer on top of an existing,
complex social network that enables a user to privately
communicate over the social network, while retaining privacy
against other users and the social network. The specific privacy
guarantees enabled by your layer will be formally analysed.
Skills used in this project: browser plugin programming,
formal security analysis.
Spam is not only an email problem. There are
ebooks that are copy-pasted together, flung together quickly with
no regards for quality of content, only to provide a revenue
stream for their authors. There are various schemes related to
ebook spam. There is a scheme in which the books themselves
provide the revenue. This type of scheme relies on selling many
books to turn a profit, and thus is more likely to use fraudulent
means to promote the book (fake reviews, etc.). Another type of
scheme relies on Amazon's Kindle Direct Publishing
programme, in which Amazon pays out money depending on the amount
of pages read.
The goal of this project is to investigate the current state of
ebook spam schemes, and devise countermeasures.
Skills used in the project: web scraping, programming.