Student project ideas (science)

Dear student, if you are enrolled at the University of Amsterdam or the Amsterdam University College, and you would like to do a project/thesis/capstone with me, these are a few topics I or members of my team are interested into. If you like one of them, or you would like to propose something else, get in touch: (please include program, year and grades).

Participation in shared tasks is encouraged. Students from previous years have participated, for example, in challenges such as SemEval 2020 unsupervised lexical semantic change (task 1) and CLEF HIPE 2020 (Named Entity Recognition and Linking on multilingual historical corpora).

I am currently loooking for students interested in participating at the SemEval 2021 NLPContributionGraph (task 11), focused on building knowledge bases from unstructured scientific publications.

AI for cultural heritage

There are a variety of topics under this general theme which are of interest, as heritage collections have been widely digitized over recent years. Themes include the use of active learning to annotate and retrieve information (this is related to human-AI interaction or hybrid AI), the use of transfer learning (for example language models of virual features), the automated enrichment of collections via information extraction (for example, named entity recognition and linking).

Several of these topics are developed within the context of the Creative Amsterdam initiative (CREATE) at UvA, and in collaboration with the main Dutch heritage organizations.

To know more: Fiorucci et al., Machine Learning for Cultural Heritage: A Survey, 2020.

Computational art history

The availability of digitized (or born-digital) art collections allows scholars to use modern machine learning techniques to study artistic influences, styles and visual patterns recurring over time.

To know more: Shen et al., Discovering Visual Patterns in Art Collections with Spatially-consistent Feature Learning, 2019.

Crypto art market prediction

Crypto art is born-digital art tokenized and exchanged on blockchains, typically Ethereum. Crypto art is currently growing very rapidly, and is at the forefront of experimentation on many art-related questions. Many project ideas can focus on crypto art, and in particular on predicting art prices using a variety of signals (both for first sales and for re-sales). If you are into machine learning and finance, this project is for you.

To know more: Aubry et al., Biased Auctioneers, 2020.

The language of science

How scientists use natural language to communicate? How does their use differs across disciplines (say, mathematics and philosophy)? This projects aims at measuring the relative difference or similarity in the use of natural language across scientific communities and over time.

To know more: Ramage et al., Mapping Three Decades of Intellectual Change in Academia, 2020.

Public understanding of science (case study on COVID-19)

Scientific results are increasingly communicated and debated publicly, they are mediatized. This happens via traditional media (e.g., newspapers), online media (e.g., Wikipedia), social media (e.g., Twitter). Understanding how these debates happen and how scientific results are framed (and by whom) is very important to clarify how science is perceived and understood. Projects here can combine several data sources to further our understanding of this topic.

Work on COVID-19-related research is of particular interest.

AI for science (case study on COVID-19)

Under this general umbrella there are many exciting opportunities which consists in the application of machine learning to improve scientific communication. Examples include the modelling of citations (citation intention, reliability), the extraction of structured information from scientific literature, the automatic summarization of scientific articles and representation learning of experts and expertise.

Work on COVID-19-related research is of particular interest.