Transfer learning for historical collections

Transfer learning is a popular machine learning technique which allows to apply (‘transfer’) a model trained on a resourceful task/domain, to another, related but resourceless task/domain. Language models are a notable examples, with popular methods such as BERT allowing to improve the performance on a variety of tasks. Historical collections, both visual and textual, could benefit from the use of transfer learning, yet its application is not straightforward. Challenges include, taking the example of text, linguistic change and poor OCR quality. This project aims to systematically assess the applicability of transfer learning across collections and languages, and develop specific approaches to improve its use.

This project is a collaboration with Konstantin Todorov and is funded by the NWO.