AmsterdamAI

23 November 2023

Amsterdam AI Highlights Event

Amsterdam AI will host its annual Highlights Event in person on Thursday, November 23 at Pakhuis de Zwijger! The event will also feature highlights from our academic and industry partners and the winners of the fourth ADS Thesis Awards.

Get your ticket below!

Programme

16:15 Welcome
16:30 Introduction
16:40 Keynote Speaker
17:15 Q&A
17:25 Amsterdam AI Thesis Awards
17:40 Closing Remarks
17:45 Drinks

Piek Vossen

Title: Meaning in a cross-lingual language model

Large Language Models or LLMs have two major components: the vocabulary and the attention model. The vocabulary is derived from the training data by tokenising the textual input and inferring what tokens are most frequent. Vocabularies are limited in size, which necessarily means that words that represent concepts are broken down into smaller parts. If that happens a word looses meaning or gets misrepresented in the model. Wen the model is initialised, the meaning needs to be repaired using the context. For example, my name is not in the vocabulary of an English model and will be broken down into “pie” and “k” and similarly the two where I live “weesp” is broken down in “we” and “esp". Obviously, these tokens obtained very different meanings so the model wil be off after initialisation with my name and my town. Still models can recover that “pie”, “k”, “we” and “esp" actually represent the name of a person and a town in e.g. “My name is Piek and I live in Weesp” but only because the context is clear. This raises the question what happens in a cross-lingual language model in which up to a hundred languages need to share the vocabulary? These models can still transfer very well information from one language to another even though they share tokens with different meanings, such as the tokens “star” and “ster" in English and Dutch, or do not share any token at al, in case of languages with unique scripts. How can these models share associations and attention, how do they transfer conceptual relations, or impose linguistics structures. In this talk, I will present some of our research to answer these questions, which sheds light on the nature of these models as well.

Bio: Prof. Dr. P.Th.J.M. (Piek) Vossen (1960) is a full Professor of Computational Lexicology at the Vrije Universiteit Amsterdam, Head of the Computational Lexicology & Terminology Lab (CLTL), co-founder and co-president of the Global WordNet Association (GWA) and Dean of Research of the Board of the Faculty of Humanities of the Vrije Universiteit of Amsterdam). His research interests are WordNets, Computational Lexicon, Ontologies, Computational Linguistics, Language Technology and Computer-Applications, both within a single language and from a multilingual perspective. Vossen is interested in the relation between lexicons and ontologies, from a theoretical point of view as well as from their usage in computer-applications in which meaning and interpretation play a role. He sees the lexicon as a fundamental resource to anchor meaning and interpretation in useful computer behaviour. Computer behaviour can make use of communicative models and insights from communication science. The organization of the lexicon and the knowledge stored in it need to take that usage as a starting point. He combines linguistics and computer science to model understanding of natural language texts by computers.

In 2013 he won the prestigious Spinoza Award of the Netherlands Organisation for Scientific Research (NWO) and in 2015 he has been honoured by the Dutch Royal House as a “Knight in the Order of the Dutch Lion”. He is also a member of the Koninklijke Nederlandse Akademie van Wetenschappen (KNAW) in three domains and a member of the Koninklijke Hollandsche Maatschappij der Wetenschappen (KHMW).

The Amsterdam AI Thesis Awards
The Amsterdam AI Thesis Awards aim to promote excellence in AI and Data Science of students at bachelor's and master's level from the Amsterdam-based Amsterdam-AI university partners (HvA, UvA and VU). The goals of the Amsterdam AI Thesis Award are:

1. Rewarding and advocating for high-quality thesis work;
2. Encourage and encourage women and underrepresented minorities to further their education;
3. Encourage diversity in AI and Data Science;
4. Promote Amsterdam and the Amsterdam AI ecosystem as an innovation hub by highlighting excellent theses.

The winners will present their thesis work during the Highlights Event and receive their prize.

Aankomend >

Upcoming events

View all events >

September 18

Amsterdam AI Highlights Event

Aankomend >

Upcoming events

AI on the Amstel x Amsterdam AI: Unpacking AGI hype vs. reality

TechEx

Digital Sustainability Conference