SVG Image
< Back to news

13 December 2023

GPT-NL: Secure and ethical AI to strengthen Dutch society

The Netherlands is to develop its own open language model: GPT-NL. Non-profit parties TNO, NFI and SURF are jointly developing the model to take an important step towards transparent, fair and verifiable use of AI in line with Dutch and European values and guidelines and with respect for data ownership.
Recently, the consortium, facilitated by the Netherlands AI Coalition (NL AIC) and HSD, received funding of EUR 13.5 million from the Ministry of Economic Affairs and Climate Change/RVO to implement this project.
 
With the launch of ChatGPT in 2022, the power of AI and Large Language Models (LLMs) became clear to the general public for the first time. Many discovered the benefits of the technology, but several issues regarding companies like OpenAI and the technology behind their solutions call for care. For example, they are not transparent about the algorithms and datasets used, making it impossible to monitor them or hold them accountable for possible unethical or harmful results. It is also unclear what happens to the information we enter into the model and who has access to it, so we cannot assume that our privacy is respected.
 
Moreover, the quality of output depends not only on the quality of the datasets on which a model is trained, but also on the amount of data. This is a problem for languages like Dutch, which is spoken by about 22 million people worldwide. Most, if not all, LLMs are trained on datasets that contain very little Dutch data, which affects the quality of Dutch output. What the Netherlands does have is a strong research and knowledge base in AI on which to build, an excellent network structure with relevant public, private and academic partners and a solid digital infrastructure. In addition, there is a growing need for a strong Dutch-language LLM that complies with Dutch and European privacy and ethics regulations, is transparent about the algorithms and datasets used, and adheres to Dutch cultural norms. This led to the GPT-NL project.

Limitations of current language models
 
The Netherlands Forensic Institute, the initiator of the project, has a strong history of using LLMs. They use these models for various purposes, such as analysing large amounts of data for evidence of criminal activity. "Language models have been indispensable in investigation work for years," says Erwin van Eijk, head of the Digital and Biometric Traces Department at the NFI. "It is impossible for humans to analyse the huge amounts of data within the limited time frame our work requires. Moreover, AI is used to protect investigators from unnecessary exposure to traumatising content. But our language models have limitations because we do not have sufficient resources to develop more comprehensive technology, which is especially needed as messaging in criminal circuits becomes increasingly cryptic. However, we do have a solid base of available data, algorithms, expertise and experience that we can build on for the GPT-NL project." continues Erwin.

Connecting the AI ecosystem
 
The use of language models like ChatGPT is practically impossible for the NFI, as the results of the models are used in criminal cases and therefore need to be transparent in their operation and compliance with legal requirements. But concerns about existing LLMs apply to a much wider range of organisations and applications. Erwin therefore sees the potential for many organisations in the Netherlands, from the public, private and academic sectors, to benefit from an expanded Dutch language model.

"To access the resources needed for this project, we had to join forces with other organisations and define a common goal," he says. Security Delta (HSD), the Dutch security cluster, and the Dutch AI Coalition (NL AIC), saw the urgency and potential of a Dutch AI language model from the beginning. They have very proper connections and helped get the relevant organisations on board to make this project a reality," says Erwin.

Snellius: The Dutch National Supercomputer
 
LLMs require very high computing power and sophisticated hardware infrastructure. "As a security cluster, we knew the perfect partner to facilitate that infrastructure," says Joris den Bruinen, head of Security Delta (HSD) and the NL AIC's Security, Peace and Law working group. "In SURF, educational institutions and research institutes join forces to develop and procure digital services. It is a public organisation built around the need for shared access to digital infrastructure and research data. SURF has the Dutch National Supercomputer Snellius on one hand, and on the other the confidence needed to find a wide range of partners willing to share their datasets on the platform," Joris said.

How Dutch society will benefit
 
ChatGPT offers numerous potential benefits for Dutch society. "As Erwin mentioned, there are a large number of potential applications for GPT-NL. To be clear, the project does not involve developing models for specific applications; it focuses on building the structural foundation on which an infinite number of customised models can be built," says Saskia Lensink, NLP specialist at TNO. "Multiple government organisations can benefit from GPT-NL, if only to align their communication with the language used by their citizens," adds Joris den Bruinen. "The language model developed for the GPT-NL project will be exploited based on a licensing structure, with different rates for academic, non-commercial and commercial use," says Joris. "This allows companies, including start-ups, to develop commercial applications on top of that. This ensures sovereignty in Dutch products and services, resulting in economic added value," he continues.
 
Some examples can be found in healthcare, where such a model could support medical professionals by, for example, summarising transcripts of conversations with patients, which requires the data to be stored securely according to European privacy laws. In education, we see that current AI models offer an American context and American values in their solutions, something we may not want for our children. While the current models may suffice for now, when GPT-NL becomes available, it may offer a valuable alternative in this segment. "We cannot really predict this, but we have seen with ChatGPT the power of AI and how it can elicit a wide variety of commercial and public applications," Joris concludes.

More information?
 
The full article is published on the NL AIC website (in Dutch)

Interested in learning more about the GPT-NL project? Then visit the pages below:

© NL AIC