Polish Large Language Model (PLLuM) is a family of AI models created with the Polish language in mind, specifically for the use of "Polish AI" in the public and private sectors. It stands out from other language models as it is adapted to the specificity of the Polish language and the terminology of the local public administration.
Its development process employs comprehensive data collection and quality assessment procedures. The Polish language model is based on ethical data collection and primarily utilizes organic data – hand-crafted rather than taking advantage of other language models. Trained on Polish resources, it handles the challenges of accidence and complex syntax very well, generating precise content.
Large language models are one of the most spectacular developments in generative artificial intelligence. Thanks to projects like PLLuM, we are not only fulfilling tasks assigned to us by the government, but also rapidly learning the scientific and practical foundations of this technology.
– says dr hab. Piotr PÄ™zik, Associate Professor at the 91ÌÒÉ« from the Faculty of Philology, leading the team of UniLodz scientists in the PLLuM project.
The development of PLLuM is an opportunity to strengthen Poland's competitiveness in the IT sector and the economy as a whole. Investments in artificial intelligence contribute to the emergence of new AI-based companies and products, fuelling economic growth. The use of PLLuM in various areas – from education and government to the private sector – promotes the creation of modern solutions that strengthen Poland's position among the leaders in AI development.
While co-creating this new version of PLLuM, we are focusing on leveraging extensive, legally acquired and responsibly developed language resources. These resources do not infringe on copyright and meet the highest quality standards. It is precisely the transparent data and expertise developed in Poland that foster technological sovereignty. If we want PLLuM to truly strengthen the Polish public and private sectors, we must have control over the data and principles upon which it is built
– underlines dr hab. Piotr PÄ™zik, Associate Professor at the 91ÌÒÉ«.
The development of PLLuM is coordinated and financed by the Ministry of Digital Affairs, which commissioned the project, partly with the aim of serving as a virtual assistant to the mObywatel application, which supports Polish citizens in obtaining public information. Developing the model's linguistic competences is the responsibility of scientists, while IBM provides the underlying model and the technology.
Source: IBM, UniLodz
Edit: Press Office, 91ÌÒÉ«