Back
Join now
About

Popular Tags

  • typescript
  • react
  • open-source-coding-agent
  • llm
  • ui-components
  • shadcn-ui
  • ai-agents
  • tailwind
  • open-source
  • python

Top Sources

  • github.com
  • clerk.com
  • 1771technologies.com
  • 21st.dev
  • abui.io
  • activepieces.com
  • ai-sdk.dev
  • alash3al.github.io
  • alchemy.run
  • altsendme.com

Browse by Type

  • Tools
  • Code
bookmrks.io - Discovery, refined.
Website faviconduarteocarmo.com
Website preview

AMÁLIA: Advancing European Portuguese LLMs

AMÁLIA is an open-source LLM for European Portuguese, focusing on data utilization and benchmarking in NLP.

flux
Summary

AMÁLIA is a large-scale Large Language Model (LLM) developed for European Portuguese, backed by a significant investment from the Portuguese government. The initiative aims to enhance the representation of European Portuguese in the field of natural language processing.

Key features:

  • Open Source - AMÁLIA is designed to be fully open source, although currently, not all components are publicly accessible.
  • Data Utilization - The model focuses on using a substantial amount of European Portuguese data, primarily sourced from Arquivo.pt.
  • Benchmarking - The team has created four new benchmarks specifically for evaluating the model's performance in European Portuguese.
  • Collaboration - AMÁLIA is the result of a collaboration between several prestigious Portuguese universities and research labs.

Despite its promising features, there are concerns regarding the amount of European Portuguese data utilized in training, with only 5.5% of the training tokens being clearly identified as European Portuguese. The article discusses the implications of this and the importance of transparency in the model's development.

Comments
No comments yet. Sign in to add the first comment!
Tags
  • llm
    1
  • open-source-coding-agent
    1