Natural Language Processing Platform The Árni Magnússon Institute for Icelandic Studies

PoS-tag and Lemmatize

Markari: ABLTagger Citation

A Part-of-Speech (PoS) tagger reads in text and tags each token with a text string that indicates the word's part of speech and other grammatical features, such as case, grammatical gender and tense.

The PoS tagger used here is simply called ABLTagger 3.0. It is maintained by The Language and Voice Technology Lab at the University of Reykjavík (CADIA-LVL) (CADIA-LVL). The groundwork for this tool was the Bi-LSTM tagger ABLTagger 1.0. which was originally developed by Steinþór Steingrímsson, Örvar Kárason and Hrafn Loftsson in the spring of 2019.

Lemmari: Nefnir Citation

On this platform, PoS-tagged words are sent to a lemmatizer, which reads in PoS-tagged text and lemmatizes it, i.e. records the base form (lemma) of each word (e.g. hestur for hests).

Word lemmas are retrieved using Nefnir, developed by Jón Friðrik Daðason.

Processing text. Please wait...

NLP tools