Croatian Language Repository: Documentation


Dunja Brozović Rončević i
Damir Ćavar (19th of June 2006) Das Korpus der kroatischen Sprache: Hrvatska jezična mrežna riznica, University of Graz, Austria.
Ivo-Pavao Jazbec i
Tomislav Stojanov (16. rujna 2006.) The Croatian Language Corpus, Conference on empirical and computational linguistics 2006 (CECL 2006), University of Zadar.

The Croatian Language Repository is a project at the Institute of Croatian Language and Linguistics that is funded by the Ministry of Science, Education, and Sports (project no. 0212010) since May 2005. The basic task of this project is to create publicly available resources for the Croatian language, and to provide crucial information about the Croatian language standard, as well as other information related to Croatian language. In this project several corpora from different development phases of Croatian language are created, including digitizations of manuscripts of Croatian dictionaries. We collect and digitize texts for a representative corpus of Croatian standard language that is the basis for not only the compilation of the Comprehensive Croatian Dictionary, but also other linguistic research.

The Croatian Language Corpus is assembled from selected text of Croatian language, covering various functional domains and genres. It includes literature and other written sources from the period of the beginning of the final shaping of the standardization of Croatian language, i.e. from the second half of the 19th century on.

The Croatian Language Corpus consists of:

• fundamental Croatian literature (e.g. novels, short stories, drama, poetry)
• non-fiction
• scientific publications from various domains and University textbooks
• school books
• translated literature from outstanding Croatian translators
• online journals and newspapers
• books from the pre-standardization period of Croatian language that are adapted to nowadays standard Croatian

In cooperation with:
Školska knjiga d.d.
Croatian Academy of Sciences and Arts (HAZU)
Stoljeća hrvatske književnosti, Matica hrvatska