NEGRA Corpus: 350,879 words (in 20,602 sentences) of
syntactically- and morphologically-annotated Frankfurter
Rundschau (with STTS POS tags)
The TIGER Corpus: 879,367 words (in 50,474 sentences) of
syntactically- and morphologically-annotated Frankfurter
Rundschau (with STTS POS tags and lemmatisation)
Tiger2Dep tool written by Wolfgang Seeker to convert TIGER
into dependency parses
TüBa-D/Z Corpus: 1.3M words of syntactically-annotated "taz"
articles (with STTS POS tags)
SALSA Corpus with semantic role annotations based on TIGER
deWaC Corpus 1B words from the web, automatically POS-tagged
and lemmatised
SdeWaC Corpus: a subset of deWaC containing parsable sentences
(884,367,144 words in 45,400,446 sentences)