Parsing is the process of structuring a linear representation in accordance with a given grammar (Grune and Jacobs, 1990).



The basic idea of parsing evaluation consists in measuring the similarity between the parser-generated tree-structure (also called labelled bracketings) and the manually constructed tree-structure.

Adequacy evaluation involves determining the fitness of a parsing system for a particular task. Efficiency evaluation is to compare the parse time of a given parser on a common test data set with a reference parser.



(Carroll et al. 1998) made the distinction between evaluation methods that are useful in leading the development of a parsing system (intrinsic evaluation) and those that are appropriate for comparing different systems (comparative evaluation). They divided parser evaluation methods into non-corpus and corpus based methods:

- Intrinsic evaluation

  • Listing linguistic constructions covered (no corpus)
  • grammatical coverage (unannotated corpus)
  • Average parse base (unannotated corpus)
  • Structural consistency (annotated corpus)
  • Best-first/Ranked consistency (annotated corpus)

- Comparative evaluation

  • Entropy/Perplexity (unannotated corpus)
  • Part-Of-Speech assignment accuracy (annnotated corpus)
  • Tree similarity (annotated corpus)
  • Grammar evaluation interest group (GIEG) scheme (annotated corpus)
  • Dependency structure-based scheme (annotated corpus)




- PASSAGE, French evaluation campaign for syntactic parsing (2007-2009).

- The Parsing Task of EVALITA 2009


- The Parsing Task of EVALITA 2007

- EASY, Evaluation campaign for syntactic parsing organized by French Technolangue action EVALDA (2003-2006).

- XTAG, wide-coverage grammar development project for English using a lexicalized Tree Adjoining Grammar (TAG) formalism (1998).

- SPARKLE, Shallow Parsing and Knowledge extraction for Language Engineering, European project (1997-2000).

- GRACE, Grammars and Resources for Analyzers of Corpora and their Evaluation, part of the French CCIIL program (1994-1997).




- Workshop on Parsing with Categorial Grammars

- 11th International Conference on Parsing Technologies (IWPT’09)

- TLT 7, The 7th International Workshop on Treebanks and Linguistic Theories (2009).

- CoNLL Shared Task 2009: Syntactic and Semantic Dependencies in Multiple Languages (2009).

- EVALITA 2009, Parsing task.

- COLING 2008, workshop on "Cross-Framework and Cross-Domain Parser Evaluation".

- LREC 2008, workshop on "Partial Parsing Between Chunking and Deep Parsing".

- ACL 2008, workshop on "Parsing German".

- IJCAI, workshop on "Shallow Parsing in South Asian Languages".

- EVALITA 2007, Parsing task.

- COLING ACL 2006, tutorial on "Dependancy Parsing".

- MSPIL-06, First National Symposium on Modeling and Shallow Parsing of Indian Languages.

- LREC 2002, workshop on "Beyond PARSEVAL Towards Improved Evaluation Measures for Parsing Systems".

- COLING 2000 Workshop on "Efficiency in Large-scale Parsing Systems".

- LREC 1998, workshop on "The Evaluation of Parsing Systems".




- Freeling
- TreeTagger




- Evalb



More about evaluation measures.