安装

不需要安装,直接从github上download下来就行
https://github.com/slavpetrov/berkeleyparser

使用方法

java -jar BerkeleyParser-1.7.jar -gr

<grammar>

然后就启动了,你输入一句话,它会返回这句话的parse结果。

查看更多参数

java -jar BerkeleyParser-1.7.jar

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
-render                       Write rendered tree to image file. (Default: false)
-maxLength Maximum sentence length (Default = 200).
-binarize Output binarized trees. (Default: false)
-variational Use variational rule score approximation instead of max-rule (Default: false)
-useGoldPOS Read data in CoNLL format, including gold part of speech tags.
-dumpPosteriors Dump max-rule posteriors to disk.
-ec_format Use Eugene Charniak's input and output format.
-nThreads Parse in parallel using n threads (Default: 1).
-modelScore Output effective model score (max rule score for max rule parser) (Default: false)
-keepFunctionLabels Retain predicted function labels. Model must have been trained with function labels. (Default: false)
-nGrammars Use a product model based on that many grammars
-chinese Enable some Chinese specific features in the lexicon.
-scores Output inside scores (only for binarized viterbi trees). (Default: false)
-tokenize Tokenize input first. (Default: false=text is already tokenized)
-substates Output subcategories (only for binarized viterbi trees). (Default: false)
-outputFile Store output in this file instead of printing it to STDOUT.
-confidence Output confidence measure, i.e. likelihood of tree given words: P(T|w) (Default: false)
-gr Grammarfile (Required)
[required]
-accurate Set thresholds for accuracy. (Default: set thresholds for efficiency)
-inputFile Read input from this file instead of reading it from STDIN.
-kbest Output the k best parse max-rule trees (Default: 1).
-tree_likelihood Output joint likelihood of tree and words: P(t,w) (Default: false)
-viterbi Compute viterbi derivation instead of max-rule tree (Default: max-rule)
-sentence_likelihood Output sentence likelihood, i.e. summing out all parse trees: P(w) (Default: false)
-h help message

我的使用方法

1
2
3
4
import os

def berkeley_parse(input_file, output_file, berkeley_jar='./berkeleyparser/BerkeleyParser-1.7.jar', gr='./berkeleyparser/eng_sm6.gr'):
os.system('java -jar ' + berkeley_jar + ' -gr ' + gr + ' -inputFile ' + input_file + ' -outputFile ' + output_file)
1
berkeley_parse(input_file, output_file)