Question
Is there a ready-to-use English grammar that I can just load it and use in NLTK? I've searched around examples of parsing with NLTK, but it seems like that I have to manually specify grammar before parsing a sentence.
How-To
From 5. Categorizing and Tagging Word, you can do POS (Part Of Speech) this way:
- In [1]: import nltk
- In [4]: nltk.download('punkt')
- [nltk_data] Downloading package punkt to
- [nltk_data] C:\Users\johnlee\AppData\Roaming\nltk_data...
- [nltk_data] Unzipping tokenizers\punkt.zip.
- Out[4]: True
- In [5]: text = word_tokenize('And now for something completely different')
- In [7]: nltk.download('averaged_perceptron_tagger')
- In [8]: for w,pos in nltk.pos_tag(text):
- ...: print('{}/{} '.format(w, pos))
- ...:
- And/CC
- now/RB
- for/IN
- something/NN
- completely/RB
- different/JJ
If you encounter conflict with installed package during installing, you can use argument --ignore-installed to get bypass this issue sometimes. Then let's see how to use this library (linguistic features):
Choi et al. (2015) found spaCy to be the fastest dependency parser available. It processes over 13,000 sentences a second, on a single thread. On the standard WSJ evaluation it scores 92.7%, over 1% more accurate than any of CoreNLP's models.
Supplement
* Natural Language Processing Made Easy – using SpaCy (in Python)
沒有留言:
張貼留言