stanford-nlp – Make Me Engineer

How to train the Stanford NLP Sentiment Analysis tool

May 31, 2023 by Tarik

What is the significance and difference between each file? Train.txt/Dev.txt/Test.txt ? This is standard machine learning terminology. The train set is used to (surprise surprise) train a model. The development set is used to tune any parameters the model might have. What you would normally do is pick a parameter value, train a model on … Read more

Extract list of Persons and Organizations using Stanford NER Tagger in NLTK

May 28, 2023 by Tarik

Thanks to the link discovered by @Vaulstein, it is clear that the trained Stanford tagger, as distributed (at least in 2012) does not chunk named entities. From the accepted answer: Many NER systems use more complex labels such as IOB labels, where codes like B-PERS indicates where a person entity starts. The CRFClassifier class and … Read more

How can I split a text into sentences using the Stanford parser?

November 20, 2022 by Tarik

You can check the DocumentPreprocessor class. Below is a short snippet. I think there may be other ways to do what you want. String paragraph = “My 1st sentence. “Does it work for questions?” My third sentence.”; Reader reader = new StringReader(paragraph); DocumentPreprocessor dp = new DocumentPreprocessor(reader); List<String> sentenceList = new ArrayList<String>(); for (List<HasWord> sentence … Read more

Java Stanford NLP: Part of Speech labels?

October 6, 2022 by Tarik

The Penn Treebank Project. Look at the Part-of-speech tagging ps. JJ is adjective. NNS is noun, plural. VBP is verb present tense. RB is adverb. That’s for english. For chinese, it’s the Penn Chinese Treebank. And for german it’s the NEGRA corpus. CC Coordinating conjunction CD Cardinal number DT Determiner EX Existential there FW Foreign … Read more

How to use Stanford Parser in NLTK using Python

May 2, 2022 by Tarik

Note that this answer applies to NLTK v 3.0, and not to more recent versions. Sure, try the following in Python: import os from nltk.parse import stanford os.environ[‘STANFORD_PARSER’] = ‘/path/to/standford/jars’ os.environ[‘STANFORD_MODELS’] = ‘/path/to/standford/jars’ parser = stanford.StanfordParser(model_path=”/location/of/the/englishPCFG.ser.gz”) sentences = parser.raw_parse_sents((“Hello, My name is Melroy.”, “What is your name?”)) print sentences # GUI for line in sentences: … Read more