Source From Here
Java API
The BabelNet Java API requires JRE 1.7 or above. Since the BabelNet 3.5 release, any data retrieved with the API is by default only in the search language. Should one want to retrieve other languages, it is possible to use the specific parameter Collection filterLangs present in the most important API methods. It is not possible to retrieve more than three languages other than the search language. For more details see this example.
Here you can access the javadoc for the API. To work with the Java API, unpack the zip file with:
The file is available the download section.
Configuration
The first important step is the configuration of the properties files located inside the BabelNet API folder. For instance, assuming you received by email the following key (see key & limits): abcdefghilmnopqrstuvz
Then Edit the babelnet.var.properties files inside BabelNet-API-3.7/config/ and add the following line into babelnet.var.properties:
Running the BabelNet demo
Now, you can test that everything works by below sample code:
- BabelNetDemo.groovy
Execution result look like:
If you replace the word to "銀行" and change Language.EN to Language.ZH (Enum Language) and execute above sample code again, you will get:
Explain Code
Let's now look at the main classes in the BabelNet API. For more details, see the API javadoc.
Main classes
The main classes of BabelNet are:
BabelNet
The BabelNet class is used as the entry point to access all the content available in BabelNet. The class is implemented through the Singleton pattern, where we restrict the instantiation of the BabelNet class to one object. You can obtain a reference to the only instance of the BabelNet class with the following line:
Now you are ready to use the
BabelNet object to work with the BabelNet API.
BabelSynset
A BabelSynset is a set of multilingual lexicalizations that are synonyms expressing a given concept or named entity. For instance, the synset for car in the motorcar sense looks like this. After creating the BabelNet object which we call bn, we can use its methods to retrieve one or many BabelSynset objects. For instance, to retrieve all the synsets containing car we can call the BabelNet#getSynset method:
We can also specify which of the parts of speech we are interested in and obtain only synsets for the specified part of speech. In the following example we retrieve all the verbal synsets containing the English lexicalization run:
The full list of
BabelPOS values is:
However, as of version 3.0, only the open-class (
i.e., top 4) parts of speech are available. Due to the nature of BabelNet, a BabelSynset may contain lexicalizations from different sources. You can restrict your search only to your sources of interest. For instance:
The full list of
BabelSenseSource is:
Each
BabelSynset has an ID that univocally identifies the synset and that can be obtained via the BabelSynset#getId method. If we have an ID and want to retrieve the corresponding synset, we can use the overloaded BabelNet#getSynset method. For instance:
which will returns the
BabelSynset corresponding to ID bn:03083790n, that is, the synset about BabelNet.
If we want to retrieve the BabelSynset corresponding to a given WordNet 3.0 ID, we can use another overloaded version of the BabelNet#getSynset method:
If we want to retrieve the
BabelSynset corresponding to a given WordNet old version ID, we can use another overloaded version of the BabelNet#getSynset method:
If we want to retrieve the
BabelSynset corresponding to a given Wikidata page ID, we can use another overloaded version of the BabelNet#getSynset method:
If we want to retrieve the
BabelSynset containing a given Wikipedia page title, we can use the method BabelNet#getSynsets:
BabelSense
A BabelSense is a term (either word or multi-word expression) in a given language occurring in a certain BabelSynset. Each occurrence of the same term (e.g., car) in different synsets is therefore a different BabelSense of that term. Now let's look at the methods to retrieve a BabelSense using the bn object we created earlier:
Once we have a
BabelSense, we can go back to the synset it belongs to using the BabelSense#getSynset method:
We can view the
BabelSynset as a container of BabelSynset, i.e., the lexicalizations in the various languages contained in the synset that express its concept or named entity.
Some methods of BabelSynset and BabelSense
We are now going into details about important methods of the BabelSynset and BabelSense classes.
Main BabelSynset methods
BabelSynset is composed of various elements which we describe below. Furthermore, a BabelSynset is connected to other BabelSynset objects. The main components of a BabelSynset are objects of the following types:
Let's take a look at the main methods of a BabelSynset object which we call by. Note: to obtain BabelSynset objects we can also use the above examples.
Main BabelSense methods
We now have a look at the BabelSense methods. The main components of a BabelSense are:
Some code retrieving the above information follows:
Usage examples
Here we show full examples that show how you can use the BabelNet API to accomplish several tasks.
Retrieve all BabelSynset objects for a specific word
For a specific word retrieves all BabelSynset objects in English, Italian and French.
Retrieve all BabelSense objects for a specific BabelSynset object
Retrieve all BabelSense objects for a specific Wikidata page id
Retrieve Wikidata id for each BabelSense in a BabelSynset
Retrieve neighbors of a BabelSynset object
Retrieve all hypernyms (上位詞) of a BabelSynset object
Retrieve all surface forms of a BabelSynset object
Retrieve the distribution of relationships (frequency of each BabelPointer type) for a specific word
Java API
The BabelNet Java API requires JRE 1.7 or above. Since the BabelNet 3.5 release, any data retrieved with the API is by default only in the search language. Should one want to retrieve other languages, it is possible to use the specific parameter Collection
Here you can access the javadoc for the API. To work with the Java API, unpack the zip file with:
The file is available the download section.
Configuration
The first important step is the configuration of the properties files located inside the BabelNet API folder. For instance, assuming you received by email the following key (see key & limits): abcdefghilmnopqrstuvz
Then Edit the babelnet.var.properties files inside BabelNet-API-3.7/config/ and add the following line into babelnet.var.properties:
Running the BabelNet demo
Now, you can test that everything works by below sample code:
- BabelNetDemo.groovy
- package demo
- import it.uniroma1.lcl.babelnet.BabelImage;
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelNetUtils;
- import it.uniroma1.lcl.babelnet.BabelSense;
- import it.uniroma1.lcl.babelnet.BabelSenseComparator;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.babelnet.BabelSynsetComparator;
- import it.uniroma1.lcl.babelnet.BabelSynsetID;
- import it.uniroma1.lcl.babelnet.BabelSynsetIDRelation;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.babelnet.data.BabelGloss;
- import it.uniroma1.lcl.babelnet.data.BabelPOS;
- import it.uniroma1.lcl.babelnet.data.BabelSenseSource;
- import it.uniroma1.lcl.jlt.util.Language;
- import it.uniroma1.lcl.jlt.util.ScoredItem;
- import java.io.IOException;
- import java.util.Arrays;
- import java.util.Collections;
- import java.util.List;
- def mainTest() throws IOException
- {
- BabelNet bn = BabelNet.getInstance();
- String word = "bank";
- System.out.println("SYNSETS WITH English word: \""+word+"\"");
- List
synsets = bn.getSynsets(word, Language.EN); - Collections.sort(synsets, new BabelSynsetComparator(word));
- for (BabelSynset synset : synsets)
- {
- // synset.getId()
- printf(String.format('=>(%s)SOURCE: %s;TYPE: %s; WN SYNSET: %s \n\tMAIN LEMMA: %s \n',
- synset.getId(), synset.getSynsetSource(), synset.getSynsetType(), synset.getWordNetOffsets(),
- synset.getMainSense(Language.EN)))
- printf("\t=== IMAGES ===:\n")
- for(Object o:synset.getImages())
- {
- try
- {
- System.out.printf(String.format('\t%s\n', o)) // it.uniroma1.lcl.babelnet.BabelImage
- }
- catch(Exception e){}
- }
- printf("\t=== CATEGORY ===:\n")
- for(Object o:synset.getCategories())
- {
- try{
- printf(String.format("\t$o\n"))
- }
- catch(Exception e){}
- }
- printf("\t===SENSES (Italian) ===: {\n")
- for (BabelSense sense : synset.getSenses(Language.IT))
- {
- try{
- printf(String.format("\t%s (%s)\n", sense.toString(), sense.getPronunciations()))
- } catch(Exception e){printf("\t%s\n", e.getMessage())}
- }
- System.out.println("}")
- System.out.println()
- }
- }
- try
- {
- BabelNet bn = BabelNet.getInstance();
- System.out.println("=== DEMO ===");
- BabelSynset synset = bn.getSynset(new BabelSynsetID("bn:03083790n"));
- System.out.println("Welcome on "+synset.getMainSense(Language.EN).getLemma().replace("_", " "));
- System.out.println(synset.getMainGloss(Language.EN).getGloss());
- mainTest()
- }
- catch(Exception e)
- {
- e.printStackTrace()
- }
If you replace the word to "銀行" and change Language.EN to Language.ZH (Enum Language) and execute above sample code again, you will get:
Explain Code
Let's now look at the main classes in the BabelNet API. For more details, see the API javadoc.
Main classes
The main classes of BabelNet are:
BabelNet
The BabelNet class is used as the entry point to access all the content available in BabelNet. The class is implemented through the Singleton pattern, where we restrict the instantiation of the BabelNet class to one object. You can obtain a reference to the only instance of the BabelNet class with the following line:
- BabelNet bn = BabelNet.getInstance();
BabelSynset
A BabelSynset is a set of multilingual lexicalizations that are synonyms expressing a given concept or named entity. For instance, the synset for car in the motorcar sense looks like this. After creating the BabelNet object which we call bn, we can use its methods to retrieve one or many BabelSynset objects. For instance, to retrieve all the synsets containing car we can call the BabelNet#getSynset method:
- // Given a word in a certain language,
- // returns the concepts (`BabelSynsets`) denoted by the word.
- List
byl = bn.getSynsets("car", Language.EN);
- // Given a word in a certain language and pos (part of speech),
- // returns the concepts denoted by the word.
- List
byl = bn.getSynsets("run", Language.EN, BabelPOS.VERB);
- public enum BabelPOS
- {
- NOUN,
- ADJECTIVE,
- VERB,
- ADVERB,
- INTERJECTION,
- PREPOSITION,
- ARTICLE,
- DETERMINER,
- CONJUNCTION,
- PRONOUN;
- }
- // Given a word in a certain language, returns the concepts
- // for the word available in the given sense sources.
- List
byl = bn.getSynsets("run", Language.EN, BabelPOS.NOUN, BabelSenseSource.WIKI, BabelSenseSource.OMWIKI);
- public enum BabelSenseSource
- {
- BABELNET, // BabelNet senses, not available as of version 3.0
- WN, // WordNet senses
- OMWN, // Open Multilingual WordNet
- WONEF, // WordNet du Francais
- WIKI, // Wikipedia page
- WIKIDIS, // Wikipedia disambiguation pages
- WIKIDATA, // Wikidata senses
- OMWIKI, // OmegaWiki senses
- WIKICAT, // Wikipedia category, not available as of version 3.0
- WIKIRED, // Wikipedia redirections
- WIKT, // Wiktionary senses
- WIKIQU, // Wikiquote page
- WIKIQUREDI, // Wikiquote redirections
- WIKTLB, // Wiktionary translation label
- VERBNET, // VerbNet senses
- FRAMENET, // FrameNet senses
- MSTERM, // Microsoft Terminology items
- GEONM, // GeoNames items
- WNTR, // Translations of WordNet senses
- WIKITR // Translations of Wikipedia links
- }
- // Gets a BabelSynset from a concept identifier (Babel synset ID).
- BabelSynset by = bn.getSynset(new BabelSynsetID("bn:03083790n"));
If we want to retrieve the BabelSynset corresponding to a given WordNet 3.0 ID, we can use another overloaded version of the BabelNet#getSynset method:
- // Gets the BabelSynsets corresponding to an input WordNet offset.
- BabelSynset by = bn.getSynset(new WordNetSynsetID("wn:06879521n"));
- // Gets the BabelSynsets corresponding to an input WordNet offset of the version 2.1.
- BabelSynset by = bn.getSynset(new WordNetSynsetID("wn:06303048n", WordNetVersion.WN_21));
- // Gets the BabelSynsets corresponding to an input Wikidata page ID.
- BabelSynset by = bn.getSynset(new WikidataID("Q4837690"));
- // Given a Wikipedia title, returns the BabelSynsets which contain it.
- List
byl = bn.getSynsets(new WikipediaID("Men in Black (film 1997)", Language.IT, BabelPOS.NOUN));
A BabelSense is a term (either word or multi-word expression) in a given language occurring in a certain BabelSynset. Each occurrence of the same term (e.g., car) in different synsets is therefore a different BabelSense of that term. Now let's look at the methods to retrieve a BabelSense using the bn object we created earlier:
- // Returns the senses for the word in a certain language.
- List
senses = bn.getSenses("home", Language.EN); - // Returns the senses for the word in a certain language and Part-Of-Speech.
- List
senses = bn.getSenses("run", Language.EN, BabelPOS.VERB); - // Returns the senses for the word with the given constraints.
- List
senses = bn.getSenses("run", Language.EN, BabelPOS.VERB, BabelSenseSource.WIKI, BabelSenseSource.OMWIKI);
- BabelSynset by = sense.getSynset();
Some methods of BabelSynset and BabelSense
We are now going into details about important methods of the BabelSynset and BabelSense classes.
Main BabelSynset methods
BabelSynset is composed of various elements which we describe below. Furthermore, a BabelSynset is connected to other BabelSynset objects. The main components of a BabelSynset are objects of the following types:
Let's take a look at the main methods of a BabelSynset object which we call by. Note: to obtain BabelSynset objects we can also use the above examples.
- // Gets a BabelSynset from a concept identifier (Babel synset ID).
- BabelSynset by = bn.getSynset(new BabelSynsetID("bn:03083790n"));
- // Most relevant BabelSense to this BabelSynset for a given language.
- BabelSense bs = by.getMainSense(Language.EN);
- // Gets the part of speech of this BabelSynset.
- BabelPOS pos = by.getPOS();
- // True if the BabelSynset is a key concept
- boolean iskeyConcept = by.isKeyConcept();
- // Gets the senses contained in this BabelSynset.
- List
senses = by.getSenses(); - // Collects all BabelGlosses in the given source for this BabelSynset.
- List
glosses = by.getGlosses(); - // Collects all BabelExamples for this BabelSynset.
- List
examples = by.getExamples(); - // Gets the images (BabelImages) of this BabelSynset.
- List
images = by.getImages(); - // Collects all the edges incident on this BabelSynset.
- List
edges = by.getEdges(); - // Gets the BabelCategory objects of this BabelSynset.
- List
cats = by.getCategories();
We now have a look at the BabelSense methods. The main components of a BabelSense are:
Some code retrieving the above information follows:
- BabelSense bs = by.getMainSense(Language.EN);
- // Gets the language of this BabelSense
- Language lang = bs.getLanguage();
- // Gets the part-of-speech tag of this BabelSense
- BabelPOS pos = bs.getPOS();
- // True if the BabelSense is a key concept
- boolean iskeyConcept = bs.isKeyConcept();
- // Gets the lemma of this BabelSense
- String lemma = bs.getLemma();
- // Gets the simple lemma of this sense (i.e., without parentheses, etc.)
- String simpleLemma = bs.getSimpleLemma();
- // Gets the pronunciations of this sense
- BabelSensePhonetics pronunciations = bs.getPronunciations();
- // Collects all the sources of the sense; ex: Wikipedia, WordNet, etc.
- BabelSenseSource source = bs.getSource();
Here we show full examples that show how you can use the BabelNet API to accomplish several tasks.
Retrieve all BabelSynset objects for a specific word
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.jlt.util.Language;
- import java.io.IOException;
- public class Example {
- public static void main(String[] args) throws IOException {
- BabelNet bn = BabelNet.getInstance();
- for (BabelSynset synset : bn.getSynsets("home", Language.EN)) {
- System.out.println("Synset ID: " + synset.getId());
- }
- }
- }
For a specific word retrieves all BabelSynset objects in English, Italian and French.
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.jlt.util.Language;
- import java.io.IOException;
- import java.util.Arrays;
- public class Example {
- public static void main(String[] args) throws IOException {
- BabelNet bn = BabelNet.getInstance();
- for (BabelSynset synset : bn.getSynsets("home", Language.EN, Arrays.asList(Language.IT, Language.FR))) {
- System.out.println("Synset ID: " + synset.getId());
- }
- }
- }
Retrieve all BabelSense objects for a specific BabelSynset object
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynsetID;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.babelnet.data.BabelAudio;
- import it.uniroma1.lcl.babelnet.data.BabelSense;
- import it.uniroma1.lcl.babelnet.data.BabelSensePhonetics;
- import java.io.IOException;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- for (BabelSense sense : bn.getSynset(new BabelSynsetID("bn:00000356n"))) {
- System.out.println("Sense: " + sense.getLemma()
- + "\tLanguage: " + sense.getLanguage().toString()
- + "\tSource: " + sense.getSource().toString());
- BabelSensePhonetics phonetic = sense.getPronunciations();
- for (BabelAudio audio : phonetic.getAudioItems()) {
- System.out.println("Audio URL " + audio.getValidatedUrl());
- }
- }
- }
- }
Retrieve all BabelSense objects for a specific Wikidata page id
- import java.io.IOException;
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.data.BabelSense;
- import it.uniroma1.lcl.babelnet.data.BabelAudio;
- import it.uniroma1.lcl.babelnet.data.BabelSensePhonetics;
- import it.uniroma1.lcl.babelnet.resources.WikidataID;
- public class Example {
- public static void main(String[] args) throws IOException {
- BabelNet bn = BabelNet.getInstance();
- for (BabelSense sense : bn.getSynset(new WikidataID("Q4837690"))) {
- System.out.println("Sense: " + sense.getLemma()
- + "\tLanguage: " + sense.getLanguage().toString()
- + "\tSource: " + sense.getSource().toString());
- BabelSensePhonetics phonetic = sense.getPronunciations();
- for (BabelAudio audio : phonetic.getAudioItems()) {
- System.out.println("Audio URL " + audio.getValidatedUrl());
- }
- }
- }
- }
Retrieve Wikidata id for each BabelSense in a BabelSynset
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSense;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.babelnet.BabelSynsetID;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.babelnet.data.BabelSenseSource;
- import java.io.IOException;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- BabelSynset by = bn.getSynset(new BabelSynsetID("bn:00000288n"));
- for (BabelSense sense : by.getSenses(BabelSenseSource.WIKIDATA)) {
- String sensekey = sense.getSensekey();
- System.out.println(sense.getLemma() + "\t" + sense.getLanguage() + "\t" + sensekey);
- }
- }
- }
Retrieve neighbors of a BabelSynset object
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.babelnet.BabelSynsetID;
- import it.uniroma1.lcl.babelnet.BabelSynsetIDRelation;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.jlt.util.Language;
- import java.io.IOException;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- BabelSynset by = bn.getSynset(new BabelSynsetID("bn:00044492n"));
- for(BabelSynsetIDRelation edge : by.getEdges()) {
- System.out.println(by.getId()+"\t"+by.getMainSense(Language.EN).getLemma()+" - "
- + edge.getPointer()+" - "
- + edge.getBabelSynsetIDTarget());
- }
- }
- }
Retrieve all hypernyms (上位詞) of a BabelSynset object
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.babelnet.BabelSynsetID;
- import it.uniroma1.lcl.babelnet.BabelSynsetIDRelation;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.jlt.util.Language;
- import java.io.IOException;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- BabelSynset by = bn.getSynset(new BabelSynsetID("bn:00015556n"));
- for(BabelSynsetIDRelation edge : by.getEdges(BabelPointer.ANY_HYPERNYM)) {
- System.out.println(by.getId()+"\t"+by.getMainSense(Language.EN).getLemma()+" - "
- + edge.getPointer()+" - "
- + edge.getBabelSynsetIDTarget());
- }
- }
- }
Retrieve all surface forms of a BabelSynset object
- import java.io.IOException;
- import java.util.Map;
- import java.util.Spliterator;
- import java.util.Spliterators;
- import java.util.function.Function;
- import java.util.stream.Collectors;
- import java.util.stream.Stream;
- import java.util.stream.StreamSupport;
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- for(String form : by.getOtherForms(Language.EN)) {
- System.out.println(by.getId()+"\t"+by.getMainSense(Language.EN).getLemma()+" - "
- + form);
- }
- }
- }
Retrieve the distribution of relationships (frequency of each BabelPointer type) for a specific word
- import java.io.IOException;
- import java.util.List;
- import java.util.Map;
- import java.util.function.Function;
- import java.util.stream.Collectors;
- import it.uniroma1.lcl.babelnet.BabelNet;
- import it.uniroma1.lcl.babelnet.BabelSynset;
- import it.uniroma1.lcl.babelnet.InvalidBabelSynsetIDException;
- import it.uniroma1.lcl.jlt.util.Language;
- public class Example {
- public static void main(String[] args) throws IOException, InvalidBabelSynsetIDException {
- BabelNet bn = BabelNet.getInstance();
- List
synsets = bn.getSynsets("car", Language.EN); - Map
totalByDept = synsets.parallelStream() - .flatMap( synset -> synset.getEdges().parallelStream())
- .map( edge -> edge.getPointer().getSymbol())
- .collect(Collectors.groupingByConcurrent(Function.identity(), Collectors.counting()));
- totalByDept.forEach((pointer, freq) -> System.out.println(pointer + "\t" + freq));
- }
- }
沒有留言:
張貼留言