A Complete Guide to Using WordNET in NLP Applications – Analytics India Magazine

Relationship

Advertisement
In the field of natural language processing, there are a variety of tasks such as automatic text classification, sentiment analysis, text summarization, etc. These tasks are partially based on the pattern of the sentence and the meaning of the words in a different context. The two different words may be similar with an amount of amplitude. For example, the words ‘jog’ and ‘run’, both of them are partially different and also partially similar to each other. To perform specific NLP-based tasks, it is required to understand the intuition of words in different positions and hold the similarity between the words as well. Here WordNET comes to the picture which helps in solving the linguistic problems of the NLP models. 
WordNET is a lexical database of semantic relations between words in more than 200 languages. In this article, we will discuss WordNet in detail with its structure, working and implementation.  The major points to be discussed in this article are listed below.
Table of Contents
What is WordNET
WordNET is a lexical database of words in more than 200 languages in which we have adjectives, adverbs, nouns, and verbs grouped differently into a set of cognitive synonyms, where each word in the database is expressing its distinct concept. The cognitive synonyms which are called synsets are presented in the database with lexical and semantic relations. WordNET is publicly available for download and also we can test its network of related words and concepts using this link. Below are a few test images when accessed this through the browser.
Results are:
For download purposes, you can navigate to this link.
The Distinction Between WordNET and Thesaurus
Where thesaurus is helping us in finding the synonyms and antonyms of the words the WordNET is helping us to do more than that. WordNET interlinks the specific sense of the words wherein thesaurus links words by their meaning only. In the WordNET the words are semantically disambiguated if they are in close proximity to each other. Thesaurus provides a level to the words in the network if the words have similar meaning but in the case of WordNET, we get levels of words according to their semantic relations which is a better way of grouping the words.  
Structure of WordNET
The below image is a basic structure of the WordNET. The main concept of the relationship between the words in the WordNETs network is that the words are synonyms like sad and unhappy, benefit and profit. These words show the same concept of using them in similar contexts by interchanging them. These types of words are grouped into synsets which are unordered sets. Where synsets are linked together if they are having even small conceptual relations. Every synset in the network has its own brief definition and many of them are illustrated with the example of how to use them in a sentence. That definition and example part makes WordNET different from other
In the below picture we can see the structure of any synset where we are having synonyms of benefit in the array of synsets with the definition and the example of usage of benefit word. This synset is related to another synset word, where the words benefit and profit have exactly the same meaning. 
Here we can see the structure of the wordnet and also how the synsets under the networks are interlinked because of the conceptual relation between the words.
Relations in the WordNET
Hyponym: In linguistics, a word with a broad meaning constitutes a category into which words with more specific meanings fall; a superordinate. For example, the colour is a hypernym of red. Where Hyponymy shows the relationship between a hypernym and a specific instance of a hyponym. A hyponym is a word or phrase whose semantic field is more specific than its hypernym. The semantic field of a hypernym, also known as a superordinate.
Image source
The above image is an example of the relationship between hyponyms and hypernym.
The reason for explaining these terms here is because in WordNET the most frequent relationships between synsets are based on these hyponym and hypernym relations. These are very beneficial in linking words like(paper, piece of paper). Saying more specifically with an example from the above picture like purple and violet, in WordNET the category colour includes purple which in turn includes violet. The root node of the hierarchy is the last point for every noun. In violet is a kind of purple and purple is a kind of colour then violet is a kind colour this is the hyponymy relation between the words which is transitive.
Meronymy: The wordnet hold follows the meronymy relation which defines the whole relationship between the synset for example a bike has two wheels handle and petrol tank. These components of a bike are inherited from their subordinates: if a bike has two wheels then a sports bike has wheels as well. In linguistics, we basically use this kind of relationship for adverbs which basically represents the characteristic of the noun. So the parts are inherited into a downward direction because all the bikes and types of bikes have two wheels, but not all kinds of automobiles consist of two wheels.
Troponymy: In linguistics, troponymy is the presence of a ‘manner’ relation between two lexemes. In WordNET  Verbs describing events that necessarily and unidirectionally entail one another are linked: {buy}-{pay}, {succeed}-{try}, {show}-{see}, etc. basically the in the hierarchy verbs towards the bottom shows the manners are characterizing the events like communication-talk-whisper. 
Antonymy: Adjective words under the WordNET arranged in the antonymy pairs like wet and dry, smile and cry. Each of these pairs of antonyms is linked with sets of semantic similar ones. The cry is linked to weep, shed tears, sob, wail etc. so that they all can be considered as the opposite of indirect antonyms of a smile. 
Cross – PoS Relations
Most of the relations in the wordNET are in the same part of speech. On the basis of part of speech relations, we can divide WordNET into 4 types of 4 subnets one for each noun, verbs, adjective, and adverb. There are also some cross-PoS pointers available in the network which include a morphosemantic link that holds the words with the same meaning and shares a stem. For example, many pairs like (reader read) in which the noun of the pair has a semantic layer with respect to the verb have been specified.
Implementation of WordNET
We can implement WordNET in just a few lines of code.
Importing libraries:
Downloading the wordnet:
Output:
Taking trial of WordNET by checking the synonyms, antonyms and similarity percentage:
Output:
Here we can see the synonyms of the evil word and in the network, good and goodness is the opposite of the evil word.
Checking the word similarity feature:
Output:
Since we know grown-up boys are men, here when we asked the measure of similarity between the man and boy it gave the result around 66% which is a nice estimation of the similarity.
Final words
Here in this article, we had an overview of the WordNET along with an understanding of what are the basic structures of the wordnet and the synset. We discussed how it works to make the relation between the words properly because the manageable representation of the data into the model can make a model more accurate and workable. We saw what lexical relation that the database follows to hold the word with huge information and we have seen how we can implement this using python and nltk. It can be done using TextBlob and R as well. You can use WordNET and try it with these tools also. And try to accurately implement it in the models for better accuracy.
References
Masterclass, Virtual
AI Application in Semiconductor Design
16th Jun
Conference, in-person (Bangalore)
MachineCon 2022
24th Jun
Workshop, Virtual
Advanced SYCL Concepts for Heterogenous Computing
24th Jun
Conference, Virtual
Deep Learning DevCon 2022
30th Jul
Conference, in-person (Bangalore)
Cypher 2022
21-23rd Sep
Stay Connected with a larger ecosystem of data science and ML Professionals
Discover special offers, top stories, upcoming events, and more.
This article briefs about the process of building a custom python package that can be used to scrape data from the web by using various inbuilt functions of BeautifulSoup.
Programs provide students across campuses in India an opportunity to gain and apply Machine Learning skills making them industry ready for science careers.
AI for India campaign is a part of ABCDEFGHI program of GoI.
Effective Binarization can help in better image segmentation.
Data science and analytics have a key role to play in achieving these sustainable development goals. They can be leveraged to enable sustainable development, particularly measuring impact, managing resources, tackling climate change, and more.
The US Navy originally designed the Tor network to hide the location of the browser on the network.
With the new platform, we aim to secure at least 100 million users on our platform across the entire enterprise user spectrum- from end-users, teams, process owners, citizen developers, and IT developers.
The rivalry between Apple and Microsoft goes way back.
This article briefs about the various methods to serialize and deserialize Scikit Learn and Tensorflow models for production
The demand for cloud skills is real. Moreover, cloud computing offers an exciting and fruitful career option for aspiring professionals and students. 
Stay up to date with our latest news, receive exclusive deals, and more.
© Analytics India Magazine Pvt Ltd 2022
Terms of use
Privacy Policy
Copyright

source


Leave a Reply

Your email address will not be published.