Semantic Representations for NLP Using VerbNet and the Generative Lexicon
The need for deeper semantic processing of human language by our natural language processing systems is evidenced by their still-unreliable performance on inferencing tasks, even using deep learning techniques. These tasks require the detection of subtle interactions between participants in events, of sequencing of subevents that are often not explicitly mentioned, and of changes to various participants across an event. Human beings can perform this detection even when sparse lexical items are involved, suggesting that linguistic insights into these abilities could improve NLP performance. In this article, we describe new, hand-crafted semantic representations for the lexical resource VerbNet that draw heavily on the linguistic theories about subevent semantics in the Generative Lexicon (GL).
VerbNet defines classes of verbs based on both their semantic and syntactic similarities, paying particular attention to shared diathesis alternations. For each class of verbs, VerbNet provides common semantic roles and typical syntactic patterns. For each syntactic pattern in a class, VerbNet defines a detailed semantic representation that traces the event participants from their initial states, through any changes and into their resulting states. We applied that model to VerbNet semantic representations, using a class’s semantic roles and a set of predicates defined across classes as components in each subevent. We will describe in detail the structure of these representations, the underlying theory that guides them, and the definition and use of the predicates. We will also evaluate the effectiveness of this resource for NLP by reviewing efforts to use the semantic representations in NLP tasks.
Syntactic and Semantic Analysis
Furthermore, this analysis can guide translators in selecting words more judiciously for crucial core conceptual words during the translation process. Semantic analysis is the process of understanding the meaning and interpretation of words, signs and sentence structure. I say this partly because semantic analysis is one of the toughest parts of natural language processing and it’s not fully solved yet. Natural language processing (NLP) is an area of computer science and artificial intelligence concerned with the interaction between computers and humans in natural language. The ultimate goal of NLP is to help computers understand language as well as we do. It is the driving force behind things like virtual assistants, speech recognition, sentiment analysis, automatic text summarization, machine translation and much more.
- This formal structure that is used to understand the meaning of a text is called meaning representation.
- However, it falls short for phenomena involving lower frequency vocabulary or less common language constructions, as well as in domains without vast amounts of data.
- A dictionary-based approach will ensure that you introduce recall, but not incorrectly.
- • Verb-specific features incorporated in the semantic representations where possible.
For example, the Ingestion frame is defined with “An Ingestor consumes food or drink (Ingestibles), which entails putting the Ingestibles in the mouth for delivery to the digestive system. VerbNet is also somewhat similar to PropBank and Abstract Meaning Representations (AMRs). PropBank defines semantic roles for individual verbs and eventive nouns, and these are used as a base for AMRs, which are semantic graphs for individual sentences. These representations show the relationships between arguments in a sentence, including peripheral roles like Time and Location, but do not make explicit any sequence of subevents or changes in participants across the timespan of the event. VerbNet’s explicit subevent sequences allow the extraction of preconditions and postconditions for many of the verbs in the resource and the tracking of any changes to participants. In addition, VerbNet allow users to abstract away from individual verbs to more general categories of eventualities.
Building Blocks of Semantic System
Subsequently, this study applied Word2Vec, GloVe, and BERT to quantify the semantic similarities among these translations. The similarities and dissimilarities among these five translations were evaluated based on the resulting similarity scores. The Jennings’ translation considered the readability of the text and restructured the original text, which was a very reader-friendly innovation at the time. Despite this structural change slightly impacting the semantic similarity with other translations, it did not significantly affect the semantic representation of the main body of The Analects when considering the overall data analysis.
At this point, we only worked with the most prototypical examples of changes of location, state and possession and that involved a minimum of participants, usually Agents, Patients, and Themes. VerbNet’s semantic representations, however, have suffered from several deficiencies that have made them difficult to use in NLP applications. To unlock the potential in these representations, we have made them more expressive and more consistent across classes of verbs. We have grounded them in the linguistic theory of the Generative Lexicon (GL) (Pustejovsky, 1995, 2013; Pustejovsky and Moszkowicz, 2011), which provides a coherent structure for expressing the temporal and causal sequencing of subevents. Explicit pre- and post-conditions, aspectual information, and well-defined predicates all enable the tracking of an entity’s state across a complex event. This study employs sentence alignment to construct a parallel corpus based on five English translations of The Analects.
Probability distribution of dependency distance and dependency type in translational language
In Classic VerbNet, the basic predicate structure consisted of a time stamp (Start, During, or End of E) and an often inconsistent number of semantic roles. Some predicates could appear with or without a time stamp, and the order of semantic roles was not fixed. For example, the Battle-36.4 class included the predicate manner(MANNER, Agent), where a constant that describes nlp semantics the manner of the Agent fills in for MANNER. While manner did not appear with a time stamp in this class, it did in others, such as Bully-59.5 where it was given as manner(E, MANNER, Agent). Using the Generative Lexicon subevent structure to revise the existing VerbNet semantic representations resulted in several new standards in the representations’ form.
Understanding human language is considered a difficult task due to its complexity. For example, there are an infinite number of different ways to arrange words in a sentence. Also, words can have several meanings and contextual information is necessary to correctly interpret sentences. Just take a look at the following newspaper headline “The Pope’s baby steps on gays.” This sentence clearly has two very different interpretations, which is a pretty good example of the challenges in natural language processing.
For example, capitalizing the first words of sentences helps us quickly see where sentences begin. For example, to require a user to type a query in exactly the same format as the matching words in a record is unfair and unproductive. In Sentiment analysis, our aim is to detect the emotions as positive, negative, or neutral in a text to denote urgency.
What Is Natural Language Processing? (Definition, Uses) – Built In
What Is Natural Language Processing? (Definition, Uses).
Posted: Tue, 17 Jan 2023 22:44:18 GMT [source]
