FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 |

«Abstract. Extracting knowledge from text has long been a goal of AI. Initial approaches were purely logical and brittle. More recently, the ...»

-- [ Page 1 ] --

Extracting Semantic Networks from Text

Via Relational Clustering

Stanley Kok and Pedro Domingos

Department of Computer Science and Engineering

University of Washington, Seattle WA 98195-2350, USA


Abstract. Extracting knowledge from text has long been a goal of AI.

Initial approaches were purely logical and brittle. More recently, the

availability of large quantities of text on the Web has led to the develop-

ment of machine learning approaches. However, to date these have mainly extracted ground facts, as opposed to general knowledge. Other learning approaches can extract logical forms, but require supervision and do not scale. In this paper we present an unsupervised approach to extracting semantic networks from large volumes of text. We use the TextRunner system [1] to extract tuples from text, and then induce general concepts and relations from them by jointly clustering the objects and relational strings in the tuples. Our approach is defined in Markov logic using four simple rules. Experiments on a dataset of two million tuples show that it outperforms three other relational clustering approaches, and extracts meaningful semantic networks.

1 Introduction A long-standing goal of AI is to build an autonomous agent that can read and understand text. The natural language processing (NLP) community attempted to achieve this goal in the 1970’s and 1980’s by building systems for understand- ing and answering questions about simple stories [3, 13, 23, 6]. These systems parsed text into a network of predefined concepts, and created a knowledge base from which inferences can be made. However, they required a large amount of manual engineering, only worked on small text sizes, and were not robust enough to perform well on unrestricted naturally occurring text. Gradually, research in this direction petered out.

Interest in the goal has been recently rekindled [16][7] by the abundance of easily accessible Web text, and by the substantial progress over the last few years in machine learning and NLP. The confluence of these three developments led to efforts to extract facts and knowledge bases from the Web [4]. Two recent steps in this direction are a system by Pasca et. al [18] and TextRunner [1].

Both systems extract facts on a large scale from Web corpora in an unsuper- vised manner. Pasca et. al’s system derives relation-specific extraction patterns from a starting set of seed facts, acquires candidate facts using the patterns, 2 Extracting Semantic Networks from Text via Relational Clustering adds high-scoring facts to the seeds, and iterates until some convergence crite- rion. TextRunner uses a domain-independent approach to extract a large set of relational tuples of the form r(x, y) where x and y are stringsdenoting objects, and r is a string denoting a relation between the objects. It uses a lightweight noun phrase chunker to identify objects, and heuristically determines the text between objects as relations. These are good first steps, but they still fall short of the goal. While they can quickly acquire a large database of ground facts in an unsupervised manner, they are not able to learn general knowledge that is embedded in the facts.

Another line of recent research takes the opposite approach. Semantic parsing [26, 17, 29] is the task of mapping a natural language sentence into logical form.

The logical statements constitute a knowledge base that can be used to perform some task like answering questions. Semantic parsing systems require a training corpus of sentences annotated with their associated logical forms (i.e., they are supervised). These systems are then trained to induce a parser that can convert novel sentences to their logical forms. Even though these systems can create knowledge bases directly, their need for annotated training data prevents them from scaling to large corpora like the Web.

In this paper, we present SNE, a scalable, unsupervised, and domain-independent system that simultaneously extracts high-level relations and concepts, and learns a semantic network [20] from text. It first uses TextRunner to extract ground facts as triples from text, and then extract knowledge from the triples. TextRunner’s triples are noisy, sparse, and contain many co-referent objects and relations.

Our system has to overcome these challenges in order to extract meaningful highlevel relations and concepts from the triples in an unsupervised manner. It does so with a probabilistic model that clusters objects by the objects that they are related to, and that clusters relations by the objects they relate. This allows information to propagate between clusters of relations and clusters of objects as they are created. Each cluster represents a high-level relation or concept. A concept cluster can be viewed as a node in a graph, and a relation cluster can be viewed as links between the concept clusters that it relates. Together the concept clusters and relation clusters define a simple semantic network. Figure 1 illustrates part of a semantic network that our approach learns. SNE is short for Semantic Network Extractor.

SNE is based on Markov logic [22], and is related to the Multiple Relational Clusterings (MRC) model [12] we recently proposed. SNE is our first step towards creating a system that can extract an arbitrary semantic network directly from text. Ultimately, we want to tightly integrate the information extraction TextRunner component and the knowledge learning SNE component to form a self-contained knowledge extraction system. This tight integration will enable information to flow between both tasks, allowing them to be solved jointly for better performance [14].

We begin by briefly reviewing Markov logic in the next section. Then we describe our model in detail (Section 3). Next we describe related work (Section 4). After that, we report our experiments comparing our model with three Extracting Semantic Networks from Text via Relational Clustering 3 alternative approaches (Section 5). We conclude with a discussion of future work (Section 6).

2 Markov Logic Markov logic combines first-order logic with Markov networks.

In first-order logic [9], formulas are constructed using four types of symbols: constants, variables, functions, and predicates. (In this paper we use only function-free logic.) Constants represent objects in the domain of discourse (e.g., people: Anna, Bob, etc.). Variables (e.g., x, y) range over the objects in the domain. Predicates represent relations among objects (e.g., Friends), or attributes of objects (e.g., Student). Variables and constants may be typed. An atom is a predicate symbol applied to a list of arguments, which may be variables or constants (e.g., Friends(Anna, x)). A ground atom is an atom all of whose arguments are constants (e.g., Friends(Anna, Bob)). A world is an assignment of truth values to all possible ground atoms. A database is a partial specification of a world; each atom in it is true, false or (implicitly) unknown.

A Markov network or Markov random field [19] is a model for the joint distribution of a set of variables X = (X1, X2,..., Xn ) ∈ X. It is composed of an undirected graph G and a set of potential functions φk. The graph has a node for each variable, and the model has a potential function for each clique in the graph. A potential function is a non-negative real-valued function of the state of the corresponding clique. The joint distribution represented by a Markov network is given by P (X = x) = Z k φk (x{k} ) where x{k} is the state of the kth clique (i.e., the state of the variables that appear in that clique). Z, known as the partition function, is given by Z = k φk (x{k} ). Markov networks x∈X are often conveniently represented as log-linear models, with each clique potential replaced by an exponentiated weighted sum of features of the state, leading 1 to P (X = x) = Z exp j wj fj (x). A feature may be any real-valued function of the state. This paper will focus on binary features, fj (x) ∈ {0, 1}. In the most direct translation from the potential-function form, there is one feature corresponding to each possible state x{k} of each clique, with its weight being log φk (x{k} ). This representation is exponential in the size of the cliques.

However, we are free to specify a much smaller number of features (e.g., logical functions of the state of the clique), allowing for a more compact representation than the potential-function form, particularly when large cliques are present.

Markov logic takes advantage of this.

A Markov logic network (MLN) is a set of weighted first-order formulas.

Together with a set of constants representing objects in the domain, it defines a Markov network with one node per ground atom and one feature per ground formula. The weight of a feature is the weight of the first-order formula that originated it. The probability distribution over possible worlds x specified by the 1 ground Markov network is given by P (X = x) = Z exp j∈Gi wi gj (x), i∈F where Z is the partition function, F is the set of all first-order formulas in the 4 Extracting Semantic Networks from Text via Relational Clustering MLN, Gi is the set of groundings of the ith first-order formula, and gj (x) = 1 if the jth ground formula is true and gj (x) = 0 otherwise. Markov logic enables us to compactly represent complex models in non-i.i.d. domains. General algorithms for inference and learning in Markov logic are discussed in [22].

3 Semantic Network Extraction

We call our model SNE, for Semantic Network Extractor. SNE simultaneously clusters objects and relations in an unsupervised manner, without requiring the number of clusters to be specified in advance. The object clusters and relation clusters respectively form the nodes and links of a semantic network. A link exists between two nodes if and only if a true ground fact can be formed from the symbols in the corresponding relation and object clusters. SNE can cluster objects of different types, and relations of any arity.

When faced with the task of extracting knowledge from noisy and sparse data like that used in our experiments, we have to glean every bit of useful information from the data to form coherent clusters. SNE does this by jointly clustering objects and relations. In its algorithm, SNE allows information from object clusters it has created at each step to be used in forming relation clusters, and vice versa. As we shall see later in our experimental results, this joint clustering approach does better than clustering objects and relations separately.

SNE is defined using a form of finite second-order Markov logic in which variables can range over relations (predicates) as well as objects (constants).

Extending Markov logic to second order involves simply grounding atoms with all possible predicate symbols as well as all constant symbols, and allows us to represent some models much more compactly than first-order Markov logic.

For simplicity, we assume that relations are binary in our definition of SNE, i.e., relations are of the form r(x, y) where r is a relation symbol, and x and y are object symbols. (Extending the definition to an arbitrary number of nary relations is straightforward.) We use γi and Γi to respectively denote a cluster and clustering (i.e., a partitioning) of symbols of type i. If r, x, and y are respectively in cluster γr, γx, and γy, we say that r(x, y) is in the cluster combination (γr, γx, γy ).

The learning problem in SNE consists of finding the cluster assignment Γ = (Γr, Γx, Γy ) that maximizes the posterior probability P (Γ |R) ∝ P (Γ, R) = P (Γ )P (R|Γ ), where R is a vector of truth assignments to the observable r(x, y) ground atoms.

We define one MLN for the likelihood P (R|Γ ) component, and one MLN for the prior P (Γ ) component of the posterior probability with just four simple rules.

The MLN for the likelihood component only contains one rule stating that the truth value of an atom is determined by the cluster combination it belongs


∀r, x, y, +γr, +γx, +γy r ∈ γr ∧ x ∈ γx ∧ y ∈ γy ⇒ r(x, y) Extracting Semantic Networks from Text via Relational Clustering 5 This rule is soft. The “+” notation is syntactic sugar that signifies that there is an instance of this rule with a separate weight for each cluster combination (γr, γx, γy ). This rule predicts the probability of query atoms given the cluster memberships of the symbols in them. This is known as the atom prediction rule.

As shown in [12], given a cluster assignment, the MAP weight wk of an instance of the atom prediction rule is given by log(tk /fk ), where tk is the empirical number of true atoms in cluster combination k, and fk is the number of false atoms. Adding smoothing parameters α and β, we estimate the MAP weight as log((tk + α)/(fk + β)).

Three rules are defined in the MLN for the prior component. The first rule

states that each symbol belongs to exactly one cluster:

∀x ∃1 γ x ∈ γ This rule is hard, i.e., it has infinite weight and cannot be violated.

The second rule imposes an exponential prior on the number of cluster combinations. This rule combats the proliferation of cluster combinations and consequent overfitting, and is represented by the formula ∀γr, γx, γy ∃r, x, y r ∈ γr ∧ x ∈ γx ∧ y ∈ γy with negative weight −λ. The parameter λ is fixed during learning, and is the penalty in log-posterior incurred by adding a cluster combination to the model.

Thus larger λs lead to fewer cluster combinations being formed. This rule represents the complexity of the model in terms of the number of instances of the atom prediction rule (which is equal to the number of cluster combinations).

The last rule encodes the belief that most symbols tend to be in different clusters. It is represented by the formula ∀x, x, γx, γx x ∈ γx ∧ x ∈ γx ∧ x = x ⇒ γx = γx with positive weight µ. The parameter µ is also fixed during learning. We expect there to be many concepts and high-level relations in a large heterogenous body of text. The tuple extraction process samples instances of these concepts and relations sparsely, and we expect each concept or relation to have only a few instances sampled, in many cases only one. Thus we expect most pairs of symbols to be in different concept and relation clusters.

Pages:   || 2 | 3 |

Similar works:

«Distribution Agreement In presenting this thesis or dissertation as a partial fulfillment of the requirements for an advanced degree from Emory University, I hereby grant to Emory University and its agents the non-exclusive license to archive, make accessible, and display my thesis or dissertation in whole or in part in all forms of media, now or hereafter known, including display on the world wide web. I understand that I may select some access restrictions as part of the online submission of...»

«Lavish Returns on Cheap Talk: Non-binding Communication in a Trust Experiment1 by Avner Ben-Ner*, Louis Putterman+, and Ting Ren* Abstract. We let subjects interact with anonymous partners in trust (investment) games with and without one of two kinds of pre-play communication: numerical (tabular) only, and verbal and numerical. We find that either kind of pre-play communication increases trusting, trustworthiness, or both, in inter-subject comparisons, but that the inclusion of verbal...»

«Department of the Treasury Internal Revenue Service Washington, DC 20224 Index Number: 2056.07-03, 2613.00-00, 2651.00-00 Person to Contact: Telephone Number: Number: 199927030 Release Date: 7/9/1999 Refer Reply To: CC:DOM:P&SI:4-PLR-117222-98 Date: April 12, 1999 Re: LEGEND Grantor = Spouse = Spouse 2 = Child 1 = Child 2 = Child 3 = Child 4 = Grandchild 2 = Grandchild 3 = Stepchild 1 = Stepchild 2 = Trust = Date 1 = Date 2 = Date 3 = This is in response to a December 9, 1998 letter, and prior...»

«2002 DEAUVILLE PRESS INFORMATION Introduction Honda’s sleek, stylish and convenient mid-sized Deauville debuted in 1998 as the commuting and light touring successor to the NTV650 Revere. Powered by a strong and reliable liquid-cooled V-twin engine, and featuring an innovative integrated fairing design, the Deauville won immediate popularity for its modern looks, ample power and performance, and easy compatibility with busy urban lifestyles. Of notable appeal has been the extra carrying...»

«INKLINGS FOREVER, Volume VI A Collection of Essays Presented at the Sixth FRANCES WHITE EWBANK COLLOQUIUM on C.S. LEWIS & FRIENDS Taylor University 2008 Upland, Indiana Rags of Lordship Tolkien, Lewis, and the Meaning of Myth John Stanifer Abstract: The most casual reader of C.S. Lewis and J.R.R. Tolkien knows that Narnia and Middle-Earth are steeped in mythology. What isn‟t as well-known is that both men saw a clear connection between their Christian faith and the ancient “pagan” myths...»

«The AfTerschool Guide FOr CrEATiNg OuTSTANDiNg iNDOOr Games TABLE OF CONTENTS Guessing Games. 2 Race, Tag, and Relay Games. 3 Throwing (Ball) Games. 6 Miscellaneous Games. 8 Contributors. 11 –1– guESS?Ng games Doggy, Doggy, Where’s Your Bone? How to Play: One person is the “dog” in the center of the circle, with eyes closed and the bone behind his or her back. When the dog’s eyes are closed, someone takes the dog’s bone. Then everyone chants, “Doggy, doggy, where’s your...»

«Operational Management Plan for Large Pelagic Species 2010―2015 Overall Goal for New Zealand fisheries New Zealanders maximising benefits from the use of fisheries within environmental limits Outcomes Use Outcome: Fisheries resources are used in a manner that provides greatest overall economic, social, and cultural benefit Environment Outcome: The capacity and integrity of the aquatic environment, habitats and species are sustained at levels that provide for future and current use. Governance...»

«J Afr Am St DOI 10.1007/s12111-011-9181-2 A RT I C L E S Black Mega-Churches in the Internet Age: Exploring Theological Teachings and Social Outreach Efforts Pamela P. Martin & Tuere A. Bowles & LaTrese Adkins & Monica T. Leach # Springer Science+Business Media, LLC 2011 Abstract The research on Black mega-churches has been limited at best. To date, little is known about theological teachings of Black mega-churches. Other primary characteristics of Black mega-churches are even less understood,...»

«This document is scheduled to be published in the Federal Register on 10/19/2016 and available online at https://federalregister.gov/d/2016-25192, and on FDsys.gov 9111-97 DEPARTMENT OF HOMELAND SECURITY Office of the Secretary [Docket No. DHS-2016-0043] Privacy Act of 1974; Department of Homeland Security/United States Citizenship and Immigration Services-007 Benefit Information System, System of Records AGENCY: Department of Homeland Security, Privacy Office. ACTION: Notice of Privacy Act...»

«Portfolio Media. Inc. | 860 Broadway, 6th Floor | New York, NY 10003 | www.law360.com Phone: +1 646 783 7100 | Fax: +1 646 783 7161 | customerservice@law360.com A Recurring Question In Calif. Federal Trade Secret Cases Law360, New York (December 09, 2014, 10:12 AM ET) Although trade secret misappropriation claims are grounded in state law, litigants commonly find themselves in a federal district court. A recurring question for federal courts adjudicating a trade secret misappropriation claim...»

«제16권 2호 (2008): 247-263 The Meaning of the Cotton “Wulf” Maxim in the Context of Anglo-Saxon Popular Thought and Culture. Yoon-hee Park (Dongguk University) The sixty-six lines of the Cotton Maxims (Maxims II), like Maxims I of The Exeter Book, consist of a series of gnomic verses which are pieces of universally accepted knowledge in the form of versified sentences (Wrenn 164). Although some critics, notably Jackson J. Campbell, have been trying to unify the 66-line gnomic verses, the...»

«Disciplining Desire: the Fluid Textuality of Annie Proulx’s “Brokeback Mountain” Mark John Isola, Wentworth Institute of Technology Nature undefiled – this is the inevitable setting of the Sacred Marriage of males. Ishmael and Queequeg, arm in arm, about to ship out, Huck and Jim swimming beside the raft in the peaceful flux of the Mississippi – here it is the motion of water which completes the syndrome, the American dream of isolation afloat. Leslie Fiedler Taking a break from my...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.