FREE ELECTRONIC LIBRARY - Dissertations, online materials

Pages:   || 2 | 3 | 4 | 5 |   ...   | 22 |

«by Ian Hyla Jermyn A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy Department of Computer ...»

-- [ Page 1 ] --

On the Use of Functionals on Boundaries

in Hierarchical Models of

Object Recognition


Ian Hyla Jermyn

A dissertation submitted in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Department of Computer Science

New York University

September 2000

Davi Geiger

c Ian Jermyn

All Rights Reserved, 2000

If there were any real proof that the sun is in the centre of the universe and the

earth in the third heaven, and that the sun does not go around the earth, but the earth round the sun, then we would have to proceed with great circumspection in explaining passages of scripture which appear to teach the contrary, and rather admit that we did not understand them, than declare an opinion to be false which is proved to be true. As for myself, I shall not believe that there are such proofs until they are shown to me. Nor is it proof that, if the sun be supposed to be at the centre of the universe and the earth in the third heaven, everything works out the same as if it were the other way around.

Cardinal Roberto Bellarmino, Master of Controversial Questions at the Collegio Romano, in a letter of 12 April 1615 to Paolo Antonio Foscarini, Carmelite monk, replying to an enquiry about the truth of the Copernican system (Opere, Vol. 12, p.


∗∗∗∗∗∗∗∗∗ Indeed, someone who does philosophy or psychology will perhaps say “I feel that I think in my head”. But what that means he won’t be able to say. For he will not be able to say what kind of feeling that is; but merely to use the expression that he ‘feels’;

as if he were saying “I feel this stitch here”. Thus he is unaware that it remains to be investigated what his expression “I feel” means here, that is to say: what consequences we are permitted to draw from this utterance. Whether we may draw the same ones as we would from the utterance “I feel a stitch here”.

Ludwig Wittgenstein, Remarks on the Philosophy of Psychology, Vol. 1, # 350, Basil Blackwell, Oxford, 1980. Translated by G. E. M. Anscombe.


To Leslie, who has shown me what is truly important, and In Memory of my Beloved Grandmother, Elizabeth Norah Jermyn.

v Acknowledgements My greatest thanks go to Davi Geiger who, after indulging my inclination to other things, showed me the computer vision light. As well as all the advice, encouragement, psychotherapy, and stern admonishments that he has given so well and so freely over the years, we have also become friends. I thank him very much for everything.

Hiroshi Ishikawa is my great and good friend. We have shared many things, particularly our adventure in Rio, during which, fuelled by cafezinhos and caipirinhas, we began the collaboration that produced much of the work in this thesis. I thank him very much for his friendship, his humour,and his thought.

I would like to thank Pete Wyckoff, who befriended me when I was a novice Englishman in New York, who introduced me to Leslie, and who has been a dear and steadfast friend ever since. My life now would be quite different without him.

Thank you to Fabian Monrose for being my fellow un-American and for sharing my love of music. When New York becomes too much, I know I can always rely on him for sympathy and great chicken.

When I visited India with Leslie in 1995, Laxmi Parida invited us into her home and showed us the beauty of her country. A growing friendship was sealed. Thank you so much to her for being a listening ear, a stimulus to thought and a deadly pool opponent.

Thank you to Ken Pao for his friendship and gentle company in the office and outside it.

Au revoir to all these good friends. Remember that Antibes is a good place to visit.

Thank you very much to David Jacobs, who besides being my kindly boss during the summer at NECI, has acted as a second advisor. His calm support has been a great help.

Thank you to Ernie Davis for his perhaps unknowing encouragement to me when I needed it. I apologize to him for the lack of real AI in this thesis.

Thank you to Nava Rubin, for serving on my thesis committee, and for providing me with food for thought through her work.

vii Thank you to Alan Siegel for his attempt to convince me of the error of my ways.

Thank you to everyone at Courant, particularly Anina, Rosemary and Lourdes, who are continually friendly and helpful and kind.

I would like to thank the Instituto de Matemática Pura e Aplicada in Rio de Janeiro for their generous hospitality during the above-mentioned sojourn.

I will never forget the monkeys and butterflies and the steamy sounds of the tropical forest floating through my office window.

Finally, thank you to my family, my parents Richard and Leonie, and my brother and sister Phil and Anna, for always being there, quietly supporting and encouraging me.

New York City, August 7, 2000

viii Abstract

Object recognition is a central problem in computer vision. Typically it is assumed to follow a sequential model in which successively more specific hypotheses are generated about the image. This is a rather simplistic model, allowing as it does no margin for error at any point. We follow a more general approach in which the various representations involved are allowed to influence one another from the outset. As a guide and ultimate goal, we study the problem of finding the region occupied by human beings in images, and the separation of the region into arms, legs and head. We approach the problem as that of defining a functional on the space of boundaries in images whose minimum specifies the region occupied by the human figure.

Previous work that uses such functionals suffers from a number of difficulties. These include an uncontrollable dependence on scale, an inability to find the global minimum for boundaries in polynomial time, and the inability to include region as well as boundary information. We present a new form of functional on boundaries in a manifold that solves these problems, and is also the unique form of functional in a specific class that possesses a nontrivial, efficiently computable global minimum. We describe applications of the model to single images and to the extraction of boundaries from stereo pairs and motion sequences.

In addition, the functionals used in previous work could not include information about the shape of the region sought. We develop a model for the part structures of boundaries that extends previous work to the case of real images, thus including shape information in the functional framework. We show that such part structures are hyperpaths in a hypergraph. An ‘optimal hyperpath’ algorithm is developed that globally minimizes the functional under some conditions.

We show how to use exemplars of a shape to construct a functional that includes specific information about the topology of the part structure sought.

An algorithm is developed that globally minimizes such functionals in the case of a fixed boundary. The behaviour of the functional mimics an aspect of human shape comparison.

–  –  –

A background is drawn for the work. The study of vision is difficult both philosophically and practically, but the notion of seeing machines clarifies the issues somewhat. A definition of a visual system as a module of a seeing machine is given, and this necessitates a discussion of image semantics as the appropriate output of a visual system. The ideas discussed are formalized using probability theory and working assumptions used to render the problem tractable. We then consider briefly what it means to test a visual system empirically.

T HE nature of vision is obscure. To a great extent this reflects the difficulties associated with any discussion of mental phenomena, whether in the biological/psychological sciences or in computer science. Indeed the very use of the word phenomena here is misleading. What we refer to as mental phenomena are exclusively experiences of ourselves, unless we count particular physical and chemical measurements that may be made on our brains and whose connection to the first kind of mental phenomena is largely unknown.

These experiences are not phenomena in the same sense that the behavior of a falling object is a phenomenon. Others do not observe my ‘mental phenomena’. They may hear me speak as if I have observed something, but we do not observe ourselves as we observe a physical event or even as we observe others, except metaphorically. It is not clear what we mean when we say that we ‘see’ something or that we ‘recognize’ an object, once we step outside the normal realms of discourse and attempt to analyze such statements in the abstract. For example, what does it mean to ask the questions “do we recognize every object in our field of view?” or “do we see every object in our field of view?”? Avoiding the dilemmas and confusions raised by these issues is not always easy.

By way of contrast, computer vision is the attempt to construct seeing machines. In full generality, a seeing machine is any machine that uses images to help accomplish a task. Such tasks are extremely varied. They range over almost all of human and animal activity: counting widgets passing by on 1 a conveyor belt; navigating through a complex environment; extracting the region corresponding to a human being in an image; animation; copying a design; handwriting recognition; and on and on. Human beings allegedly devote a third of the volumes of their brains to visual processing, which gives some indication of the problems facing computer vision. Nevertheless, by approaching the study of vision in this operational way, it is to be hoped that we can avoid the philosophical concerns mentioned in the first paragraph, and eventually shed some light on what we are talking about when we discuss human vision, as well as constructing useful technology along the way.

The first thing we will do however, is to make a simplifying assumption that reduces the operational content of our model. We will postulate a separation between those parts of the machine that deal with the images themselves and those parts that perform other tasks such as planning or locomotion. The picture is of a ‘module’ (called the visual system) that takes images as input (the images are made available according to a plan formulated elsewhere in the machine), and that produces as output statements about the image. Such a picture has advantages and disadvantages. On the positive side, it is a useful abstraction since we are not forced to contemplate general intelligent behaviour in addition to the already formidable difficulties of image understanding, and it opens the possibility of discovering task-independent methods. On the negative side, the separation means that we must now test the performance of the visual system independently of a specific task. In what could such a test consist? We are forced to refer the notion of image understanding to human performance, since that is the only visual system to whose output we have access.

1. IMAGE SEMANTICS In performing a given task, the images used by the seeing machine will be endowed with a semantics. This semantics encodes what the seeing machine as a whole does with the images it acquires: what consequences it can draw from these images. A semantics can be thought of as a collection of statements about the image that are true. In general the semantics will clearly depend on the task. The job of the visual system is to output a statement from the semantics on receiving an image as input.

In order for the semantics to be testable in any meaningful way, the relevant people must agree on the statements in it: ground truth is established by human consensus. This may be because the semantics is agreed upon for a specific type of image and task, for example a blueprint, but often this is not the case. For example, the statement that there is a black rectangle at such and such a location in figure 1 is unlikely to produce disagreement among observers. On the other hand, the statement that this image is a picture of a

2 FIGURE 1. A black rectangle or a book?

black book might well, and yet it is not an unreasonable interpretation of the image. While this may seem to subjectivize the notion of the meaning of an image, in practice it is all that we have once we separate visual understanding from task performance. In the future, given a theory as to why we divide the world into the objects and concepts that we use (such a theory is not inconceivable: perhaps there is an informationally optimal way to do this, to which human understanding is an approximation), this situation might be changed. In the meantime, human consensus is what we mean by image understanding.

In order to compare two visual systems, we must have not only the notion of ground truth provided by human consensus, but also a notion of how ‘close’ to correct a given statement is. Given the output of a visual system on a particular image, this latter notion (an evaluation function) will compare the statement to the image semantics and output a real number, the evaluation of the output. Two visual systems can then be compared by, for example, using a probability distribution of possible inputs and computing the mean evaluations. The evaluation function is not given a priori. It too must be agreed upon, and will in general be task-dependent. In fact, in a typical task the evaluation function will depend upon a number of other factors that only logically become available to us once we consider the task itself. For example, the resources needed for the visual system to output its statement might be extremely important in reality, and may offset the accuracy of the 3 result. These factors are completely task-dependent and we do not consider them further except to ensure that they are not prohibitive (for example, an algorithm that takes time exponential in the size of the input).

It is hard to give a clearly defined semantics for many images. For example, depictions of real scenes can be given a semantics by making statements about possible scenes prefixed by “If a real scene had generated the image, then in that scene... ”. The problem is that in some cases there may not be enough consensus to render such statements free of their dependence on the speaker.

Pages:   || 2 | 3 | 4 | 5 |   ...   | 22 |

Similar works:

«COLLABORATIVE DISSERTATIONS IN COMPOSITION: A FEMINIST DISRUPTION OF THE STATUS QUO A Dissertation Submitted to the School of Graduate Studies and Research in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy Laura M. Mangini Indiana University of Pennsylvania August 2015 © 2015 Laura M. Mangini All Rights Reserved ii Indiana University of Pennsylvania Schol of Graduate Studies and Research Department of English We hereby approve the dissertation of Laura M....»

«University of Alberta Stone Bodies in the City: Unmapping Monuments, Memory and Belonging in Ottawa by Tonya Katherine Davidson A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements for the degree of Doctor of Philosophy Sociology ©Tonya Katherine Davidson Fall 2012 Edmonton, Alberta Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private,...»

«Belt Wrestling William BAXTER SCOTLAND 2008 Belt Wrestling, the Oldest Sport? A salto from the 6th World Championships in Ufa, Bashkortostan, 2007 Belt wrestling has achieved a new dynamic and is undergoing a renaissance in the 21st century. The modern drive for its valorisation comes from the Russian Federation where several indigenous variations are still practised. To accommodate the many variations of belt wrestling throughout the world, the philosophy of the new International Belt...»

«Visualizing Users, User Communities, and Usage Trends in Complex Information Systems Using Implicit Rating Data Seonho Kim Dissertation submitted to the faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science and Applications Advisory Committee: Edward A. Fox, Chair Weiguo Fan Christopher North Deborah Tatar Ricardo da Silva Torres April 14, 2008 Blacksburg, Virginia Keywords:...»

«Multiple System Atrophy and Parkinson’s disease Thesis submitted for the degree doctor of philosophy By Haya Kisos Submitted for the senate of Hebrew University June 2013 This work was carried out by supervision of Dr. Ronit Sharon and Prof. Tamir Ben Hur Abstract: The synucleinopathies are a diverse group of neurodegenerative disorders that share a common pathologic intracellular lesion, composed primarily of aggregates of insoluble α-Synuclein (α-Syn) protein in selectively vulnerable...»

«Open Journal of Philosophy, 2016, 6, 176-183 Published Online May 2016 in SciRes. http://www.scirp.org/journal/ojpp http://dx.doi.org/10.4236/ojpp.2016.62016 China-West Interculture Kuangming Wu Philosophy Department, University of Wisconsin-Oshkosh, Oshkosh, WI, USA Received 3 March 2016; accepted 2 May 2016; published 5 May 2016 Copyright © 2016 by author and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY)....»

«DEVOLUTION FROM ABOVE: THE ORIGINS AND PERSISTENCE OF STATE-SPONSORED MILITIAS A Dissertation submitted to the Faculty of the Graduate School of Arts and Sciences of Georgetown University in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Government By Ariel I. Ahram, M.A. Washington, DC July 30, 2008 Copyright 2008 by Ariel I. Ahram All Rights Reserved ii DEVOLUTION FROM ABOVE: THE ORIGINS AND PERSISTENCE OF STATE-SPONSORED MILITIAS Ariel I. Ahram, M.A. Thesis...»

«HIGH PERFORMANCE COMPUTING FOR IRREGULAR ALGORITHMS AND APPLICATIONS WITH AN EMPHASIS ON BIG DATA ANALYTICS A Thesis Presented to The Academic Faculty by Oded Green In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the School of Computational Science and Engineering Georgia Institute of Technology May 2014 Copyright c 2014 by Oded Green HIGH PERFORMANCE COMPUTING FOR IRREGULAR ALGORITHMS AND APPLICATIONS WITH AN EMPHASIS ON BIG DATA ANALYTICS Approved by:...»

«MECHANICAL BEHAVIOR OF CARBON NANOTUBE FORESTS UNDER COMPRESSIVE LOADING A Dissertation Presented to The Academic Faculty By Parisa Pour Shahid Saeed Abadi In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the George W. Woodruff School of Mechanical Engineering Georgia Institute of Technology May 2013 Copyright © 2013 Parisa Pour Shahid Saeed Abadi MECHANICAL BEHAVIOR OF CARBON NANOTUBE FORESTS UNDER COMPRESSIVE LOADING Approved by: Dr. Samuel Graham, Advisor...»

«Forthcoming in Philosophy East & West 57:4 (2007) Language and Ontology in Early Chinese Thought∗ Chris Fraser Department of Philosophy Chinese University of Hong Kong July 2005 (Revised January 2007) Correspondence: Chris Fraser (方克濤) (Assistant Professor) Department of Philosophy Rm. 430, Fung King Hey Bldg. Chinese University of Hong Kong Shatin, N.T., Hong Kong Telephone: 852-9782-0560 Fax: 852-2603-5323 E-mail: cjfraser@cuhk.edu.hk Copyright © 2005, 2007 Brief Summary for Table of...»

«Torah Praxis after 70 C.E.: Reading Matthew and Luke-Acts as Jewish Texts by Isaac Wilk Oliver A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Near Eastern Studies) in The University of Michigan 2012 Doctoral Committee: Professor Gabriele Boccaccini, Chair Professor Raymond H. Van Dam Assistant Professor Ellen Muehlberger Assistant Professor Rachel Neis Professor Daniel Boyarin, University of California, Berkeley To my Father, Benoni...»

«LA SOLITUDE DE L’HOMME MODERNE, UN PROBLÈME PHILOSOPHIQUE Conf.univ.dr. IULIANA PAŞTIN, Universitatea Creştină „Dimitrie Cantemir’’ La grandeur d'un métier est peut-être avant tout, d'unir les Hommes. Il n'est qu'un luxe véritable et c'est celui des Relations Humaines. En travaillant pour les seuls biens matériels, nous bâtissons nous-mêmes notre prison, avec notre monnaie de cendre qui ne procure rien qui vaille de vivre. Antoine de Saint Exupéry Abstract: Loneliness or...»

<<  HOME   |    CONTACTS
2016 www.dissertation.xlibx.info - Dissertations, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.