«3D MODELING WITH DATA-DRIVEN SUGGESTIONS A DISSERTATION SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE AND THE COMMITTEE ON GRADUATE STUDIES OF ...»
3D MODELING WITH DATA-DRIVEN SUGGESTIONS
SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHYSiddhartha Chaudhuri August 2011 © 2011 by Siddhartha Chaudhuri. All Rights Reserved.
Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons AttributionNoncommercial-No Derivative Works 3.0 United States License.
http://creativecommons.org/licenses/by-nc-nd/3.0/us/ This dissertation is online at: http://purl.stanford.edu/vq766tr8762 ii I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
Vladlen Koltun, Primary Adviser I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
Leonidas Guibas I certify that I have read this dissertation and that, in my opinion, it is fully adequate in scope and quality as a dissertation for the degree of Doctor of Philosophy.
Marc Levoy Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost Graduate Education This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file in University Archives.
iii Abstract Creating detailed three-dimensional shapes on the computer is hard. The standard tools for the task are complex and require long training and familiarization. As a result, 3D modeling is typically the domain of the professional artist and not the casual user. Professionals invest the time to master their tools, but such tools are usually restricted to low-level sculpting operations. High-level reasoning and geometric manipulation, of which computers are well capable, are not used to help the artist reach her goals more eﬃciently or creatively.
In this dissertation, I propose techniques by which computers, endowed with a greater understanding of the structure of shapes, can both support the creative pursuits of professionals, as well as signiﬁcantly ease the burden of3D modeling for the casual user. To this end, I describe methods for generating “suggestions” during the 3D modeling process: component shapes that may be directly used to augment the currently-modeled shape, or to inspire directions for its further development. These suggestions are drawn from a large library of previously-modeled shapes. Also, I discuss the construction of an assembly-based modeling tool that enables casual users to rapidly construct shapes from suggested components, with minimal training. Experiments with both professional and casual users suggest that this approach successfully supports rapid, creative 3D modeling.
Signiﬁcant portions of this work appear in publications in the SIGGRAPH Asia  and SIGGRAPH  conferences, and a supplementary video for the second paper may be viewed at http://www.youtube.com/watch?v=7Abki79WIOY.
The diﬃculty in writing an acknowledgements section is that there are always too many people to thank. A comprehensive list of all the people I’m grateful to for support, advice, company and love during my stay at Stanford would run to many tens of pages. This section is necessarily an abbreviated version.
My PhD advisor, Vladlen Koltun, was a principal source of support and encouragement throughout my stay at Stanford. He was always available for discussions, and invested great time and eﬀort in the projects we worked on together. His commitment to tackling signiﬁcant problems was inspiring, as was his skill at presenting research work in a lucid and accessible fashion. My research interests, and any success I have achieved in pursuing them, owe a great deal to his vision and insight.
I would also like to thank Leo Guibas, Pat Hanrahan, Scott Klemmer and Marc Levoy for helpful discussions, advice and comments at various stages. I truly appreciated the opportunity to learn from wonderful teachers at Stanford. I also enjoyed my time in the Stanford Theory Group, my ﬁrst home as a graduate student.
I’ve been fortunate to have colleagues who are uniformly better than me, both as computer scientists and as human beings. In particular, I’m grateful to Ewen Cheslack-Postava, Jared Duke, Daniel Gibson, Daniel Horn, Ming Jiang, Vangelis Kalogerakis, Philipp Kr¨henb¨hl, Ranjitha Kumar, Joni Laserson, Steve Lesser, a u Sergey Levine, Yu Lou, Alex Mattos, Paul Merrell, Eric Schkufza, Jerry Talton, Jack Wang, TongKe Xue and Lingfeng Yang. Avi Robinson-Mosher, early partner-in-crime in weekly rock-climbing sessions, remains a ready source of great conversation and fresh perspectives on research. Niels Joubert, in addition to being a good friend, was also an outstanding TA for the CS148 course I taught. I’ve always enjoyed hanging v out with Chris Platz, digital artist par excellence and down-to-earth nice guy. Lynda Harris, Melissa Rivera and Monica Niemiec, my lovely admins, made every logistical issue as painless as possible.
Many other friends collectively helped me preserve my sanity during these six years. I can only name a few here: thank you Abhirup, Archan, Corina, Jad, Jean-Gab, Jen, John, Mike, Priya, Samantak, Shrestha, Stephanie and Suchi for all the great times. Chaitanya Mishra, always up for a coﬀee run to Philz, also introduced me to the worst nightmare of jockeys on the Serengeti: The Artist Formerly Known as Prince. Avisek Das, my ﬁrst friend (and roommate) at Stanford, rapidly moved from ‘friend’ to ‘brother’ status. He remains one of the nicest and most dependable people I know. Of the many other people who’ve shared apartments with me, Albert Brothers and Brian Watanabe have remained particularly good friends.
Dipanjan Das and Vikas Yendluri, wonderful musicians and friends both, kept my love of Hindustani classical music alive. I have treasured Fridays spent in Steven Baigel’s ﬁlm studio in Berkeley, watching his documentary on the sitarist Nikhil Banerjee take shape and listening to great music. It was always a pleasure to catch up with Madhur Tulsiani during those visits. Ravindra Vishnoi and I have made valiant eﬀorts to deprive ourselves of oxygen, vegetation and civilization (some would say we didn’t have to try too hard on the last front). Most of all, I’m lucky to have had the same set of best friends for the last 15 or more years: thank you Amarttya, Amitabha, Anand, Gaurav, Ritoban, Satyaki and Shrawan.
Last but certainly not least, my family, who have given me unstinted love and support over the years. A special shout-out goes to my cousins Sugato, Sunanda, Vinayak and Sukanya, my nephews Rohan and Nikhil, and my grandmothers Sushmita Das Gupta and Sujata Chaudhuri. Kanak Maitra continues to shower me with aﬀection every time I go home. My sister Aparna has carefully avoided saying anything positive about me all her life — I’m delighted to note that she shows no signs of change. My parents, Supriya and Sukanta Chaudhuri, in addition to giving me life, have taught me everything I know about it. This dissertation is for them.
xi 2.6 (a) A database shape with wingtip stabilizers and (b) a query shape with small wingtip lights are approximately aligned, as shown in (c) a closeup of their aligned wings. The light has points with SDF signatures comparable to that of the stabilizer. Nevertheless, (d) our multi-scale contextual correspondence score is coherent on the database shape, blue indicating a good match and yellow/red indicating a low match. By contrast, four other matching measures — (e) distance to nearest neighbor on the aligned query shape, (f) unthresholded and (g) thresholded similarity of the nearest neighbor’s signature, and (h) presence of a neighbor with a similar signature — are not robust to the presence of the light or the proximity of the wing and mark parts of the stabilizer as matched........................ 25
2.7 Detecting matched and unmatched parts of a database shape with pre-alignment. (a) A neighborhood similarity function is computed on its surface, and (b) the query shape is aligned to it. (c) A per-point correspondence score is then computed. (d) After thresholding the average correspondence score for each segment, and applying shape symmetries, unmatched portions (red) yield candidate suggestions... 26
2.8 Data-driven suggestions (red) help an artist create an imaginative aircraft design for virtual environments. The starting shape is shown in green, followed by the top 5 suggestions made by our prototype system with the pre-alignment approach of Section 220.127.116.11, and three additional suggestions identiﬁed by the artist as stimulating. The source database models automatically identiﬁed by the system are indicated in wireframe. The resulting model, incorporating four of the presented suggestions, is shown on the right..................... 31 xii
2.9 Data-driven suggestions (red) help an artist create a fantastical creature from a crude initial shape (green). The top 5 suggestions generated by the system with the pre-alignment approach of Section 18.104.22.168 are shown in red (with source database models in wireframe). The suggestions were used to concretize the design and create the ﬁnal chimeric creature (left)................................ 31
2.10 Automatically generated suggestions (red) for the queries in green, using the pre-alignment approach of Section 22.214.171.124. These suggestions were not, in general, the highest ranked ones, but they demonstrate the potential of data-driven suggestions................. 32
2.11 Top-ranked suggestions with contextual local signatures (Section 126.96.36.199).
Query shapes (green) and the four top-ranked suggestions (red) generated for each by. The automatically retrieved database shapes that yielded the suggestions are indicated in wireframe............ 33
2.12 Suggestions (red) for an ambiguous shape (green) generated with contextual local signatures (Section 188.8.131.52)................. 34
2.13 InspireMe interface, showing a query shape (green) and suggestions for it (red). One suggestion has been selected and added to the mockup (blue).................................... 35
2.14 Models created by artists for the aircraft task (top) and the creature task (bottom). For each model, the ﬁgure shows the initial query shape (top row), the mockup created with suggestions from InspireMe (middle row), and the ﬁnal textured model created from the mockup (bottom row). All novel components in the mockups are derived from data-driven suggestions. For each task, every model was created by a diﬀerent artist............................... 37
2.15 Activity time in modeling programs (blue) and InspireMe (red) during the creation of the models in Figure 2.14................. 38
3.1 Purely geometric comparison between query (left) and database (right) shapes yields suggestions (red) that are not consistent with the semantic category of the query.......................... 41
3.2 Overview of our approach. The preprocessing stage (top) begins with a library of models, segmented and labeled using the technique of Kalogerakis et al. . The components extracted from the models are further clustered by geometric style. A Bayesian network is then learned that encodes probabilistic dependencies between labels, geometric styles, part adjacencies, number of parts from each category, and symmetries. The ﬁgure shows a subset from a real network learned from a library of creature models. The runtime stage (bottom) performs probabilistic inference in the learned Bayesian network to generate ranked lists of category labels and components within each category, customized for the currently assembled model......... 44
3.3 Modeling interface. The model is assembled from presented parts... 46
3.4 Clusters of head parts, based on a Gaussian mixture model. The feature space is visualized by projection onto the two principal axes... 54
3.5 Results for the Toy task (top) and Creature task (bottom). The plots on the left show the cumulative distribution of categories from which parts were chosen, as a function of the ranking of the category at the time of selection. The plots on the right show the cumulative distribution of components used by participants, as a function of the ranking of the component within its category at the time of selection. The probabilistic model presented more relevant categories and components than the static ordering or the geometric approach.............. 61 xiv
3.6 Number of components used in a single assembled model (top) and number of library models that the employed components originate from (bottom).................................. 62
4.1 Annoted screenshot of our prototype application for assembly-based modeling with data-driven suggestions.................. 68
4.2 Gluing two components together. (a) The components to be glued;
(b) slots are identiﬁed; (c) the slots are capped; (d) slot properties, such as attachment frames, are computed; (e) the slot of the source component (green) is mapped to an exponential map computed on sample points on the target surface (including the capping mesh), and neighboring regions deformed for a smooth join; (f) the ﬁnished join.
Note that the source may be glued anywhere on the target, not necessarily in the region of a corresponding slot................ 73
4.3 The exponential map at a point p takes geodesic curves originating at p in (a) to rays with the same origin in the tangent plane Tp in (b).