본문

서브메뉴

상세정보

Deep Generative Priors for View Synthesis at Scale.
Deep Generative Priors for View Synthesis at Scale.
Deep Generative Priors for View Synthesis at Scale.

상세정보

자료유형  
 학위논문(국외)
기본표목-개인명  
표제와 책임표시사항  
Deep Generative Priors for View Synthesis at Scale.
발행, 배포, 간사 사항  
발행, 배포, 간사 사항  
Ann Arbor : ProQuest Dissertations & Theses , 2025
    형태사항  
    123 p.
    일반주기  
    Source: Dissertations Abstracts International, Volume: 87-04, Section: A.
    일반주기  
    Advisor: Kanazawa, Angjoo.
    학위논문주기  
    Thesis (Ph.D.)--University of California, Berkeley, 2025.
    요약 등 주기  
    요약View synthesis-the task of generating photorealistic images of a scene from novel camera viewpoints-is a cornerstone of computer vision, underpinning graphics, immersive reality, and embodied AI. Yet despite its importance, view synthesis has not demonstrated scaling properties comparable to those in language or 2D generation, even when provided with more data and compute: reconstruction-based methods collapse under sparse views or scene motion, while generative models struggle with 3D consistency and precise camera control.This thesis shows that deep generative priors-instantiated as diffusion models conditioned on camera poses-bridge this gap. We proceed in three steps. First, we start by revealing that state-of-the-art dynamic view-synthesis benchmarks quietly rely on multi-view cues; removing those cues triggers steep performance drops and exposes the brittleness of reconstruction-based models. Then, we present a working solution that injects learned monocular depth and long-range tracking priors into a dynamic 3D Gaussian scene representation, recovering globally consistent geometry and motion from a single video. Finally, we abandon explicit reconstruction altogether, coupling camera-conditioned diffusion with a two-pass sampling strategy to synthesize minute-long, camera-controlled videos from as little as one input image.From diagnosing the limits of reconstruction, to augmenting it with data-driven regularizers, to replacing it with a fully generative pipeline, our results trace a clear progression that delivers state-of-the-art fidelity, temporal coherence, and camera control precision while requiring orders-of-magnitude less input signal. We conclude by outlining open challenges and future directions for scaling view synthesis to truly world-scale 3D environments.
    주제명부출표목-일반주제명  
    주제명부출표목-일반주제명  
    주제명부출표목-일반주제명  
    비통제 색인어  
    비통제 색인어  
    비통제 색인어  
    비통제 색인어  
    부출표목-단체명  
    University of California Berkeley Electrical Engineering & Computer Sciences
      기본자료저록  
      Dissertations Abstracts International. 87-04A.
      전자적 위치 및 접속  
       원문정보보기

      MARC

       008260219s2025        us  ||||||||||||||c||eng  d
      ■001000017359367
      ■00520260202105109
      ■006m          o    d                
      ■007cr#unu||||||||
      ■020    ▼a9798297601284
      ■035    ▼a(MiAaPQ)AAI32236843
      ■040    ▼aMiAaPQ▼cMiAaPQ
      ■0820  ▼a004
      ■1001  ▼aGao,  Hang.
      ■24510▼aDeep  Generative  Priors  for  View  Synthesis  at  Scale.
      ■260    ▼a[S.l.]▼bUniversity  of  California,  Berkeley.  ▼c2025
      ■260  1▼aAnn  Arbor▼bProQuest  Dissertations  &  Theses▼c2025
      ■300    ▼a123  p.
      ■500    ▼aSource:  Dissertations  Abstracts  International,  Volume:  87-04,  Section:  A.
      ■500    ▼aAdvisor:  Kanazawa,  Angjoo.
      ■5021  ▼aThesis  (Ph.D.)--University  of  California,  Berkeley,  2025.
      ■520    ▼aView  synthesis-the  task  of  generating  photorealistic  images  of  a  scene  from  novel  camera  viewpoints-is  a  cornerstone  of  computer  vision,  underpinning  graphics,  immersive  reality,  and  embodied  AI.  Yet  despite  its  importance,  view  synthesis  has  not  demonstrated  scaling  properties  comparable  to  those  in  language  or  2D  generation,  even  when  provided  with  more  data  and  compute:  reconstruction-based  methods  collapse  under  sparse  views  or  scene  motion,  while  generative  models  struggle  with  3D  consistency  and  precise  camera  control.This  thesis  shows  that  deep  generative  priors-instantiated  as  diffusion  models  conditioned  on  camera  poses-bridge  this  gap.  We  proceed  in  three  steps.  First,  we  start  by  revealing  that  state-of-the-art  dynamic  view-synthesis  benchmarks  quietly  rely  on  multi-view  cues;  removing  those  cues  triggers  steep  performance  drops  and  exposes  the  brittleness  of  reconstruction-based  models.  Then,  we  present  a  working  solution  that  injects  learned  monocular  depth  and  long-range  tracking  priors  into  a  dynamic  3D  Gaussian  scene  representation,  recovering  globally  consistent  geometry  and  motion  from  a  single  video.  Finally,  we  abandon  explicit  reconstruction  altogether,  coupling  camera-conditioned  diffusion  with  a  two-pass  sampling  strategy  to  synthesize  minute-long,  camera-controlled  videos  from  as  little  as  one  input  image.From  diagnosing  the  limits  of  reconstruction,  to  augmenting  it  with  data-driven  regularizers,  to  replacing  it  with  a  fully  generative  pipeline,  our  results  trace  a  clear  progression  that  delivers  state-of-the-art  fidelity,  temporal  coherence,  and  camera  control  precision  while  requiring  orders-of-magnitude  less  input  signal.  We  conclude  by  outlining  open  challenges  and  future  directions  for  scaling  view  synthesis  to  truly  world-scale  3D  environments.
      ■590    ▼aSchool  code:  0028.
      ■650  4▼aComputer  science.
      ■650  4▼aComputer  engineering.
      ■650  4▼aInformation  science.
      ■653    ▼aDeep  generative
      ■653    ▼aCamera  control
      ■653    ▼a3D  Gaussian  scene
      ■653    ▼aGraphics
      ■690    ▼a0984
      ■690    ▼a0464
      ■690    ▼a0723
      ■71020▼aUniversity  of  California,  Berkeley▼bElectrical  Engineering  &  Computer  Sciences.
      ■7730  ▼tDissertations  Abstracts  International▼g87-04A.
      ■790    ▼a0028
      ■791    ▼aPh.D.
      ■792    ▼a2025
      ■793    ▼aEnglish
      ■85640▼uhttp://www.riss.kr/pdu/ddodLink.do?id=T17359367▼nKERIS▼z이  자료의  원문은  한국교육학술정보원에서  제공합니다.

      미리보기

      내보내기

      chatGPT토론

      Ai 추천 관련 도서


        신착도서 더보기
        관련도서 더보기
        최근 3년간 통계입니다.
        SMS 발송 간략정보 이동 상세정보출력

        소장정보

        • 예약
        • 서가에 없는 책 신고
        • 자료배달서비스
        • 나의폴더
        • 우선정리요청
        소장자료
        등록번호 청구기호 소장처 대출가능여부 대출정보
        EM179484 TD   자료대출실(3층) 정리중  정리중 
        마이폴더

        * 대출중인 자료에 한하여 예약이 가능합니다. 예약을 원하시면 예약버튼을 클릭하십시오.

        해당 도서를 다른 이용자가 함께 대출한 도서

        관련도서

        관련 인기도서

        서평쓰기