Learning accross different 3D representations

Learning accross different 3D representations

  • High res voxel grids are impractical in terms of memory or compute
OctNet Use Octree as input to the network
Octree Generating Networks Have network produce an Octree

Submanifold sparse convolutions #

  • Only apply convolutions where geometry is -> saves compute

  • Can be combined with hash based structures like spatial hashing to only store positions where there is something -> saves memory

  • Together: Allows processing of the scene at a higher resolution

  • Sparse Generative Neural Networks

From multi-view #

MVCNN Multi view images -> classification
  • Render Point Clouds as spheres and do MVCNN on them

  • Multi view considerations

    1. What viewpoints to use?
    2. How many viewpoints?
    3. How to handle noisy/incomplete data?

Convolutions on points #

PointNet++ already introduced hierarchical multi resolution point handling for local neighborhood structure.

Convolutions on meshes #

  • Meshes are graphs
  • How to define convolutions over graphs?
    • Should be agnostic to number of vertices / faces

Geometric Operators #

Spectral graph convolution #

  • Apply the convolutions in frequency domain
  • “convolution in spacial domain == multiplication in frequency domain”
  1. Convert mesh to frequency domain by using the eigenvectors of the Laplacian mesh operator .
  2. Multiply with kernel
  • No guarantee that filters will have local support on the graph
  • No shift invariance
  • No pooling

Message Passing Graph Neural Networks #

  • See Graph Neural Network

  • Vertices have features associated to them

  • Messages get passed from vertices to neighbors

    • Messages get aggrigated at vertices (sum or avg)
  • This is run a few times iteratively over the whole graph

  • Optionally edges also have features that get aggegated

  • See: Scan2Mesh

Attempts #

Geodesic CNN Work with geodesic patches; no pooling
MeshCNN Edges have features; convolutions and pooling

Combining Representations #

  • Leverage benefits of multiple representations

  • Convert Point Cloud to graph based on local neighbohood and use convolutions on graphs

  • Joint 3d-Multi view learning to predict semantic labels in scene

Calendar October 22, 2023