Meet Object Capture for iOS


Intro for Object Capture for iOS – allows you to use Computer Vision to create lifelike 3D models which can be used in Object Capture API on Mac, but now you can do on device reconstruction for iOS.  There is a sample application to learn how to do this.   To me this looks like an update from the app shared earlier this year / late last year.

The code is available in the developer documentation, but you should certainly bookmark this page – https://developer.apple.com/documentation/Updates/wwdc2023

More objects with LiDAR

  • Performs best with extra details, but improved objects with low texture objects by using LiDAR, system augments models based on the point cloud to create objects
  • Still avoid transparent or reflective objects

Guided Capture

  • This automatically captures images with LiDAR provides feedback and guidance on how to capture.
  • Capture dial indicates which areas that has images – kinda like when you scan your face
  • You will get notified if there is not enough light.  Also use diffused light to minimize reflection 
  • You want a consistent distance when scanning and keep the object within the frame
  • Don’t forget to flip objects if it is rigid, if the object is repetitive, it may be problematic to do flipping.
  • There is now an API to tell you if the object is captured enough for flipping. And will recommend which way you should flip.

iOS API

  • imageCapture API is what you want to look up to find more information.  This is basically a State Machine to capture between ready, detecting, capturing, finishing, and finally completed
  • The APIs are in RealityKit and SwiftUI (https://developer.apple.com/documentation/realitykit/objectcapturesession
  • You should capture a space for where images are stored during Initialize phase
  • Your app will require to create your own UI to control the capture session for the user.  
  • The detection phase allows you to identify the bounding of the object so that it knows what you’d like to capture
  • Capturing will generate a point cloud to show you progress, one you are finished, you will need to create your own UI to complete the capture, or generate new captures to additional passes, or Flipping the object.
  • The Finishing process will wait for all data to be saved and then automatically move to Completing state
  • Completed state will then allow you to do On Device reconstruction, if completing fails you will have to create a new session.
  • Creating a 3D Model is the “Reconstruction API” – This is a PhotogrammetrySession which is pointed to the images to process and generate a usdz model file.  More on this wwdc21
  • Mac will also use LiDAR data, and supports higher resolution than the iOS device.  You can just use Reality Composer Pro on the Mac and won’t have to write any code.

Reconstruction enhancements 

  • Mac performance has been improved, along with providing an estimate processing time
  • You can create poses of images to pre configured and optimized poses.
  • You can also customize the number of Triangles, with a new Custom detail level