A Beginner’s Guide to ARKit Part-1

Ravneet Kaur
4 min readJul 12, 2021

Overview

This Article is an introduction to AR and ARKit . If you are already familiar with the same, jump to Part-2.

Introduction

The basic requirement for any AR experience is the ability to create and track a correspondence between the real-world space that a user inhibits and the virtual space where a user can place visual content. When the app displays the 2D or 3D content together with the real world through camera, the user perceives it as though the virtual content is a part of the real world.

ARKit

ARKit is Apple’s mobile AR development framework. Augmented frameworks aren’t new. Vuforia for example has been around for many years. But what makes ARKit more efficient is its markerless tracking ability.

Markerless tracking means it doesn’t need AR markers e.g., QR Codes etc. It understands the world around it to place virtual content in it.

Frameworks behind ARKit

AVFoundation

Combines six major technology areas that together encompass a wide range of tasks for capturing, processing, synthesizing, controlling, importing and exporting audiovisual media on Apple platforms.

CoreMotion

Reports motion- and environment-related data from the onboard hardware of iOS devices, including from the accelerometers and gyroscopes, and from the pedometer, magnetometer, and barometer.

CoreML

Helps integrating machine learning model for object, surface or image detection and classification

ARKit Fundamentals

World Tracking

To create a correspondence between the real and virtual world, ARKit uses a technique called VIO i.e. Visual-Inertial Odometry. It basically analyzes the phone camera and motion data to keep track of device’s position and orientation in the real world also called pose.

Scene Analysis

To manage a world map while a session is in progress, ARKit continuously interprets the inputs from camera and motion sensors to detect and/or track the following –

Topology — planes, walls, surfaces etc.

Objects — 2D images or 3D objects in the real world

Light — real-world lighting to influence brightness and rendering of virtual object

Face — creates AR experiences like face filters, virtual animations in real-time

Replicate a person’s movement in a virtual representation, like a skeleton

Scene Interaction

To engage in an interaction with the virtual content placed in the real world through camera, ARKit provides classes for ray-casting

Scene Persistence

ARKit allows user to save the world map (including the virtual contents) and utilize it later or send the map to other devices to create a shared experience

AR Multi-Player Games

People Occlusion

AR experiences are now much more immersive, thanks to people occlusion. This is a green-screen-style effect made possible by machine learning. ARKit accomplishes the occlusion by identifying regions in the camera feed where people reside and preventing virtual content from drawing into that region’s pixels.

Rendering Integration with ARKit

ARKit doesn’t have its own Graphics API so its capabilities are limited to world-space-tracking and scene understanding. Thus, it requires a graphics framework to provide an AR experience.

SpriteKit —Apple node-based framework for creating and rendering 2D games and 2D graphics.. You can use SpriteKit as a standalone API or use it with SceneKit and ARKit. Its main feature is the ability to draw sprites with physics, 2D text and shapes, images and video.

SceneKit — 3D framework by Apple that helps manipulating and rendering 3D objects. SceneKit gives you a high-quality render technology , although for AR projects you can use it only with ARKit.

RealityKit — High-quality render technology and up-to-date AR capabilities out-of-the-box. Supports LiDAR Scanner. You can use it alone or with ARKit and MetalKit.The main advantage of RealityKit — it can complement / change / customize scenes coming from Reality Composer app and can be a powerful extension for ARKit — although it shines as a standalone AR SDK as well.

Metal — To be precise, Metal is not a rendering technology but rather the GPU accelerator. Developers usually use Metal framework to generate a High-Quality GPU Rendering for games with sophisticated 3D environments.

--

--