60 FPS Camera Module

Overview

Anvil's camera module offers a great starting point for Physical AI developers. The main features worth noting are:

  • A set of swappable lenses for each camera module, ranging from 180° fisheye to standard focus lens. This is important for supporting different Physical AI models as models are biased towards pre-training data (fisheye in = fisheye out)

  • A rugged USB screw-lock shell that prevents disconnection during movement or impact

  • A framerate of 60 frames per second supports the inference-time configurations of most state of the art models

Media

Packing List

This camera module is pre-integrated in all of Anvil's robots. They are also sold separately as potential replacements or upgrades for existing robots.

Technical Specs

Resolution and Maximum Frame-rate

  • 1080p at 60 fps

  • 720p at 60 fps (recommended)1

  • 480p at 60 fps

  • 4K UHD at 30 fps (likely unused)

Encoding Options

MJPEG and YUV4222

Connectivity

Screw-lock USB 2.0

UVC Camera Controls

  • Resolution and FPS

  • Exposure

  • White Balance

  • Brightness

  • Contrast

  • Hue

  • Saturation

  • Sharpness

  • Gamma

  • Zoom / Pan / Tilt

Interchangeable Lenses

  • 180° Fisheye

  • 150° Pseudo Fisheye

  • 130° Wide Angle

  • Standard

  • In addition, most high quality M12 lenses can be used directly with this module. To check compatibility with Anvil's camera module, just send us your lens specs.

1 Physical AI models such as ACT and Diffusion Policy usually train on low resolution images such as 480p or 360p, since higher resolutions would require orders of magnitude more computing power. This means that most image data must be resized (down-sampled) in order to fit model expectations. Due to this requirement, we recommend that users select a lower resolution over MJPEG to reduce the communication bandwidth requirements and system load.

2 YUV422 is a RAW pixel format which requires significantly higher USB data bandwidth for little to no benefit in the Physical AI use case. RAW pixels primarily preserve minor texture details (such as subtle gradients from shadows) which are thrown away when images are downsized in training runs. We do not recommend that users record using RAW. All framerates and resolution data above are based on the recommended MJPEG encoding.

Last updated