AI Robot Head Tracking Using Browser Vision (No Training, No Server)

Posted

in

by

Turn your head left, right, up, and down.
Open circuit Open circuit

Download FreeCAD project and STL files

Introduction

Most “AI robot face” demos rely on heavy machine-learning models, cloud APIs, or large datasets.
This project explores a different approach:

Using simple geometry and browser-based vision to create an expressive robot head that reacts to human movement in real time.

The robot head tracks left, right, up, and down head motion and mirrors it using animated eyes displayed on a small OLED screen — all controlled directly from a web browser.

No dataset.
No model training.
No server.

What This Project Does

  • Detects a human face using the browser camera
  • Estimates head orientation (yaw & pitch)
  • Sends motion data to an Arduino using Web Serial
  • Animates eye movement on a 0.96″ OLED display
  • Allows direction inversion (mirror correction) at runtime

The result is a small robot head that feels responsive and alive.

System Architecture

Browser Camera
      ↓
Face Geometry (MediaPipe)
      ↓
Yaw & Pitch Calculation
      ↓
Web Serial (USB)
      ↓
Arduino UNO
      ↓
OLED Eye Animation

All computation happens locally in the browser.

Hardware Used

  • Arduino UNO
  • 0.96″ OLED (SSD1306, I2C)
  • USB cable
  • 3D-printed enclosure

Software Stack

  • HTML + JavaScript
  • MediaPipe Face Mesh
  • Web Serial API
  • Arduino (C++)

Key Design Decisions

1. Pupils Move, Not Eyeballs

Moving only the pupils inside fixed eyeballs makes the face feel more natural and expressive.

2. Face-Relative Geometry

Head motion is measured relative to facial landmarks, not camera pixels.
This makes movement symmetric and stable.

3. Runtime Direction Flip

A toggle button allows instant correction for mirrored cameras without changing code.


Educational Value

This project can be used to teach:

  • Coordinate systems
  • Geometry-based tracking
  • Browser ↔ hardware communication
  • Human-centered design

It is suitable for classrooms, labs, and exhibitions.


Conclusion

This robot head demonstrates that intelligence is not just about models, but about understanding interaction.

By combining browser vision, simple math, and embedded hardware, we can build systems that feel responsive, expressive, and intuitive — without complexity.


Try It Live

Allow camera access, connect the Arduino, and move your head left, right, up, and down.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *