Computer Vision

Introduction

Most “AI robot face” demos rely on heavy machine-learning models, cloud APIs, or large datasets.
This project explores a different approach:

Using simple geometry and browser-based vision to create an expressive robot head that reacts to human movement in real time.

The robot head tracks left, right, up, and down head motion and mirrors it using animated eyes displayed on a small OLED screen — all controlled directly from a web browser.

No dataset.
No model training.
No server.

What This Project Does

Detects a human face using the browser camera

Estimates head orientation (yaw & pitch)

Sends motion data to an Arduino using Web Serial

Animates eye movement on a 0.96″ OLED display

Allows direction inversion (mirror correction) at runtime

The result is a small robot head that feels responsive and alive.

Key Design Decisions

1. Pupils Move, Not Eyeballs

Moving only the pupils inside fixed eyeballs makes the face feel more natural and expressive.

2. Face-Relative Geometry

Head motion is measured relative to facial landmarks, not camera pixels.
This makes movement symmetric and stable.

3. Runtime Direction Flip

A toggle button allows instant correction for mirrored cameras without changing code.

Computer Vision is a rapidly growing field that deals with enabling machines to interpret, analyze, and understand digital images and videos. Here are some of the top computer vision libraries that can help developers to build powerful computer vision applications.

OpenCV

OpenCV is a widely-used open-source computer vision library that provides developers with a range of tools for image and video analysis, object detection, face recognition, and more. OpenCV is written in C++ and supports multiple programming languages such as Python, Java, and MATLAB.

Official website: https://opencv.org/
User-friendliness: Easy to use with extensive documentation and tutorials.
Community support: Large and active community with frequent updates and contributions.

TensorFlow

TensorFlow is an open-source machine learning framework that includes a range of tools for image recognition, object detection, and classification. TensorFlow supports multiple programming languages, including Python, C++, and Java.

Official website: https://www.tensorflow.org/
User-friendliness: Easy to use with extensive documentation and tutorials.
Community support: Large and active community with frequent updates and contributions.

PyTorch

PyTorch is an open-source machine-learning library that includes a range of tools for image recognition, object detection, and segmentation. PyTorch supports multiple programming languages, including Python, C++, and Java.

Official website: https://pytorch.org/
User-friendliness: Easy to use with extensive documentation and tutorials.
Community support: Large and active community with frequent updates and contributions.

Caffe

Caffe is a deep learning framework that includes tools for image classification, segmentation, and detection. Caffe is written in C++ and supports multiple programming languages such as Python and MATLAB.

Official website: http://caffe.berkeleyvision.org/
User-friendliness: Moderate difficulty with a learning curve.
Community support: Medium-sized community with frequent updates and contributions.

Keras

Keras is an open-source deep-learning library that provides tools for image recognition, object detection, and classification. Keras supports multiple programming languages, including Python and R.

Official website: https://keras.io/
User-friendliness: Easy to use with extensive documentation and tutorials.
Community support: Large and active community with frequent updates and contributions.

scikit-image

scikit-image is a Python library that provides tools for image processing, including filtering, segmentation, and feature extraction.

Official website: https://scikit-image.org/
User-friendliness: Easy to use with extensive documentation and tutorials.
Community support: Large and active community with frequent updates and contributions.

These computer vision libraries offer a wide range of tools and functionalities for developers to work with. Choosing the right library largely depends on the requirements and specific use cases of the project.

Tag: Computer Vision

AI Robot Head Tracking Using Browser Vision (No Training, No Server)

Download FreeCAD project and STL files

Introduction

What This Project Does

System Architecture

Hardware Used

Software Stack

Key Design Decisions

1. Pupils Move, Not Eyeballs

2. Face-Relative Geometry

3. Runtime Direction Flip

Educational Value

Conclusion

Try It Live

What are Computer vision libraries?