How PhasorLab Achieved Accurate Human Depth Detection up to 10 Metres with Mask RCNN and a Single CCTV Camera

Ajackus partnered with PhasorLab — a high-precision positioning technology company using sub-nanosecond synchronisation — to develop a custom computer vision solution that accurately detects human bodies and measures their depth up to 10 metres using a single CCTV camera, enhancing PhasorLab’s centimetre-level indoor and outdoor positioning capabilities.

Technologies

Ionic | Ajackus.com
technologies-laraval-info | Ajackus.com
OpenCV | Ajackus.com
technologies pythonbackend | Ajackus.com
phasorlab | Ajackus.com

10m

Depth Detection Range

Sub-nanosecond

Synchronisation Precision Supported

1

CCTV Camera Required

Overview

Executive Summary
Client
Challenge
Goals
Journey
Results
Technology
Takeaways
FAQ

Executive Summary

The Problem

PhasorLab’s Hyper Sync Net technology delivers sub-nanosecond time synchronisation for centimetre-level positioning accuracy — but the system lacked the computer vision capability to reliably detect human bodies and measure their depth within a scene, limiting the precision of human tracking in both indoor and outdoor environments.

The Solution

Ajackus designed and built a two-component computer vision solution: a body detection module using Mask RCNN for precise human identification within a continuous image stream, paired with a depth estimation model using the DenseDepth algorithm to calculate distances between detected humans and the camera up to 10 metres — all operating from a single CCTV camera input.

The Result

PhasorLab now accurately detects human presence and measures depth using a single camera, enhancing the positioning capability of its network-based systems in both indoor and outdoor environments. The solution operates reliably within the 10-metre range specified by PhasorLab’s use case requirements.

Client

PhasorLab is a deep technology company specialising in high-precision network-based positioning. Its core technology — Hyper Sync Net — achieves sub-nanosecond time synchronisation and sub-parts-per-billion frequency synchronisation, enabling centimetre-level target tracking accuracy in positioning systems suitable for both indoor and outdoor deployment. PhasorLab engaged Ajackus to extend its capabilities into computer vision, adding human body detection and depth measurement to complement the positioning infrastructure the company had already built.

Industry Electronics and Communication / Positioning Technology
Core Technology Hyper Sync Net — sub-nanosecond synchronisation
Positioning Accuracy Centimetre-level indoor and outdoor tracking
Engagement Model Managed Delivery — custom computer vision model development

Challenge

The Bottom Line

PhasorLab needed a computer vision system that could reliably identify human bodies and calculate their depth within a single camera’s field of view — without requiring multi-camera setups or specialised hardware beyond a standard CCTV unit.

Building a Custom Depth Model for a Specialised Use Case

Off-the-shelf depth estimation libraries were not designed for PhasorLab’s specific requirements: accurate distance measurement up to 10 metres, operating from a 2D image stream produced by a standard CCTV camera, in real-world environments with variable lighting and occlusion. Developing a model that met these requirements demanded significant research into depth detection algorithms and careful selection from available approaches — with the wrong choice potentially introducing measurement errors that would undermine the centimetre-level accuracy PhasorLab’s positioning technology promised.

Human Body Detection in a Continuous Image Stream

Identifying individual humans within a continuous video stream — rather than in static images — introduced challenges around consistent detection across frames, robustness to partial occlusion, and accurate instance segmentation (distinguishing individual people rather than detecting a group as a single mass). Body shape recognition at regular intervals required a library capable of object segmentation at instance level, not just object classification.

Single CCTV Camera Constraint

PhasorLab’s integration requirements specified that the solution must function with a single standard CCTV camera. This constraint ruled out stereo-vision approaches that rely on multiple cameras for depth calculation and required the depth estimation model to infer 3D information from a 2D image — a more technically demanding problem that required careful algorithm selection.

Goals

The project focused on building a computer vision solution for human detection and depth measurement compatible with PhasorLab’s existing positioning infrastructure.

Goal Outcome Required
Human body detection Reliable identification of individual humans in a continuous image stream
Depth measurement Accurate distance calculation between humans and camera, up to 10 metres
Single camera compatibility Full functionality with one standard CCTV camera input
Integration with positioning system Seamless output compatible with PhasorLab’s existing network-based positioning pipeline
Indoor and outdoor reliability Consistent performance across variable environmental conditions

Journey

Ajackus developed PhasorLab’s computer vision solution across four structured phases, from algorithm research through integration testing.

Phase 1: Algorithm Research and Library Evaluation

The Ajackus team conducted a structured research phase before committing to any specific algorithm or library. For body detection, multiple object detection and segmentation libraries were evaluated against PhasorLab’s requirements — including performance on human-specific shapes, instance segmentation capability, and frame-rate compatibility with continuous video streams. For depth estimation, depth detection algorithms were assessed for their ability to produce accurate distance measurements from monocular (single camera) 2D images in real-world conditions.

Phase 2: Body Detection Module with Mask RCNN

The Ajackus team selected and deployed the Mask RCNN library for the body detection component. Mask RCNN was chosen for its superior instance segmentation capability — the ability to identify and isolate individual humans within an image, rather than simply detecting the presence of people. This precision was essential for PhasorLab’s use case, where distinguishing individual subjects within a scene is required for accurate per-person positioning. The module was implemented to process continuous image streams at regular intervals, providing consistent detection output to the depth measurement layer.

Phase 3: Depth Estimation with DenseDepth

For the distance measurement component, the Ajackus team selected the DenseDepth algorithm model following comparative evaluation of available monocular depth estimation approaches. DenseDepth was chosen for its accuracy in estimating depth from single 2D images — producing distance measurements compatible with PhasorLab’s known area map data. The integration of DenseDepth with the Mask RCNN output allowed the system to calculate the distance between each detected human and the camera, completing the depth measurement pipeline up to the 10-metre requirement.

Phase 4: Integration and Testing

The combined body detection and depth estimation pipeline was integrated into PhasorLab’s existing system architecture and tested across both indoor and outdoor environments. Edge cases — including partial occlusion, multiple simultaneous subjects, and variable lighting conditions — were addressed during testing to ensure reliable performance in real-world deployment conditions.

Results

PhasorLab’s positioning system now incorporates human body detection and depth measurement, extending the practical scope of its centimetre-level positioning technology to real-world human tracking applications.

10m

Accurate Depth Detection Range

1

Standard CCTV Camera Required

2

Environments Validated (Indoor + Outdoor)

What went well:

Technical Achievements

  • Human body detection implemented using Mask RCNN, providing accurate instance-level segmentation of individual humans within a continuous CCTV image stream
  • Depth measurement implemented using the DenseDepth algorithm, enabling accurate distance calculation between detected humans and the camera up to 10 metres from a monocular 2D image input
  • Full solution operating from a single standard CCTV camera — no specialised hardware or multi-camera setup required
  • System integrated with PhasorLab’s existing network-based positioning pipeline and validated in both indoor and outdoor environments

Business Impact

  • PhasorLab’s positioning system now incorporates human body detection and depth measurement, extending the practical scope of its centimetre-level positioning technology to real-world human tracking applications
  • The single-camera architecture significantly reduces hardware requirements for deployment, lowering the cost and complexity of installing PhasorLab’s positioning system in new environments
  • The solution is compatible with standard CCTV infrastructure that is already widely deployed in commercial and industrial settings, enabling retrofit adoption without new camera hardware

Why It Worked

Research Before Commitment

Rather than defaulting to the most popular available libraries, the Ajackus team conducted a structured evaluation of multiple depth estimation and body detection approaches before making any selection. For a precision technology client like PhasorLab, where sub-optimal algorithm choice would directly degrade positioning accuracy, this research investment was the foundation of a reliable solution.

Instance-Level Precision

Mask RCNN was selected specifically for its instance segmentation capability — the ability to distinguish individual people, not just detect human presence. For a positioning system that must track specific individuals rather than groups, detection-level granularity is insufficient.

Monocular-First Architecture

The DenseDepth-based depth estimation approach was designed from the start to operate from a single monocular camera. Rather than designing a solution that would have required expensive hardware modifications, the Ajackus team matched the technical approach to the deployment constraint — producing a solution that works with infrastructure clients already have.

Frequently Asked Questions

How does the PhasorLab computer vision solution detect human depth from a single camera?

The Ajackus team implemented the DenseDepth algorithm, which estimates depth information from a single 2D monocular image — without requiring stereo cameras or specialised depth sensors. DenseDepth infers distance by analysing image features and comparing them against known area map data provided by PhasorLab's positioning infrastructure, producing accurate distance measurements up to 10 metres from a standard CCTV camera.

Why was Mask RCNN selected over other object detection libraries for body detection?

Mask RCNN was chosen for its instance segmentation capability — the ability to identify and isolate individual humans within an image at pixel level, rather than simply detecting the presence of people. For PhasorLab's positioning use case, distinguishing individual subjects is essential. Other detection libraries that operate at bounding-box or class-detection level would not provide the per-person precision the system required.

Can the system handle multiple people simultaneously in the camera's field of view?

Yes. Mask RCNN's instance segmentation architecture supports the simultaneous detection and isolation of multiple individual humans within a single frame. Each detected person is processed independently by the depth estimation module, allowing the system to provide distance measurements for multiple subjects concurrently.

How quickly can Ajackus develop custom computer vision models for specialised technical requirements?

The timeline for custom computer vision development depends on the complexity of the detection task, the availability of training data, and the specificity of the performance requirements. For the PhasorLab engagement, the Ajackus team completed algorithm research, model selection, implementation, and integration testing within the project timeline. Contact hello@ajackus.com to discuss your specific requirements and a realistic delivery estimate.

Does Ajackus have experience building AI solutions for electronics and communication technology companies?

Yes. Ajackus has built AI-powered solutions across several technically demanding verticals, including computer vision, data analytics, and agentic AI systems. The Ajackus AI engineering team has experience selecting and implementing established AI frameworks (such as Mask RCNN and DenseDepth) and adapting them to client-specific constraints — including hardware limitations, accuracy requirements, and integration architecture.

We're Ajackus
We combine design, engineering, and speed to deliver beautifully crafted, scalable products.