Skip to content

Estimating Depth

For tracking the depth, the following function is implemented. Depth is the distance between the user and the camera. There is a way to calculate depth if you know the focal length of the camera and the average distance between the eyes.

estimate_depth(landmarks)

Estimate the Z-coordinate (depth) for a detected face.

This function calculates the depth, which is the distance between the screen and the user, using a method that relies on the distance between the eyes. It uses the focal length and the average distance between the eyes, to estimate the depth based on eye landmarks detected.

Parameters:

Name Type Description Default
landmarks list[list[int]]

A list of arrays, each array representing a landmark with x and y position of that landmark.

required

Returns:

Name Type Description
int

The distance between the user and the camera

Source code in src/depth.py
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def estimate_depth(landmarks: list[list[int]]):
    """Estimate the Z-coordinate (depth) for a detected face.

    This function calculates the depth, which is the distance between the screen and the user, using a method that relies on the distance between the eyes. 
    It uses the focal length and the average distance between the eyes, to estimate the depth based on eye landmarks detected. 

    Parameters:
        landmarks (list[list[int]]): A list of arrays, each array representing a landmark with x and y position of that landmark.

    Returns:
        int: The distance between the user and the camera 

    """
    # Check that the list has the 468 landmarks 
    if len(landmarks) != 468:
        print("ERROR: Invalid length of landmark list expected 468, was {len(landmarks)}")
        return None 

    # Retrieve the eye indexes 
    left_eye = landmarks[EYE_DISTANCE_INDEX['left_eye']]
    right_eye = landmarks[EYE_DISTANCE_INDEX['right_eye']]

    # Calculate distance between eyes
    w, _ = CVZONE_DETECTOR_MAX_ONE.findDistance(left_eye, right_eye)

    # Estimate depth
    return int((INTEROCULAR_DISTANCE * FOCAL_LENGTH) / w)