Face Landmarks - Python
This guide will help you implement real-time face landmark detection on video frames using the VideoSDK. We will leverage the Mediapipe library to detect facial landmarks and draw them on the video frames.
Prerequisites
- Install necessary libraries:
pip install videosdk python-dotenv opencv-python av mediapipe
- Create a
.env
file and add your VideoSDK token, meeting ID, and name:VIDEOSDK_TOKEN=your_token
MEETING_ID=your_meeting_id
NAME=your_name
Code Breakdown
Imports and Constants
We start by importing necessary libraries and loading environment variables in face_landmarks.py
file:
import asyncio
import os
from videosdk import MeetingConfig, VideoSDK, Participant, Stream, MeetingEventHandler, ParticipantEventHandler, CustomVideoTrack, Meeting
import mediapipe as mp
import cv2
from av import VideoFrame
from dotenv import load_dotenv
load_dotenv()
VIDEOSDK_TOKEN = os.getenv("VIDEOSDK_TOKEN")
MEETING_ID = os.getenv("MEETING_ID")
NAME = os.getenv("NAME")
loop = asyncio.get_event_loop()
# Initialize Mediapipe face landmarks
mp_face_mesh = mp.solutions.face_mesh
mp_drawing = mp.solutions.drawing_utils
meeting: Meeting = None
Face Landmarks Processor
This processor performs face landmark detection on each video frame and draws the landmarks and return the frame:
class FaceLandMarkProcessor():
def __init__(self) -> None:
print("Processor initialized")
self.face_mesh = mp_face_mesh.FaceMesh(
static_image_mode=False,
max_num_faces=1,
min_detection_confidence=0.5,
min_tracking_confidence=0.5,
)
def process(self, frame: VideoFrame) -> VideoFrame:
# Convert frame to image
img = frame.to_ndarray(format="bgr24")
# Convert the image to RGB
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
# Perform face landmark detection
results = self.face_mesh.process(img_rgb)
# Draw face landmarks on the image
if results.multi_face_landmarks:
for face_landmarks in results.multi_face_landmarks:
mp_drawing.draw_landmarks(
image=img,
landmark_list=face_landmarks,
connections=mp_face_mesh.FACEMESH_TESSELATION,
landmark_drawing_spec=None,
connection_drawing_spec=mp_drawing.DrawingSpec(
color=(0, 255, 0), thickness=1, circle_radius=1
),
)
# rebuild a VideoFrame, preserving timing information
new_frame = VideoFrame.from_ndarray(img, format="bgr24")
new_frame.pts = frame.pts
new_frame.time_base = frame.time_base
return new_frame
CustomVideoTrack
Define a custom video track that will run the above processor when new frame received.
class ProcessedVideoTrack(CustomVideoTrack):
"""
A video stream track that transforms frames from an another track.
"""
kind = "video"
def __init__(self, track):
super().__init__() # don't forget this!
self.track = track
self.processor = FaceLandMarkProcessor()
async def recv(self):
frame = await self.track.recv()
new_frame = self.processor.process(frame)
return new_frame
Process on stream available
This function applies the ProcessedVideoTrack
to a available video track:
def process_video(track: CustomVideoTrack):
global meeting
meeting.add_custom_video_track(
track=ProcessedVideoTrack(track=track)
)
Event Handlers
Define event handlers to handle meeting and participant events:
class MyMeetingEventHandler(MeetingEventHandler):
def __init__(self):
super().__init__()
def on_meeting_left(self, data):
print("on_meeting_left")
def on_participant_joined(self, participant: Participant):
participant.add_event_listener(
MyParticipantEventHandler(participant_id=participant.id)
)
def on_participant_left(self, participant: Participant):
print("on_participant_left")
class MyParticipantEventHandler(ParticipantEventHandler):
def __init__(self, participant_id: str):
super().__init__()
self.participant_id = participant_id
def on_stream_enabled(self, stream: Stream):
print("on_stream_enabled: " + stream.kind)
if stream.kind == "video":
process_video(track=stream.track)
def on_stream_disabled(self, stream: Stream):
print("on_stream_disabled")
Main Function
Initialize the meeting and start the event loop:
def main():
global meeting
# Example usage:
meeting_config = MeetingConfig(
meeting_id=MEETING_ID,
name=NAME,
mic_enabled=False,
webcam_enabled=False,
token=VIDEOSDK_TOKEN,
)
meeting = VideoSDK.init_meeting(**meeting_config)
print("adding event listener...")
meeting.add_event_listener(MyMeetingEventHandler())
print("joining into meeting...")
meeting.join()
if __name__ == "__main__":
main()
loop.run_forever()
Running the Code
To run the code, simply execute the script:
python face_landmarks.py
This script will join the meeting specified by MEETING_ID
with the provided VIDEOSDK_TOKEN
and NAME
, and perform real-time face landmark detection on video frames using Mediapipe.
Feel free to modify the face landmark detection logic inside the FaceLandMarkProcessor
class to adjust the detection parameters or apply additional processing.
Output
Stuck anywhere? Check out this example code on GitHub.
API Reference
The API references for all the methods and events utilized in this guide are provided below.
Got a Question? Ask us on discord