Building a face detector with OpenCV in C++


Hi everyone! In this blog post, I will explain how to build a face detection algorithm with the machine learning components in OpenCV. We will use OpenCV to read an image from a camera and detect faces in it. The result will look like this.

Face detection algorithm drawing rectangles around faces

You can find all code for this blog post on my github.

Installing OpenCV

We will use some rather new parts of OpenCV and its OpenCV_contrib module. The most convenient way to make sure you have access to these modules is by building OpenCV from source. I used OpenCV version 4.2.0 on Ubuntu 16.04. For your convenience, I included a bash script that takes care of installing the correct OpenCV version. It will also install all necessary dependencies. The script lies in the accompanying GitHub repo.

The cv::dnn::Net class we will be using was added to OpenCV in version 3.4.10, so earlier versions might also work. But, I did not test this.

CMake setup

We will build our code with CMake. For this, we create a CMake project with a single executable and set the C++ standard to 14.

cmake_minimum_required(VERSION 3.0)
project(OpenCVFaceDetector LANGUAGES CXX)

add_executable(${PROJECT_NAME} main.cpp)
target_compile_features(${PROJECT_NAME} PUBLIC cxx_std_14)
target_include_directories(${PROJECT_NAME} PRIVATE include)

Then we take care of the OpenCV dependency. We find the OpenCV package and link our executable against it.

# OpenCV setup
find_package(OpenCV REQUIRED)
target_link_libraries(${PROJECT_NAME} ${OpenCV_LIBS})

The whole CMakeLists.txt file should look like this.

cmake_minimum_required(VERSION 3.0)
project(OpenCVFaceDetector LANGUAGES CXX)

add_executable(${PROJECT_NAME} main.cpp)
target_compile_features(${PROJECT_NAME} PUBLIC cxx_std_14)
target_include_directories(${PROJECT_NAME} PRIVATE include)

# OpenCV setup
find_package(OpenCV REQUIRED)
target_link_libraries(${PROJECT_NAME} ${OpenCV_LIBS})

Getting an image from the camera

The first thing we have to do is getting a camera image to work with. Luckily, the cv::videocapture class makes this easy.

We include the OpenCV header to have access to OpenCV’s functionality. Next, we create a cv::videocapture object and try to open the first camera we can find.

#include <opencv4/opencv2/opencv.hpp>

int main(int argc, char **argv) {

    cv::VideoCapture video_capture;
    if (!video_capture.open(0)) {
        return 0;
    }

Afterwards, we create a cv::Mat to hold the frame and display it in an infinite loop. If the user presses ‘Esc’ we break the loop, destroy the display window and release the video capture.

    cv::Mat frame;
    while (true) {
        video_capture >> frame;

        imshow("Image", frame);
        const int esc_key = 27;
        if (cv::waitKey(10) == esc_key) { 
            break;
        }
    }

    cv::destroyAllWindows();
    video_capture.release();

    return 0;
}

So far the main.cpp file will look like the following.

#include <opencv4/opencv2/opencv.hpp>

int main(int argc, char **argv) {

    cv::VideoCapture video_capture;
    if (!video_capture.open(0)) {
        return 0;
    }

    cv::Mat frame;
    while (true) {
        video_capture >> frame;

        imshow("Image", frame);
        const int esc_key = 27;
        if (cv::waitKey(10) == esc_key) { 
            break;
        }
    }

    cv::destroyAllWindows();
    video_capture.release();

    return 0;
}

We can now display images captured from the camera :-)

Captured camera image

Using the cv:dnn::Net class to load a pre-trained SSD face detection network

Now we’ll start building a face detector. We use the cv::dnn::Net class and load weights from a pre-trained caffe model.

Since it’s nice to have all functionality in one place, we create a class FaceDetector for the model. So first, we create two new files src/FaceDetector.cpp and include/FaceDetector.h. To make sure our code still builds, we add the implementation file to our CMake target. That is, go to your CMakeLists.txt and change the line containing add_executable(...) to look like this

add_executable(${PROJECT_NAME} src/main.cpp src/FaceDetector.cpp)

In include/FaceDetector.h we define this class. The model has a constructor in which we will load the model weights. Additionally it has a method

std::vector<cv::Rect> detect_face_rectangles(const cv::Mat &frame)

that takes an input image and gives us a vector of detected faces.

#ifndef VISUALS_FACEDETECTOR_H
#define VISUALS_FACEDETECTOR_H
#include <opencv4/opencv2/dnn.hpp>

class FaceDetector {
public:
    explicit FaceDetector();

/// Detect faces in an image frame
/// \param frame Image to detect faces in
/// \return Vector of detected faces
    std::vector<cv::Rect> detect_face_rectangles(const cv::Mat &frame);

We save the actual network in a private member variable. In addition to the model, we will also save

  • input_image_width/height_ dimensions of the input image
  • scale_factor_ scaling factor when converting the image to a data blob
  • mean_values_ the mean values for each channel the network was trained with. These values will be subtracted from the image when transforming the image to a data blob.
  • confidence_threshold_ the confidence threshold to use when detecting faces. The model will supply a confidence value for each detected face. Faces with a confidence value >= confidence_threshold_ will be kept. All other faces are discarded.
private:
    /// Face detection network
    cv::dnn::Net network_;
    /// Input image width
    const int input_image_width_;
    /// Input image height
    const int input_image_height_;
    /// Scale factor when creating image blob
    const double scale_factor_;
    /// Mean normalization values network was trained with
    const cv::Scalar mean_values_;
    /// Face detection confidence threshold
    const float confidence_threshold_;

};


#endif //VISUALS_FACEDETECTOR_H

The full header file is here.

Next, let’s get to work with implementing the functions we defined above. We start with the constructor. For most of the member variables we put in the correct values.

#include <sstream>
#include <vector>
#include <string>
#include <FaceDetector.h>
#include <opencv4/opencv2/opencv.hpp>

FaceDetector::FaceDetector() :
    confidence_threshold_(0.5), 
    input_image_height_(300), 
    input_image_width_(300),
    scale_factor_(1.0),
    mean_values_({104., 177.0, 123.0}) {

Inside of the constructor we will use cv::dnn::readNetFromCaffe to load the model into our network_ variable. cv::dnn::readNetFromCaffe takes two files to construct the model: The first (deploy.prototxt) is the model configuration which describes the model archtecture. The second (res10_300x300_ssd_iter_140000_fp16.caffemodel) is the binary data for the model weights.

We could move these files to the directory that contains our binary after building. But this solution is rather fragile, because it breaks when the binary moves. Thus we pass in the file location via CMake.

A quick jump back to our CMake configuration

In this StackOverflow post I found a nice way to pass a file path to C++. They recommend to pass the path as a compile_definition to the target. That way CMake can figure out the correct path of the file and pass it into a variable. This variable will be usable in C++.

That is, we add the following lines to our CMakeLists.txt.

# Introduce preprocessor variables to keep paths of asset files
set(FACE_DETECTION_CONFIGURATION
"${PROJECT_SOURCE_DIR}/assets/deploy.prototxt")

set(FACE_DETECTION_WEIGHTS
"${PROJECT_SOURCE_DIR}/assets/res10_300x300_ssd_iter_140000_fp16.caffemodel")

target_compile_definitions(${PROJECT_NAME} PRIVATE 
FACE_DETECTION_CONFIGURATION="${FACE_DETECTION_CONFIGURATION}")

target_compile_definitions(${PROJECT_NAME} PRIVATE 
FACE_DETECTION_WEIGHTS="${FACE_DETECTION_WEIGHTS}")

Finishing the methods in FaceDetector.cpp

Now that we found a way to access the necessary files, we can construct the model.

FaceDetector::FaceDetector() :
    confidence_threshold_(0.5), 
    input_image_height_(300), 
    input_image_width_(300),
    scale_factor_(1.0),
    mean_values_({104., 177.0, 123.0}) {
        // Note: The variables MODEL_CONFIGURATION_FILE
        // and MODEL_WEIGHTS_FILE are passed in via cmake
        network_ = cv::dnn::readNetFromCaffe(FACE_DETECTION_CONFIGURATION,
                FACE_DETECTION_WEIGHTS);

    if (network_.empty()) {
        std::ostringstream ss;
        ss << "Failed to load network with the following settings:\n"
           << "Configuration: " + std::string(FACE_DETECTION_CONFIGURATION) + "\n"
           << "Binary: " + std::string(FACE_DETECTION_WEIGHTS) + "\n";
        throw std::invalid_argument(ss.str());
    }

The next step is to implement detect_face_rectangles. We start by transforming the input image into a data blob. The function cv::dnn::blobFromImage takes care of rescaling the image to the correct input size for the network. It also subtracts the mean value in each color channel.

std::vector<cv::Rect> FaceDetector::detect_face_rectangles(const cv::Mat &frame) {
    cv::Mat input_blob = cv::dnn::blobFromImage(frame,
            scale_factor_,
            cv::Size(input_image_width_, input_image_height_),
            mean_values_,
            false,
            false);

Following, we can forward our data through the network. We save the result in the variable detection_matrix.

    network_.setInput(input_blob, "data");
    cv::Mat detection = network_.forward("detection_out");
    cv::Mat detection_matrix(detection.size[2],
            detection.size[3],
            CV_32F,
            detection.ptr<float>());

We iterate through the rows of the matrix. Each row contains one detection. While iterating, we check if the confidence value exceeds our threshold. If so, we construct a cv::Rect and save it in the result vector faces.

    std::vector<cv::Rect> faces;

    for (int i = 0; i < detection_matrix.rows; i++) {
        float confidence = detection_matrix.at<float>(i, 2);

        if (confidence < confidence_threshold_) {
            continue;
        }
        int x_left_bottom = static_cast<int>(
                detection_matrix.at<float>(i, 3) * frame.cols);

        int y_left_bottom = static_cast<int>(
                detection_matrix.at<float>(i, 4) * frame.rows);

        int x_right_top = static_cast<int>(
                detection_matrix.at<float>(i, 5) * frame.cols);

        int y_right_top = static_cast<int>(
                detection_matrix.at<float>(i, 6) * frame.rows);

        faces.emplace_back(x_left_bottom,
                y_left_bottom,
                (x_right_top - x_left_bottom),
                (y_right_top - y_left_bottom));
    }

    return faces;
}

That concludes our implementation of FaceDetector. Click this link for the full .cpp file.

Visualizing detected faces

Since we implemented the face detector as a class, visualizing the rectangles is easy. First, include the FaceDetector.h header file. Then, we create a FaceDetector object and call the detect_face_rectangles method. Next, we use OpenCV’s rectangle method to draw a rectangle over the detected faces.

#include <opencv4/opencv2/opencv.hpp>
#include "FaceDetector.h"

int main(int argc, char **argv) {

    cv::VideoCapture video_capture;
    if (!video_capture.open(0)) {
        return 0;
    }

    FaceDetector face_detector;

    cv::Mat frame;
    while (true) {
        video_capture >> frame;

        auto rectangles = face_detector.detect_face_rectangles(frame);
        cv::Scalar color(0, 105, 205);
        int frame_thickness = 4;
        for(const auto & r : rectangles){
            cv::rectangle(frame, r, color, frame_thickness);
        }
        imshow("Image", frame);
        const int esc_key = 27;
        if (cv::waitKey(10) == esc_key) {
            break;
        }
    }

    cv::destroyAllWindows();
    video_capture.release();

    return 0;
}

If we run this, we see a rectangle around Beethoven’s face!

Face detection algorithm in action

Wrap-up

This concludes our post about face detection in OpenCV. We saw how we can grab the camera image and find faces in it using a pre-trained SSD network in OpenCV.

Follow me on twitter (@bewagner_) for more content on C++ and machine learning!

See you soon!

Related Posts

The state of DevOps

Analyzing developer sentiment towards DevOps based on the Stack Overflow 2020 Developer Survey

C++ dependency management with CMake's FetchContent

How to replace git submodules with a built-in CMake feature

用CMake的FetchContent來管理C++依賴項

如何使用內置的CMake功能替換Git子模組

Using OpenCV to detect face key points with C++

After detecting faces in an image in the last post, we will now use one of OpenCV's built-in models to extract face key points.