Sum of absolute differences

Sum of absolute differences

SAD 绝对差值和


In digital image processing, the sum of absolute differences (SAD) is a
measure of the similarity(相似度) between image blocks.
It is calculated by taking the absolute difference between each pixel in the
original block and the corresponding pixel in the block being used for
These differences are summed to create a simple metric(度量) of
block similarity,
the \(L^{1}\) norm of the
difference image or
Manhattan distance between
two image blocks.

The sum of absolute differences may be used for a variety of purposes,
such as object recognition,
the generation of disparity maps for stereo images,
and motion estimation for video compression.

This example uses the sum of absolute differences to identify which part of a
search image is most similar to a template image.
In this example, the template image is 3 by 3 pixels in size,
while the search image is 3 by 5 pixels in size.
Each pixel is represented by a single integer from 0 to 9.

Template    Search image
 2 5 5       2 7 5 8 6
 4 0 7       1 7 4 2 7
 7 5 9       8 4 6 8 5

There are exactly three unique locations within the search image where the
template may fit: the left side of the image, the center of the image,
and the right side of the image. To calculate the SAD values,
the absolute value of the difference between each corresponding pair of pixels
is used: the difference between 2 and 2 is 0, 4 and 1 is 3, 7 and 8 is 1,
and so forth.

Calculating the values of the absolute differences for each pixel,
for the three possible template locations, gives the following:

Left    Center   Right
0 2 0   5 0 3    3 3 1
3 7 3   3 4 5    0 2 0
1 1 3   3 1 1    1 3 4

For each of these three image patches,
the 9 absolute differences are added together,
giving SAD values of 20, 25, and 17, respectively.
From these SAD values,
it could be asserted that the right side of the search image is the most
similar to the template image,
because it has the lowest sum of absolute differences as compared to the other
two locations.

TEST(SAD, example)
    cv::Mat const templateImage = (cv::Mat_<int>(3, 3) <<
        2, 5, 5,
        4, 0, 7,
        7, 5, 9);
    std::cout << "templateImage " << templateImage << "\n";
    cv::Mat const searchImage = (cv::Mat_<int>(3, 5) <<
        2, 7, 5, 8, 6,
        1, 7, 4, 2, 7,
        8, 4, 6, 8, 5);
    std::cout << "searchImage " << searchImage << "\n";
    cv::Mat const left = searchImage.colRange(0, 3);
    std::cout << "left " << left << "\n";
    cv::Mat const center = searchImage.colRange(1, 4);
    std::cout << "center " << center << "\n";
    cv::Mat const right = searchImage.colRange(2, 5);
    std::cout << "right " << right << "\n";
    // get absolute differences mat
    cv::Mat const leftSad0 = cv::abs(templateImage - left);
    // compute SAD
    double const leftSad = cv::sum(leftSad0)[0];
    std::cout << "leftSad " << leftSad << "\n";
    EXPECT_DOUBLE_EQ(20, leftSad);
    cv::Mat const centerSad0 = cv::abs(templateImage - center);
    double const centerSad = cv::sum(centerSad0)[0];
    std::cout << "centerSad " << centerSad << "\n";
    EXPECT_DOUBLE_EQ(25, centerSad);
    cv::Mat const rightSad0 = cv::abs(templateImage - right);
    double const rightSad = cv::sum(rightSad0)[0];
    std::cout << "rightSad " << rightSad << "\n";
    EXPECT_DOUBLE_EQ(17, rightSad);
    double const theMostSimilar = std::min(
        std::min(leftSad, centerSad), rightSad);
    EXPECT_DOUBLE_EQ(rightSad, theMostSimilar);

Comparison to other metrics

Object recognition

The sum of absolute differences provides a simple way to automate the
searching for objects inside an image,
but may be unreliable due to the effects of contextual factors(情境因素)
such as changes in lighting, color, viewing direction, size, or shape.
The SAD may be used in conjunction with other object recognition methods,
such as edge detection,
to improve the reliability of results.

Video compression

SAD is an extremely fast metric due to its simplicity;
it is effectively the simplest possible metric that takes into account every
pixel in a block.
Therefore it is very effective for a wide motion search of many different
SAD is also easily parallelizable since it analyzes each pixel separately,
making it easily implementable with such instructions as ARM NEON or x86 SSE2.
For example, SSE has packed sum of absolute differences instruction (PSADBW)
specifically for this purpose.
Once candidate blocks are found,
the final refinement of the motion estimation process is often done with other
slower but more accurate metrics,
which better take into account human perception.
These include the
sum of absolute transformed differences (SATD) 变换绝对差值和,
the sum of squared differences (SSD) 差值平方和,
and rate-distortion optimization.