Posts

Document Orientation Correction

Another application of the problem explained in the article “Discovering a small image inside a large image using Python and OpenCV” is that you can use this code to find a special sign in a ID card Scan and make certain changes based on that.

For example, here we have given the program a type of sample certificate that has been scanned in reverse, and after finding the sign in the upper right corner of the card, we rotate it as many times as necessary to show the certificate image in the right direction.

Sample code:

import cv2

def find_image_in_larger_image(small_image, large_image):

# Read images

# Check if images are loaded successfully

if small_image is None or large_image is None:

print(“Error: Unable to load images.”)

return None

# Get dimensions of both images

small_height, small_width = small_image.shape

large_height, large_width = large_image.shape

# Find the template (small image) within the larger image

result = cv2.matchTemplate(large_image, small_image, cv2.TM_CCOEFF_NORMED)

# Define a threshold to consider a match

threshold = 0.6

# Find locations where the correlation coefficient is greater than the threshold

locations = cv2.findNonZero((result >= threshold).astype(int))

# If no match is found

if locations is None:

return None

# Determine the position of the matched areas

matched_positions = []

for loc in locations:

x, y = loc[0]

if x < large_width / 2:

position_x = “left”

otherwise:

position_x = “right”

if y < large_height / 2:

position_y = “top”

otherwise:

position_y = “bottom”

matched_positions.append((position_x, position_y))

return matched_positions

# Example usage

small_image_path = “mark.jpg”

large_image_path = “card.jpg”

small_image = cv2.imread(small_image_path, cv2.IMREAD_GRAYSCALE)

large_image = cv2.imread(large_image_path, cv2.IMREAD_GRAYSCALE)

rotated_large_image=large_image

positions = find_image_in_larger_image(small_image, large_image)

max_rotation = 10 # Set the maximum rotation limit

if positions:

position_x, position_y = positions[0]

print(“Position: {}, {}”.format(position_x, position_y))

while max_rotation>0:

max_rotation-=1

rotated_large_image = cv2.rotate(rotated_large_image, cv2.ROTATE_90_CLOCKWISE)

positions = find_image_in_larger_image(small_image, rotated_large_image)

if positions:

position_x, position_y = positions[0]

print(“Position: {}, {}”.format(position_x, position_y))

if(position_x==’right’ and position_y==’top’):

cv2.imshow(“Mark”, small_image)

cv2.imshow(“Original”, large_image)

cv2.imshow(“Result”, rotated_large_image)

cv2.waitKey(0)

cv2.destroyAllWindows()

max_rotation=0

otherwise:

print(“No match found after {} rotations.”.format(10-max_rotation))

otherwise:

print(“No match found after {} rotations.”.format(10-max_rotation))

Click here to view on GitHub.

Finding a Small Image Within a Larger Image Using Python and OpenCV

In this post, we’ll explore an interesting application of image processing using the Python programming language and the OpenCV library. The topic of this post is finding a small image within a larger image. This application can be used for pattern recognition, image matching, or even face detection.

First, make sure that OpenCV is installed in your Python environment. If you haven’t installed it yet, you can do so using the following commands:

pip install opencv-python
pip install opencv-python-headless

 

Now that OpenCV is installed, let’s delve into the Python code for finding an image within an image. The following code accomplishes this task:

import cv2

def find_image_in_larger_image(small_image_path, large_image_path):
# Read images
small_image = cv2.imread(small_image_path, cv2.IMREAD_GRAYSCALE)
large_image = cv2.imread(large_image_path, cv2.IMREAD_GRAYSCALE)

# Check if images are loaded successfully
if small_image is None or large_image is None:
print(“Error: Unable to load images.”)
return None

# Get dimensions of both images
small_height, small_width = small_image.shape
large_height, large_width = large_image.shape

# Find the template (small image) within the larger image
result = cv2.matchTemplate(large_image, small_image, cv2.TM_CCOEFF_NORMED)

# Define a threshold to consider a match
threshold = 0.8

# Find locations where the correlation coefficient is greater than the threshold
locations = cv2.findNonZero((result >= threshold).astype(int))

# If no match is found
if locations is None:
print(“No match found.”)
return None

# Draw rectangles around the matching areas
for loc in locations:
top_left = loc[0]
bottom_right = (top_left[0] + small_width, top_left[1] + small_height)
cv2.rectangle(large_image, top_left, bottom_right, (0, 255, 0), 2)

# Display the result (optional)
cv2.imshow(“Result”, large_image)
cv2.waitKey(0)
cv2.destroyAllWindows()

return locations

# Example usage
small_image_path = “small_image.jpg”
large_image_path = “large_image.jpg”

find_image_in_larger_image(small_image_path, large_image_path)

This code finds occurrences of the small image within the larger image and draws rectangles around the matching areas.

By running this code and replacing "small_image.jpg" and "large_image.jpg" with the paths to your actual images, you can experience its functionality.

Face Detection with HOG and MTCNN

Exploring Face Detection Techniques: HOG vs. MTCNN

Face detection is a fundamental task in computer vision with applications ranging from security systems to social media. Two popular methods for face detection are Histogram of Oriented Gradients (HOG) and Multi-task Cascaded Convolutional Networks (MTCNN). In this blog post, we’ll delve into both techniques, providing an overview of their principles and showcasing Python code for each.

Histogram of Oriented Gradients (HOG)

Understanding HOG

HOG is a feature descriptor widely used for object detection. It works by analyzing the distribution of gradients in an image, making it particularly effective for detecting objects with distinct shapes and textures.

How it Works

The HOG algorithm divides an image into small, overlapping cells, computes the gradient orientation within each cell, and then creates a histogram of these orientations. These histograms are then concatenated to form the final feature vector, which is used for training a support vector machine (SVM) or another classifier.

Python Code Example

We’ll begin by exploring HOG-based face detection using OpenCV in Python. The provided code loads an image, applies the HOG detector, and draws bounding boxes around detected faces.

View on GitHub

Multi-task Cascaded Convolutional Networks (MTCNN)

Understanding MTCNN

MTCNN is a deep learning-based face detection model designed to handle various face orientations and scales. It consists of three stages: face detection, bounding box regression, and facial landmark localization.

How it Works

MTCNN operates in a cascaded manner, with each stage refining the results of the previous one. The first stage detects potential face regions, the second stage refines the bounding boxes, and the third stage locates facial landmarks. The combined information provides accurate face detection.

Python Code Example

Next, we’ll explore face detection using MTCNN with the help of the `mtcnn` library in Python. The code loads an image, applies the MTCNN detector, and displays the image with bounding boxes around detected faces.

View on GitHub

Choosing the Right Method

While both HOG and MTCNN are effective for face detection, each has its strengths and limitations. HOG is robust and computationally efficient, making it suitable for real-time applications. On the other hand, MTCNN excels in handling diverse face orientations and is well-suited for scenarios where faces may appear at different scales and angles.

In the accompanying Python code, we showcase how to implement both techniques and provide insights into their usage. Feel free to experiment with the code and explore which method best fits your specific use case.

Continue reading for a detailed walkthrough of the code, usage instructions, and a discussion on factors influencing detection accuracy.