-
Obtain binary image. Load the image, convert to grayscale, then Otsu’s threshold to obtain a binary black/white image.
-
Detect and remove horizontal lines. To detect horizontal lines, we create a special horizontal kernel and morph open to detect horizontal contours. From here we find contours on the mask and “fill in”
the detected horizontal contours with white to effectively remove the lines -
Repair image. At this point the image may have gaps if the horizontal lines intersected through characters. To repair the text, we create a vertical kernel and morph close to reverse the damage
After converting to grayscale, we Otsu’s threshold to obtain a binary image
image = cv2.imread('1.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
Next we create a special horizontal kernel to detect horizontal lines. We draw these lines onto a mask and then find contours on the mask. To remove the lines, we fill in the contours with white
Detected lines
Mask
Filled in contours
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
The image currently has gaps. To fix this, we construct a vertical kernel to repair the image
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=1)
Note depending on the image, the size of the kernel will change. You can think of the kernel as
(horizontal, vertical)
. For instance, to detect longer lines, we could use a(50,1)
kernel instead. If we wanted thicker lines, we could increase the 2nd parameter to say(50,2)
.
Here’s the results with the other images
Detected lines
Original ->
Removed
Detected lines
Original ->
Removed
Full code
import cv2
image = cv2.imread('1.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=1)
cv2.imshow('thresh', thresh)
cv2.imshow('detected_lines', detected_lines)
cv2.imshow('image', image)
cv2.imshow('result', result)
cv2.waitKey()