img2txt

인터넷에서 우연히 점자 유니코드로 된 그림을 봤다. 재밌어 보여 만들어보기로 했다.

계획

이미지의 테두리만 남긴다.
결과물이 잘리거나 하지 않게 해상도를 맞춰준다.
4x2 픽셀들을 하나의 점자 유니코드(U+2800 ~ U+28FF)로 표현한다.

물론 계획을 하고 시작하진 않았다..

도라에몽

우리 파란 너구리를 점자로 만들어버릴 거다.

첫번째 단계

이미지의 테두리만 뽑아내는 작업이다. 여러가지 방법이 존재하지만 난 그 중에서도 직관적인 방법을 택했다.

import cv2 as cv

# Open image file
img_path = './doraemon.jpeg'
img = cv.imread(img_path, cv.IMREAD_GRAYSCALE)

# Extract only edges
blurred_img = cv.pyrUp(cv.pyrDown(img))
edged_img = cv.subtract(img, blurred_img)

그림판에서 사진의 해상도를 억지로 키우면 선명한 느낌이 사라지고 blur된 것처럼 보일 때가 있을 것이다. 이번에 사용된 edge만 남기기 방법에선 그러한 원리를 채택한다.

원본의 `img`는 테두리가 전부 선명하다. 하지만 `blurred_img`에서는 테두리가 번지게 되면서 테두리의 픽셀 값이 원본보다 작아진다. 와중에 원래 단색으로 채워졌던 부분은 번져봤자 바로 옆에 있는 픽셀도 자신과 같은 값이였기 때문에 원본과 차이가 크게 나지 않는다.

이 상태에서 원본 `img`에서 `blurred_img`의 픽셀값을 element-wise하게 빼준다면 픽셀 값의 차이가 큰 테두리에서만 픽셀차가 0보다 크게 나와, 테두리만 뽑아낼 수 있게 되는 것이다.

Figure 1와 Figure 2에서는 각 이미지 처리의 중간과정을 시각적으로 보여준다.

두번째 단계

점자 유니코드

점자 유니코드는 가로로 두 개의 점, 세로로 네 개의 점을 포함한다.

Braille Table - https://www.unicode.org/charts/PDF/U2800.pdf

위는 unicode.org에서 제공하는 U+2800부터의 문자 테이블이다. 8개의 점이 각각 색이 채워지는가 안 채워지는가에 따라 문자가 달라진다. 경우의 수는 당연하게도 `2⁸ =256`개를 갖는다.

점의 색이 채워지는 규칙도 당연히 정해져있다. Figure 3에 써진 숫자의 순서대로 2⁰, 2¹, 2², 2³, 2⁴, 2⁵, 2⁶, 2⁷를 U+2800에 더한 자리에 색이 채워진다. 예를 들면 다음과 같다.

위와 같이 1, 2, 4, 6, 7번 자리가 찬 점자 문자를 얻으려면 2⁰, 2¹, 2³, 2⁵, 2⁶을 `U+2800`에 더하면 된다.

즉, ` 2⁰ + 2¹ + 2³ + 2⁵ + 2⁶ = 1 + 2 + 8 + 32 + 64 = 107`, 16진수로 `6B`를 `2800`에 더하면 `286B`가 원하던 특수문자의 번호라는 것이다.

이미지 해상도 조절

이렇게 점자 유니코드는 4x2의 크기를 갖는다. 즉, 완성될 점자 텍스트의 가로x세로 크기는 (2의 배수)x(4의 배수)가 될 것이다. 하지만 원본 이미지의 해상도가 2의 배수의 가로 크기, 4의 배수의 세로 크기를 갖지 않는다면 이미지가 잘릴 가능성이 있다. 따라서 각각 2와 4의 배수에 맞도록 이미지 해상도를 조절한다면 픽셀 정보들을 전부 담을 수 있을 것이다.

resized_shape =  round(img.shape[1] / 2) * 2, round(img.shape[0] / 4) * 4
resized_img = cv.resize(edged_img, resized_shape)

각각 2, 4로 나눈 값을 반내림해준 뒤, 다시 2와 4를 곱해주면 원본 크기에 근접하면서도 2, 4의 배수가 되게끔 해상도를 조절할 수 있다.

Figure 4를 보면 해상도가 조절?되었다. 원본의 가로, 세로 크기가 원래부터 2, 4의 배수였기 때문에 크기가 바뀌지는 않았다. 아무튼 이 작업을 통해 다른 사진들 또한 깔끔하게 처리할 수 있음을 알아두자.

세번째 단계

반복문을 돌면서 4x2 크기의 픽셀들이 각각 임계값(threshold)을 넘는지를 판단해, 0 또는 1을 부여한 뒤, Figure 3에서 봤던대로 해당 4x2 픽셀들에 알맞는 점자 코드를 찾을 것이다.

def get_number(window, threshold):
    ct = 0
    if window[0,0] > threshold:
        ct += 1
    if window[1,0] > threshold:
        ct += 2
    if window[2, 0] > threshold:
        ct += 4
    if window[0, 1] > threshold:
        ct += 8
    if window[1, 1] > threshold:
        ct += 16
    if window[2, 1] > threshold:
        ct += 32
    if window[3, 0] > threshold:
        ct += 64
    if window[3, 1] > threshold:
        ct += 128
    return chr(0x2800 + ct)

Figure 3에 있던 점들에 위치하는 픽셀들의 값이 `threshold`보다 크다면 픽셀이 있는 것으로 판단해(1을 부여), 4x2의 픽셀 군집(코드에서 window라고 표현한다)에 알맞는 점자 유니코드를 찾아 반환하도록 한다.

뭔가 비효율적인 것 같지만 아무튼 잘 작동하는 코드다.

result=''

for row in range(0, resized_img.shape[0], 4):
    for col in range(0, resized_img.shape[1], 2):
        result += get_number(resized_img[row:row+4, col:col+2], 10) # 10 is threshold
    result += '\n'

4x2 크기의 윈도우를 `get_number` 함수에 넘겨, 점자 문자를 받아와 `result`에 이어붙인다.

한 줄이 끝날 때마다 개행문자를 붙여 다음 줄로 넘어가도록 한다.

결과

..? 결과물이 너무 크다;; 해상도를 조절하는 두 번째 단계에서 약간만 코드를 추가해준다면 결과물의 크기도 변경할 수 있다.

smaller = 5

resized_shape =  round(img.shape[1] / (2 * smaller)) * 2, round(img.shape[0] / (4 * smaller)) * 4
resized_img = cv.resize(edged_img, resized_shape)

`smaller`라는 조정 변수를 도입해 크기를 좀 더 줄일 수 있다.

해상도가 작아짐에 따라 이미지가 뭉개진다..

진짜 결과

`[이미지에서 선만 뽑아내기 → 이미지 크기 조정]` 이 과정의 순서를 `[이미지 크기 조정 → 이미지에서 선만 뽑아내기]` 이렇게 반대로 바꾸면 좀 더 좋은 결과를 얻을 수 있다.

img_path = './doraemon.jpeg'
img = cv.imread(img_path, cv.IMREAD_GRAYSCALE)

smaller = 5
resized_shape = round(img.shape[1] / (2 * smaller)) * 2, round(img.shape[0] / (4 * smaller)) * 4
resized_img = cv.resize(img, resized_shape)

edged_img = cv.subtract(resized_img, cv.pyrUp(cv.pyrDown(resized_img)))

...
### 점자로 만드는 과정 ###

선이 좀 두껍긴 하지만, 도라에몽임을 인지할 정도는 된다. `threshold`의 값을 조정하거나 `smaller`의 값을 조정하면서 좀 더 나은 결과를 찾을 수도 있을 것이다.

# 최종 코드

import cv2 as cv

# Global parameter
smaller = 5
threshold = 3

# Open image file
img_path = './doraemon.jpeg'
img = cv.imread(img_path, cv.IMREAD_GRAYSCALE)

# Resize image
resized_shape = round(img.shape[1] / (2 * smaller)) * 2, round(img.shape[0] / (4 * smaller)) * 4
resized_img = cv.resize(img, resized_shape)

# Extract only edges
edged_img = cv.subtract(resized_img, cv.pyrUp(cv.pyrDown(img)))

# Find braille unicode
def get_number(window, threshold):
    ct = 0
    if window[0,0] > threshold:
        ct += 1
    if window[1,0] > threshold:
        ct += 2
    if window[2, 0] > threshold:
        ct += 4
    if window[0, 1] > threshold:
        ct += 8
    if window[1, 1] > threshold:
        ct += 16
    if window[2, 1] > threshold:
        ct += 32
    if window[3, 0] > threshold:
        ct += 64
    if window[3, 1] > threshold:
        ct += 128
    return chr(0x2800 + ct)

# Make text result
result=''
for row in range(0, edged_img.shape[0], 4):
    for col in range(0, edged_img.shape[1], 2):
        result += get_number(edged_img[row:row+4, col:col+2], threshold)
    result += '\n'

print()
print(result)

# 실제 여러가지 테스트하면서 작성한 코드

import cv2 as cv
import numpy as np
from matplotlib import pyplot as plt

# Open image file
img_path = './doraemon.jpeg'
img = cv.imread(img_path, cv.IMREAD_GRAYSCALE)

smaller = 5

hehe = round(img.shape[1] / (2 * smaller)) * 2, round(img.shape[0] / (4 * smaller)) * 4
img = cv.resize(img, hehe)

# Extract only edges
edged_img = cv.subtract(img, cv.pyrUp(cv.pyrDown(img)))

blurred_img = cv.pyrUp(cv.pyrDown(img))

# shape index :: 0: row, 1: column

# resized_shape = round(img.shape[1] / (2 * smaller)) * 2, round(img.shape[0] / (4 * smaller)) * 4
# resized_img = cv.resize(edged_img, resized_shape)
resized_img = edged_img

def get_number(window, threshold):
    ct = 0
    if window[0,0] > threshold:
        ct += 1
    if window[1,0] > threshold:
        ct += 2
    if window[2, 0] > threshold:
        ct += 4
    if window[0, 1] > threshold:
        ct += 8
    if window[1, 1] > threshold:
        ct += 16
    if window[2, 1] > threshold:
        ct += 32
    if window[3, 0] > threshold:
        ct += 64
    if window[3, 1] > threshold:
        ct += 128
    return chr(0x2800 + ct)

result=''

for row in range(0, resized_img.shape[0], 4):
    for col in range(0, resized_img.shape[1], 2):
        result += get_number(resized_img[row:row+4, col:col+2], 4)
    result += '\n'

print()
print(f"Resolution of original image: {img.shape}")
print(f"Resolution of resized image: {resized_img.shape}")
print(result)


# plt.subplot(1,3,1)
# plt.xticks([],[])
# plt.yticks([],[])
# plt.title('img')
# plt.imshow(img, cmap='gray')

# plt.subplot(1,3,2)
# plt.imshow(blurred_img, cmap='gray')
# plt.title('blurred_img')
# plt.xticks([],[])
# plt.yticks([],[])

# plt.subplot(1,3,3)
# plt.imshow(edged_img, cmap='gray')
# plt.title('edged_img')
# plt.xticks([],[])
# plt.yticks([],[])

plt.subplot(1,2,1)
plt.imshow(edged_img, cmap='gray')
plt.title('edged_img')
plt.xticks([],[])
plt.yticks([],[])

plt.subplot(1,2,2)
plt.imshow(resized_img, cmap='gray')
plt.title('resized_img')
plt.xticks([],[])
plt.yticks([],[])


plt.show()

⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣠⣴⡾⠿⣛⣿⡛⠛⠿⢷⣦⣄⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢀⣤⣾⢟⣡⣤⣼⣿⠋⡣⢾⣿⣆⠈⠻⣷⣄⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⣾⢳⣿⣫⢷⣽⠋⠒⢱⡻⣿⡿⣃⠀⠀⠈⢻⣧⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢻⣾⡿⢿⣿⣾⣗⣶⣾⣿⠿⣿⣯⣭⣆⠀⠀⢿⡇⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢻⣸⣧⠀⠀⠀⠈⠉⠙⠋⠛⠿⢾⣯⣿⠀⠀⢸⣿⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢸⣟⣿⣆⠀⣀⣤⣤⣄⣀⡀⠀⣠⣿⡿⠀⠀⣾⡏⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⢀⣿⠽⠿⣆⢽⣥⣤⠈⢉⣡⣾⣿⡿⠁⠀⣼⡟⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⢀⣴⡶⢶⣾⣟⣔⣤⣴⡎⣁⠒⠺⠿⣿⠿⠋⠀⣠⣾⠟⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⣰⣟⣵⡿⠿⣮⢜⣸⣏⣿⡿⠀⠐⠶⣤⡄⡀⣶⡿⠛⠁⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠻⡼⠋⠀⠀⣿⢯⣿⣄⣿⣦⣄⠀⠀⠀⠀⢸⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⣤⢶⣄⡀⣼⡟⠀⠙⠹⠿⠿⠽⠋⠈⠠⡔⠉⢻⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠙⢿⣽⠍⠫⣶⣦⣄⡀⠀⠀⠀⠀⠀⠀⢰⣦⣻⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠙⠻⢷⣦⣄⠀⠀⠀⣸⣿⡏⣿⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⢻⡷⣦⡶⢿⣿⣻⡇⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀
⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠈⠻⠯⠟⠻⠵⠟⠀⠀⠀⠀⠀⠀

위 텍스트는 css의 `line-height` 속성을 조정하여 자연스럽게 보이도록 수정했다.

'Study > 잡다한 것들' 카테고리의 다른 글

리눅스 FTZ ssh 접속 (0)	2023.12.25

계획