OCR with Tesseract

Computer Language/Python

OCR with Tesseract

lejpower 2021. 11. 5. 17:49

테스트용 스탭서버가 우분투 이기 때문에 우분투를 기준으로 테스트 해 보았다.

tesseract install

sudo apt install tesseract-ocr
sudo apt install libtesseract-dev

pip install

pip install Pillow
pip install pytesseract

download the trained datafile

sudo apt-get install tesseract-ocr-*

https://github.com/tesseract-ocr/tessdata

GitHub - tesseract-ocr/tessdata: Trained models with support for legacy and LSTM OCR engine

Trained models with support for legacy and LSTM OCR engine - GitHub - tesseract-ocr/tessdata: Trained models with support for legacy and LSTM OCR engine

github.com

Python test code

테스트 사진이 일본항공권이라서 일본어로 테스트 해봄 ㅋ

import pytesseract
from PIL import Image

pytesseract.pytesseract.tesseract_cmd = r'/usr/bin/tesseract'


ocr = Image.open('/home/ubuntu/DEVEOPMENT_AWS_STEP_SERVER/Python_AWS_STEPSERVER/python_OCR_test/test_ocr.jpg')
result = pytesseract.image_to_string(ocr, lang='jpn')

print(result)

결과

BOAHUING PASS

 

保誠栓査坦と搭来口で2次元バーコードをタッチして《ださい。

Please touch the barcode at security check and the boarding 92te

ムSTAH ALLIANCE MEABER ぷと

東京/:                      、 沖縄

TOKYO/HANEDム                                  Le         OKINAW和A

09 : 20 発           ” 11:55着

 

2053556 1096791

 

 

指乗口 / 指乗順       - 指乗締切時刻

GATE / GROUP             Boarding Closs Tims

58 /Group4     09:10       ままーー
)

(LSN: 8834
DAF      g/7 8:49 BP 2 PNR:NQ5CF          FARE: INTOW    BN: 338

저작자표시 비영리 변경금지

'Computer Language > Python' 카테고리의 다른 글

[pipenv] pipenv install throws --system is intended to be used for pre-existing Pipfile installation 에러발생 시 대처방안 (0)	2023.03.09
pipenv 사용법 (0)	2023.03.09
anaconda & jupyter notebook installation in EC2 with Ubuntu (0)	2021.11.17

현재글OCR with Tesseract

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Juni's space

OCR with Tesseract

tesseract install

pip install

download the trained datafile

Python test code

결과

'Computer Language > Python' 카테고리의 다른 글

'Computer Language/Python'의 다른글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역

OCR with Tesseract

tesseract install

pip install

download the trained datafile

Python test code

결과

'Computer Language > Python' 카테고리의 다른 글

'Computer Language/Python'의 다른글

관련글

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역