Wednesday, March 1, 2023

Extract Text From Image Using Command Line

The tool to use for this task is called tesseract.


To install tesseract in any ubuntu derivatives, just run below command
$ sudo apt install tesseract-ocr -y
To extract text from an image file called 007.png, run below command
$ tesseract 007.png 007output
007output is the output file, and an extension of .txt will always be put to the output file.

To view the output, just use cat like below
$ cat 007output.txt

No comments: