brew info tesseract
The result of recognition on
Chinese - Simplified is a little bit terrifying.
I noticed that it added a new neural network system based on LSTMs after 4.0.0+
But it need to be build from source code on macOS.
Thankfully, the manul is quit specify on their README.md
brew install automake autoconf autoconf-archive libtool
git clone https://github.com/tesseract-ocr/tesseract/
Their best trained modes, download the language
chi_sim.traineddata and put it under
tesseract image.png image -l chi_sim
OK, it is still terrible under the
Song typeface font. It need to be trained a new model by myself.
Finally, ignoring the
tesseract, I found drag the image to OneNote, and
Ctrl + Click ->
Copy Text from Picture will get more Accuracy. 😓