IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> C++知识库 -> Tesseract-OCR在Ubuntu20.04平台上使用 -> 正文阅读

[C++知识库]Tesseract-OCR在Ubuntu20.04平台上使用

安装Tesseract-OCR

在ubuntu20.04上,我们按官方的最简单方式安装,

sudo apt install tesseract-ocr

如果你需要做开发,或是自己训练模型,那就要安装开发者工具,

sudo apt install libtesseract-dev 

安装完了检查一下,发现版本是4.1.1

tesseract -v
tesseract 4.1.1
 leptonica-1.79.0
  libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 2.0.3) : libpng 1.6.37 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.1
 Found AVX512BW
 Found AVX512F
 Found AVX2
 Found AVX
 Found FMA
 Found SSE
 Found libarchive 3.4.0 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.8 liblz4/1.9.2 libzstd/1.4.4

如果你希望安装最新的版本如5.1.0,那么你需要到官方地址去下载了自己编译安装,这里略过先

GitHub - tesseract-ocr/tesseract: Tesseract Open Source OCR Engine (main repository)https://github.com/tesseract-ocr/tesseract

安装qtcreator

因为这里我们会调用一个简单的qt界面,因此需要安装开发工具qtcreator。

sudo apt install qt5-default

然后到这里下载,

Download Offline Installers | Source Package Offline Installer | Qt

比如我下载的是6.0.2版本的,

https://download.qt.io/official_releases/qtcreator/6.0/6.0.2/qt-creator-opensource-linux-x86_64-6.0.2.run

sudo chmod +x qt-creator-opensource-linux-x86_64-6.0.2.run 
./qt-creator-opensource-linux-x86_64-6.0.2.run 

这样,就安装好了。

qtcreator->新建文件或项目-->Non Qt Project --> Plain C++ Application,然后在main下输入下面的源码,

源码如下,

#include <stdio.h>
#include <stdlib.h>
#include <opencv2/opencv.hpp>
#include <leptonica/allheaders.h>
#include <tesseract/baseapi.h>
#include <tesseract/publictypes.h>
#include <opencv2/imgproc.hpp>

int main()
{
    //std::cout << "Hello World!" << std::endl;
    std::string image_name = "/home/mc/ocr/testimg/testocr.png"; //"/home/mc/ocr/testimg/eurotext.png";
    cv::Mat imageMat;
    imageMat = cv::imread(image_name);
    if (imageMat.data == nullptr)
    {
        printf("No image data \n");
        return -1;
    }
    //cv::Mat blurMat;
    //cv::medianBlur(imageMat, blurMat, 5); // 图像模糊
    cv::Mat z1, g_grayImage;
    cv::cvtColor(imageMat, z1, cv::COLOR_BGR2GRAY);            // 灰度图
    // cv::threshold(z1, z2, 214, 255, cv::THRESH_BINARY);     // 阈值
    cv::adaptiveThreshold(z1, g_grayImage, 255, cv::ADAPTIVE_THRESH_MEAN_C, cv::THRESH_BINARY, 7, 25);  // 自动降噪


    cv::namedWindow("Image1", cv::WINDOW_AUTOSIZE);
    cv::imshow("Image1", g_grayImage);

    cv::waitKey(0);

    //std::system("chcp 65001");
    char* outText;
    tesseract::TessBaseAPI api;
    //if (api.Init(NULL, "chi_sim"))  // for chinese
    if(api.Init("/home/mc/ocr/tesseract/tessdata_best-main", "eng", tesseract::OEM_DEFAULT))
    {
        std::cout << stderr << std::endl;
        exit(1);
    }
    // Pix *image = pixRead("3.jpg");
    api.SetImage((uchar*)g_grayImage.data, g_grayImage.cols, g_grayImage.rows, 1, g_grayImage.cols);

    outText = api.GetUTF8Text();
    if (outText == nullptr)
    {
        std::cout << "No Data" << std::endl;
    }
    std::cout << outText << std::endl;
    // Destroy used object and release memory
    api.End();           // delete api;
    delete[] outText;    // pixDestroy(&image);

    return 0;
}

qtcreator的配置文件如下所示(注意我没用到默认的gcc-9,而是用的gcc-8),可以参考前一篇文章,

qtcreator报错:fatal error: stdlib.h: No such file or directory_高精度计算机视觉的博客-CSDN博客

TEMPLATE = app
CONFIG += console c++11
CONFIG -= app_bundle
CONFIG -= qt

SOURCES += \
        main.cpp

unix:!macx: LIBS += -L$$PWD/../../../../usr/local/lib/ -lopencv_world

INCLUDEPATH += $$PWD/../../../../usr/local/include/opencv4
DEPENDPATH += $$PWD/../../../../usr/local/include/opencv4

unix:!macx: LIBS += -L$$PWD/../../../../usr/lib/x86_64-linux-gnu/ -ltesseract

#INCLUDEPATH += $$PWD/../../../../usr/include
#DEPENDPATH += $$PWD/../../../../usr/include

INCLUDEPATH += /usr/include/c++/8

参考:???? 官方的安装说明

Ubuntu

If they are not already installed, you need the following libraries (Ubuntu 16.04/14.04):

sudo apt-get install g++ # or clang++ (presumably)
sudo apt-get install autoconf automake libtool
sudo apt-get install pkg-config
sudo apt-get install libpng-dev
sudo apt-get install libjpeg8-dev
sudo apt-get install libtiff5-dev
sudo apt-get install zlib1g-dev

if you plan to install the training tools, you also need the following libraries:

sudo apt-get install libicu-dev
sudo apt-get install libpango1.0-dev
sudo apt-get install libcairo2-dev

Leptonica

You also need to install Leptonica. Ensure that the development headers for Leptonica are installed before compiling Tesseract.

Tesseract versions and the minimum version of Leptonica required:

TesseractLeptonicaUbuntu
4.001.74.2Ubuntu 18.04
3.051.74.0Must build from source
3.041.71Ubuntu 16.04
3.031.70Ubuntu 14.04
3.021.69Ubuntu 12.04
3.011.67

One option is to install the distro’s Leptonica package:

sudo apt-get install libleptonica-dev

but if you are using an oldish version of Linux, the Leptonica version may be too old, so you will need to build from source.

The sources are at https://github.com/DanBloomberg/leptonica . The instructions for building are given in Leptonica README.

Note that if building Leptonica from source, you may need to ensure that /usr/local/lib is in your library path. This is a standard Linux bug, and the information at Stackoverflow is very helpful.

Installing Tesseract from Git

Please follow instructions in Compiling–GitInstallation

Also read Install Instructions

Install elsewhere / without root

Tesseract can be configured to install anywhere, which makes it possible to install it without root access.

To install it in $HOME/local:

./autogen.sh
./configure --prefix=$HOME/local/
make
make install

To install it in $HOME/local using Leptonica libraries also installed in $HOME/local:

./autogen.sh
LIBLEPT_HEADERSDIR=$HOME/local/include ./configure \
  --prefix=$HOME/local/ --with-extra-libraries=$HOME/local/lib
make
make install

In some system, you might also need to specify the path to the pkg-config before running the configure script:

export PKG_CONFIG_PATH=$HOME/local/lib/pkgconfig
  C++知识库 最新文章
【C++】友元、嵌套类、异常、RTTI、类型转换
通讯录的思路与实现(C语言)
C++PrimerPlus 第七章 函数-C++的编程模块(
Problem C: 算法9-9~9-12:平衡二叉树的基本
MSVC C++ UTF-8编程
C++进阶 多态原理
简单string类c++实现
我的年度总结
【C语言】以深厚地基筑伟岸高楼-基础篇(六
c语言常见错误合集
上一篇文章      下一篇文章      查看所有文章
加:2022-03-21 20:30:29  更:2022-03-21 20:31:24 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/24 3:14:05-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码