Detect text on image using tess4j library on linux

Tesseract OCR library is the best way to detect text on image. Let’s start with Java spring boot project.

– Step 1: Download template spring boot project on: https://github.com/habogay/spring-boot-gcp

– Step 2: Install tesseract otc:

sudo apt-get install tesseract-ocr

– Step 3: create environment in tool (I use eclipse): TESSDATA_PREFIX=/usr/share/tesseract-ocr/tessdata/

– Step 4: Use Tesseract:

String rs=””;

ITesseract tess = new Tesseract();

 try {

    // Specify trained data folder

    // tess.setDatapath(“./tessdata”);

    // Specify detected language

    tess.setLanguage(“eng”);

    File img = new File(“/home/habogay/Desktop/lh.png”);

    rs = tess.doOCR(img);

    model.addAttribute(“rs”, rs);

    System.out.println(rs);

  } catch (Exception e) {

        System.out.println(e.getMessage());

  }

source code: https://github.com/habogay/fb-controller

Done :d

Leave a Reply

Your email address will not be published. Required fields are marked *