npm 包 @tessdata/por 使用教程-JavaScript中文网-JavaScript教程资源分享门户

什么是 @tessdata/por

@tessdata/por 是一个 npm 包，用于识别多语言文本中的文字。它基于 Tesseract OCR 引擎，支持识别多种语言的文字，并将结果以文本形式返回。

安装 @tessdata/por

使用 npm 安装 @tessdata/por：

npm install @tessdata/por

使用 @tessdata/por

@tessdata/por 提供了一个简单易用的接口，只需要将图片的路径传入 recognize 方法，即可返回识别结果：

const { recognize } = require('@tessdata/por')

recognize('./images/image1.jpg').then((text) => {
  console.log(text)
}).catch((error) => {
  console.error(error)
})

recognize 方法返回一个 Promise，成功时返回识别结果，失败则返回错误信息。

@tessdata/por 还支持设置识别语言和配置项。在 recognize 方法中，可以传入语言代码和配置项：

const { recognize } = require('@tessdata/por')

recognize('./images/image1.jpg', { lang: 'eng', psm: 7 }).then((text) => {
  console.log(text)
}).catch((error) => {
  console.error(error)
})

上述代码中，lang 设置为 'eng'，表示识别英文文本，psm 设置为 7，表示使用单独的行识别模式。

@tessdata/por 支持的语言代码请参考 Tesseract OCR 文档。

注意事项

使用 @tessdata/por 时，请注意以下事项：

需要安装 Tesseract OCR 引擎，请参考 Tesseract OCR 文档；
图片路径必须是绝对路径或相对于当前工作目录的路径；
如果识别结果不准确，可以调整配置项。

示例代码

以下是一个完整的示例代码，演示如何使用 @tessdata/por 识别一张图片中的文字：

const { recognize } = require('@tessdata/por')

recognize('./images/image1.jpg', { lang: 'eng', psm: 7 }).then((text) => {
  console.log(text)
}).catch((error) => {
  console.error(error)
})

以上代码中：

./images/image1.jpg 是图片的路径；
{ lang: 'eng', psm: 7 } 指定使用英文识别模式和单独的行识别模式。

总结

@tessdata/por 是一个方便易用的 npm 包，用于识别多语言文本中的文字。通过本文的介绍，你已经能够轻松地使用它来识别图片中的文字了。当然，要想使用得更加熟练，还需要多加实践和学习。

来源：JavaScript中文网，转载请注明来源 https://www.javascriptcn.com/post/6005625881e8991b448df96a