npm 包 meta-scraper 使用教程-JavaScript中文网-JavaScript教程资源分享门户

概述

meta-scraper 是一款可以从网站中抓取元数据的 Node.js 模块，它支持多种网站，并且可以自动处理多个重定向。它最常用的应用是从网站抓取页面的标题、描述和图像等信息，然后在一张卡片中显示该信息。

在本篇文章中，我们将介绍如何在你的前端应用程序中使用 meta-scraper。

安装

使用 npm 可以很容易地安装 meta-scraper：

npm install meta-scraper

基本用法

首先，我们需要引入 meta-scraper 模块：

const Metascraper = require('meta-scraper');

然后，我们可以使用它从页面中抓取元数据。下面的例子演示了如何从 https://github.com/ 抓取页面标题和描述：

-- -------------------- ---- -------
----- ----------- - --- -------------
  ---- ---------------------
---

-----------
  ---------
  -------------- -- -
    ----------------------------
    ----------------------------------
  --
  ------------ -- -
    -------------------
  ---

上面的代码将输出以下内容：

GitHub: Where the world builds software · GitHub
GitHub is where over 65 million developers shape the future of software, together. Contribute to the open source community, manage your Git repositories, review code like a pro, track bugs and features, power your CI/CD and DevOps workflows, and secure code before you commit it.

自定义选择器

默认情况下，meta-scraper 将使用一组默认的选择器来确定哪些元数据应该被抓取。但是，你也可以使用自定义选择器来指定要抓取的特定元数据。下面的例子演示了如何从 https://github.com/ 抓取页面图像：

-- -------------------- ---- -------
----- ----------- - --- -------------
  ---- ----------------------
  ----- ------ ------------------- ---------------------------------------------------------------------------------------------
---

-----------
  ---------
    ------ ----------
  --
  -------------- -- -
    ----------------------------
  --
  ------------ -- -
    -------------------
  ---

上面的代码将输出 https://github.githubassets.com/images/modules/site/home-illustration-desktop.svg。

自定义页面请求

如果你需要使用自定义的页面请求选项，可以通过 opts 属性传递它们。下面的例子演示了如何使用代理来请求页面：

-- -------------------- ---- -------
----- ----------- - --- -------------
  ---- ----------------------
  ----- -
    ------ --- ----------------------------------------
  -
---

-----------
  ---------
  -------------- -- -
    ----------------------------
    ----------------------------------
  --
  ------------ -- -
    -------------------
  ---

上面的代码将使用代理来请求页面。

深度学习和指导意义

在本文中，我们介绍了如何使用 meta-scraper 从网站中抓取元数据，其中包括页面标题、描述和图像等信息。我们还介绍了自定义选择器和自定义常规选项的方法。学习这些技能可以帮助你轻松处理前端开发中的许多常见任务。

来源：JavaScript中文网，转载请注明来源 https://www.javascriptcn.com/post/60055aeb81e8991b448d891f