IT数码 购物 网址 头条 软件 日历 阅读 图书馆
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
图片批量下载器
↓批量下载图片,美女图库↓
图片自动播放器
↓图片自动播放器↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁
 
   -> 人工智能 -> paddle nlp taskflow 全家桶 包括代码生成与AIGC图片生成 一起探索paddle nlp 开包即用的能力吧 -> 正文阅读

[人工智能]paddle nlp taskflow 全家桶 包括代码生成与AIGC图片生成 一起探索paddle nlp 开包即用的能力吧

paddle nlp作为自然语言处理领域的全家桶,具有很多的不错的开箱即用的nlp能力。今天我们来一起看看基于paddle nlp中taskflow开箱即用的能力有哪些。
第一步先升级aistudio中的paddlenlp 保持最新版本。

pip install -U paddlenlp
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Requirement already satisfied: paddlenlp in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (2.4.2)
Collecting paddlenlp
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/ff/93/f383925fdd45e9ea1c3dbb2ea7b6a7521c6a0169dd2f797252eb81712700/paddlenlp-2.4.5-py3-none-any.whl (1.9 MB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/1.9 MB ? eta -:--:--━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.1/1.9 MB 4.1 MB/s eta 0:00:01━━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.4/1.9 MB 6.3 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━ 0.8/1.9 MB 8.2 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━ 1.3/1.9 MB 9.8 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 1.9/1.9 MB 11.4 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 1.9/1.9 MB 10.1 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 9.1 MB/s eta 0:00:00
[?25hRequirement already satisfied: tqdm in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (4.64.1)
Requirement already satisfied: multiprocess<=0.70.12.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.70.11.1)
Requirement already satisfied: paddle2onnx in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (1.0.0)
Requirement already satisfied: jieba in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.42.1)
Requirement already satisfied: huggingface-hub>=0.10.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.11.0)
Collecting fastapi
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/d8/09/ce090f6d53ce8b6335954488087210fa1e054c4a65f74d5f76aed254c159/fastapi-0.88.0-py3-none-any.whl (55 kB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/55.5 kB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.5/55.5 kB 176.9 kB/s eta 0:00:00
[?25hRequirement already satisfied: dill<0.3.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.3.3)
Requirement already satisfied: seqeval in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (1.2.2)
Requirement already satisfied: rich in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (12.6.0)
Requirement already satisfied: datasets>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (2.7.0)
Requirement already satisfied: protobuf<=3.20.0,>=3.1.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (3.20.0)
Requirement already satisfied: colorama in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.4.4)
Requirement already satisfied: colorlog in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (4.1.0)
Requirement already satisfied: paddlefsl in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (1.1.0)
Requirement already satisfied: sentencepiece in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (0.1.96)
Collecting typer
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/0d/44/56c3f48d2bb83d76f5c970aef8e2c3ebd6a832f09e3621c5395371fe6999/typer-0.7.0-py3-none-any.whl (38 kB)
Collecting uvicorn
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/96/f3/f39ac8ac3bdf356b4934b8f7e56173e96681f67ef0cd92bd33a5059fae9e/uvicorn-0.20.0-py3-none-any.whl (56 kB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/56.9 kB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━━━━━━━ 30.7/56.9 kB 1.5 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 56.9/56.9 kB 875.6 kB/s eta 0:00:00
[?25hRequirement already satisfied: visualdl in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from paddlenlp) (2.4.0)
Requirement already satisfied: pyyaml>=5.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (5.1.2)
Requirement already satisfied: pyarrow>=6.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (10.0.0)
Requirement already satisfied: xxhash in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (3.1.0)
Requirement already satisfied: importlib-metadata in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (4.2.0)
Requirement already satisfied: responses<0.19 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (0.18.0)
Requirement already satisfied: packaging in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (21.3)
Requirement already satisfied: aiohttp in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (3.8.3)
Requirement already satisfied: pandas in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (1.1.5)
Requirement already satisfied: fsspec[http]>=2021.11.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (2022.11.0)
Requirement already satisfied: requests>=2.19.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (2.24.0)
Requirement already satisfied: numpy>=1.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from datasets>=2.0.0->paddlenlp) (1.19.5)
Requirement already satisfied: filelock in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from huggingface-hub>=0.10.1->paddlenlp) (3.0.12)
Requirement already satisfied: typing-extensions>=3.7.4.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from huggingface-hub>=0.10.1->paddlenlp) (4.3.0)
Collecting starlette==0.22.0
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/1d/4e/30eda84159d5b3ad7fe663c40c49b16dd17436abe838f10a56c34bee44e8/starlette-0.22.0-py3-none-any.whl (64 kB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/64.3 kB ? eta -:--:--━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/64.3 kB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━━━ 41.0/64.3 kB 835.3 kB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━?━ 61.4/64.3 kB 497.5 kB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.3/64.3 kB 488.3 kB/s eta 0:00:00
[?25hCollecting pydantic!=1.7,!=1.7.1,!=1.7.2,!=1.7.3,!=1.8,!=1.8.1,<2.0.0,>=1.6.2
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/22/53/196c9a5752e30d682e493d7c00ea0a02377446578e577ae5e085010dc0bd/pydantic-1.10.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/11.8 MB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/11.8 MB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/11.8 MB 490.3 kB/s eta 0:00:25━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.1/11.8 MB 897.5 kB/s eta 0:00:14━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.1/11.8 MB 881.4 kB/s eta 0:00:14?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.2/11.8 MB 1.1 MB/s eta 0:00:11?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.3/11.8 MB 1.2 MB/s eta 0:00:10?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.3/11.8 MB 1.2 MB/s eta 0:00:10━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.4/11.8 MB 1.3 MB/s eta 0:00:09━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.4/11.8 MB 1.4 MB/s eta 0:00:09━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.5/11.8 MB 1.5 MB/s eta 0:00:08━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.6/11.8 MB 1.6 MB/s eta 0:00:08━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.7/11.8 MB 1.7 MB/s eta 0:00:07━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.8/11.
[?25hRequirement already satisfied: anyio<5,>=3.4.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from starlette==0.22.0->fastapi->paddlenlp) (3.6.1)
Requirement already satisfied: commonmark<0.10.0,>=0.9.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from rich->paddlenlp) (0.9.1)
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from rich->paddlenlp) (2.13.0)
Requirement already satisfied: scikit-learn>=0.21.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from seqeval->paddlenlp) (0.24.2)
Requirement already satisfied: click<9.0.0,>=7.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from typer->paddlenlp) (8.0.4)
Collecting h11>=0.8
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/95/04/ff642e65ad6b90db43e668d70ffb6736436c7ce41fcc549f4e9472234127/h11-0.14.0-py3-none-any.whl (58 kB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/58.3 kB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.3/58.3 kB 2.2 MB/s eta 0:00:00
[?25hRequirement already satisfied: six>=1.14.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (1.16.0)
Requirement already satisfied: Pillow>=7.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (8.2.0)
Requirement already satisfied: matplotlib in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (2.2.3)
Requirement already satisfied: Flask-Babel>=1.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (1.0.0)
Requirement already satisfied: bce-python-sdk in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (0.8.53)
Requirement already satisfied: flask>=1.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from visualdl->paddlenlp) (1.1.1)
Requirement already satisfied: itsdangerous>=0.24 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl->paddlenlp) (1.1.0)
Requirement already satisfied: Werkzeug>=0.15 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl->paddlenlp) (0.16.0)
Requirement already satisfied: Jinja2>=2.10.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from flask>=1.1.1->visualdl->paddlenlp) (3.0.0)
Requirement already satisfied: Babel>=2.3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl->paddlenlp) (2.8.0)
Requirement already satisfied: pytz in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Flask-Babel>=1.0.0->visualdl->paddlenlp) (2019.3)
Requirement already satisfied: multidict<7.0,>=4.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (6.0.2)
Requirement already satisfied: yarl<2.0,>=1.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (1.7.2)
Requirement already satisfied: asynctest==0.13.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (0.13.0)
Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (2.1.1)
Requirement already satisfied: frozenlist>=1.1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (1.3.0)
Requirement already satisfied: aiosignal>=1.1.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (1.2.0)
Requirement already satisfied: attrs>=17.3.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (22.1.0)
Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from aiohttp->datasets>=2.0.0->paddlenlp) (4.0.2)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from packaging->datasets>=2.0.0->paddlenlp) (3.0.9)
Requirement already satisfied: idna<3,>=2.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.19.0->datasets>=2.0.0->paddlenlp) (2.8)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.19.0->datasets>=2.0.0->paddlenlp) (1.25.11)
Requirement already satisfied: chardet<4,>=3.0.2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.19.0->datasets>=2.0.0->paddlenlp) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from requests>=2.19.0->datasets>=2.0.0->paddlenlp) (2019.9.11)
Requirement already satisfied: scipy>=0.19.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp) (1.6.3)
Requirement already satisfied: threadpoolctl>=2.0.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp) (2.1.0)
Requirement already satisfied: joblib>=0.11 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from scikit-learn>=0.21.3->seqeval->paddlenlp) (0.14.1)
Requirement already satisfied: future>=0.6.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl->paddlenlp) (0.18.0)
Requirement already satisfied: pycryptodome>=3.8.0 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from bce-python-sdk->visualdl->paddlenlp) (3.9.9)
Requirement already satisfied: zipp>=0.5 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from importlib-metadata->datasets>=2.0.0->paddlenlp) (3.8.1)
Requirement already satisfied: python-dateutil>=2.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->visualdl->paddlenlp) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->visualdl->paddlenlp) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from matplotlib->visualdl->paddlenlp) (1.1.0)
Requirement already satisfied: sniffio>=1.1 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from anyio<5,>=3.4.0->starlette==0.22.0->fastapi->paddlenlp) (1.3.0)
Requirement already satisfied: MarkupSafe>=2.0.0rc2 in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from Jinja2>=2.10.1->flask>=1.1.1->visualdl->paddlenlp) (2.0.1)
Requirement already satisfied: setuptools in /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages (from kiwisolver>=1.0.1->matplotlib->visualdl->paddlenlp) (56.2.0)
Installing collected packages: pydantic, h11, starlette, uvicorn, typer, fastapi, paddlenlp
  Attempting uninstall: paddlenlp
    Found existing installation: paddlenlp 2.4.2
    Uninstalling paddlenlp-2.4.2:
      Successfully uninstalled paddlenlp-2.4.2
Successfully installed fastapi-0.88.0 h11-0.14.0 paddlenlp-2.4.5 pydantic-1.10.2 starlette-0.22.0 typer-0.7.0 uvicorn-0.20.0

[notice] A new release of pip available: 22.1.2 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.

第一个能力非常的惊艳,是一个代码生成模型。上来就来了一个超必杀。

pip install regex
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting regex
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/42/d8/8a7131e7d0bf237f7bcd3191541a4bf21863c253fe6bee0796900a1a9a29/regex-2022.10.31-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/757.1 kB ? eta -:--:--━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 122.9/757.1 kB 3.5 MB/s eta 0:00:01━━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 174.1/757.1 kB 2.6 MB/s eta 0:00:01━━━━━━━━━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━ 307.2/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━━━━━━━ 399.4/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━━━ 522.2/757.1 kB 3.0 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━━ 634.9/757.1 kB 3.0 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━ 696.3/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 747.5/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 747.5/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 747.5/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 747.5/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 747.5/757.1 kB 2.9 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━
[?25hInstalling collected packages: regex
Successfully installed regex-2022.10.31

[notice] A new release of pip available: 22.1.2 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
           from paddlenlp import Taskflow 

           codegen = Taskflow("code_generation")
           codegen("def hello_world():")
        #    '''
        #    ['\n    print("Hello world")']
        #    '''
[2022-12-23 11:07:33,283] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/vocab.json
[2022-12-23 11:07:33,286] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/merges.txt
[2022-12-23 11:07:33,288] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/added_tokens.json
[2022-12-23 11:07:33,291] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/special_tokens_map.json
[2022-12-23 11:07:33,293] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/tokenizer_config.json
[2022-12-23 11:07:33,373] [    INFO] - Adding                                 to the vocabulary
[2022-12-23 11:07:33,375] [    INFO] - Adding                                to the vocabulary
[2022-12-23 11:07:33,377] [    INFO] - Adding                               to the vocabulary
[2022-12-23 11:07:33,379] [    INFO] - Adding                              to the vocabulary
[2022-12-23 11:07:33,381] [    INFO] - Adding                             to the vocabulary
[2022-12-23 11:07:33,383] [    INFO] - Adding                            to the vocabulary
[2022-12-23 11:07:33,385] [    INFO] - Adding                           to the vocabulary
[2022-12-23 11:07:33,387] [    INFO] - Adding                          to the vocabulary
[2022-12-23 11:07:33,389] [    INFO] - Adding                         to the vocabulary
[2022-12-23 11:07:33,391] [    INFO] - Adding                        to the vocabulary
[2022-12-23 11:07:33,393] [    INFO] - Adding                       to the vocabulary
[2022-12-23 11:07:33,394] [    INFO] - Adding                      to the vocabulary
[2022-12-23 11:07:33,398] [    INFO] - Adding                     to the vocabulary
[2022-12-23 11:07:33,400] [    INFO] - Adding                    to the vocabulary
[2022-12-23 11:07:33,402] [    INFO] - Adding                   to the vocabulary
[2022-12-23 11:07:33,404] [    INFO] - Adding                  to the vocabulary
[2022-12-23 11:07:33,406] [    INFO] - Adding                 to the vocabulary
[2022-12-23 11:07:33,408] [    INFO] - Adding                to the vocabulary
[2022-12-23 11:07:33,410] [    INFO] - Adding               to the vocabulary
[2022-12-23 11:07:33,412] [    INFO] - Adding              to the vocabulary
[2022-12-23 11:07:33,414] [    INFO] - Adding             to the vocabulary
[2022-12-23 11:07:33,416] [    INFO] - Adding            to the vocabulary
[2022-12-23 11:07:33,418] [    INFO] - Adding           to the vocabulary
[2022-12-23 11:07:33,419] [    INFO] - Adding          to the vocabulary
[2022-12-23 11:07:33,421] [    INFO] - Adding         to the vocabulary
[2022-12-23 11:07:33,423] [    INFO] - Adding        to the vocabulary
[2022-12-23 11:07:33,425] [    INFO] - Adding       to the vocabulary
[2022-12-23 11:07:33,427] [    INFO] - Adding      to the vocabulary
[2022-12-23 11:07:33,430] [    INFO] - Adding     to the vocabulary
[2022-12-23 11:07:33,432] [    INFO] - Adding    to the vocabulary
[2022-12-23 11:07:33,434] [    INFO] - Adding 									 to the vocabulary
[2022-12-23 11:07:33,436] [    INFO] - Adding 								 to the vocabulary
[2022-12-23 11:07:33,439] [    INFO] - Adding 							 to the vocabulary
[2022-12-23 11:07:33,441] [    INFO] - Adding 						 to the vocabulary
[2022-12-23 11:07:33,443] [    INFO] - Adding 					 to the vocabulary
[2022-12-23 11:07:33,445] [    INFO] - Adding 				 to the vocabulary
[2022-12-23 11:07:33,447] [    INFO] - Adding 			 to the vocabulary
[2022-12-23 11:07:33,450] [    INFO] - Adding 		 to the vocabulary
[2022-12-23 11:07:33,452] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/model_state.pdparams
[2022-12-23 11:07:33,454] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/Salesforce/codegen-350M-mono/model_config.json





['\n    """Prints out a greeting."""\n    print("Hello World!")']

句法依存分析
需要安装LAC

pip install LAC --upgrade
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting LAC
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/c3/88/966e99c95cac93730a7f3cdf92a17e2a0e924bea61b9a86ae7995feaa4fe/LAC-2.1.2.tar.gz (64.8 MB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/64.8 MB ? eta -:--:--━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.2/64.8 MB 5.6 MB/s eta 0:00:12━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.3/64.8 MB 4.5 MB/s eta 0:00:15━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.5/64.8 MB 4.4 MB/s eta 0:00:15━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.6/64.8 MB 4.3 MB/s eta 0:00:15━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.7/64.8 MB 4.2 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.9/64.8 MB 4.2 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/64.8 MB 4.2 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/64.8 MB 4.2 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/64.8 MB 4.1 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/64.8 MB 4.1 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/64.8 MB 4.0 MB/s eta 0:00:16?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/64.8 MB 3.9 MB/s eta 0:00:17━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.8/64
[?25h  Preparing metadata (setup.py) ... [?25ldone
[?25hBuilding wheels for collected packages: LAC
  Building wheel for LAC (setup.py) ... [?25ldone
[?25h  Created wheel for LAC: filename=LAC-2.1.2-py2.py3-none-any.whl size=64814683 sha256=651514f67476935e9c56b3793243ab9d4549695b1941ecc3f5c9edb4aff5d1f0
  Stored in directory: /home/aistudio/.cache/pip/wheels/7b/db/c0/9ca0a499e40c78935cd8afd72372cc4a85d986ba2265285d20
Successfully built LAC
Installing collected packages: LAC
Successfully installed LAC-2.1.2

[notice] A new release of pip available: 22.1.2 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
           from paddlenlp import Taskflow 

           ddp = Taskflow("dependency_parsing")
           ddp("三亚是一座美丽的城市")
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}]
           '''
           ddp(["三亚是一座美丽的城市", "他送了一本书"])
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}, {'word': ['他', '送', '了', '一本', '书'], 'head': [2, 0, 2, 5, 2], 'deprel': ['SBV', 'HED', 'MT', 'ATT', 'VOB']}]
           '''       

           ddp = Taskflow("dependency_parsing", prob=True, use_pos=True)
           ddp("三亚是一座美丽的城市")
           '''
           [{'word': ['三亚', '是', '一座', '美丽的城市'], 'head': [2, 0, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'VOB'], 'postag': ['LOC', 'v', 'm', 'n'], 'prob': [1.0, 1.0, 1.0, 1.0]}]
           '''

           ddp = Taskflow("dependency_parsing", model="ddparser-ernie-1.0")
           ddp("三亚是一座美丽的城市")
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}]
           '''

           ddp = Taskflow("dependency_parsing", model="ddparser-ernie-gram-zh")
           ddp("三亚是一座美丽的城市")
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}]
           '''

           # 已分词输入
           ddp = Taskflow("dependency_parsing", segmented=True)
           ddp.from_segments([["三亚", "是", "一座", "美丽", "的", "城市"]])
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}]
           '''
           ddp.from_segments([['三亚', '是', '一座', '美丽', '的', '城市'], ['他', '送', '了', '一本', '书']])
           '''
           [{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}, {'word': ['他', '送', '了', '一本', '书'], 'head': [2, 0, 2, 5, 2], 'deprel': ['SBV', 'HED', 'MT', 'ATT', 'VOB']}]
           '''   
       
W1223 11:15:20.603538   215 analysis_predictor.cc:2160] Deprecated. Please use CreatePredictor instead.
[2022-12-23 11:15:20,775] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:15:22,356] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/dependency_parsing/ddparser/static/inference
[2022-12-23 11:15:23,832] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-1.0/model_state.pdparams
100%|██████████| 386M/386M [00:15<00:00, 26.5MB/s] 
[2022-12-23 11:15:40,157] [    INFO] - Downloading word_vocab.json from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-1.0/word_vocab.json
100%|██████████| 612k/612k [00:00<00:00, 2.14MB/s]
[2022-12-23 11:15:40,588] [    INFO] - Downloading rel_vocab.json from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-1.0/rel_vocab.json
100%|██████████| 526/526 [00:00<00:00, 374kB/s]
[2022-12-23 11:15:40,708] [    INFO] - Downloading SourceHanSansCN-Regular.ttf from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/SourceHanSansCN-Regular.ttf
100%|██████████| 8.28M/8.28M [00:00<00:00, 15.2MB/s]
[2022-12-23 11:15:41,430] [    INFO] - We are using <class 'paddlenlp.transformers.ernie.modeling.ErnieModel'> to load 'ernie-1.0'.
[2022-12-23 11:15:41,433] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/ernie/ernie_v1_chn_base.pdparams and saved to /home/aistudio/.paddlenlp/models/ernie-1.0
[2022-12-23 11:15:41,435] [    INFO] - Downloading ernie_v1_chn_base.pdparams from https://bj.bcebos.com/paddlenlp/models/transformers/ernie/ernie_v1_chn_base.pdparams
100%|██████████| 383M/383M [00:18<00:00, 21.2MB/s] 
[2022-12-23 11:16:01,458] [    INFO] - Weights from pretrained model not used in ErnieModel: ['cls.predictions.layer_norm.weight', 'cls.predictions.decoder_bias', 'cls.predictions.transform.bias', 'cls.predictions.transform.weight', 'cls.predictions.layer_norm.bias']
[2022-12-23 11:16:02,480] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:16:16,645] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/dependency_parsing/ddparser-ernie-1.0/static/inference
[2022-12-23 11:16:19,060] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-gram-zh/model_state.pdparams
100%|██████████| 386M/386M [00:14<00:00, 27.1MB/s] 
[2022-12-23 11:16:35,036] [    INFO] - Downloading word_vocab.json from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-gram-zh/word_vocab.json
100%|██████████| 590k/590k [00:00<00:00, 5.34MB/s]
[2022-12-23 11:16:35,329] [    INFO] - Downloading rel_vocab.json from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/ddparser-ernie-gram-zh/rel_vocab.json
100%|██████████| 526/526 [00:00<00:00, 522kB/s]
[2022-12-23 11:16:35,439] [    INFO] - Downloading SourceHanSansCN-Regular.ttf from https://bj.bcebos.com/paddlenlp/taskflow/dependency_parsing/SourceHanSansCN-Regular.ttf
100%|██████████| 8.28M/8.28M [00:00<00:00, 20.0MB/s]
[2022-12-23 11:16:36,016] [    INFO] - We are using <class 'paddlenlp.transformers.ernie_gram.modeling.ErnieGramModel'> to load 'ernie-gram-zh'.
[2022-12-23 11:16:36,019] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/ernie_gram_zh/ernie_gram_zh.pdparams and saved to /home/aistudio/.paddlenlp/models/ernie-gram-zh
[2022-12-23 11:16:36,021] [    INFO] - Downloading ernie_gram_zh.pdparams from https://bj.bcebos.com/paddlenlp/models/transformers/ernie_gram_zh/ernie_gram_zh.pdparams
100%|██████████| 570M/570M [00:17<00:00, 34.8MB/s] 
[2022-12-23 11:16:57,735] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:17:09,064] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/dependency_parsing/ddparser-ernie-gram-zh/static/inference





"\n[{'word': ['三亚', '是', '一座', '美丽', '的', '城市'], 'head': [2, 0, 6, 6, 4, 2], 'deprel': ['SBV', 'HED', 'ATT', 'ATT', 'MT', 'VOB']}, {'word': ['他', '送', '了', '一本', '书'], 'head': [2, 0, 2, 5, 2], 'deprel': ['SBV', 'HED', 'MT', 'ATT', 'VOB']}]\n"
taskflow中基于plato-mini的
           from paddlenlp import Taskflow 

           # 非交互模式
           dialogue = Taskflow("dialogue")
           dialogue(["吃饭了吗"])
           '''
           ['刚吃完饭,你在干什么呢?']
           '''
           dialogue(["你好", "吃饭了吗"], ["你是谁?"])
           '''
           ['吃过了,你呢', '我是李明啊']
           '''

           dialogue = Taskflow("dialogue")
           # 进入交互模式 (输入exit退出)
           dialogue.interactive_mode(max_turn=3)
[2022-12-23 11:36:39,441] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/dialogue/plato-mini/model_state.pdparams
100%|██████████| 518M/518M [00:12<00:00, 42.3MB/s] 
[2022-12-23 11:36:53,635] [    INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/taskflow/dialogue/plato-mini/model_config.json
100%|██████████| 518/518 [00:00<00:00, 296kB/s]
[2022-12-23 11:36:53,726] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini-vocab.txt and saved to /home/aistudio/.paddlenlp/models/plato-mini
[2022-12-23 11:36:53,729] [    INFO] - Downloading plato-mini-vocab.txt from https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini-vocab.txt
100%|██████████| 410k/410k [00:00<00:00, 3.24MB/s]
[2022-12-23 11:36:53,952] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini-spm.model and saved to /home/aistudio/.paddlenlp/models/plato-mini
[2022-12-23 11:36:53,955] [    INFO] - Downloading plato-mini-spm.model from https://bj.bcebos.com/paddlenlp/models/transformers/unified_transformer/plato-mini-spm.model
100%|██████████| 711k/711k [00:00<00:00, 6.92MB/s]
[2022-12-23 11:36:54,234] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/plato-mini/tokenizer_config.json
[2022-12-23 11:36:54,237] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/plato-mini/special_tokens_map.json
Building prefix dict from the default dictionary ...
Dumping model to file cache /tmp/jieba.cache
Loading model cost 0.791 seconds.
Prefix dict has been built successfully.
[2022-12-23 11:37:03,058] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/plato-mini/plato-mini-vocab.txt
[2022-12-23 11:37:03,060] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/plato-mini/plato-mini-spm.model
[2022-12-23 11:37:03,119] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/plato-mini/tokenizer_config.json
[2022-12-23 11:37:03,122] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/plato-mini/special_tokens_map.json


[Human]: 对话实现


[Bot]:你是个很有爱心的人,我很高兴认识你


[Human]: 你是一个什么样的人


[Bot]:我是一个很有爱心的人,你有什么事情吗?


[Human]: 你哪里有爱心了


[Bot]:这不,我说的是你的事情,我只是一个很有爱心的人


[Human]: exit



---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

/tmp/ipykernel_215/731744024.py in <module>
     14 dialogue = Taskflow("dialogue")
     15 # 进入交互模式 (输入exit退出)
---> 16 dialogue.interactive_mode(max_turn=3)


/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlenlp/taskflow/taskflow.py in interactive_mode(self, max_turn)
    610                 human = input("[Human]:").strip()
    611                 if human.lower() == "exit":
--> 612                     exit()
    613                 robot = self.task_instance(human)[0]
    614                 print("[Bot]:%s" % robot)


NameError: name 'exit' is not defined
taskflow实现的基于uie模型的信息抽取
            from paddlenlp import Taskflow

            # 实体抽取 Entity Extraction
            schema = ['时间', '选手', '赛事名称'] # Define the schema for entity extraction
            ie = Taskflow('information_extraction', schema=schema)
            ie("2月8日上午北京冬奥会自由式滑雪女子大跳台决赛中中国选手谷爱凌以188.25分获得金牌!")
            '''
            [{'时间': [{'text': '2月8日上午', 'start': 0, 'end': 6, 'probability': 0.9857378532924486}], '选手': [{'text': '谷爱凌', 'start': 28, 'end': 31, 'probability': 0.8981548639781138}], '赛事名称': [{'text': '北京冬奥会自由式滑雪女子大跳台决赛', 'start': 6, 'end': 23, 'probability': 0.8503089953268272}]}]
            '''

            # 关系抽取 Relation Extraction
            schema = [{"歌曲名称":["歌手", "所属专辑"]}] # Define the schema for relation extraction
            ie.set_schema(schema) # Reset schema
            ie("《告别了》是孙耀威在专辑爱的故事里面的歌曲")
            '''
            [{'歌曲名称': [{'text': '告别了', 'start': 1, 'end': 4, 'probability': 0.6296155977145546, 'relations': {'歌手': [{'text': '孙耀威', 'start': 6, 'end': 9, 'probability': 0.9988381005599081}], '所属专辑': [{'text': '爱的故事', 'start': 12, 'end': 16, 'probability': 0.9968462078543183}]}}, {'text': '爱的故事', 'start': 12, 'end': 16, 'probability': 0.2816869478191606, 'relations': {'歌手': [{'text': '孙耀威', 'start': 6, 'end': 9, 'probability': 0.9951415104192272}]}}]}]
            '''

            # 事件抽取 Event Extraction
            schema = [{'地震触发词': ['地震强度', '时间', '震中位置', '震源深度']}] # Define the schema for event extraction
            ie.set_schema(schema) # Reset schema
            ie('中国地震台网正式测定:5月16日06时08分在云南临沧市凤庆县(北纬24.34度,东经99.98度)发生3.5级地震,震源深度10千米。')
            '''
            [{'地震触发词': [{'text': '地震', 'start': 56, 'end': 58, 'probability': 0.9977425555988333, 'relations': {'地震强度': [{'text': '3.5级', 'start': 52, 'end': 56, 'probability': 0.998080217831891}], '时间': [{'text': '5月16日06时08分', 'start': 11, 'end': 22, 'probability': 0.9853299772936026}], '震中位置': [{'text': '云南临沧市凤庆县(北纬24.34度,东经99.98度)', 'start': 23, 'end': 50, 'probability': 0.7874012889740385}], '震源深度': [{'text': '10千米', 'start': 63, 'end': 67, 'probability': 0.9937974422968665}]}}]}]
            '''

            # 观点提取Opinion Extraction
            schema = [{'评价维度': ['观点词', '情感倾向[正向,负向]']}] # Define the schema for opinion extraction
            ie.set_schema(schema) # Reset schema
            ie("地址不错,服务一般,设施陈旧")
            '''
            [{'评价维度': [{'text': '地址', 'start': 0, 'end': 2, 'probability': 0.9888139270606509, 'relations': {'观点词': [{'text': '不错', 'start': 2, 'end': 4, 'probability': 0.9927847072459528}], '情感倾向[正向,负向]': [{'text': '正向', 'probability': 0.998228967796706}]}}, {'text': '设施', 'start': 10, 'end': 12, 'probability': 0.9588297379365116, 'relations': {'观点词': [{'text': '陈旧', 'start': 12, 'end': 14, 'probability': 0.9286753967902683}], '情感倾向[正向,负向]': [{'text': '负向', 'probability': 0.9949389795770394}]}}, {'text': '服务', 'start': 5, 'end': 7, 'probability': 0.9592857070501211, 'relations': {'观点词': [{'text': '一般', 'start': 7, 'end': 9, 'probability': 0.9949359182521675}], '情感倾向[正向,负向]': [{'text': '负向', 'probability': 0.9952498258302498}]}}]}]
            '''

            #句子级别文本分类 Sentence-level Sentiment Classification
            schema = ['情感倾向[正向,负向]'] # Define the schema for sentence-level sentiment classification
            ie.set_schema(schema) # Reset schema
            ie('这个产品用起来真的很流畅,我非常喜欢')
            '''
            [{'情感倾向[正向,负向]': [{'text': '正向', 'probability': 0.9990024058203417}]}]
            '''

            # 英文模型 English Model
            schema = [{'Person': ['Company', 'Position']}]
            ie_en = Taskflow('information_extraction', schema=schema, model='uie-base-en')
            ie_en('In 1997, Steve was excited to become the CEO of Apple.')
            '''
            [{'Person': [{'text': 'Steve', 'start': 9, 'end': 14, 'probability': 0.999631971804547, 'relations': {'Company': [{'text': 'Apple', 'start': 48, 'end': 53, 'probability': 0.9960158209451642}], 'Position': [{'text': 'CEO', 'start': 41, 'end': 44, 'probability': 0.8871063806420736}]}}]}]
            '''

            schema = ['Sentiment classification [negative, positive]']
            ie_en.set_schema(schema)
            ie_en('I am sorry but this is the worst film I have ever seen in my life.')
            '''
            [{'Sentiment classification [negative, positive]': [{'text': 'negative', 'probability': 0.9998415771287057}]}]
            '''

            schema = [{'Comment object': ['Opinion', 'Sentiment classification [negative, positive]']}]
            ie_en.set_schema(schema)
            ie_en("overall i 'm happy with my toy.")
[2022-12-23 11:46:16,047] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_v1.0/model_state.pdparams
100%|██████████| 450M/450M [00:08<00:00, 54.8MB/s] 
[2022-12-23 11:46:25,850] [    INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base/model_config.json
100%|██████████| 377/377 [00:00<00:00, 238kB/s]
[2022-12-23 11:46:25,917] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base/vocab.txt
100%|██████████| 182k/182k [00:00<00:00, 2.11MB/s]
[2022-12-23 11:46:26,088] [    INFO] - Downloading special_tokens_map.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base/special_tokens_map.json
100%|██████████| 112/112 [00:00<00:00, 93.7kB/s]
[2022-12-23 11:46:26,162] [    INFO] - Downloading tokenizer_config.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base/tokenizer_config.json
100%|██████████| 172/172 [00:00<00:00, 115kB/s]
[2022-12-23 11:46:27,262] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:46:38,138] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/information_extraction/uie-base/static/inference
[2022-12-23 11:46:40,277] [    INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load '/home/aistudio/.paddlenlp/taskflow/information_extraction/uie-base'.
[2022-12-23 11:46:40,749] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_en_v1.1/model_state.pdparams
100%|██████████| 418M/418M [00:11<00:00, 37.7MB/s] 
[2022-12-23 11:46:53,499] [    INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_en/model_config.json
100%|██████████| 347/347 [00:00<00:00, 256kB/s]
[2022-12-23 11:46:53,599] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_en/vocab.txt
100%|██████████| 226k/226k [00:00<00:00, 2.52MB/s]
[2022-12-23 11:46:53,890] [    INFO] - Downloading special_tokens_map.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_en/special_tokens_map.json
100%|██████████| 112/112 [00:00<00:00, 96.5kB/s]
[2022-12-23 11:46:53,972] [    INFO] - Downloading tokenizer_config.json from https://bj.bcebos.com/paddlenlp/taskflow/information_extraction/uie_base_en/tokenizer_config.json
100%|██████████| 172/172 [00:00<00:00, 151kB/s]
[2022-12-23 11:46:54,977] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:47:05,378] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/information_extraction/uie-base-en/static/inference
[2022-12-23 11:47:07,383] [    INFO] - We are using <class 'paddlenlp.transformers.ernie.tokenizer.ErnieTokenizer'> to load '/home/aistudio/.paddlenlp/taskflow/information_extraction/uie-base-en'.





[{}]
基于WordTag的知识标注
          from paddlenlp import Taskflow 

          # 默认使用WordTag词类知识标注工具
          wordtag = Taskflow("knowledge_mining", model="wordtag")
          wordtag("《孤女》是2010年九州出版社出版的小说,作者是余兼羽")
          '''
          [{'text': '《孤女》是2010年九州出版社出版的小说,作者是余兼羽', 'items': [{'item': '《', 'offset': 0, 'wordtag_label': 'w', 'length': 1}, {'item': '孤女', 'offset': 1, 'wordtag_label': '作品类_实体', 'length': 2}, {'item': '》', 'offset': 3, 'wordtag_label': 'w', 'length': 1}, {'item': '是', 'offset': 4, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_是'}, {'item': '2010年', 'offset': 5, 'wordtag_label': '时间类', 'length': 5, 'termid': '时间阶段_cb_2010年'}, {'item': '九州出版社', 'offset': 10, 'wordtag_label': '组织机构类', 'length': 5, 'termid': '组织机构_eb_九州出版社'}, {'item': '出版', 'offset': 15, 'wordtag_label': '场景事件', 'length': 2, 'termid': '场景事件_cb_出版'}, {'item': '的', 'offset': 17, 'wordtag_label': '助词', 'length': 1, 'termid': '助词_cb_的'}, {'item': '小说', 'offset': 18, 'wordtag_label': '作品类_概念', 'length': 2, 'termid': '小说_cb_小说'}, {'item': ',', 'offset': 20, 'wordtag_label': 'w', 'length': 1}, {'item': '作者', 'offset': 21, 'wordtag_label': '人物类_概念', 'length': 2, 'termid': '人物_cb_作者'}, {'item': '是', 'offset': 23, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_是'}, {'item': '余兼羽', 'offset': 24, 'wordtag_label': '人物类_实体', 'length': 3}]}]
          '''

          wordtag= Taskflow("knowledge_mining", batch_size=2)
          wordtag(["热梅茶是一道以梅子为主要原料制作的茶饮",
                   "《孤女》是2010年九州出版社出版的小说,作者是余兼羽"])
          '''
          [{'text': '热梅茶是一道以梅子为主要原料制作的茶饮', 'items': [{'item': '热梅茶', 'offset': 0, 'wordtag_label': '饮食类_饮品', 'length': 3}, {'item': '是', 'offset': 3, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_是'}, {'item': '一道', 'offset': 4, 'wordtag_label': '数量词', 'length': 2}, {'item': '以', 'offset': 6, 'wordtag_label': '介词', 'length': 1, 'termid': '介词_cb_以'}, {'item': '梅子', 'offset': 7, 'wordtag_label': '饮食类', 'length': 2, 'termid': '饮食_cb_梅'}, {'item': '为', 'offset': 9, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_为'}, {'item': '主要原料', 'offset': 10, 'wordtag_label': '物体类', 'length': 4, 'termid': '物品_cb_主要原料'}, {'item': '制作', 'offset': 14, 'wordtag_label': '场景事件', 'length': 2, 'termid': '场景事件_cb_制作'}, {'item': '的', 'offset': 16, 'wordtag_label': '助词', 'length': 1, 'termid': '助词_cb_的'}, {'item': '茶饮', 'offset': 17, 'wordtag_label': '饮食类_饮品', 'length': 2, 'termid': '饮品_cb_茶饮'}]}, {'text': '《孤女》是2010年九州出版社出版的小说,作者是余兼羽', 'items': [{'item': '《', 'offset': 0, 'wordtag_label': 'w', 'length': 1}, {'item': '孤女', 'offset': 1, 'wordtag_label': '作品类_实体', 'length': 2}, {'item': '》', 'offset': 3, 'wordtag_label': 'w', 'length': 1}, {'item': '是', 'offset': 4, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_是'}, {'item': '2010年', 'offset': 5, 'wordtag_label': '时间类', 'length': 5, 'termid': '时间阶段_cb_2010年'}, {'item': '九州出版社', 'offset': 10, 'wordtag_label': '组织机构类', 'length': 5, 'termid': '组织机构_eb_九州出版社'}, {'item': '出版', 'offset': 15, 'wordtag_label': '场景事件', 'length': 2, 'termid': '场景事件_cb_出版'}, {'item': '的', 'offset': 17, 'wordtag_label': '助词', 'length': 1, 'termid': '助词_cb_的'}, {'item': '小说', 'offset': 18, 'wordtag_label': '作品类_概念', 'length': 2, 'termid': '小说_cb_小说'}, {'item': ',', 'offset': 20, 'wordtag_label': 'w', 'length': 1}, {'item': '作者', 'offset': 21, 'wordtag_label': '人物类_概念', 'length': 2, 'termid': '人物_cb_作者'}, {'item': '是', 'offset': 23, 'wordtag_label': '肯定词', 'length': 1, 'termid': '肯定否定词_cb_是'}, {'item': '余兼羽', 'offset': 24, 'wordtag_label': '人物类_实体', 'length': 3}]}]
          '''

          # 使用WordTag-IE进行信息抽取
          wordtag = Taskflow("knowledge_mining", model="wordtag", with_ie=True)
          '''
          [[{'text': '《忘了所有》是一首由王杰作词、作曲并演唱的歌曲,收录在专辑同名《忘了所有》中,由波丽佳音唱片于1996年08月31日发行。', 'items': [{'item': '《', 'offset': 0, 'wordtag_label': 'w', 'length': 1}, {'item': '忘了所有', 'offset': 1, 'wordtag_label': '作品类_实体', 'length': 4}, {'item': '》', 'offset': 5, 'wordtag_label': 'w', 'length': 1}, {'item': '是', 'offset': 6, 'wordtag_label': '肯定词', 'length': 1}, {'item': '一首', 'offset': 7, 'wordtag_label': '数量词_单位数量词', 'length': 2}, {'item': '由', 'offset': 9, 'wordtag_label': '介词', 'length': 1}, {'item': '王杰', 'offset': 10, 'wordtag_label': '人物类_实体', 'length': 2}, {'item': '作词', 'offset': 12, 'wordtag_label': '场景事件', 'length': 2}, {'item': '、', 'offset': 14, 'wordtag_label': 'w', 'length': 1}, {'item': '作曲', 'offset': 15, 'wordtag_label': '场景事件', 'length': 2}, {'item': '并', 'offset': 17, 'wordtag_label': '连词', 'length': 1}, {'item': '演唱', 'offset': 18, 'wordtag_label': '场景事件', 'length': 2}, {'item': '的', 'offset': 20, 'wordtag_label': '助词', 'length': 1}, {'item': '歌曲', 'offset': 21, 'wordtag_label': '作品类_概念', 'length': 2}, {'item': ',', 'offset': 23, 'wordtag_label': 'w', 'length': 1}, {'item': '收录', 'offset': 24, 'wordtag_label': '场景事件', 'length': 2}, {'item': '在', 'offset': 26, 'wordtag_label': '介词', 'length': 1}, {'item': '专辑', 'offset': 27, 'wordtag_label': '作品类_概念', 'length': 2}, {'item': '同名', 'offset': 29, 'wordtag_label': '场景事件', 'length': 2}, {'item': '《', 'offset': 31, 'wordtag_label': 'w', 'length': 1}, {'item': '忘了所有', 'offset': 32, 'wordtag_label': '作品类_实体', 'length': 4}, {'item': '》', 'offset': 36, 'wordtag_label': 'w', 'length': 1}, {'item': '中', 'offset': 37, 'wordtag_label': '词汇用语', 'length': 1}, {'item': ',', 'offset': 38, 'wordtag_label': 'w', 'length': 1}, {'item': '由', 'offset': 39, 'wordtag_label': '介词', 'length': 1}, {'item': '波丽佳音', 'offset': 40, 'wordtag_label': '人物类_实体', 'length': 4}, {'item': '唱片', 'offset': 44, 'wordtag_label': '作品类_概念', 'length': 2}, {'item': '于', 'offset': 46, 'wordtag_label': '介词', 'length': 1}, {'item': '1996年08月31日', 'offset': 47, 'wordtag_label': '时间类_具体时间', 'length': 11}, {'item': '发行', 'offset': 58, 'wordtag_label': '场景事件', 'length': 2}, {'item': '。', 'offset': 60, 'wordtag_label': 'w', 'length': 1}]}], [[{'HEAD_ROLE': {'item': '王杰', 'offset': 10, 'type': '人物类_实体'}, 'TAIL_ROLE': [{'item': '忘了所有', 'type': '作品类_实体', 'offset': 1}], 'GROUP': '创作', 'TRIG': [{'item': '作词', 'offset': 12}, {'item': '作曲', 'offset': 15}, {'item': '演唱', 'offset': 18}], 'SRC': 'REVERSE'}, {'HEAD_ROLE': {'item': '忘了所有', 'type': '作品类_实体', 'offset': 1}, 'TAIL_ROLE': [{'item': '王杰', 'offset': 10, 'type': '人物类_实体'}], 'GROUP': '创作者', 'SRC': 'HTG', 'TRIG': [{'item': '作词', 'offset': 12}, {'item': '作曲', 'offset': 15}, {'item': '演唱', 'offset': 18}]}, {'HEAD_ROLE': {'item': '忘了所有', 'type': '作品类_实体', 'offset': 1}, 'TAIL_ROLE': [{'item': '歌曲', 'offset': 21, 'type': '作品类_概念'}], 'GROUP': '类型', 'SRC': 'TAIL'}, {'HEAD_ROLE': {'item': '忘了所有', 'offset': 32, 'type': '作品类_实体'}, 'TAIL_ROLE': [{'item': '忘了所有', 'type': '作品类_实体', 'offset': 1}], 'GROUP': '收录', 'TRIG': [{'item': '收录', 'offset': 24}], 'SRC': 'REVERSE'}, {'HEAD_ROLE': {'item': '忘了所有', 'type': '作品类_实体', 'offset': 1}, 'TAIL_ROLE': [{'item': '忘了所有', 'offset': 32, 'type': '作品类_实体'}], 'GROUP': '收录于', 'SRC': 'HGT', 'TRIG': [{'item': '收录', 'offset': 24}]}, {'HEAD_ROLE': {'item': '忘了所有', 'offset': 32, 'type': '作品类_实体'}, 'TAIL_ROLE': [{'item': '王杰', 'type': '人物类_实体', 'offset': 10}], 'GROUP': '创作者', 'TRIG': [{'item': '专辑', 'offset': 27}], 'SRC': 'REVERSE'}, {'HEAD_ROLE': {'item': '王杰', 'type': '人物类_实体', 'offset': 10}, 'TAIL_ROLE': [{'item': '忘了所有', 'offset': 32, 'type': '作品类_实体'}], 'GROUP': '创作', 'SRC': 'HGT', 'TRIG': [{'item': '专辑', 'offset': 27}]}, {'HEAD_ROLE': {'item': '忘了所有', 'type': '作品类_实体', 'offset': 32}, 'TAIL_ROLE': [{'item': '唱片', 'offset': 44, 'type': '作品类_概念'}], 'GROUP': '类型', 'SRC': 'TAIL'}]]]
          '''
          
          # 切换为NPTag名词短语标注工具
          nptag = Taskflow("knowledge_mining", model="nptag")
          nptag("糖醋排骨")
          '''
          [{'text': '糖醋排骨', 'label': '菜品'}]
          '''

          nptag(["糖醋排骨", "红曲霉菌"])
          '''
          [{'text': '糖醋排骨', 'label': '菜品'}, {'text': '红曲霉菌', 'label': '微生物'}]
          '''

          # 输出粗粒度类别标签`category`,即WordTag的词汇标签。
          nptag = Taskflow("knowledge_mining", model="nptag", linking=True)
          nptag(["糖醋排骨", "红曲霉菌"])
          '''
          [{'text': '糖醋排骨', 'label': '菜品', 'category': '饮食类_菜品'}, {'text': '红曲霉菌', 'label': '微生物', 'category': '生物类_微生物'}]
          '''
[2022-12-23 11:53:56,452] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag_v1.3/model_state.pdparams
100%|██████████| 367M/367M [00:06<00:00, 59.1MB/s] 
[2022-12-23 11:54:03,922] [    INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag_v1.1/model_config.json
100%|██████████| 482/482 [00:00<00:00, 334kB/s]
[2022-12-23 11:54:04,019] [    INFO] - Downloading termtree_type.csv from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag/termtree_type.csv
100%|██████████| 9.03k/9.03k [00:00<00:00, 7.59MB/s]
[2022-12-23 11:54:04,113] [    INFO] - Downloading termtree_data from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag/termtree_data
100%|██████████| 354M/354M [00:07<00:00, 51.9MB/s] 
[2022-12-23 11:54:12,229] [    INFO] - Downloading tags.txt from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag_v1.1/tags.txt
100%|██████████| 7.58k/7.58k [00:00<00:00, 5.64MB/s]
[2022-12-23 11:54:12,304] [    INFO] - Downloading spo_config.pkl from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/wordtag_v1.1/spo_config.pkl
100%|██████████| 1.62M/1.62M [00:00<00:00, 5.83MB/s]
[2022-12-23 11:54:54,760] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/ernie_ctm/vocab.txt and saved to /home/aistudio/.paddlenlp/models/wordtag
[2022-12-23 11:54:54,764] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/models/transformers/ernie_ctm/vocab.txt
100%|██████████| 89.6k/89.6k [00:00<00:00, 3.66MB/s]
[2022-12-23 11:54:54,889] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/wordtag/tokenizer_config.json
[2022-12-23 11:54:54,892] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/wordtag/special_tokens_map.json
[2022-12-23 11:54:56,057] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:55:06,051] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/wordtag/static/inference
[2022-12-23 11:55:53,046] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/wordtag/vocab.txt
[2022-12-23 11:55:53,070] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/wordtag/tokenizer_config.json
[2022-12-23 11:55:53,073] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/wordtag/special_tokens_map.json
[2022-12-23 11:56:34,541] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/wordtag/vocab.txt
[2022-12-23 11:56:34,564] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/wordtag/tokenizer_config.json
[2022-12-23 11:56:34,566] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/wordtag/special_tokens_map.json
[2022-12-23 11:56:38,055] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/nptag/model_state.pdparams
100%|██████████| 377M/377M [00:19<00:00, 19.9MB/s] 
[2022-12-23 11:56:59,048] [    INFO] - Downloading model_config.json from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/nptag/model_config.json
100%|██████████| 463/463 [00:00<00:00, 92.5kB/s]
[2022-12-23 11:56:59,206] [    INFO] - Downloading name_category_map.json from https://bj.bcebos.com/paddlenlp/taskflow/knowledge_mining/nptag/name_category_map.json
100%|██████████| 84.2k/84.2k [00:00<00:00, 1.99MB/s]
[2022-12-23 11:56:59,368] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/ernie_ctm/vocab.txt and saved to /home/aistudio/.paddlenlp/models/nptag
[2022-12-23 11:56:59,370] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/models/transformers/ernie_ctm/vocab.txt
100%|██████████| 89.6k/89.6k [00:00<00:00, 2.60MB/s]
[2022-12-23 11:56:59,497] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/nptag/tokenizer_config.json
[2022-12-23 11:56:59,500] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/nptag/special_tokens_map.json
[2022-12-23 11:57:01,623] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:57:17,786] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/knowledge_mining/nptag/static/inference
[2022-12-23 11:57:19,289] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/nptag/vocab.txt
[2022-12-23 11:57:19,307] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/nptag/tokenizer_config.json
[2022-12-23 11:57:19,310] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/nptag/special_tokens_map.json





"\n[{'text': '糖醋排骨', 'label': '菜品', 'category': '饮食类_菜品'}, {'text': '红曲霉菌', 'label': '微生物', 'category': '生物类_微生物'}]\n"
基于lac的文本分词taskflow
           from paddlenlp import Taskflow 

           lac = Taskflow("lexical_analysis")
           lac("LAC是个优秀的分词工具")
           '''
           [{'text': 'LAC是个优秀的分词工具', 'segs': ['LAC', '是', '个', '优秀', '的', '分词', '工具'], 'tags': ['nz', 'v', 'q', 'a', 'u', 'n', 'n']}]
           '''

           lac(["LAC是个优秀的分词工具", "三亚是一个美丽的城市"])
           '''
           [{'text': 'LAC是个优秀的分词工具', 'segs': ['LAC', '是', '个', '优秀', '的', '分词', '工具'], 'tags': ['nz', 'v', 'q', 'a', 'u', 'n', 'n']}, 
            {'text': '三亚是一个美丽的城市', 'segs': ['三亚', '是', '一个', '美丽', '的', '城市'], 'tags': ['LOC', 'v', 'm', 'a', 'u', 'n']}
           ]
           '''
[2022-12-23 11:57:21,066] [    INFO] - Downloading model_state.pdparams from https://bj.bcebos.com/paddlenlp/taskflow/lexical_analysis/lac/model_state.pdparams
100%|██████████| 32.3M/32.3M [00:01<00:00, 26.6MB/s]
[2022-12-23 11:57:22,566] [    INFO] - Downloading tag.dic from https://bj.bcebos.com/paddlenlp/taskflow/lexical_analysis/lac/tag.dic
100%|██████████| 425/425 [00:00<00:00, 207kB/s]
[2022-12-23 11:57:22,666] [    INFO] - Downloading q2b.dic from https://bj.bcebos.com/paddlenlp/taskflow/lexical_analysis/lac/q2b.dic
100%|██████████| 44.1k/44.1k [00:00<00:00, 2.24MB/s]
[2022-12-23 11:57:22,776] [    INFO] - Downloading word.dic from https://bj.bcebos.com/paddlenlp/taskflow/lexical_analysis/lac/word.dic
100%|██████████| 745k/745k [00:00<00:00, 7.45MB/s]
[2022-12-23 11:57:23,164] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 11:57:23,580] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/lac/static/inference





"\n[{'text': 'LAC是个优秀的分词工具', 'segs': ['LAC', '是', '个', '优秀', '的', '分词', '工具'], 'tags': ['nz', 'v', 'q', 'a', 'u', 'n', 'n']}, \n {'text': '三亚是一个美丽的城市', 'segs': ['三亚', '是', '一个', '美丽', '的', '城市'], 'tags': ['LOC', 'v', 'm', 'a', 'u', 'n']}\n]\n"
命名实体识别
          from paddlenlp import Taskflow 

          # WordTag精确模式
          ner = Taskflow("ner")
          ner("《孤女》是2010年九州出版社出版的小说,作者是余兼羽")
          '''
          [('《', 'w'), ('孤女', '作品类_实体'), ('》', 'w'), ('是', '肯定词'), ('2010年', '时间类'), ('九州出版社', '组织机构类'), ('出版', '场景事件'), ('的', '助词'), ('小说', '作品类_概念'), (',', 'w'), ('作者', '人物类_概念'), ('是', '肯定词'), ('余兼羽', '人物类_实体')]
          '''

          ner(["热梅茶是一道以梅子为主要原料制作的茶饮", "《孤女》是2010年九州出版社出版的小说,作者是余兼羽"])
          '''
          [[('热梅茶', '饮食类_饮品'), ('是', '肯定词'), ('一道', '数量词'), ('以', '介词'), ('梅子', '饮食类'), ('为', '肯定词'), ('主要原料', '物体类'), ('制作', '场景事件'), ('的', '助词'), ('茶饮', '饮食类_饮品')], [('《', 'w'), ('孤女', '作品类_实体'), ('》', 'w'), ('是', '肯定词'), ('2010年', '时间类'), ('九州出版社', '组织机构类'), ('出版', '场景事件'), ('的', '助词'), ('小说', '作品类_概念'), (',', 'w'), ('作者', '人物类_概念'), ('是', '肯定词'), ('余兼羽', '人物类_实体')]]
          '''

          # 只返回实体/概念词
          ner = Taskflow("ner", entity_only=True)
          ner("《孤女》是2010年九州出版社出版的小说,作者是余兼羽")
          '''
          [('孤女', '作品类_实体'), ('2010年', '时间类'), ('九州出版社', '组织机构类'), ('出版', '场景事件'), ('小说', '作品类_概念'), ('作者', '人物类_概念'), ('余兼羽', '人物类_实体')]
          '''

          # 使用快速模式,只返回实体词
          ner = Taskflow("ner", mode="fast", entity_only=True)
          ner("三亚是一个美丽的城市")
          '''
          [('三亚', 'LOC')]
          '''
[2022-12-23 13:35:58,923] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/wordtag/vocab.txt
[2022-12-23 13:35:58,941] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/wordtag/tokenizer_config.json
[2022-12-23 13:35:58,943] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/wordtag/special_tokens_map.json
[2022-12-23 13:36:02,014] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/wordtag/vocab.txt
[2022-12-23 13:36:02,032] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/wordtag/tokenizer_config.json
[2022-12-23 13:36:02,034] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/wordtag/special_tokens_map.json





"\n[('三亚', 'LOC')]\n"

paddle nlp taskflow 生成古诗
采用gpt-cpm-large-cn模型进行古诗生成
文本生成比较占用显存。在推理过程中占用了11.9Gb的显存。这个使用的时候一定要注意。

           from paddlenlp import Taskflow 

           poetry = Taskflow("poetry_generation")
           poetry("林密不见人")
           '''
           [{'text': '林密不见人', 'answer': ',但闻人语响。'}]
           '''

           poetry(["林密不见人", "举头邀明月"])
           '''
           [{'text': '林密不见人', 'answer': ',但闻人语响。'}, {'text': '举头邀明月', 'answer': ',低头思故乡。'}]
           '''
           poetry(["夜岚风不止", "岂止万千关"])

[2022-12-23 14:09:18,841] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/gpt-cpm-cn-sentencepiece.model
[2022-12-23 14:09:18,884] [    INFO] - Adding <bod> to the vocabulary
[2022-12-23 14:09:18,886] [    INFO] - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[2022-12-23 14:09:18,890] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/tokenizer_config.json
[2022-12-23 14:09:18,892] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/special_tokens_map.json
[2022-12-23 14:09:18,894] [    INFO] - added tokens file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/added_tokens.json
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.755 seconds.
Prefix dict has been built successfully.





[{'text': '夜岚风不止', 'answer': ', '}, {'text': '岂止万千关', 'answer': '山,'}]
           poetry(["刘郎他乡遇故知", "山外青山楼外楼"])

[{'text': '刘郎他乡遇故知', 'answer': ', '},
 {'text': '山外青山楼外楼', 'answer': ',西湖歌舞几时休? '}]

taskflow 问答生成模型
这个问答实现的模型也是和诗歌基于同样的模型 gpt cpm large cn模型

           from paddlenlp import Taskflow 

           qa = Taskflow("question_answering")
           qa("中国的国土面积有多大?")
[2022-12-23 17:40:19,423] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/gpt-cpm-cn-sentencepiece.model
[2022-12-23 17:40:19,470] [    INFO] - Adding <bod> to the vocabulary
[2022-12-23 17:40:19,475] [    INFO] - Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
[2022-12-23 17:40:19,478] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/tokenizer_config.json
[2022-12-23 17:40:19,480] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/special_tokens_map.json
[2022-12-23 17:40:19,482] [    INFO] - added tokens file saved in /home/aistudio/.paddlenlp/models/gpt-cpm-large-cn/added_tokens.json
Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
Loading model cost 0.769 seconds.
Prefix dict has been built successfully.





[{'text': '中国的国土面积有多大?', 'answer': '960万平方公里。'}]
           qa(["中国国土面积有多大?", "中国的首都在哪里?"])

[{'text': '中国国土面积有多大?', 'answer': '960万平方公里。'},
 {'text': '中国的首都在哪里?', 'answer': '北京。'}]

文本纠错模型 基于 ernie,这点让我很惊讶,ernie 1.0版本竟然可以自带文本纠错能力。 需要安装pypinyin

pip install pypinyin
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting pypinyin
  Downloading https://pypi.tuna.tsinghua.edu.cn/packages/6f/b4/44e3270db224eede315b7b1edea982d4d90fbabf18c49a843f3f20d8c730/pypinyin-0.47.1-py2.py3-none-any.whl (1.4 MB)
     l     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.0/1.4 MB ? eta -:--:--━━━━━━━━━?━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0.3/1.4 MB 10.3 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━?━━━━━━━━━━ 1.0/1.4 MB 15.0 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━? 1.4/1.4 MB 15.1 MB/s eta 0:00:01━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 12.7 MB/s eta 0:00:00
[?25hInstalling collected packages: pypinyin
Successfully installed pypinyin-0.47.1

[notice] A new release of pip available: 22.1.2 -> 22.3.1
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.
           from paddlenlp import Taskflow

           text_correction = Taskflow("text_correction")

[2022-12-23 17:50:34,183] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/ernie-1.0/vocab.txt
[2022-12-23 17:50:34,198] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/ernie-1.0/tokenizer_config.json
[2022-12-23 17:50:34,201] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/ernie-1.0/special_tokens_map.json





[{'source': '遇到逆竟时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'target': '遇到逆境时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'errors': [{'position': 3, 'correction': {'竟': '境'}}]}]
#    推理一条
text_correction('遇到逆竟时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。')
[{'source': '遇到逆竟时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'target': '遇到逆境时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'errors': [{'position': 3, 'correction': {'竟': '境'}}]}]
#    批量推理多条数据
text_correction(['遇到逆竟时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
                            '人生就是如此,经过磨练才能让自己更加拙壮,才能使自己更加乐观。'])
[{'source': '遇到逆竟时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'target': '遇到逆境时,我们必须勇于面对,而且要愈挫愈勇,这样我们才能朝著成功之路前进。',
  'errors': [{'position': 3, 'correction': {'竟': '境'}}]},
 {'source': '人生就是如此,经过磨练才能让自己更加拙壮,才能使自己更加乐观。',
  'target': '人生就是如此,经过磨练才能让自己更加茁壮,才能使自己更加乐观。',
  'errors': [{'position': 18, 'correction': {'拙': '茁'}}]}]

paddle nlp taskflow 文本相似度模型 基于simbert-base-chinese模型进行

         from paddlenlp import Taskflow

         similarity = Taskflow("text_similarity", model='simbert-base-chinese')
[2022-12-23 18:46:22,446] [    INFO] - loading configuration file /home/aistudio/.paddlenlp/taskflow/text_similarity/simbert-base-chinese/model_config.json
[2022-12-23 18:46:22,452] [    INFO] - Model config BertConfig {
  "architectures": [
    "BertModel"
  ],
  "attention_probs_dropout_prob": 0.1,
  "fuse": false,
  "hidden_act": "gelu",
  "hidden_dropout_prob": 0.1,
  "hidden_size": 768,
  "initializer_range": 0.02,
  "intermediate_size": 3072,
  "layer_norm_eps": 1e-12,
  "max_position_embeddings": 512,
  "model_type": "bert",
  "num_attention_heads": 12,
  "num_hidden_layers": 12,
  "pad_token_id": 0,
  "paddlenlp_version": null,
  "pool_act": "linear",
  "type_vocab_size": 2,
  "vocab_size": 13685
}

[2022-12-23 18:46:22,455] [    INFO] - Configuration saved in /home/aistudio/.paddlenlp/taskflow/text_similarity/simbert-base-chinese/config.json
W1223 18:46:22.461328  3190 gpu_resources.cc:61] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 11.2, Runtime API Version: 11.2
W1223 18:46:22.466092  3190 gpu_resources.cc:91] device: 0, cuDNN Version: 8.2.
[2022-12-23 18:46:28,173] [ WARNING] - Some weights of the model checkpoint at /home/aistudio/.paddlenlp/taskflow/text_similarity/simbert-base-chinese were not used when initializing BertModel: ['cls.predictions.transform.weight', 'cls.predictions.transform.LayerNorm.bias', 'bert.embeddings.position_ids', 'cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder_bias', 'cls.predictions.decoder_weight', 'cls.predictions.decoder.bias', 'cls.seq_relationship.weight', 'cls.predictions.transform.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2022-12-23 18:46:28,176] [    INFO] - All the weights of BertModel were initialized from the model checkpoint at /home/aistudio/.paddlenlp/taskflow/text_similarity/simbert-base-chinese.
If your task is similar to the task the model of the checkpoint was trained on, you can already use BertModel for predictions without further training.
[2022-12-23 18:46:28,180] [    INFO] - Converting to the inference model cost a little time.
[2022-12-23 18:46:38,656] [    INFO] - The inference model save in the path:/home/aistudio/.paddlenlp/taskflow/text_similarity/simbert-base-chinese/static/inference
[2022-12-23 18:46:42,772] [    INFO] - Downloading https://bj.bcebos.com/paddlenlp/models/transformers/simbert/vocab.txt and saved to /home/aistudio/.paddlenlp/models/simbert-base-chinese
[2022-12-23 18:46:42,846] [    INFO] - Downloading vocab.txt from https://bj.bcebos.com/paddlenlp/models/transformers/simbert/vocab.txt
100%|██████████| 63.4k/63.4k [00:00<00:00, 5.19MB/s]
[2022-12-23 18:46:42,968] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/simbert-base-chinese/tokenizer_config.json
[2022-12-23 18:46:42,970] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/simbert-base-chinese/special_tokens_map.json
# 推理一条数据
similarity([["世界上什么东西最小", "世界上什么东西最小?"]])

[{'text1': '世界上什么东西最小', 'text2': '世界上什么东西最小?', 'similarity': 0.992725}]
# 推理多条数据
similarity([["光眼睛大就好看吗", "眼睛好看吗?"], ["小蝌蚪找妈妈怎么样", "小蝌蚪找妈妈是谁画的"]])

[{'text1': '光眼睛大就好看吗', 'text2': '眼睛好看吗?', 'similarity': 0.7450272},
 {'text1': '小蝌蚪找妈妈怎么样', 'text2': '小蝌蚪找妈妈是谁画的', 'similarity': 0.8192149}]

task flow 中最重磅的应用 AIGC 基于pai-painter-painting-base-zh进行文本生成图片

           from paddlenlp import Taskflow 

           text_to_image = Taskflow("text_to_image")
           images = text_to_image("风阁水帘今在眼,且来先看早梅红")

[2022-12-23 18:49:27,321] [    INFO] - We are using <class 'paddlenlp.transformers.artist.tokenizer.ArtistTokenizer'> to load 'pai-painter-painting-base-zh'.
[2022-12-23 18:49:27,324] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/pai-painter-painting-base-zh/vocab.txt
[2022-12-23 18:49:27,340] [    INFO] - tokenizer config file saved in /home/aistudio/.paddlenlp/models/pai-painter-painting-base-zh/tokenizer_config.json
[2022-12-23 18:49:27,343] [    INFO] - Special tokens file saved in /home/aistudio/.paddlenlp/models/pai-painter-painting-base-zh/special_tokens_map.json
[2022-12-23 18:49:27,346] [    INFO] - We are using <class 'paddlenlp.transformers.artist.modeling.ArtistForImageGeneration'> to load 'pai-painter-painting-base-zh'.
[2022-12-23 18:49:27,348] [    INFO] - Already cached /home/aistudio/.paddlenlp/models/pai-painter-painting-base-zh/model_state.pdparams
images = text_to_image("刘孟水帘今在眼,且来先看早梅红")
# images[0].save("figure.png")
for image in images:
    for image_one in image:
        image_one.show()
![根据文本生成的图片](https://img-blog.csdnimg.cn/c2a36d0eb98845358c1a94cd3bffcd41.png#pic_center)

在这里插入图片描述

不知道是不是我的版本问题,今天好几个应用的结果看起来并不是很对。后来在paddlenlp的工作人员的帮助下解决了很多的问题。在此感谢paddlenlp的工作人员对于问题的解答。

  人工智能 最新文章
2022吴恩达机器学习课程——第二课(神经网
第十五章 规则学习
FixMatch: Simplifying Semi-Supervised Le
数据挖掘Java——Kmeans算法的实现
大脑皮层的分割方法
【翻译】GPT-3是如何工作的
论文笔记:TEACHTEXT: CrossModal Generaliz
python从零学(六)
详解Python 3.x 导入(import)
【答读者问27】backtrader不支持最新版本的
上一篇文章      下一篇文章      查看所有文章
加:2022-12-25 11:10:19  更:2022-12-25 11:12:19 
 
开发: C++知识库 Java知识库 JavaScript Python PHP知识库 人工智能 区块链 大数据 移动开发 嵌入式 开发工具 数据结构与算法 开发测试 游戏开发 网络协议 系统运维
教程: HTML教程 CSS教程 JavaScript教程 Go语言教程 JQuery教程 VUE教程 VUE3教程 Bootstrap教程 SQL数据库教程 C语言教程 C++教程 Java教程 Python教程 Python3教程 C#教程
数码: 电脑 笔记本 显卡 显示器 固态硬盘 硬盘 耳机 手机 iphone vivo oppo 小米 华为 单反 装机 图拉丁

360图书馆 购物 三丰科技 阅读网 日历 万年历 2024年11日历 -2024/11/25 20:44:17-

图片自动播放器
↓图片自动播放器↓
TxT小说阅读器
↓语音阅读,小说下载,古典文学↓
一键清除垃圾
↓轻轻一点,清除系统垃圾↓
图片批量下载器
↓批量下载图片,美女图库↓
  网站联系: qq:121756557 email:121756557@qq.com  IT数码