EasyNLP玩转文本摘要(新闻标题)生成
简介: 本⽂将提供关于PEGASUS的技术解读,以及如何在EasyNLP框架中使⽤与PEGASUS相关的文本摘要(新闻标题)生成模型。
作者:王明、黄俊
导读
文本生成是自然语言处理领域的一个重要研究方向,具有丰富的实际应用场景以及研究价值。其中,生成式文本摘要作为文本生成的一个重要子任务,在实际应用场景中,包括新闻标题生成、摘要生成、关键词生成等任务形式。预训练语言模型,如BERT、MASS、uniLM等虽然在NLU场景中取得了令人瞩目的性能,但模型采用的单词、子词遮盖语言模型并不适用于文本生成场景中,特别是生成式文本摘要场景。其原因是,生成式文本摘要任务往往要求模型具有更粗粒度的语义理解,如句子、段落语义理解,以此进行摘要生成。为了解决上述问题,PEGASUS模型(PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization)针对文本摘要任务设计了无监督预训练任务(Gap Sentence Generation,简称GSG),即随机遮盖文档中的几个完整句子,让模型生成被遮盖的句子。该预训练任务能够很好地与实际地文本摘要任务匹配,从而使得预训练后的模型经过简单的微调后达到较好的摘要生成效果。因此,我们在EasyNLP框架中集成了PEGASUS算法和模型,使用户能够方便地使用该模型进行文本摘要生成相关任务的训练和预测。
EasyNLP(https://github.com/alibaba/EasyNLP)是阿⾥云机器学习PAI 团队基于 PyTorch 开发的易⽤且丰富的中⽂NLP算法框架,⽀持常⽤的中⽂预训练模型和⼤模型落地技术,并且提供了从训练到部署的⼀站式 NLP 开发体验。EasyNLP 提供了简洁的接⼝供⽤户开发 NLP 模型,包括NLP应⽤ AppZoo 和预训练 ModelZoo,同时提供技术帮助⽤户⾼效的落地超⼤预训练模型到业务。文本生成作为自然语言处理的一大子任务,具有众多的实际应用,包括标题生成、文本摘要、机器翻译、问答系统、对话系统等等。因此,EasyNLP也在逐步增加对文本生成子任务的支持,希望能够服务更多的NLP以及NLG算法开发者和研究者,也希望和社区一起推动NLG技术的发展和落地。
本⽂将提供关于PEGASUS的技术解读,以及如何在EasyNLP框架中使⽤与PEGASUS相关的文本摘要(新闻标题)生成模型。
Pegasus模型详解
在此之前,文本生成预训练模型T5、BART等模型虽然在众多文本生成任务中取得了明显的性能增益,但是在文本摘要任务中,模型的预训练目标与文本摘要目标还是存在较大的差异。这导致此类预训练模型在迁移至不用领域的摘要任务时,仍然需要较多的训练数据对模型进行微调才能达到较好的效果。为了缓解上述问题,PEGASUS模型在原始的子词遮盖损失的基础上,增加了完整句子遮盖损失,即将输入文档中的随机几个完整句子进行遮盖,让模型复原。
具体地,如上图所示,PEGASUS采用编码器-解码器架构(标准transformer架构)。模型对输入采用两种遮盖,一种是BERT采用的子词遮盖,用【mask2】表示,让模型的编码器还原被遮盖的子词(该类损失在消融实验中被证明对下游任务无性能增益,因此在最终的PEGASUS模型中并未采用)。另一种是GSG,用【mask1】表示,即让解码器生成输入中被遮盖的随机完整句子。针对此损失,作者同时提出三种可选方案,包括Random(随机选择m个句子)、Lead(选择前m个句子)、Ind-Orig(根据重要性分数选择m个句子)。其中,重要性分数具体通过计算每句话与文档中其它句子集合的ROUGE分数得到。可以认为,该策略选择能够很大程度代表文档中其它句子的句子作为遮盖对象。下图展示了三种选句子方案的一个例子,所选句子分别被标记为绿色、红棕色、蓝色。实验表明,采用第三种句子选择策略的模型能够取得最优性能。
文本摘要模型使用教程
以下我们简要介绍如何在EasyNLP框架中使用PEGASUS以及其他文本摘要模型。
安装EasyNLP
用户可以直接参考GitHub(https://github.com/alibaba/EasyNLP)上的说明安装EasyNLP算法框架。
数据准备
在具体的文本摘要场景中,需要用户提供下游任务的训练与验证数据,为tsv文件。对于文本摘要任务,这一文件包含以制表符\t分隔的两列数据,第一列是摘要列,第二列为原文列。样例如下:
湖北:“四上企业”复工率已达93.8% 央视网消息:4月1日,记者从湖北省新冠肺炎疫情防控工作新闻发布会上获悉,在各方面共同努力下,湖北省复工复产工作取得了阶段性成效。截至3月31日,湖北省“四上企业”包括规模以上工业、规模以上服务业法人单位等的复工率已达93.8%,复岗率69.3%。武汉市的复工率、复岗率也分别达到了85.4%、40.4%。责任编辑:王诗尧
下列文件为已经完成预处理的新闻标题生成训练和验证数据,可用于测试:
https://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/generation/title_gen.zip
中文新闻标题生成
由于PEGASUS原文产出的模型仅支持英文,为了方便中文社区用户的使用,我们基于mT5的模型架构预训练了一个针对中文新闻标题摘要的模型mT5,并将其集成进EasyNLP的模型库中。同时,我们还集成了IDEA机构预训练的文本摘要中文模型Randeng(可以认为是中文版的PEGASUS),便于用户探索不同模型的性能。以下汇总了EasyNLP中可用的模型,并对比模型在上述数据集上的性能表现。推荐用户选择前两个模型进行文本摘要,后三个模型进行新闻标题生成。
中文 |
新闻标题(Rouge1/2/L) |
论文标题摘要(Rouge1/2/L) |
hfl/randeng-238M-Summary-Chinese |
59.66/46.26/55.95 |
54.55/39.37/50.69 |
hfl/randeng-523M-Summary-Chinese |
62.86/49.67/58.89 |
53.83/39.17/49.92 |
alibaba-pai/mt5-title-generation-zh-275m |
62.35/48.63/58.96 |
54.28/40.26/50.55 |
alibaba-pai/randeng-238M-Summary-Chinese-tuned |
64.31/51.80/60.97 |
58.83/45.28/55.72 |
alibaba-pai/randeng-523M-Summary-Chinese-tuned |
64.76/51.65/61.06 |
59.27/45.58/55.92 |
在新闻标题生成任务中,我们采用以下命令对模型进行训练。用户可以根据超参数‘save_checkpoint_steps’来决定保存模型的步数,框架在此时会对训练的模型进行评测,会根据模型在验证集上的表现决定是否更新保存的模型参数。其中,运行的main.py文件在EasyNLP/examples/appzoo_tutorials/sequence_generation目录下,同时需要将训练和验证集数据放到该目录下。可以在‘user_defined_parameters’超参数下的‘pretrain_model_name_or_path’指定上述表格中的模型。
python main.py \
--mode train \
--app_name=sequence_generation \
--worker_gpu=1 \
--tables=./cn_train.tsv,./cn_dev.tsv \
--input_schema=title_tokens:str:1,content_tokens:str:1 \
--first_sequence=content_tokens \
--second_sequence=title_tokens \
--label_name=title_tokens \
--checkpoint_dir=./finetuned_zh_model \
--micro_batch_size=8 \
--sequence_length=512 \
--epoch_num=1 \
--save_checkpoint_steps=150 \
--export_tf_checkpoint_type none \
--user_defined_parameters 'pretrain_model_name_or_path=alibaba-pai/mt5-title-generation-zh language=zh copy=false max_encoder_length=512 min_decoder_length=12 max_decoder_length=32 no_repeat_ngram_size=2 num_beams=5 num_return_sequences=5'
另外,用户可以利用以下命令使用模型进行摘要生成,模型的路径由‘checkpoint_dir’指定。用户可以通过‘append_cols’指定在输出文件中添加输入列,如果不指定则填none。
python main.py \
--mode=predict \
--app_name=sequence_generation \
--worker_gpu=1 \
--tables=./cn_dev.tsv \
--outputs=./cn.preds.txt \
--input_schema=title:str:1,content:str:1,title_tokens:str:1,content_tokens:str:1,tag:str:1 \
--output_schema=predictions,beams \
--append_cols=content,title,tag \
--first_sequence=content_tokens \
--checkpoint_dir=./finetuned_zh_model \
--micro_batch_size=32 \
--sequence_length=512 \
--user_defined_parameters 'language=zh copy=false max_encoder_length=512 min_decoder_length=12 max_decoder_length=32 no_repeat_ngram_size=2 num_beams=5 num_return_sequences=5'
以下为模型对近期热点事件预测的几条样例,每条样例包含5列数据(以制表符\t隔开),分别为预测的摘要列(新闻标题)、beam search的5条候选(用||隔开)、输入的原文、输入的新闻标签。其中后三列是从对应的输入数据中直接拷贝过来。由于新闻文本过长,以下仅展示每条样例的前四列结果。
**费德勒告别信:未来我还会打更多的网球** 费德勒告别信:未来我还会打更多的网球||费德勒告别信:未来我还会打更多网球||费德勒告别信:未来我还会打更多网球但不是在大满贯或巡回赛||费德勒告别信:未来我还会打更多的网球||详讯:费德勒宣布退役,并告别信 **一代传奇落幕!网球天王费德勒宣布退役** 央视网消息:北京时间9月15日晚,网球天王罗杰-费德勒在个人社媒上宣布退役。41岁的费德勒是男子网坛历史最伟大球员之一,曾103次斩获单打冠军,大满贯单打夺冠20次(澳网6冠、法网1冠、温网8冠、美网5冠),共计310周位于男单世界第一。附费德勒告别信:在这些年网球给我的所有礼物中,最棒的毫无疑问是我一路上所遇到的人:我的朋友、我的竞争对手、以及最重要的球迷,是他们给予了这项运动生命。今天,我想和大家分享一些消息。正如你们中的许多人所知道的,过去三年中,我遇到了受伤和手术的挑战。......
**台风“梅花”将在大连沿海登陆将逐步变性为温带气旋** 台风“梅花”将在大连沿海登陆将逐步变性为温带气旋||台风“梅花”将在大连沿海登陆后逐渐变性为温带气旋||台风“梅花”将在大连沿海登陆将逐渐变性为温带气旋||台风“梅花”将在大连沿海登陆后变性为温带气旋||台风“梅花”将在大连沿海登陆后逐渐变性 **台风“梅花”将于16日傍晚前后在辽宁大连沿海登陆** 记者9月16日从辽宁省大连市气象部门获悉,今年第12号台风“梅花”将于16日傍晚前后在大连市旅顺口区至庄河市一带沿海登陆,之后逐渐变性为温带气旋。 受台风“梅花”影响,14日8时至16日10时,大连全市平均降雨量为132毫米,最大降雨量出现在金普新区大李家街道正明寺村,为283.6毫米;一小时最大降雨量出现在长海县广鹿岛镇,为49.4毫米......
英文文本摘要
EasyNLP模型库中同样集成了英文文本摘要模型,包括PEGASUS和BRIO。以下表格展示了两个模型在英文文本摘要数据上的性能表现。用户同样可以使用上述代码对模型进行训练和预测。需要注意的是,EasyNLP默认的是对中文的处理,因此,当需要处理英文文本时,需要在‘user_defined_parameters’中指定language为en,如不提供,则默认为中文(zh)。
英文 |
文本摘要(Rouge1/2/L) |
alibaba-pai/pegasus-summary-generation-en |
37.79/18.69/35.44 |
hfl/brio-cnndm-uncased |
41.46/23.34/38.91 |
训练过程如下:
wget http://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/generation/en_train.tsv
wget http://atp-modelzoo-sh.oss-cn-shanghai.aliyuncs.com/release/tutorials/generation/en_dev.tsv python main.py \
--mode train \
--app_name=sequence_generation \
--worker_gpu=1 \
--tables=./en_train.tsv,./en_dev.tsv \
--input_schema=title:str:1,content:str:1 \
--first_sequence=content \
--second_sequence=title \
--label_name=title \
--checkpoint_dir=./finetuned_en_model \
--micro_batch_size=1 \
--sequence_length=512 \
--epoch_num 1 \
--save_checkpoint_steps=500 \
--export_tf_checkpoint_type none \
--user_defined_parameters 'language=en pretrain_model_name_or_path=alibaba-pai/pegasus-summary-generation-en copy=false max_encoder_length=512 min_decoder_length=64 max_decoder_length=128 no_repeat_ngram_size=2 num_beams=5 num_return_sequences=5'
预测过程如下:
python main.py \
--mode=predict \
--app_name=sequence_generation \
--worker_gpu=1 \
--tables=./en_dev.tsv \
--outputs=./en.preds.txt \
--input_schema=title:str:1,content:str:1 \
--output_schema=predictions,beams \
--append_cols=title,content \
--first_sequence=content \
--checkpoint_dir=./finetuned_en_model \
--micro_batch_size 32 \
--sequence_length 512 \
--user_defined_parameters 'language=en copy=false max_encoder_length=512 min_decoder_length=64 max_decoder_length=128 no_repeat_ngram_size=2 num_beams=5 num_return_sequences=5'
以下展示了模型对一篇热点科技新闻稿的摘要预测结果:
With the image generator Stable Diffusion, you can conjure within seconds a potrait of Beyoncé as if painted by Vincent van Gogh, a cyberpunk cityscape in the style of 18th century Japanese artist Hokusai and a complex alien world straight out of science fiction. Released to the public just two weeks ago, it’s become one of several popular AI-powered text-to-image generators, including DALL-E 2, that have taken the internet by storm. Now, the company behind Stable Diffusion is in discussions to raise $100 million from investors, according to three people with knowledge of the matter. Investment firm Coatue expressed initial interest in a deal that would value the London-based startup Stability AI at $500 million, according to two of the people. Lightspeed Venture Partners then entered talks — which are still underway — to invest at a valuation up to $1 billion, two sources said. Stability AI, Coatue and Lightspeed declined requests for comment. The London-based startup previously raised at least $10 million in SAFE notes (a form of convertible security popular among early-stage startups) at a valuation of up to $100 million, according to one of the sources. An additional fourth source with direct knowledge confirmed Stability AI’s previous round. Much of the company’s funds came directly from founder and CEO Emad Mostaque, a former hedge fund manager. News of the prior financing was previously unreported. By nature of being open source, Stability AI’s underlying technology is free to use. So far, the company does not have a clear business model in place, according to three of the sources. However, Mostaque said in an interview last month with Yannic Kilcher, a machine learning engineer and YouTube personality, that he has already penned partnerships with “governments and leading institutions” to sell the technology. “We’ve negotiated massive deals so we’d be profitable at the door versus most money-losing big corporations,” he claims. The first version of Stable Diffusion itself cost just $600,000 to train, he wrote on Twitter — a fraction of the company’s total funding. Mostaque, 39, hails from Bangladesh and grew up in England. He received a master’s degree in mathematics and computer science from Oxford University in 2005 and spent 13 years working at U.K. hedge funds. In 2019, he launched Symmitree, a startup that aimed to reduce the cost of technology for people in poverty; it shuttered after one year, according to his LinkedIn profile. He then founded Stability AI in late 2020 with the mission of building open-source AI projects. According to its website, text-to-image generation is only one component of a broader apparatus of AI-powered offerings that the company is helping to build. Other open-source research groups it backs are developing tools for language, audio and biology. Stable Diffusion — created in collaboration with RunwayML, a video editing startup also backed by Coatue, and researchers at the Ludwig Maximilian University of Munich — has generated by far the most buzz among the company’s projects. It comes as AI image generators entered the zeitgeist this year, with the release of OpenAI’s DALL-E 2 in April and independent research lab Midjourney’s eponymous product in July. Google also revealed a text-to-image system, Imagen, in May, though it is not available to the public. Mostaque and his peers have said that the existing technology only represents the tip of the iceberg of what AI art is capable of creating: Future use cases could include drastically improved photorealism, video and animation. These image generators are already facing controversy: Many of them have been trained by processing billions of images on the internet without the consent of the copyright holder, prompting debate over ethics and legality. Last week, a testy debate broke out online after a Colorado fine arts competition awarded a top prize to an AI-generated work of art. Moreover, unlike DALL-E and Midjourney, which have restrictions in place to prevent the generation of gory or pornographic images, Stable Diffusion’s open source nature allows users to bypass such a block. On 4chan, numerous threads have appeared with AI-generated deepfakes of celebrity nudes, while Reddit has banned at least four communities that were dedicated to posting “not safe for work” AI imagery made using Stable Diffusion. It’s a double-edged sword for Stability AI, which has accumulated community goodwill precisely due to its open source approach that gives its users full access to its code. The company’s website states that the company is “building open AI tools,” a mission that mirrors the initial intent of OpenAI to democratize access to artificial intelligence. OpenAI was launched as a nonprofit research organization by prominent technologists including Sam Altman and Elon Musk, but upon accepting a $1 billion investment from Microsoft in 2019, it became a for-profit business. The move led it to focus on commercializing its technology rather than making it more widely available, drawing criticism from the AI community — and Musk himself. Stability AI has been a for-profit corporation from its inception, which Mostaque has said is meant to allow the open source research to reach more people. In an interviewwith TechCrunch last month, he said that the company was fully independent. “Nobody has any voting rights except our 75 employees — no billionaires, big funds, governments or anyone else with control of the company or the communities we support,” he said. At a $1 billion valuation, Mostaque would be ceding up to 10% of the company to the new financiers. Venture capital investors who take significant stakes in startups typically ask for board positions so they can influence the decisions the company is making using their money. Lightspeed, which manages $10 billion of assets, and Coatue, which is in charge of $73 billion, both have a track record of taking board seats, though it’s unclear if that will be the case with Stability AI. Follow me on Twitter. Send me a secure tip.
针对上述新闻原稿,以下为两个最新模型的摘要生成结果:
stable Diffusion is in discussions to raise $100 million from investors, three people say. The image generator is one of several popular AI-powered text-to-image generators.
company behind the popular image generator Stable Diffusion is in talks to raise $100 million from investors, according to sources
以上是如何利用EasyNLP进行文本摘要模型训练和预测的全部过程,更详细的使用教程可加入以下课程进行学习。标题党速成班:基于机器学习PAI EasyNLP的中文新闻标题生成
未来展望
在未来,我们计划在EasyNLP框架中集成面向知识的中⽂预训练模型,覆盖各个常⻅的NLU和NLG中⽂领域,敬请期待。我们也将在EasyNLP框架中集成更多SOTA模型(特别是中⽂模型),来⽀持各种NLP和多模态任务。此外, 阿⾥云机器学习PAI团队也在持续推进中文文本生成和中⽂多模态模型的⾃研⼯作,欢迎⽤户持续关注我们,也欢迎加⼊ 我们的开源社区,共建中⽂NLP和多模态算法库!
Github地址:https://github.com/alibaba/EasyNLP
参考文献
- Chengyu Wang, Minghui Qiu, Taolin Zhang, Tingting Liu, Lei Li, Jianing Wang, Ming Wang, Jun Huang, Wei Lin. EasyNLP: A Comprehensive and Easy-to-use Toolkit for Natural Language Processing. arXiv
- Zhang, Jingqing, et al. "Pegasus: Pre-training with extracted gap-sentences for abstractive summarization." International Conference on Machine Learning. PMLR, 2020.
- Xue, Linting, et al. "mT5: A massively multilingual pre-trained text-to-text transformer." arXiv preprint arXiv:2010.11934(2020).
- Lewis, Mike, et al. "Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension." arXiv preprint arXiv:1910.13461 (2019).
- Song, Kaitao, et al. "Mass: Masked sequence to sequence pre-training for language generation." arXiv preprint arXiv:1905.02450 (2019).
- Dong, Li, et al. "Unified language model pre-training for natural language understanding and generation." Advances in Neural Information Processing Systems 32 (2019).
- Yixin Liu, Pengfei Liu, Dragomir Radev, and Graham Neubig. 2022. BRIO: Bringing Order to Abstractive Summarization. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 2890–2903, Dublin, Ireland. Association for Computational Linguistics.
阿里灵杰回顾
- 阿里灵杰:阿里云机器学习PAI开源中文NLP算法框架EasyNLP,助力NLP大模型落地
- 阿里灵杰:预训练知识度量比赛夺冠!阿里云PAI发布知识预训练工具
- 阿里灵杰:EasyNLP带你玩转CLIP图文检索
- 阿里灵杰:EasyNLP中文文图生成模型带你秒变艺术家
- 阿里灵杰:EasyNLP集成K-BERT算法,借助知识图谱实现更优Finetune
原文链接:https://click.aliyun.com/m/1000360659/
本文为阿里云原创内容,未经允许不得转载。
EasyNLP玩转文本摘要(新闻标题)生成的更多相关文章
- 基于TextRank算法的文本摘要
本文介绍TextRank算法及其在多篇单领域文本数据中抽取句子组成摘要中的应用. TextRank 算法是一种用于文本的基于图的排序算法,通过把文本分割成若干组成单元(句子),构建节点连接图,用句子之 ...
- TextRank算法及生产文本摘要方法介绍
TextRank 算法是一种用于文本的基于图的排序算法,其基本思想来源于谷歌的 PageRank算法,通过把文本分割成若干组成单元(句子),构建节点连接图,用句子之间的相似度作为边的权重,通过循环迭代 ...
- iOS动画案例(2) 仿网易新闻标题动画
由于产品的需要,做了一个和网易新闻标题类似的动画效果,现在新闻类的APP都是采用这样的动画效果,来显示更多的内容.先看一下动画效果: 由于这个动画效果在很多场合都有应用,所以我专门封装了一个控 ...
- 实现自动文本摘要(python,java)
参考资料:http://www.ruanyifeng.com/blog/2013/03/automatic_summarization.html http://joshbohde.com/blog/d ...
- crawler4j源码学习(1):搜狐新闻网新闻标题采集爬虫
crawler4j是用Java实现的开源网络爬虫.提供了简单易用的接口,可以在几分钟内创建一个多线程网络爬虫.下面实例结合jsoup,采集搜狐新闻网(http://news.sohu.com/)新闻标 ...
- 【PHP爬虫】curl+simple_html_dom 抓取百度最新消息新闻标题,来源,URL
<title>新闻转载统计</title> <script> function submit(){ wd=document.getElementById('name ...
- SnowNLP:•中文分词•词性标准•提取文本摘要,•提取文本关键词,•转换成拼音•繁体转简体的 处理中文文本的Python3 类库
SnowNLP是一个python写的类库,可以方便的处理中文文本内容,是受到了TextBlob的启发而写的,由于现在大部分的自然语言处理库基本都是针对英文的,于是写了一个方便处理中文的类库,并且和Te ...
- python3爬虫-爬取新浪新闻首页所有新闻标题
准备工作:安装requests和BeautifulSoup4.打开cmd,输入如下命令 pip install requests pip install BeautifulSoup4 打开我们要爬取的 ...
- Python3:爬取新浪、网易、今日头条、UC四大网站新闻标题及内容
Python3:爬取新浪.网易.今日头条.UC四大网站新闻标题及内容 以爬取相应网站的社会新闻内容为例: 一.新浪: 新浪网的新闻比较好爬取,我是用BeautifulSoup直接解析的,它并没有使用J ...
- Fragment在Activity中跳转,实现类似新闻标题跳转新闻内容功能
1.准备的工作,新闻数据类,新闻数据适配器,适配器的布局: News.java package com.example.zps.fourfragmentbestpractice; /** * Crea ...
随机推荐
- GDB调试之多线程
1.set scheduler-locking off/on/step 调试时除了当前线程在运行,要想规定其他线程的运行情况用这个命令,有三个选择: set scheduler-locking off ...
- 记录--H5 视频兼容性处理总结
这里给大家分享我在网上总结出来的一些知识,希望对大家有所帮助 概述 最近在负责公司官网的开发,在 H5 播放视频时,遇到很多兼容问题,所以总结下在 H5 播放时,遇到的兼容性问题,并封装一个 Vide ...
- 记录--为什么没有人能讲清楚 BFC?
这里给大家分享我在网上总结出来的一些知识,希望对大家有所帮助 一.你看得懂权威的解释吗? 1. CSS 规范中对 BFC 的描述 CSS 规范(英文) | 中文翻译 浮动,绝对定位的元素,非块盒的块容 ...
- css实现按钮点击水波纹效果和两边扩散效果
废话少说,先上代码了,复制到html中即可使用 点击查看代码 <!DOCTYPE html> <html lang="en"> <head> & ...
- python 图片转文字小工具
应群友要求,要做一个图片转文字的格式,因为有些人的简历中只有一张图片要提取他里面的文字就不好办了. 于是就有了下面这个小工具: 功能:选择要识别的图片后,识别出来后存到.txt文本中. 实现原理,基于 ...
- Python 生成二维码的几种方式、生成条形码
一: # 生成地维码 import qrcode import matplotlib.pyplot as plt from barcode.writer import ImageWriter 创建QR ...
- window.location.href和this.$router.push区别
使用location.href='/url'来跳转,简单方便,但是刷新了页面:使用history.pushState('/url'),无刷新页面,静态跳转: 引进router,然后使用router.p ...
- spring中的bean对象的有关了解
@Configuration public class AppConfig { @Bean public MyBean getMyBean() { MyBean myBean = new MyBean ...
- 01 jQuery初使用
01 jQuery初使用 jQuery是一个曾经火遍大江南北的一个Javascript的第三方库. jQuery的理念: write less do more. 其含义就是让前端程序员从繁琐的js代码 ...
- [UAC]C++判断某进程是否有管理员权限
BOOL IsAdminProcess(UINT PID) { if (PID <= 0) PID = GetCurrentProcessId(); HANDLE hProcess = Open ...