搜索【词】的结果

敏感词过滤

Author: Richard Zhang. Mail: 89205975@qq.com This library filters sensitive phrases by user's configuration. Currently, only support UTF8 & ANSI encoded strings. The matching rule is max-length-matching, the library tries to match sensitive phrase as long as possible. For example: "damn fucker" and "damn" are all in sensitive dictionary, the sentence "he's a damn fucker" will be processed to "he's a ***********". Even user insert some spaces or non-letter characters between sensitive words, the library is also able to deal with it. For example: "Bad boy" is added to sensitive dictionary, "Bad.boy", "Bad boy", "Bad/boy" can also be filtered. "你去死" is added to sensitive dictionary, "你去死", "你/去死", "你去 .死" can also be filtered. Compiling requirement: 1. STL C++11 2. BOOST multi_index_container Performance test condition: 1. Giving a sentence around 100 bytes (English & Chinese mixed) 2. Dirty phrases around 10,000 3. Do 1,000 loop test 4. Intel I7 CPU Test result: For each loop, it cost around 100us

2022-04-02 17:47:14 4KB 脏话 敏感词 聊天过滤

1

维基百科中文词向量.zip

维基百科词向量 sgns.wiki.char.bz2解压后文件后缀名是.char, 可以通过一些方法得到.txt结尾的文件，有35万多个字词和符号，300维的向量表示。将词向量作为词嵌入层时需要加载全部的词向量到内存，如果计算机的内存不够大，会直接内存溢出。所以，截取8000，20000个词汇的词向量进行使用，在配置普遍的设备也能运行。该项目提供了100多个使用不同表示（密集和稀疏），上下文特征（单词，ngram，字符等）和语料库训练的中文单词向量（嵌入）。人们可以很容易地获得具有不同属性的预训练向量，并将它们用于下游任务。

2022-04-02 15:34:26 336.39MB 维基百科 维基百科中文词向量 中文词向量

1

Hownet情感词语集.zip

Hownet情感词语集

2022-04-01 16:05:36 83KB 情感词

1

金山词霸的屏幕取词技术，利用了 DLL编程，非常值得学习.rar

金山词霸的屏幕取词技术，利用了 DLL编程，非常值得学习.

2022-03-31 22:20:19 49KB 金山词霸的屏幕取词技术，利用了 DLL编程，非常值得学习.rar

1

屏幕取词C++源代码（看评论酌情下载）

屏幕取词C++源代码，nhw32.dll实现，参考~

2022-03-31 22:14:25 3.47MB 屏幕取词

1

HMM孤立词算法及其源码

这个是孤立词的HMM算法实现，还不错。贡献给大家了。

2022-03-30 22:33:45 14KB HMM

1

java使用DFA算法实现敏感词过滤

java使用dfa算法实现敏感词过滤，此算法效率最高，附带了一个敏感词库，轻松搞定论坛网站的敏感词过滤问题。

2022-03-30 13:14:46 1.39MB 敏感词过滤 dfa Java

1

中文文本分类停用词1208个

中文文本分类停用词1208个中文文本分类停用词1208个中文文本分类停用词1208个

2022-03-30 11:47:56 3KB 停用词

1

Python-基于同义词词林知网指纹字词向量向量空间模型的句子相似度计算

self complement of Sentence Similarity compute based on cilin, hownet, simhash, wordvector,vsm models，基于同义词词林，知网，指纹，字词向量，向量空间模型的句子相似度计算。

2022-03-29 17:13:03 7.51MB Python开发-自然语言处理

1

词达人答题选项提示辅助答题脚本.zip

微信词达人辅助答题脚本在词达人学词过程中选项会有相关提示操作过程在压缩包文件夹下world文档中

2022-03-29 13:47:27 6.34MB 词达人 脚本 辅助答题

1

个人信息

热门下载

最新下载

其他资源