Patent9专利在线

当前查询到7条专利与查询词 "【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd."相关，搜索用时0.3906039秒!排序方式：

发明专利：7实用新型: 0外观设计: 0

共 7 条，当前第 1-7 条　返回搜索页

1：[发明] 【中文】一种结合BERT模型的图像描述生成方法【EN】Image description generation method combined with BERT model

申请号：201911025320.1 公开号：CN110852331A 主分类号：G06K9/46

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.10.25 公开日：2020.02.28

发明人：【中文】宋荣伟;刘汪洋;曹扬【EN】Song Rongwei;Liu Wangyang;Cao Yang

摘要：【中文】本发明提供了一种结合BERT模型的图像描述生成方法，首先提取图像的特征向量，对特征向量进行压缩、维度扩充，其次，用外部语料数据扩充词典，然后，将特征向量和词典输入基于端到端加入注意力机制的图像描述生成模型，生成弱语义描述语句A，最后，通过BERT模型对弱语义描述语句A进行语义调整，获取完整的图像描述语句。本发明通过对特征向量进行压缩与维度扩充，增强图像数据的特征表达含义；利用基于端到端并加入注意力机制的图像描述生成模型生成弱语义图像描述语句，同时对于词汇不足的问题采用应用外部语料数据扩充词典的方式，增强语义含义，使所生成的图像描述能更准确的表征图像的内容，具有更丰富的语义。【EN】The invention provides an image description generation method combined with a BERT model, which comprises the steps of firstly extracting a feature vector of an image, compressing and dimension expanding the feature vector, secondly expanding a dictionary by using external corpus data, then inputting the feature vector and the dictionary into an image description generation model based on an end-to-end attention adding mechanism to generate a weak semantic description statement A, and finally, carrying out semantic adjustment on the weak semantic description statement A through the BERT model to obtain a complete image description statement. The feature expression meaning of the image data is enhanced by compressing and dimension expanding the feature vector; an image description generation model based on end-to-end and with an attention mechanism is used for generating weak semantic image description sentences, meanwhile, a dictionary is expanded by applying external corpus data for solving the problem of insufficient vocabulary, semantic meaning is enhanced, and the generated image description can accurately represent the content of the image and has richer semantics.

详细信息下载全文

2：[发明] 【中文】一种联盟链中的私有与隐私数据保护机制【EN】Private and private data protection mechanism in alliance chain

申请号：201911055361.5 公开号：CN110851862A 主分类号：G06F21/62

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.10.31 公开日：2020.02.28

发明人：【中文】吴江洲;余秦勇;李泽松;刘伟;徐辰福【EN】Wu Jiangzhou;Yu Qinyong;Li Zesong;Liu Wei;Xu Chenfu

摘要：【中文】本发明提供了一种联盟链中的私有与隐私数据保护机制，包括节点和排序服务，节点通过通道与排序服务通信连接，所述节点中存有账本文件、私有数据库、隐私数据库和一般数据库，私有数据库、隐私数据库和一般数据库通过peer节点与账本文件连接；所述账本文件中存有私有数据、隐私数据和一般数据。本发明的有益效果在于：能够有效实现联盟链中的私有与隐私数据的安全保护，避免私有与隐私数据对联盟链中的所有成员公开。【EN】The invention provides a private and private data protection mechanism in an alliance chain, which comprises a node and a sequencing service, wherein the node is in communication connection with the sequencing service through a channel, an account book file, a private database and a general database are stored in the node, and the private database, the private database and the general database are connected with the account book file through a peer node; private data, privacy data and general data are stored in the account book file. The invention has the beneficial effects that: the security protection of the private and private data in the alliance chain can be effectively realized, and the private and private data are prevented from being disclosed to all members in the alliance chain.

详细信息下载全文

3：[发明] 【中文】一种标点预测模型训练方法及文本标点确定方法【EN】Punctuation prediction model training method and text punctuation determination method

申请号：201911072366.9 公开号：CN110852040A 主分类号：G06F40/117

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.11.05 公开日：2020.02.28

发明人：【中文】刘彦志;曹扬【EN】Liu Yanzhi;Cao Yang

摘要：【中文】本发明提供了一种标点预测模型训练方法及文本标点确定方法，标点预测模型训练方法包括：(1)获取用于标点预测模型训练的分字文本训练集；(2)利用数据增强方法从训练集中生成训练样本；(3)获取训练好的标点预测模型。文本标点确定方法包括：(1)获取无标点的目标文本；(2)获取目标文本中每个文字后面的预测标点；(3)将预测标点插入目标文本中对应文字的后面，得到标点确定的文本。本发明所提供的标点预测模型训练方法和文本标点确定方法，可以优化标点预测模型的训练，让标点预测模型达到自身的最佳性能，从而提高标点预测结果的正确性。【EN】The invention provides a punctuation prediction model training method and a text punctuation determination method, wherein the punctuation prediction model training method comprises the following steps: (1) acquiring a word segmentation text training set for punctuation prediction model training; (2) generating training samples from the training set by using a data enhancement method; (3) and acquiring a trained punctuation prediction model. The text punctuation determination method comprises the following steps: (1) acquiring a target text without punctuations; (2) acquiring a predicted punctuation behind each character in a target text; (3) and inserting the predicted punctuation into the back of the corresponding character in the target text to obtain the text determined by the punctuation. The punctuation prediction model training method and the text punctuation determination method provided by the invention can optimize the training of the punctuation prediction model, so that the punctuation prediction model achieves the self optimal performance, thereby improving the correctness of punctuation prediction results.

详细信息下载全文

4：[发明] 【中文】一种自动化更新的词法分析系统【EN】Automatic-updating lexical analysis system

申请号：201911060395.3 公开号：CN110866400A 主分类号：G06F40/295

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.11.01 公开日：2020.03.06

发明人：【中文】晏玉珽;印忠文;常兵;曹扬【EN】Yan Yuting;Yin Zhongwen;Chang Bing;Cao Yang

摘要：【中文】本发明提供了一种自动化更新的词法分析系统，包括用户词库、系统控制模块和与系统控制模块连接的子模块控制模块；所述子模控制模块分别与数据获取及加工模块、用户词库更新模块连；所述数据获取及加工模块依次与新词发现模块、词法分析模块连接；所述用户词库分别与用户词库更新模块、词法分析模块、新词分析模块连接。本发明不仅解决现有词法分析系统普遍存在的领域自适应性问题，而且通过文本数据的自动化获取和词典的自动更新解决了如今互联网背景下用词习惯和新词术语日益更新给词法分析准确度带来的挑战，为中文自然语言处理语义理解、信息检索、机器翻译等上层任务提供了支撑。【EN】The invention provides an automatically updated lexical analysis system, which comprises a user lexicon, a system control module and a sub-module control module connected with the system control module; the submodules control module is respectively connected with the data acquisition and processing module and the user word stock updating module; the data acquisition and processing module is sequentially connected with the new word discovery module and the lexical analysis module; and the user word bank is respectively connected with the user word bank updating module, the lexical analysis module and the new word analysis module. The method not only solves the field self-adaptability problem commonly existing in the existing lexical analysis system, but also solves the challenge brought to the lexical analysis accuracy by word habits and increasingly updated new word terms under the current Internet background through the automatic acquisition of text data and the automatic updating of the dictionary, and provides support for upper-layer tasks of semantic understanding, information retrieval, machine translation and the like of Chinese natural language processing.

详细信息下载全文

5：[发明] 【中文】一种基于跨媒体统一表征模型的跨媒体检索方法【EN】Cross-media retrieval method based on cross-media uniform characterization model

申请号：201911061277.4 公开号：CN110866129A 主分类号：G06F16/48

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.11.01 公开日：2020.03.06

发明人：【中文】王进;刘汪洋;曹扬;张秋悦;闫盈盈;宋荣伟;阚丹会【EN】Wang Jin;Liu Wangyang;Cao Yang;Zhang Qiuyue;Yan Yingying;Song Rongwei;Kan Danhui

摘要：【中文】本发明针对跨媒体检索问题，提出了一种基于跨媒体统一表征模型的跨媒体检索方法，包括以下步骤：(1)跨媒体数据库构建，建立面向政务新闻领域大跨媒体数据库；(2)跨媒体数据预处理，文本、图像、视频和音频等数据的输入预处理；(3)跨媒体数据原域特征提取，跨媒体数据的原域特征向量提取；(4)跨媒体数据统一表征，跨媒体数据在共同表示空间的特征向量提取；(5)数据检索语义相似度计算与排序，检索目标数据与跨媒体数据库中数据语义相似度计算，并排序输出结果。本发明不仅提出了一种支持四种媒体数据的相互检索方法，同时提出多种媒体数据的统一表征模型，提高了跨媒体语义检索精度，具有关阔的应用前景。【EN】The invention provides a cross-media retrieval method based on a cross-media uniform representation model aiming at the problem of cross-media retrieval, which comprises the following steps: (1) constructing a cross-media database, and establishing a large-cross-media database facing the government affair news field; (2) cross-media data preprocessing, input preprocessing of data such as texts, images, videos and audios; (3) extracting original domain features of cross-media data, and extracting original domain feature vectors of the cross-media data; (4) uniformly representing the cross-media data, and extracting feature vectors of the cross-media data in a common representation space; (5) and calculating and sequencing the semantic similarity of the data, calculating the semantic similarity of the data of the retrieval target and the data in the cross-media database, and sequencing to output results. The invention not only provides a mutual retrieval method supporting four media data, but also provides a unified representation model of multiple media data, improves the cross-media semantic retrieval precision, and has a broad application prospect.

详细信息下载全文

6：[发明] 【中文】一种面向跨媒体知识推理任务的知识表示方法【EN】Knowledge representation method for cross-media knowledge reasoning task

申请号：201911061280.6 公开号：CN110909881A 主分类号：G06N5/02

申请人：【中文】中电科大数据研究院有限公司【EN】Division big data Research Institute Co. Ltd. 申请日:2019.11.01 公开日：2020.03.24

发明人：【中文】昌攀;曹扬;王进;刘汪洋【EN】Chang Pan;Cao Yang;Wang Jin;Liu Wangyang

摘要：【中文】本发明提供了一种面向跨媒体知识推理任务的知识表示方法，该方法包括：抽取跨媒体知识图谱的RDF三元组信息，将跨媒体知识图谱RDF三元组数据表示为初始的低维向量；利用最大间隔成本函数训练正负例三元组样本之间的向量表示，同时挖掘正负例三元组样本间的相似性(或差异性)，添加到最大间隔成本函数中，提高模型知识推理识别相似实体的能力。本发明能够对基于RDF构建的跨媒体知识图谱三元组进行知识表示和知识推理，利用本发明学习到的知识推理模型进行实体链接和知识分类，能够提高跨媒体知识图谱中的链接预测和三元组分类的准确度。【EN】The invention provides a knowledge representation method for a cross-media knowledge inference task, which comprises the following steps: extracting RDF triple information of the cross-media knowledge graph, and representing the RDF triple data of the cross-media knowledge graph as an initial low-dimensional vector; the maximum interval cost function is utilized to train vector representation between the positive and negative triple samples, meanwhile, the similarity (or difference) between the positive and negative triple samples is mined and added into the maximum interval cost function, and the capability of the model for reasoning and identifying similar entities is improved. The method can perform knowledge representation and knowledge inference on the cross-media knowledge map triplets constructed based on the RDF, and can improve the accuracy of link prediction and triplet classification in the cross-media knowledge map by performing entity link and knowledge classification by using the knowledge inference model learned by the method.

详细信息下载全文

7：[发明] 【中文】一种基于梯度提升算法的黑导游检测方法【EN】Black guide detection method based on gradient lifting algorithm

申请号：201911173486.8 公开号：CN110909545A 主分类号：G06F40/289

申请人：【中文】电子科技大学;中电科大数据研究院有限公司【EN】UNIVERSITY OF ELECTRONIC SCIENCE AND TECHNOLOGY;Division big data Research Institute Co. Ltd. 申请日:2019.11.26 公开日：2020.03.24

发明人：【中文】詹瑾瑜;余佳雨;江维;李响;杨瑞;刘昌澍;李博智;蔡玉舒;周巧瑜【EN】Zhan Jinyu;Yu Jiayu;Jiang Wei;Li Xiang;Yang Rui;Liu Changshu;Li Bozhi;Cai Yushu;Zhou Qiaoyu

摘要：【中文】本发明公开一种基于梯度提升算法的黑导游检测方法，应用于数据检测领域，针对现有的旅游行业监管滞后问题，本发明通过获取网站新闻URL数据，并基于词嵌入训练得到词向量模型；并基于获得的词向量模型，采用梯度提升算法训练得到黑导游类别预测模型；最后通过向得到的黑导游类别预测模型输入投诉文本，得到预测类型，相比于现有的人工数据检测，显著提高了检测效率。【EN】The invention discloses a black tour guide detection method based on a gradient lifting algorithm, which is applied to the field of data detection and aims at the problem of supervision lag of the existing tour industry; training by adopting a gradient lifting algorithm based on the obtained word vector model to obtain a black tour guide category prediction model; and finally, inputting a complaint text into the obtained black tour guide category prediction model to obtain a prediction type, and compared with the existing manual data detection, the detection efficiency is obviously improved.

详细信息下载全文

共 7 条，当前第 1-7 条　返回搜索页