当前查询到5条专利与查询词 "Dong Panshan"相关,搜索用时0.3754697秒!排序方式:
发明专利:5实用新型: 0外观设计: 0
5 条,当前第 1-5 条 返回搜索页
申请号:201911365765.4 公开号:CN111046841A 主分类号:G06K9/00
摘要:【中文】本发明提供一种PowerPoint文件的文字提取方法、系统、终端及存储介质,包括:从PowerPoint Document数据流筛选出文字标签;通过遍历文字标签的子标签类型筛选出文字子标签;从所述文字子标签提取数据,并将所述数据转换为文字段;将从所有文字标签提取出的多个文字段进行汇总拼接,拼接完成后得到文字文件。本发明提供的方法兼容性好,不依赖Office PowerPoint/金山Dps组件,无需安装Office或者提取Office组件文件,纯粹调用系统API函数。效率高,二进制读取文件,并进行精确定位处理,执行效率显著提高。程序包小,所有实现通过自己手动编码和调用系统API函数,不依赖任何第三方程序文件。 【EN】The invention provides a method, a system, a terminal and a storage medium for extracting characters of a PowerPoint file, wherein the method comprises the following steps: screening out text labels from the PowerPoint Document data stream; screening out character sub-labels by traversing the sub-label types of the character labels; extracting data from the text sub-label and converting the data into text fields; and summarizing and splicing a plurality of text fields extracted from all text labels to obtain a text file after splicing. The method provided by the invention has good compatibility, does not depend on the Office PowerPoint/Jinshan Dps component, does not need to install Office or extract the Office component file, and purely calls the system API function. The efficiency is high, the binary system reads the file and carries out accurate positioning processing, and the execution efficiency is obviously improved. The program package is small, all the realization is through oneself manual coding and calling system API function, does not rely on any third party program file.
详细信息 下载全文

申请号:202010031025.3 公开号:CN111209723A 主分类号:G06F40/149
摘要:【中文】本发明提供了一种解析Office二进制格式并提取文档属性文字的方法及系统,本发明通过分析Office文档的二进制数据,根据文档中属性的存放原理,将Office文档中的属性文字全部提取出来,相对于使用二次接口开发和JAVA OPI技术接口提取文档属性中的文字,通过分析二进制文件并提取文件属性中文字可支持跨平台,不仅支持Windows系统而且支持Linux等系统,且效率高,通过二进制读取文件,并进行精确定位处理,执行效率显著提高,另外程序包小,所有实现均通过手动编码和调用系统API函数,不依赖任何第三方程序文件。本发明不限制于Office文件的文字提取,凡是采用Office存放原理的文件都可采用此方法提取文字,如金山Office等。 【EN】The invention provides a method and a system for analyzing an Office binary format and extracting document attribute characters, wherein the method and the system are characterized in that binary data of an Office document are analyzed, the attribute characters in the Office document are all extracted according to the storage principle of the attributes in the document, and compared with the method for extracting the characters in the document attributes by using secondary interface development and JAVA OPI technical interface, cross-platform can be supported by analyzing a binary file and extracting the characters in the document attributes, a Windows system and a Linux system are supported, the efficiency is high, the file is read by the binary system and is accurately positioned, the execution efficiency is obviously improved, in addition, the program package is small, all the realization is realized by manually coding and calling the system API function, and no third-party program file is relied. The invention is not limited to the character extraction of Office files, and all files adopting the Office storage principle can be extracted by adopting the method, such as Jinshan Office and the like.
详细信息 下载全文

申请号:202010013792.1 公开号:CN111241096A 主分类号:G06F16/22
摘要:【中文】本发明提供一种EXCEL文档的文本提取方法、系统、终端及存储介质,包括:从Excel文档的WorkBook数据流筛选SST标签,并从所述SST标签提取文本索引库信息;通过筛选标签Type值筛选出WorkBook数据流中的所有Sheet标签,读取所述Sheet标签生成sheet索引库;提取所述sheet索引库下的sheet页的偏移位置,根据所述偏移位置从所述sheet页的所有标签提取出LabelSst标签,并读取所述LabelSst标签的索引信息;根据所述索引信息查找文本索引库中相应的文本,并按所述索引信息所属LabelSst标签在sheet索引库的位置对相应文本进行拼接。本发明兼容性好,采用二进制读取文件,并进行精确定位处理,执行效率显著提高;所有实现通过自己手动编码和调用系统API函数,不依赖任何第三方程序文件,程序包较小。 【EN】The invention provides a text extraction method, a system, a terminal and a storage medium of an EXCEL document, comprising the following steps: screening SST labels from a WorkBook data stream of an Excel document, and extracting text index library information from the SST labels; screening all Sheet tags in a WorkBook data stream through screening tag Type values, and reading the Sheet tags to generate a Sheet index library; extracting offset positions of the sheet pages in the sheet index library, extracting Labelsst labels from all labels of the sheet pages according to the offset positions, and reading index information of the Labelsst labels; and searching a corresponding text in a text index base according to the index information, and splicing the corresponding text according to the position of the Labelsst label to which the index information belongs in the sheet index base. The invention has good compatibility, adopts binary system to read the file, and carries out accurate positioning processing, thereby obviously improving the execution efficiency; all the realization is realized through self manual coding and calling system API functions, and does not depend on any third party program files, and the program package is small.
详细信息 下载全文

申请号:202010031347.8 公开号:CN111241787A 主分类号:G06F40/126
摘要:【中文】本发明提供了一种解析word二进制格式并提取文档中文字的方法及系统,本发明通过分析Word文档的二进制数据,根据文档中文字的存放原理,把Word文档中的文字全部提取出来,相对于使用二次接口开发、JAVA OPI技术接口以及转换工具提取文档中的文字,通过分析二进制格式并提取文字具有兼容性好、效率高以及程序包小的特点,不依赖Office PowerPoint/金山Wps组件,无需安装Office或者提取Office组件文件,纯粹调用系统API函数;通过二进制读取文件,并进行精确定位处理,执行效率显著提高;所有实现均通过手动编码和调用系统API函数,不依赖任何第三方程序文件。 【EN】The invention provides a method and a system for analyzing Word binary format and extracting characters in a document, wherein the method and the system are characterized in that all characters in the Word document are extracted by analyzing binary data of the Word document according to the storage principle of the characters in the document, and compared with the method for extracting the characters in the document by using secondary interface development, JAVA OPI technical interface and conversion tool, the method and the system have the characteristics of good compatibility, high efficiency and small program package by analyzing the binary format and extracting the characters, do not depend on Office PowerPoint/Jinshan Wps components, do not need to install Office or extract Office component files, and call a system API function; the file is read through the binary system, and the accurate positioning processing is carried out, so that the execution efficiency is obviously improved; all the implementations are realized by manually coding and calling system API functions, and do not depend on any third-party program files.
详细信息 下载全文

申请号:201911109539.X 公开号:CN111005940A 主分类号:F16C33/46
摘要:【中文】本发明提供一种双中心弧形兜孔双排圆柱黄铜实体保持架,由架体构成,架体上设有N个圆柱滚子兜孔,其圆柱滚子兜孔由一对相对弧线构成,所述的弧线是以轴承的回转中心为中心、通过半径R设定的,其特征在于:所述的弧线下部接有曲率相同、中心不同的弧线。具有提高轴承稳定性,延长轴承的使用寿命等优点。 【EN】The invention provides a double-row cylindrical brass solid retainer with double-center arc-shaped pockets, which consists of a frame body, wherein N cylindrical roller pockets are arranged on the frame body, each cylindrical roller pocket consists of a pair of opposite arcs, and the arcs are set by taking the rotation center of a bearing as the center through a radius R, and are characterized in that: the lower part of the arc line is connected with arc lines with the same curvature and different centers. The bearing has the advantages of improving the stability of the bearing, prolonging the service life of the bearing and the like.
详细信息 下载全文

5 条,当前第 1-5 条 返回搜索页