JP7543339B2

JP7543339B2 - Document mapping display device, document mapping display method, and document mapping display program

Info

Publication number: JP7543339B2
Application number: JP2022071856A
Authority: JP
Inventors: 晃太穴井; 祐一四方; 俊介山本; 萌津田; 梢足立
Original assignee: Toyota Technical Development Corp
Current assignee: Toyota Technical Development Corp
Priority date: 2021-05-28
Filing date: 2022-04-25
Publication date: 2024-09-02
Anticipated expiration: 2042-04-25
Also published as: JP2022183023A

Description

本発明は、文献マッピング表示装置、文献マッピング表示方法、及び文献マッピング表示プログラムに関し、特に論文、特許公報等の文献についてどの分野において増加しているのか等を可視化するための文献マッピング表示装置とその方法及びプログラムに関する。 The present invention relates to a document mapping display device, a document mapping display method, and a document mapping display program, and in particular to a document mapping display device and method and program for visualizing in which fields documents such as papers and patent publications are increasing.

公開されている論文、特許公報等の文献について、その文献内に存在する所定の文章に着目して二次元平面上の分布図として文献の位置、数が表現されることがある。例えば、二次元平面上において情報要素の多少に応じて配色を濃くする等の視覚的な表示（いわゆるヒートマップ等の表示）が用いられていた（特許文献１参照）。 For published documents such as papers and patent publications, the location and number of documents are sometimes expressed as a distribution map on a two-dimensional plane, focusing on specific sentences present in the document. For example, a visual display (such as a so-called heat map) has been used in which the color scheme is darkened depending on the amount of information elements on the two-dimensional plane (see Patent Document 1).

しかしながら、複数の文献を二次元平面上に配置する手法では、文献の類似度を視覚的に表示することは可能であっても、二次元平面上の配置を理解する者の経験、感覚等により左右されることもあり、客観性は十分とは言えない。また、複数の文献を二次元平面上に配置する手法の場合、複数の文献の集合が変化すると二次元平面上の文献の配置も変化するため、時間経過による文献の推移を判断することが難しかった。なお、従前の表示の手法によると、文献中に存在する所定の文章を参考に、複数の文献が二次元平面上に視覚的に整理されるため、どの分野が注目されているのか等の文献の動向調査に用いられていた。 However, with the method of arranging multiple documents on a two-dimensional plane, although it is possible to visually display the similarity of documents, it can be influenced by the experience and intuition of the person who understands the arrangement on the two-dimensional plane, and it cannot be said to be sufficiently objective. Furthermore, with the method of arranging multiple documents on a two-dimensional plane, when the collection of multiple documents changes, the arrangement of the documents on the two-dimensional plane also changes, making it difficult to determine the progress of the documents over time. Note that with previous display methods, multiple documents are visually organized on a two-dimensional plane with reference to certain sentences present in the documents, and this was used to investigate trends in documents, such as which fields are attracting attention.

特開平９－１１４８５９号公報Japanese Patent Application Publication No. 9-114859

本発明は上述の点に鑑みなされたものであり、公開されている論文、特許公報等の文献について、その文献内に存在する所定の文章に着目して二次元平面上の分布図に表現する際の客観性を担保するとともに、時間経過による文献の推移を可視化可能とすることのできる文献マッピング表示装置を提供し、併せて文献マッピング表示方法、文献マッピング表示プログラムも提供する。 The present invention has been made in consideration of the above points, and provides a document mapping display device that ensures objectivity when displaying published documents such as papers and patent publications in a two-dimensional distribution map focusing on specific sentences present in the document, and that can visualize the progress of documents over time, as well as a document mapping display method and a document mapping display program.

すなわち、実施形態の文献マッピング表示装置は、複数の文献を取得する文献取得部と、複数の文献から所定の文章を取得する文章取得部と、複数の文献同士を、前記複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する文献配置部と、二次元平面に存在する所定領域における文献数の時系列の変化量を抽出する抽出部と、二次元平面に存在する所定領域における文献数の時系列の変化量に基づいて成長度を計算する計算部と、成長度を出力する出力部とを備えることを特徴とする。
That is, the document mapping display device of the embodiment is characterized by comprising a document acquisition unit that acquires multiple documents, a sentence acquisition unit that acquires specified sentences from the multiple documents, a document arrangement unit that arranges the multiple documents on a two-dimensional plane according to similarities between specified sentences contained in each of the multiple documents, an extraction unit that extracts the amount of change in the number of documents over time in a specified area existing on the two-dimensional plane, a calculation unit that calculates a growth degree based on the amount of change in the number of documents over time in a specified area existing on the two-dimensional plane, and an output unit that outputs the growth degree.

加えて、実施形態の文献マッピング表示装置は、複数の文献を取得する文献取得部と、複数の文献から所定の文章を取得する文章取得部と、複数の文献同士を、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する文献配置部と、二次元平面に存在する所定領域における文献数の時系列の変化量を抽出する抽出部と、二次元平面に存在する所定領域における文献数の時系列の変化量に基づいて成長度を計算する計算部と、二次元平面に存在する所定領域における複数の文献に基づいて集積領域を検出する集積検出部と、所定領域における成長度と集積領域の二次元平面における時系列の変化点を二次元平面に表示する時系列変化出力部とを備えることを特徴とする。
In addition, the document mapping display device of the embodiment is characterized by comprising a document acquisition unit that acquires multiple documents, a sentence acquisition unit that acquires specified sentences from the multiple documents, a document arrangement unit that arranges the multiple documents on a two-dimensional plane according to the similarity of specified sentences contained in each of the multiple documents, an extraction unit that extracts the amount of change in the number of documents over time in a specified area existing on the two-dimensional plane, a calculation unit that calculates a growth degree based on the amount of change in the number of documents over time in a specified area existing on the two-dimensional plane, an accumulation detection unit that detects an accumulation area based on the multiple documents in the specified area existing on the two-dimensional plane, and a time series change output unit that displays the growth degree in the specified area and the time series change points in the accumulation area on the two-dimensional plane.

さらに、文献取得部は、クローリング部を備えインターネット回線を通じて複数の文献を取得することとしてもよい。 Furthermore, the document acquisition unit may be equipped with a crawling unit and acquire multiple documents via an Internet line.

さらに、複数の文献のそれぞれには、文献の特徴を示すタグ情報が文献に応じて付されていて、文献取得部は、前記タグ情報に基づいて前記複数の文献を取得することとしてもよい。 Furthermore, tag information indicating characteristics of each of the multiple documents may be attached to each of the multiple documents, and the document acquisition unit may acquire the multiple documents based on the tag information.

さらに、文献配置部における所定の文章同士の類似性に従う二次元平面への配置は自然言語処理に基づくこととしてもよい。 Furthermore, the placement of specified sentences on a two-dimensional plane according to similarities between sentences in the document placement section may be based on natural language processing.

さらに、文献配置部は、自然言語処理により複数の文献のそれぞれに特徴ベクトルを生成するベクトル化部を備えることとしてもよい。 Furthermore, the document placement unit may include a vectorization unit that generates a feature vector for each of the multiple documents using natural language processing.

さらに、抽出部は、二次元平面をグリッド状に区画して生じる一の区画を所定領域として所定領域内に存在する文献数を抽出することとしてもよい。 Furthermore, the extraction unit may be configured to extract the number of documents present within a given region, the given region being one of the sections that is generated by dividing a two-dimensional plane into a grid.

さらに、抽出部は、二次元平面に存在する所定領域における文献数の比較に基づいて文献数の時系列の変化量を抽出することとしてもよい。
Furthermore, the extraction unit may extract the amount of change in the number of documents over time based on a comparison of the number of documents in a predetermined area existing on a two-dimensional plane.

さらに、抽出部は、二次元平面における所定の文章の集合の変化を抽出することとしてもよく、また、抽出部は、所定領域における前記文献数の変化量に基づいて二次元平面における所定の文章の集合の変化の差分を抽出することとしてもよい。 Furthermore, the extraction unit may extract a change in a set of specified sentences on a two-dimensional plane, and the extraction unit may extract a difference in the change in the set of specified sentences on a two-dimensional plane based on the amount of change in the number of documents in a specified area.

さらに、計算部は、一の区画に隣接する他の区画に存在する文献数の時系列の変化量から一の区画の成長度を計算することとしてもよい。
Furthermore, the calculation unit may calculate the growth rate of a given section from the amount of change over time in the number of documents present in another section adjacent to the given section.

さらに、出力部は、成長度の大小を二次元平面において矢印の種類により表示することとしてもよい。またさらに、出力部は、成長度を文献それぞれに対して数値として表示することとしてもよい。 The output unit may further display the degree of growth on a two-dimensional plane using a type of arrow. The output unit may further display the degree of growth for each document as a numerical value.

さらに、集積検出部は、二次元平面に存在する所定領域における複数の文献の密度に基づいて集積領域を検出することとしてもよく、集積領域には、任意の集積領域数が指定されることとしてもよい。また、時系列変化出力部は、二次元平面において複数の文献の集積領域における時系列の変化点を円により表示し、円同士をつなぐ線により表示することとしてもよい。またさらに、時系列変化出力部は、成長度の大小を二次元平面において矢印の種類により表示することとしてもよい。 Furthermore, the accumulation detection unit may detect an accumulation area based on the density of multiple documents in a specified area existing on a two-dimensional plane, and any number of accumulation areas may be specified for the accumulation area. Furthermore, the time series change output unit may display time series change points in the accumulation area of multiple documents on a two-dimensional plane as circles and lines connecting the circles. Furthermore, the time series change output unit may display the magnitude of the growth level on a two-dimensional plane using types of arrows.

本発明の文献マッピング表示装置は、複数の文献を取得する文献取得部と、複数の文献から所定の文章を取得する文章取得部と、複数の文献同士を、前記複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する文献配置部と、二次元平面に存在する所定領域における文献数の時系列の変化量を抽出する抽出部と、二次元平面に存在する所定領域における文献数の時系列の変化量に基づいて成長度を計算する計算部と、成長度を出力する出力部とを備えるため、公開されている論文、特許公報等の文献について、その文献内に存在する所定の文章に着目して二次元平面上の分布図に表現する際の客観性を担保するとともに、時間経過による文献の推移を可視化可能とすることができる。なお、文献マッピング表示方法及び文献マッピング表示プログラムにおいても同様の効果を得ることができる。
The document mapping display device of the present invention includes a document acquisition unit that acquires a plurality of documents, a text acquisition unit that acquires a predetermined text from the plurality of documents, a document arrangement unit that arranges the plurality of documents on a two-dimensional plane according to the similarity of the predetermined text contained in each of the plurality of documents, an extraction unit that extracts the amount of change in the number of documents in a predetermined area on the two-dimensional plane over time , a calculation unit that calculates a growth rate based on the amount of change in the number of documents in a predetermined area on the two-dimensional plane over time, and an output unit that outputs the growth rate, so that it is possible to ensure objectivity when expressing documents such as published papers and patent publications on a distribution map on a two-dimensional plane by focusing on the predetermined text contained in the document, and to visualize the transition of documents over time. Note that the same effect can be obtained in the document mapping display method and the document mapping display program.

加えて、本発明の文献マッピング表示装置は、複数の文献を取得する文献取得部と、複数の文献から所定の文章を取得する文章取得部と、複数の文献同士を、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する文献配置部と、二次元平面に存在する所定領域における文献数の時系列の変化量を抽出する抽出部と、二次元平面に存在する所定領域における文献数の時系列の変化量に基づいて成長度を計算する計算部と、二次元平面に存在する所定領域における複数の文献に基づいて集積領域を検出する集積検出部と、所定領域における成長度と集積領域の二次元平面における時系列の変化点を二次元平面に表示する時系列変化出力部とを備えるため、公開されている論文、特許公報等の文献について、その文献内に存在する所定の文章に着目して二次元平面上の分布図に表現する際の客観性を担保するとともに、時間経過による文献の推移を可視化可能とすることができる。なお、文献マッピング表示方法及び文献マッピング表示プログラムにおいても同様の効果を得ることができる。
In addition, the document mapping display device of the present invention includes a document acquisition unit that acquires a plurality of documents, a text acquisition unit that acquires a predetermined text from the plurality of documents, a document arrangement unit that arranges the plurality of documents on a two-dimensional plane according to the similarity of the predetermined text contained in each of the plurality of documents, an extraction unit that extracts the amount of change in the number of documents in a predetermined area existing on the two-dimensional plane over time , a calculation unit that calculates the degree of growth based on the amount of change in the number of documents in a predetermined area existing on the two-dimensional plane over time, an accumulation detection unit that detects an accumulation area based on the plurality of documents in the predetermined area existing on the two-dimensional plane, and a time series change output unit that displays the degree of growth in the predetermined area and the time series change points in the accumulation area on the two-dimensional plane on the two-dimensional plane, so that it is possible to ensure objectivity when expressing documents such as published papers and patent publications on a distribution map on a two-dimensional plane by focusing on the predetermined text contained in the document, and to visualize the transition of documents over time. The same effect can be obtained in the document mapping display method and the document mapping display program.

第１及び第２実施形態に共通の文献マッピング表示装置の概要を示す概略構成図である。1 is a schematic configuration diagram showing an overview of a document mapping display device common to first and second embodiments. 第１及び第２実施形態に共通の文献マッピング表示装置内の構成を示すブロック図である。FIG. 2 is a block diagram showing a configuration within a document mapping display device common to the first and second embodiments. 第１実施形態の文献マッピング表示装置内の機能部を示すブロック図である。1 is a block diagram showing functional units in a document mapping display device according to a first embodiment; 第１実施形態の複数の文献を表示する二次元平面の模式図である。FIG. 2 is a schematic diagram of a two-dimensional plane displaying a plurality of documents according to the first embodiment. （Ａ）は二次元平面の模式図であり、（Ｂ）はグリッド状の区画を示す模式図である。1A is a schematic diagram of a two-dimensional plane, and FIG. 1B is a schematic diagram showing grid-like divisions. グリッド状の区画における成長度の計算例を示す第１模式図であり、（Ａ）は横軸方向の計算例であり、（Ｂ）は縦軸方向の計算例であり、（Ｃ）はある区画の全体の成長度を求める計算例である。FIG. 11 is a first schematic diagram showing an example of calculation of growth degree in a grid-shaped section, where (A) is an example of calculation in the horizontal axis direction, (B) is an example of calculation in the vertical axis direction, and (C) is an example of calculation for determining the overall growth degree of a certain section. グリッド状の区画における成長度の計算例を示す第２模式図であり、（Ａ）は横軸方向の集計例であり、（Ｂ）は縦軸方向の集計例であり、（Ｃ）はある区画の全体の成長度の集計例である。FIG. 2 is a second schematic diagram showing an example of calculating the growth rate in a grid-shaped section, where (A) is an example of the calculation in the horizontal axis direction, (B) is an example of the calculation in the vertical axis direction, and (C) is an example of the calculation of the overall growth rate of a certain section. 成長度の表示例であり、（Ａ）は矢印表示の例であり、（Ｂ）は数値表示の例である。13 is an example of how the growth level is displayed, where (A) is an example of an arrow display, and (B) is an example of a numerical value display. 第１実施形態の文献マッピング表示装置の主要な制御を示す第１フローチャートである。4 is a first flowchart showing main control of the document mapping display device of the first embodiment. 第１実施形態の文献マッピング表示装置の主要な制御を示す第２フローチャートである。11 is a second flowchart showing main control of the document mapping display device of the first embodiment. 第１実施形態の文献マッピング表示装置の主要な制御を示す第３フローチャートである。11 is a third flowchart showing main control of the document mapping display device of the first embodiment. 第２実施形態の文献マッピング表示装置内の機能部を示すブロック図である。FIG. 11 is a block diagram showing functional units in a document mapping display device according to a second embodiment. 第２実施形態の複数の文献を表示する二次元平面の模式図である。FIG. 11 is a schematic diagram of a two-dimensional plane displaying a plurality of documents according to the second embodiment. 図１３の二次元平面に文献の集積領域の円と当該円をつなぐ線を重ねて表示する模式図である。FIG. 14 is a schematic diagram showing circles of document accumulation areas and lines connecting the circles superimposed on the two-dimensional plane of FIG. 13 . 図１４の部分拡大した模式図である。FIG. 15 is a partially enlarged schematic diagram of FIG. 14 . 第２実施形態の文献マッピング表示装置の主要な制御を示す第１フローチャートである。13 is a first flowchart showing main control of the document mapping display device of the second embodiment. 第２実施形態の文献マッピング表示装置の主要な制御を示す第２フローチャートである。13 is a second flowchart showing main control of the document mapping display device of the second embodiment. 第２実施形態の文献マッピング表示装置の主要な制御を示す第３フローチャートである。13 is a third flowchart showing main control of the document mapping display device of the second embodiment.

第１実施形態及び第２実施形態の文献マッピング表示装置は、複数の文献についてそれらの文献中に存在する所定の文章による類似性の高低、いわゆる文献同士の近さを二次元平面上に表すとともに、二次元平面における文献数の変化量を抽出して、どの方向に文献数が伸びているのかを視覚的に明らかにして表示する装置である。 The document mapping display device of the first and second embodiments is a device that displays on a two-dimensional plane the degree of similarity between multiple documents based on specific sentences present in those documents, i.e., the closeness between documents, and extracts the amount of change in the number of documents on the two-dimensional plane, visually clarifying and displaying the direction in which the number of documents is increasing.

複数の文献とは、例えば、国内外において発行（刊行）される論文（研究論文、学会報告）、技報（技術報告）、公開特許公報、特許公報等である。加えて、新聞、雑誌の記事、立法、行政、司法等により公開される法律、規則、通達、若しくは法人等からの発表等の文字により記述された文献であれば、種類は問われない。 The multiple documents include, for example, papers (research papers, academic conference reports), technical reports, published patent bulletins, patent publications, etc. issued domestically or internationally. In addition, any type of document written in text, such as newspaper or magazine articles, laws, regulations, notices made public by legislative, administrative, or judicial bodies, or announcements from corporations, etc., is acceptable.

複数の文献に存在する所定の文章とは、文献内に見られる具体的な意味内容を示す文章となる。例えば、文献の名称、文献の要約、文献の抄録等である。むろん、文章の数は１つの文献当たり１文章に限られず、適宜の数である。 A specific sentence that exists in multiple documents is a sentence that indicates the specific meaning found in the documents. For example, the name of the document, a summary of the document, or an abstract of the document. Of course, the number of sentences is not limited to one per document, but can be any appropriate number.

加えて、複数の文献のそれぞれには、文献の特徴を示すタグ情報が個々の文献に応じて付されている。文献が論文の場合、タグ情報には、文献の要約の記載、文献のキーワードに加え、文献の公開年度、さらには著者、所属等の書誌事項も含まれる。また、特許公報等の場合、国際特許分類（ＩＰＣ）、発明者、出願人等の書誌事項もタグ情報に加えられる。なお、これらのタグ情報は、当該文献マッピング表示装置の使用者（ユーザ）による入力もされる。 In addition, tag information indicating the characteristics of each document is attached to each document. If the document is a paper, the tag information includes a summary of the document, keywords for the document, the year of publication of the document, and bibliographic information such as the author and affiliation. In the case of patent publications, bibliographic information such as the International Patent Classification (IPC), inventor, and applicant is also added to the tag information. Note that this tag information is also input by the user of the document mapping display device.

図１は第１実施形態及び第２実施形態に共通の実施形態の文献マッピング表示装置１の構成を示す概略図である。文献マッピングの対象となる文献については、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ等の固定メディア２に格納された状態、または、インターネット回線３を通じて取得可能な文献である。固定メディア２またはインターネット回線３を通じて取得される文献は文献マッピング表示装置１にて取得される。文献マッピング表示装置１は、パーソナルコンピュータ（ＰＣ）、タブレット端末、スマートフォン等、種々の電子計算機（計算リソース、コンピュータ）である。また、文献マッピング表示装置１には、データ蓄積のためのサーバ（図示せず）が接続される。 Figure 1 is a schematic diagram showing the configuration of a literature mapping display device 1 of an embodiment common to the first and second embodiments. Literature to be the subject of literature mapping is stored in fixed media 2 such as CD-ROM or DVD-ROM, or is available via an Internet line 3. Literature obtained via the fixed media 2 or the Internet line 3 is obtained by the literature mapping display device 1. The literature mapping display device 1 is any of various electronic computers (computing resources, computers) such as a personal computer (PC), tablet terminal, or smartphone. In addition, a server (not shown) for storing data is connected to the literature mapping display device 1.

図２は第１実施形態及び第２実施形態に共通の文献マッピング表示装置１内の構成を示すブロック図である。当該ブロック図から理解されるように、ハードウェア的にＣＰＵ１１、ＲＡＭ１２、ＲＯＭ１３、記憶部１４、Ｉ／Ｏ（インプット・アウトプットインターフェース）１５により構成される。その他にメインメモリ、ＬＳＩ等も含まれる。またソフトウェア的に、メインメモリにロードされた文献マッピング表示プログラム等により実現される。 Figure 2 is a block diagram showing the internal configuration of the document mapping display device 1 common to the first and second embodiments. As can be seen from this block diagram, in terms of hardware, it is composed of a CPU 11, RAM 12, ROM 13, storage unit 14, and I/O (input/output interface) 15. Other components include a main memory, an LSI, etc. Also, in terms of software, it is realized by a document mapping display program loaded into the main memory, etc.

文献マッピング表示装置１の各機能部をソフトウェアにより実現する場合、文献マッピング表示装置１は各機能を実現するソフトウェアであるプログラムの命令を実行することで実現される。このプログラムを格納する記録媒体は、「一時的でない有形の媒体」、例えば、ＣＤ、ＤＶＤ、半導体メモリ、プログラマブルな論理回路などを用いることができる。また、このプログラムは、当該プログラムを伝送可能な任意の伝送媒体（通信ネットワーク、放送波等）を介して文献マッピング表示装置１（コンピュータ）に供給されてもよい。 When each functional part of the document mapping display device 1 is realized by software, the document mapping display device 1 is realized by executing the commands of a program, which is software that realizes each function. The recording medium that stores this program can be a "non-transient tangible medium," such as a CD, DVD, semiconductor memory, or programmable logic circuit. In addition, this program may be supplied to the document mapping display device 1 (computer) via any transmission medium (communications network, broadcast waves, etc.) that can transmit the program.

文献マッピング表示装置１における各種の記憶部は、ＲＡＭ１２、ＲＯＭ１３であり、記憶部１４としてのＨＤＤまたはＳＳＤ等の記憶装置である。また、演算処理を実行する各機能部はＣＰＵ１１等の演算素子である。文献マッピング表示装置１は、図３のブロック図のとおり、文献取得部１１０、文章取得部１２０、文献配置部１３０、抽出部１４０、計算部１５０、出力部１６０、クローリング部１１１、ベクトル化部１３１等の機能部を備える。 The various storage units in the document mapping display device 1 are RAM 12, ROM 13, and storage unit 14, which is a storage device such as an HDD or SSD. In addition, each functional unit that executes calculation processing is a calculation element such as a CPU 11. As shown in the block diagram of FIG. 3, the document mapping display device 1 has functional units such as a document acquisition unit 110, a text acquisition unit 120, a document placement unit 130, an extraction unit 140, a calculation unit 150, an output unit 160, a crawling unit 111, and a vectorization unit 131.

Ｉ／Ｏ１５は通信（送受信）用のインターフェース、バッファ等である。Ｉ／Ｏ１５は、インターネット回線との接続またはＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ等の読取部４（リーダー）からの入力信号の受信、表示部７等への出力信号の送信に用いられ、ＣＰＵ１１と連携する。表示部７は公知のディスプレイ（液晶表示装置、有機ＥＬ表示装置等）である。加えて、表示部７はタブレット端末、スマートフォン等の画像表示機能を備える機器としてもよい。さらに、Ｉ／Ｏ１５には、入力装置としてキーボード５、マウス６等の機器が接続される。 The I/O 15 is an interface, buffer, etc. for communication (transmission and reception). The I/O 15 is used for connection to an Internet line or for receiving input signals from a reading unit 4 (reader) such as a CD-ROM or DVD-ROM, and for sending output signals to a display unit 7, etc., and works in conjunction with the CPU 11. The display unit 7 is a known display (liquid crystal display device, organic EL display device, etc.). In addition, the display unit 7 may be a device with an image display function such as a tablet terminal or a smartphone. Furthermore, devices such as a keyboard 5 and a mouse 6 are connected to the I/O 15 as input devices.

第１実施形態及び第２実施形態に共通の文献マッピング表示装置１（コンピュータ）は、後述するように、複数の文献と、所定の文章に基づいて二次元平面上の分布図に表現し成長度を可視化して表示する機能を備える。 The document mapping display device 1 (computer) common to the first and second embodiments has a function to visualize and display the degree of growth by plotting a distribution map on a two-dimensional plane based on multiple documents and a specified text, as described below.

始めに第１実施形態の文献マッピング表示装置１（コンピュータ）の個々の機能部について、図３のブロック図等を参照して順に説明する。 First, the individual functional parts of the document mapping display device 1 (computer) of the first embodiment will be described in order with reference to the block diagram in FIG. 3.

文献取得部１１０は、複数の文献を取得する。文献の取得に際しては、ＣＤ－ＲＯＭ、ＤＶＤ－ＲＯＭ等の固定メディア２に格納された文献であれば、読取部４を通じてデータとして取得可能である。あるいは、インターネット回線に接続されていれば、外部のサーバ（図示せず）から対象となる文献を受信して取得することができる。取得後の文献は、一次的に記憶部１４に記憶（格納）される。 The document acquisition unit 110 acquires multiple documents. When acquiring documents, if the documents are stored in fixed media 2 such as a CD-ROM or DVD-ROM, they can be acquired as data through the reading unit 4. Alternatively, if connected to an Internet line, the target documents can be received and acquired from an external server (not shown). After acquisition, the documents are temporarily stored in the memory unit 14.

文献取得部１１０はクローリング部１１１を備えることができる。クローリング部１１１は、クローリング部は、インターネット上に存在するＷｅｂサイトの情報を取得して、検索用データベース・インデックスを作成する。そして、クローリング部１１１は自動的に目的とする文献の存在するＷｅｂサイトにアクセスして目的とする文献を取得する。
クローリングに際しては、文献マッピング表示装置１のユーザから目的とする文献（論文の名称、所定官庁の特許等の公報）の情報が入力される。そこで、文献取得部１１０は文献の情報に基づいてＷｅｂサイトを巡回し、該当する文献のｈｔｍｌ情報を取得し、該当する文献を取得する。 The document acquisition unit 110 may include a crawling unit 111. The crawling unit 111 acquires information on websites existing on the Internet and creates a search database index. Then, the crawling unit 111 automatically accesses the website on which the target document exists and acquires the target document.
When crawling, information on the target document (the title of a paper, a publication such as a patent issued by a specific government agency) is input by a user of the document mapping display device 1. The document acquisition unit 110 then crawls the website based on the document information, acquires the HTML information of the relevant document, and acquires the relevant document.

文章取得部１２０は、複数の文献から所定の文章を取得する。 The text acquisition unit 120 acquires specific text from multiple documents.

文献配置部１３０は、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する。さらに、文献配置部１３０における所定の文章同士の類似性に伴う二次元平面への配置は自然言語処理に基づく。文献中に存在する文章等には言語特有の表現上の揺らぎ、ぶれ等が存在する。そのため、所定の文章同士の類似度の比較を円滑にするため、自然言語処理の利用が望ましい。 The document placement unit 130 places the specified sentences contained in each of the multiple documents on a two-dimensional plane according to their similarity. Furthermore, the placement on the two-dimensional plane in the document placement unit 130 according to the similarity between the specified sentences is based on natural language processing. The sentences present in the documents have fluctuations and inconsistencies in expression that are specific to the language. Therefore, it is desirable to use natural language processing to facilitate a comparison of the similarity between the specified sentences.

ここで言う所定の文章同士の類似性とは、相互の文章における意味内容の近さを示す。 The similarity between given sentences here refers to the closeness of the meaning of the sentences.

そして、文献配置部１３０はベクトル化部１３１を備える。このベクトル化部１３１は、自然言語処理により複数の文献のそれぞれに特徴ベクトルを生成する。すなわち、個々の文献は特徴ベクトルを保持している。そして、特徴ベクトルは次元数の削減を通じて二次元に表示可能となっている。そのため、複数の文献のそれぞれは、二次元平面においては基準点（図示せず）から次元削減された特徴ベクトルに応じた位置に配置されることとなる。特徴ベクトルの生成には、例えば、Ｗｏｒｄ２ｖｅｃ等の単語の埋め込みを生成するために使用される一連のモデル群が利用される。 The document placement unit 130 includes a vectorization unit 131. This vectorization unit 131 generates a feature vector for each of the multiple documents using natural language processing. That is, each document holds a feature vector. The feature vector can be displayed in two dimensions by reducing the number of dimensions. Therefore, each of the multiple documents is placed in a two-dimensional plane at a position corresponding to the feature vector that has been dimension-reduced from a reference point (not shown). The feature vector is generated using a series of models used to generate word embeddings such as Word2vec.

文献配置部１３０が生成する特徴ベクトルは、次元数が数百次元と高次元に及ぶ。このように特徴ベクトルが高次元に及ぶと図示、可視化に非常に煩雑となる。そこで、高次元に及ぶ特徴ベクトルは、二次元にまで次元数が削減される（次元圧縮）。次元数の削減に際しては、事前学習ができるもの、さらには、事前学習した結果に基づいて分布を配置するＵＭＡＰ処理等の公知の次元圧縮の手法が用いられる。つまりは、次元数の削減において事前学習ができ、学習した結果を用いてその都度実行する際、次元圧縮した二次元平面上の分布は著しく変化しない。これにより、経年の変化量を捉えることができる。なお、事前学習に際しては、複数の文献の所定の文章を網羅的に事前学習することが望ましい。これらは、国内特許であれば、ある時点でのすべての公報に対して事前学習しておいてもよい。さらには事前学習を、例えば国連調査のように数年毎に更新してもよい。 The feature vector generated by the document placement unit 130 has a high number of dimensions, up to several hundred. When the feature vector has such a high number of dimensions, it becomes very complicated to illustrate and visualize it. Therefore, the number of dimensions of the high-dimensional feature vector is reduced to two dimensions (dimensionality reduction). When reducing the number of dimensions, a known dimensionality reduction method such as pre-learning or UMAP processing that arranges the distribution based on the results of pre-learning is used. In other words, pre-learning is possible in reducing the number of dimensions, and when the learned results are used to execute each time, the distribution on the dimensionally reduced two-dimensional plane does not change significantly. This makes it possible to capture the amount of change over time. In addition, when pre-learning, it is desirable to comprehensively pre-learn predetermined sentences of multiple documents. If these are domestic patents, they may be pre-learned for all publications at a certain point in time. Furthermore, the pre-learning may be updated every few years, for example, like a United Nations survey.

図４の模式図は、第１実施形態の複数の文献を二次元平面２０に配置した一例である。文献マッピングの対象となる文献は日本国内の特許である。図中、１つの点が１件の文献に相当する。 The schematic diagram in Figure 4 shows an example of multiple documents in the first embodiment arranged on a two-dimensional plane 20. The documents that are the subject of document mapping are Japanese patents. In the diagram, one point corresponds to one document.

図４の模式図では、複数の文献が二次元平面に表示され、文献の集合が可視化されている。しかしながら、図４の模式図の段階では、ある時点における文献の集合が表示されているに留まる。 In the schematic diagram in Figure 4, multiple documents are displayed on a two-dimensional plane, visualizing the collection of documents. However, at the schematic diagram stage in Figure 4, the collection of documents is only displayed at a certain point in time.

抽出部１４０は、二次元平面に存在する所定領域における文献数の変化量を抽出する。また、所定領域における文献数の変化量に基づいて二次元平面における所定の文章の集合の変化または所定の文章の集合の変化の差分を抽出する。ここで言う文献数の変化量とは、所定領域における或る年度の文献数と所定領域における或る年度の前年度の文献数の差としてもよい。計算部１５０は、二次元平面に存在する所定領域における文献数の変化量に基づいて当該所定領域の成長度として計算する。また成長度とは文献の変化量からベクトル量を含む特徴（例えば、勾配等）を求めたものを示す。 The extraction unit 140 extracts the amount of change in the number of documents in a specified region on a two-dimensional plane. In addition, based on the amount of change in the number of documents in the specified region, the extraction unit 140 extracts the change in a specified set of sentences on the two-dimensional plane or the difference in the change in the specified set of sentences. The amount of change in the number of documents here may be the difference between the number of documents in a specified region in a certain year and the number of documents in the specified region in the previous year. The calculation unit 150 calculates the growth rate of the specified region based on the amount of change in the number of documents in the specified region on a two-dimensional plane. The growth rate refers to a feature (e.g., gradient, etc.) including a vector quantity calculated from the amount of change in documents.

実施形態にあっては、抽出部１４０は、二次元平面２０（図４参照）をグリッド状に区画して生じる一の区画を所定領域として当該所定領域内に存在する文献数と文献のタグ情報に基づいて文献数の変化量を抽出する。この様子は図５の模式図として示される。 In this embodiment, the extraction unit 140 divides the two-dimensional plane 20 (see FIG. 4) into a grid pattern, defines each section as a specific region, and extracts the amount of change in the number of documents based on the number of documents present in the specific region and the tag information of the documents. This is shown in the schematic diagram of FIG. 5.

前出の所定領域とは、図５等に示される二次元平面２０の中から分析対象とする所定範囲を規定して区画される部分であり、後出の図６のグリッド状（格子状）に区画される一の（１つの）区画である。 The aforementioned specified area is a portion that is partitioned by defining a specified range to be analyzed from within the two-dimensional plane 20 shown in FIG. 5, etc., and is one (one) partition partitioned in a grid pattern (lattice pattern) as shown in FIG. 6 below.

図４の図面上では個々の文献は灰色の点として表現されている。ここで、個々の文献のそれぞれに前出のタグ情報を反映することができる。例えば、タグ情報としてある国際特許分類を「青色」、別の国際特許分類を「橙色」、さらに別の国際特許分類を「緑色」等と色分けすることが可能である。そうすると、タグ情報を手掛かりに、文献分布の傾向把握が可能となる。 In the diagram in Figure 4, each document is represented as a gray dot. The tag information mentioned above can be reflected in each document. For example, it is possible to color-code one international patent classification as tag information, "blue," another as "orange," and yet another as "green." This makes it possible to grasp trends in document distribution using the tag information as a clue.

図５（Ａ）から理解されるように、二次元平面２０の中から分析対象とする領域２１が選定される。当該二次元平面２０の領域２１は、均等な所定間隔３１を有するグリッド３０によりグリッド状（格子状）に区画される。こうして分析対象とする領域２１はグリッド３０により複数の区画３２（いわゆる升目）に区画される。そして、個々の区画３２（升目）に存在する文献数の変化量が抽出される。 As can be seen from FIG. 5(A), an area 21 to be analyzed is selected from a two-dimensional plane 20. The area 21 on the two-dimensional plane 20 is partitioned into a grid (lattice) shape by a grid 30 having a uniform, predetermined spacing 31. In this way, the area 21 to be analyzed is partitioned into a plurality of sections 32 (so-called squares) by the grid 30. Then, the amount of change in the number of documents present in each of the sections 32 (squares) is extracted.

図５（Ｂ）では、領域２１は横方向をｘ軸、縦方向をｙ軸とするｘ－ｙ平面として表現される。図中の区画３２に存在する数字は、具体的な文献数の変化量である。なお、領域２１に対する所定間隔３１は任意に設定可能である。所定間隔３１が広くなると、マクロ的な把握が可能となり、所定間隔３１が狭くなると、ミクロ的な把握が可能となる。 In FIG. 5(B), the region 21 is represented as an x-y plane with the horizontal direction being the x-axis and the vertical direction being the y-axis. The numbers in the sections 32 in the figure represent the specific change in the number of documents. The predetermined interval 31 for the region 21 can be set arbitrarily. A wider predetermined interval 31 allows for a macroscopic understanding, and a narrower predetermined interval 31 allows for a microscopic understanding.

個々のグリッド３０により生じた複数の区画３２（升目）に存在する文献数が抽出された後、計算部１５０は、一の区画に隣接する他の区画に存在する文献数同士から一の区画における成長度を計算する。すなわち、縦軸と横軸の関係から一の区画に隣接する前後及び上下の区画からの差分に基づいて一の区画における特徴が計算される。 After the number of documents present in the multiple sections 32 (grids) generated by each grid 30 is extracted, the calculation unit 150 calculates the growth rate of the one section from the number of documents present in other sections adjacent to the one section. In other words, the characteristics of the one section are calculated based on the difference between the sections before, after, and above and below the one section, based on the relationship between the vertical and horizontal axes.

例えば、或る年度の複数の文献が二次元平面に配置されているとき（前出の図４、図５（Ａ）参照）、所定の領域の或る年度の具体的な文献数が求められる（図５（Ｂ）参照）。すなわち、年度のタグ情報に基づいて所定領域の文献数が抽出される。そして、前出の或る年の前年度についても、複数の文献は二次元平面に配置可能であるため、所定の領域の或る年度の前年度の具体的な文献数が求められる（図５（Ｂ）参照）。つまり、年度別にタグ情報に基づいて文献数の変化量が求められる。そこで、個々の区画３２毎に、或る年度とその前年度の数値同士（いわゆる文献数同士）の比較（差分）が可能である。そうすると、差分量の多少から文献数の変化の程度の把握が容易となる。 For example, when multiple documents from a certain year are arranged on a two-dimensional plane (see Figures 4 and 5(A) above), the specific number of documents for a certain year in a specified area is found (see Figure 5(B)). That is, the number of documents in a specified area is extracted based on the tag information for the year. Then, since multiple documents can be arranged on a two-dimensional plane for the previous year of the certain year, the specific number of documents for a certain area in the previous year of the certain year is found (see Figure 5(B)). That is, the amount of change in the number of documents is found for each year based on the tag information. Therefore, it is possible to compare (difference) the numerical values (so-called number of documents) for a certain year and the previous year for each section 32. This makes it easy to grasp the degree of change in the number of documents from the amount of difference.

前述の差分量からの文献数の変化は、いわゆる年度毎（年単位毎）の文献数変化の把握である。これに加え、複数年度間の差分量（平均の差分量）と、或る年度とその前年度の差分量との比較も可能である。毎年の文献数変化を把握するとともに、ここ数年にわたる文献数変化量との上振れ、下振れ等の差分量についての変化量も算出可能である。当該抽出は抽出部１４０により実行される。そして、計算部１５０により、縦軸と横軸の関係から一の区画に隣接する前後及び上下の区画からの差分に基づいて一の区画における特徴が計算される。 The change in the number of documents from the difference amount described above is what is known as a grasp of the change in the number of documents from year to year (yearly unit). In addition to this, it is also possible to compare the difference amount between multiple years (average difference amount) and the difference amount between a certain year and the previous year. In addition to grasping the change in the number of documents from year to year, it is also possible to calculate the amount of change in the difference amount, such as upward and downward fluctuations, from the change in the number of documents over the past few years. This extraction is performed by the extraction unit 140. Then, the calculation unit 150 calculates the characteristics of a certain section based on the difference from the sections before and after and above and below the certain section, based on the relationship between the vertical and horizontal axes.

図６の模式図は一の区画における勾配の計算の仕方を示す。図中のそれぞれの区画内の数値は文献数の変化量である。図６（Ａ）は横方向となるｘ軸の計算を示す。計算対象の区画４１の文献数の変化量は「７」であり、左に隣接する区画４２の文献数の変化量は「５」、右に隣接する区画４３の文献数の変化量は「９」である。実施形態の場合、区画４３の文献数の変化量「９」から区画４２の文献数の変化量「５」が引かれて、差分「４」が得られる。差分に「１／２」が掛けられて「２」が得られる。この「２」が、計算対象の区画４１の横方向となるｘ軸方向の成長度（ｄｘ）である。なお、両端は前方差分または後方差分となる（図示せず）。 The schematic diagram in FIG. 6 shows how to calculate the gradient in one section. The numerical values in each section in the diagram are the amount of change in the number of documents. FIG. 6(A) shows the calculation of the x-axis, which is the horizontal direction. The amount of change in the number of documents in section 41 being calculated is "7", the amount of change in the number of documents in section 42 adjacent to the left is "5", and the amount of change in the number of documents in section 43 adjacent to the right is "9". In the case of the embodiment, the amount of change in the number of documents in section 42, "5", is subtracted from the amount of change in the number of documents in section 43, "9", to obtain the difference "4". The difference is multiplied by "1/2" to obtain "2". This "2" is the growth rate (dx) in the x-axis direction, which is the horizontal direction of section 41 being calculated. Note that both ends are forward or backward differences (not shown).

図６（Ｂ）は縦方向となるｙ軸の計算を示す。計算対象の区画４１の文献数の変化量は「７」であり、上に隣接する区画４４の文献数の変化量は「２」、下に隣接する区画４５の文献数の変化量は「０」である。実施形態の場合、区画４４の文献数の変化量「０」から区画４５の文献数の変化量「２」が引かれて、差分「－２」が得られる。差分に「１／２」が掛けられて「－１」が得られる。この「－１」が、計算対象の区画４１の縦方向となるｙ軸方向の成長度（ｄｙ）である。なお、両端は前方差分または後方差分となる（図示せず）。 Figure 6 (B) shows the calculation of the y-axis, which is the vertical direction. The change in the number of documents in the section 41 being calculated is "7", the change in the number of documents in the adjacent section 44 above is "2", and the change in the number of documents in the adjacent section 45 below is "0". In the case of the embodiment, the change in the number of documents in section 45, "2", is subtracted from the change in the number of documents in section 44, "0", to obtain the difference "-2". The difference is multiplied by "1/2" to obtain "-1". This "-1" is the growth rate (dy) in the y-axis direction, which is the vertical direction of the section 41 being calculated. Note that both ends are forward or backward differences (not shown).

図６（Ｃ）はある区画の全体の成長度を求める計算例である。ある区画における横方向（ｘ方向）の成長度（ｄｘ）及び縦方向（ｙ方向）の成長度（ｄｙ）から、ある区画の全体の成長度（Ｇ）が計算される式である。すなわち、各方向の成長度の平方の和に対して平方根が求められる。図６の例によると、２^２＋（－１）^２＝５の平方根（√（５））となり、約２．２４となる。なお、図６等に開示の成長度の計算は一例であり、成長度の計算は、図示及び説明の方法に限定されない。 FIG. 6C is an example of a calculation for determining the overall growth rate of a certain section. This is a formula for calculating the overall growth rate (G) of a certain section from the growth rate (dx) in the horizontal direction (x direction) and the growth rate (dy) in the vertical direction (y direction) of the certain section. In other words, the square root is calculated for the sum of the squares of the growth rates in each direction. In the example of FIG. 6, 2 ² +(−1) ² = the square root of 5 (√(5)), which is about 2.24. Note that the calculation of the growth rate disclosed in FIG. 6 and the like is an example, and the calculation of the growth rate is not limited to the method shown and described.

計算部１５０は、図６にて説明の計算を全ての区画に対して実行する。図７の模式図は各区画における成長度を示す例である。図７（Ａ）は全ての区画における横方向となるｘ軸における成長度（ｄｘ）の表示であり、図７（Ｂ）は全ての区画における縦方向となるｙ軸における成長度（ｄｙ）の表示である。図７（Ｃ）は全ての区画における成長度（Ｇ）の一覧である。 The calculation unit 150 performs the calculations described in FIG. 6 for all sections. The schematic diagram in FIG. 7 is an example showing the growth degree in each section. FIG. 7(A) shows the growth degree (dx) on the x-axis, which is the horizontal direction, in all sections, and FIG. 7(B) shows the growth degree (dy) on the y-axis, which is the vertical direction, in all sections. FIG. 7(C) is a list of the growth degrees (G) in all sections.

既述のとおり、（ｉ）二次元平面の中からの分析対象とする領域の選定、（ｉｉ）その領域へのグリッドの設定による個々の区画の作成、（ｉｉｉ）各区画の成長度の算出の３段階が順に実行される。そうすると、ある特定の区画（例えば前出の区画４１）について、その区画における横方向となるｘ軸方向の成長度（ｄｘ）及び縦方向となるｙ軸方向の成長度（ｄｙ）が算出可能となる。 As mentioned above, three steps are executed in sequence: (i) selecting an area to be analyzed from within a two-dimensional plane, (ii) creating individual sections by setting a grid in that area, and (iii) calculating the growth rate of each section. Then, for a particular section (e.g. section 41 mentioned above), it becomes possible to calculate the growth rate (dx) in the x-axis direction, which is the horizontal direction, and the growth rate (dy) in the y-axis direction, which is the vertical direction, of that section.

そして、出力部１６０は成長度を出力する。成長度の出力は、図１の表示部７（ディスプレイ）への画像として表示される。図８は表示部７における表示例であり、図８（Ａ）では、成長度は二次元平面において矢印として表示される。 Then, the output unit 160 outputs the growth degree. The output of the growth degree is displayed as an image on the display unit 7 (display) of FIG. 1. FIG. 8 is an example of a display on the display unit 7, and in FIG. 8(A), the growth degree is displayed as an arrow on a two-dimensional plane.

図８（Ａ）の例では、矢印は２種類用意され、所定の閾値以上の成長度の場合には黒い矢印２２、別の所定の閾値以上の成長度の場合には白抜きの矢印２３として表示されている。図８（Ｂ）では、成長度は個々の文献のそれぞれについて、図示の例では文献番号、発明者との関係で数値表示されている。 In the example of FIG. 8(A), two types of arrows are provided, a black arrow 22 for a growth level above a given threshold, and a hollow arrow 23 for a growth level above another given threshold. In FIG. 8(B), the growth level is displayed numerically for each individual document, in the illustrated example in relation to the document number and the inventor.

図８（Ａ）の場合、矢印２２，２３の存在箇所、色を通じて二次元平面における成長度の高い領域の客観的な把握が可能となる。また、図８（Ｂ）の場合、文献毎に成長度は数値として具体的に把握することができる。 In the case of FIG. 8(A), the locations and colors of the arrows 22 and 23 make it possible to objectively grasp areas with high growth rates on a two-dimensional plane. In addition, in the case of FIG. 8(B), the growth rate for each document can be specifically grasped as a numerical value.

続いて、第１実施形態の文献マッピング表示方法を文献マッピング表示プログラムとともに説明する。 Next, the document mapping display method of the first embodiment will be explained together with the document mapping display program.

第１実施形態の文献マッピング表示方法は、第１実施形態の文献マッピング表示プログラムに基づいて、文献マッピング表示装置１のＣＰＵ１１により実行される。文献マッピング表示方法は、文献マッピング表示装置１のＣＰＵ１１に対して、文献取得機能、文章取得機能、文献配置機能、抽出機能、計算機能、出力機能を実行させ、さらに、クローリング機能を実行させる。各機能は前述の説明と重複するため、詳細は省略する。 The literature mapping display method of the first embodiment is executed by the CPU 11 of the literature mapping display device 1 based on the literature mapping display program of the first embodiment. The literature mapping display method causes the CPU 11 of the literature mapping display device 1 to execute a literature acquisition function, a text acquisition function, a literature arrangement function, an extraction function, a calculation function, an output function, and further a crawling function. Each function overlaps with the above description, so details are omitted.

図９、図１０、及び図１１のフローチャートは第１実施形態の文献マッピング表示装置１のＣＰＵ１１における文献マッピング表示方法の全体の流れであり、図９では文献取得ステップ（Ｓ１１０）、文章取得ステップ（Ｓ１２０）、文献配置ステップ（Ｓ１３０）、抽出ステップ（Ｓ１４０）、計算ステップ（Ｓ１５０）、出力ステップ（Ｓ１６０）が実行され、図１０ではクローリングステップ（Ｓ１１１）が実行される。図１１ではベクトル化ステップ（Ｓ１３１）が実行される。 The flowcharts in Figures 9, 10, and 11 show the overall flow of the document mapping display method in the CPU 11 of the document mapping display device 1 of the first embodiment. In Figure 9, a document acquisition step (S110), a text acquisition step (S120), a document placement step (S130), an extraction step (S140), a calculation step (S150), and an output step (S160) are executed, while in Figure 10, a crawling step (S111) is executed. In Figure 11, a vectorization step (S131) is executed.

文献取得機能は、複数の文献を取得する（Ｓ１１０；文献取得ステップ）。文章取得機能は、複数の文献から所定の文章を取得する（Ｓ１２０；文章取得ステップ）。文献配置機能は、複数の文献同士を、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する（Ｓ１３０；文献配置ステップ）。抽出機能は、二次元平面に存在する所定領域における文献数の変化量を抽出する（Ｓ１４０；抽出ステップ）。さらに、抽出機能は、所定領域における文献数の変化量に基づいて二次元平面における所定の文章の集合の変化、または所定の文章の集合の変化の差分を抽出する。計算機能は、二次元平面に存在する所定領域における文献数の変化量に基づいて成長度を計算する（Ｓ１５０；計算ステップ）。出力機能は、成長度を出力する（Ｓ１６０；出力ステップ）。また、クローリング機能は、インターネット回線を通じて複数の文献を取得する（Ｓ１１１；クローリングステップ）（図１０参照）。ベクトル化機能は、自然言語処理により複数の文献のそれぞれに特徴ベクトルを生成する（Ｓ１３１；ベクトル化ステップ）（図１１参照）。 The document acquisition function acquires a plurality of documents (S110; document acquisition step). The text acquisition function acquires a predetermined text from the plurality of documents (S120; text acquisition step). The document arrangement function arranges the plurality of documents on a two-dimensional plane according to the similarity of the predetermined text contained in each of the plurality of documents (S130; document arrangement step). The extraction function extracts the amount of change in the number of documents in a predetermined area present on the two-dimensional plane (S140; extraction step). Furthermore, the extraction function extracts the change in the set of predetermined texts in the two-dimensional plane, or the difference in the change in the set of predetermined texts, based on the amount of change in the number of documents in the predetermined area. The calculation function calculates the growth rate based on the amount of change in the number of documents in a predetermined area present on the two-dimensional plane (S150; calculation step). The output function outputs the growth rate (S160; output step). The crawling function acquires a plurality of documents through an Internet line (S111; crawling step) (see FIG. 10). The vectorization function generates a feature vector for each of the plurality of documents by natural language processing (S131; vectorization step) (see FIG. 11).

続いて第２実施形態の文献マッピング表示装置１（コンピュータ）の個々の機能部について、図１２のブロック図等を参照して順に説明する。第２実施形態の文献マッピング表示装置１の機械構成については第１実施形態の文献マッピング表示装置１と共通であるため説明を省略する。第２実施形態の文献マッピング表示装置１は、図１２のブロック図のとおり、文献取得部１１０、文章取得部１２０、文献配置部１３０、抽出部１４０、計算部１５０、集積検出部１７０、時系列変化出力部１８０、クローリング部１１１、ベクトル化部１３１等の機能部を備える。なお、第２実施形態において、第１実施形態の文献マッピング表示装置１と共通する構成については同じ符号が用いられ、重複説明は省略される。 Next, the individual functional units of the document mapping display device 1 (computer) of the second embodiment will be described in order with reference to the block diagram of FIG. 12 and the like. The mechanical configuration of the document mapping display device 1 of the second embodiment is the same as that of the document mapping display device 1 of the first embodiment, and therefore the description will be omitted. As shown in the block diagram of FIG. 12, the document mapping display device 1 of the second embodiment includes functional units such as a document acquisition unit 110, a text acquisition unit 120, a document arrangement unit 130, an extraction unit 140, a calculation unit 150, an accumulation detection unit 170, a time series change output unit 180, a crawling unit 111, and a vectorization unit 131. Note that in the second embodiment, the same reference numerals are used for components common to the document mapping display device 1 of the first embodiment, and duplicated descriptions will be omitted.

図１２のブロック図において、文献取得部１１０は、複数の文献を取得する。文献取得部１１０の機能は第１実施形態と同様である。文献取得部１１０に含まれるクローリング部１１１の機能は第１実施形態と同様である。文章取得部１２０は、複数の文献から所定の文章を取得する。文章取得部１２０の機能は第１実施形態と同様である。 In the block diagram of FIG. 12, the document acquisition unit 110 acquires multiple documents. The function of the document acquisition unit 110 is the same as in the first embodiment. The function of the crawling unit 111 included in the document acquisition unit 110 is the same as in the first embodiment. The text acquisition unit 120 acquires a predetermined text from multiple documents. The function of the text acquisition unit 120 is the same as in the first embodiment.

第２実施形態の文献配置部１３０は、複数の文献同士を、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する。第２実施形態の文献配置部１３０における文献に含まれる所定の文章同士の類似性に伴う二次元平面への配置は自然言語処理に基づく。文献の文章中に存在する文言等には言語特有の表現上の揺らぎ、ぶれ等が存在する。そのため、所定の文章同士の類似度の比較を円滑にするため、自然言語処理の利用が望ましい。ここで言う所定の文献に含まれる所定の文章同士の類似性とは、文献相互における意味内容の近さを示す。 The document arrangement unit 130 of the second embodiment arranges multiple documents on a two-dimensional plane according to the similarity of specific sentences contained in each of the multiple documents. The arrangement on the two-dimensional plane in accordance with the similarity of specific sentences contained in the documents by the document arrangement unit 130 of the second embodiment is based on natural language processing. Words and the like present in the sentences of the documents have fluctuations and inconsistencies in expression that are specific to the language. Therefore, in order to facilitate a comparison of the similarity between the specific sentences, it is desirable to use natural language processing. The similarity between specific sentences contained in the specific documents referred to here refers to the closeness of the meaning of the documents.

そして、文献配置部１３０はベクトル化部１３１を備える。このベクトル化部１３１は、自然言語処理により複数の文献のそれぞれに特徴ベクトルを生成する。ベクトル化部１３１の機能は第１実施形態と同様である。文献配置部１３０が生成する特徴ベクトルについても、第１実施形態と同様に二次元にまで次元数が削減される（次元圧縮）。そこで、図１３の模式図のように二次元平面への表示を可能としている。 The document arrangement unit 130 also includes a vectorization unit 131. This vectorization unit 131 generates a feature vector for each of the multiple documents by natural language processing. The function of the vectorization unit 131 is the same as in the first embodiment. The feature vectors generated by the document arrangement unit 130 also have their number of dimensions reduced to two dimensions (dimensional compression) as in the first embodiment. Therefore, it is possible to display them on a two-dimensional plane as shown in the schematic diagram of FIG. 13.

図１３の模式図は、第２実施形態の複数の文献を二次元平面２５に配置した一例である。図中の灰色部分は個々の文献の集合に相当する。なお、灰色の濃淡による区分けは技術分野のおおまかな境界を示している。実際の表示は複数の異なる色のカラー表示であり、点の集合とされる。図示は便宜上異なる濃淡の灰色としている。 The schematic diagram in FIG. 13 is an example of a plurality of documents in the second embodiment arranged on a two-dimensional plane 25. The gray parts in the diagram correspond to collections of individual documents. The division by the shade of gray indicates the rough boundaries of technical fields. The actual display is a color display of multiple different colors, and is considered as a collection of points. For convenience, the illustration shows different shades of gray.

抽出部１４０は、二次元平面２５に存在する所定領域における文献数の変化量を抽出する。計算部１５０は、二次元平面に存在する所定領域における文献数の変化量に基づいて成長度を計算する。第２実施形態の文献マッピング表示装置１における抽出部１４０及び計算部１５０の機能は、第１実施形態と同様であり、前述の図５、図６、図７、図８における説明と同様の処理が実行される。 The extraction unit 140 extracts the amount of change in the number of documents in a specified area on the two-dimensional plane 25. The calculation unit 150 calculates the growth rate based on the amount of change in the number of documents in a specified area on the two-dimensional plane. The functions of the extraction unit 140 and the calculation unit 150 in the document mapping display device 1 of the second embodiment are the same as those of the first embodiment, and the same processing as that described above in Figures 5, 6, 7, and 8 is executed.

集積検出部１７０は、二次元平面に存在する所定領域における複数の文献に基づいて集積領域を検出する（クラスタリング）。さらには、集積検出部１７０は、二次元平面に存在する所定領域における複数の文献の密度に基づいて集積領域を検出する（クラスタリング）。検出に際しては、二次元平面に配置された複数の文献は密度ベースクラスタリング手法、ｋ平均法、ｋ近傍法等が用いられる。集積領域は、当該文献マッピング表示装置１のユーザの設定により任意の集積領域数が指定される。このため、ユーザの要望に即した集積領域数に応じた分析が可能となる。 The accumulation detection unit 170 detects accumulation areas based on a plurality of documents in a specified area on a two-dimensional plane (clustering). Furthermore, the accumulation detection unit 170 detects accumulation areas based on the density of a plurality of documents in a specified area on a two-dimensional plane (clustering). When detecting, density-based clustering methods, k-means clustering, k-nearest neighbor clustering, etc. are used for a plurality of documents arranged on a two-dimensional plane. The number of accumulation areas is specified by the user's settings on the document mapping display device 1. This makes it possible to perform analysis according to the number of accumulation areas that meets the user's needs.

時系列変化出力部１８０は、所定領域における成長度と集積領域の二次元平面における時系列の変化点を二次元平面に表示する。さらに言うと、二次元平面２５に存在する所定領域における成長度と複数の文献の集積領域の二次元平面２５における時系列の変化点を二次元平面２５に表示する。表示形態は次述の図１４、図１５となる。 The time series change output unit 180 displays on a two-dimensional plane the degree of growth in a specified area and the time series change points in the two-dimensional plane of the accumulation area. More specifically, it displays on the two-dimensional plane 25 the degree of growth in a specified area present on the two-dimensional plane 25 and the time series change points in the two-dimensional plane 25 of the accumulation area of multiple documents. The display format is as shown in Figures 14 and 15 below.

時系列変化出力部１８０では、集積検出部１７０にて検出した集積領域毎（クラスタ毎）に時系列の変化点を計算した後に出力される。集積領域毎（クラスタ毎）の時系列の変化点は、二次元平面２５に存在する所定領域における複数の文献のうち、集積検出部１７０にて検出されたそれぞれの集積領域毎（クラスタ毎）に属する複数の文献が使用される。 The time series change output unit 180 calculates and then outputs the time series change points for each accumulation area (each cluster) detected by the accumulation detection unit 170. The time series change points for each accumulation area (each cluster) are calculated using the multiple documents belonging to each accumulation area (each cluster) detected by the accumulation detection unit 170 out of the multiple documents in a specific area present on the two-dimensional plane 25.

さらに時系列変化出力部１８０では、それぞれの集積領域（クラスタ）に属する複数の文献を期間毎に当該文献の密となる位置が算出され、複数期間がつなげられて時系列の変化点が表示される。当該文献の密となる位置の算出に際しては、ガウシアン分布等が用いられる。 Furthermore, the time series change output unit 180 calculates the dense positions of the documents belonging to each cluster for each period, and connects the multiple periods to display the change points in the time series. A Gaussian distribution or the like is used to calculate the dense positions of the documents.

具体的には、図１４の模式図のとおり、文献が数多く密集する（密となる）位置を強調するため、円等の図形が用いられる。これらの円は二次元平面２５の随所に表示されている。 Specifically, as shown in the schematic diagram of FIG. 14, shapes such as circles are used to emphasize locations where many documents are concentrated (dense). These circles are displayed at various points on a two-dimensional plane 25.

図１４は３年分の表示態様を例示している。それぞれの集積領域（灰色の濃淡により区分けされる領域）には、円が３個含まれる。例えば、各円は、二次元平面２５の中の集積領域における今年、１年前、２年前の複数の文献の密となる位置に相当する。図示は３年間分の例であるため円を３個としている。そこで、５年分の累積調査ならば５個の円に数は増やされる。また、年毎（期間毎）に円以外の図形（四角等）が用いられるようにしても良い。さらに、各円は時系列の順に線でつながれる。図１４の表示とすると、複数の文献の密となる位置と、当該位置の時系列を伴う変化の両方が一括して二次元平面２５に表示可能となり、視覚的な把握が容易となる。むろん、表示の期間は図示に限らず適宜である。例えば２年毎としてもよい。 Figure 14 shows an example of a display format for three years. Each accumulation area (area divided by shades of gray) contains three circles. For example, each circle corresponds to a location where multiple documents are densely packed this year, one year ago, and two years ago in the accumulation area in the two-dimensional plane 25. Since the figure shows an example for three years, there are three circles. Therefore, if it is a cumulative survey for five years, the number of circles is increased to five. Also, shapes other than circles (such as squares) may be used for each year (period). Furthermore, each circle is connected by a line in chronological order. With the display shown in Figure 14, both the locations where multiple documents are densely packed and the changes in the locations over time can be displayed together on the two-dimensional plane 25, making it easy to visually grasp. Of course, the period of display is not limited to the one shown in the figure and can be any appropriate period. For example, it may be every two years.

より詳しくは、図１４を部分的に拡大した図１５の拡大模式図が参照される。図１５では、２０１８年、２０１９年、２０２０年の過去３年分の経時変化の様子が表される。２０１８年の文献の密となる位置に円２８ａ、２０１９年の文献の密となる位置に円２８ｂ、２０２０年の文献の密となる位置に円２８ｃとして表示される。また、経時変化の表示を明確化するため、円２８ａ、２８ｂ、２８ｃの順に灰色の程度が濃くなるようにしている。このような複数の文献の密となる位置を示す円の位置から二次元平面２５における移動（位置の軌跡）がわかりやすくなる。 For more details, see the enlarged schematic diagram of FIG. 15, which is a partial enlargement of FIG. 14. FIG. 15 shows the changes over time for the past three years, 2018, 2019, and 2020. Circles 28a are displayed at locations where there is a high density of documents in 2018, circles 28b at locations where there is a high density of documents in 2019, and circles 28c at locations where there is a high density of documents in 2020. In order to clarify the display of the changes over time, circles 28a, 28b, and 28c are displayed in increasing gray in that order. The positions of the circles indicating the locations where such multiple documents are dense make it easier to understand the movement (locus of position) on the two-dimensional plane 25.

さらに、複数の文献の密となる位置を示す円２８ａと２８ｂの間は線２９ｐによりつながれ、円２８ｂと２８ｃの間は線２９ｑによりつながれる。線を配置することにより、当該線の長さ（丸同士の距離）が明確化するため、二次元平面２５における移動の量（大きく動いているのか、その位置に留まっているのか）の把握が容易となる。また、図示では、線自体も経時変化の表示を明確化するため、線２９ｐ、線２９ｑの順に灰色の程度が濃くなるようにしている。 Furthermore, circles 28a and 28b, which indicate the location where multiple documents are densely packed, are connected by line 29p, and circles 28b and 28c are connected by line 29q. By arranging the lines, the length of the lines (the distance between the circles) is made clear, making it easier to grasp the amount of movement in two-dimensional plane 25 (whether there is a large movement or whether the position remains the same). In the illustration, the lines themselves are also made to be darker in gray in the order of line 29p, then line 29q, in order to clearly show the change over time.

図１５の例では、矢印は２種類用意され、所定の閾値以上の成長度の場合には黒い矢印２６、別の所定の閾値以上の成長度の場合には白抜きの矢印２７として表示されている。第１実施形態と同様に、矢印２６，２７の存在箇所、色を通じて二次元平面２５における成長度の高い領域の客観的な把握が可能となる。なお、矢印の種類は図示の２種類には限られない。図示では矢印２６，２７は三角形として示されている。これは二次元平面２５中の表示の簡略化の便宜である。 In the example of FIG. 15, two types of arrows are provided, and when the growth level is above a predetermined threshold, a black arrow 26 is displayed, and when the growth level is above another predetermined threshold, a white arrow 27 is displayed. As in the first embodiment, the locations and colors of the arrows 26 and 27 make it possible to objectively grasp areas of high growth on the two-dimensional plane 25. Note that the types of arrows are not limited to the two types shown. In the illustration, the arrows 26 and 27 are shown as triangles. This is for the convenience of simplifying the display on the two-dimensional plane 25.

続いて、第２実施形態の文献マッピング表示方法を文献マッピング表示プログラムとともに説明する。 Next, the literature mapping display method of the second embodiment will be explained together with the literature mapping display program.

第２実施形態の文献マッピング表示方法は、第２実施形態の文献マッピング表示プログラムに基づいて、文献マッピング表示装置１のＣＰＵ１１により実行される。文献マッピング表示方法は、文献マッピング表示装置１のＣＰＵ１１に対して、文献取得機能、文章取得機能、文献配置機能、抽出機能、計算機能、集積検出機能、時系列変化出力機能を実行させ、さらに、クローリング機能を実行させる。各機能は前述の説明と重複するため、詳細は省略する。 The literature mapping display method of the second embodiment is executed by the CPU 11 of the literature mapping display device 1 based on the literature mapping display program of the second embodiment. The literature mapping display method causes the CPU 11 of the literature mapping display device 1 to execute a literature acquisition function, a text acquisition function, a literature arrangement function, an extraction function, a calculation function, an accumulation detection function, and a time series change output function, and further executes a crawling function. Each function overlaps with the above description, so details will be omitted.

図１６、図１７、及び図１８のフローチャートは第２実施形態の文献マッピング表示装置１のＣＰＵ１１における文献マッピング表示方法の全体の流れであり、図１６では文献取得ステップ（Ｓ１１０）、文章取得ステップ（Ｓ１２０）、文献配置ステップ（Ｓ１３０）、抽出ステップ（Ｓ１４０）、計算ステップ（Ｓ１５０）、集積検出ステップ（Ｓ１７０）、時系列変化出力ステップ（Ｓ１８０）が実行され、図１７ではクローリングステップ（Ｓ１１１）が実行される。図１８ではベクトル化ステップ（Ｓ１３１）が実行される。 The flowcharts in Figures 16, 17, and 18 show the overall flow of the document mapping display method in the CPU 11 of the document mapping display device 1 of the second embodiment. In Figure 16, a document acquisition step (S110), a text acquisition step (S120), a document arrangement step (S130), an extraction step (S140), a calculation step (S150), an accumulation detection step (S170), and a time series change output step (S180) are executed, and in Figure 17, a crawling step (S111) is executed. In Figure 18, a vectorization step (S131) is executed.

文献取得機能は、複数の文献を取得する（Ｓ１１０；文献取得ステップ）。文章取得機能は、複数の文献から所定の文章を取得する（Ｓ１２０；文章取得ステップ）。文献配置機能は、複数の文献同士を、複数の文献のそれぞれに含まれる所定の文章の類似性に従い二次元平面に配置する（Ｓ１３０；文献配置ステップ）。抽出機能は、二次元平面に存在する所定領域における文献数の変化量を抽出する（Ｓ１４０；抽出ステップ）。さらに、抽出機能は、所定領域における文献数の変化量に基づいて二次元平面における所定の文章の集合の変化、または所定の文章の集合の変化の差分を抽出する。計算機能は、二次元平面に存在する所定領域における文献数の変化量に基づいて成長度を計算する（Ｓ１５０；計算ステップ）。集積検出機能は、二次元平面に存在する所定領域における複数の文献に基づいて（複数の文献の密度に基づいて）集積領域を検出する（Ｓ１７０；集積検出ステップ）。時系列変化出力機能は、所定領域における成長度と前記集積領域の二次元平面における時系列の変化点を二次元平面に表示する（Ｓ１８０；時系列変化出力ステップ）。また、クローリング機能は、インターネット回線を通じて複数の文献を取得する（Ｓ１１１；クローリングステップ）（図１７参照）。ベクトル化機能は、自然言語処理により複数の文献のそれぞれに特徴ベクトルを生成する（Ｓ１３１；ベクトル化ステップ）（図１８参照）。 The document acquisition function acquires a plurality of documents (S110; document acquisition step). The text acquisition function acquires a predetermined text from the plurality of documents (S120; text acquisition step). The document arrangement function arranges the plurality of documents on a two-dimensional plane according to the similarity of the predetermined text contained in each of the plurality of documents (S130; document arrangement step). The extraction function extracts the amount of change in the number of documents in a predetermined area present on the two-dimensional plane (S140; extraction step). Furthermore, the extraction function extracts the change in the set of predetermined texts in the two-dimensional plane, or the difference in the change in the set of predetermined texts, based on the amount of change in the number of documents in the predetermined area. The calculation function calculates the growth degree based on the amount of change in the number of documents in a predetermined area present on the two-dimensional plane (S150; calculation step). The accumulation detection function detects an accumulation area based on the plurality of documents in a predetermined area present on the two-dimensional plane (based on the density of the plurality of documents) (S170; accumulation detection step). The time series change output function displays the growth degree in the predetermined area and the time series change points in the two-dimensional plane of the accumulation area on the two-dimensional plane (S180; time series change output step). The crawling function also acquires multiple documents via an Internet line (S111; crawling step) (see FIG. 17). The vectorization function generates feature vectors for each of the multiple documents through natural language processing (S131; vectorization step) (see FIG. 18).

上述した本発明のコンピュータプログラムは、プロセッサが読み取り可能な記録媒体に記録されていてよく、記録媒体としては、「一時的でない有形の媒体」、例えば、ディスク、カード、半導体メモリ、プログラマブルな論理回路などを用いることができる。 The computer program of the present invention described above may be recorded on a processor-readable recording medium, and the recording medium may be a "non-transitory tangible medium" such as a disk, card, semiconductor memory, or programmable logic circuit.

なお、上記コンピュータプログラムは、例えば、ActionScript、JavaScript（登録商標）などのスクリプト言語、Objective-C、Java（登録商標）などのオブジェクト指向プログラミング言語、HTML5などのマークアップ言語などを用いて実装できる。 The computer program can be implemented using, for example, a scripting language such as ActionScript or JavaScript (registered trademark), an object-oriented programming language such as Objective-C or Java (registered trademark), or a markup language such as HTML5.

１文献マッピング表示装置
２固定メディア
３インターネット回線
４読取部
５キーボード
６マウス
７表示部（ディスプレイ）
１１ＣＰＵ
１２ＲＡＭ
１３ＲＯＭ
１４記憶部
１５Ｉ／Ｏ（インプット・アウトプットインターフェース）
２０，２５二次元平面
２２，２３，２６，２７矢印
２８ａ，２８ｂ，２８ｃ円
２９ｐ，２９ｑ線
３０グリッド
３１グリッドの間隔
３２，４１，４２，４３，４４，４５区画
１１０文献取得部
１１１クローリング部
１２０文章取得部
１３０文献配置部
１３１ベクトル化部
１４０抽出部
１５０計算部
１６０出力部
１７０集積検出部
１８０時系列変化出力部 1 Literature mapping display device 2 Fixed media 3 Internet line 4 Reading unit 5 Keyboard 6 Mouse 7 Display unit (display)
11 CPU
12 RAM
13 ROM
14 Memory unit 15 I/O (input/output interface)
20, 25 Two-dimensional plane 22, 23, 26, 27 Arrows 28a, 28b, 28c Circles 29p, 29q Lines 30 Grid 31 Grid spacing 32, 41, 42, 43, 44, 45 Sections 110 Document acquisition section 111 Crawling section 120 Text acquisition section 130 Document placement section 131 Vectorization section 140 Extraction section 150 Calculation section 160 Output section 170 Accumulation detection section 180 Time series change output section

Claims

a document acquisition unit for acquiring a plurality of documents;
a text acquisition unit for acquiring a predetermined text from the plurality of documents;
a document arrangement unit that arranges the plurality of documents on a two-dimensional plane according to similarities between predetermined sentences included in each of the plurality of documents;
an extraction unit that extracts a time series change in the number of documents in a predetermined area existing on the two-dimensional plane;
a calculation unit that calculates a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
and an output unit that outputs the growth degree.

a document acquisition unit for acquiring a plurality of documents;
a text acquisition unit for acquiring a predetermined text from the plurality of documents;
a document arrangement unit that arranges the plurality of documents on a two-dimensional plane according to similarities between predetermined sentences included in each of the plurality of documents;
an extraction unit that extracts a time series change in the number of documents in a predetermined area existing on the two-dimensional plane;
a calculation unit that calculates a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
an accumulation detection unit that detects an accumulation area based on the plurality of documents in a predetermined area existing on the two-dimensional plane;
a time series change output unit that displays, on a two-dimensional plane, a growth rate in a predetermined area and time series change points in the two-dimensional plane of the accumulation area.

The document mapping display device according to claim 1 or 2, wherein the document acquisition unit includes a crawling unit and acquires the plurality of documents via an Internet line.

tag information indicating characteristics of the document is attached to each of the plurality of documents in accordance with the document;
The document mapping display device according to claim 3 , wherein the document acquisition section acquires the plurality of documents based on the tag information.

The document mapping display device according to claim 1 or 2, wherein the document placement unit places the documents on a two-dimensional plane according to the similarity between the specified sentences based on natural language processing.

The document mapping display device according to claim 5, wherein the document placement unit includes a vectorization unit that generates a feature vector for each of the plurality of documents by the natural language processing.

The document mapping display device according to claim 1 or 2, wherein the extraction unit extracts the number of documents present within a given region, the given region being one of the sections that is generated by dividing the two-dimensional plane into a grid shape.

The document mapping display device according to claim 1 or 2, wherein the extraction unit extracts a time-series change in the number of documents based on a comparison of the number of documents in a predetermined area present on the two-dimensional plane.

The document mapping display device according to claim 1 or 2, wherein the extraction unit extracts changes in a set of predetermined sentences on a two-dimensional plane.

The document mapping display device according to claim 7 , wherein the calculation unit calculates the growth rate of the one section from a time series change in the number of documents present in another section adjacent to the one section.

The document mapping display device according to claim 1, wherein the output unit displays the degree of growth on the two-dimensional plane using a type of arrow.

The document mapping display device according to claim 1, wherein the output unit displays the growth rate for each document as a numerical value.

The document mapping display device according to claim 2, wherein the accumulation detection unit detects an accumulation area based on the density of the plurality of documents in a predetermined area present on the two-dimensional plane.

The document mapping display device according to claim 2, in which any number of accumulation areas can be specified for the accumulation areas.

The document mapping display device according to claim 2, wherein the time series change output unit displays time series change points in the accumulation area of the plurality of documents on the two-dimensional plane as circles and displays the circles as lines connecting each other.

The document mapping display device according to claim 2, wherein the time series change output unit displays the degree of growth on the two-dimensional plane using a type of arrow.

The computer
A document acquisition step of acquiring a plurality of documents;
a sentence acquisition step of acquiring a predetermined sentence from the plurality of documents;
a document arrangement step of arranging the plurality of documents on a two-dimensional plane according to similarities between predetermined sentences included in each of the plurality of documents;
an extraction step of extracting a time series change in the number of documents in a predetermined area on the two-dimensional plane;
a calculation step of calculating a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
and outputting the growth degree.

On the computer,
A document acquisition function that acquires multiple documents;
A text acquisition function for acquiring a specified text from a plurality of documents;
a document arrangement function for arranging the plurality of documents on a two-dimensional plane in accordance with similarities between predetermined sentences contained in each of the plurality of documents;
an extraction function for extracting a time series change in the number of documents in a predetermined area on the two-dimensional plane;
a calculation function for calculating a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
and an output function for outputting the growth degree.

The computer
A document acquisition step of acquiring a plurality of documents;
a sentence acquisition step of acquiring a predetermined sentence from the plurality of documents;
a document arrangement step of arranging the plurality of documents on a two-dimensional plane according to similarities between predetermined sentences included in each of the plurality of documents;
an extraction step of extracting a time series change in the number of documents in a predetermined area on the two-dimensional plane;
a calculation step of calculating a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
an accumulation detection step of detecting an accumulation region based on the plurality of documents in a predetermined region existing on the two-dimensional plane;
and a time series change output step of displaying on a two-dimensional plane the degree of growth in a predetermined region and time series change points in the two-dimensional plane of the accumulation region.

On the computer,
A document acquisition function that acquires multiple documents;
A text acquisition function for acquiring a predetermined text from the plurality of documents;
a document arrangement function for arranging the plurality of documents on a two-dimensional plane in accordance with similarities between predetermined sentences contained in each of the plurality of documents;
an extraction function for extracting a time series change in the number of documents in a predetermined area on the two-dimensional plane;
a calculation function for calculating a growth rate based on a time series change in the number of documents in a predetermined area present on the two-dimensional plane;
an accumulation detection function for detecting an accumulation region based on the plurality of documents in a predetermined region existing on the two-dimensional plane;
and a time series change output function for displaying on a two-dimensional plane the degree of growth in a predetermined area and time series change points in the two-dimensional plane of the accumulation area.