JP4829655B2

JP4829655B2 - Literature information analysis system and literature information analysis program

Info

Publication number: JP4829655B2
Application number: JP2006088965A
Authority: JP
Inventors: 裕二宗
Original assignee: Nomura Research Institute Ltd
Current assignee: Nomura Research Institute Ltd
Priority date: 2006-03-28
Filing date: 2006-03-28
Publication date: 2011-12-07
Anticipated expiration: 2026-03-28
Also published as: JP2007265009A

Description

この発明は文献情報分析システム及び文献情報分析プログラムに係り、特に、特許文献や学術文献などの膨大な数の電子化された文書情報の中から重要なキーワードを抽出し、各キーワード間の関係及びキーワードを含む文献間の関係を統計的手法を用いて分析すると共に、その結果を視覚的に表現する技術に関する。 The present invention relates to a document information analysis system and a document information analysis program, and in particular, extracts important keywords from an enormous number of digitized document information such as patent documents and academic documents. The present invention relates to a technique for analyzing the relationship between documents including keywords using a statistical method and visually expressing the result.

多数の特許文献（出願公開公報及び特許登録公報等）を精査することにより、特定の技術分野における企業別の出願状況や技術開発の方向性を把握することができ、未開拓の領域を認識することも可能となるため、多くの企業において特許文献の収集及び分析が行われている。
また、分析結果を平面上にマッピングすることにより、当該技術分野における全体的な傾向を視覚的に把握できるように工夫することも実践されている。 By reviewing a large number of patent documents (application publications, patent registrations, etc.), it is possible to grasp the application status and direction of technological development by company in a specific technical field, and recognize unexplored areas In many companies, patent documents are collected and analyzed.
In addition, it is also practiced to devise so that the overall tendency in the technical field can be visually grasped by mapping the analysis result on a plane.

この所謂特許マップの作成手順や表現方法については様々なものが提案されているが、以下の工程を経る点では概ね共通している。
(1) 電子化された膨大な数の特許文献を、国際特許分類（IPC）やFターム、出願人名、公報発行年度、フリーワード等によって数十件〜数百件程度に絞り込む。
(2) 各特許文献のタイトルや要約書、代表図面をチェックし、不要な文献を排除する。
(3) 残った文献の内容を詳細に検討し、発明の内容（目的、構成、効果等）を短くまとめて書き出す。
(4) 各特許文献の番号、出願日、分類、出願人、内容等を１枚のカードに書き写し、縦軸に技術分類、横軸に年代等を設定した平面上に配置する。
(4) の代わりに、各特許文献から特徴的なキーワード（技術用語）を抽出し、これらを平面状に配置すると共に、関連する特許文献の番号や出願人名等を列記する場合もある。あるいは、各キーワードに関連する特許文献の数量を円の面積で表現することにより、キーワード間の関係（重要度）を一覧表示する手法もある。 Various so-called patent map creation procedures and expression methods have been proposed, but they are generally the same in the following steps.
(1) The enormous number of patent documents that have been digitized will be narrowed down to a few dozens to several hundreds by international patent classification (IPC), F-term, applicant name, publication year, free word, etc.
(2) Check the titles, abstracts and representative drawings of each patent document and eliminate unnecessary documents.
(3) Examine the contents of the remaining documents in detail, and write out the contents of the invention (purpose, composition, effects, etc.) in a short summary.
(4) Copy the number, filing date, classification, applicant, contents, etc. of each patent document on a single card and place it on a plane with the technical classification on the vertical axis and the age etc. on the horizontal axis.
Instead of (4), characteristic keywords (technical terms) are extracted from each patent document, and these are arranged in a plane, and the numbers and names of applicants of related patent documents may be listed. Alternatively, there is also a method of displaying a list of relationships (importance levels) between keywords by expressing the number of patent documents related to each keyword by the area of a circle.

何れにしても、特許マップの作成には膨大な時間が必要であり、人為的な作業が中心となるため、対象となる文献数を事前に厳しく絞り込んでおくことが不可欠となる。
このような特許マップの作成を支援するツールとして、下記の特許文献１においては、予め配置条件設定画面上において特許マップの外形寸法、表示項目、縦軸と横軸に用いる項目等を設定しておき、特許マップ化の対象となる二次元配列データ（出願日、分類、公開番号、コメント等のデータを含むCSVファイル）の座標をこれらの配置条件に従って算出する特許マップ作成システムが開示されている。 In any case, the creation of a patent map requires an enormous amount of time and is centered on human work, so it is indispensable to strictly narrow down the number of documents to be processed in advance.
As a tool for supporting the creation of such a patent map, in the following Patent Document 1, the external dimensions of the patent map, display items, items used for the vertical and horizontal axes, etc. are set in advance on the arrangement condition setting screen. In addition, a patent map creation system is disclosed that calculates the coordinates of two-dimensional array data (CSV file that includes data such as application date, classification, publication number, and comments) to be patented in accordance with these arrangement conditions. .

この特許文献１のシステムによれば、従来の(4)の作業を自動化することができ、特許マップの作成作業をある程度効率化することが可能となる。
しかしながら、特許マップ作成において最も時間を要する特許文献の内容分析や評価自体は人間が行うことが前提となっているため、その省力化効果は限定的なレベルにとどまる。
また、事前に分析対象となる特許文献数自体を厳しく絞り込むことが不可欠であり、その過程で重要な特許文献の取りこぼしが生じる危険性が高い。
さらに、このように特許文献の評価や内容分析を人為的に処理する限りは、作業者の主観や能力、経験によって成果物の品質が大きく左右され、安定性に欠けることとなる。
しかも、上記システムによって得られる特許マップは、平面上に一定の面積を有するカードを配置する形式であるため一覧性に欠け、当該技術分野における全体的な傾向や方向性を直感的に把握する用途には不向きである。 According to the system of Patent Document 1, the conventional work (4) can be automated, and the efficiency of the patent map creation work can be improved to some extent.
However, since it is assumed that the content analysis and evaluation itself of patent documents that require the most time in creating a patent map are performed by humans, the labor saving effect is limited to a limited level.
In addition, it is essential to strictly narrow down the number of patent documents to be analyzed in advance, and there is a high risk that important patent documents will be missed in the process.
Furthermore, as long as the evaluation and content analysis of the patent document are artificially processed in this way, the quality of the product is greatly influenced by the subjectivity, ability, and experience of the worker, and the stability is lacking.
In addition, the patent map obtained by the above system is a format in which cards with a certain area are arranged on a plane, so it lacks a list and is used to intuitively grasp the overall trend and direction in the technical field. Not suitable for.

そこで、このような問題点を解決するものとして、出願人は特許文献２に記載の文献情報分析システムを既に出願している。
このシステムによれば、膨大な文献情報の中から自動的にキーワードあるいはキーワード候補を抽出し、これらを統計的手法を用いて分析することによって二次元平面上に各キーワードに対応したシンボルを配置させると共に、各文献の存在をも同じ二次元平面上に分布密度として表現することができる。
このため、文献情報として特許文献を用いることにより、多数の特許文献の全体的な広がりや集中具合、出願の方向性を一覧できる特許マップを自動生成することが可能となる。
また、各文献の評価や分析が基本的に自動化されるため、客観的な分析結果が得られるのみならず、従来のように対象文献数を無理に絞り込む必要がなくなり、比較的広い範囲における傾向を概観することが可能となる。
特開２００１−２２２５３６特開２００５−１４９３４６ Therefore, the applicant has already filed a document information analysis system described in Patent Document 2 as a solution to such problems.
According to this system, keywords or keyword candidates are automatically extracted from an enormous amount of document information, and these are analyzed using a statistical method to place symbols corresponding to each keyword on a two-dimensional plane. At the same time, the existence of each document can be expressed as a distribution density on the same two-dimensional plane.
For this reason, by using a patent document as the document information, it is possible to automatically generate a patent map that can list the overall spread and concentration of many patent documents and the direction of application.
In addition, since the evaluation and analysis of each document is basically automated, not only objective analysis results can be obtained, but there is no need to forcibly narrow down the number of target documents as in the past, and trends in a relatively wide range It becomes possible to overview.
JP 2001-222536 A JP-A-2005-149346

このように、特許文献２の技術によって膨大な量の特許文献を人手を介することなく自動的に処理し、技術開発の動向や全体的な広がりを視覚的に認識することが可能となったのであるが、技術開発の方向性の是非を企業が検討する際には、各技術分野における特許出願の粗密具合を把握するだけでは不十分であり、将来における経済的なリターンとの兼ね合いが重要な判断基準となる。
このため、特許文献２の分析システムによって得られた特許マップ上に、経済的なリターンの可能性を示す情報を適切に表示することができれば、さらに利便性が高まることが期待できる。 As described above, since the technology of Patent Document 2 can automatically process a huge amount of patent documents without human intervention, it is possible to visually recognize the trend of technology development and the overall spread. However, when a company considers the direction of technological development, it is not enough to grasp the degree of patent application in each technical field, and it is important to balance the economic return in the future. Judgment criteria.
For this reason, if the information which shows the possibility of an economical return can be appropriately displayed on the patent map obtained by the analysis system of patent document 2, it can be expected that convenience is further improved.

そこでこの発明は、特許文献２の文献情報分析システムによって得られた特許マップ上に、将来における経済的なリターンの可能性を示す情報を容易かつ適切に表示できる技術の実現を目的としている。 Therefore, an object of the present invention is to realize a technique capable of easily and appropriately displaying information indicating the possibility of an economic return in the future on the patent map obtained by the document information analysis system of Patent Document 2.

上記の目的を達成するため、請求項１に記載した文献情報分析システムは、電子化された複数の文献情報の中から、所定の基準に従って複数のキーワードを抽出する手段と、各文献情報に含まれるキーワードの組合せ及び出現数に対して主成分分析を施すことにより、二次元平面上における各文献の座標を算出する手段と、各文献の座標を一定の面積を備えた領域毎に集計することにより、各領域の分布密度を算出する手段と、各領域に対し、その分布密度に対応した表示パターンを割り当てる手段と、二次元平面上に各領域の表示パターンを反映させることにより、文献の分布密度に対応した図形を生成する手段と、上記キーワードを含む文献の総数及び各キーワードの出現総数に対して主成分分析を施すことにより、二次元平面上における各キーワードの座標を算出する手段と、各キーワードの座標に従い、上記二次元平面上に当該キーワードを配置する手段と、上記文献の分布密度に対応した図形及びキーワードを含む文献情報分析マップを生成する手段と、この文献情報分析マップをディスプレイ上に表示させる手段と、入力手段を介して、上記文献情報分析マップに表示されたキーワードが選択されると共に、市場規模の表示を求める選択がされた場合に、上記ディスプレイ上に市場規模額を入力するためのフォームを表示させる手段と、入力手段を介して上記フォームに入力された金額に所定の換算比率を適用することにより、市場規模対応図形の寸法を算出する手段と、当該寸法を備えた所定の市場規模対応図形を生成し、入力手段を介して選択された上記キーワードの近傍に表示させる手段とを備えたことを特徴としている。
上記「文献情報」には、特許公開公報や特許登録公報などの特許文献の他、論文などの学術文献、雑誌の記事などが含まれる（以下同様）。
上記「分布密度に対応した表示パターンを割り当てる」とは、例えば分布密度に応じて異なった色彩や模様、濃淡度を割り当てることが該当する（以下同様）。
上記「キーワード」には、キーワードを表す文字列の他、キーワードの略語やイニシャルをも含まれる（以下同様）。
上記「主成分分析」とは、多くの変量の値をできるだけ情報の損失なしに、数個の総合的指標（主成分）で代表して表す統計的手法を意味する（以下同様）。
この文献情報分析システムを所謂クライアント−サーバ型のシステム構成によって実現する場合、上記「ディスプレイ上に表示させる手段」は、クライアント端末に表示用の情報（Webファイル等）を配信し、クライアント端末のディスプレイ上に必要情報（分布密度に対応した図形やキーワードの存在を示すシンボル等）を表示させるサーバ側の機能を指す（以下同様）。 In order to achieve the above object, a document information analysis system according to claim 1 includes means for extracting a plurality of keywords from a plurality of digitized document information according to a predetermined standard, and is included in each document information By calculating the principal component analysis for the combination of keywords and the number of occurrences, a means for calculating the coordinates of each document on the two-dimensional plane and the coordinates of each document are aggregated for each area having a certain area By means of calculating the distribution density of each region, means for assigning a display pattern corresponding to the distribution density to each region, and reflecting the display pattern of each region on the two-dimensional plane, the document distribution A principal component analysis is performed on a two-dimensional plane by means of generating a figure corresponding to the density, and performing a principal component analysis on the total number of documents including the above keywords and the total number of occurrences of each keyword. A means for calculating the coordinates of each keyword, a means for arranging the keywords on the two-dimensional plane according to the coordinates of each keyword, and a document information analysis map including a figure and keywords corresponding to the distribution density of the documents are generated. When the keyword displayed on the document information analysis map is selected via the input means , the means for displaying the document information analysis map on the display, and the display for displaying the market size is selected. And a means for displaying a form for inputting the market scale amount on the display, and applying a predetermined conversion ratio to the amount input to the form via the input means, thereby measuring the size of the figure corresponding to the market scale. means for calculating, generates a predetermined market corresponding figure with the dimensions, selected through the input unit the keyword Is characterized in that a means for displaying the vicinity of.
The above-mentioned “document information” includes patent documents such as patent publications and patent registration publications, academic documents such as papers, journal articles, and the like (the same applies hereinafter).
The above-mentioned “assigning a display pattern corresponding to the distribution density” corresponds to assigning different colors, patterns, and shades according to the distribution density (the same applies hereinafter).
The “keyword” includes a keyword abbreviation and initial in addition to a character string representing the keyword (the same applies hereinafter).
The above-mentioned “principal component analysis” means a statistical method that represents the values of many variables as few representative indices (principal components) as possible without loss of information (the same applies hereinafter).
When this document information analysis system is realized by a so-called client-server system configuration, the “means for displaying on the display” distributes display information (such as a Web file) to the client terminal, and the display of the client terminal It refers to a server-side function for displaying necessary information (a graphic corresponding to the distribution density, a symbol indicating the presence of a keyword, etc.) (the same applies hereinafter).

請求項２に記載した文献情報分析システムは、請求項１のシステムを前提とし、さらに、入力手段を介して選択された複数のキーワードの座標に対し、それぞれの出現文献数に基づく加重平均を求めて各キーワード間の中心座標を特定する手段と、上記市場規模対応図形を当該中心座標上に表示させる手段とを備えたことを特徴としている。 The document information analysis system according to claim 2 is based on the system of claim 1 and further obtains a weighted average based on the number of documents appearing for the coordinates of a plurality of keywords selected via the input means. And a means for specifying the center coordinates between the keywords and a means for displaying the market size corresponding graphic on the center coordinates.

請求項３に記載した文献情報分析システムは、請求項１または２のシステムを前提とし、さらに、上記フォームには市場カテゴリを入力する欄が設けられており、入力手段を介して入力された市場カテゴリを上記市場規模対応図形の近傍に表示させる手段を備えたことを特徴としている。 The document information analysis system described in claim 3 is based on the system of claim 1 or 2, and further, a field for inputting a market category is provided in the form, and the market input through the input means A means for displaying the category in the vicinity of the graphic corresponding to the market size is provided.

請求項４に記載した文献情報分析システムは、請求項１〜３のシステムを前提とし、さらに、上記フォーム中には市場規模額の拡大傾向または縮小傾向を選択的に入力する欄が設けられており、入力手段を介して選択入力された傾向に対応した色彩を備えた市場規模対応図形を生成する手段を備えたことを特徴としている。 The document information analysis system according to claim 4 is based on the system according to claims 1 to 3, and further, the form is provided with a column for selectively inputting an expansion tendency or a reduction tendency of the market scale amount. And a means for generating a figure corresponding to the market scale having a color corresponding to the tendency selected and input via the input means.

請求項５に記載した文献情報分析システムは、電子化された複数の文献情報の中から、所定の基準に従って複数のキーワードを抽出する手段と、各文献情報に含まれるキーワードの組合せ及び出現数に対して主成分分析を施すことにより、二次元平面上における各文献の座標を算出する手段と、各文献の座標を一定の面積を備えた領域毎に集計することにより、各領域の分布密度を算出する手段と、各領域に対し、その分布密度に対応した表示パターンを割り当てる手段と、二次元平面上に各領域の表示パターンを反映させることにより、文献の分布密度に対応した図形を生成する手段と、上記キーワードを含む文献の総数及び各キーワードの出現総数に対して主成分分析を施すことにより、二次元平面上における各キーワードの座標を算出する手段と、各キーワードの座標に従い、上記二次元平面上に当該キーワードを配置する手段と、上記文献の分布密度に対応した図形及びキーワードを含む文献情報分析マップを生成する手段と、この文献情報分析マップをディスプレイ上に表示させる手段と、複数のキーワードと特定の市場カテゴリとの対応関係を定義しておく市場−キーワード対応記憶手段と、各市場カテゴリに係る所定期間毎の市場規模額データを格納しておく市場規模額記憶手段と、上記ディスプレイ上に表示されている各キーワードと上記市場−キーワード対応記憶手段のキーワードとを比較し、一または複数の市場カテゴリを特定する手段と、上記市場規模額記憶手段を参照し、各市場カテゴリに関連付けられた市場規模額を特定する手段と、当該金額に所定の換算比率を適用することにより、市場規模対応図形の寸法を算出する手段と、当該寸法を備えた市場規模対応図形を生成し、当該市場カテゴリに含まれる各キーワードの近傍に表示させる手段とを備えたことを特徴としている。 The document information analysis system according to claim 5 includes a means for extracting a plurality of keywords from a plurality of digitized document information according to a predetermined standard, a combination of keywords included in each document information, and the number of appearances. By performing principal component analysis on the two-dimensional plane, the means for calculating the coordinates of each document on the two-dimensional plane, and the coordinates of each document are tabulated for each area having a certain area, thereby calculating the distribution density of each area. A means for calculating, a means for assigning a display pattern corresponding to the distribution density to each area, and a figure corresponding to the distribution density of the document are generated by reflecting the display pattern of each area on the two-dimensional plane. The coordinates of each keyword on the two-dimensional plane are calculated by performing principal component analysis on the means, the total number of documents including the keyword, and the total number of occurrences of each keyword. Means for arranging the keyword on the two-dimensional plane according to the coordinates of each keyword, means for generating a document information analysis map including figures and keywords corresponding to the distribution density of the document, and the document information Means for displaying an analysis map on a display; market-keyword correspondence storage means for defining a correspondence relationship between a plurality of keywords and a specific market category; and market scale amount data for each market category for a predetermined period. A market scale amount storage means for storing, a means for comparing each keyword displayed on the display with a keyword of the market-keyword correspondence storage means to identify one or a plurality of market categories, and the market A means for identifying the market scale amount associated with each market category with reference to the scale amount storage means, and a predetermined amount A means for calculating a size of a figure corresponding to a market scale by applying a calculation ratio; and a means for generating a figure corresponding to a market scale having the dimension and displaying it in the vicinity of each keyword included in the market category. It is characterized by that.

請求項６に記載した文献情報分析システムは、請求項５のシステムを前提とし、さらに、上記の市場規模額記憶手段に格納された複数期間における市場規模額の増減に応じて、市場規模の拡大または縮小傾向を判断する手段と、この傾向に対応した色彩を備えた市場規模対応図形を生成する手段とを備えたことを特徴としている。 The document information analysis system described in claim 6 is based on the system of claim 5 and further expands the market size according to the increase / decrease of the market size amount in a plurality of periods stored in the market size amount storage means. Alternatively, the present invention is characterized by comprising means for judging a reduction tendency and means for generating a market scale corresponding figure having a color corresponding to this tendency.

請求項７に記載した文献情報分析システムは、請求項５または６のシステムを前提とし、さらに、上記市場カテゴリに係る各キーワードの座標に対し、出現文献数に基づく加重平均を求めて各キーワード間の中心座標を特定する手段と、上記市場規模対応図形を上記中心座標上に配置させる手段とを備えたことを特徴としている。 The document information analysis system according to claim 7 is based on the system according to claim 5 or 6, and further calculates a weighted average based on the number of documents that appear for the coordinates of each keyword related to the market category. Characterized in that it comprises means for specifying the center coordinates of the above and means for arranging the figure corresponding to the market scale on the center coordinates.

請求項８に記載した文献情報分析システムは、請求項２または７のシステムを前提とし、さらに、上記市場規模対応図形が上記中心座標を中心点とし、上記寸法に対応した半径または直径を備えた円であることを特徴としている。 The literature information analysis system according to claim 8 is based on the system according to claim 2 or 7, and further, the figure corresponding to the market scale has a radius or a diameter corresponding to the dimension with the center coordinate as the center point. It is characterized by a circle.

請求項９に記載した文献情報分析プログラムは、コンピュータを、電子化された複数の文献情報の中から、所定の基準に従って複数のキーワードを抽出する手段、各文献情報に含まれるキーワードの組合せ及び出現数に対して主成分分析を施すことにより、二次元平面上における各文献の座標を算出する手段、各文献の座標を一定の面積を備えた領域毎に集計することにより、各領域の分布密度を算出する手段、各領域に対し、その分布密度に対応した表示パターンを割り当てる手段、二次元平面上に各領域の表示パターンを反映させることにより、文献の分布密度に対応した図形を生成する手段、上記キーワードを含む文献の総数及び各キーワードの出現総数に対して主成分分析を施すことにより、二次元平面上における各キーワードの座標を算出する手段、各キーワードの座標に従い、上記二次元平面上に当該キーワードを配置する手段、上記文献の分布密度に対応した図形及びキーワードを含む文献情報分析マップを生成する手段、この文献情報分析マップをディスプレイ上に表示させる手段、入力手段を介して、上記文献情報分析マップに表示されたキーワードが選択されると共に、市場規模の表示を求める選択がされた場合に、上記ディスプレイ上に市場規模額を入力するためのフォームを表示させる手段、入力手段を介して上記フォームに入力された金額に所定の換算比率を適用することにより、市場規模対応図形の寸法を算出する手段、当該寸法を備えた所定の市場規模対応図形を生成し、入力手段を介して選択された上記キーワードの近傍に表示させる手段として機能させることを特徴としている。 The document information analysis program according to claim 9 is a program for extracting a plurality of keywords according to a predetermined standard from a plurality of digitized document information, a combination of keywords included in each document information, and an appearance A means for calculating the coordinates of each document on a two-dimensional plane by performing principal component analysis on the number, and the distribution density of each region by counting the coordinates of each document for each region having a certain area Means for assigning a display pattern corresponding to the distribution density to each area, means for generating a figure corresponding to the distribution density of the document by reflecting the display pattern of each area on the two-dimensional plane By performing principal component analysis on the total number of documents including the above keywords and the total number of occurrences of each keyword, the position of each keyword on the two-dimensional plane is determined. Calculating means, means for arranging the keyword on the two-dimensional plane according to the coordinates of each keyword, means for generating a document information analysis map including a figure and keywords corresponding to the distribution density of the document, and analyzing the document information When the keyword displayed on the bibliographic information analysis map is selected via the means for displaying the map on the display and the input means, and the market size is displayed on the display when the keyword is requested to be displayed. Means for displaying a form for inputting an amount; means for calculating a size of a figure corresponding to a market size by applying a predetermined conversion ratio to the amount input to the form via the input means; to produce a predetermined market corresponding figure, as means for displaying the vicinity of the keyword selected via the input means It is characterized in that to ability.

請求項１に記載の文献情報分析システム及び請求項９に記載の文献情報分析プログラムによれば、フォームを通じて市場規模額を入力することによって、市場規模の大きさに比例した寸法の図形が自動生成され、選択したキーワードの近傍に配置される。
このため、各選択キーワードに関連した市場の規模の大きさ、すなわち経済的なリターンの可能性を示す情報を、特許マップ上に極めて容易に配置可能となる。 According to the literature information analysis system according to claim 1 and the literature information analysis program according to claim 9, a figure having a size proportional to the market size is automatically generated by inputting the market size amount through a form. And placed in the vicinity of the selected keyword.
For this reason, the size of the market related to each selected keyword, that is, information indicating the possibility of economical return can be arranged very easily on the patent map.

請求項２に記載の文献情報分析システムによれば、選択された複数のキーワードの中心座標が自動的に算出され、その上に市場規模対応図形が配置されるため、市場規模対応図形を最も適切な位置に配置可能となる。 According to the literature information analysis system according to claim 2, since the center coordinates of a plurality of selected keywords are automatically calculated and the market size corresponding graphic is arranged thereon, the market size corresponding graphic is most suitable. It can be arranged at any position.

請求項３に記載の文献情報分析システムによれば、フォームを通じて市場カテゴリを入力することにより、当該カテゴリが市場規模対応図形の近傍に自動的に表示されることとなるため、選択されたキーワードが属する市場を明確に把握可能となる。 According to the literature information analysis system of the third aspect, by inputting the market category through the form, the category is automatically displayed in the vicinity of the graphic corresponding to the market size. It becomes possible to clearly grasp the market to which it belongs.

請求項４に記載の文献情報分析システムによれば、フォームを通じて市場規模額の拡大傾向または縮小傾向を選択することにより、市場規模対応図形の色彩を各傾向に応じて異ならせることが可能となるため、選択キーワードが属する市場の将来性を明確に把握可能となる。 According to the literature information analysis system according to claim 4, it is possible to change the color of the graphic corresponding to the market scale according to each tendency by selecting the expansion tendency or the reduction tendency of the market scale amount through the form. Therefore, it is possible to clearly grasp the future of the market to which the selected keyword belongs.

請求項５に記載の文献情報分析システムによれば、キーワードと市場カテゴリとの対応関係や市場規模額を認識していない場合であっても、システムの側で市場規模の大きさに比例した寸法の図形を自動生成し、当該市場に属するキーワードの近傍に配置することが可能となる。 According to the literature information analysis system according to claim 5, even if the correspondence relationship between the keyword and the market category and the market size amount are not recognized, the size in proportion to the size of the market size on the system side. Can be automatically generated and placed in the vicinity of keywords belonging to the market.

請求項６に記載の文献情報分析システムによれば、市場規模額の拡大／縮小傾向を認識していない場合であっても、システムの側で市場規模額の拡大／縮小傾向が判断され、各傾向に対応した色彩の市場規模対応図形を自動的に表示することが可能となる。 According to the literature information analysis system of the sixth aspect, even if the market size amount expansion / reduction trend is not recognized, the system side determines the market size amount expansion / reduction tendency, It is possible to automatically display a figure corresponding to the market size of the color corresponding to the trend.

請求項７に記載の文献情報分析システムによれば、市場カテゴリに含まれる複数のキーワードの中心座標が自動的に算出され、その上に市場規模対応図形が配置されるため、市場規模対応図形を最も適切な位置に配置可能となる。 According to the literature information analysis system according to claim 7, since the center coordinates of a plurality of keywords included in the market category are automatically calculated and the market scale corresponding graphic is arranged thereon, It can be arranged at the most appropriate position.

請求項８に記載の文献情報分析システムによれば、市場規模対応図形として各キーワード間の中心座標を中心点とし、市場規模額に比例した半径または直径を備えた円が自動的に生成され、ディスプレイ上に表示されるため、各キーワードが関連する市場の規模をより明確に表現可能となる。 According to the literature information analysis system according to claim 8, a circle having a radius or a diameter proportional to the market scale amount is automatically generated with the center coordinate between the keywords as the center point as the market scale corresponding figure. Since it is displayed on the display, it is possible to express the size of the market to which each keyword is related more clearly.

図１は、この発明に係る文献情報分析システム10の構成を示すブロック図であり、サーバ群12と、インターネット13やイントラネット等の通信ネットワークを介してサーバ群12と接続された複数のクライアント端末14とを備えている。
サーバ群12は、Webサーバ16と、AP（Application）サーバ18と、DB（Database）サーバ20とを備えており、APサーバ18には検索処理部22と、市場−キーワード対応テーブル23と、描画処理部24と、市場規模額テーブル25と、座標算出部26と、言語解析事前処理部28と、キーワード抽出部30と、表示データ蓄積部32と、要置換・削除単語データベース34と、類義語・必要語データベース36とが設けられている。
また、DBサーバ20は、文献データベース38を備えている。 FIG. 1 is a block diagram showing a configuration of a literature information analysis system 10 according to the present invention. A server group 12 and a plurality of client terminals 14 connected to the server group 12 via a communication network such as the Internet 13 or an intranet. And.
The server group 12 includes a Web server 16, an AP (Application) server 18, and a DB (Database) server 20. The AP server 18 includes a search processing unit 22, a market-keyword correspondence table 23, and a drawing. Processing unit 24, market size table 25, coordinate calculation unit 26, language analysis preprocessing unit 28, keyword extraction unit 30, display data storage unit 32, replacement required / deleted word database 34, synonym / A necessary word database 36 is provided.
Further, the DB server 20 includes a document database 38.

サーバ群12は、それぞれWebサーバ16、APサーバ18、DBサーバ20として機能する独立したコンピュータをLAN接続することによって構成することもできるが、単一のコンピュータに各サーバ機能を実現するプログラムをセットアップすることによって構成することもできる。
APサーバ18の検索処理部22、描画処理部24、座標算出部26、言語解析事前処理部28、キーワード抽出部30は、APサーバ18のCPUがOS及び専用のアプリケーション・プログラムに従って必要な処理を実行することによって実現される。また、市場−キーワード対応テーブル23、市場規模額テーブル25、表示データ蓄積部32、要置換・削除単語データベース34、類義語・必要語データベース36は、APサーバ18のハードディスク内に格納されている。 The server group 12 can be configured by connecting independent computers that function as the Web server 16, AP server 18, and DB server 20 via LAN, but setup a program that implements each server function on a single computer. It can also be configured.
The search processing unit 22, the drawing processing unit 24, the coordinate calculation unit 26, the language analysis preprocessing unit 28, and the keyword extraction unit 30 of the AP server 18 perform processing necessary for the CPU of the AP server 18 according to the OS and a dedicated application program. It is realized by executing. In addition, the market-keyword correspondence table 23, the market scale amount table 25, the display data storage unit 32, the replacement / deletion word database 34, and the synonym / necessary word database 36 are stored in the hard disk of the AP server 18.

クライアント端末14は、マウスやキーボード等の入力装置と、ディスプレイを備えたPC等のコンピュータよりなり、OS及びWebブラウザプログラムが少なくともインストールされている。 The client terminal 14 includes an input device such as a mouse and a keyboard, and a computer such as a PC provided with a display. An OS and a web browser program are installed at least.

つぎに、図２及び図３のフローチャートに従い、この文献情報分析システム10における処理手順を説明する。
まずユーザは、クライアント端末14からインターネット13経由でWebサーバ16にアクセスする。そして、ID及びパスワードの入力を伴う認証ステップをクリアすると、Webサーバ16からクライアント端末14に検索条件入力用のフォームが送信され（Ｓ10）、ディスプレイに表示される（Ｓ12）。 Next, a processing procedure in the document information analysis system 10 will be described with reference to the flowcharts of FIGS.
First, the user accesses the Web server 16 from the client terminal 14 via the Internet 13. When the authentication step involving the input of the ID and password is cleared, a search condition input form is transmitted from the Web server 16 to the client terminal 14 (S10) and displayed on the display (S12).

これに対しユーザは、分析範囲を特定するための検索条件を入力装置を介して入力し、Webサーバ16に送信する（Ｓ14）。
例えば、検索条件入力フォームにおいて、「文献種別（特開、特許登録、実開、実用登録等）」、「年度範囲（1992年〜2003年など）」、「国際特許分類（IPC）」、「Ｆターム」、「出願人／権利者」の中で必要な項目に検索条件を入力し、実行ボタンをクリックする。 On the other hand, the user inputs a search condition for specifying the analysis range via the input device and transmits it to the Web server 16 (S14).
For example, in the search condition input form, “document type (JP, patent registration, practical application, practical registration, etc.)”, “year range (1992-2003, etc.)”, “international patent classification (IPC)”, “ Enter the search conditions in the required items in “F-term” and “Applicant / right holder”, and click the execute button.

これを受けたAPサーバ18の検索処理部22は、DBサーバ20経由で上記検索条件に合致する特許文献を文献データベース38から抽出し（Ｓ16）、ヒット件数が表記された検索結果ファイルをWebサーバ16を介してクライアント端末14に送信する（Ｓ18）。
ディスプレイに表示された検索結果に納得したユーザが、分析開始ボタンをクリックして分析リクエストを送信すると（Ｓ20）、検索処理部22は各特許文献の電子データ（文献情報）を文献データベース38から抽出する（Ｓ22）。
図４は抽出データの一例を示すものであり、「出願番号」、「出願日」、「出願人」、「発明者」、「発明の名称」、「抄録」の各データ項目を備えている。また、これらのデータは汎用的なCSV形式を備えている。
検索処理部22によって抽出されたデータは、表示データ蓄積部32に格納される。 Upon receiving this, the search processing unit 22 of the AP server 18 extracts patent documents that match the above search conditions from the DB database 20 via the DB server 20 (S16), and retrieves the search result file indicating the number of hits from the Web server. 16 is transmitted to the client terminal 14 via 16 (S18).
When a user who is satisfied with the search result displayed on the display clicks an analysis start button and sends an analysis request (S20), the search processing unit 22 extracts electronic data (document information) of each patent document from the document database 38. (S22).
FIG. 4 shows an example of extracted data, which includes data items of “application number”, “application date”, “applicant”, “inventor”, “name of invention”, and “abstract”. . These data have a general CSV format.
The data extracted by the search processing unit 22 is stored in the display data storage unit 32.

つぎに言語解析事前処理部28が起動し、表示データ蓄積部32に格納された抽出データに対して整形処理を施す（Ｓ24）。
まず、言語解析事前処理部28は、テキストマイニングのアルゴリズムに従い抽出データを単語レベルに分解する。
つぎに言語解析事前処理部28は、要置換・削除単語データベース34を参照し、抽出データに含まれる不要な単語を削除すると共に、必要な置換処理を実行する。
ここで不要な単語とは、例えば抄録中に含まれる「課題」や「解決手段」、「効果」などの定型的な段落タイトル、あるいは「ところで」や「しかしながら」、「そこで」などの接続詞、「である。」、「この場合」など技術的な意味を有さない言葉が該当する。要置換・削除単語データベース34内には、予め分析対象外とすべき多数の要削除単語が登録されている。
また必要な置換処理としては、例えば出願番号中に含まれる「特願昭55-」の和暦表示を、「特願1980-」の西暦表示に変換することが該当する。このため、要置換・削除単語データベース34内には、予め和暦と西暦との対応データが格納されている。
言語解析事前処理部28によって整形された抽出データは、表示データ蓄積部32に格納される。 Next, the language analysis pre-processing unit 28 is activated, and the extraction data stored in the display data storage unit 32 is shaped (S24).
First, the language analysis preprocessing unit 28 decomposes the extracted data into word levels according to a text mining algorithm.
Next, the language analysis pre-processing unit 28 refers to the replacement / deletion word database 34, deletes unnecessary words included in the extracted data, and executes necessary replacement processing.
Unnecessary words here are typical paragraph titles such as “issue”, “solution”, “effect”, etc. included in the abstract, or conjunctions such as “by the way”, “however”, “where”, This applies to words that have no technical meaning, such as “is” or “in this case”. In the replacement / deletion word database 34, a large number of deletion words that should not be analyzed are registered in advance.
The necessary replacement processing corresponds to, for example, converting the Japanese calendar display of “Japanese Patent Application No. 55-” included in the application number into the Western calendar display of “Japanese Patent Application 1980-”. Therefore, correspondence data between the Japanese calendar and the Western calendar is stored in the replacement / deletion word database 34 in advance.
The extracted data shaped by the language analysis preprocessing unit 28 is stored in the display data storage unit 32.

つぎにキーワード抽出部30が起動し、整形済みの抽出データからキーワード候補を抽出すると共に（Ｓ26）、各キーワード候補を出現文献数順に集計する。
まず、キーワード抽出部30は類義語・必要語データベース36を参照し、実質的に同義である複数の技術用語を一つの用語（代表語）に統一させる。例えば、ある特許文献中に「炭素繊維」とあり、他の特許文献中に「カーボンファイバ」の記載があった場合、キーワード抽出部はそれぞれについて「炭素繊維」の記載ありと認定し、それぞれを当該キーワード候補の出現文献としてカウントする。 Next, the keyword extraction unit 30 is activated to extract keyword candidates from the shaped extracted data (S26), and the keyword candidates are tabulated in the order of the number of appearance documents.
First, the keyword extraction unit 30 refers to the synonym / necessary word database 36 to unify a plurality of technical terms that are substantially synonymous into one term (representative word). For example, if there is “carbon fiber” in one patent document and “carbon fiber” is described in another patent document, the keyword extraction unit recognizes that “carbon fiber” is described for each, It counts as an appearance literature of the keyword candidate concerned.

ここで、無数の単語の中からどの程度の数のキーワード候補を抽出するかについては、予めシステム10において設定しておくこともできるが、分析要求時にユーザの側で設定することもできる。
例えば、抽出件数の上限を100件と設定されていた場合、キーワード抽出部30は出現文献数の上位100位内の単語をキーワード候補として選定する。 Here, the number of keyword candidates to be extracted from an infinite number of words can be set in the system 10 in advance, or can be set on the user side when an analysis request is made.
For example, when the upper limit of the number of extractions is set to 100, the keyword extraction unit 30 selects words in the top 100 of the number of appearance documents as keyword candidates.

ただし、類義語・必要語データベース36内に特定の技術用語が必要語として設定されていた場合、キーワード抽出部30は当該必要語に該当する単語については例え出現文献数に基づく順位が100位以下であっても、キーワード候補として選定する。
この必要語の設定は、予めシステム運用者の側で準備して類義語・必要語データベース36に格納しておく他に、分析要求時にユーザが指定することもできる。 However, if a specific technical term is set as a required word in the synonym / necessary word database 36, the keyword extraction unit 30 ranks the word corresponding to the required word, for example, based on the number of appearing documents at 100th or lower. Even if there are, select them as keyword candidates.
The necessary words can be set by the user at the time of an analysis request, in addition to being prepared in advance by the system operator and stored in the synonym / necessary word database 36.

図５は、キーワード候補の抽出結果リスト50を例示するものであり、出現件数１位の「表面」というキーワード候補は、105件の特許文献中に合計で212回出現しており、全文献数に占める割合（出現頻度）が2.16％であることを示している。
このキーワード候補の抽出結果は、Webサーバ16からクライアント端末14に送信され（Ｓ28）、ディスプレイに表示される（Ｓ30）。
これに対しユーザは、上記リストの中から必要なキーワード候補のチェックボックスにレ点を入力し、「選択」ボタン（図示省略）をクリックする（Ｓ32）。 FIG. 5 exemplifies a keyword candidate extraction result list 50, and the keyword candidate “surface” having the highest number of occurrences appears 212 times in total in 105 patent documents. It shows that the ratio (appearance frequency) is 2.16%.
The keyword candidate extraction result is transmitted from the Web server 16 to the client terminal 14 (S28) and displayed on the display (S30).
In response to this, the user enters a check mark in the check box for the necessary keyword candidate from the above list, and clicks a “select” button (not shown) (S32).

これを受けたキーワード抽出部30は、ユーザが指定したキーワード候補を正式なキーワードとして抽出し（Ｓ34）、表示データ蓄積部32に格納する。
この際、各特許文献の出願番号と抽出したキーワード、及び各キーワードの出現数との対応関係も登録される。 Receiving this, the keyword extraction unit 30 extracts keyword candidates designated by the user as formal keywords (S34) and stores them in the display data storage unit 32.
At this time, the correspondence between the application number of each patent document, the extracted keyword, and the number of appearances of each keyword is also registered.

つぎに座標算出部26が起動し、各特許文献の二次元平面上における位置座標を算出する（Ｓ36）。
すなわち、座標算出部26は、各特許文献に関連付けられたキーワードの組合せ及びそれぞれの出現数のデータを多変量解析の主成分分析用アルゴリズムに入力することにより、図６に示すように、各特許文献のＸ軸座標（第１主成分）及びＹ軸座標（第２主成分）が算出される。
この算出結果は、表示データ蓄積部32に格納される。 Next, the coordinate calculation unit 26 is activated to calculate the position coordinates on the two-dimensional plane of each patent document (S36).
That is, the coordinate calculation unit 26 inputs the keyword combinations associated with each patent document and the data of the number of appearances to the principal component analysis algorithm of multivariate analysis, as shown in FIG. X-axis coordinates (first principal component) and Y-axis coordinates (second principal component) of the document are calculated.
The calculation result is stored in the display data storage unit 32.

つぎに座標算出部26は、各特許文献の座標データを基に、二次元平面上における特許文献の分布密度を算出する（Ｓ38）。
すなわち、座標算出部26は、座標軸上の一定の面積に含まれる文献数を当該の領域の分布密度に変換する。
図７はその変換処理のイメージを示すものであり、二次元平面52上に各特許文献を座標通りにプロットした後（図中の黒点が各特許文献の位置を示している）、Ｘ軸及びＹ軸を所定の間隔で仕切ることによって複数の領域に区分し、各領域内に含まれる文献数を当該領域の分布密度として集計する。
例えば、αの領域には４件の特許文献が含まれているため、分布頻度は「４」とカウントされる。これに対し、βの領域には１件の特許文献も含まれていないため、分布頻度は「０」となる。
この分布密度データは、表示データ蓄積部32に格納される。 Next, the coordinate calculation unit 26 calculates the distribution density of the patent documents on the two-dimensional plane based on the coordinate data of each patent document (S38).
That is, the coordinate calculation unit 26 converts the number of documents included in a certain area on the coordinate axis into the distribution density of the region.
FIG. 7 shows an image of the conversion process. After plotting each patent document on the two-dimensional plane 52 according to the coordinates (the black dots in the figure indicate the position of each patent document), the X axis and The Y-axis is divided into a plurality of regions by partitioning at a predetermined interval, and the number of documents included in each region is tabulated as the distribution density of the region.
For example, since four patent documents are included in the region α, the distribution frequency is counted as “4”. On the other hand, since one patent document is not included in the region of β, the distribution frequency is “0”.
The distribution density data is stored in the display data storage unit 32.

つぎに座標算出部26は、各キーワードを含む特許文献の件数、及び出現数のデータを主成分分析用アルゴリズムに入力することにより、図８に示すように、各キーワードのＸ軸座標（第１主成分）及びＹ軸座標（第２主成分）を算出する（Ｓ40）。
この算出結果は、表示データ蓄積部32に格納される。 Next, the coordinate calculation unit 26 inputs the data of the number of patent documents including each keyword and the number of appearances to the principal component analysis algorithm, thereby, as shown in FIG. Principal component) and Y-axis coordinates (second principal component) are calculated (S40).
The calculation result is stored in the display data storage unit 32.

つぎに描画処理部24が起動し、座標算出部26による算出結果に基づき、二次元平面上に分布密度データ及びキーワードの座標データを反映させた特許マップを生成する（Ｓ42）。
Webサーバ16は、この特許マップ表示用のWebファイルを生成し、クライアント端末14に送信することにより（Ｓ44）、クライアント端末14のディスプレイに表示させる（Ｓ46）。
図９は特許マップ54の一例を示すものであり、各キーワードの存在を示す点が該当の座標上にプロットされると共に、各点の近傍には対応のキーワード（文字列）が表示されている。 Next, the drawing processing unit 24 is activated to generate a patent map reflecting the distribution density data and the keyword coordinate data on the two-dimensional plane based on the calculation result by the coordinate calculation unit 26 (S42).
The web server 16 generates the patent map display web file and transmits it to the client terminal 14 (S44), thereby displaying it on the display of the client terminal 14 (S46).
FIG. 9 shows an example of the patent map 54. Points indicating the presence of each keyword are plotted on the corresponding coordinates, and a corresponding keyword (character string) is displayed in the vicinity of each point. .

また、特許マップ54上には、特許文献の分布密度に対応した図形（紋様）55が表示されている。
すなわち、上記のように特許マップ54を構成する細分化された領域には分布密度が予め関連付けられており、描画処理部24がその分布密度に応じて異なった色彩（表示パターン）を当該領域に割り当てることにより、特許マップ54上に分布密度を反映した図形（紋様）55が描画されることとなる。
各領域に対する色彩の割当て方に限定はないが、一例を挙げれば以下のようになる。
(1) 分布密度が５０以上の領域 →赤色
(2) 分布密度が４０〜４９の領域→橙色
(3) 分布密度が３０〜３９の領域→黄色
(4) 分布密度が２０〜２９の領域→黄緑色
(5) 分布密度が１０〜１９の領域→水色
(6) 分布密度が１〜９の領域 →青色
(7) 分布密度が０の領域 →藍色 Further, on the patent map 54, a figure (pattern) 55 corresponding to the distribution density of the patent document is displayed.
That is, as described above, the distribution density is associated in advance with the subdivided areas constituting the patent map 54, and the drawing processing unit 24 assigns different colors (display patterns) to the areas according to the distribution densities. By assigning, a graphic (pattern) 55 reflecting the distribution density is drawn on the patent map 54.
There is no limitation on how colors are assigned to each region, but an example is as follows.
(1) Area where distribution density is 50 or more → Red
(2) Distribution density range from 40 to 49 → orange
(3) Distribution density is 30 to 39 → yellow
(4) Distribution density range of 20-29 → Yellowish green
(5) Area where distribution density is 10-19 → light blue
(6) Area with distribution density 1-9 → Blue
(7) Area with zero distribution density → Indigo

図９の特許マップ54は白黒で表現されているため分布密度を直感的に把握することは難しいが、実際には上記の色彩によってヒートマップのように分布密度が鮮やかに描画されているため、ユーザはディスプレイ上に表示された特許マップ54を一目見ただけで特許文献数の粗密具合を認識することができる。
分布密度の変化を色彩によって表現する代わりに、他の表示パターンによって表現することもできる。例えば、同一色彩における濃淡に差を付けることによって分布密度の変化を表現することが該当する。あるいは、各領域に分布密度に応じて異なった模様（斜線、網線等）を割り当てることにより、分布密度の変化を表現してもよい。 Since the patent map 54 in FIG. 9 is expressed in black and white, it is difficult to intuitively grasp the distribution density. However, since the distribution density is actually drawn vividly like the heat map by the above color, The user can recognize the density of the number of patent documents only by looking at the patent map 54 displayed on the display.
Instead of expressing the change in distribution density by color, it can also be expressed by other display patterns. For example, it corresponds to expressing a change in distribution density by giving a difference between shades in the same color. Alternatively, a change in distribution density may be expressed by assigning different patterns (hatched lines, mesh lines, etc.) to each region according to the distribution density.

以下、図９の特許マップを観察することにより、ユーザはどのような情報を読み取ることができるのかについて説明する。
(1) まずユーザは、特定のキーワードの周辺に広がる分布密度を参照することにより、当該キーワード（技術テーマ）に関連する特許出願の多寡を認識することができる。例えば、「光励起」のキーワードの周辺が赤色で取り囲まれている場合、「光励起」に関連している特許出願件数が多いことを意味している。
(2) つぎにユーザは、各キーワードのマップ上における位置により、当該キーワードのユニーク度を確認することができる。すなわち、特許マップ54の中心に近い位置に配置されたキーワードは比較的オーソドックスであることを意味し、中心から外れるほどユニークな技術要素であることを読み取ることができる。
(3) またユーザは、複数のキーワード間の距離や組合せに基づき、各技術要素間の関係や位置付けを推察することもできる。 Hereinafter, what information the user can read by observing the patent map of FIG. 9 will be described.
(1) First, the user can recognize the number of patent applications related to the keyword (technical theme) by referring to the distribution density spreading around the specific keyword. For example, if the keyword “photoexcitation” is surrounded by a red color, it means that the number of patent applications related to “photoexcitation” is large.
(2) Next, the user can check the uniqueness of the keyword based on the position of each keyword on the map. That is, the keyword arranged at a position close to the center of the patent map 54 means that it is relatively orthodox, and it can be read that it is a unique technical element as it deviates from the center.
(3) The user can also infer the relationship and positioning between the technical elements based on the distances and combinations between the keywords.

例えば、図１０に示すように、「光励起」というキーワードの近傍に効果を表す「吸水性」、「防曇性」、「防露性」、「防汚性」のキーワードが配置されている場合、光励起作用を利用することによって吸水性、防曇性、防露性、防汚性といった効果を企図した出願傾向を読み取ることができる。
あるいは、「吸水性」、「防曇性」、「防露性」、「防汚性」の近傍で分布密度がゼロの領域には、光励起作用を利用して他の効果を実現する技術が配置されるべきことが予想できるため、つぎの技術開発の方向性を探る際の一助となる。 For example, as shown in FIG. 10, when the keywords “water absorption”, “anti-fogging”, “dew-proof”, and “anti-fouling” are arranged in the vicinity of the keyword “photoexcitation”. By utilizing the photoexcitation action, it is possible to read the tendency of applications intended for effects such as water absorption, antifogging, dewproofing and antifouling.
Alternatively, in the area where the distribution density is zero near “water absorption”, “antifogging”, “dewproofing”, and “antifouling”, there is a technology that uses photoexcitation to achieve other effects. Since it can be predicted that it will be arranged, it will help in exploring the direction of the next technological development.

上記のキーワードのシンボル（点または文字列）は、特許マップ54上においてクリッカブルに表示されており、ユーザがマウスポインタを特定のキーワードに近づけて左クリックすると、図１１に示すように、描画処理部24によって別ウィンドウ56が開かれ、当該キーワードに関連付けられた特許文献の出願番号及び出願人名がリスト表示される。
さらに、このリストの中の特定文献をユーザがクリックすると、描画処理部24によって他のウィンドウ58が開かれ、表示データ蓄積部32内に格納された当該文献の抄録データが表示される。 The above-described keyword symbols (points or character strings) are displayed in a clickable manner on the patent map 54. When the user clicks the mouse pointer close to a specific keyword and left-clicks, as shown in FIG. 24 opens another window 56, which lists the application numbers and applicant names of patent documents associated with the keyword.
Further, when the user clicks on a specific document in the list, the drawing processing unit 24 opens another window 58, and the abstract data of the document stored in the display data storage unit 32 is displayed.

またユーザは、特許マップ54の右横に設けられた年次指定欄58において特定の年次を指定し、「年次表示」ボタン60をクリックすることにより、当該年次における特許マップの表示を要求することができる。
すなわち、図９の特許マップ54は、最初の検索要求時にユーザが指定した年次範囲（例えば１０年分）の特許出願に基づいて生成されたものであるが、この中からユーザが特定の年次を指定した場合、描画処理部24は各領域の分布密度から指定された年次以外の文献数を間引いて分布密度を再計算し、その結果を特許マップ54上に反映させる（図示省略）。
この際、キーワードのシンボル（点及び文字列）はそのままの位置で表示されるため、ユーザは分布密度を表す図形55を観察することにより、特定年次における出願傾向を読み取ることができる。 In addition, the user designates a specific year in the year designation field 58 provided on the right side of the patent map 54, and clicks the “annual display” button 60 to display the patent map in the year. Can be requested.
That is, the patent map 54 in FIG. 9 is generated based on a patent application in an annual range (for example, 10 years) designated by the user at the time of the first search request. When the following is specified, the drawing processing unit 24 recalculates the distribution density by thinning out the number of documents other than the specified year from the distribution density of each region, and reflects the result on the patent map 54 (not shown). .
At this time, since the keyword symbols (points and character strings) are displayed as they are, the user can read the application tendency in a specific year by observing the graphic 55 representing the distribution density.

さらにユーザは、「アニメ表示」ボタン62をクリックすることにより、最初に指定した年次範囲内において、分布密度の変化を一定間隔をおいて年次単位で連続表示（アニメーション表示）させることもできる（図示省略）。
この場合も、キーワードのシンボル（点及び文字列）はそのままの位置で表示されるため、ユーザは分布密度を表す図形55の変化を観察することにより、出願傾向の推移を読み取ることができる。 In addition, the user can click the “animation display” button 62 to continuously display (animation display) the change in distribution density at regular intervals within the annual range specified first. (Not shown).
Also in this case, since the keyword symbols (points and character strings) are displayed as they are, the user can read the transition of the application tendency by observing the change of the graphic 55 representing the distribution density.

ユーザは、出願人選択欄64において特定の出願人を指定することにより、当該出願人に係る特許マップの表示を要求することができる。
すなわち、最初の検索要求時にユーザが出願人を限定しなかった場合、図９の特許マップ54は当該技術分野における全特許出願に基づいて生成されたこととなり、出願人選択欄には各特許文献の出願人がリスト表示される。
この中からユーザが特定の出願人を指定した場合、座標算出部26は各領域の分布密度から指定された出願人以外の件数を間引いて分布密度を再計算し、描画処理部24はその結果を特許マップ54上に反映させる（図示省略）。
この場合も、キーワードのシンボル（点及び文字列）はそのままの位置で表示されるため、ユーザは分布密度を表す図形55を観察することにより、特定出願人による出願傾向を読み取ることができる。
この結果、例えばＭ＆Ａに際して合併候補となる複数企業の得意分野を順に俯瞰し、最も補完効果の高い企業を選択することが可能となる。 The user can request a display of a patent map relating to the applicant by designating a specific applicant in the applicant selection field 64.
That is, if the user does not limit the applicant at the time of the first search request, the patent map 54 of FIG. 9 is generated based on all patent applications in the technical field, and each patent document is displayed in the applicant selection column. Will be listed.
When the user designates a specific applicant from these, the coordinate calculation unit 26 recalculates the distribution density by thinning out the number of cases other than the designated applicant from the distribution density of each region, and the drawing processing unit 24 obtains the result. Is reflected on the patent map 54 (not shown).
Also in this case, since the keyword symbols (points and character strings) are displayed as they are, the user can read the application tendency by the specific applicant by observing the graphic 55 representing the distribution density.
As a result, for example, in the M & A, it is possible to look down on the specialty fields of a plurality of companies that are candidates for the merger in order, and to select the company with the highest supplementary effect.

さらにユーザは、出願人表示欄64において複数の出願人を指定し、「和集合」ボタン66にチェックを入れて「選択」ボタン70をクリックすることにより、共同出願を含め各出願人に係る全ての特許文献に基づいた分布密度を特許マップ54上に表示させることができる。 Furthermore, the user designates a plurality of applicants in the applicant display field 64, checks the “union” button 66, and clicks the “select” button 70, so that all of the applicants including the joint application are related. Can be displayed on the patent map 54.

これに対し、出願人表示欄64において複数の出願人を指定し、「差集合」ボタン68にチェックを入れて「選択」ボタン70をクリックすると、各出願人間の得意分野を明示する分布密度を特許マップ54上に表示させることができる。
例えば、ユーザが出願人選択欄64においてＡ社とＢ社を選択し、「差集合」ボタン68にチェックを入れて「選択」ボタン70をクリックすると、座標算出部26は各領域の分布密度から両社以外の出願人に係る文献数を除外すると共に、一方の出願人に係る文献数と他方の出願人に係る文献数とを比較し、その差を何れかの出願人に係る文献数として当該領域に関連付ける。 On the other hand, when multiple applicants are specified in the applicant display field 64, the “Difference set” button 68 is checked and the “Select” button 70 is clicked, the distribution density that clearly indicates the field of expertise of each applicant is displayed. It can be displayed on the patent map 54.
For example, when the user selects Company A and Company B in the applicant selection field 64, checks the “difference set” button 68 and clicks the “select” button 70, the coordinate calculation unit 26 calculates the distribution density of each region. Exclude the number of documents related to applicants other than the two companies, compare the number of documents related to one applicant with the number of documents related to the other applicant, and the difference as the number of documents related to one of the applicants Associate with a region.

具体例を示すと、ある領域においてＡ社の文献数が８でＢ社の文献数が２の場合、当該領域にはＡ社の文献数として６が計上される。
また、他の領域においてＡ社の文献数が１でＢ社の文献数が５の場合、当該領域にはＢ社の文献数として４が計上される。
これに対し、Ａ社の文献のみが存在している領域には、当該文献数がそのままＡ社の文献数として当該領域に計上され、Ｂ社の文献のみが存在している領域には、当該文献数がＢ社の文献数として当該領域に計上される。
なお、Ａ社及びＢ社の共同出願に関しては、何れの文献数としても計上されない。 As a specific example, when the number of documents of company A is 8 and the number of documents of company B is 2 in a certain area, 6 is counted as the number of documents of company A in the area.
Further, when the number of documents of company A is 1 and the number of documents of company B is 5 in other areas, 4 is counted as the number of documents of company B in this area.
On the other hand, in the area where only the documents of company A exist, the number of the documents is included in the area as the number of documents of company A as it is, and in the area where only the documents of company B exist, The number of documents is included in this area as the number of documents of Company B.
In addition, regarding the joint application of Company A and Company B, it is not counted as any number of documents.

つぎに描画処理部24は、各領域に出願人及び分布密度に応じて異なった色彩を割り当てることにより、特許マップ54上に両社の差集合を反映した図形を描画する。
例えば、Ａ社に係る領域には赤系の色彩を割当て、その濃度によって文献数の多寡を表現すると共に、Ｂ社に係る領域には青系の色彩を割当て、その濃度によって文献数の多寡を表現する。
この結果ユーザは、一目で両社の優位な領域を把握することが可能となる。 Next, the drawing processing unit 24 draws a graphic reflecting the difference set of the two companies on the patent map 54 by assigning different colors to each region according to the applicant and the distribution density.
For example, a red color is assigned to the area related to company A, and the number of documents is expressed by the density, and a blue color is assigned to the area related to company B, and the number of documents is assigned according to the density. Express.
As a result, the user can grasp the superior areas of both companies at a glance.

ここで、特許マップ54上に市場規模対応図形の追加を希望するユーザは、図１２に示すように、任意のキーワードを複数反転選択した状態で右クリックし、表示されるメニューから「市場規模の表示」を選択する（図３のＳ48）。
この結果、カテゴリ、市場規模額、トレンド、図形指定の入力項目を備えた市場規模入力フォーム72が特許マップ54上に表示される（Ｓ50）。
これに対しユーザは、カテゴリ欄に市場のタイトルを入力すると共に、市場規模額欄に金額を入力する（Ｓ52）。
また、市場規模が拡大傾向を示している場合にはトレンド欄の「拡大傾向」にチェックを入れ、縮小傾向を示している場合には「縮小傾向」にチェックを入れる。
さらに、図形指定欄に希望の図形（円または棒グラフ）を指定し、「ＯＫ」ボタンをクリックする。 Here, as shown in FIG. 12, the user who wants to add a figure corresponding to the market size on the patent map 54, right-clicks with a plurality of arbitrary keywords selected, and selects “Market Scale” from the displayed menu. "Display" is selected (S48 in FIG. 3).
As a result, a market size input form 72 having input items for category, market size, trend, and graphic designation is displayed on the patent map 54 (S50).
On the other hand, the user inputs the market title in the category column and inputs the amount in the market scale column (S52).
Also, if the market size shows an expansion trend, check the “Expansion Trend” in the trend column, and if it indicates a reduction trend, check the “Reduction Trend”.
Further, a desired figure (circle or bar graph) is designated in the figure designation column, and the “OK” button is clicked.

これを受けたAPサーバ18の描画処理部24は、まずユーザが選択した各キーワード間の中心座標を算出する（Ｓ54）。
具体的には、各キーワードのＸ軸座標値及びＹ軸座標値に対し、出現文献数に基づく加重平均をとり、それぞれの算出結果を中心座標とする。
例えば、図１３に示すように、五つの選択キーワードＡ〜ＥのそれぞれのＸ軸座標、Ｙ軸座標、出現文献数が表の通りである場合、中心座標のＸ軸の値は数式(1)より−0.182となり、Ｙ軸の値は数式(2)より−1.994となる。
つぎに描画処理部24は、ユーザが入力した金額を所定の金額／ドット換算スケールに当てはめて円の半径を導く（Ｓ56）。例えば、10億円＝１ドットという換算スケールである場合、ユーザが入力した市場規模額が800億円であれば、円の半径は80ドットとなる。ユーザが図形指定で棒グラフを選択した場合には、金額を所定の金額／ドット換算スケールに当てはめることにより、棒グラフの長さがドット数として求められる。 Receiving this, the drawing processing unit 24 of the AP server 18 first calculates center coordinates between the keywords selected by the user (S54).
Specifically, a weighted average based on the number of appearance documents is taken for the X-axis coordinate value and the Y-axis coordinate value of each keyword, and the respective calculation results are set as the central coordinates.
For example, as shown in FIG. 13, when the X-axis coordinate, the Y-axis coordinate, and the number of appearance documents of each of the five selection keywords A to E are as shown in the table, the value of the X-axis of the central coordinate is expressed by the formula From the equation (2), the value of the Y-axis is −1.994.
Next, the drawing processing unit 24 applies the amount input by the user to a predetermined amount / dot conversion scale to derive the radius of the circle (S56). For example, when the conversion scale is 1 billion yen = 1 dot, if the market scale amount input by the user is 80 billion yen, the radius of the circle is 80 dots. When the user selects a bar graph by designating a figure, the length of the bar graph is obtained as the number of dots by applying the amount to a predetermined amount / dot conversion scale.

つぎに描画処理部24は、図１４に示すように、特許マップ54上に中心座標を示す星印Ｘと、この星印Ｘを中心点とする半径80ドットの円Ｙを描画する（Ｓ58）。
この円Ｙの上方には、当該市場のカテゴリ「コーティング材」と、市場規模額「800億円」が表示されている。
この円Ｙは拡大傾向を反映する赤色で描かれているが、ユーザがフォーム72において縮小傾向を選択した場合にはこれを反映する青色で描かれることとなる。
なお、ユーザが図形指定で棒グラフを選択した場合には、上記星印Ｘから対応の長さを備えた赤色の棒グラフが伸びる図形が、描画処理部24によって特許マップ54上に加えられる。
このようにして描画処理部24によって市場規模対応図形が追加された新たな特許マップ54は、Webサーバ16経由でクライアント端末14に送信され（Ｓ60）、ディスプレイに表示される（Ｓ62）。 Next, as shown in FIG. 14, the drawing processing unit 24 draws an asterisk X indicating the center coordinates and a circle Y having a radius of 80 dots centered on the asterisk X on the patent map 54 (S58). .
Above the circle Y, the category “coating material” of the market and the market size “80 billion yen” are displayed.
This circle Y is drawn in red that reflects the enlargement tendency, but when the user selects a reduction tendency in the form 72, it is drawn in blue that reflects this.
When the user selects a bar graph by designating the graphic, a graphic in which a red bar graph having a corresponding length extends from the star X is added to the patent map 54 by the drawing processing unit 24.
The new patent map 54 to which the graphic corresponding to the market size is added by the drawing processing unit 24 in this way is transmitted to the client terminal 14 via the Web server 16 (S60) and displayed on the display (S62).

上記のＳ48〜Ｓ62の操作を繰り返すことにより、図１５に示すように、特許マップ54上に市場カテゴリ毎に市場規模額を示す複数の円Ｙが描かれることとなり、ユーザは各キーワードと市場カテゴリとの関連性、市場規模額の多寡（円の面積）と出願密度との関連性、当該市場のトレンド（拡大／縮小）等を視覚的に認識することが可能となる。 By repeating the above operations S48 to S62, as shown in FIG. 15, a plurality of circles Y indicating the market size for each market category are drawn on the patent map 54, and the user can select each keyword and market category. It is possible to visually recognize the relevance, the size of the market size (the area of the yen) and the application density, the market trend (expansion / reduction), and the like.

上記においては、ユーザの側で予め各キーワードと市場カテゴリとの関連性、市場規模額に関する統計データ、過去から現在までのトレンドを認識している必要があるが、一定の条件を満たす場合には市場規模対応図形の自動表示を実現することもできる。
以下、図１６のフローチャートに従い、市場規模対応図形の自動表示に係る処理手順を説明する。 In the above, it is necessary for the user to recognize the relationship between each keyword and the market category, statistical data on the market size, and trends from the past to the present. Automatic display of figures corresponding to the market size can also be realized.
In the following, a processing procedure related to automatic display of a figure corresponding to the market size will be described with reference to the flowchart of FIG.

まず、ユーザが特許マップ54上で右クリックし、表示されるメニューから「市場規模の自動表示」をクリックすると、これを受けた描画処理部24は（Ｓ70）、市場−キーワード対応テーブル23に登録されたキーワードと特許マップ54上に表示された各キーワードとを比較し、対応の市場カテゴリが存在するか否かを判定する（Ｓ72、Ｓ74）。
すなわち、市場−キーワード対応テーブル23には、図１７に示すように、各市場カテゴリと複数のキーワードとの対応関係が予め多数定義されている。ここで例えば、特許マップ54上にハイブリッド、発電機、エンジン、変速のキーワードが所定の距離内に存在していた場合、描画処理部24は「ハイブリッド車」の市場カテゴリを特定すると共に、ハイブリッド、発電機、エンジン、変速を関連キーワードと認定する（Ｓ76）。
これに対し、対応の市場カテゴリが存在しない場合には、「対応データなし」の表示がクライアント端末14に返される（Ｓ78）。 First, when the user right-clicks on the patent map 54 and clicks “automatic display of market size” from the displayed menu, the drawing processing unit 24 that receives this (S70) registers it in the market-keyword correspondence table 23. The determined keyword is compared with each keyword displayed on the patent map 54, and it is determined whether or not a corresponding market category exists (S72, S74).
That is, in the market-keyword correspondence table 23, as shown in FIG. 17, a large number of correspondence relationships between each market category and a plurality of keywords are defined in advance. Here, for example, when the keywords of hybrid, generator, engine, and shift are present within a predetermined distance on the patent map 54, the drawing processing unit 24 identifies the market category of “hybrid vehicle”, The generator, engine, and shift are recognized as related keywords (S76).
On the other hand, if there is no corresponding market category, a display of “no corresponding data” is returned to the client terminal 14 (S78).

なお、市場−キーワード対応テーブル23の各市場カテゴリに関連付けられた複数のキーワードの全てが特許マップ54上に表示されている場合はもちろん、所定比率以上（例えば70％以上）一致している場合には当該市場カテゴリを認定するというように、柔軟に運用することもできる。 It should be noted that not only when all of the plurality of keywords associated with each market category in the market-keyword correspondence table 23 are displayed on the patent map 54, but also when there is a match with a predetermined ratio or more (for example, 70% or more). Can be operated flexibly, such as certifying the market category.

つぎに描画処理部24は、上記と同様のロジックにより、関連キーワード間の中心座標を算出する（Ｓ80）。
つぎに描画処理部24は、市場規模額テーブル25を参照し、ハイブリッド車に関する最新の市場規模額と拡大／縮小傾向を取得する（Ｓ82）。
図１８に示すように、市場規模額テーブル25には、各市場カテゴリと過去数年における市場規模額が格納されているため、各年の市場規模額の推移によって市場規模が拡大傾向にあるのか縮小傾向にあるのかを判断できる。例えば、ハイブリッド車の場合には2004年度の市場規模額よりも2005年度の市場規模額の方が大きいため、描画処理部24は拡大傾向と判定する。
つぎに描画処理部24は、上記と同様の手法により、この2005年度の市場規模額に対応した円の半径を算出する（Ｓ84）。 Next, the drawing processing unit 24 calculates center coordinates between related keywords by the same logic as described above (S80).
Next, the drawing processing unit 24 refers to the market size table 25 and obtains the latest market size and the expansion / reduction tendency regarding the hybrid vehicle (S82).
As shown in FIG. 18, since the market size table 25 stores each market category and the market size value in the past several years, does the market size tend to expand due to the change in the market size value of each year? It can be judged whether there is a tendency to shrink. For example, in the case of a hybrid vehicle, since the market size amount in the 2005 fiscal year is larger than the market size amount in the 2004 fiscal year, the drawing processing unit 24 determines that the trend is an expansion trend.
Next, the drawing processing unit 24 calculates the radius of the circle corresponding to the market size amount in fiscal 2005 by the same method as described above (S84).

つぎに描画処理部24は、特許マップ54上に上記の中心座標を中心点とする上記半径の円を描画する（Ｓ86）。この際、円の色彩は市場規模の拡大傾向を反映させた赤色に着色される。
最後に、円の近傍に市場タイトル及び市場規模額を挿入した特許マップ54が生成され、クライアント端末14に送信される（Ｓ88）。
以上の結果、クライアント端末14のディスプレイには、図１４に示したのと同様、市場規模額を示す円Ｙが表示されることとなる。
もちろん、Ｓ74で該当する市場カテゴリが複数存在する場合、描画処理部24はＳ76〜Ｓ86を繰り返すことにより、特許マップ54上に複数の円を描く（図１５参照）。 Next, the drawing processing unit 24 draws a circle with the radius having the center point as the center point on the patent map 54 (S86). At this time, the color of the circle is colored red reflecting the expanding trend of the market scale.
Finally, a patent map 54 in which the market title and the market size amount are inserted in the vicinity of the circle is generated and transmitted to the client terminal 14 (S88).
As a result, as shown in FIG. 14, a circle Y indicating the market scale amount is displayed on the display of the client terminal 14.
Of course, when there are a plurality of corresponding market categories in S74, the drawing processing unit 24 draws a plurality of circles on the patent map 54 by repeating S76 to S86 (see FIG. 15).

上記にあっては、ユーザ特許マップ54上で右クリックし、表示メニューから「市場規模の自動表示」を選択することにより、市場規模を示す複数の円を自動的に表示することとしたが、ユーザが特許マップ54上で右クリックした場合に、右クリックをした特許マップ54上の座標、もしくは右クリックした位置に表示されているキーワードに基づいて最も近い市場カテゴリを特定し、特定された市場カテゴリの市場規模を示す円を表示するようにしてもよい。
これにより、ユーザが直感的に興味を持った技術領域やキーワードについて、関連する市場の規模の大きさを極めて容易に知ることができる。 In the above, by right-clicking on the user patent map 54 and selecting "Automatic display of market size" from the display menu, a plurality of circles indicating the market size are automatically displayed. When the user right-clicks on the patent map 54, the closest market category is identified based on the coordinates on the right-clicked patent map 54 or the keyword displayed at the right-clicked position. A circle indicating the market size of the category may be displayed.
This makes it possible to know the size of the related market very easily for the technical areas and keywords that the user is intuitively interested in.

上記の文献情報分析システム10にあっては、クライアント端末14とサーバ群12とを備えたクライアント−サーバ型のシステム構成を備えていたが、この発明はこれに限定されるものではなく、パソコン等のコンピュータに専用のアプリケーションプログラムをセットアップすることにより、スタンドアロン型のシステム構成によって実現することも当然に可能である。
この場合、文献データベース38をパソコン等のハードディスクに格納しておくこともできるが、ネットワークを介して接続されたDBサーバ内に格納された文献データベースから必要な文献情報を抽出することもできる。 The literature information analysis system 10 described above has a client-server type system configuration including the client terminal 14 and the server group 12, but the present invention is not limited to this, and a personal computer or the like Of course, it is possible to realize a stand-alone system configuration by setting up a dedicated application program in the computer.
In this case, the document database 38 can be stored in a hard disk such as a personal computer, but necessary document information can also be extracted from a document database stored in a DB server connected via a network.

上記にあっては、キーワード抽出部30によって名詞のキーワード候補が抽出される例を示したが、この発明はこれに限定されるものではなく、名詞と名詞の組合せ、形容詞と名詞の組合せや、あるいは名詞と動詞の組合せをキーワード候補として抽出することもできる。
例えば、「水の汚れ（名詞＋名詞）」、「高い親水性（形容詞＋名詞）」、「紫外線を照射する（名詞＋動詞）」などが該当する。 In the above, an example in which a keyword candidate for a noun is extracted by the keyword extraction unit 30, but the present invention is not limited to this, a combination of a noun and a noun, a combination of an adjective and a noun, Alternatively, combinations of nouns and verbs can be extracted as keyword candidates.
For example, “dirt of water (noun + noun)”, “high hydrophilicity (adjective + noun)”, “irradiate ultraviolet rays (noun + verb)”, and the like are applicable.

上記のように、キーワード抽出部30によってまずキーワード候補がリストアップされ、その中からユーザが選択したものをキーワードとして認定する代わりに、キーワード抽出部30によって自動的にキーワードが確定されるようにシステム10を運用することもできる。
また、上記にあっては、キーワード抽出部30によって特許文献の抄録（要約書の内容）からキーワード候補が抽出される例を示したが、特許明細書の全文からキーワードを抽出することも当然に可能である。 As described above, the keyword extraction unit 30 first lists keyword candidates, and instead of authorizing a keyword selected by the user as a keyword, the keyword extraction unit 30 automatically determines the keyword. 10 can also be used.
Moreover, in the above, an example in which keyword candidates are extracted from the abstract of patent documents (contents of the abstract) by the keyword extraction unit 30 has been shown. Naturally, it is also possible to extract keywords from the full text of the patent specification. Is possible.

上記にあっては、日本語の特許文献から抽出した日本語のキーワードに基づき、文献情報の座標や分布密度、各キーワードの座標を算出し、これらの算出結果に基づいて日本語表記の特許マップを生成・表示する例を示したが、このシステム10は英語や中国語、ドイツ語、フランス語など日本語以外の言語で記述された文献情報にも対応可能である。
すなわち、文献データベース38には各言語による特許文献情報が蓄積されており、要置換・削除単語データベース34及び類義語・必要語データベース36にも言語毎に単語が登録されている。
また、検索処理部22による特許文献情報の検索・抽出処理、言語解析事前処理部28による整形処理、キーワード抽出部30によるキーワード候補の抽出処理、座標算出部26による各種算出処理、描画処理部24による特許マップの生成処理も、多言語対応となされている。 In the above, based on the Japanese keywords extracted from the Japanese patent documents, the coordinates and distribution density of the document information and the coordinates of each keyword are calculated, and the patent map in Japanese notation is based on these calculation results In this example, the system 10 can handle document information written in a language other than Japanese, such as English, Chinese, German, and French.
That is, patent document information in each language is accumulated in the document database 38, and words are also registered for each language in the required / removed word database 34 and the synonym / required word database 36.
Also, patent document information search / extraction processing by the search processing unit 22, shaping processing by the language analysis preprocessing unit 28, keyword candidate extraction processing by the keyword extraction unit 30, various calculation processing by the coordinate calculation unit 26, drawing processing unit 24 The patent map generation processing by is also multilingual.

この発明は、特許マップ作成の効率化を実現することが出発点であったため、上記においては特許マップの生成に絞って説明を展開したが、他の分野に応用することも可能である。
例えば、特許文献の代わりに学術文献（学会誌や専門雑誌に掲載された研究論文など）の電子データを文献情報として文献データベースに格納しておけば、学界における研究動向を俯瞰するための研究マップを生成・表示することが可能となる。
特許文献の場合、出願後１年６ヶ月を経過しなければ情報が開示されないという制約があるが、研究論文はいち早く発表される傾向があるため、特許出願動向を先読みするためのツールとして活用することが期待できる。
この場合も、研究分野と市場規模額との関連性を示す図形を研究マップ上に表示することにより、研究テーマと経済的なリターンとの関係を視覚的に認識可能となる。 Since the present invention is based on the realization of efficient creation of a patent map, the above description has been focused on generating a patent map, but it can also be applied to other fields.
For example, if electronic data of academic literature (research papers published in academic journals and specialized journals) is stored in the literature database as literature information instead of patent literature, a research map to provide an overview of research trends in academia Can be generated and displayed.
In the case of patent documents, there is a restriction that information will not be disclosed until one year and six months have passed since the application was filed. However, research papers tend to be published quickly, so they can be used as a tool for prefetching patent application trends. I can expect that.
In this case as well, the relationship between the research theme and the economic return can be visually recognized by displaying on the research map a graphic showing the relationship between the research field and the market size.

この発明に係る文献情報分析システムの構成を示すブロック図である。It is a block diagram which shows the structure of the literature information analysis system which concerns on this invention. この文献情報分析システムにおける処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in this literature information analysis system. この文献情報分析システムにおける処理手順を示すフローチャートである。It is a flowchart which shows the process sequence in this literature information analysis system. 文献データベースから抽出したデータの一例を示す説明図である。It is explanatory drawing which shows an example of the data extracted from the literature database. キーワード候補の抽出結果リストを例示する説明図である。It is explanatory drawing which illustrates the extraction result list | wrist of a keyword candidate. 各特許文献のＸ軸座標及びＹ軸座標の算出結果を例示する説明図である。It is explanatory drawing which illustrates the calculation result of the X-axis coordinate of each patent document, and a Y-axis coordinate. 特許文献の座標から分布密度を算出する際のイメージを示す説明図である。It is explanatory drawing which shows the image at the time of calculating distribution density from the coordinate of a patent document. 各キーワードのＸ軸座標及びＹ軸座標の算出結果を例示する説明図である。It is explanatory drawing which illustrates the calculation result of the X-axis coordinate of each keyword, and a Y-axis coordinate. 特許マップの一例を示すレイアウト図である。It is a layout figure which shows an example of a patent map. 特許マップの部分拡大図である。It is the elements on larger scale of a patent map. 特許マップ上に二つの別ウィンドウを表示させた状態を示すレイアウト図である。It is a layout figure which shows the state which displayed two separate windows on the patent map. 特許マップ上に市場規模入力フォームが表示された状態を示す説明図である。It is explanatory drawing which shows the state by which the market scale input form was displayed on the patent map. 各選択キーワード間の中心座標の算出方法を示す説明図である。It is explanatory drawing which shows the calculation method of the center coordinate between each selection keyword. 特許マップ上に市場規模対応図形（円）等が表示された状態を示す説明図である。It is explanatory drawing which shows the state by which the market scale corresponding figure (circle) etc. were displayed on the patent map. 特許マップ上に複数の市場規模対応図形（円）等が表示された状態を示す説明図である。It is explanatory drawing which shows the state by which the some market scale corresponding figure (circle) etc. were displayed on the patent map. 市場規模対応図形の自動表示に係る処理手順を説明するフローチャートである。It is a flowchart explaining the process sequence which concerns on the automatic display of a market scale corresponding figure. 市場−キーワード対応テーブルの構造を示す説明図である。It is explanatory drawing which shows the structure of a market-keyword correspondence table. 市場規模額テーブルの構造を示す説明図である。It is explanatory drawing which shows the structure of a market scale amount table.

Explanation of symbols

10 文献情報分析システム
12 サーバ群
13 インターネット
14 クライアント端末
16 Webサーバ
18 APサーバ
20 DBサーバ
22 検索処理部
23 市場−キーワード対応テーブル
24 描画処理部
25 市場規模額テーブル
26 座標算出部
28 言語解析事前処理部
30 キーワード抽出部
32 表示データ蓄積部
34 要置換・削除単語データベース
36 類義語・必要語データベース
38 文献データベース
50 キーワード候補の抽出結果リスト
52 二次元平面
54 特許マップ
55 特許文献の分布密度に対応した図形
56 別ウィンドウ
58 他のウィンドウ
60 「年次表示」ボタン
62 「アニメ表示」ボタン
64 出願人指定欄
66 「和集合」ボタン
68 「差集合」ボタン
70 「選択」ボタン
72 市場規模入力フォーム
Ｘ中心座標を示す星印
Ｙ市場規模対応円 10 Literature information analysis system
12 servers
13 Internet
14 Client terminal
16 Web server
18 AP server
20 DB server
22 Search processing section
23 Market-Keyword Table
24 Drawing processor
25 Market size table
26 Coordinate calculator
28 Language analysis pre-processing section
30 Keyword extractor
32 Display data storage
34 Required replacement / deletion word database
36 Synonyms and Necessary Words Database
38 Literature database
50 Keyword candidate extraction results list
52 Two-dimensional plane
54 Patent Map
55 Figures corresponding to the distribution density of patent documents
56 Another window
58 Other windows
60 “Annual Display” button
62 “Animation” button
64 Applicant designation field
66 “Union” button
68 “Difference” button
70 Select button
72 Market size input form X Star indicating the center coordinates Y Market size corresponding circle

Claims

Means for extracting a plurality of keywords from a plurality of digitized document information according to a predetermined standard;
Means for calculating the coordinates of each document on a two-dimensional plane by performing a principal component analysis on the combination of keywords and the number of occurrences included in each document information;
Means for calculating the distribution density of each region by counting the coordinates of each document for each region having a certain area;
Means for assigning a display pattern corresponding to the distribution density to each region;
Means for generating a graphic corresponding to the distribution density of the literature by reflecting the display pattern of each region on the two-dimensional plane;
Means for calculating the coordinates of each keyword on a two-dimensional plane by performing principal component analysis on the total number of documents including the keyword and the total number of occurrences of each keyword;
Means for placing the keyword on the two-dimensional plane according to the coordinates of each keyword;
Means for generating a document information analysis map including graphics and keywords corresponding to the distribution density of the documents;
Means for displaying this bibliographic information analysis map on a display;
When the keyword displayed on the bibliographic information analysis map is selected via the input means, and a selection for displaying the market size is made, a form for inputting the market size amount is displayed on the display Means to
Means for calculating the size of the figure corresponding to the market size by applying a predetermined conversion ratio to the amount inputted in the form via the input means;
A means for generating and displaying a predetermined market corresponding figure with the dimensions, in the vicinity of the keyword selected through the input unit,
A literature information analysis system characterized by comprising:

A means for obtaining a weighted average based on the number of appearance documents with respect to the coordinates of a plurality of keywords selected via the input means, and specifying a center coordinate between the keywords;
Means for displaying the market size corresponding figure on the center coordinates;
The document information analysis system according to claim 1, further comprising:

The above form has a field for entering the market category,
3. The literature information analysis system according to claim 1, further comprising means for displaying a market category input via the input means in the vicinity of the graphic corresponding to the market size.

In the above form, there is a field for selectively entering the trend of increasing or decreasing the market size.
The literature information analysis system according to any one of claims 1 to 3, further comprising means for generating a figure corresponding to a market scale having a color corresponding to a tendency selected and input via the input means.

Means for extracting a plurality of keywords from a plurality of digitized document information according to a predetermined standard;
Means for calculating the coordinates of each document on a two-dimensional plane by performing a principal component analysis on the combination of keywords and the number of occurrences included in each document information;
Means for calculating the distribution density of each region by counting the coordinates of each document for each region having a certain area;
Means for assigning a display pattern corresponding to the distribution density to each region;
Means for generating a graphic corresponding to the distribution density of the literature by reflecting the display pattern of each region on the two-dimensional plane;
Means for calculating the coordinates of each keyword on a two-dimensional plane by performing principal component analysis on the total number of documents including the keyword and the total number of occurrences of each keyword;
Means for placing the keyword on the two-dimensional plane according to the coordinates of each keyword;
Means for generating a document information analysis map including graphics and keywords corresponding to the distribution density of the documents;
Means for displaying this bibliographic information analysis map on a display;
A market-keyword correspondence storage means for defining a correspondence relationship between a plurality of keywords and a specific market category;
Market size amount storage means for storing market size amount data for each predetermined period related to each market category;
A means for comparing each keyword displayed on the display with a keyword in the market-keyword correspondence storage means to identify one or a plurality of market categories;
Means for referring to the market size amount storage means and identifying the market size amount associated with each market category;
Means for calculating the size of the figure corresponding to the market size by applying a predetermined conversion ratio to the amount;
A means for generating a market size corresponding figure having the dimensions and displaying it in the vicinity of each keyword included in the market category;
A literature information analysis system characterized by comprising:

Means for determining an expansion or contraction tendency of the market size according to the increase or decrease of the market size amount in a plurality of periods stored in the market size amount storage means;
6. The literature information analysis system according to claim 5, further comprising means for generating a figure corresponding to a market scale having a color corresponding to the tendency.

For the coordinates of each keyword related to the market category, a means for obtaining a weighted average based on the number of appearance documents and specifying the center coordinates between the keywords;
Means for arranging the market size corresponding figure on the central coordinates;
The bibliographic information analysis system according to claim 6, further comprising:

The literature information analysis system according to claim 2 or 7, wherein the figure corresponding to the market size is a circle having a radius or a diameter corresponding to the dimension with the center coordinate as a center point.

Computer
Means for extracting a plurality of keywords according to a predetermined standard from a plurality of electronic document information;
Means for calculating the coordinates of each document on a two-dimensional plane by performing principal component analysis on the combination of keywords and the number of occurrences included in each document information;
Means for calculating the distribution density of each region by counting the coordinates of each document for each region having a certain area;
Means for assigning a display pattern corresponding to the distribution density to each region;
Means for generating a graphic corresponding to the distribution density of the literature by reflecting the display pattern of each region on the two-dimensional plane;
Means for calculating the coordinates of each keyword on a two-dimensional plane by performing principal component analysis on the total number of documents including the keyword and the total number of occurrences of each keyword;
Means for placing the keyword on the two-dimensional plane according to the coordinates of each keyword;
Means for generating a document information analysis map including graphics and keywords corresponding to the distribution density of the documents;
Means for displaying this bibliographic information analysis map on a display;
When the keyword displayed on the bibliographic information analysis map is selected via the input means, and a selection for displaying the market size is made, a form for inputting the market size amount is displayed on the display Means to
Means for calculating the size of the figure corresponding to the market size by applying a predetermined conversion ratio to the amount entered in the form via the input means;
It means for generating and displaying a predetermined market corresponding figure with the dimensions, in the vicinity of the keyword selected through the input unit,
A literature information analysis program characterized by functioning as