JP6809338B2

JP6809338B2 - Information processing equipment, information processing methods and information processing programs

Info

Publication number: JP6809338B2
Application number: JP2017069251A
Authority: JP
Inventors: 貴之福谷; 山本　雅洋; 雅洋山本; 禎秀尾芦
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-03-30
Filing date: 2017-03-30
Publication date: 2021-01-06
Anticipated expiration: 2037-03-30
Also published as: JP2018169981A

Description

本発明は、情報処理装置、情報処理方法及び情報処理プログラムに関する。特に、キーワード検索機能を備えた情報処理装置、情報処理方法及び情報処理プログラムに関する。 The present invention relates to an information processing device, an information processing method, and an information processing program. In particular, it relates to an information processing device having a keyword search function, an information processing method, and an information processing program.

例えば、同一キーワードを有するなどの関連性を有する複数の文書ファイルを関連付けて格納するData Aware型文書データベースが存在する（特許文献１〜３）。また、ユーザが入力した検索キーワードを含む文書ファイルをこのような文書データベースから検索する技術も存在する（特許文献４）。 For example, there is a Data Aware type document database that stores a plurality of related document files having the same keyword in association with each other (Patent Documents 1 to 3). There is also a technique for searching a document file containing a search keyword input by a user from such a document database (Patent Document 4).

特許第６００２８３２号公報Japanese Patent No. 6002832 特開２００９−２７７１４７号公報Japanese Unexamined Patent Publication No. 2009-277147 特開２００５−３３９１５６号公報Japanese Unexamined Patent Publication No. 2005-339156 特開２００３−１７８０５５号公報Japanese Unexamined Patent Publication No. 2003-178055

以下の分析は、本発明の観点からなされたものである。なお、上記先行技術文献の各開示を、本書に引用をもって繰り込むものとする。 The following analysis was made from the point of view of the present invention. The disclosures of the above prior art documents shall be incorporated into this document by citation.

特許文献１、２に開示の技術には、ファイル検索に長時間を要するという問題点がある。 The techniques disclosed in Patent Documents 1 and 2 have a problem that it takes a long time to search for a file.

すなわち、特許文献１の文書データベースでは、例えば、関連文書ファイルに同一タグを付して、一次記憶媒体に格納する。ここで、一次記憶媒体は、通常、領域単位でデータを管理しており、１つの領域には複数の文書ファイルが格納される。そして、特許文献２の検索技術では、検索キーワードに対応するタグを特定し、特定したタグを付された文書ファイルを含む領域を一次記憶媒体からキャッシュメモリに読み出している。 That is, in the document database of Patent Document 1, for example, the related document files are attached with the same tag and stored in the primary storage medium. Here, the primary storage medium usually manages data in units of areas, and a plurality of document files are stored in one area. Then, in the search technique of Patent Document 2, the tag corresponding to the search keyword is specified, and the area including the document file with the specified tag is read from the primary storage medium into the cache memory.

ここで、一次記憶媒体からキャッシュメモリへ読み出される領域には、検索キーワードとは無関係の文書ファイルも含まれる。そのため、特許文献１、２に開示の技術では、検索キーワードと無関係の文書ファイルのデータ移行によってファイル検索に遅延が生じる。 Here, the area read from the primary storage medium to the cache memory also includes a document file unrelated to the search keyword. Therefore, in the techniques disclosed in Patent Documents 1 and 2, file search is delayed due to data migration of document files unrelated to the search keyword.

そこで、本発明では、短時間でのファイル検索が可能な情報処理装置、情報処理方法及び情報処理プログラムを提供することを目的とする。 Therefore, an object of the present invention is to provide an information processing device, an information processing method, and an information processing program capable of searching files in a short time.

本発明の第１の視点によれば、複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体と、
前記記憶媒体に記憶されたファイルを一時的に保持するキャッシュメモリと、
前記ファイルシステム領域と、前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、
前記記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出部と、
前記キーワード抽出部によって抽出された抽出キーワードと、前記キーワード管理テーブルとを突き合わせて、前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込処理部と、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルを前記キャッシュメモリに読み出す読出処理部と、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定部と、
を含む情報処理装置が提供される。 According to the first aspect of the present invention, a storage medium for storing a plurality of files by dividing them into a plurality of file system areas, and
A cache memory that temporarily holds files stored in the storage medium, and
A keyword management table that manages the file system area in association with the keywords in the file stored in the file system area.
A keyword extractor that extracts keywords from the written file when writing a file to the storage medium,
The extracted keyword extracted by the keyword extraction unit is compared with the keyword management table, the file system area corresponding to the extracted keyword is determined as the writing destination of the writing file, and the writing file is sent to the determined file system area. Write processing unit to write
When searching for a file containing a search keyword, the keyword management table is referred to, a file system area corresponding to the search keyword is detected, and a file stored in the detected file system area is read into the cache memory. Processing unit and
A file identification unit that identifies a file containing the search keyword from the files read into the cache memory,
An information processing device including is provided.

本発明の第２の視点によれば、複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出ステップと、
前記ファイルシステム領域と前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、前記キーワード抽出ステップによって抽出された抽出キーワードとを突き合わせて前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込ステップと、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルをキャッシュメモリに読み出す読出ステップと、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定ステップと、
を含む情報処理方法が提供される。 According to the second viewpoint of the present invention, when writing a file to a storage medium in which a plurality of files are divided into a plurality of file system areas and stored, a keyword extraction step of extracting a keyword from the written file and a keyword extraction step.
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction step. A writing step in which the area is determined as the writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
A file identification step for identifying a file containing the search keyword from the files read into the cache memory, and
Information processing methods including

本発明の第３の視点によれば、複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出処理と、
前記ファイルシステム領域と前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、前記キーワード抽出処理によって抽出された抽出キーワードとを突き合わせて前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込処理と、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルをキャッシュメモリに読み出す読出処理と、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定処理と、
をコンピュータに実行させるプログラムが提供される。 According to the third viewpoint of the present invention, when writing a file to a storage medium in which a plurality of files are divided into a plurality of file system areas and stored, a keyword extraction process for extracting a keyword from the written file and a keyword extraction process
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction process. A writing process in which an area is determined as a writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
File identification processing that identifies a file containing the search keyword from the files read into the cache memory, and
Is provided with a program that causes the computer to execute.

本発明の各視点によれば、短時間でのファイル検索が可能な情報処理装置、情報処理方法及び情報処理プログラムが提供される。 According to each viewpoint of the present invention, an information processing device, an information processing method, and an information processing program capable of searching a file in a short time are provided.

一実施形態の概要を説明するための図である。It is a figure for demonstrating the outline of one Embodiment. 一実施形態の概要を説明するための図である。It is a figure for demonstrating the outline of one Embodiment. 一実施形態の概要を説明するための図である。It is a figure for demonstrating the outline of one Embodiment. 情報処理装置１０の構成の一例を示した図である。It is a figure which showed an example of the structure of the information processing apparatus 10. キーワード管理テーブル２２０に記憶される情報を示す図である。It is a figure which shows the information stored in the keyword management table 220. キーワードリスト２３０の一例を示す図である。It is a figure which shows an example of a keyword list 230. ファイルの書き込みの際の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of processing at the time of writing a file. ファイル検索処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a file search process.

本発明のとり得る好適な実施形態について図面を参照して詳細に説明する。なお、以下の記載に付記した図面参照符号は、理解を助けるための一例として各要素に便宜上付記したものであり、本発明を図示の態様に限定することを意図するものではない。また、各図におけるブロック間の接続線は双方向及び単方向の双方を含み、図面の明瞭化を目的として適宜図示又は削除される。 Preferred embodiments of the present invention will be described in detail with reference to the drawings. It should be noted that the drawing reference reference numerals added to the following description are added to each element for convenience as an example for assisting understanding, and the present invention is not intended to be limited to the illustrated embodiment. Further, the connecting line between the blocks in each drawing includes both bidirectional and unidirectional, and is appropriately shown or deleted for the purpose of clarifying the drawing.

図１に示すように、情報処理装置１０は、複数の文書ファイルを複数のファイルシステム領域に区分して記憶する一次記憶媒体１００を有し、例えば、複数の物理メモリの各々に個別のファイルシステム領域（ＦＳ＃１〜ｎ）が割り当てられる。なお、本願開示において、ファイルシステム領域とは、記憶媒体において特定のファイルシステムにより初期化された領域であり、所謂、論理ボリュームである。 As shown in FIG. 1, the information processing apparatus 10 has a primary storage medium 100 that divides and stores a plurality of document files into a plurality of file system areas, and for example, a file system that is individual to each of the plurality of physical memories. Areas (FS # 1 to n) are allocated. In the disclosure of the present application, the file system area is an area initialized by a specific file system in the storage medium, and is a so-called logical volume.

また情報処理装置１０は、キャッシュメモリ２１０と、キーワード管理テーブル２２０とを有する。キャッシュメモリ２１０は、一次記憶媒体１００に記憶された文書ファイルを一時的に保持する二次記憶媒体であり、キーワード管理テーブル２２０は、ファイルシステム領域と、ファイルシステム領域に格納された文書ファイルにおけるキーワードを対応付けて管理するテーブルである。 The information processing device 10 also has a cache memory 210 and a keyword management table 220. The cache memory 210 is a secondary storage medium that temporarily holds a document file stored in the primary storage medium 100, and the keyword management table 220 contains keywords in the file system area and the document file stored in the file system area. It is a table that manages by associating.

また、情報処理装置１０は、キーワード抽出部３１０と、書込処理部３２０と、読出処理部３３０と、ファイル特定部３４０とを有する。キーワード抽出部３１０は、一次記憶媒体１００へのファイルの書き込みの際に、書き込みファイルからキーワードを抽出する。書込処理部３２０は、抽出キーワードと、キーワード管理テーブル２２０とを突き合わせて、抽出キーワードに対応するファイルシステム領域を書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ書き込みファイルを書き込む。 Further, the information processing device 10 has a keyword extraction unit 310, a writing processing unit 320, a reading processing unit 330, and a file specifying unit 340. The keyword extraction unit 310 extracts keywords from the writing file when writing the file to the primary storage medium 100. The writing processing unit 320 collates the extracted keyword with the keyword management table 220, determines the file system area corresponding to the extracted keyword as the writing destination of the writing file, and writes the writing file to the determined file system area.

読出処理部３３０は、検索キーワードを含むファイルを検索する際に、キーワード管理テーブル２２０を参照して、検索キーワードに対応付けられたファイルシステム領域を検出し、その領域に格納されたファイルをキャッシュメモリ２１０に読み出す。ファイル特定部３４０は、キャッシュメモリ２１０に読み出されたファイルの中から検索キーワードを含むファイルを特定する。 When searching for a file containing a search keyword, the read processing unit 330 refers to the keyword management table 220, detects a file system area associated with the search keyword, and caches the file stored in that area in the cache memory. Read to 210. The file identification unit 340 identifies a file including a search keyword from the files read into the cache memory 210.

概念的に説明すると、図２、３に示すように、一のファイルシステム領域には複数のファイルが格納されており、キーワード管理テーブル２２０に記憶されたキーワードとタグによって関連付けられる。ファイルの書き込みの際には、図２に示すように、書き込みファイルからキーワードが抽出され、抽出されたキーワードと関連づけられたファイルシステム領域に書き込みファイルが書き込まれる。また、ファイル検索の際には、図３に示すように検索キーワードと関連付けられたファイルシステム領域に格納された全てのファイルをキャッシュメモリ２１０に一旦読み出し、読み出したファイルの中から、検索キーワードを含むファイルを特定する。 Conceptually, as shown in FIGS. 2 and 3, a plurality of files are stored in one file system area, and are associated with keywords stored in the keyword management table 220 by tags. When writing a file, as shown in FIG. 2, keywords are extracted from the writing file, and the writing file is written in the file system area associated with the extracted keywords. Further, when searching for a file, as shown in FIG. 3, all the files stored in the file system area associated with the search keyword are once read into the cache memory 210, and the read files include the search keyword. Identify the file.

そのため、上記の情報処理装置１０は、ファイル検索の際に、検索キーワードと関連性の高いファイルのデータをキャッシュメモリ２１０へ読み出すことになる。つまり、上記の情報処理装置１０は、検索キーワードと無関係の文書ファイルのデータ移行を行わないため、その分ファイル検索に要する時間が短縮される。 Therefore, the information processing apparatus 10 reads the data of the file highly related to the search keyword into the cache memory 210 at the time of the file search. That is, since the information processing apparatus 10 does not transfer the data of the document file unrelated to the search keyword, the time required for the file search is shortened accordingly.

以下では本発明のとり得る好適な実施形態について図面を参照して詳細に説明する。 Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

［第１の実施形態］
図４に示すように、情報処理装置１０は、ホスト端末２０及び検索用端末３０と接続され、一次記憶媒体１００と、ファイル管理メモリ１１０と、記憶部２００と、処理部３００とを有する。ホスト端末２０は、情報処理装置１０に対して書き込みファイルを送信して、書き込み要求を行う端末である。検索用端末３０は、情報処理装置１０に対して検索キーワードを送信して、検索要求を行う端末である。 [First Embodiment]
As shown in FIG. 4, the information processing device 10 is connected to the host terminal 20 and the search terminal 30, and has a primary storage medium 100, a file management memory 110, a storage unit 200, and a processing unit 300. The host terminal 20 is a terminal that transmits a write file to the information processing device 10 and makes a write request. The search terminal 30 is a terminal that transmits a search keyword to the information processing device 10 and makes a search request.

一次記憶媒体１００は、複数のファイルを複数のファイルシステム領域（ＦＳ＃１〜ｎ）に区分して記憶する記憶媒体である。ここで、ファイルシステム領域は、各々個別の物理メモリに格納することもできるし、複数のファイルシステム領域を１個の物理メモリに格納することもできる。ファイル管理メモリ１１０は、ホスト端末２０から受け付けた書き込みファイルを一時的に管理するための記憶媒体である。 The primary storage medium 100 is a storage medium that divides and stores a plurality of files into a plurality of file system areas (FS # 1 to n). Here, each file system area can be stored in an individual physical memory, or a plurality of file system areas can be stored in one physical memory. The file management memory 110 is a storage medium for temporarily managing the write files received from the host terminal 20.

記憶部２００は、情報処理装置１０における各種処理を実行するためのデータ、プログラムなどを記憶する記憶媒体であり、キャッシュメモリ２１０と、キーワード管理テーブル２２０と、キーワードリスト２３０とを有する。キャッシュメモリ２１０は、ＳＲＡＭ（Static Random Access Memory）などであり、一次記憶媒体１００に記憶されたファイルを一時的に保持する。 The storage unit 200 is a storage medium for storing data, programs, and the like for executing various processes in the information processing apparatus 10, and has a cache memory 210, a keyword management table 220, and a keyword list 230. The cache memory 210 is an SRAM (Static Random Access Memory) or the like, and temporarily holds a file stored in the primary storage medium 100.

キーワード管理テーブル２２０は、ファイルシステム領域と、ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するテーブルである。具体的な一例を挙げると、キーワード管理テーブル２２０は、図５に示すように、ファイルシステム領域の識別子（ＦＳ＃１）などに対応付けて、キーワード（用語Ａなど）とそのキーワードに関する閾値とを記憶する。なお、閾値とは、例えば、一のファイルにおけるキーワードの出現回数、抽出回数などであり、以下に記載のように、ファイルの書き込みの際の処理及び、ファイル検索処理に利用される。 The keyword management table 220 is a table that manages the file system area and the keywords in the file stored in the file system area in association with each other. As a specific example, as shown in FIG. 5, the keyword management table 220 associates a keyword (term A, etc.) with an identifier (FS # 1) of a file system area, and sets a keyword (term A, etc.) and a threshold value related to the keyword. Remember. The threshold value is, for example, the number of times a keyword appears in one file, the number of times of extraction, and the like, and is used for processing when writing a file and file search processing as described below.

キーワードリスト２３０は、図６に示すように、キーワードとそのキーワードに関する閾値とを対応付けたリストであり、情報処理装置１０の管理者によって予め登録される。 As shown in FIG. 6, the keyword list 230 is a list in which a keyword and a threshold value related to the keyword are associated with each other, and is registered in advance by the administrator of the information processing apparatus 10.

処理部３００は、記憶部２００からデータ、プログラムなどを読み出して実行することで、情報処理装置１０における処理を実行する。具体的には、処理部３００は、図４に示すように、キーワード抽出部３１０と、書込処理部３２０と、読出処理部３３０と、ファイル特定部３４０とを有する。 The processing unit 300 executes the processing in the information processing device 10 by reading data, a program, and the like from the storage unit 200 and executing the data. Specifically, as shown in FIG. 4, the processing unit 300 includes a keyword extraction unit 310, a writing processing unit 320, a reading processing unit 330, and a file identification unit 340.

キーワード抽出部３１０は、一次記憶媒体１００へのファイルの書き込みの際に、書き込みファイルからキーワードを抽出する。具体的には、キーワード抽出部３１０は、ファイル管理メモリ１１０に書き込みファイルが格納されると、キーワードリスト２３０に列挙されたキーワードを書き込みファイルから抽出し、抽出回数が閾値以上のキーワードを特定する。 The keyword extraction unit 310 extracts keywords from the writing file when writing the file to the primary storage medium 100. Specifically, when the write file is stored in the file management memory 110, the keyword extraction unit 310 extracts the keywords listed in the keyword list 230 from the write file and identifies the keywords whose extraction count is equal to or greater than the threshold value.

書込処理部３２０は、抽出されたキーワードと、キーワード管理テーブル２２０とを突き合わせて、書き込みファイルの書き込み先のファイルシステム領域を決定し、決定したファイルシステム領域へ書き込みファイルを書き込む。具体的には、書込処理部３２０は、キーワード抽出部３１０によって特定されたキーワードが、キーワード管理テーブル２２０に格納されているか否かを判定する。 The writing processing unit 320 collates the extracted keywords with the keyword management table 220, determines the file system area to which the writing file is written, and writes the writing file to the determined file system area. Specifically, the writing processing unit 320 determines whether or not the keyword specified by the keyword extraction unit 310 is stored in the keyword management table 220.

ここで、キーワードが格納されている場合には、書込処理部３２０は、そのキーワードに対応するファイルシステム領域を書き込みファイルの書き込み先に決定する。そして書込処理部３２０は、ファイル管理メモリ１１０に格納された書き込みファイルを書き込み先に決定したファイルシステム領域へ書き込む。 Here, when a keyword is stored, the write processing unit 320 determines the file system area corresponding to the keyword as the write destination of the write file. Then, the write processing unit 320 writes the write file stored in the file management memory 110 to the file system area determined as the write destination.

一方で、キーワードが格納されていない場合には、書込処理部３２０は、キーワード管理テーブル２２０に新たなるファイルシステム領域のエントリを作成するとともに、そのファイルシステム領域を一次記憶媒体１００の中に確保する。そして、書込処理部３２０は、抽出回数が閾値以上のキーワードと、その閾値を作成したエントリに登録するとともに、ファイル管理メモリ１１０に格納された書き込みファイルを新たに確保したファイルシステム領域に書き込む。 On the other hand, when the keyword is not stored, the write processing unit 320 creates an entry for a new file system area in the keyword management table 220 and secures the file system area in the primary storage medium 100. To do. Then, the writing processing unit 320 registers the keyword whose extraction count is equal to or greater than the threshold value and the entry for which the threshold value is created, and writes the writing file stored in the file management memory 110 to the newly secured file system area.

読出処理部３３０は、検索キーワードを含むファイルを検索する際に、キーワード管理テーブル２２０を参照して、検索キーワードに対応付けられたファイルシステム領域を検出し、その領域に格納されたファイルをキャッシュメモリ２１０に読み出す。具体的には、検索用端末３０から検索キーワードを受け付けると、読出処理部３３０は、キーワード管理テーブル２２０を参照して、検索キーワードに対応付けられたファイルシステム領域を特定する。そして、読出処理部３３０は、特定したファイルシステム領域に格納されたファイルを一括して一次記憶媒体１００から読み出して、キャッシュメモリ２１０に格納する。 When searching for a file containing a search keyword, the read processing unit 330 refers to the keyword management table 220, detects a file system area associated with the search keyword, and caches the file stored in that area in the cache memory. Read to 210. Specifically, when the search keyword is received from the search terminal 30, the read processing unit 330 refers to the keyword management table 220 and identifies the file system area associated with the search keyword. Then, the read processing unit 330 collectively reads the files stored in the specified file system area from the primary storage medium 100 and stores them in the cache memory 210.

ファイル特定部３４０は、キャッシュメモリ２１０に読み出されたファイルの中から検索キーワードを含むファイルを特定する。具体的には、ファイル特定部３４０は、キャッシュメモリ２１０にファイルが格納されると、格納されたファイルの中から検索キーワードを含むファイルを特定し、特定したファイルに関する情報（例えば、ファイル名）を検索用端末３０に対して出力する。 The file identification unit 340 identifies a file including a search keyword from the files read into the cache memory 210. Specifically, when a file is stored in the cache memory 210, the file identification unit 340 identifies a file containing a search keyword from the stored files and provides information (for example, a file name) regarding the specified file. Output to the search terminal 30.

以下では、図面を参照しつつ、情報処理装置１０による一連の処理の流れ、及び各機能部による処理を説明する。 Hereinafter, a series of processing flows by the information processing apparatus 10 and processing by each functional unit will be described with reference to the drawings.

図７は、ファイルの書き込みの際の処理の流れを示すフローチャートである。ファイル管理メモリ１１０に書き込みファイルが格納されると（ステップＳ１１、ＹＥＳ）、キーワード抽出部３１０は、書き込みファイルからキーワードを抽出して、抽出回数が閾値以上のキーワードを特定する（ステップＳ１２）。そして、書込処理部３２０は、特定されたキーワードがキーワード管理テーブル２２０に格納されているか否かを判定する（ステップＳ１３）。 FIG. 7 is a flowchart showing a processing flow when writing a file. When the write file is stored in the file management memory 110 (step S11, YES), the keyword extraction unit 310 extracts the keyword from the write file and identifies the keyword whose extraction count is equal to or greater than the threshold value (step S12). Then, the writing processing unit 320 determines whether or not the specified keyword is stored in the keyword management table 220 (step S13).

ここで、キーワードが格納されている場合には（ステップＳ１３、ＹＥＳ）、書込処理部３２０は、そのキーワードに対応するファイルシステム領域にファイル管理メモリ１１０に格納された書き込みファイルを書き込む（ステップＳ１４）。一方で、キーワードが格納されていない場合には（ステップＳ１３、ＮＯ）、書込処理部３２０は、新たなるファイルシステム領域を確保し、確保したファイルシステム領域に書き込みファイルを書き込む（ステップＳ１５）。 Here, when the keyword is stored (step S13, YES), the write processing unit 320 writes the write file stored in the file management memory 110 to the file system area corresponding to the keyword (step S14). ). On the other hand, when the keyword is not stored (step S13, NO), the write processing unit 320 secures a new file system area and writes the write file to the secured file system area (step S15).

図８は、ファイル検索処理の流れを示すフローチャートである。図８に示すように、読出処理部３３０は、検索キーワードを受け付けると（ステップＳ２１、ＹＥＳ）、検索キーワードに対応付けられたファイルシステム領域に格納されたファイルを読み出してキャッシュメモリ２１０に格納する（ステップＳ２２）。 FIG. 8 is a flowchart showing the flow of the file search process. As shown in FIG. 8, when the read processing unit 330 receives the search keyword (step S21, YES), the read processing unit 330 reads the file stored in the file system area associated with the search keyword and stores it in the cache memory 210 (step S21, YES). Step S22).

そして、ファイル特定部３４０は、キャッシュメモリ２１０に格納されたファイルの中から検索キーワードを含むファイルを特定し（ステップＳ２３）、特定したファイルに関する情報を検索用端末３０に対して出力する（ステップＳ２４）。 Then, the file specifying unit 340 identifies a file including the search keyword from the files stored in the cache memory 210 (step S23), and outputs information about the specified file to the search terminal 30 (step S24). ).

上述のように、第１の実施形態の情報処理装置１０では、キーワードによって関連付けられるファイルが一のファイルシステム領域に格納される。そして、ファイル検索の際には、検索キーワードに対応するファイルシステム領域内のファイルがキャッシュメモリ２１０へ読み出される。つまり、第１の実施形態の情報処理装置１０では、キャッシュメモリ２１０へ読み出されるファイルの中には、検索キーワードを含むファイルが多く含まれることになり、無駄なデータ移行が抑制され、その分ファイル検索に要する時間が短縮される。 As described above, in the information processing apparatus 10 of the first embodiment, the files associated with the keywords are stored in one file system area. Then, at the time of file search, the files in the file system area corresponding to the search keyword are read into the cache memory 210. That is, in the information processing apparatus 10 of the first embodiment, the files read into the cache memory 210 include many files including the search keyword, which suppresses unnecessary data migration and files by that amount. The time required for searching is reduced.

［第２の実施形態］
上記の第１の実施形態では、キーワードは、情報処理装置１０の管理者によってキーワードリスト２３０に予め登録される。ここで、ホスト端末２０から受け付けた書き込みファイルからキーワードを抽出して、キーワードリスト２３０に追加することもできる。 [Second Embodiment]
In the first embodiment described above, the keywords are pre-registered in the keyword list 230 by the administrator of the information processing apparatus 10. Here, the keyword can be extracted from the writing file received from the host terminal 20 and added to the keyword list 230.

例えば、キーワード抽出部３１０は、ファイル管理メモリ１１０に書き込みファイルが格納されると、書き込みファイルの中で所定回数以上の頻度で出現した用語を新規キーワードとして特定する。そして、キーワード抽出部３１０は、新規キーワードに関するエントリをキーワードリスト２３０に作成するとともに、その新規キーワードと、閾値とを登録する。ここで、新規キーワードの閾値は予め設定される値であっても良いし、出現回数に基づいて決定しても良い。また、新規キーワードの登録可否を、情報処理装置１０の管理者が適宜選択できるようにしても良い。もちろん、キーワードリスト２３０に登録されたキーワードを情報処理装置１０の管理者が削除できるようにしても良い。 For example, when the write file is stored in the file management memory 110, the keyword extraction unit 310 identifies a term that appears in the write file with a frequency of a predetermined number of times or more as a new keyword. Then, the keyword extraction unit 310 creates an entry related to the new keyword in the keyword list 230, and registers the new keyword and the threshold value. Here, the threshold value of the new keyword may be a preset value or may be determined based on the number of occurrences. Further, the administrator of the information processing apparatus 10 may appropriately select whether or not to register a new keyword. Of course, the administrator of the information processing apparatus 10 may be able to delete the keywords registered in the keyword list 230.

また、所定回数以上かつ上位ｘ個までという条件を新規キーワードの特定条件としても良い。なお、新規キーワードに関しては、上記の特許文献１をはじめとする自然言語処理による文章検索や画像のメタデータ検索など、公知の技術を用いた検索方法によって抽出しても良い。 Further, the condition that the number of times is equal to or more than a predetermined number and up to the top x pieces may be set as the specific condition of the new keyword. The new keyword may be extracted by a search method using a known technique such as a sentence search by natural language processing such as the above-mentioned Patent Document 1 or an image metadata search.

［第３の実施形態］
上記の実施形態１、２には、種々の変化形態が考えられる。そこで、実施形態１、２の変化形態を第３の実施形態として以下に記載する。 [Third Embodiment]
Various variations can be considered in the above-described first and second embodiments. Therefore, the modified forms of the first and second embodiments will be described below as the third embodiment.

（キーワード管理テーブル２２０へのキーワードの追加）
実施形態１では、書き込みファイルから複数のキーワードが閾値以上の抽出回数で抽出される時がある。この時に、キーワード管理テーブル２２０において、あるファイルシステム領域と一の抽出キーワードとが対応していても、そのファイルシステム領域と他の抽出キーワードとが対応していないことがある。このような場合に、キーワード管理テーブル２２０において、そのファイルシステム領域に対応付けて他の抽出キーワードを追加する。 (Adding keywords to keyword management table 220)
In the first embodiment, a plurality of keywords may be extracted from the written file at a number of extractions equal to or greater than the threshold value. At this time, in the keyword management table 220, even if a certain file system area and one extracted keyword correspond to each other, the file system area may not correspond to another extracted keyword. In such a case, in the keyword management table 220, another extracted keyword is added in association with the file system area.

具体的な一例を挙げて説明すると、書き込みファイルから用語Ａと用語Ｚとが抽出された場合に、図５に示すキーワード管理テーブル２２０において、ＦＳ＃１に対応付けて用語Ｚを追加する。このようにすれば、キーワード管理テーブル２２０にキーワードを自動的に追加することができる。 Explaining with a specific example, when the term A and the term Z are extracted from the writing file, the term Z is added in association with FS # 1 in the keyword management table 220 shown in FIG. In this way, keywords can be automatically added to the keyword management table 220.

（ファイルの書き込みの際のファイルシステム領域の選択）
実施形態１では、書き込みファイルから複数のキーワードが閾値以上の抽出回数で抽出され、かつ、複数のファイルシステム領域の各々に対応する時がある。具体的な一例を挙げて説明すると、書き込みファイルから用語Ａ及び用語ｂが各々の閾値以上の抽出回数で抽出される時がある。 (Selection of file system area when writing a file)
In the first embodiment, a plurality of keywords may be extracted from the written file at a number of extractions equal to or greater than the threshold value, and may correspond to each of the plurality of file system areas. To explain with a specific example, there are cases where the terms A and b are extracted from the writing file at a number of extractions equal to or greater than the respective threshold values.

この場合に、図５のキーワード管理テーブル２２０を参照した書込処理部３２０は、閾値が高い用語Ａに対応するファイルシステム領域（ＦＳ＃１）を書き込みファイルの書き込み先に決定する。あるいは、書込処理部３２０は、用語Ａ及び用語ｂに対応する両ファイルシステム領域（ＦＳ＃１、＃２）を書き込みファイルの書き込み先に決定する。 In this case, the writing processing unit 320 with reference to the keyword management table 220 of FIG. 5 determines the file system area (FS # 1) corresponding to the term A having a high threshold value as the writing destination of the writing file. Alternatively, the write processing unit 320 determines both file system areas (FS # 1 and # 2) corresponding to the terms A and b as the write destination of the write file.

（ファイル検索の際のファイルシステム領域の選択）
実施形態１では、キーワード管理テーブル２２０において、同一キーワードが複数のファイルシステム領域に対応付けて記憶され、ファイル検索の際に複数のファイルシステム領域が特定される時がある。この場合に、各ファイルシステム領域やキーワードに重みづけをしておき、キャッシュメモリ２１０にファイルが読みだされるファイルシステム領域の順序を決定しても良い。 (Selection of file system area when searching for files)
In the first embodiment, the same keyword is stored in association with a plurality of file system areas in the keyword management table 220, and a plurality of file system areas may be specified at the time of file search. In this case, each file system area or keyword may be weighted to determine the order of the file system areas in which the file is read into the cache memory 210.

なお、情報処理装置１０における各種処理は、例えば、処理モジュールを搭載したコンピュータによって実現される。当該処理モジュールは、例えば、メモリに格納されたプログラムをＣＰＵが実行することで実現される。また、そのプログラムは、ネットワークを介してダウンロードするか、あるいは、プログラムを記憶した記憶媒体を用いて、更新することができる。さらに、上記処理モジュールは、半導体チップにより実現されてもよい。即ち、上記処理モジュールが行う機能は、何らかのハードウェア及び／又はソフトウェアにより実現できればよい。 The various processes in the information processing apparatus 10 are realized by, for example, a computer equipped with a processing module. The processing module is realized, for example, by the CPU executing a program stored in the memory. In addition, the program can be downloaded via a network or updated using a storage medium in which the program is stored. Further, the processing module may be realized by a semiconductor chip. That is, the function performed by the processing module may be realized by some hardware and / or software.

上記の実施形態の一部又は全部は、以下の付記のようにも記載され得るが、以下には限られない。 Some or all of the above embodiments may also be described, but not limited to:

（付記１）
複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体と、
前記記憶媒体に記憶されたファイルを一時的に保持するキャッシュメモリと、
前記ファイルシステム領域と、前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、
前記記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出部と、
前記キーワード抽出部によって抽出された抽出キーワードと、前記キーワード管理テーブルとを突き合わせて、前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込処理部と、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルを前記キャッシュメモリに読み出す読出処理部と、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定部と、
を含む情報処理装置。 (Appendix 1)
A storage medium that divides multiple files into multiple file system areas and stores them,
A cache memory that temporarily holds files stored in the storage medium, and
A keyword management table that manages the file system area in association with the keywords in the file stored in the file system area.
A keyword extractor that extracts keywords from the written file when writing a file to the storage medium,
The extracted keyword extracted by the keyword extraction unit is compared with the keyword management table, the file system area corresponding to the extracted keyword is determined as the writing destination of the writing file, and the writing file is sent to the determined file system area. Write processing unit to write
When searching for a file containing a search keyword, the keyword management table is referred to, a file system area corresponding to the search keyword is detected, and a file stored in the detected file system area is read into the cache memory. Processing unit and
A file identification unit that identifies a file containing the search keyword from the files read into the cache memory,
Information processing equipment including.

（付記２）
キーワードを列挙するキーワードリストを更に備え、
前記キーワード抽出部は、前記書き込みファイルから所定回数以上の頻度で抽出された用語を新規キーワードとして特定し、前記キーワードリストに書き込む付記１に記載の情報処理装置。 (Appendix 2)
Further equipped with a keyword list that lists keywords
The information processing device according to Appendix 1, wherein the keyword extraction unit identifies a term extracted from the writing file at a frequency of a predetermined number of times or more as a new keyword and writes it in the keyword list.

（付記３）
前記キーワード抽出部は、前記キーワードリストに列挙されたキーワードを書き込みファイルから抽出する付記２に記載の情報処理装置。 (Appendix 3)
The information processing device according to Appendix 2, wherein the keyword extraction unit extracts keywords listed in the keyword list from a writing file.

（付記４）
前記キーワード管理テーブルは、各キーワードに対応付けて抽出回数に関する閾値を記憶し、
前記書込処理部は、前記キーワード抽出部による抽出回数が前記閾値以上のキーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先として決定する付記１〜３のいずれか１つに記載の情報処理装置。 (Appendix 4)
The keyword management table stores a threshold value related to the number of extractions in association with each keyword.
The information processing according to any one of Appendix 1 to 3, wherein the write processing unit determines a file system area corresponding to a keyword whose number of extractions by the keyword extraction unit is equal to or greater than the threshold value as a write destination of the write file. apparatus.

（付記５）
前記書込処理部は、前記キーワード抽出部による抽出回数が前記閾値以上のキーワードに対応するファイルシステム領域が存在しない場合には、前記記憶媒体内に新たなるファイルシステム領域を確保し、当該ファイルシステム領域を前記書き込みファイルの書き込み先として決定する付記４に記載の情報処理装置。 (Appendix 5)
When the writing processing unit does not have a file system area corresponding to a keyword whose number of extractions by the keyword extraction unit is equal to or greater than the threshold value, the writing processing unit secures a new file system area in the storage medium and the file system. The information processing apparatus according to Appendix 4, wherein the area is determined as the writing destination of the writing file.

（付記６）
複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出ステップと、
前記ファイルシステム領域と前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、前記キーワード抽出ステップによって抽出された抽出キーワードとを突き合わせて前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込ステップと、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルをキャッシュメモリに読み出す読出ステップと、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定ステップと、
を含む情報処理方法。 (Appendix 6)
When writing a file to a storage medium that divides multiple files into multiple file system areas and stores them, a keyword extraction step that extracts keywords from the written file and a keyword extraction step.
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction step. A writing step in which the area is determined as the writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
A file identification step for identifying a file containing the search keyword from the files read into the cache memory, and
Information processing methods including.

（付記７）
複数のファイルを複数のファイルシステム領域に区分して記憶する記憶媒体へのファイルの書き込みの際に、当該書き込みファイルからキーワードを抽出するキーワード抽出処理と、
前記ファイルシステム領域と前記ファイルシステム領域に格納されたファイルにおけるキーワードとを対応付けて管理するキーワード管理テーブルと、前記キーワード抽出処理によって抽出された抽出キーワードとを突き合わせて前記抽出キーワードに対応するファイルシステム領域を前記書き込みファイルの書き込み先に決定し、決定したファイルシステム領域へ前記書き込みファイルを書き込む書込処理と、
検索キーワードを含むファイルを検索する際に、前記キーワード管理テーブルを参照して、前記検索キーワードに対応するファイルシステム領域を検出し、検出したファイルシステム領域に格納されたファイルをキャッシュメモリに読み出す読出処理と、
前記キャッシュメモリに読み出されたファイルの中から前記検索キーワードを含むファイルを特定するファイル特定処理と、
をコンピュータに実行させるプログラム。 (Appendix 7)
When writing a file to a storage medium that divides and stores multiple files into multiple file system areas, keyword extraction processing that extracts keywords from the written file and keyword extraction processing
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction process. A writing process in which an area is determined as a writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
File identification processing that identifies a file containing the search keyword from the files read into the cache memory, and
A program that causes a computer to run.

なお、上記の特許文献の開示を、本書に引用をもって繰り込むものとする。本発明の全開示（請求の範囲を含む）の枠内において、さらにその基本的技術思想に基づいて、実施形態ないし実施例の変更・調整が可能である。また、本発明の請求の範囲の枠内において種々の開示要素（各請求項の各要素、各実施形態ないし実施例の各要素、各図面の各要素等を含む）の多様な組み合わせ、ないし選択が可能である。すなわち、本発明は、請求の範囲を含む全開示、技術的思想にしたがって当業者であればなし得るであろう各種変形、修正を含むことは勿論である。 The disclosure of the above patent documents shall be incorporated into this document by citation. Within the framework of the entire disclosure (including the scope of claims) of the present invention, it is possible to change or adjust the embodiments or examples based on the basic technical idea thereof. Further, various combinations or selections of various disclosure elements (including each element of each claim, each element of each embodiment or embodiment, each element of each drawing, etc.) within the scope of the claims of the present invention. Is possible. That is, it goes without saying that the present invention includes all disclosure including claims, and various modifications and modifications that can be made by those skilled in the art in accordance with the technical idea.

１０情報処理装置
２０ホスト端末
３０検索用端末
１００一次記憶媒体
１１０ファイル管理メモリ
２００記憶部
２１０キャッシュメモリ
２２０キーワード管理テーブル
２３０キーワードリスト
３００処理部
３１０キーワード抽出部
３２０書込処理部
３３０読出処理部
３４０ファイル特定部 10 Information processing device 20 Host terminal 30 Search terminal 100 Primary storage medium 110 File management memory 200 Storage unit 210 Cache memory 220 Keyword management table 230 Keyword list 300 Processing unit 310 Keyword extraction unit 320 Writing processing unit 330 Reading processing unit 340 File Specific part

Claims

A storage medium that divides multiple files into multiple file system areas and stores them,
A cache memory that temporarily holds files stored in the storage medium, and
A keyword management table that manages the file system area in association with the keywords in the file stored in the file system area.
A keyword extractor that extracts keywords from the written file when writing a file to the storage medium,
The extracted keyword extracted by the keyword extraction unit is compared with the keyword management table, the file system area corresponding to the extracted keyword is determined as the writing destination of the writing file, and the writing file is sent to the determined file system area. Write processing unit to write
When searching for a file containing a search keyword, the keyword management table is referred to, a file system area corresponding to the search keyword is detected, and a file stored in the detected file system area is read into the cache memory. Processing unit and
A file identification unit that identifies a file containing the search keyword from the files read into the cache memory,
Information processing equipment including.

Further equipped with a keyword list that lists keywords
The information processing device according to claim 1, wherein the keyword extraction unit identifies a term extracted from the writing file at a frequency of a predetermined number of times or more as a new keyword and writes it in the keyword list.

The information processing device according to claim 2, wherein the keyword extraction unit extracts the keywords listed in the keyword list from a writing file.

The keyword management table stores a threshold value related to the number of extractions in association with each keyword.
The information according to any one of claims 1 to 3, wherein the writing processing unit determines a file system area corresponding to a keyword whose number of extractions by the keyword extraction unit is equal to or greater than the threshold value as a writing destination of the writing file. Processing equipment.

When the writing processing unit does not have a file system area corresponding to a keyword whose number of extractions by the keyword extraction unit is equal to or greater than the threshold value, the writing processing unit secures a new file system area in the storage medium and secures a new file system area in the storage medium. The information processing apparatus according to claim 4, wherein the area is determined as the writing destination of the writing file.

When writing a file to a storage medium that divides multiple files into multiple file system areas and stores them, a keyword extraction step that extracts keywords from the written file and a keyword extraction step.
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction step. A writing step in which the area is determined as the writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
A file identification step for identifying a file containing the search keyword from the files read into the cache memory, and
Information processing methods including.

When writing a file to a storage medium that divides and stores multiple files into multiple file system areas, keyword extraction processing that extracts keywords from the written file and keyword extraction processing
A file system corresponding to the extracted keyword by matching the keyword management table that manages the file system area and the keyword in the file stored in the file system area in association with each other and the extracted keyword extracted by the keyword extraction process. A writing process in which an area is determined as a writing destination of the writing file and the writing file is written to the determined file system area.
When searching for a file containing a search keyword, the keyword management table is referred to, the file system area corresponding to the search keyword is detected, and the file stored in the detected file system area is read into the cache memory. When,
File identification processing that identifies a file containing the search keyword from the files read into the cache memory, and
A program that causes a computer to run.