JP7397766B2

JP7397766B2 - Information processing device, information processing method and program

Info

Publication number: JP7397766B2
Application number: JP2020106490A
Authority: JP
Inventors: 駿侍新田; 諒也前沢; 剛光上野
Original assignee: 株式会社オービック
Priority date: 2020-06-19
Filing date: 2020-06-19
Publication date: 2023-12-13
Anticipated expiration: 2040-06-19
Also published as: JP2022002005A

Description

本発明は、情報処理装置、情報処理方法およびプログラムに関する。 The present invention relates to an information processing device, an information processing method, and a program.

昨今、電子データ（例えば、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）により電子化されたもの、または外部取込された（例えばＥＤＩ（ＥｌｅｃｔｒｏｎｉｃＤａｔａＩｎｔｅｒｃｈａｎｇｅ）によりデータベースへ取り込まれた）もの、など）をＥＲＰ（ＥｎｔｅｒｐｒｉｓｅＲｅｓｏｕｒｃｅＰｌａｎｎｉｎｇ）に係るアプリケーションソフトウェアに自動的に連結させることが可能となってきている。そして、外部データをＥＲＰに取り込む際に、外部データに対しＥＲＰ内で使用されているコード（具体的にはＥＲＰ内で使用されている各種マスタ中の主キー項目に係る値）を付与する「名寄せ」（情報処理）が必要となる。つまり、名寄せの需要が高まってきている。 Recently, electronic data (e.g., data digitized by OCR (Optical Character Recognition) or data imported externally (e.g. imported into a database by EDI (Electronic Data Interchange)) is being processed into ERP (Enterprise Research). ce It has become possible to automatically connect to application software related to planning. When importing external data into ERP, codes used in ERP (specifically, values related to primary key items in various masters used in ERP) are assigned to the external data. "Name identification" (information processing) is required. In other words, the demand for name matching is increasing.

従来の名寄せシステムは、特定の名寄せ場面での使用を想定したものとなっている。従来の名寄せシステムの一例として、同一人物による債権情報を検索する目的で債権情報の名寄せを行うシステム、バージョンが異なることによるソフトウェア名の表記ゆれを吸収してソフトウェアを正しく管理する目的でソフトウェア名の名寄せを行うシステム、ＣＲＭ（ＣｕｓｔｏｍｅｒＲｅｌａｔｉｏｎｓｈｉｐＭａｎａｇｅｍｅｎｔ）において対象の企業を企業名と周辺情報から正しく検索する目的で企業名の名寄せを行うシステム、帳票における企業名にＥＲＰ内で使用されている得意先マスタ中の得意先コード（主キー項目に係る値）を付与する企業名の名寄せに係るシステム、または、注文書における商品名にＥＲＰ内で使用されている商品マスタ中の商品コード（主キー項目に係る値）を付与する商品名の名寄せに係るシステムが挙げられる。 Conventional name matching systems are designed to be used in specific name matching situations. Examples of conventional name aggregation systems include a system that combines name receivables to search for receivables by the same person, and a system that combines software names to manage software correctly by absorbing variations in software name notation due to different versions. A system that performs name matching, a system that performs name matching for the purpose of correctly searching for a target company from the company name and surrounding information in CRM (Customer Relationship Management), and a customer master database used in ERP for the company name in the form. A system for aggregating company names that assigns customer codes (values related to primary key items) in An example of this is a system related to name matching of product names that assigns such values.

なお、特許文献１には、ソースコードの修正を行うこと無く保険料計算式の追加、変更、削除を可能とする保険料算出システムが開示されている。 Note that Patent Document 1 discloses an insurance premium calculation system that allows addition, change, and deletion of insurance premium calculation formulas without modifying the source code.

特開２０１３－０６５０７７号公報Japanese Patent Application Publication No. 2013-065077

しかしながら、ＥＲＰでは、多種多様なデータを扱うため、名寄せ場面も多種多様となる。そのため、従来のように名寄せ場面ごとに名寄せシステムを設計・開発することは非効率であると考えられる。 However, since ERP handles a wide variety of data, the situations for name matching are also diverse. Therefore, it is considered inefficient to design and develop a name matching system for each name matching situation, as has been the case in the past.

本発明は、上記に鑑みてなされたものであって、多種多様な名寄せ場面（具体的には名寄せ元データ（例：帳票データ）と名寄せ先データ（例：マスタ）との組み合わせ）に応じた名寄せフローを、名寄せ場面ごとに設計・開発することなくコーディングレスで作成・実行することができる情報処理装置、情報処理方法およびプログラムを提供することを目的とする。 The present invention has been made in view of the above, and is capable of handling a wide variety of name matching situations (specifically, combinations of name matching source data (e.g., form data) and name matching destination data (e.g., master)). The purpose of the present invention is to provide an information processing device, an information processing method, and a program that can create and execute a name matching flow without having to design or develop a name matching flow for each name matching scene without coding.

上述した課題を解決し、目的を達成するために、本発明に係る情報処理装置は、制御部を備える情報処理装置であって、クレンジング手法と類似度計算手法との手法組み合わせが名寄せ先データの項目である先項目と紐付けて格納され、且つ、名寄せ元データの項目である元項目と先項目との項目組み合わせを１つまたは複数含む名寄せ実行時に使用する項目組み合わせ集合が名寄せ元データのテンプレートと紐付けて格納された記憶部にアクセス可能であり、前記制御部は、名寄せ元データと名寄せ先データを受け取る受取手段と、前記受け取った名寄せ元データのテンプレートに紐付く項目組み合わせ集合を、前記記憶部から取得し、取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く手法組み合わせを、前記記憶部から取得する取得手段と、１）前記取得した項目組み合わせ集合中の項目組み合わせと前記取得した手法組み合わせとの組み合わせごとに、１１）前記受け取った名寄せ元データが保持する、当該項目組み合わせ中の元項目に係る値と、前記受け取った名寄せ先データが保持する、当該項目組み合わせ中の先項目に係る値と、に対する、当該手法組み合わせ中のクレンジング手法によるクレンジング処理と、１２）クレンジング処理後の両値に対する、当該手法組み合わせ中の類似度計算手法による類似度計算処理と、を実行し、２）当該得られた各類似度を集計し、３）当該得られた集計値に基づく情報と、当該名寄せ先データが保持する、名寄せ先データを一意に識別するための先項目である識別先項目に係る値を、名寄せ結果として出力する名寄せ実行手段と、を備えること、を特徴とする。 In order to solve the above-mentioned problems and achieve the purpose, an information processing device according to the present invention is an information processing device including a control unit, in which a method combination of a cleansing method and a similarity calculation method is applied to target data. The template of the source data is a set of item combinations that is stored in association with the destination item, which is an item, and is used when performing name matching, and contains one or more item combinations of the source item and destination item, which are the items of the source data. and a receiving means for receiving the comparison source data and the comparison destination data, and the item combination set linked to the template of the received comparison source data. an acquisition unit that acquires from the storage unit, for each item combination in the acquired item combination set, a method combination that is linked to a previous item in the item combination, from the storage unit; 1) the acquired item combination set; For each combination of the item combination in the middle and the acquired method combination, 11) the value related to the source item in the item combination held by the received name reference data and the value held by the received name reference data; 12) Cleansing processing using the cleansing method in the method combination for the value related to the previous item in the item combination, and 12) Similarity calculation processing using the similarity calculation method in the method combination for both values after the cleansing process. 2) aggregate each of the obtained similarities, and 3) calculate information based on the obtained aggregate value and uniquely identify the reference data held by the reference data. The present invention is characterized by comprising a name matching execution means for outputting a value related to an identification destination item, which is a destination item, as a name matching result.

また、本発明に係る情報処理装置において、前記記憶部は、重み付け値を先項目と紐付けてさらに格納し、前記取得手段は、前記取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く重み付け値を、前記記憶部からさらに取得し、前記名寄せ実行手段は、各類似度に前記取得した各重み付け値を掛け合わせて各類似度を集計すること、を特徴とする。 Further, in the information processing device according to the present invention, the storage unit further stores the weighting value in association with the previous item, and the acquisition means stores the weighting value for each item combination in the acquired item combination set. A weighting value linked to a previous item in the item is further acquired from the storage unit, and the name matching execution means multiplies each degree of similarity by each of the acquired weighted values to total each degree of similarity. do.

また、本発明に係る情報処理装置において、前記名寄せ実行手段は、集計値と１つの閾値または互いに異なる複数の閾値との大小を比較し、比較結果に応じたコンテンツを、前記集計値に基づく情報として出力すること、を特徴とする。 Further, in the information processing device according to the present invention, the name matching execution means compares the total value with one threshold value or a plurality of mutually different threshold values, and selects content according to the comparison result from information based on the total value. It is characterized by outputting as .

また、本発明に係る情報処理装置において、前記名寄せ実行手段は、前記受け取った名寄せ先データが保持する、前記取得した項目組み合わせ集合中の先項目に係る値を、名寄せ結果としてさらに出力すること、を特徴とする。 Further, in the information processing device according to the present invention, the name matching execution means further outputs, as a name matching result, a value related to a previous item in the acquired item combination set, which is held by the received name matching destination data; It is characterized by

また、本発明に係る情報処理装置において、前記記憶部は、１）手法組み合わせを複数含む手法組み合わせ集合と、２）先項目と手法組み合わせの識別情報との先項目・手法組み合わせを、名寄せ先データ中の先項目の個数分含む先項目・手法組み合わせ集合を、名寄せ先データ別に複数含む先項目・手法組み合わせ集合族と、３）項目組み合わせ集合を１つまたは複数含む項目組み合わせ集合族と、４）項目組み合わせ集合の識別情報と先項目・手法組み合わせ集合の識別情報との組み合わせを１つまたは複数含む第一組み合わせ集合を１つまたは複数含む第一組み合わせ集合族と、５）第一組み合わせ集合の識別情報と名寄せ元データのテンプレートの識別情報との組み合わせを１つまたは複数含む第二組み合わせ集合と、を格納しているものであり、前記取得手段は、１）前記受け取った名寄せ元データ中のテンプレートの識別情報に紐付く第一組み合わせ集合の識別情報を、前記第二組み合わせ集合から取得し、２）前記取得した第一組み合わせ集合の識別情報で特定される第一組み合わせ集合を、前記第一組み合わせ集合族から取得し、３）前記取得した第一組み合わせ集合中の項目組み合わせ集合の識別情報で特定される項目組み合わせ集合を、前記項目組み合わせ集合族から取得するとともに、前記取得した第一組み合わせ集合中の先項目・手法組み合わせ集合の識別情報で特定される先項目・手法組み合わせ集合を、前記先項目・手法組み合わせ集合族から取得し、４）前記取得した先項目・手法組み合わせ集合から、前記取得した項目組み合わせ集合中の先項目に紐付く手法組み合わせを取得すること、を特徴とする。 Further, in the information processing device according to the present invention, the storage unit stores 1) a method combination set including a plurality of method combinations, and 2) a previous item/method combination of a previous item and identification information of the method combination, in the collation destination data. 3) an item combination set family containing one or more item combination sets, 4) 5) a first combination set family including one or more first combination sets including one or more combinations of identification information of the item combination set and identification information of the previous item/method combination set; and 5) identification of the first combination set. a second combination set including one or more combinations of information and identification information of the template of the reference data, and the acquisition means includes: 1) a template in the received reference data; 2) obtain the identification information of a first combination set linked to the identification information of the second combination set from the second combination set; 3) obtain an item combination set specified by the identification information of the item combination set in the obtained first combination set from the item combination set group; 4) obtain the previous item/method combination set specified by the identification information of the previous item/method combination set from the previous item/method combination set group; A feature of this method is to obtain a method combination that is linked to a previous item in a set of item combinations.

また、本発明に係る情報処理装置において、前記制御部は、クレンジング手法を設定させるための領域と、類似度計算手法を設定させるための領域とを含む第一の設定画面を介して、オペレータに、手法組み合わせを設定させる第一設定手段と、名寄せ先データを設定させるための領域と、設定された名寄せ先データ中の先項目を表示させるための領域と、先項目に適用する手法組み合わせを設定させるための領域と、を含む第二の設定画面を介して、オペレータに、先項目・手法組み合わせ集合を設定させる第二設定手段と、名寄せ元データのテンプレートを設定させるための領域と、名寄せ先データを設定させるための領域と、先項目・手法組み合わせ集合を設定させるための領域と、設定された名寄せ先データ中の先項目と設定されたテンプレート中の元項目とを表示させ、項目組み合わせ集合を設定させるための領域と、を含む第三の設定画面を介して、オペレータに、項目組み合わせ集合、名寄せ元データのテンプレートおよび先項目・手法組み合わせ集合の紐付けを設定させる第三設定手段と、をさらに備えること、を特徴とする。 Further, in the information processing device according to the present invention, the control unit allows the operator to set the cleansing method through a first setting screen that includes an area for setting a cleansing method and an area for setting a similarity calculation method. , a first setting means for setting a method combination, an area for setting reference data, an area for displaying a destination item in the set reference data, and setting a method combination to be applied to the destination item. a second setting screen that allows the operator to set a destination item/method combination set, an area for setting a template for the source data, and a second setting screen that includes An area for setting data, an area for setting a destination item/method combination set, a destination item in the set destination data and a source item in the set template are displayed, and the item combination set is displayed. a third setting means for causing the operator to set the item combination set, the template of the source data, and the linkage of the destination item/method combination set via a third setting screen including an area for setting the item combination set; It is characterized by further comprising:

また、本発明に係る情報処理装置において、前記記憶部は、名寄せ元データが保持する、項目組み合わせ集合中の元項目に係る値と、名寄せ先データが保持する識別先項目に係る値と、を紐付けて登録した辞書データをさらに格納し、前記制御部は、前記受け取った名寄せ元データが保持する、前記取得した項目組み合わせ集合中の元項目に係る値が、前記辞書データに登録されたものと同じである場合に、前記辞書データに登録された識別先項目に係る値と、当該値が前記辞書データに登録されたものであることを示す情報を、名寄せ結果として出力する辞書使用名寄せ実行手段をさらに備え、前記名寄せ実行手段は、前記受け取った名寄せ元データが保持する、前記取得した項目組み合わせ集合中の元項目に係る値が、前記辞書データに登録されたものと同じでなかった場合に、前記１）から前記３）の処理を実行すること、を特徴とする。 Further, in the information processing device according to the present invention, the storage unit stores a value related to the source item in the item combination set held by the name comparison source data and a value related to the identification destination item held by the name comparison destination data. The linked and registered dictionary data is further stored, and the control unit stores a value related to the original item in the acquired item combination set held by the received name matching source data that is registered in the dictionary data. is the same as the dictionary data, the value related to the identification item registered in the dictionary data and information indicating that the value is registered in the dictionary data are output as the matching result. The name matching execution means further comprises a means for performing name matching, when a value related to a source item in the acquired item combination set held by the received name matching source data is not the same as a value registered in the dictionary data. The method is characterized by executing the processes 1) to 3) above.

また、本発明に係る情報処理装置において、前記制御部は、前記名寄せ実行手段において、前記受け取った名寄せ元データが保持する、前記取得した項目組み合わせ集合中の元項目に係る値と、前記受け取った名寄せ先データが保持する識別先項目に係る値との特定の組み合わせが、所定回数以上記録された場合、当該特定の組み合わせの前記辞書データへの登録を提案するための情報を出力する提案情報出力手段をさらに備えること、を特徴とする。 Further, in the information processing device according to the present invention, the control unit, in the name matching execution means, obtains a value related to the original item in the acquired item combination set held by the received name matching source data, and a value related to the source item in the acquired item combination set, and When a specific combination with a value related to an identification item held in the identification destination data is recorded a predetermined number of times or more, output proposal information that outputs information for proposing registration of the specific combination in the dictionary data. The method further comprises means.

また、本発明に係る情報処理装置において、名寄せ元データと名寄せ先データは、ＥＲＰ（ＥｎｔｅｒｐｒｉｓｅＲｅｓｏｕｒｃｅＰｌａｎｎｉｎｇ）に係るアプリケーションソフトウェアから転送されたものであり、前記出力は、名寄せ元データと名寄せ先データの転送元のＥＲＰに係るアプリケーションソフトウェアを出力先とするものであること、を特徴とする。 Furthermore, in the information processing device according to the present invention, the source data and destination data are transferred from application software related to ERP (Enterprise Resource Planning), and the output is a combination of the source data and destination data. It is characterized in that the output destination is application software related to the ERP of the transfer source.

また、本発明に係る情報処理装置において、名寄せ先データは、ＥＲＰに係るアプリケーションソフトウェア内に設定されたマスタであること、を特徴とする。 Furthermore, the information processing device according to the present invention is characterized in that the name identification data is a master set in application software related to ERP.

また、本発明に係る情報処理装置において、名寄せ元データは、ＯＣＲ（ＯｐｔｉｃａｌＣｈａｒａｃｔｅｒＲｅｃｏｇｎｉｔｉｏｎ）により電子化された、または外部取込された電子データであること、を特徴とする。 Furthermore, the information processing apparatus according to the present invention is characterized in that the name identification data is electronic data digitized by OCR (Optical Character Recognition) or externally imported.

また、本発明に係る情報処理方法は、クレンジング手法と類似度計算手法との手法組み合わせが名寄せ先データの項目である先項目と紐付けて格納され、且つ、名寄せ元データの項目である元項目と先項目との項目組み合わせを１つまたは複数含む名寄せ実行時に使用する項目組み合わせ集合が名寄せ元データのテンプレートと紐付けて格納された記憶部にアクセス可能な、制御部を備える情報処理装置で実行される情報処理方法であって、前記制御部で実行される、名寄せ元データと名寄せ先データを受け取る受取ステップと、前記受け取った名寄せ元データのテンプレートに紐付く項目組み合わせ集合を、前記記憶部から取得し、取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く手法組み合わせを、前記記憶部から取得する取得ステップと、１）前記取得した項目組み合わせ集合中の項目組み合わせと前記取得した手法組み合わせとの組み合わせごとに、１１）前記受け取った名寄せ元データが保持する、当該項目組み合わせ中の元項目に係る値と、前記受け取った名寄せ先データが保持する、当該項目組み合わせ中の先項目に係る値と、に対する、当該手法組み合わせ中のクレンジング手法によるクレンジング処理と、１２）クレンジング処理後の両値に対する、当該手法組み合わせ中の類似度計算手法による類似度計算処理と、を実行し、２）当該得られた各類似度を集計し、３）当該得られた集計値に基づく情報と、当該名寄せ先データが保持する、名寄せ先データを一意に識別するための先項目である識別先項目に係る値を、名寄せ結果として出力する名寄せ実行ステップと、を含むこと、を特徴とする。 Further, in the information processing method according to the present invention, a method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the name comparison destination data, and a source item that is an item of the name comparison source data is stored. Executed by an information processing device equipped with a control unit that can access a storage unit in which a set of item combinations used when performing name matching, including one or more item combinations of and destination item, is stored in association with the template of the name matching source data. The information processing method includes a receiving step of receiving the reference data and the reference data, which is executed by the control unit, and a set of item combinations linked to the template of the received reference data from the storage unit. an acquisition step of acquiring, for each item combination in the acquired item combination set, a method combination linked to a previous item in the item combination from the storage unit; 1) an item in the acquired item combination set; For each combination of a combination and the acquired method combination, 11) the value related to the source item in the item combination held by the received name reference data and the item combination held by the received name reference data; 12) A cleansing process using the cleansing method in the method combination for the value related to the previous item in 12) A similarity calculation process using the similarity calculation method in the method combination for both values after the cleansing process. 2) aggregate each of the obtained similarities, and 3) calculate information based on the obtained aggregate value and a previous item for uniquely identifying the reference data held by the reference data. It is characterized by including a name matching execution step of outputting a value related to a certain identification destination item as a name matching result.

また、本発明に係るプログラムは、クレンジング手法と類似度計算手法との手法組み合わせが名寄せ先データの項目である先項目と紐付けて格納され、且つ、名寄せ元データの項目である元項目と先項目との項目組み合わせを１つまたは複数含む名寄せ実行時に使用する項目組み合わせ集合が名寄せ元データのテンプレートと紐付けて格納された記憶部にアクセス可能な、制御部を備える情報処理装置に実行させるためのプログラムであって、前記制御部に実行させるための、名寄せ元データと名寄せ先データを受け取る受取ステップと、前記受け取った名寄せ元データのテンプレートに紐付く項目組み合わせ集合を、前記記憶部から取得し、取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く手法組み合わせを、前記記憶部から取得する取得ステップと、１）前記取得した項目組み合わせ集合中の項目組み合わせと前記取得した手法組み合わせとの組み合わせごとに、１１）前記受け取った名寄せ元データが保持する、当該項目組み合わせ中の元項目に係る値と、前記受け取った名寄せ先データが保持する、当該項目組み合わせ中の先項目に係る値と、に対する、当該手法組み合わせ中のクレンジング手法によるクレンジング処理と、１２）クレンジング処理後の両値に対する、当該手法組み合わせ中の類似度計算手法による類似度計算処理と、を実行し、２）当該得られた各類似度を集計し、３）当該得られた集計値に基づく情報と、当該名寄せ先データが保持する、名寄せ先データを一意に識別するための先項目である識別先項目に係る値を、名寄せ結果として出力する名寄せ実行ステップと、を含むこと、を特徴とする。 Further, the program according to the present invention is such that a method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the name comparison destination data, and a source item that is an item of the name comparison source data and a destination item are stored. To cause the execution to be executed by an information processing device equipped with a control unit that can access a storage unit in which a set of item combinations used when performing name matching, including one or more item combinations with items, is stored in association with the template of the name matching source data. The program includes a receiving step for receiving the reference data and the reference data to be executed by the control unit, and a set of item combinations linked to the template of the received reference data from the storage unit. , an acquisition step of acquiring, for each item combination in the acquired item combination set, a method combination linked to a previous item in the item combination from the storage unit; 1) the item combination in the acquired item combination set; For each combination with the acquired method combination, 11) the value related to the source item in the item combination held by the received name reference data and the value of the source item in the item combination held by the received name reference data; 12) Perform a cleansing process using the cleansing method in the method combination for the value related to the previous item, and 12) Perform a similarity calculation process on both values after the cleansing process using the similarity calculation method in the method combination. , 2) aggregate each of the obtained similarities, and 3) information based on the obtained aggregate value and identification, which is the first item for uniquely identifying the reference data held by the reference data. The present invention is characterized by including a name matching execution step of outputting a value related to the previous item as a name matching result.

本発明は、多種多様な名寄せ場面（具体的には名寄せ元データ（例：帳票データ）と名寄せ先データ（例：マスタ）との組み合わせ）に応じた名寄せフローを、名寄せ場面ごとに設計・開発することなくコーディングレスで作成・実行することができる、という効果を奏する。 The present invention designs and develops name matching flows for each name matching situation, which correspond to a wide variety of name matching situations (specifically, combinations of name matching source data (e.g., form data) and name matching destination data (e.g., master)). This has the effect that it can be created and executed without any coding.

図１は、本実施形態の概要を示す図である。FIG. 1 is a diagram showing an overview of this embodiment. 図２は、本実施形態の概要を示す図である。FIG. 2 is a diagram showing an overview of this embodiment. 図３は、本実施形態の概要を示す図である。FIG. 3 is a diagram showing an overview of this embodiment. 図４は、本実施形態の概要を示す図である。FIG. 4 is a diagram showing an overview of this embodiment. 図５は、本実施形態の概要を示す図である。FIG. 5 is a diagram showing an overview of this embodiment. 図６は、本実施形態の概要を示す図である。FIG. 6 is a diagram showing an overview of this embodiment. 図７は、本実施形態の概要を示す図である。FIG. 7 is a diagram showing an overview of this embodiment. 図８は、情報処理装置１００の構成の一例を示す図である。FIG. 8 is a diagram showing an example of the configuration of the information processing device 100. 図９は、処理工程データ１０６ａの一例を示す図である。FIG. 9 is a diagram showing an example of the processing process data 106a. 図１０は、クレンジング適用データ１０６ｂの一例を示す図である。FIG. 10 is a diagram showing an example of cleansing application data 106b. 図１１は、クレンジング手法マスタ１０６ｃの一例を示す図である。FIG. 11 is a diagram showing an example of the cleansing method master 106c. 図１２は、類似度計算手法マスタ１０６ｄの一例を示す図である。FIG. 12 is a diagram showing an example of the similarity calculation method master 106d. 図１３は、マスタ－処理工程マッピングデータ１０６ｅの一例を示す図である。FIG. 13 is a diagram showing an example of the master-processing process mapping data 106e. 図１４は、マスタ－処理工程マッピング明細データ１０６ｆの一例を示す図である。FIG. 14 is a diagram showing an example of master-processing process mapping detail data 106f. 図１５は、マスタ一覧マスタ１０６ｇの一例を示す図である。FIG. 15 is a diagram showing an example of the master list master 106g. 図１６は、マスタ項目マスタ１０６ｈの一例を示す図である。FIG. 16 is a diagram showing an example of the master item master 106h. 図１７は、名寄せ手法データ１０６ｉの一例を示す図である。FIG. 17 is a diagram showing an example of the name matching method data 106i. 図１８は、コード付与設定データ１０６ｊの一例を示す図である。FIG. 18 is a diagram showing an example of the code assignment setting data 106j. 図１９は、列マッピングデータ１０６ｋの一例を示す図である。FIG. 19 is a diagram showing an example of column mapping data 106k. 図２０は、データテンプレートマスタ１０６ｍの一例を示す図である。FIG. 20 is a diagram showing an example of the data template master 106m. 図２１は、データテンプレート項目マスタ１０６ｎの一例を示す図である。FIG. 21 is a diagram showing an example of the data template item master 106n. 図２２は、コード付与辞書条件データ１０６ｐの一例を示す図である。FIG. 22 is a diagram showing an example of the code assignment dictionary condition data 106p. 図２３は、コード付与辞書付与データ１０６ｑの一例を示す図である。FIG. 23 is a diagram showing an example of code assignment dictionary assignment data 106q. 図２４は、操作ログデータ１０６ｒの一例を示す図である。FIG. 24 is a diagram showing an example of the operation log data 106r. 図２５は、修正ログヘッダデータ１０６ｓの一例を示す図である。FIG. 25 is a diagram showing an example of the modified log header data 106s. 図２６は、修正ログ明細データ１０６ｔの一例を示す図である。FIG. 26 is a diagram showing an example of modification log detail data 106t. 図２７は、処理工程設定画面ＭＡの一例を示す図である。FIG. 27 is a diagram showing an example of the processing step setting screen MA. 図２８は、マスタ－処理工程マッピング設定画面ＭＢの一例を示す図である。FIG. 28 is a diagram showing an example of the master-processing process mapping setting screen MB. 図２９は、名寄せ手法設定画面ＭＣの一例を示す図である。FIG. 29 is a diagram showing an example of the name matching method setting screen MC. 図３０は、名寄せ結果の一例を示す図である。FIG. 30 is a diagram showing an example of the name matching results.

以下に、本発明に係る情報処理装置、情報処理方法およびプログラムの実施形態を、図面に基づいて詳細に説明する。なお、本実施形態により本発明が限定されるものではない。 Embodiments of an information processing device, an information processing method, and a program according to the present invention will be described in detail below based on the drawings. Note that the present invention is not limited to this embodiment.

［１．概要］
ここでは、本実施形態の概要について、図１から図７を参照して説明する。 [1. overview]
Here, an overview of this embodiment will be explained with reference to FIGS. 1 to 7.

従来は、名寄せする場面に応じた名寄せシステムが作成されていたが、ＥＲＰでは名寄せする場面が多岐にわたるため、場面ごとにシステムを毎回作成することは非効率であった。 In the past, name matching systems were created according to the situations in which names were to be linked, but since there are a wide variety of situations in which names should be linked in ERP, it was inefficient to create a system for each situation.

本実施形態では、各マスタの項目に対して最適な名寄せの処理工程を定義しておくことで、オペレータは、名寄せの内部処理を意識する必要がなくなり、項目のマッピングを行うだけで適切な名寄せが自動的に実装されるようになった。処理工程は、クレンジングと類似度計算で構成されており、各処理工程のクレンジング手法と類似度計算手法は、処理工程対象値の特徴に応じて選択することが可能である。これにより、開発コストが抑えられ、コーディングを行う必要なく適切な名寄せシステムを作成できる。 In this embodiment, by defining the optimal name matching processing process for each master item, the operator does not need to be aware of the internal processing of name matching, and can perform appropriate name matching just by mapping items. is now automatically implemented. The processing steps consist of cleansing and similarity calculation, and the cleansing method and similarity calculation method for each processing step can be selected depending on the characteristics of the processing step target value. This reduces development costs and allows you to create a suitable name identification system without the need for coding.

図１には、本実施形態で行われる名寄せ処理の流れが抽象化して示されている。本実施形態では、各名寄せ場面での名寄せ処理を抽象化し、処理の流れを定義した。これにより、名寄せ場面ごとでカスタマイズすべき部分を明らかにした。名寄せ処理では、まず、受け取った名寄せ元データと受け取ったマスタの項目マッピングを行い、つぎに、項目ごとにクレンジング処理と類似度計算を行い、最後に、類似度を集計してコード付与を行う。 FIG. 1 shows an abstracted flow of the name matching process performed in this embodiment. In this embodiment, the name matching process in each name matching scene is abstracted and the process flow is defined. This clarified the parts that should be customized for each name gathering situation. In the name matching process, first, item mapping is performed between the received name matching source data and the received master, then a cleansing process and similarity calculation are performed for each item, and finally, the similarities are aggregated and a code is assigned.

図２には、処理工程のまとまりおよび処理工程のマッピングの一例が示されている。クレンジングと類似度計算は、データ種類（例えば人名、住所など）ごとに異なるため、本実施形態では、これらをまとめて「処理工程」と命名し、データ種類ごとに処理工程を作成する。また、マスタの項目に適切な処理工程をマッピングすることで、マスタに応じた名寄せ処理をマッピング設定により実現可能となる。名寄せシステムにマスタの項目データを持たせるため、マスタと処理工程のマッピングの設定を名寄せシステムに記憶させることができる。これにより、当該マッピングの設定を名寄せ実行ごとに行う必要がなくなる。また、マスタへの処理工程のマッピングのみで名寄せ元データに対する処理工程も決定するため、名寄せ元データと処理工程のマッピングが必要なくなる。 FIG. 2 shows an example of a group of processing steps and a mapping of the processing steps. Since cleansing and similarity calculation differ depending on the data type (for example, person's name, address, etc.), in this embodiment, these are collectively named a "processing step" and a processing step is created for each data type. Furthermore, by mapping appropriate processing steps to items in the master, name matching processing according to the master can be realized by mapping settings. In order to have the master item data in the name matching system, the mapping settings between the master and the processing steps can be stored in the name matching system. This eliminates the need to configure the mapping each time name matching is performed. Further, since the processing steps for the comparison source data are determined only by mapping the processing steps to the master, there is no need to map the comparison source data and the processing steps.

図３には、クレンジングと類似度計算の部品化の一例が示されている。本実施形態では、部品化により、処理工程の処理変更を容易に行え、処理工程の追加、クレンジングや類似度計算の手法の追加、および処理工程のカスタマイズが効率よく可能となる。本実施形態では、使用場面に応じて、各処理工程に対し必要な部品（クレンジング手法と類似度計算手法）を選択することができる。 FIG. 3 shows an example of componentization of cleansing and similarity calculation. In this embodiment, the componentization makes it easy to change the processing steps, efficiently add processing steps, cleansing and similarity calculation methods, and customize the processing steps. In this embodiment, necessary parts (cleansing method and similarity calculation method) can be selected for each processing step depending on the usage situation.

図４には、コード付与辞書の処理が示されている。データによっては、場面に応じて特有の表記ゆれがあり、正しいコード付与ができないものが存在する。このようなデータに対して名寄せ処理を行うのは非効率である。そこで、本実施形態では、入力データの値（名寄せ元データの項目に係る値）と付与するコード（マスタの主キー項目に係る値）の組をレコードとしたコード付与辞書を用いた処理を行う。これにより、正しいコードが付与される。さらに、名寄せ処理を行わずにコード付与できるため、名寄せ処理を高速に行うことが可能となる。つまり、場面特有の表記ゆれを辞書に登録することで、正確かつ高速に名寄せを実行することができる。 FIG. 4 shows the processing of the code assignment dictionary. Depending on the data, there may be unique notation variations depending on the situation, and it may not be possible to assign the correct code. It is inefficient to perform name matching processing on such data. Therefore, in this embodiment, processing is performed using a code assignment dictionary in which the set of the input data value (value related to the item of the name identification source data) and the code to be assigned (value related to the master primary key item) is set as a record. . This will give you the correct code. Furthermore, since codes can be assigned without performing name matching processing, name matching processing can be performed at high speed. In other words, by registering scene-specific spelling variations in a dictionary, name matching can be performed accurately and quickly.

図５には、名寄せ元データに対する名寄せ結果および一致度の記号の一例が示されている。本実施形態では、名寄せ結果には、付与したコード、一致度、および、コードに応じた周辺情報を表示する。付与したコードの一致度は、図５に示す一致度の記号で表示する。このように、辞書によるコード付与の表示を変えることで、名寄せ結果の解釈性を向上させることができる。 FIG. 5 shows an example of the matching results and matching degree symbols for the matching source data. In this embodiment, the assigned code, degree of matching, and peripheral information corresponding to the code are displayed in the name matching result. The degree of coincidence of the assigned codes is displayed by the degree of coincidence symbol shown in FIG. In this way, by changing the display of codes assigned by the dictionary, it is possible to improve the interpretability of the name matching results.

図６には、類似度集計に導入する重み付け値の一例が示されている。名寄せでは、マスタや名寄せ元データの複数の項目を使用するが、名寄せにおける各項目の重要度は異なる。そこで、本実施形態では、重み付け値を導入し、類似度集計時に重み付け値を掛け合わせることで、各列（項目）の重要度を考慮したコード付与ができ、精度が向上する。 FIG. 6 shows an example of weighting values to be introduced into similarity aggregation. Name matching uses multiple items in the master and source data, but the importance of each item in name matching differs. Therefore, in this embodiment, by introducing a weighted value and multiplying it by the weighted value when calculating the similarity, it is possible to assign a code in consideration of the importance of each column (item), thereby improving accuracy.

図７には、修正ログの一例が示されている。本実施形態では、操作ログと修正ログの２種類のログが取得できる。特定の入力と修正後付与コードの組のログが複数回記録された場合、この組をコード付与辞書に登録するよう自動的に提案する。この提案に対し、辞書登録の可否をオペレータが判断できる。例えば、図７に示すように、入力が「ＯＢＩＣ、東京都中央区、ＳＩｅｒ」であり、修正後付与コードが「Ａ０００２」という組のログが数回記録されている場合、「入力が『ＯＢＩＣ、東京都中央区、ＳＩｅｒ』であった場合は『Ａ０００２』というコードを付与する」というルールを提案し、登録の可否をオペレータに判断させる。また、処理工程を作成する際に既存の処理工程との精度比較を行う場合、比較の基準として修正ログを用いる。修正ログにより、既存の処理工程では修正が必要であった入力に対して、新しい処理工程では正確にコード付与できるかを評価する。 FIG. 7 shows an example of a modification log. In this embodiment, two types of logs can be acquired: operation logs and modification logs. When a log of a set of a specific input and a corrected assigned code is recorded multiple times, it is automatically suggested that this set be registered in the code assigned dictionary. The operator can decide whether or not to register the proposal in the dictionary. For example, as shown in Figure 7, if the input is "OBIC, Chuo-ku, Tokyo, SIer" and the group with the corrected code "A0002" has been recorded several times, then , Chuo-ku, Tokyo, SIer', a code of 'A0002' will be assigned,' and the operator will decide whether registration is possible. Further, when creating a processing step and comparing accuracy with an existing processing step, a modification log is used as a reference for comparison. Using the modification log, evaluate whether the new processing process can accurately assign codes to inputs that required modification in the existing processing process.

本実施形態によれば、データの種類をまとめた処理工程を保持することで、幅広い使用場面に対して名寄せを行うことができる。また、データの種類毎の処理工程の設定をマッピングのみで行え、コストを抑えることができる。また、「データ種類ごとに適した処理工程を設定すること」、「使用場面ごとに処理工程の部品を取り換えられること」および「表記ゆれをコード付与辞書で対応すること」により、精度を保った名寄せを行うことができる。また、「コード付与辞書による結果の表示を変えること」により、修正作業の補助として結果を役立てることができる。これらにより、人の手による修正作業を効率よく行うことができる。 According to this embodiment, by retaining processing steps that summarize data types, name matching can be performed for a wide range of usage situations. Furthermore, the processing steps for each type of data can be set simply by mapping, which can reduce costs. In addition, accuracy was maintained by ``setting the appropriate processing process for each data type,'' ``being able to replace parts of the processing process depending on the usage situation,'' and ``corresponding to variations in notation using a code assignment dictionary.'' Name identification can be performed. Furthermore, by ``changing the display of the results using the code assignment dictionary'', the results can be used to assist in correction work. These allow manual correction work to be performed efficiently.

［２．構成・処理］
ここでは、本実施形態に係る情報処理装置の構成と処理の一例について、図８から図３０を参照して説明する。 [2. Configuration/Processing]
Here, an example of the configuration and processing of the information processing apparatus according to this embodiment will be described with reference to FIGS. 8 to 30.

図８は、情報処理装置（名寄せ処理装置）１００の構成の一例を示すブロック図である。情報処理装置１００は、市販のデスクトップ型パーソナルコンピュータである。なお、情報処理装置１００は、デスクトップ型パーソナルコンピュータのような据置型情報処理装置に限らず、市販されているノート型パーソナルコンピュータ、ＰＤＡ（ＰｅｒｓｏｎａｌＤｉｇｉｔａｌＡｓｓｉｓｔａｎｔｓ）、スマートフォン、タブレット型パーソナルコンピュータなどの携帯型情報処理装置であってもよい。 FIG. 8 is a block diagram showing an example of the configuration of the information processing device (name matching processing device) 100. Information processing device 100 is a commercially available desktop personal computer. Note that the information processing device 100 is not limited to a stationary information processing device such as a desktop personal computer, but may also be a portable type such as a commercially available notebook personal computer, a PDA (Personal Digital Assistant), a smartphone, or a tablet personal computer. It may also be an information processing device.

情報処理装置１００は、制御部１０２と通信インターフェース部１０４と記憶部１０６と入出力インターフェース部１０８と、を備えている。情報処理装置１００が備えている各部は、任意の通信路を介して通信可能に接続されている。 The information processing device 100 includes a control section 102, a communication interface section 104, a storage section 106, and an input/output interface section 108. Each unit included in the information processing device 100 is communicably connected via an arbitrary communication path.

通信インターフェース部１０４は、ルータ等の通信装置および専用線等の有線または無線の通信回線を介して、情報処理装置１００をネットワーク３００に通信可能に接続する。通信インターフェース部１０４は、他の装置と通信回線を介してデータを通信する機能を有する。ここで、ネットワーク３００は、情報処理装置１００とＥＲＰシステム２００（ＥＲＰに係るアプリケーションソフトウェアが導入された情報処理装置）とを相互に通信可能に接続する機能を有し、例えばインターネットやＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）等である。なお、情報処理装置１００は、ＥＲＰに係るアプリケーションソフトウェアが導入されたものであってもよい。 The communication interface unit 104 communicatively connects the information processing device 100 to the network 300 via a communication device such as a router and a wired or wireless communication line such as a dedicated line. The communication interface unit 104 has a function of communicating data with other devices via a communication line. Here, the network 300 has a function of connecting the information processing device 100 and the ERP system 200 (information processing device into which application software related to ERP has been installed) so that they can communicate with each other, for example, the Internet or LAN (Local Area Network) etc. Note that the information processing device 100 may be one in which application software related to ERP is installed.

入出力インターフェース部１０８には、入力装置１１２および出力装置１１４が接続されている。出力装置１１４には、モニタ（家庭用テレビを含む）の他、スピーカやプリンタを用いることができる。入力装置１１２には、キーボード、マウス、及びマイクの他、マウスと協働してポインティングデバイス機能を実現するモニタを用いることができる。なお、以下では、出力装置１１４をモニタ１１４とし、入力装置１１２をキーボード１１２またはマウス１１２として記載する場合がある。 An input device 112 and an output device 114 are connected to the input/output interface unit 108 . As the output device 114, in addition to a monitor (including a home television), a speaker or a printer can be used. As the input device 112, in addition to a keyboard, a mouse, and a microphone, a monitor that cooperates with the mouse to realize a pointing device function can be used. Note that in the following description, the output device 114 may be referred to as a monitor 114, and the input device 112 may be referred to as a keyboard 112 or a mouse 112.

記憶部１０６には、各種のデータベース、テーブルおよびファイルなどが格納される。記憶部１０６には、ＯＳ（ＯｐｅｒａｔｉｎｇＳｙｓｔｅｍ）と協働してＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）に命令を与えて各種処理を行うためのコンピュータプログラムが記録される。記憶部１０６として、例えば、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）・ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等のメモリ装置、ハードディスクのような固定ディスク装置、フレキシブルディスク、および光ディスク等を用いることができる。 The storage unit 106 stores various databases, tables, files, and the like. The storage unit 106 stores a computer program for giving instructions to a CPU (Central Processing Unit) to perform various processes in cooperation with an OS (Operating System). As the storage unit 106, for example, a memory device such as a RAM (Random Access Memory) or a ROM (Read Only Memory), a fixed disk device such as a hard disk, a flexible disk, an optical disk, etc. can be used.

［２－１．記憶部１０６の構成の概要］
ここでは、記憶部１０６の構成の概要について、説明する。記憶部１０６には、大まかに、クレンジング手法と類似度計算手法との手法組み合わせが、名寄せ先データの項目である先項目と紐付けて格納され、且つ、名寄せ元データの項目である元項目と先項目との項目組み合わせを１つまたは複数含む名寄せ実行時に使用する項目組み合わせ集合が、名寄せ元データのテンプレートと紐付けて格納されている。 [2-1. Overview of configuration of storage unit 106]
Here, an overview of the configuration of the storage unit 106 will be explained. In the storage unit 106, a method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the data to be compared, and is stored in association with a source item that is an item of the source data. An item combination set used when performing name matching that includes one or more item combinations with the previous item is stored in association with the template of the matching source data.

ここで、記憶部１０６は、以下の［１１］から［１５］のデータが格納されたものでもよい。なお、記憶部１０６には、重み付け値が先項目と紐付けてさらに格納されてもよい。また、記憶部１０６には、以下の［１６］のデータがさらに格納されてもよい。
［１１］手法組み合わせを複数含む手法組み合わせ集合
［１２］先項目と手法組み合わせの識別情報との先項目・手法組み合わせを、名寄せ先データ中の先項目の個数分含む先項目・手法組み合わせ集合を、名寄せ先データ別に複数含む先項目・手法組み合わせ集合族
［１３］項目組み合わせ集合を１つまたは複数含む項目組み合わせ集合族
［１４］項目組み合わせ集合の識別情報と先項目・手法組み合わせ集合の識別情報との組み合わせを１つまたは複数含む第一組み合わせ集合を１つまたは複数含む第一組み合わせ集合族
［１５］第一組み合わせ集合の識別情報と名寄せ元データのテンプレートの識別情報との組み合わせを１つまたは複数含む第二組み合わせ集合
［１６］名寄せ元データが保持する、項目組み合わせ集合中の元項目に係る値と、名寄せ先データが保持する、名寄せ先データを一意に識別するための先項目である識別先項目（例：主キー項目）に係る値と、を紐付けて登録した辞書データ Here, the storage unit 106 may store the following data [11] to [15]. Note that the storage unit 106 may further store the weighting value in association with the previous item. Further, the storage unit 106 may further store the following data [16].
[11] Method combination set that includes multiple method combinations [12] A destination item/method combination set that includes the number of destination item/method combinations of the destination item and identification information of the method combination, as many as the number of destination items in the destination data, [13] Item combination set family that includes one or more item combination sets [14] Identification information of the item combination set and identification information of the previous item/method combination set A first combination set family that includes one or more first combination sets that include one or more combinations [15] Contains one or more combinations of the identification information of the first combination set and the identification information of the template of the name source data Second combination set [16] The value related to the source item in the item combination set held by the name source data and the identification destination item that is the first item for uniquely identifying the name source data and held by the name source data. Dictionary data registered by linking values related to (e.g. primary key items) and

ここで、名寄せ元データは、例えば、発注書データ、見積書データ、その他の帳票データ（例えばＯＣＲにより電子化された、または外部取込された（例えばＥＤＩによりデータベースへ取り込まれた）電子データなど）である。名寄せ先データは、例えば、得意先マスタ、商品マスタ、その他のＥＲＰシステム２００内に設定されたマスタである。 Here, the source data includes, for example, purchase order data, quotation data, other form data (for example, electronic data digitized by OCR or externally imported (for example, imported into a database by EDI)), etc. ). The name identification data is, for example, a customer master, a product master, or another master set in the ERP system 200.

［２－２．記憶部１０６の構成の具体例］
ここでは、記憶部１０６の構成の具体例について、図９から図２６を参照して説明する。具体的には、記憶部１０６は、以下のデータおよびマスタが格納されたものである。
・処理工程データ１０６ａ
・クレンジング適用データ１０６ｂ
・クレンジング手法マスタ１０６ｃ
・類似度計算手法マスタ１０６ｄ
・マスタ－処理工程マッピングデータ１０６ｅ
・マスタ－処理工程マッピング明細データ１０６ｆ
・マスタ一覧マスタ１０６ｇ
・マスタ項目マスタ１０６ｈ
・名寄せ手法データ１０６ｉ
・コード付与設定データ１０６ｊ
・列マッピングデータ１０６ｋ
・データテンプレートマスタ１０６ｍ
・データテンプレート項目マスタ１０６ｎ
・コード付与辞書条件データ１０６ｐ
・コード付与辞書付与データ１０６ｑ
・操作ログデータ１０６ｒ
・修正ログヘッダデータ１０６ｓ
・修正ログ明細データ１０６ｔ [2-2. Specific example of configuration of storage unit 106]
Here, a specific example of the configuration of the storage unit 106 will be described with reference to FIGS. 9 to 26. Specifically, the storage unit 106 stores the following data and masters.
・Processing process data 106a
・Cleansing application data 106b
・Cleansing method master 106c
・Similarity calculation method master 106d
・Master processing process mapping data 106e
・Master processing process mapping detail data 106f
・Master list master 106g
・Master item master 106h
・Name identification method data 106i
・Code assignment setting data 106j
・Column mapping data 106k
・Data template master 106m
・Data template item master 106n
・Code assignment dictionary condition data 106 pages
・Code assignment dictionary assignment data 106q
・Operation log data 106r
・Correction log header data 106s
・Correction log detail data 106t

図９は、処理工程データ１０６ａの一例を示す図である。処理工程データ１０６ａは、処理工程識別情報（手法組み合わせの識別情報に相当）、類似度計算手法識別情報（手法組み合わせ中の類似度計算手法の識別情報に相当）およびデータ生成日付を含む。 FIG. 9 is a diagram showing an example of the processing process data 106a. The processing process data 106a includes processing process identification information (corresponding to the identification information of the method combination), similarity calculation method identification information (corresponding to the identification information of the similarity calculation method in the method combination), and data generation date.

図１０は、クレンジング適用データ１０６ｂの一例を示す図である。クレンジング適用データ１０６ｂは、処理工程識別情報（手法組み合わせの識別情報に相当）、実行順、クレンジング手法識別情報（手法組み合わせ中のクレンジング手法の識別情報に相当）、辞書データへのパスおよびデータ生成日付を含む。 FIG. 10 is a diagram showing an example of cleansing application data 106b. The cleansing application data 106b includes processing step identification information (corresponding to the identification information of the method combination), execution order, cleansing method identification information (corresponding to the identification information of the cleansing method in the method combination), the path to dictionary data, and the data generation date. including.

図１１は、クレンジング手法マスタ１０６ｃの一例を示す図である。クレンジング手法マスタ１０６ｃは、クレンジング手法識別情報（クレンジング手法の識別情報に相当）および辞書フラグを含む。 FIG. 11 is a diagram showing an example of the cleansing method master 106c. The cleansing method master 106c includes cleansing method identification information (corresponding to cleansing method identification information) and a dictionary flag.

図１２は、類似度計算手法マスタ１０６ｄの一例を示す図である。類似度計算手法マスタ１０６ｄは、類似度計算手法識別情報（類似度計算手法の識別情報に相当）およびオプションフラグを含む。 FIG. 12 is a diagram showing an example of the similarity calculation method master 106d. The similarity calculation method master 106d includes similarity calculation method identification information (corresponding to similarity calculation method identification information) and option flags.

図１３は、マスタ－処理工程マッピングデータ１０６ｅの一例を示す図である。マスタ－処理工程マッピングデータ１０６ｅは、マスタ処理工程マッピング識別情報（先項目・手法組み合わせ集合の識別情報に相当）、マスタ識別情報（名寄せ先データの識別情報に相当）およびデータ生成日付を含む。 FIG. 13 is a diagram showing an example of the master-processing process mapping data 106e. The master processing step mapping data 106e includes master processing step mapping identification information (corresponding to the identification information of the previous item/method combination set), master identification information (corresponding to the identification information of the collation destination data), and data generation date.

図１４は、マスタ－処理工程マッピング明細データ１０６ｆの一例を示す図である。マスタ－処理工程マッピング明細データ１０６ｆは、先項目・手法組み合わせ集合族に相当する。マスタ－処理工程マッピング明細データ１０６ｆは、マスタ処理工程マッピング識別情報（先項目・手法組み合わせ集合の識別情報に相当）、マスタ項目識別情報（先項目・手法組み合わせ中の先項目の識別情報に相当）、処理工程識別情報（先項目・手法組み合わせ中の手法組み合わせの識別情報に相当）、重み付け値およびデータ生成日付を含む。 FIG. 14 is a diagram showing an example of master-processing process mapping detail data 106f. The master-processing process mapping detail data 106f corresponds to the previous item/method combination set family. The master processing process mapping detail data 106f includes master processing process mapping identification information (corresponding to the identification information of the previous item/method combination set), master item identification information (corresponding to the identification information of the previous item in the previous item/method combination) , processing step identification information (corresponding to the identification information of the method combination in the previous item/method combination), weighting value, and data generation date.

図１５は、マスタ一覧マスタ１０６ｇの一例を示す図である。マスタ一覧マスタ１０６ｇは、マスタ識別情報（名寄せ先データの識別情報に相当）を含む。 FIG. 15 is a diagram showing an example of the master list master 106g. The master list master 106g includes master identification information (corresponding to identification information of the name identification data).

図１６は、マスタ項目マスタ１０６ｈの一例を示す図である。マスタ項目マスタ１０６ｈは、マスタ項目識別情報（先項目の識別情報に相当）、マスタ識別情報（名寄せ先データの識別情報に相当）および主キー設定情報を含む。 FIG. 16 is a diagram showing an example of the master item master 106h. The master item master 106h includes master item identification information (corresponding to the identification information of the previous item), master identification information (corresponding to the identification information of the data to be collated), and primary key setting information.

図１７は、名寄せ手法データ１０６ｉの一例を示す図である。名寄せ手法データ１０６ｉは、第二組み合わせ集合に相当する。名寄せ手法データ１０６ｉは、名寄せ手法識別情報（第一組み合わせ集合の識別情報に相当）、データテンプレート識別情報（名寄せ元データのテンプレートの識別情報に相当）およびデータ生成日付を含む。 FIG. 17 is a diagram showing an example of the name matching method data 106i. The name matching method data 106i corresponds to a second combination set. The name matching method data 106i includes name matching method identification information (corresponding to the identification information of the first combination set), data template identification information (corresponding to the identification information of the template of the name matching source data), and data generation date.

図１８は、コード付与設定データ１０６ｊの一例を示す図である。コード付与設定データ１０６ｊは、第一組み合わせ集合族に相当する。コード付与設定データ１０６ｊは、名寄せ手法データ識別情報（項目組み合わせ集合の識別情報に相当）、名寄せ手法識別情報（第一組み合わせ集合の識別情報に相当）、マスタ処理工程マッピング識別情報（先項目・手法組み合わせ集合の識別情報に相当）およびデータ生成日付を含む。 FIG. 18 is a diagram showing an example of the code assignment setting data 106j. The code assignment setting data 106j corresponds to the first combination set family. The code assignment setting data 106j includes name matching method data identification information (corresponding to the identification information of the item combination set), name matching method identification information (corresponding to the identification information of the first combination set), master processing process mapping identification information (previous item/method (equivalent to the identification information of the combination set) and the data generation date.

図１９は、列マッピングデータ１０６ｋの一例を示す図である。列マッピングデータ１０６ｋは、項目組み合わせ集合族に相当する。名寄せ手法データ識別情報（項目組み合わせ集合の識別情報に相当）、マスタ項目識別情報（項目組み合わせ中の先項目の識別情報に相当）およびデータ項目識別情報（項目組み合わせ中の元項目の識別情報に相当）を含む。 FIG. 19 is a diagram showing an example of column mapping data 106k. The column mapping data 106k corresponds to an item combination set family. Name matching method data identification information (corresponds to the identification information of the item combination set), master item identification information (corresponds to the identification information of the previous item in the item combination), and data item identification information (corresponds to the identification information of the source item in the item combination) )including.

図２０は、データテンプレートマスタ１０６ｍの一例を示す図である。データテンプレートマスタ１０６ｍは、データテンプレート識別情報（名寄せ元データのテンプレートの識別情報に相当）を含む。 FIG. 20 is a diagram showing an example of the data template master 106m. The data template master 106m includes data template identification information (corresponding to identification information of the template of the name matching source data).

図２１は、データテンプレート項目マスタ１０６ｎの一例を示す図である。データテンプレート項目マスタ１０６ｎは、データ項目識別情報（元項目の識別情報に相当）およびデータテンプレート識別情報（名寄せ元データのテンプレートの識別情報に相当）を含む。 FIG. 21 is a diagram showing an example of the data template item master 106n. The data template item master 106n includes data item identification information (corresponding to the identification information of the original item) and data template identification information (corresponding to the identification information of the template of the source data).

図２２は、コード付与辞書条件データ１０６ｐの一例を示す図である。コード付与辞書条件データ１０６ｐは、コード付与辞書レコード識別情報、データ項目識別情報および当該データ項目識別情報で特定される項目に係るコード値を含む。 FIG. 22 is a diagram showing an example of the code assignment dictionary condition data 106p. The code assignment dictionary condition data 106p includes code assignment dictionary record identification information, data item identification information, and a code value related to the item specified by the data item identification information.

図２３は、コード付与辞書付与データ１０６ｑの一例を示す図である。コード付与辞書付与データ１０６ｑは、コード付与辞書レコード識別情報、マスタ識別情報および当該マスタ識別情報で特定されるマスタが保持する主キー項目に係るコード値を含む。 FIG. 23 is a diagram showing an example of code assignment dictionary assignment data 106q. The code assignment dictionary assignment data 106q includes code assignment dictionary record identification information, master identification information, and a code value related to a primary key item held by a master specified by the master identification information.

図２４は、操作ログデータ１０６ｒの一例を示す図である。操作ログデータ１０６ｒは、操作ログ識別情報、データ名識別情報、名寄せ手法識別情報（第一組み合わせ集合の識別情報に相当）およびデータ生成日付を含む。 FIG. 24 is a diagram showing an example of the operation log data 106r. The operation log data 106r includes operation log identification information, data name identification information, name matching method identification information (corresponding to identification information of the first combination set), and data generation date.

図２５は、修正ログヘッダデータ１０６ｓの一例を示す図である。修正ログヘッダデータ１０６ｓは、修正ログ識別情報、操作ログ識別情報、名寄せ手法データ識別情報（項目組み合わせ集合の識別情報に相当）、修正前コード値、修正後コード値およびデータ生成日付を含む。 FIG. 25 is a diagram showing an example of the modified log header data 106s. The modification log header data 106s includes modification log identification information, operation log identification information, name identification method data identification information (corresponding to identification information of an item combination set), a pre-modification code value, a post-modification code value, and a data generation date.

図２６は、修正ログ明細データ１０６ｔの一例を示す図である。修正ログ明細データ１０６ｔは、修正ログ識別情報、データ項目識別情報（項目組み合わせ中の元項目の識別情報に相当）、当該データ項目識別情報で特定される項目に係るコード値およびデータ生成日付を含む。 FIG. 26 is a diagram showing an example of modification log detail data 106t. The modification log detail data 106t includes modification log identification information, data item identification information (corresponding to the identification information of the original item in the item combination), a code value related to the item specified by the data item identification information, and a data generation date. .

以上で、記憶部１０６の構成の具体例についての説明を終了する。 This concludes the description of the specific example of the configuration of the storage unit 106.

図８に戻り、制御部１０２は、情報処理装置１００を統括的に制御するＣＰＵ等である。制御部１０２は、ＯＳ等の制御プログラム・各種の処理手順等を規定したプログラム・所要データなどを格納するための内部メモリを有し、格納されているこれらのプログラムに基づいて種々の情報処理を実行する。 Returning to FIG. 8, the control unit 102 is a CPU or the like that centrally controls the information processing device 100. The control unit 102 has an internal memory for storing control programs such as an OS, programs specifying various processing procedures, required data, etc., and performs various information processing based on these stored programs. Execute.

制御部１０２は、機能概念的に、第一設定部１０２ａ、第二設定部１０２ｂ、第三設定部１０２ｃ、受取部１０２ｄ、取得部１０２ｅ、辞書使用名寄せ実行部１０２ｆ、名寄せ実行部１０２ｇおよび提案情報出力部１０２ｈなどを備える。 Functionally, the control unit 102 includes a first setting unit 102a, a second setting unit 102b, a third setting unit 102c, a receiving unit 102d, an acquisition unit 102e, a dictionary use name matching unit 102f, a name matching unit 102g, and proposal information. It includes an output section 102h and the like.

［２－３．制御部１０２が備える各処理部が実行する処理の概要］
ここでは、制御部１０２が備える各処理部が実行する処理の概要について、説明する。 [2-3. Overview of processing executed by each processing unit included in the control unit 102]
Here, an overview of the processing executed by each processing unit included in the control unit 102 will be explained.

第一設定部１０２ａは、第一の設定画面を介して、オペレータに、手法組み合わせを設定させる。第一の設定画面は、クレンジング手法を設定させるための領域と、類似度計算手法を設定させるための領域とを含む。 The first setting unit 102a allows the operator to set the method combination via the first setting screen. The first setting screen includes an area for setting a cleansing method and an area for setting a similarity calculation method.

第二設定部１０２ｂは、第二の設定画面を介して、オペレータに、先項目・手法組み合わせ集合を設定させる。第二の設定画面は、名寄せ先データを設定させるための領域と、設定された名寄せ先データ中の先項目を表示させるための領域と、先項目に適用する手法組み合わせを設定させるための領域と、を含む。 The second setting unit 102b allows the operator to set the previous item/method combination set via the second setting screen. The second setting screen has an area for setting the name reference data, an area for displaying the destination item in the set name reference data, and an area for setting the method combination to be applied to the destination item. ,including.

第三設定部１０２ｃは、第三の設定画面を介して、オペレータに、項目組み合わせ集合、名寄せ元データのテンプレートおよび先項目・手法組み合わせ集合の紐付けを設定させる。第三の設定画面は、名寄せ元データのテンプレートを設定させるための領域と、名寄せ先データを設定させるための領域と、先項目・手法組み合わせ集合を設定させるための領域と、設定された名寄せ先データ中の先項目と設定されたテンプレート中の元項目とを表示させ、項目組み合わせ集合を設定させるための領域と、を含む。 The third setting unit 102c allows the operator to set the linkage of the item combination set, the template of the source data, and the destination item/method combination set via the third setting screen. The third setting screen has an area for setting a template for the reference data, an area for setting the reference data, an area for setting a target item/method combination set, and an area for setting the reference data that has been set. It includes an area for displaying the previous item in the data and the original item in the set template, and setting an item combination set.

受取部１０２ｄは、名寄せ元データと名寄せ先データを受け取る。受取部１０２ｄは、ＥＲＰシステム２００から転送された名寄せ元データと名寄せ先データを受け取ってもよい。 The receiving unit 102d receives the matching source data and the matching destination data. The receiving unit 102d may receive the comparison source data and the comparison destination data transferred from the ERP system 200.

取得部１０２ｅは、受取部１０２ｄで受け取った名寄せ元データのテンプレートに紐付く項目組み合わせ集合を、記憶部１０６から取得し、取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く手法組み合わせを、記憶部１０６から取得する。 The acquisition unit 102e acquires from the storage unit 106 an item combination set that is linked to the template of the matching source data received by the reception unit 102d, and for each item combination in the acquired item combination set, selects the previous item in the item combination. The method combination associated with is acquired from the storage unit 106.

取得部１０２ｅは、取得した項目組み合わせ集合中の項目組み合わせごとに、当該項目組み合わせ中の先項目に紐付く重み付け値を、記憶部１０６からさらに取得してもよい。 The acquisition unit 102e may further acquire from the storage unit 106, for each item combination in the acquired item combination set, a weighting value associated with the previous item in the item combination.

取得部１０２ｅは、以下の［２１］から［２４］の処理を実行してもよい。
［２１］受取部１０２ｄで受け取った名寄せ元データ中のテンプレートの識別情報に紐付く第一組み合わせ集合の識別情報を、第二組み合わせ集合から取得する。
［２２］［２１］で取得した第一組み合わせ集合の識別情報で特定される第一組み合わせ集合を、第一組み合わせ集合族から取得する。
［２３］［２２］で取得した第一組み合わせ集合中の項目組み合わせ集合の識別情報で特定される項目組み合わせ集合を、項目組み合わせ集合族から取得するとともに、当該取得した第一組み合わせ集合中の先項目・手法組み合わせ集合の識別情報で特定される先項目・手法組み合わせ集合を、先項目・手法組み合わせ集合族から取得する。
［２４］［２３］で取得した先項目・手法組み合わせ集合から、［２３］で取得した項目組み合わせ集合中の先項目に紐付く手法組み合わせを取得する。 The acquisition unit 102e may execute the following processes [21] to [24].
[21] Obtain the identification information of the first combination set linked to the identification information of the template in the matching source data received by the receiving unit 102d from the second combination set.
[22] A first combination set specified by the identification information of the first combination set obtained in [21] is obtained from the first combination set family.
[23] Obtain the item combination set specified by the identification information of the item combination set in the first combination set obtained in [22] from the item combination set family, and also obtain the previous item in the obtained first combination set. - Obtain the previous item/method combination set specified by the identification information of the method combination set from the previous item/method combination set family.
[24] From the previous item/method combination set obtained in [23], obtain the method combination that is linked to the previous item in the item combination set obtained in [23].

辞書使用名寄せ実行部１０２ｆは、「受取部１０２ｄで受け取った名寄せ元データが保持する、取得部１０２ｅが取得した項目組み合わせ集合（具体的には［２３］で取得した項目組み合わせ集合）中の元項目に係る値」が、辞書データに登録されたものと同じである場合に、辞書データに登録された識別先項目（例：主キー項目）に係る値と、当該値が辞書データに登録されたものであることを示す情報（例：「辞」という文字情報）を、名寄せ結果として、例えば名寄せ元データと名寄せ先データの転送元のＥＲＰシステム２００を出力先として出力する。 The dictionary-using name matching execution unit 102f selects "original items in the item combination set acquired by the acquisition unit 102e (specifically, the item combination set acquired in [23]) that is held by the name matching source data received by the receiving unit 102d. If the "value related to" is the same as the one registered in the dictionary data, the value related to the identification item (e.g. primary key item) registered in the dictionary data and the value related to the value registered in the dictionary data. The information indicating that the name is the same (for example, the character information "ji") is output as the name matching result, for example, to the ERP system 200, which is the transfer source of the name matching source data and the name matching destination data, as an output destination.

名寄せ実行部１０２ｇは、以下の［３１］から［３３］の処理を実行する。
［３１］「取得部１０２ｅが取得した項目組み合わせ集合（具体的には［２３］で取得した項目組み合わせ集合）中の項目組み合わせ」と「取得部１０２ｅが取得した手法組み合わせ（具体的には［２４］で取得した手法組み合わせ）」との組み合わせごとに、以下の［３１１］の処理と［３１２］の処理を実行する。
［３１１］「受取部１０２ｄで受け取った名寄せ元データが保持する、当該項目組み合わせ中の元項目に係る値」と「受取部１０２ｄで受け取った名寄せ先データが保持する、当該項目組み合わせ中の先項目に係る値」とに対する、当該手法組み合わせ中のクレンジング手法によるクレンジング処理
［３１２］クレンジング処理後の両値に対する、当該手法組み合わせ中の類似度計算手法による類似度計算処理
［３２］［３１］で得られた各類似度を集計する。
［３３］［３２］で得られた集計値に基づく情報と、当該名寄せ先データが保持する、識別先項目（例：主キー項目）に係る値を、名寄せ結果として、例えば名寄せ元データと名寄せ先データの転送元のＥＲＰシステム２００を出力先として出力する。 The name matching execution unit 102g executes the following processes [31] to [33].
[31] “Item combinations in the item combination set acquired by the acquisition unit 102e (specifically, the item combination set acquired in [23])” and “method combinations acquired by the acquisition unit 102e (specifically, the item combination set acquired in [23])” The following process [311] and process [312] are executed for each combination of "method combination obtained in )".
[311] “The value related to the source item in the item combination that is held by the matching source data received by the receiving unit 102d” and “the previous item in the item combination that is held by the matching destination data received by the receiving unit 102d” [312] Similarity calculation processing using the similarity calculation method in the combination of methods for both values after the cleansing process [32] [31] The calculated similarities are then aggregated.
[33] The information based on the aggregate value obtained in [32] and the value related to the identification item (e.g. primary key item) held by the target data are merged with, for example, the source data as the merge result. The ERP system 200 from which the destination data is transferred is output as the output destination.

名寄せ実行部１０２ｇは、各類似度に取得した各重み付け値を掛け合わせて各類似度を集計してもよい。 The name matching execution unit 102g may aggregate each similarity degree by multiplying each similarity degree by each acquired weighting value.

名寄せ実行部１０２ｇは、集計値と１つの閾値または互いに異なる複数の閾値との大小を比較し、比較結果に応じたコンテンツ（例：名寄せの精度が高いことを意味する記号（例：○）、名寄せの精度が中程度であることを意味する記号（例：△）、名寄せの精度が低いことを意味する記号（例：×））を、集計値に基づく情報として出力してもよい。 The name matching execution unit 102g compares the total value with one threshold value or a plurality of mutually different threshold values, and selects content according to the comparison result (e.g., a symbol (e.g., ○) indicating that the name matching accuracy is high), A symbol (for example, △) that means that the accuracy of name matching is medium, and a symbol (for example, ×) that means that the accuracy of name matching is low may be output as information based on the total value.

名寄せ実行部１０２ｇは、「受取部１０２ｄで受け取った名寄せ先データが保持する、取得部１０２ｅで取得した項目組み合わせ集合（具体的には［２３］で取得した項目組み合わせ集合）中の先項目に係る値」を、名寄せ結果としてさらに出力してもよい。 The name matching execution unit 102g performs a process based on “the previous item in the item combination set obtained by the obtaining unit 102e (specifically, the item combination set obtained in [23]) that is held by the name matching data received by the receiving unit 102d. value" may be further output as the name matching result.

名寄せ実行部１０２ｇは、「受取部１０２ｄで受け取った名寄せ元データが保持する、取得部１０２ｅで取得した項目組み合わせ集合（具体的には［２３］で取得した項目組み合わせ集合）中の元項目に係る値」が、辞書データに登録されたものと同じでなかった場合に、［３１］から［３３］の処理を実行してもよい。 The name matching execution unit 102g performs a process based on “the original item in the item combination set obtained by the obtaining unit 102e (specifically, the item combination set obtained in [23]) that is held by the name matching source data received by the receiving unit 102d. If the "value" is not the same as that registered in the dictionary data, the processes from [31] to [33] may be executed.

提案情報出力部１０２ｈは、「受取部１０２ｄで受け取った名寄せ元データが保持する、取得部１０２ｅで取得した項目組み合わせ集合（具体的には［２３］で取得した項目組み合わせ集合）中の元項目に係る値」と「受取部１０２ｄで受け取った名寄せ先データが保持する識別先項目（例：主キー項目）に係る値」とに関する特定の組み合わせが、名寄せ実行部１０２ｇにおいて所定回数以上記録された場合、当該特定の組み合わせの辞書データへの登録を提案するための提案情報（例えば、当該提案に関するテキスト情報）を、例えば名寄せ元データと名寄せ先データの転送元のＥＲＰシステム２００を出力先として出力する。 The proposal information output unit 102h outputs "the original item in the item combination set acquired by the acquisition unit 102e (specifically, the item combination set acquired in [23]) held by the name matching data received by the reception unit 102d". When a specific combination of ``value related to this value'' and ``value related to an identification item (e.g. primary key item) held by the name comparison data received by the receiving unit 102d'' is recorded a predetermined number of times or more in the name comparison execution unit 102g. , the proposal information for proposing registration of the particular combination in the dictionary data (for example, text information regarding the proposal) is outputted to, for example, the ERP system 200 that is the source of the transfer source data and destination data. .

以上で、制御部１０２が備える各処理部が実行する処理の概要についての説明を終了する。 This concludes the explanation of the outline of the processing executed by each processing unit included in the control unit 102.

［２－４．制御部１０２が備える各処理部が実行する処理の具体例］
ここでは、制御部１０２が備える各処理部が実行する処理の具体例について、図２７から図３０等を参照して説明する。 [2-4. Specific example of processing executed by each processing unit included in the control unit 102]
Here, specific examples of processing executed by each processing unit included in the control unit 102 will be described with reference to FIGS. 27 to 30 and the like.

第一設定部１０２ａは、処理工程設定画面ＭＡを介して、オペレータに、登録する処理工程の基となるクレンジング手法と類似度計算手法を設定させる。 The first setting unit 102a allows the operator to set the cleansing method and similarity calculation method, which are the basis of the processing step to be registered, via the processing step setting screen MA.

図２７は、処理工程設定画面ＭＡの一例を示す図である。処理工程設定画面ＭＡは、登録ボタンＭＡ１、取消ボタンＭＡ２、戻るボタンＭＡ３、設定領域ＭＡ４、設定領域ＭＡ５および設定領域ＭＡ６を含む。 FIG. 27 is a diagram showing an example of the processing step setting screen MA. The processing step setting screen MA includes a registration button MA1, a cancel button MA2, a back button MA3, a setting area MA4, a setting area MA5, and a setting area MA6.

設定領域ＭＡ４は、登録する処理工程の処理工程識別情報を設定させるための領域である。設定領域ＭＡ５は、登録する処理工程の基となるクレンジング手法のクレンジング手法識別情報を設定させるための領域である。設定領域ＭＡ６は、登録する処理工程の基となる類似度計算手法の類似度計算手法識別情報を設定させるための領域である。 Setting area MA4 is an area for setting processing process identification information of a processing process to be registered. The setting area MA5 is an area for setting cleansing method identification information of the cleansing method that is the basis of the processing step to be registered. The setting area MA6 is an area for setting similarity calculation method identification information of the similarity calculation method that is the basis of the processing step to be registered.

オペレータは、設定領域ＭＡ４に、登録する処理工程を設定し、設定領域ＭＡ５と設定領域ＭＡ６に、その処理工程の基となるクレンジング手法と類似度計算手法を設定する。オペレータは、＋ボタンを押下して、クレンジング手法を複数設定してもよい。設定できるクレンジング手法と類似度計算手法は、図１１のクレンジング手法マスタ１０６ｃと図１２の類似度計算手法マスタ１０６ｄ内の手法である。登録ボタンＭＡ１を押すと、処理工程設定画面ＭＡの情報が、図９の処理工程データ１０６ａと図１０のクレンジング適用データ１０６ｂに登録される。図９と図１０に示す情報は、名寄せ処理の実行前に登録しておく。 The operator sets the processing step to be registered in the setting area MA4, and sets the cleansing method and similarity calculation method that are the basis of the processing step in the setting area MA5 and setting area MA6. The operator may press the + button to set multiple cleansing methods. The cleansing methods and similarity calculation methods that can be set are those in the cleansing method master 106c in FIG. 11 and the similarity calculation method master 106d in FIG. 12. When the registration button MA1 is pressed, the information on the treatment process setting screen MA is registered in the treatment process data 106a in FIG. 9 and the cleansing application data 106b in FIG. 10. The information shown in FIGS. 9 and 10 is registered before executing the name matching process.

図８に戻り、第二設定部１０２ｂは、マスタ－処理工程マッピング設定画面ＭＢを介して、オペレータに、登録するマスタ処理工程マッピングの基となるマスタとマスタ項目と処理工程と重み付け値を設定させる。 Returning to FIG. 8, the second setting unit 102b allows the operator to set the master, master item, processing process, and weighting value on which the master processing process mapping to be registered is based via the master processing process mapping setting screen MB. .

図２８は、マスタ－処理工程マッピング設定画面ＭＢの一例を示す図である。マスタ－処理工程マッピング設定画面ＭＢは、登録ボタンＭＢ１、取消ボタンＭＢ２、戻るボタンＭＢ３、設定領域ＭＢ４、設定領域ＭＢ５および設定領域ＭＢ６を含む。 FIG. 28 is a diagram showing an example of the master-processing process mapping setting screen MB. The master-processing process mapping setting screen MB includes a registration button MB1, a cancel button MB2, a back button MB3, a setting area MB4, a setting area MB5, and a setting area MB6.

設定領域ＭＢ４は、登録するマスタ処理工程マッピングの基となるマスタのマスタ識別情報を設定させるための領域である。設定領域ＭＢ５は、登録するマスタ処理工程マッピングのマスタ処理工程マッピング識別情報を設定させるための領域である。設定領域ＭＢ６は、設定領域ＭＢ４に設定されたマスタ識別情報で特定されるマスタ中のマスタ項目のマスタ項目識別情報を表示させ、マスタ項目に適用する処理工程の処理工程識別情報と重み付け値を設定させるための領域である。 The setting area MB4 is an area for setting master identification information of a master that is the basis of the master processing step mapping to be registered. The setting area MB5 is an area for setting master processing process mapping identification information of the master processing process mapping to be registered. The setting area MB6 displays the master item identification information of the master item in the master specified by the master identification information set in the setting area MB4, and sets the process process identification information and weighting value of the process process applied to the master item. This is an area for

オペレータは、設定領域ＭＢ５に、登録するマスタ処理工程マッピングを設定し、設定領域ＭＢ４に、登録するマスタ処理工程マッピングの基となるマスタを設定する。設定できるマスタは、図１５のマスタ一覧マスタ１０６ｇ内のマスタである。設定領域ＭＢ４にマスタが設定されると、図１６のマスタ項目マスタ１０６ｈ内にある、当該設定されたマスタに紐付くマスタ項目が、設定領域ＭＢ６に表示される。オペレータは、設定領域ＭＢ６に、表示されたマスタ項目に適用する処理工程と重み付け値を設定する。登録ボタンＭＢ１を押すと、マスタ－処理工程マッピング設定画面ＭＢの情報が、図１３のマスタ－処理工程マッピングデータ１０６ｅと図１４のマスタ－処理工程マッピング明細データ１０６ｆに登録される。図１３と図１４に示す情報は、名寄せ処理の実行前に登録しておく。 The operator sets the master processing step mapping to be registered in the setting area MB5, and sets the master on which the master processing step mapping to be registered is based in the setting area MB4. The masters that can be set are the masters in the master list master 106g in FIG. 15. When a master is set in the setting area MB4, master items linked to the set master in the master item master 106h in FIG. 16 are displayed in the setting area MB6. The operator sets the processing steps and weighting values to be applied to the displayed master item in the setting area MB6. When the registration button MB1 is pressed, the information on the master-processing process mapping setting screen MB is registered in the master-processing process mapping data 106e in FIG. 13 and the master-processing process mapping detailed data 106f in FIG. 14. The information shown in FIGS. 13 and 14 is registered before executing the name matching process.

図８に戻り、第三設定部１０２ｃは、名寄せ手法設定画面ＭＣを介して、オペレータに、登録する名寄せ手法の基となるデータテンプレートと名寄せ手法データとマスタ処理工程マッピングとマスタ項目とデータ項目を設定させる。 Returning to FIG. 8, the third setting unit 102c prompts the operator, via the name matching method setting screen MC, to provide the data template, name matching method data, master processing process mapping, master item, and data item that are the basis of the name matching method to be registered. Let it be set.

図２９は、名寄せ手法設定画面ＭＣの一例を示す図である。名寄せ手法設定画面ＭＣは、登録ボタンＭＣ１、取消ボタンＭＣ２、戻るボタンＭＣ３、設定領域ＭＣ４、設定領域ＭＣ５、設定領域ＭＣ６、設定領域ＭＣ７および設定領域ＭＣ８を含む。 FIG. 29 is a diagram showing an example of the name matching method setting screen MC. The name identification method setting screen MC includes a registration button MC1, a cancel button MC2, a return button MC3, a setting area MC4, a setting area MC5, a setting area MC6, a setting area MC7, and a setting area MC8.

設定領域ＭＣ４は、登録する名寄せ手法の名寄せ手法識別情報を設定させるための領域である。設定領域ＭＣ５は、登録する名寄せ手法の基となるデータテンプレートのデータテンプレート識別情報を設定させるための領域である。設定領域ＭＣ６は、マスタを設定させるための領域である。設定領域ＭＣ７は、登録する名寄せ手法の基となるマスタ処理工程マッピングのマスタ処理工程マッピング識別情報を設定させるための領域である。設定領域ＭＣ８は、設定領域ＭＣ６に設定されたマスタ識別情報で特定されるマスタ中のマスタ項目のマスタ項目識別情報と、設定領域ＭＣ５に設定されたデータテンプレート識別情報で特定されるデータテンプレートに紐付くデータ項目のデータ項目識別情報を表示させ、１つのコード付与を行うための名寄せ手法データを設定させるための領域である。 The setting area MC4 is an area for setting the name matching method identification information of the name matching method to be registered. The setting area MC5 is an area for setting data template identification information of a data template that is the basis of the name matching method to be registered. The setting area MC6 is an area for setting a master. The setting area MC7 is an area for setting master processing process mapping identification information of a master processing process mapping that is the basis of the name matching method to be registered. The setting area MC8 is linked to the master item identification information of the master item in the master specified by the master identification information set in the setting area MC6 and the data template specified by the data template identification information set in the setting area MC5. This area is for displaying the data item identification information of the attached data item and for setting name matching method data for assigning one code.

オペレータは、設定領域ＭＣ４に、登録する名寄せ手法を設定し、設定領域ＭＣ５に、登録する名寄せ手法の基となるデータテンプレートを設定する。設定できるデータテンプレートは、図２０のデータテンプレートマスタ１０６ｍ内のテンプレートである。設定領域ＭＣ５にデータテンプレートが設定されると、図２１のデータテンプレート項目マスタ１０６ｎ内にある、設定されたデータテンプレートに紐付くデータ項目が、設定領域ＭＣ８の「名寄せ元データ項目」と題した領域に表示される。オペレータは、設定領域ＭＣ６に、１つのコード付与に使用されるマスタを設定する。設定できるマスタは、図１５のマスタ一覧マスタ１０６ｇ内のマスタである。設定領域ＭＣ６にマスタが設定されると、図１６のマスタ項目マスタ１０６ｈ内にある、設定されたマスタに紐付くマスタ項目が、設定領域ＭＣ８の「マスタの列」と題した領域に表示される。オペレータは、選択領域ＭＣ７に、登録する名寄せ手法の基となるマスタ処理工程マッピングを設定する。オペレータは、設定領域ＭＣ８に表示されたデータ項目とマスタ項目のうち、１つのコード付与時に使用するデータ項目とマスタ項目の組み合わせを、図２８に示すように線等で設定する。登録ボタンＭＣ１を押すと、名寄せ手法設定画面ＭＣの情報が、図１７の名寄せ手法データ１０６ｉ、図１８のコード付与設定データ１０６ｊおよび図１９の列マッピングデータ１０６ｋに登録される。図１７、図１８および図１９に示す情報は、名寄せ処理の実行前に登録しておく。 The operator sets the name matching method to be registered in the setting area MC4, and sets the data template that is the basis of the name matching method to be registered in the setting area MC5. The data templates that can be set are the templates in the data template master 106m in FIG. 20. When a data template is set in the setting area MC5, the data items linked to the set data template in the data template item master 106n in FIG. will be displayed. The operator sets a master used for assigning one code in the setting area MC6. The masters that can be set are the masters in the master list master 106g in FIG. 15. When a master is set in the setting area MC6, the master items linked to the set master in the master item master 106h in FIG. 16 are displayed in an area titled "Master Column" in the setting area MC8. . The operator sets, in the selection area MC7, a master processing process mapping that is the basis of the name matching method to be registered. The operator sets a combination of data items and master items displayed in the setting area MC8 to be used when assigning one code using a line or the like as shown in FIG. 28. When the registration button MC1 is pressed, the information on the name matching method setting screen MC is registered in the name matching method data 106i of FIG. 17, the code assignment setting data 106j of FIG. 18, and the column mapping data 106k of FIG. 19. The information shown in FIGS. 17, 18, and 19 is registered before the name matching process is executed.

図８に戻り、受取部１０２ｄは、ＥＲＰシステム２００から転送された帳票データ（テーブル）（名寄せ元データの一例）とマスタ（テーブル）（名寄せ先データの一例）を受け取る。 Returning to FIG. 8, the receiving unit 102d receives the form data (table) (an example of the data to be compared) and the master (table) (an example of the data to be compared) transferred from the ERP system 200.

取得部１０２ｅは、以下の［４１］から［４６］の処理を実行する。
［４１］名寄せ手法データ１０６ｉから、受取部１０２ｄで受け取った帳票データに付与されている当該帳票データのテンプレート識別情報に紐付く名寄せ手法識別情報を取得する。
［４２］コード付与設定データ１０６ｊから、［４１］で取得した名寄せ手法識別情報に紐付く１つまたは複数の「名寄せ手法データ識別情報とマスタ処理工程マッピング識別情報の組み合わせ」を取得する。
［４３］［４２］で取得した名寄せ手法データ識別情報ごとに、列マッピングデータ１０６ｋから、名寄せ手法データ識別情報に紐付く１つまたは複数の「マスタ項目識別情報とデータ項目識別情報の組み合わせ」を取得する。
［４４］［４２］で取得したマスタ処理工程マッピング識別情報ごとに、マスタ－処理工程マッピングデータ１０６ｅおよびマスタ－処理工程マッピング明細データ１０６ｆから、マスタ処理工程マッピング識別情報に紐付くマスタ識別情報と複数の「マスタ項目識別情報と処理工程識別情報と重み付け値の組み合わせ」を取得する。
［４５］［４３］で取得したマスタ項目識別情報ごとに、［４４］で取得した複数の「マスタ項目識別情報と処理工程識別情報と重み付け値の組み合わせ」から、マスタ項目識別情報に紐付く処理工程識別情報と重み付け値を取得する。
［４６］［４５］で取得した処理工程識別情報ごとに、処理工程データ１０６ａおよびクレンジング適用データ１０６ｂから、処理工程識別情報に紐付く「１つまたは複数のクレンジング手法識別情報と１つまたは複数の類似度計算手法識別情報の組み合わせ」を取得する。 The acquisition unit 102e executes the following processes [41] to [46].
[41] From the name matching method data 106i, the name matching method identification information that is linked to the template identification information of the form data that is given to the form data received by the receiving unit 102d is acquired.
[42] One or more "combinations of name matching method data identification information and master processing step mapping identification information" that are linked to the name matching method identification information obtained in [41] are acquired from the code assignment setting data 106j.
[43] For each name matching method data identification information obtained in [42], one or more "combinations of master item identification information and data item identification information" linked to the name matching method data identification information are extracted from the column mapping data 106k. get.
[44] For each master processing process mapping identification information acquired in [42], from the master processing process mapping data 106e and the master processing process mapping detailed data 106f, the master identification information linked to the master processing process mapping identification information and a plurality of The "combination of master item identification information, processing process identification information, and weighting value" is obtained.
[45] For each master item identification information obtained in [43], a process of linking it to the master item identification information from the multiple "combinations of master item identification information, processing process identification information, and weighting values" obtained in [44]. Obtain process identification information and weighting values.
[46] For each processing step identification information acquired in [45], from the processing step data 106a and the cleansing application data 106b, "one or more cleansing method identification information and one or more "combination of similarity calculation method identification information" is obtained.

辞書使用名寄せ実行部１０２ｆは、ｉ）コード付与辞書レコード識別情報を基に、コード付与辞書条件データ１０６ｐから、一致判断の条件となるデータ項目識別情報およびコード値を取得すると共に、コード付与辞書付与データ１０６ｑから、付与するコード値を取得し、ｉｉ）取得部１０２ｅが［４３］で取得した各データ項目識別情報および、受取部１０２ｄで受け取った帳票データが保持する、当該各データ項目識別情報で特定される各項目に係る各値とが、ｉ）で取得したデータ項目識別情報およびコード値と全て一致するか判断する。 The dictionary use name matching execution unit 102f i) acquires data item identification information and code values, which are conditions for matching judgment, from the code assignment dictionary condition data 106p based on the code assignment dictionary record identification information; The code value to be assigned is acquired from the data 106q, and ii) the acquisition unit 102e acquires each data item identification information acquired in [43] and the respective data item identification information held by the form data received by the reception unit 102d. It is determined whether each value related to each specified item all matches the data item identification information and code value obtained in i).

辞書使用名寄せ実行部１０２ｆは、一致すると判断した場合、ｉ）で取得した付与するコード値と、当該値が辞書データに登録されたものであることを示す情報（例：「辞」という文字情報）を、名寄せ結果として、例えば帳票データとマスタの転送元のＥＲＰシステム２００を出力先として出力する。 When the dictionary use name matching execution unit 102f determines that there is a match, the code value to be assigned obtained in step i) and information indicating that the value is registered in the dictionary data (for example, character information "ji") are added. ) is output as the name matching result, for example, to the ERP system 200 from which the form data and master are transferred.

名寄せ実行部１０２ｇは、例えば辞書使用名寄せ実行部１０２ｆが一致すると判断しなかった場合、以下の［５１］から［５３］の処理を実行する。
［５１］取得部１０２ｅが［４３］で取得した「マスタ項目識別情報とデータ項目識別情報の組み合わせ」と、取得部１０２ｅが［４６］で取得した「１つまたは複数のクレンジング手法識別情報と１つまたは複数の類似度計算手法識別情報の組み合わせ」との組み合わせごとに、以下の［５１１］のクレンジング処理と、以下の［５１２］の類似度計算処理を実行する。
［５１１］「受取部１０２ｄで受け取った帳票データが保持する、当該『マスタ項目識別情報とデータ項目識別情報の組み合わせ』中のデータ項目識別情報で特定される項目に係る値」と「受取部１０２ｄで受け取ったマスタが保持する、当該『マスタ項目識別情報とデータ項目識別情報の組み合わせ』中のマスタ項目識別情報で特定される項目に係る値」とに対する、当該「１つまたは複数のクレンジング手法識別情報と１つまたは複数の類似度計算手法識別情報の組み合わせ」中のクレンジング手法識別情報で特定されるクレンジング手法によるクレンジング処理
［５１２］クレンジング処理後の両値に対する、当該「１つまたは複数のクレンジング手法識別情報と１つまたは複数の類似度計算手法識別情報の組み合わせ」中の類似度計算手法識別情報で特定される類似度計算手法による類似度計算処理
［５２］［５１］で得られた各類似度に、取得部１０２ｅが［４５］で取得した各重み付け値を掛け合わせて、当該各類似度を集計する。
［５３］［５２］で得られた集計値と１つの閾値または互いに異なる複数の閾値との大小を比較し、以下の情報を、名寄せ結果として、例えば帳票データテーブルとマスタテーブルの転送元のＥＲＰシステム２００を出力先として出力する。
・比較結果に応じたコンテンツ（例：名寄せの精度が高いことを意味する記号（例：○）、名寄せの精度が中程度であることを意味する記号（例：△）、名寄せの精度が低いことを意味する記号（例：×））
・受取部１０２ｄで受け取ったマスタの主キー項目と、当該主キー項目に係る値（コード）
・取得部１０２ｅが［４３］で取得した「マスタ項目識別情報とデータ項目識別情報の組み合わせ」中のマスタ項目識別情報と、受取部１０２ｄで受け取ったマスタが保持する、当該マスタ項目識別情報で特定される項目に係る値 For example, when the dictionary-using name matching execution unit 102f does not determine that there is a match, the name matching execution unit 102g executes the following processes [51] to [53].
[51] The acquisition unit 102e acquires the “combination of master item identification information and data item identification information” acquired in [43], and the acquisition unit 102e acquires “one or more cleansing method identification information and one The following cleansing process [511] and the following similarity calculation process [512] are executed for each combination of "combination of one or more similarity calculation method identification information".
[511] “The value related to the item specified by the data item identification information in the “combination of master item identification information and data item identification information” held by the form data received by the receiving unit 102d” The "one or more cleansing method identifications" for the "value related to the item specified by the master item identification information in the "combination of master item identification information and data item identification information" held by the master received in [512] Cleansing processing using the cleansing method specified by the cleansing method identification information in the "combination of similarity calculation method identification information and one or more similarity calculation method identification information." Each similarity calculation process obtained by the similarity calculation process [52] [51] specified by the similarity calculation method identification information in "Combination of method identification information and one or more similarity calculation method identification information" The similarity is multiplied by each weighting value obtained by the acquisition unit 102e in [45], and the respective similarity is totaled.
[53] The magnitude of the aggregate value obtained in [52] is compared with one threshold value or a plurality of mutually different threshold values, and the following information is used as the name matching result, for example, in the ERP of the transfer source of the form data table and the master table. The system 200 is used as the output destination.
・Contents according to the comparison results (e.g., a symbol that means the accuracy of name matching is high (e.g., ○), a symbol that means that the accuracy of name matching is medium (e.g., △), a symbol that means that the accuracy of name matching is low, Symbols that mean things (e.g. ×))
・The master primary key item received by the receiving unit 102d and the value (code) related to the primary key item
- Identification using the master item identification information in the "combination of master item identification information and data item identification information" acquired by the acquisition unit 102e in [43] and the master item identification information held by the master received by the receiving unit 102d. Value related to the item

図３０は、名寄せ実行部１０２ｇが出力する名寄せ結果などの一例を示す図である。 FIG. 30 is a diagram showing an example of the name matching results output by the name matching execution unit 102g.

提案情報出力部１０２ｈは、「受取部１０２ｄで受け取った帳票データが保持する、取得部１０２ｅが［４３］で取得した各『マスタ項目識別情報とデータ項目識別情報の組み合わせ』中の各データ項目識別情報で特定される各項目に係る各値」と「受取部１０２ｄで受け取ったマスタが保持する主キー項目に係る値」とに関する特定の組み合わせが、名寄せ実行部１０２ｇにおいて所定回数以上記録された場合、当該特定の組み合わせのコード付与辞書データ（具体的にはコード付与辞書条件データ１０６ｐとコード付与辞書付与データ１０６ｑ）への登録を提案するための提案情報（例えば、当該提案に関するテキスト情報）を、例えば帳票データとマスタの転送元のＥＲＰシステム２００を出力先として出力する。 The proposal information output unit 102h outputs each data item identification in each ``combination of master item identification information and data item identification information'' acquired by the acquisition unit 102e in [43], which is held in the form data received by the reception unit 102d. When a specific combination of "each value associated with each item specified by the information" and "the value associated with the primary key item held by the master received by the receiving unit 102d" is recorded a predetermined number of times or more in the name matching execution unit 102g. , proposal information (for example, text information regarding the proposal) for proposing registration of the specific combination in the code assignment dictionary data (specifically, the code assignment dictionary condition data 106p and the code assignment dictionary assignment data 106q), For example, the ERP system 200, which is the transfer source of the form data and the master, is output as the output destination.

［３．本実施形態のまとめ］
以上、本実施形態によれば、幅広い使用場面において名寄せを低コストで行うことができる。また、データの種類ごとに適した処理工程を設定し、使用場面ごとに処理工程を細かく設定でき、コード付与辞書による結果の表示を変えることで、人力による名寄せ結果の修正作業の負担が軽減される。 [3. Summary of this embodiment]
As described above, according to this embodiment, name matching can be performed at low cost in a wide range of usage situations. In addition, by setting the appropriate processing process for each type of data, making detailed settings for each usage situation, and changing the display of results using a code dictionary, the burden of manual correction of name matching results is reduced. Ru.

［４．他の実施形態］
本発明は、上述した実施形態以外にも、特許請求の範囲に記載した技術的思想の範囲内において種々の異なる実施形態にて実施されてよいものである。 [4. Other embodiments]
In addition to the embodiments described above, the present invention may be implemented in various different embodiments within the scope of the technical idea described in the claims.

例えば、実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。 For example, among the processes described in the embodiments, all or part of the processes described as being performed automatically can be performed manually, or all of the processes described as being performed manually can be performed manually. Alternatively, some of the steps can be performed automatically using known methods.

また、本明細書中や図面中で示した処理手順、制御手順、具体的名称、各処理の登録データや検索条件等のパラメータを含む情報、画面例、データベース構成については、特記する場合を除いて任意に変更することができる。 In addition, unless otherwise specified, information including processing procedures, control procedures, specific names, parameters such as registered data and search conditions for each process, screen examples, and database configurations shown in this specification and drawings are included. It can be changed arbitrarily.

また、情報処理装置１００に関して、図示の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。 Further, regarding the information processing device 100, each illustrated component is functionally conceptual, and does not necessarily need to be physically configured as illustrated.

例えば、情報処理装置１００が備える処理機能、特に制御部にて行われる各処理機能については、その全部または任意の一部を、ＣＰＵおよび当該ＣＰＵにて解釈実行されるプログラムにて実現してもよく、また、ワイヤードロジックによるハードウェアとして実現してもよい。尚、プログラムは、本実施形態で説明した処理を情報処理装置に実行させるためのプログラム化された命令を含む一時的でないコンピュータ読み取り可能な記録媒体に記録されており、必要に応じて情報処理装置１００に機械的に読み取られる。すなわち、ＲＯＭまたはＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）などの記憶部などには、ＯＳと協働してＣＰＵに命令を与え、各種処理を行うためのコンピュータプログラムが記録されている。このコンピュータプログラムは、ＲＡＭにロードされることによって実行され、ＣＰＵと協働して制御部を構成する。 For example, the processing functions provided in the information processing device 100, especially each processing function performed by the control unit, may be realized in whole or in part by a CPU and a program interpreted and executed by the CPU. Alternatively, it may be implemented as hardware using wired logic. Note that the program is recorded on a non-temporary computer-readable recording medium containing programmed instructions for causing the information processing device to execute the processing described in this embodiment. Machine read to 100. That is, a storage unit such as a ROM or an HDD (Hard Disk Drive) stores a computer program that cooperates with an OS to give instructions to a CPU and perform various processes. This computer program is executed by being loaded into the RAM, and constitutes a control unit in cooperation with the CPU.

また、このコンピュータプログラムは、情報処理装置１００に対して任意のネットワークを介して接続されたアプリケーションプログラムサーバに記憶されていてもよく、必要に応じてその全部または一部をダウンロードすることも可能である。 Further, this computer program may be stored in an application program server connected to the information processing device 100 via an arbitrary network, and all or part of it may be downloaded as necessary. be.

また、本実施形態で説明した処理を実行するためのプログラムを、一時的でないコンピュータ読み取り可能な記録媒体に格納してもよく、また、プログラム製品として構成することもできる。ここで、この「記録媒体」とは、メモリーカード、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）メモリ、ＳＤ（ＳｅｃｕｒｅＤｉｇｉｔａｌ）カード、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＥＰＲＯＭ（ＥｒａｓａｂｌｅＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＥＥＰＲＯＭ（登録商標）（ＥｌｅｃｔｒｉｃａｌｌｙＥｒａｓａｂｌｅａｎｄＰｒｏｇｒａｍｍａｂｌｅＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＣＤ－ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｋＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、ＭＯ（Ｍａｇｎｅｔｏ－Ｏｐｔｉｃａｌｄｉｓｋ）、ＤＶＤ（ＤｉｇｉｔａｌＶｅｒｓａｔｉｌｅＤｉｓｋ）、および、Ｂｌｕ－ｒａｙ（登録商標）Ｄｉｓｃ等の任意の「可搬用の物理媒体」を含むものとする。 Further, a program for executing the processing described in this embodiment may be stored in a non-temporary computer-readable recording medium, or may be configured as a program product. Here, the "recording medium" includes a memory card, a USB (Universal Serial Bus) memory, an SD (Secure Digital) card, a flexible disk, a magneto-optical disk, a ROM, an EPROM (Erasable Programmable Read Only Memory), and an EE. PROM (registration Trademark) (Electrically Erasable and Programmable Read Only Memory), CD-ROM (Compact Disk Read Only Memory), MO (Magneto-Optical disk) ), DVD (Digital Versatile Disk), Blu-ray (registered trademark) Disc, etc. shall include any “portable physical medium”.

また、「プログラム」とは、任意の言語または記述方法にて記述されたデータ処理方法であり、ソースコードまたはバイナリコード等の形式を問わない。なお、「プログラム」は必ずしも単一的に構成されるものに限られず、複数のモジュールやライブラリとして分散構成されるものや、ＯＳに代表される別個のプログラムと協働してその機能を達成するものをも含む。なお、実施形態に示した各装置において記録媒体を読み取るための具体的な構成および読み取り手順ならびに読み取り後のインストール手順等については、周知の構成や手順を用いることができる。 Further, a "program" is a data processing method written in any language or writing method, and does not matter in the form of source code or binary code. Note that a "program" is not necessarily limited to a unitary structure, but may be distributed as multiple modules or libraries, or may work together with separate programs such as an OS to achieve its functions. Including things. Note that well-known configurations and procedures can be used for the specific configuration and reading procedure for reading the recording medium in each device shown in the embodiments, and the installation procedure after reading.

記憶部に格納される各種のデータベース等は、ＲＡＭ、ＲＯＭ等のメモリ装置、ハードディスク等の固定ディスク装置、フレキシブルディスク、および、光ディスク等のストレージ手段であり、各種処理やウェブサイト提供に用いる各種のプログラム、テーブル、データベース、および、ウェブページ用ファイル等を格納する。 The various databases stored in the storage unit are storage devices such as memory devices such as RAM and ROM, fixed disk devices such as hard disks, flexible disks, and optical disks, and are used for various processing and website provision. Stores programs, tables, databases, web page files, etc.

また、情報処理装置１００は、既知のパーソナルコンピュータまたはワークステーション等の情報処理装置として構成してもよく、また、任意の周辺装置が接続された当該情報処理装置として構成してもよい。また、情報処理装置１００は、当該装置に本実施形態で説明した処理を実現させるソフトウェア（プログラムまたはデータ等を含む）を実装することにより実現してもよい。 Further, the information processing device 100 may be configured as an information processing device such as a known personal computer or a workstation, or may be configured as the information processing device to which any peripheral device is connected. Further, the information processing device 100 may be implemented by installing software (including programs, data, etc.) that allows the device to implement the processing described in this embodiment.

更に、装置の分散・統合の具体的形態は図示するものに限られず、その全部または一部を、各種の付加等に応じてまたは機能負荷に応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。すなわち、上述した実施形態を任意に組み合わせて実施してもよく、実施形態を選択的に実施してもよい。 Furthermore, the specific form of dispersion and integration of devices is not limited to what is shown in the diagram, and all or part of them can be functionally or physically divided into arbitrary units according to various additions or functional loads. It can be configured in a distributed/integrated manner. That is, the embodiments described above may be implemented in any combination, or the embodiments may be implemented selectively.

本発明は、特に、名寄せ処理において有用である。 The present invention is particularly useful in name matching processing.

１００情報処理装置
１０２制御部
１０２ａ第一設定部
１０２ｂ第二設定部
１０２ｃ第三設定部
１０２ｄ受取部
１０２ｅ取得部
１０２ｆ辞書使用名寄せ実行部
１０２ｇ名寄せ実行部
１０２ｈ提案情報出力部
１０４通信インターフェース部
１０６記憶部
１０６ａ処理工程データ
１０６ｂクレンジング適用データ
１０６ｃクレンジング手法マスタ
１０６ｄ類似度計算手法マスタ
１０６ｅマスタ－処理工程マッピングデータ
１０６ｆマスタ－処理工程マッピング明細データ
１０６ｇマスタ一覧マスタ
１０６ｈマスタ項目マスタ
１０６ｉ名寄せ手法データ
１０６ｊコード付与設定データ
１０６ｋ列マッピングデータ
１０６ｍデータテンプレートマスタ
１０６ｎデータテンプレート項目マスタ
１０６ｐコード付与辞書条件データ
１０６ｑコード付与辞書付与データ
１０６ｒ操作ログデータ
１０６ｓ修正ログヘッダデータ
１０６ｔ修正ログ明細データ
１０８入出力インターフェース部
１１２入力装置
１１４出力装置
２００ＥＲＰシステム
３００ネットワーク 100 Information processing device 102 Control unit
102a First setting section
102b Second setting section
102c Third setting section
102d Receiving section
102e Acquisition part
102f Dictionary usage name matching execution unit
102g name matching execution part
102h Proposal information output section 104 Communication interface section 106 Storage section
106a Processing process data
106b Cleansing application data
106c cleansing method master
106d Similarity calculation method master
106e Master processing process mapping data
106f Master processing process mapping detail data
106g Master list master
106h Master item master
106i name identification method data
106j Code assignment setting data
106k column mapping data
106m data template master
106n Data template item master
106p code assignment dictionary condition data
106q Code assignment dictionary assignment data
106r operation log data
106s modification log header data
106t Modification log detailed data 108 Input/output interface section 112 Input device 114 Output device 200 ERP system 300 Network

Claims

An information processing device comprising a control unit,
A method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the source data, and one or more item combinations of the source item and destination item that are the items of the source data are stored. It is possible to access a storage unit in which a set of item combinations including multiple items used when performing name matching is stored in association with a template of the name matching source data having a predetermined pattern or layout ;
The control unit includes:
a receiving means for receiving the reference data and the reference data;
The item combination set linked to the template identification information given to the received name reference data is obtained from the storage unit, and for each item combination in the obtained item combination set, the item combination set is linked to the previous item in the item combination. an acquisition means for acquiring a combination of methods from the storage unit;
1) For each combination of the item combination in the acquired item combination set and the acquired method combination, 11) The value related to the original item in the item combination held by the received name matching source data and the value of the received item combination. 12) Cleansing processing by the cleansing method in the method combination for the value related to the previous item in the item combination held by the name identification data held by the name matching destination data, and 12) similarity in the method combination for both values after the cleansing process. 2) aggregate each of the obtained similarities, and 3) collect information based on the obtained aggregate value and the name reference data held by the name reference data. a name matching execution means that outputs a value related to an identification destination item that is a destination item for uniquely identifying data as a name matching result;
In addition,
The storage unit includes:
Dictionary data that is registered by linking the value related to the source item in the item combination set held by the name reference data and the value related to the identification destination item held by the name reference data.
store and
The control unit includes:
If the value of the source item in the acquired item combination set held by the received name identification data is the same as that registered in the dictionary data, the identification target item registered in the dictionary data. Dictionary-using name matching execution means that outputs a value related to , and information indicating that the value is registered in the dictionary data, as a name matching result.
Equipped with
The name matching execution means performs the above 1) when the value of the original item in the acquired item combination set held by the received name matching source data is not the same as that registered in the dictionary data. Executing the process of 3) above from
An information processing device characterized by:

The storage unit further stores the weighted value in association with the previous item,
The acquisition means further acquires, for each item combination in the acquired item combination set, a weighting value linked to a previous item in the item combination from the storage unit,
The name aggregation executing means multiplies each degree of similarity by each of the obtained weighting values to total each degree of similarity;
The information processing device according to claim 1, characterized in that:

The name matching execution means compares the total value with one threshold value or a plurality of mutually different threshold values, and outputs content according to the comparison result as information based on the total value;
The information processing device according to claim 1 or 2, characterized in that:

The name matching execution means further outputs, as a name matching result, a value related to a previous item in the acquired item combination set, which is held by the received name matching destination data;
The information processing device according to any one of claims 1 to 3, characterized in that:

The storage unit includes:
1) A method combination set including multiple method combinations,
2) A destination item/method combination set that includes a plurality of destination item/method combination sets for each destination data, each containing a destination item/method combination of the destination item and the identification information of the method combination, as many as the number of destination items in the destination data. With the family,
3) an item combination set family containing one or more item combination sets;
4) a first combination set family that includes one or more first combination sets that include one or more combinations of identification information of an item combination set and identification information of a previous item/method combination set;
5) a second combination set including one or more combinations of the identification information of the first combination set and the identification information of the template of the name reference data;
It stores
The acquisition means is
1) obtaining the identification information of the first combination set linked to the identification information of the template in the received name matching source data from the second combination set;
2) acquiring a first combination set specified by the identification information of the acquired first combination set from the first combination set family;
3) Acquire the item combination set specified by the identification information of the item combination set in the acquired first combination set from the item combination set group, and also acquire the previous item/method combination in the acquired first combination set. obtaining a previous item/method combination set specified by the identification information of the set from the previous item/method combination set family;
4) obtaining a method combination linked to a previous item in the obtained item combination set from the obtained previous item/method combination set;
The information processing device according to any one of claims 1 to 4.

The control unit includes:
a first setting means for causing an operator to set a method combination via a first setting screen including an area for setting a cleansing method and an area for setting a similarity calculation method;
A second setting including an area for setting the reference data, an area for displaying the destination item in the set reference data, and an area for setting the method combination to be applied to the destination item. a second setting means for causing an operator to set a destination item/method combination set via a screen;
An area for setting the template of the reference data, an area for setting the reference data, an area for setting the destination item/method combination set, and the destination item and setting in the set reference data. The third setting screen, which includes an area for displaying the source items in the template and setting the item combination set, allows the operator to set the item combination set, the source data template, and the destination item. a third setting means for setting the linkage of the method combination set;
further equipping;
The information processing device according to claim 5, characterized in that:

The control unit includes:
The name matching execution means specifies a value related to the source item in the acquired item combination set held by the received name matching source data and a value related to the identification destination item held by the received name matching destination data. further comprising a suggestion information output means for outputting information for proposing registration of the specific combination in the dictionary data when the combination is recorded a predetermined number of times or more;
The information processing device according to claim 1 , characterized in that:

The source data and destination data are transferred from application software related to ERP (Enterprise Resource Planning).
The output destination is the application software related to the ERP that is the transfer source of the source data and destination data;
The information processing device according to any one of claims 1 to 7 .

The name reference data must be the master set in the application software related to ERP,
The information processing device according to claim 8 , characterized in that:

The source data must be electronic data that has been digitized by OCR (Optical Character Recognition) or imported externally;
The information processing device according to claim 9 , characterized by:

A method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the source data, and one or more item combinations of the source item and destination item that are the items of the source data are stored. The item combination set used when performing the name matching is executed by an information processing device equipped with a control unit that can access a storage unit in which a set of item combinations used when performing the name matching is stored in association with a template of the name matching source data having a predetermined pattern or layout. An information processing method comprising:
executed by the control unit,
a receiving step for receiving the reference data and the reference data;
The item combination set linked to the template identification information given to the received name reference data is obtained from the storage unit, and for each item combination in the obtained item combination set, the item combination set is linked to the previous item in the item combination. an acquisition step of acquiring a method combination from the storage unit;
1) For each combination of the item combination in the acquired item combination set and the acquired method combination, 11) The value related to the original item in the item combination held by the received name matching source data and the value of the received item combination. 12) Cleansing processing by the cleansing method in the method combination for the value related to the previous item in the item combination held by the name identification data held by the name matching destination data, and 12) similarity in the method combination for both values after the cleansing process. 2) aggregate each of the obtained similarities, and 3) collect information based on the obtained aggregate value and the name reference data held by the name reference data. a name matching execution step of outputting a value related to an identification destination item that is a destination item for uniquely identifying data as a name matching result;
including, and further,
The storage unit includes:
Dictionary data that is registered by linking the value related to the source item in the item combination set held by the name reference data and the value related to the identification destination item held by the name reference data.
store and
executed by the control unit,
If the value of the source item in the acquired item combination set held by the received name identification data is the same as that registered in the dictionary data, the identification target item registered in the dictionary data. a dictionary-using name matching execution step of outputting, as a name matching result, a value related to , and information indicating that the value is registered in the dictionary data;
including;
In the name matching execution step, if the value of the original item in the acquired item combination set held by the received name matching source data is not the same as the value registered in the dictionary data, the step of performing 1) Executing the process of 3) above from
An information processing method characterized by:

A method combination of a cleansing method and a similarity calculation method is stored in association with a destination item that is an item of the source data, and one or more item combinations of the source item and destination item that are the items of the source data are stored. Execute by an information processing device equipped with a control unit that can access a storage unit in which a set of item combinations used when performing name matching that includes multiple items has a predetermined pattern or layout and is stored in association with a template of the name matching source data. It is a program for
for the control unit to execute,
a receiving step for receiving the reference data and the reference data;
The item combination set linked to the template identification information given to the received name reference data is obtained from the storage unit, and for each item combination in the obtained item combination set, the item combination set is linked to the previous item in the item combination. an acquisition step of acquiring a method combination from the storage unit;
1) For each combination of the item combination in the acquired item combination set and the acquired method combination, 11) The value related to the original item in the item combination held by the received name matching source data and the value of the received item combination. 12) Cleansing processing by the cleansing method in the method combination for the value related to the previous item in the item combination held by the name identification data held by the name matching destination data, and 12) similarity in the method combination for both values after the cleansing process. 2) aggregate each of the obtained similarities, and 3) collect information based on the obtained aggregate value and the name reference data held by the name reference data. a name matching execution step of outputting a value related to an identification destination item that is a destination item for uniquely identifying data as a name matching result;
including, and further,
The storage unit is
Dictionary data that is registered by linking the value related to the source item in the item combination set held by the name reference data and the value related to the identification destination item held by the name reference data.
store and
for the control unit to execute,
If the value of the source item in the acquired item combination set held by the received name identification data is the same as that registered in the dictionary data, the identification target item registered in the dictionary data. a dictionary-using name matching execution step of outputting, as a name matching result, a value related to , and information indicating that the value is registered in the dictionary data;
including;
In the name matching execution step, if the value of the original item in the acquired item combination set held by the received name matching source data is not the same as the value registered in the dictionary data, the step of performing 1) Executing the process of 3) above from
A program featuring.