JP4010843B2

JP4010843B2 - Action amount generation device, action amount generation method, and action amount generation program

Info

Publication number: JP4010843B2
Application number: JP2002090287A
Authority: JP
Inventors: 義典柳沼
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2002-03-28
Filing date: 2002-03-28
Publication date: 2007-11-21
Anticipated expiration: 2022-03-28
Also published as: JP2003288580A

Description

【０００１】
【発明の属する技術分野】
本発明は、過去に蓄積された既知データを用いて予測対象データの目的変数値の予測を行う行動量生成装置に関し、特に予測結果が期待されたものではない場合に、希望する結果へと変更するためには、どのように、そしてどのくらい予測対象データの説明変数値を変更すれば良いかを計算することで、予測結果改善のための行動自動生成を行うことができる行動量生成装置に関するものである。
【０００２】
【従来の技術】
近年、コンピュータやインターネットの発達により、遠隔地も含めての様々かつ大量な情報を容易に入手可能となった。また、記憶装置の高密度化、低価格化により、それら得られた情報の蓄積も容易となった。例えば、流通業におけるPOS(Point Of Sale)システムでは全国各地の小売店の売り上げ内容を本社コンピュータなどに集めることが可能であり、時間と販売された商品の関係として刻々と蓄積されている。この他にも、製造業における各種製造装置の条件と生成された製品の歩留まりデータ、金融業における個人のクレジットカード使用状況や、保険業における保険使用者の個人データと使用状況の情報など、大量の情報が蓄積される分野は多岐にわたる。
【０００３】
そして、現在これらの大量の蓄積データ内に内在する因果関係や規則と言った価値ある情報を、自動的かつ効率的に抽出し、ビジネスに役立たせたいという要望が高まっている。
【０００４】
ところで、以前より、蓄積された既知事例データを利用して、統計的処理やAI（Artificial Intelligence）、ニューラルネットワーク等を用いて、未知のデータ（予測対象データ）を予測することが行われてきた。しかし、実応用の場面では、単に予測対象データの予測を行うだけでなく、予測結果から次に何をすべきか、行動を指示してくれることが望まれる場合がある。
【０００５】
例えば、製造業の場合、各種製造装置の条件データから生成された製品が「不良」と予測された場合、どの製造条件をどう変更すれば、「良」へと変化するかを調べたいという希望が多い。また、CRM（Customer Relationship Management）分野においては、企業のもつ顧客リストの中からあまりアクティブでない顧客を分類した場合に、その顧客と優良顧客との相違点を明確にし、各顧客ごとに優良顧客へと導くための戦略を見つけることで、例えば各顧客に最も効果的なキャンペーンを行うといったよりきめ細かな対応が可能となる。
【０００６】
さらに、保険業における保険使用者の個人データと使用状況の情報からリスクを調べる問題でも、例えば「危険」と予測された人に対して、その人の状況に最も似ており、かつ「安全」と予測されるような条件を求めることは、顧客データ理解の上で大変重要である。
【０００７】
【発明が解決しようとする課題】
しかしながら、従来の予測装置ではこのような情報は提供していないか、あるいは感度分析を用いた各説明変数の目的変数に対する敏感さを測定する程度にとどまっている。また、既に本出願人により特許出願されている「予測分析装置とその装置の実現に用いられるプログラム記録媒体（出願番号：H12-325213)」では、予測器として、ニューラルネットワークや決定木などによる全体モデルを生成し、生成された全体モデルを利用して変更値を計算する方法と事例に基づき変更値を探索する方法が提案されている。しかし、全体モデルを生成する方法では、まず十分な全体モデルを学習しなければならず、十分に学習できない場合には適用できないという問題がある。また、十分な学習を行わせるための試行錯誤や学習が十分かどうかの確認を行うために計算時間や手間がかかり、コスト高になるという問題があった。また、事例に基づき変更値を探索する方法では、探索方法としてはユーザが指定した範囲を全数探索することのみが行われており、大量データに対して実施する場合には時間がかかるという問題もあった。
【０００８】
本発明は、上述した問題点を解決するためになされたものであり、予測結果を希望する値へと変更するために、どのように、そしてどのくらい予測対象データの説明変数の変更を行うべきかを効率良くが求めることができる行動量生成装置、行動量生成方法、行動量生成プログラムを提供することを目的としている。
【０００９】
【課題を解決するための手段】
上述した課題を解決するため、本発明は、モデルを生成しない予測手法として、事例に基づく推論（Memory-Based Reasoning: 以下MBR）から得られるデータ情報を用い、予測対象データの目的変数の予測値に対して異なる結果を希望する場合に、効率的に計算を行う変更値計算部を備えることで、最もその予測対象データの各説明変数値に近く、かつ希望する値が予測値として得られるような予測対象データの変更量（行動量）が得られるようにしたものである。さらに、変更対象としたい説明変数と、変更せず固定としたい説明変数を指定可能とすることで、そもそも制御不可能な説明変数を計算対象から外すことができ、現実問題への適用範囲を広げることができる。加えて、各説明変数に対して優先度や値変更方向の指定も可能とすることで、ユーザが重視している説明変数から変更を行うといったきめ細かな戦略決定が可能となる。このように、本発明を用いることにより、予測値を希望する結果へと変更するための予測対象データの行動量を求めることが可能となる。
【００１０】
すなわち、本発明は、既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力することができる予測部と、前記予測値に対する希望値を入力することができる希望値入力部（希望結果入力部４）と、前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断部（予測結果判断部５）と、前記比較判断部の出力に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算部とを備えてなるものである。
【００１１】
なお、本発明の行動量生成装置において、前記比較判断部は、前記予測値が前記希望値に不一致、あるいは誤差が許容値以上又は許容値より大きい場合に、前記変更値計算部に行動量を算出させるようにすることができる。また、本発明の行動量生成装置において、前記変更値計算部が算出する行動量は、予測値を希望値へと変更するために可能な最少の予測対象データの行動量であることを特徴とすることもできる。
【００１２】
更に、本発明の行動量生成装置において、探索条件の指定を可能とする探索条件入力部を備え、前記変更値計算部は、更に前記探索条件に基づいて予測対象データの行動量を算出することを特徴とするものである。
【００１３】
ここに、前記探索条件入力部により指定される探索条件として、制御可能な１つあるいは複数の説明変数を変更対象説明変数として指定できることを特徴とすることもできる。また、前記探索条件入力部により指定される探索条件として、変更対象説明変数ごとの変更値範囲を指定できることを特徴とすることもでき、さらには、前記探索条件入力部における探索条件として、変更対象説明変数の優先順位、及び数値属性の場合には優先探索方向が含まれることを特徴とすることもできる。
【００１４】
また、本発明は、前記予測部として事例に基づく推論を用い、前記予測部より得られるデータ特徴として前記事例に基づく推論により出力される影響度を用いることを特徴とするものである。
【００１５】
また、本発明の行動量生成装置において、前記予測部として事例に基づく推論を用い、前記予測部より得られるデータ特徴として、前記事例に基づく推論により出力される予測対象データの類似事例データおよびその類似度を用いることを特徴とすることができる。また、本発明において、前記変更値計算部は、前記予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータと前記予測対象データとの相違をもって行動量とすることを特徴とすることもできる。さらに、前記変更値計算部は、前記予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータを予測対象データとみなし、かつ既知事例データから除いて再度、前記予測部により予測を行わせ、得られた再予測値に対する前記比較判断部における判断結果に基づいて、前記予測対象データの行動量を算出することを特徴とすることもできる。
【００１６】
また、前記変更値計算部は、前記再予測値が希望値と一致、あるいは誤差が許容値以下又は許容値より小さい場合には、その探索されたデータと予測対象データとの相違をもって前記予測対象データの行動量とすることを特徴とすることもできる。さらに、前記変更値計算部は、前記再予測値が希望値と不一致、あるいは誤差が許容値以上又は許容値より大きい場合には、前記変更値計算を繰り返すことを特徴とすることもできる。
【００１７】
なお、前記予測部として事例に基づく推論を用い、前記予測部より得られるデータ特徴として前記事例に基づく推論により出力される影響度を用いることを特徴とする行動量生成装置と、前記予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータを予測対象データとみなし、かつ既知事例データから除いて再度、前記予測部により予測を行わせ、得られた再予測値に対する前記比較判断部における判断結果に基づいて、前記予測対象データの行動量を算出する変更値計算部を備えた行動量生成装置とを並列に備えてなると共に、これら行動量生成装置による予測値に基づいて前記予測対象データの行動量を判断する判断部（最少行動判断部８）を備えてなる行動量生成装置を構成することもできる。
【００１８】
また、本発明における行動量生成方法は、既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力する予測ステップと、前記予測値に対する希望値を入力する希望値入力ステップと、前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断ステップと、前記比較判断結果に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算ステップとを備えてなるものである。
【００１９】
また、本発明に係る行動量生成プログラムは、既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力する予測ステップと、前記予測値に対する希望値を入力する希望値入力ステップと、前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断ステップと、前記比較判断結果に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算ステップとをコンピュータに実行させるものである。
【００２０】
【発明の実施の形態】
以下、本発明の実施の形態を図面を用いて説明する。
まず、本発明の実施の形態における基本構成について説明する。
図1は本発明の実施の形態の基本構成を示すブロック図であり、この基本構成は、既知事例データを入力する既知事例データ入力部１と、予測対象データを入力する予測対象データ入力部２と、既知事例データと予測対象データとが入力され予測値を出力する予測部３と、希望値を入力する希望結果入力部４と、希望値と予測値とが入力される予測結果判断部５と、予測対象データ入力部２、予測部３、及び予測結果判断部５の出力が適宜入力され行動量を生成する変更値計算部６とを備えて構成される。
【００２１】
既知事例データ入力部１により入力される既知事例データは、蓄積された過去データのことであり、説明変数の値（条件）と付随した目的変数の値（結果）が既知であり、予測の根拠となるものである。既知事例データとして製造データの一例を図２に示す。予測対象データ入力部２により入力される予測対象データは、目的変数値（結果）が不明の未知のデータを意味する。図３に予測対象データの例を示す。なお、本発明においては説明変数（条件）、目的変数（結果）とも数値・カテゴリ値いずれの属性においても適用可能である。
【００２２】
予測部３では、既知事例データを基に予測対象データの目的変数値を予測し、予測値として出力する。希望結果入力部４においてユーザにより希望値を入力させ、予測結果判断部５にて希望値と予測部３による予測値を判定し、両者が一致、あるいは誤差が許容値以下の場合には終了要求を、そうでない場合には変更値計算要求を出力する。なお、許容値はユーザによって事前に入力される。
【００２３】
変更値計算部６では、予測結果判断部５による変更値計算要求及び希望値に応じて、予測対象データ、予測部３より渡された予測値とデータ特徴を用いて変更データを計算する。計算された変更結果は予測部３へと渡され、再度の予測および判定に用いられる。また、変更値計算部６では、予測結果判断部５における終了要求に応じて、予測対象データと変更結果から起こすべき行動を計算し、行動量生成６ａを行って出力する。
【００２４】
図４に予測部３として用いられる、事例に基づく推論MBRの概念図を示す。MBR予測の基本原理とは、蓄積された既知事例データ中で、予測対象データに近いk個のデータを探索し、その重み付き加算和により予測値を決定するものである。
【００２５】
ここでは、まず、予測部３にて用いる事例に基づく推論の原理について、図４を用いて説明する。図４は多次元空間を表し、各次元がデータの説明変数を、＋や−は目的変数値を意味している。すなわち、事例データは、多次元空間における各説明変数を座標として持つ一つの点として表されている。
【００２６】
MBR予測の基本原理とは、蓄積された既知事例データ中で、予測対象データに近いk個のデータを探索し、その重み付き加算和により予測値を決定するものである。ＭＢＲによる予測は、以下の順で行われる。
【００２７】
（１）蓄積された既知事例データに対し、ある目的変数における影響度を計算する。
（２）予測対象データと既知事例データの１つ１つとの類似度を計算する。
（３）（２）で求めた類似度を用いて、既知事例データ中から予測対象データに最も類似するk個を選択し、類似事例データとする。
（４）得られたk個の類似事例データを用い、予測結果を決定する。
（５）得られた予測結果に対して、確信度を計算する。
【００２８】
以下、それぞれについて、式とデータ例を用いて詳細を説明する。
（１）影響度の計算
影響度とは、各説明変数、もしくは各説明変数値が目的変数値に対して与える影響度合いを、既知事例データから確率計算により求めたものである。この計算方法としては、説明変数値ごとに求めたい場合には、CCF(Cross-Category Feature importance)やnewCCF(new Cross-Category Feature importance)等の手法が用いられる。
【００２９】
影響度の計算は以下の計算式に従って行われる。
（CCFの場合）
ｗ_i（ｖ）＝Σｐ（ｃ｜ｖ）² (a1)
ここで、ｃは各目的変数値を表し、Σによる加算は各目的変数値ｃについて行われる。なお、ｉは説明変数の番号を、ｖは説明変数の値を、ｐ（ｃ｜ｖ）は、説明変数が値ｖをとる時に目的変数値ｃをとる確率を表す。
【００３０】
すなわちCCFとは、説明変数が値ｖをとる時に目的変数値ｃをとる確率を２乗し、全てのクラスについて合計していることを意味する。これによって、説明変数がある値ｖをとる時に、単一の目的変数値ｃを必ずとる場合には影響度は１となり、説明変数値ｖによってどの目的変数値ｃをとる確率も均等の場合には、影響度は最小値：１／Ｎ_cとなる（Ｎ_cは目的変数値の数）。
【００３１】

【００３２】
ここで、括弧外のΣはｃについての合計を行い、括弧内のΣはｄについての合計を行う。なお、ｃは各目的変数値を、ｉは説明変数の番号を、ｖは説明変数の値を、Ｎ_cは目的変数値の数を、ｐ（ｃ）は目的変数値がｃとなる確率を、ｐ(ｃ｜ｖ)は、説明変数が値ｖをとる時に目的変数値ｃをとる確率を表す。また、ｑ_v(ｃ)は、説明変数が値ｖをとる時の目的変数値ｃをとる確率の、もともと目的変数値がｃとなる確率に対する比を意味している。
【００３３】
このnewCCF法はCCF法に対して、1). 目的変数値の分布偏りの考慮、2). ある説明変数値ｖが目的変数値への決定に寄与しない場合には、影響度は０にする、の２点において改良が施されている。つまり、ある説明変数値ｖをとる時の目的変数値ｃの分布が全体の目的変数値分布に一致するときに、分子は０となるため影響度は０（最小値）となり、その説明変数値の予測における影響を消し去ることができる。一方、ある説明変数値ｖをとる時に単一の目的変数値ｃしかとらない場合には、影響度値が1.0となるのはCCFと変わらない。
【００３４】
図５に説明変数として「圧力」と「濃度」におけるnewCCFの影響度例を示し、図６に、そのうちの圧力の影響度例を示す。
図５や図６において、各説明変数値がどの目的変数値を導く方向へと影響を及ぼしているか（図５中の「方向」という列）は、全体の目的変数値の割合に対し、その説明変数値をとるデータにおける目的変数値の割合の変化から求めている。図６では、説明変数：「圧力」について、各説明変数値の目的変数値（OK, NG）への影響度を示している。影響の大きさを棒グラフの長さで、方向を色で示しており、黒塗りはOKの方向へと、斜線はNGの方向へと影響を及ぼすことを表す。この例では、中央付近（0.75〜2.0）の圧力値はOKの方向へと影響を与えていることがわかる。
【００３５】
なお、CCF, newCCFのいずれの場合においても、既知事例データ中に欠損値がある場合には該当レコードを削除してから影響度計算を行う。また、説明変数が数値属性の場合は、ｖは単一の値ではなく、数値範囲を意味する。
図７には、ある予測対象データにおける各説明変数値のもつ影響度と影響方向を図示した。図中、予測対象データとして圧力(1.9)、天気（晴れ）は、OKの方向へと寄与する影響度を持つが、温度１(13.8度)、温度2（8.9度）がNGの方向へと寄与する影響度を持つ。
【００３６】
（２）予測対象データと既知事例データとの類似度の計算
類似度とは、事例データ間の類似性の尺度である。式(2)(3)によって、予測対象データと各既知事例データとの類似度を求める。
【００３７】

【００３８】
ここでΣはｉについて加算を行い、ｗ_i(ｖ)は、（１）で求めた影響度を表す。すなわち、類似度は事例間の影響度付き距離の逆数を意味しており、事例どうしが似ているほど類似度は高くなる。なお、既知事例データに欠損値が含まれる場合、予測対象データの該当属性の単一属性間距離を１とし、予測対象データに欠損値が含まれる場合、既知事例データの該当属性の単一属性間距離を０として、類似度を計算する。
【００３９】
（３）既知事例データ中から予測対象データに最も類似するk個を選択
（２）において、既知事例データのそれぞれ１データと予測対象データの類似度を求め、最も類似度が高いk個を既知事例データから選択し、類似事例データとする。ここでkは、以下の方法で決定される。
ａ）事前にユーザにより指定される
ｂ）既知事例データの一部を予測対象データとみなし、複数のk値により予測作業を繰り返し、最も予測正答率が高いk値を最適値として自動的に決定する。
なお、予測作業について次項（４）で述べる。
図８に、k=10として選択された、図３に示した予測対象データに対する選択された類似事例データの類似度の逆数を示す。
【００４０】
（４）得られたk個の類似事例データを用い予測結果決定
ａ）目的変数がカテゴリ値属性の場合
類似事例データを用いて目的変数値ごとの類似度合計Ｔ_cを次式(a2)により求める。ここで、S_jは（３）で選択されたk個の類似事例データ中のｊ番目の類似事例データの予測対象データとの類似度を意味する。
【００４１】
目的変数値ごとの類似度合計Ｔ_c＝ΣS_j (a2)
加算はSｊ∈ｃについて行われる。
【００４２】
次式(a3)に従い、得られた各目的変数値におけるＴ_cのうち、最大のＴ_cを与える目的変数値ｃを、予測値Ｃ_predictとして決定する。
予測値Ｃ_predict＝［ｃ｜ｍａｘ（Ｔ_c）］ (a3)
【００４３】
ｂ）目的変数が数値属性の場合
次式(a4)を用いて予測値ｃ_predictは決定される。ここで、類似事例データ数をｋとし、ｊ番目の類似事例データと予測対象データとの間の類似度をＳ_j、ｊ番目の類似事例データの目的変数値をｃ_jとする。
【００４４】
予測値ｃ_predict＝ΣＳ_j・ｃ_j／ΣＳ_j (a4)
ここで、Σはｊ＝１からｋについて行われる。すなわち、類似度による重み付き加算和により予測値を決定する。
【００４５】
（５）得られた予測結果に対し確信度を計算
確信度とは予測結果の信頼性を表す尺度である。以下のようにして求められる。
ａ）目的変数がカテゴリ値属性の場合
各目的変数値の類似度合計に対する、予測結果ｃ_predict の類似度合計の割合として、次式(a5)により求められる。
【００４６】
確信度Ｐ＝Ｔｃ_predict／ΣＴ_c (a5)
なお、加算はｃについて行われる。
ｂ）目的変数が数値属性の場合
同様に、次式(a6)により求められる。ここで、σ_cは、目的変数の標準偏差を意味する。
確信度Ｐ＝１／ΣＳ_j・（ｃ_j−ｃ_predict）²／（σ_c ²・ΣＳ_j＋１） (a6)
ここで、Σはｊ＝１からｋについて加算される。
【００４７】
例として、図９に、図８の類似データから求められた図３の予測対象データに対する予測結果とその確信度値を示す。
以上のようにして、既知事例データを用いて予測対象データの予測を行い、その確信度を提示することが可能となる。
【００４８】
なお、ここで説明したMBRを用いた予測方法については、「類似事例に基づく予測装置およびその方法」と称して、本出願人により、既に出願されている（出願公開番号：特開2000-155681)。
【００４９】
図１０にMBR予測において、ある既知事例データを予測対象データとして再予測する場合に、既知事例データ中から該当データを外す概念図を示す。これは、変更値計算が類似事例を利用して行う場合に、求められた変更値は既知事例中に含まれるため、必ず予測値は希望値となる。しかし、求められた変更値がノイズの影響を受けている可能性もあるため、変更値を既知事例データ中から除いてその周辺の類似事例から予測を行い、予測値が元事例の結果値と変わらなければ安定しているとして判断できるのである。
【００５０】
以下、本実施の形態の基本構成における動作について説明する。図１１に全体の動作流れ図を示す。図１１において、ステップＳ１において、始めに既知事例データ入力部１と予測対象データ入力部２を用いて既知事例データと予測対象データとを予測部３に入力し（ステップＳ１ａ，Ｓ１ｂ，Ｓ１ｃ）、予測を行う（ステップＳ２）。次に、予測部３で得られた予測値と、希望値入力部４において入力された希望値を予測結果判断部５へと入力し、変更計算が必要か否かの判断を行う（ステップＳ３）。
【００５１】
両者が不一致、あるいは誤差が許容値以上であり、変更値計算が必要と判断された場合（ステップＳ３，Ｙ）、予測対象データ、予測部３から得られる予測値およびデータ特徴、予測結果判断部５を通して得られる希望値を用いて変更値計算部６にて変更値計算を行う（ステップＳ６）。ステップＳ６においては、予測部３として事例に基づく推論(MBR)を用いる場合、データ特徴として以下のいずれかを用いて効率良く変更値計算を行う。
【００５２】
▲１▼MBRが出力する既知事例データの影響度。
▲２▼MBRが出力する予測対象データの類似事例データ、および類似度。
【００５３】
求められた変更値は予測対象データとみなして予測部３へと入力され、再度、予測部３にて予測を行う（ステップＳ２）。予測部３による再予測値に対して、再び予測結果判断部５にて変更値計算が必要か否かの判断を行い（ステップＳ３）、必要なしと判断されるまでこのサイクルを繰り返す。ただし、変更値計算部６において該当する変更値がなしと判断された場合、そこで処理を終了する（ステップＳ７）。
【００５４】
ここで、類似事例データおよび類似度を用いて変更値計算を行う場合、求められた変更値はかならず元の既知事例データに含まれる。その時、その変更値をもって行動量を求める場合と、ノイズの影響を防ぐためにその変更値を予測対象データとみなし、かつ既知事例データから除いて、再度予測部３にて予測を行い予測結果判断部５の判断へと委ねる場合がある。
【００５５】
予測結果判断部５において、変更値計算の必要なしと判断された場合には（ステップＳ３，Ｎ）、変更値計算部６において、予測対象データと変更データの差分を求め（ステップＳ４）、行動量として出力し終了する（ステップＳ５）。
【００５６】
以下、以上に説明した本発明の基本構成に基づく実施の形態についてより詳細に説明する。
実施の形態１．
図１２は、実施の形態１を示すブロック図である。これは、基本構成を示す図1において探索条件入力部７が追加されたものであり、ユーザによる探索条件の指定が可能である。変更値計算部６においては、指定された探索条件も満たすように変更値計算を行う。図１３に探索条件指定後の、予測対象データ例を示す。
【００５７】
図１３では、以下の事項が示されている。
▲１▼第一行目に各説明変数に対して変更か固定かを示すマークが示されている。
▲２▼第二行目に探索優先順位が記述され、第三行目に探索方向が記述されている。
▲３▼第四行目と第五行目には各説明変数の探索範囲が記述されている。
【００５８】
なお、説明変数がカテゴリ値属性の場合には、探索範囲としては第四行目にのみ、とりうる値を列挙するものとし、また、探索方向は指定できないものとする。従って、第三行目と第五行目は空欄となる。これらは、ユーザがGUI（Graphical User Interface）などを通して任意に指定する。
【００５９】
また、探索条件入力部７が存在する場合には、入力された探索条件に照らし合わせて、変更値計算部６にて変更値計算を行う。この時、探索条件入力部７において指定可能な探索条件は以下の通りがある。
【００６０】
▲１▼変更対象説明変数の指定。
▲２▼各変更対象説明変数の変更範囲の指定。
▲３▼各変更対象説明変数間の優先度、および数値属性の場合には変更方向の指定。
【００６１】
実施の形態１によれば、ユーザによる探索条件の指定が可能となり、予測の効率化を図ることが可能となる。
【００６２】
実施の形態２．
次に実施の形態２について説明する。
実施の形態２は、予測部３として事例に基づく推論MRBを用い、予測部３より得られるデータ特徴としてMBRが出力する影響度を用い、変更値計算部６にて、予測値を希望値へと変更するためのできるだけ少ない予測対象データの行動量（変更量）を、与えられた影響度を利用して効率的に自動計算するようにしたものである。ブロック図は例えば図１２を用いることができる。
【００６３】
MBR影響度を活かした、変更値計算部６における変更値計算方法の流れ図を図１４に示す。図１４において、先ず、必要なデータが入力される（ステップＳ１１）、次に探索条件入力部７が存在する場合、変更対象説明変数が指定されていれば（ステップＳ１２，Ｙ）、該当する説明変数の影響度のみを抽出する（ステップＳ１３）。次に、探索範囲が指定されている場合には（ステップＳ１４，Ｙ）、該当する探索範囲内の影響度のみを抽出する（ステップＳ１５）。図１５に、図１３に示した探索条件例に従って抽出された影響度を示す。図１５では、図１３の説明変数「圧力」において範囲指定されていることが、反映されている。
【００６４】
続いて、説明変数に優先順位が指定されている場合には（ステップＳ１６，Ｙ）、探索順として優先順に設定し（ステップＳ１７）、そうでなければ（ステップＳ１６，Ｎ）、探索順として抽出された影響度値とその影響方向を調べ、希望値以外へと影響を与える影響度を持つ説明変数中で影響度の大きい順に設定し、続いて希望値へと影響を与える影響度を持つ説明変数中で影響度の小さい順に設定する（ステップＳ１８）。
【００６５】
図１３の例では探索順が指定されているので、濃度→圧力の順に探索される。なお、もし探索順が指定されていなかった場合には、図１５中の濃度（25.0-27.5: 0.65, OK）よりも圧力（1.25-1.50: 0.70, OK）の方が希望値へと影響を与える影響度の中で高い影響度を持つので、圧力→濃度の順に探索される。
【００６６】
次に探索順に従って、以下のように探索を行う。以下は、ステップＳ１９に該当しており、その部分のみを抜き出した流れ図を図１６に示す。図１６において、探索順が指定されている場合（ステップＳ２１）、探索対象の説明変数があるかどうかを判断し（ステップＳ２２）、なければ（ステップＳ２２，Ｎ）該当なしと出力して終了する（ステップＳ３３，Ｓ２０）。
【００６７】
探索対象の説明変数がある場合（ステップＳ２２，Ｙ）において、説明変数がカテゴリ値属性の場合（ステップＳ２３，Ｙ）、希望値へと影響を及ぼす最大影響度のカテゴリ値を探索する（ステップＳ２４）。一方、ステップＳ２３において、説明変数が数値属性の場合（ステップＳ２３，Ｎ）には、優先方向の指定の有無が判断され（ステップＳ３０）、優先方向の指定がある場合（ステップＳ３０，Ｙ）には、数値を優先方向へと変化させ、希望値へと影響を与える影響度を持つ最も近い数値（境界値）を探索する（ステップＳ３１）。一方、優先方向の指定が無い場合（ステップＳ３０，Ｎ）、数値を大小いずれか境界値が近い方向へと順に変化させ、希望値へと影響を与える影響度を持つ最も近い数値（境界値）を探索する（ステップＳ３２）。
【００６８】
次に、探索結果の有無が判断され（ステップＳ２５）、数値属性、カテゴリ値属性のいずれの場合においても、探索の結果が存在しなければ（ステップＳ２５，Ｎ）、その探索対象説明変数には該当がないので、次の探索対象説明変数へと進む（ステップＳ２２）。一方、探索の結果が存在する場合（ステップＳ２５，Ｙ）は、その説明変数における現在の値が希望値へと影響を与えていない影響度を持つ場合には（Ｓ２６，Ｎ）、探索された影響度をそのまま出力する（Ｓ２９，Ｓ２０）。
【００６９】
一方、ステップＳ２６において、その説明変数における現在の値が希望値へと影響を与える影響度を持つ場合には（ステップＳ２６，Ｙ）、その影響度値より探索された影響度値が大きいかどうかを判定し（ステップＳ２７）、大きい場合（ステップＳ２７，Ｙ）にのみ探索結果として出力する（ステップＳ２９，Ｓ２０）。ステップＳ２７において、影響度値が小さい場合には（ステップＳ２７，Ｎ）、次の探索対象説明変数へと移る（ステップＳ２２）。
【００７０】
ステップＳ２９（ステップＳ２０）で示される変更値としては、予測対象データ中の該当説明変数値を探索値に置換した形で出力する。変更値は予測部３に入力されて再予測が行われ、変更データの予測結果が希望値となるかどうかを予測結果判定部５で確認する。希望値と一致しない、あるいは誤差が許容値以上であり、再度の変更計算が必要な場合には、まず同じ説明変数内でより影響度値の大きい数値があるかどうかを探索し、あればその値を変更値として出力する。ない場合には、その変更値は維持したまま、あるいはステップワイズ法等を用いて、次の探索順の対象説明変数へと探索を行う。再度の変更計算が必要ない場合には、元の予測対象データ２とこの変更データとの差をもって出力とする。変更例を図１７に示す。
【００７１】
図１７の例では、予測対象データNo.101に対して、優先順に従い、まず「濃度」を18.8に対し、最も値が近くかつ影響度がOKに寄与する範囲（濃度22.5-25.0）を見つけ、22.5へと変更し、変更したデータに対してMBR予測を行う。その結果が希望値を満たさなかったので、次に同じ説明変数：「濃度」の中で希望値へと影響を与える、より大きな影響度値を持つ値があるかどうかを探索した所、（濃度25.0-27.5）を見つける。そこで、「濃度」を25.0へと変更して、同様に変更したデータに対してMBR予測を行う。しかし、ここでも予測結果が希望値を満たさず、かつ説明変数：「濃度」の中で希望値へと影響を与える、より大きな影響度値を持つ数値がないので、次の優先順である説明変数：「圧力」を2.3から2.0へと変更する。再度変更したデータに対して、MBR予測を行い、希望値OKを満たせば、変更したデータと元の値(No.101)との差分を計算して、行動として出力される。図１７の例では、最終的には以下のように出力される。
【００７２】
例：「濃度を18.8から22.5へと変更し、圧力を2.3から2.0へと変更する。」
【００７３】
なお、全ての探索順で変更値を計算したが、探索範囲を満たし、かつ、予測値が希望値となるような変更値が得られなかった場合には、該当なしとして出力して終了する。
【００７４】
実施の形態３．
次に実施の形態３について説明する。実施の形態３は、予測部３として事例に基づく推論(MBR)を用い、予測部３から得られるデータ特徴として、MBRが出力可能な予測対象データの類似事例データおよびその類似度を用い、変更値計算部６にて、予測値を希望値へと変更するためのできるだけ少ない予測対象データの行動量（変更量）を、与えられた類似事例データとその類似度を利用して効率的に自動計算するようにしたものである。
【００７５】
また、この場合、変更値計算部６における、予測対象データの類似事例およびその類似度を用いた、予測対象データの行動量（変更量）の効率的計算方法として、予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータと予測対象データとの相違に基づいて行動量とするようにしている。
【００７６】
さらには、探索されたデータを予測対象データとみなし、かつ既知事例データから除いて再度、予測部３で予測を行い、その結果を予測結果判断部５で判断し、再予測値が希望値と一致、あるいは誤差が許容値以下である場合にはその探索されたデータと予測対象データとの相違をもって行動量とし、不一致、あるいは誤差が許容値以上の場合には、変更値計算部６にて変更値計算を繰り返すことで、ノイズに強い安定した行動量を導くことを可能としたものである。
【００７７】
図８にMBRによる予測によって得られる予測対象データの類似事例データと類似度の逆数を示している。ここで、類似事例データとは既述のように、MBR予測において既知事例データ中で予測対象データに最も類似したk個のデータを意味する。また類似度とは、各類似事例データと予測対象データとの類似性の尺度であり、上述した式(2)(3)によって求められる。
【００７８】
まず、予測値が希望値と一致しない場合において、図１８に目的変数がカテゴリ値属性の場合の希望値と各類似事例の目的変数値の誤差を、図１９に目的変数が数値属性の場合の希望値と各類似事例の目的変数値の誤差を示す。いずれも各点が１つの類似事例を意味する。図１８においては、希望値と各類似事例の目的変数値が一致している場合には0、一致していない場合には１として表されており、類似度が高い場合には一致していないが、その後一致する類似事例も現れることを示している。一方、図１９では目的変数は数値属性のため、誤差は連続値で表されている。
【００７９】
このようなMBR類似事例と類似度を活かした、変更値計算部６における変更値計算方法の流れ図を図２０に示す。図２０において、所定のデータが入力されると（ステップＳ４１）、目的変数値が希望値である、あるいは希望値と一定誤差以内である、類似事例データのみを抽出して類似事例データとしてセットし直す（ステップＳ４２）。続いて、探索条件入力部７が存在する場合、変更対象説明変数が指定されていれば（ステップＳ４３，Ｙ）、類似事例データ中からその対象説明変数のみが異なるような類似事例を抽出して新たに類似事例データをセットしなおす（ステップＳ４４）。次に、説明変数に優先順位がある場合には（ステップＳ４５，Ｙ）、優先順の高い説明変数のみが異なる類似事例データごとに類似度順を並び替える（ステップＳ４６）。この時、優先方向も指定されている場合には、その優先方向順も含めて並び替える。
【００８０】
図２１に、図８に示した類似事例データを図１３の探索条件に従って並び替えた類似事例データ例を示す。図２１では、以下の作用により並び替え類似事例データが生成されている。
【００８１】
▲１▼希望値ではない類似事例（No.68,No.11,No.45,No.83,No.92,No.73）は削除。
▲２▼No.08は希望値を持つが、探索条件である固定説明変数：温度１、温度２が予測対象データと異なっているため削除。
▲３▼No.62は類似度ではNo.32より低いものの、説明変数の優先順位として濃度の方が優先されていることから、濃度のみが異なるNo.62を優先して順番繰り上げる。
▲４▼そして、並び替え後の類似度順に類似事例データを調べ、探索指定範囲が指定されているときには、探索された類似事例データが探索指定範囲内かどうかを判断し、探索外であれば次の並び替え後類似度順の類似事例データへと探索を進める（ステップＳ４７）。
▲５▼探索範囲内に希望値を結果値としてもつ類似事例データが見つかった場合には（ステップＳ４８，Ｙ）、
ａ）元の予測対象データと、この探索データとの差をもって行動として出力する（ステップＳ４９）。
ｂ）この探索データを予測対象データとみなし、かつ既知事例データから除き、予測部３にて再予測を行い、安定性の確認をする（ステップＳ４９）（図１０）。
【００８２】
上記ｂ）においては、再予測の結果、予測結果判断部５にて予測値が希望値と一致せず、あるいは誤差が許容値以上に大きく、再度の変更値計算が必要と判断された場合には、再び次の並び替え後類似度順の類似事例へと探索を開始し、変更値計算の必要がなくなるまでこのサイクルを繰り返す。希望値と予測値が一致、あるいは許容値以下の誤差であり、再度の変更計算が必要なくなった場合には、元の予測対象データとこの探索データとの差をもって行動量を出力する。変更から出力までの例を図２２に示す。
【００８３】
図２２の例では、まず最も類似度の高い類似事例データNo.62を変更データとみなし、既知事例データ中からはNo.62を除いた状態で、No.62をMBRにより予測し、希望値を満たすかどうかを判定する。希望値を満たさない場合には、次に類似度が高いNo.34を変更データとみなし、既知事例データからはNo.34を除いた状態でMBRにより予測し、希望値を満たすかどうかを判定する。希望値を満たせば、そこで探索は終了し、変更データ（とみなされた類似データ）と元データ(No.101)の差分をもって行動として出力される。図２２の例では、最終的には以下のように出力される。
【００８４】
例：「濃度を18.8から22.0へと変更し、圧力を2.3から1.8へと変更する。」
【００８５】
なお、全ての類似事例を探索した結果、探索範囲を満たし、かつ、希望値を結果値としてもつ類似事例が存在しなかった場合には、該当なしとして出力して終了する（ステップＳ４９）。
【００８６】
実施の形態４．
以下に、本発明の実施の形態４について説明する。
図２３および図２４に実施の形態４におけるブロック図を示す。図２３は図１２と基本構成は同じであるが、図２３、図２４では、予測部３と予測結果判断部５と変更値計算部６とからなる部分が、並列実行部分１００（１００Ａ，１００Ｂ）として複数設けられていると共に、最小行動を判断する最小行動判断部8が設けられているところが異なっている。そして、片方の実行部分１００Ａが実施の形態２に対応し、他方の実行部分１００Ｂが実施の形態３に対応している。実施の形態４では、同一の予測対象データに対して、影響度を利用した変更値計算と、類似事例データおよび類似度を利用した変更値計算をそれぞれ並列に行い、以下を出力する。
【００８７】
▲１▼いずれの実行部分でも「該当なし」と出力された場合には「該当なし」と出力する。
▲２▼いずれか片方の実行部分のみが変更値を出力した場合には、その変更値と予測対象データとの差分を行動量として出力する。
▲３▼双方の実行部分がそれぞれ変更値を出力した場合には、最少行動判断部８にていずれがより予測対象データに類似しているかどうかを判定し、より類似している変更値と予測対象データとの差分を行動量として出力する。
【００８８】
ここで、最小行動判断部８における判断の方法としては、以下があげられる。
ａ）探索範囲入力部７にて優先順や優先方向が指定されている場合
最優先説明変数における両変更値と予測対象データとの違いを比較し、より予測対象データに近く、優先方向側に変更されている結果を選択。両者に差が現れなければ、次に優先度の高い説明変数を調べ差が現れるまで繰り返し判定する。最終的に、より近い方と予測対象データとの差分を行動として出力する。
【００８９】
ｂ）優先順や優先方向が指定されていない場合
次式(4)より、予測対象データと各変更値との距離を計算し、より近い変更値と予測対象データとの差分を行動として出力する。
データ間距離ｄ＝Σ（ｄ_i）² （4）
ここで、Σによる加算はｉについて行われる。
【００９０】
実施の形態4によれば、迅速に、且つ最適な予測を効率的に行うことが可能となる。
【００９１】
（付記１）既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力することができる予測部と、前記予測値に対する希望値を入力することができる希望値入力部と、前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断部と、前記比較判断部の出力に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算部とを備えてなる行動量生成装置。
（付記２）付記１に記載の行動量生成装置において、
前記比較判断部は、前記予測値が前記希望値に不一致、あるいは誤差が許容値以上又は許容値より大きい場合に、前記変更値計算部に行動量を算出させることを特徴とする行動量生成装置。
（付記３）付記１又は付記２に記載の行動量生成装置において、
前記変更値計算部が算出する行動量は、予測値を希望値へと変更するために可能な最少の予測対象データの行動量であることを特徴とする行動量生成装置。
（付記４）付記１乃至付記３のいずれかに記載の行動量生成装置において、
探索条件の指定を可能とする探索条件入力部を備え、前記変更値計算部は、更に前記探索条件に基づいて予測対象データの行動量を算出することを特徴とする行動量生成装置。
（付記５）付記４に記載の行動量生成装置において、
前記探索条件入力部により指定される探索条件として、制御可能な１つあるいは複数の説明変数を変更対象説明変数として指定できることを特徴とする行動量生成装置。
（付記６）付記４又は付記５に記載の行動量生成装置において、
前記探索条件入力部により指定される探索条件として、変更対象説明変数ごとの変更値範囲を指定できることを特徴とする行動量生成装置。
（付記７）付記４乃至付記６のいずれかに記載の行動量生成装置において、
前記探索条件入力部における探索条件として、変更対象説明変数の優先順位、及び数値属性の場合には優先探索方向が含まれることを特徴とする行動量生成装置。
（付記８）付記１乃至付記７のいずれかに記載の行動量生成装置において、
前記予測部として事例に基づく推論を用い、前記予測部より得られるデータ特徴として前記事例に基づく推論により出力される影響度を用いることを特徴とする行動量生成装置。
（付記９）付記１乃至付記７のいずれかに記載の行動量生成装置において、
前記予測部として事例に基づく推論を用い、前記予測部より得られるデータ特徴として、前記事例に基づく推論により出力される予測対象データの類似事例データおよびその類似度を用いることを特徴とする行動量生成装置。
（付記１０）付記９に記載の行動量生成装置において、
前記変更値計算部は、前記予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータと前記予測対象データとの相違をもって行動量とすることを特徴とする行動量生成装置。
（付記１１）付記９に記載の行動量生成装置において、
前記変更値計算部は、前記予測対象データに最も類似し、かつ希望値を持つデータを類似事例データ中より探索し、探索されたデータを予測対象データとみなし、かつ既知事例データから除いて再度、前記予測部により予測を行わせ、得られた再予測値に対する前記比較判断部における判断結果に基づいて、前記予測対象データの行動量を算出することを特徴とする行動量生成装置。
（付記１２）付記１１に記載の行動量生成装置において、
前記変更値計算部は、前記再予測値が希望値と一致、あるいは誤差が許容値以下又は許容値より小さい場合には、その探索されたデータと予測対象データとの相違をもって前記予測対象データの行動量とすることを特徴とする行動量生成装置。
（付記１３）付記１１に記載の行動量生成装置において、
前記変更値計算部は、前記再予測値が希望値と不一致、あるいは誤差が許容値以上又は許容値より大きい場合には、前記変更値計算を繰り返すことを特徴とする行動量生成装置。
（付記１４）付記８に記載の行動量生成装置と、付記１１乃至付記１３のいずれかに記載の行動量生成装置とを並列に備えると共に、これら行動量生成装置による予測値に基づいて前記予測対象データの行動量を判断する判断部を備えてなる行動量生成装置。
（付記１５）既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力する予測ステップと、
前記予測値に対する希望値を入力する希望値入力ステップと、
前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断ステップと、
前記比較判断結果に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算ステップとを備えてなる行動量生成方法。
（付記１６）既知事例データに基づいて予測対象データに対する予測値及びデータ特徴を出力する予測ステップと、
前記予測値に対する希望値を入力する希望値入力ステップと、
前記希望値と前記予測値が入力され、前記予測値と前記希望値とを比較判断する比較判断ステップと、
前記比較判断結果に基づいて、前記予測値と前記予測対象データと前記データ特徴を用いて、予測対象データの行動量を算出する変更値計算ステップとをコンピュータに実行させる行動量生成プログラム。
【００９２】
【発明の効果】
以上のように本発明を用いることにより、予測値を希望値へと変更するために、どのように、そして、どのくらい予測対象データの変更を行えば良いかを効率的に計算することが可能となる。
【図面の簡単な説明】
【図１】実施の形態の基本構成を示すブロック図である。
【図２】事例データ例を示す図である。
【図３】予測対象データ(No.101)を示す図である。
【図４】 MBR概念図である。
【図５】 newCCFによる影響度例を示す図である。
【図６】説明変数(圧力の影響度例)を示す図である。
【図７】予測対象データの影響度例を示す図である。
【図８】予測対象データ(No.101)に対する類似事例データの類似度の逆数を示す図である。
【図９】予測対象データ(No.101)の予測結果を示す図である。
【図１０】予測対象データを既知事例データから削除するMBR概念図である。
【図１１】全体動作を示すフローチャートである。
【図１２】実施の形態1を示すブロック図である。
【図１３】変更対象説明変数、探索範囲、探索優先度と探索方向指定の例を示す図である。
【図１４】影響度を用いた変更値計算部の動作を示すフローチャートである。
【図１５】探索条件によって抜き出された圧力と濃度の影響度を示す図である。
【図１６】影響度を用いた変更値計算部の動作を示すフローチャートである。
【図１７】予測対象データ(No.101)に対し、影響度を用いた変更値計算例を示す図である。
【図１８】カテゴリ値属性における類似度と目的変数値・希望値誤差との関係を示す図である。
【図１９】数値属性における類似度と目的変数値・希望値誤差との関係を示す図である。
【図２０】類似事例と類似度を用いた変更値計算部の動作を示すフローチャートである。
【図２１】図２０の探索条件に従って並び替えられた予測対象データ(No.101)に対する類似事例データを示す図である。
【図２２】予測対象データ(No.101)に対し、類似事例データと類似度を用いた変更値計算例を示す図である。
【図２３】実施の形態４を示す第1ブロック図である。
【図２４】実施の形態４を示す第２ブロック図である。
【符号の説明】
１既知データ入力部、２予測対象データ入力部、３，３Ａ，３Ｂ予測部、４希望結果入力部（希望値入力部）、５，５Ａ，５Ｂ予測結果判断部（比較判断部）、６，６Ａ，６Ｂ変更値計算部、７探索条件入力部、８最少行動判断部(判断部)。[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a behavior amount generation apparatus that predicts an objective variable value of prediction target data using known data accumulated in the past, and in particular, when a prediction result is not expected, change to a desired result In order to do this, it relates to a behavior amount generation device that can automatically generate behavior for improving the prediction result by calculating how and how much the explanatory variable value of the prediction target data should be changed It is.
[0002]
[Prior art]
In recent years, with the development of computers and the Internet, various and large amounts of information including remote locations can be easily obtained. In addition, the storage of the information obtained by the storage device having higher density and lower price has become easier. For example, in a POS (Point Of Sale) system in the distribution business, it is possible to collect sales contents of retail stores in various parts of the country in a head office computer or the like, and it is accumulated every moment as a relationship between time and sold products. In addition to this, a large amount of data such as the conditions of various manufacturing equipment in the manufacturing industry and the yield data of the generated products, personal credit card usage status in the financial industry, personal data and usage status information of insurance users in the insurance business, etc. There are a wide variety of fields where information is accumulated.
[0003]
In addition, there is a growing demand for automatically and efficiently extracting valuable information such as causal relationships and rules inherent in these large amounts of accumulated data to be useful for business.
[0004]
By the way, it has been practiced to predict unknown data (predicted data) by using statistical processing, AI (Artificial Intelligence), neural network, etc., using accumulated known case data. . However, in actual application situations, it may be desired not only to predict the prediction target data but also to instruct the action on what to do next from the prediction result.
[0005]
For example, in the case of the manufacturing industry, if a product generated from the condition data of various manufacturing equipment is predicted to be “defective”, it would be desirable to investigate which manufacturing condition and how to change it to “good” There are many. In addition, in the CRM (Customer Relationship Management) field, when a less active customer is classified from a company's customer list, the difference between the customer and the superior customer is clarified, and each customer is identified as a superior customer. By finding a strategy to guide the customer, for example, it is possible to take a more detailed response such as conducting the most effective campaign for each customer.
[0006]
Furthermore, the problem of investigating risks from personal data and usage information of insurance users in the insurance industry is similar to the situation of a person who is predicted to be "dangerous", for example, and is "safe". It is very important to understand the customer data to obtain the conditions that can be predicted.
[0007]
[Problems to be solved by the invention]
However, the conventional prediction apparatus does not provide such information, or only measures the sensitivity of each explanatory variable to the objective variable using sensitivity analysis. In addition, in the “predictive analysis device and program recording medium used to realize the device (application number: H12-325213)” already filed by the present applicant, as a predictor, the whole is made using a neural network or a decision tree. There are proposed a method of generating a model, calculating a change value using the generated overall model, and a method of searching for a change value based on an example. However, the method for generating the global model has a problem that it must first learn a sufficient global model, and cannot be applied when sufficient learning is not possible. In addition, there is a problem in that it takes a lot of calculation time and labor to perform trial and error for sufficient learning and to check whether learning is sufficient, which increases costs. In addition, in the method of searching for the change value based on the case, as a search method, only a full search of the range specified by the user is performed, and there is a problem that it takes time when performed on a large amount of data. there were.
[0008]
The present invention has been made to solve the above-described problems, and how to change the explanatory variables of the prediction target data in order to change the prediction result to a desired value. It is an object of the present invention to provide an action amount generation device, an action amount generation method, and an action amount generation program.
[0009]
[Means for Solving the Problems]
In order to solve the above-described problems, the present invention uses data information obtained from case-based reasoning (Memory-Based Reasoning: hereinafter referred to as MBR) as a prediction method that does not generate a model. When a different result is desired, a change value calculation unit that performs efficient calculation is provided, so that the desired value can be obtained as a predicted value that is closest to each explanatory variable value of the prediction target data. The amount of change (behavior amount) of the prediction target data can be obtained. In addition, by making it possible to specify the explanatory variables that you want to change and the explanatory variables that you want to fix without changing, you can exclude explanatory variables that cannot be controlled in the first place, and expand the scope of application to real problems. be able to. In addition, by making it possible to specify the priority and the value change direction for each explanatory variable, it is possible to make detailed strategy decisions such as making changes from the explanatory variables that are emphasized by the user. As described above, by using the present invention, it is possible to obtain the amount of behavior of the prediction target data for changing the predicted value to the desired result.
[0010]
That is, the present invention provides a prediction unit that can output a prediction value and data characteristics for prediction target data based on known case data, and a desired value input unit (desired result) that can input a desired value for the prediction value. Based on the output of the input determination section 4), the comparison value determination section (prediction result determination section 5) for comparing and determining the prediction value and the desired value, and the comparison value determination section. A change value calculation unit that calculates an action amount of the prediction target data using the prediction value, the prediction target data, and the data feature.
[0011]
In the action amount generation device of the present invention, the comparison determination unit may add an action amount to the change value calculation unit when the predicted value does not match the desired value, or an error is greater than or equal to an allowable value or greater than an allowable value. It can be made to calculate. Further, in the behavior amount generation device of the present invention, the behavior amount calculated by the change value calculation unit is a behavior amount of the minimum prediction target data that is possible to change the predicted value to a desired value. You can also
[0012]
Furthermore, the action amount generation device of the present invention further includes a search condition input unit that enables specification of a search condition, and the change value calculation unit further calculates an action amount of the prediction target data based on the search condition. It is characterized by.
[0013]
Here, as a search condition specified by the search condition input unit, one or a plurality of explanatory variables that can be controlled can be specified as an explanatory variable to be changed. Further, as a search condition specified by the search condition input unit, a change value range for each change target explanatory variable can be specified. Further, as a search condition in the search condition input unit, a change target In the case of explanatory variable priorities and numerical attributes, a priority search direction may be included.
[0014]
Further, the present invention is characterized in that an inference based on a case is used as the prediction unit, and an influence level output by the inference based on the case is used as a data feature obtained from the prediction unit.
[0015]
Further, in the behavior amount generation device of the present invention, as the prediction unit, inference based on a case is used, and as a data feature obtained from the prediction unit, similar case data of prediction target data output by the inference based on the case and its data It can be characterized by using similarity. In the present invention, the change value calculation unit searches the similar case data for data that is most similar to the prediction target data and has a desired value, and has a difference between the searched data and the prediction target data. It can also be characterized by an action amount. Further, the change value calculation unit searches the similar case data for data that is most similar to the prediction target data and has a desired value, regards the searched data as prediction target data, and removes it from the known case data Then, the prediction unit may perform prediction again, and the behavior amount of the prediction target data may be calculated based on the determination result in the comparison determination unit with respect to the obtained re-prediction value.
[0016]
Further, when the re-predicted value matches the desired value, or the error is equal to or smaller than the allowable value or smaller than the allowable value, the change value calculation unit determines the prediction target with a difference between the searched data and the prediction target data. It can also be characterized by the action amount of data. Further, the change value calculation unit may repeat the change value calculation when the re-predicted value does not match the desired value or when the error is greater than or equal to a tolerance value or greater than the tolerance value.
[0017]
An amount-of-behavior generating apparatus characterized in that an inference based on a case is used as the prediction unit, and an influence output by the inference based on the case is used as a data feature obtained from the prediction unit, and the prediction target data Is searched for in the similar case data, the searched data is regarded as the prediction target data, and is excluded from the known case data, and the prediction is performed again by the prediction unit. And a behavior amount generation device including a change value calculation unit that calculates a behavior amount of the prediction target data based on a determination result in the comparison determination unit with respect to the re-predicted value. Configuring an action amount generation device including a determination unit (minimum action determination unit 8) that determines an action amount of the prediction target data based on a predicted value by the generation device; It can be.
[0018]
Further, the action amount generation method according to the present invention includes a prediction step of outputting a predicted value and a data feature for prediction target data based on known case data, a desired value input step of inputting a desired value for the predicted value, and the desired A value and the predicted value are input, a comparison determination step for comparing and determining the predicted value and the desired value, and based on the comparison determination result, using the predicted value, the prediction target data, and the data feature, And a change value calculation step for calculating an action amount of the prediction target data.
[0019]
Moreover, the action amount generation program according to the present invention includes a prediction step of outputting a predicted value and data characteristics for prediction target data based on known case data, a desired value input step of inputting a desired value for the predicted value, A desired value and the predicted value are input, a comparison determination step for comparing and determining the predicted value and the desired value, and based on the comparison determination result, the predicted value, the prediction target data, and the data feature are used. The change value calculation step for calculating the action amount of the prediction target data is executed by the computer.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, embodiments of the present invention will be described with reference to the drawings.
First, a basic configuration in the embodiment of the present invention will be described.
FIG. 1 is a block diagram showing a basic configuration of an embodiment of the present invention. This basic configuration includes a known case data input unit 1 for inputting known case data and a prediction target data input unit 2 for inputting prediction target data. A prediction unit 3 that receives known case data and prediction target data and outputs a predicted value; a desired result input unit 4 that inputs a desired value; and a prediction result determination unit 5 that receives a desired value and a predicted value And a change value calculation unit 6 that appropriately receives outputs of the prediction target data input unit 2, the prediction unit 3, and the prediction result determination unit 5 and generates an action amount.
[0021]
The known case data input by the known case data input unit 1 is accumulated past data, the value of the explanatory variable (condition) and the value of the associated objective variable (result) are known, and the basis for the prediction It will be. An example of manufacturing data as known case data is shown in FIG. The prediction target data input by the prediction target data input unit 2 means unknown data whose objective variable value (result) is unknown. FIG. 3 shows an example of prediction target data. In the present invention, both explanatory variables (conditions) and objective variables (results) can be applied to both numerical and categorical attributes.
[0022]
The prediction unit 3 predicts the objective variable value of the prediction target data based on the known case data and outputs it as a predicted value. A desired value is input by the user in the desired result input unit 4, and the desired value and the predicted value by the predicting unit 3 are determined by the prediction result determination unit 5. If both match or the error is less than the allowable value, an end request is issued. Otherwise, a change value calculation request is output. The allowable value is input in advance by the user.
[0023]
The change value calculation unit 6 calculates the change data using the prediction target data, the prediction value and the data feature passed from the prediction unit 3 in accordance with the change value calculation request and the desired value from the prediction result determination unit 5. The calculated change result is passed to the prediction unit 3 and used for re-prediction and determination. In addition, the change value calculation unit 6 calculates an action to be caused from the prediction target data and the change result in response to the termination request in the prediction result determination unit 5, and performs and outputs an action amount generation 6a.
[0024]
FIG. 4 shows a conceptual diagram of case-based reasoning MBR used as the prediction unit 3. The basic principle of MBR prediction is to search k pieces of data close to the prediction target data in the accumulated known case data and determine a prediction value by the weighted addition sum.
[0025]
Here, first, the principle of inference based on cases used in the prediction unit 3 will be described with reference to FIG. FIG. 4 shows a multidimensional space, where each dimension represents an explanatory variable of data, and + and − represent an objective variable value. That is, the case data is represented as one point having each explanatory variable in the multidimensional space as coordinates.
[0026]
The basic principle of MBR prediction is to search k pieces of data close to the prediction target data in the accumulated known case data and determine a prediction value by the weighted addition sum. Prediction by MBR is performed in the following order.
[0027]
(1) The degree of influence of a certain objective variable is calculated on the accumulated known case data.
(2) The similarity between the prediction target data and each of the known case data is calculated.
(3) Using the similarity obtained in (2), k pieces most similar to the prediction target data are selected from the known case data and set as similar case data.
(4) The prediction result is determined using the obtained k similar case data.
(5) A certainty factor is calculated for the obtained prediction result.
[0028]
In the following, details will be described using equations and data examples.
(1) Impact calculation
The degree of influence is obtained by calculating the degree of influence of each explanatory variable or each explanatory variable value on the objective variable value from the known case data. As this calculation method, when it is desired to obtain each explanatory variable value, a method such as CCF (Cross-Category Feature importance) or new CCF (new Cross-Category Feature importance) is used.
[0029]
The impact level is calculated according to the following formula.
(CCF)
w_i(V) = Σp (c | v)² (a1)
Here, c represents each objective variable value, and addition by Σ is performed for each objective variable value c. Note that i is the number of the explanatory variable, v is the value of the explanatory variable, and p (c | v) is the probability of taking the objective variable value c when the explanatory variable takes the value v.
[0030]
That is, CCF means that the probability of taking the objective variable value c when the explanatory variable takes the value v is squared and summed up for all classes. As a result, when the explanatory variable takes a certain value v, if the single objective variable value c is necessarily taken, the degree of influence is 1, and the probability of taking any objective variable value c by the explanatory variable value v is equal. Is the minimum influence: 1 / N_c(N_cIs the number of objective variable values).
[0031]

[0032]
Here, Σ outside the parentheses is the sum for c, and Σ within the parentheses is the sum for d. C is the value of each objective variable, i is the number of the explanatory variable, v is the value of the explanatory variable, N_cRepresents the number of objective variable values, p (c) represents the probability that the objective variable value will be c, and p (c | v) represents the probability of taking the objective variable value c when the explanatory variable takes the value v. Q_v(c) means the ratio of the probability of taking the objective variable value c when the explanatory variable takes the value v to the probability that the objective variable value is originally c.
[0033]
This newCCF method is different from the CCF method in that 1). Considering the distribution bias of the objective variable value, 2). If a certain explanatory variable value v does not contribute to the determination of the objective variable value, the influence level is set to 0. Improvements have been made in two points. That is, when the distribution of the objective variable value c when taking a certain explanatory variable value v matches the entire objective variable value distribution, the numerator is 0 and the influence is 0 (minimum value). Can eliminate the effect on the prediction. On the other hand, when only a single objective variable value c is taken when taking a certain explanatory variable value v, the influence value is 1.0, which is the same as CCF.
[0034]
FIG. 5 shows an example of the influence degree of newCCF in “pressure” and “concentration” as explanatory variables, and FIG. 6 shows an example of the influence degree of the pressure.
In FIG. 5 and FIG. 6, which explanatory variable value influences the direction in which the objective variable value is derived (the column “direction” in FIG. 5) It is obtained from the change in the ratio of the objective variable value in the data that takes the explanatory variable value. FIG. 6 shows the degree of influence of each explanatory variable value on the objective variable value (OK, NG) for the explanatory variable: “pressure”. The magnitude of the effect is indicated by the length of the bar graph, and the direction is indicated by the color. Black indicates that the effect is in the OK direction, and hatched lines indicate that the effect is in the NG direction. In this example, it can be seen that the pressure value near the center (0.75 to 2.0) has an influence in the direction of OK.
[0035]
In either case of CCF or newCCF, if there is a missing value in the known case data, the impact is calculated after deleting the corresponding record. When the explanatory variable has a numeric attribute, v means a numeric range, not a single value.
FIG. 7 illustrates the degree of influence and the direction of influence of each explanatory variable value in certain prediction target data. In the figure, pressure (1.9) and weather (sunny) as the prediction target data have an influence that contributes to the direction of OK, but temperature 1 (13.8 degrees) and temperature 2 (8.9 degrees) are in the direction of NG. Has a contributing impact.
[0036]
(2) Calculation of similarity between prediction target data and known case data
Similarity is a measure of similarity between case data. Using equations (2) and (3), the degree of similarity between the prediction target data and each known case data is obtained.
[0037]

[0038]
Where Σ is added for i and w_i(v) represents the influence obtained in (1). That is, the degree of similarity means the reciprocal of the distance with the degree of influence between cases, and the degree of similarity increases as the cases are similar. In addition, when the missing value is included in the known case data, the distance between the single attributes of the corresponding attribute of the prediction target data is 1. When the missing value is included in the prediction target data, the single attribute of the corresponding attribute of the known case data The similarity is calculated with the inter-distance being 0.
[0039]
(3) Select k pieces that are most similar to the prediction target data from the known case data
In (2), the degree of similarity between each of the known case data and the prediction target data is obtained, and k pieces having the highest degree of similarity are selected from the known case data to obtain similar case data. Here, k is determined by the following method.
a) Designated by the user in advance
b) A part of the known case data is regarded as the prediction target data, the prediction work is repeated with a plurality of k values, and the k value with the highest predicted correct answer rate is automatically determined as the optimum value.
The prediction work will be described in the next section (4).
FIG. 8 shows the reciprocal of the similarity of the selected similar case data with respect to the prediction target data shown in FIG. 3 selected as k = 10.
[0040]
(4) Prediction result determination using k similar case data obtained
a) When the objective variable is a categorical value attribute
Similarity total T for each objective variable value using similar case data_cIs obtained by the following equation (a2). Where S_jMeans the degree of similarity of the j-th similar case data among the k similar case data selected in (3) with the prediction target data.
[0041]
Total similarity for each objective variable value T_c= ΣS_j                        (a2)
The addition is performed for Sjεc.
[0042]
According to the following equation (a3), the T at each objective variable value obtained_cThe largest T_cAn objective variable value c that gives a predicted value C_predictDetermine as.
Predicted value C_predict= [C | max (T_c] (A3)
[0043]
b) When the objective variable has a numeric attribute
Predicted value c using the following equation (a4)_predictIs determined. Here, the number of similar case data is k, and the similarity between the jth similar case data and the prediction target data is S._j, The objective variable value of the jth similar case data is c_jAnd
[0044]
Predicted value c_predict= ΣS_j・ C_j/ ΣS_j                          (a4)
Here, Σ is performed for j = 1 to k. That is, a predicted value is determined by a weighted addition sum based on similarity.
[0045]
(5) Calculate certainty factor for the obtained prediction result
The certainty factor is a scale representing the reliability of the prediction result. It is obtained as follows.
a) When the objective variable is a categorical value attribute
Prediction result c for the total similarity of each objective variable value_predict Is calculated by the following equation (a5).
[0046]
Certainty factor P = Tc_predict/ ΣT_c                                    (a5)
Note that addition is performed for c.
b) When the objective variable has a numeric attribute
Similarly, it is obtained by the following equation (a6). Where σ_cMeans the standard deviation of the objective variable.
Certainty factor P = 1 / ΣS_j・ (C_j-C_predict)²/ (Σ_c ²・ ΣS_j+1) (a6)
Here, Σ is added for j = 1 to k.
[0047]
As an example, FIG. 9 shows the prediction result and the certainty value for the prediction target data of FIG. 3 obtained from the similar data of FIG.
As described above, prediction target data is predicted using known case data, and the certainty factor can be presented.
[0048]
Note that the prediction method using the MBR described here has already been filed by the present applicant under the name of “prediction apparatus and method based on similar cases” (Application Publication No. 2000-155681). ).
[0049]
FIG. 10 shows a conceptual diagram of removing relevant data from the known case data when repredicting certain known case data as prediction target data in MBR prediction. This is because when the change value calculation is performed using a similar case, the obtained change value is included in the known case, so the predicted value is always the desired value. However, there is a possibility that the obtained change value may be affected by noise, so the change value is excluded from the known case data and prediction is made from similar cases around it, and the prediction value is the same as the result value of the original case. If it does not change, it can be judged as stable.
[0050]
Hereinafter, the operation in the basic configuration of the present embodiment will be described. FIG. 11 shows an overall operation flowchart. In FIG. 11, in step S1, first, the known case data and the prediction target data are input to the prediction unit 3 using the known case data input unit 1 and the prediction target data input unit 2 (steps S1a, S1b, S1c). Prediction is performed (step S2). Next, the prediction value obtained by the prediction unit 3 and the desired value input by the desired value input unit 4 are input to the prediction result determination unit 5 to determine whether or not change calculation is necessary (step S3). ).
[0051]
When both are inconsistent or the error is greater than or equal to the allowable value and it is determined that the change value calculation is necessary (step S3, Y), the prediction target data, the predicted value and data feature obtained from the prediction unit 3, the prediction result determination unit The change value calculation unit 6 calculates the change value using the desired value obtained through 5 (step S6). In step S <b> 6, when case-based reasoning (MBR) is used as the prediction unit 3, the change value is efficiently calculated using any of the following as a data feature.
[0052]
(1) The degree of influence of known case data output by MBR.
(2) Similar case data and similarity of prediction target data output by MBR.
[0053]
The obtained change value is regarded as data to be predicted and is input to the prediction unit 3, and the prediction unit 3 performs prediction again (step S2). With respect to the re-predicted value by the prediction unit 3, the prediction result determination unit 5 determines again whether or not a change value calculation is necessary (step S3), and this cycle is repeated until it is determined that it is not necessary. However, if the change value calculation unit 6 determines that there is no corresponding change value, the process ends there (step S7).
[0054]
Here, when the change value calculation is performed using the similar case data and the similarity, the obtained change value is always included in the original known case data. At that time, when determining the action amount with the changed value, in order to prevent the influence of noise, the changed value is regarded as the prediction target data, and is excluded from the known case data, and the prediction unit 3 performs prediction again and the prediction result determination unit 5 may be left to judgment.
[0055]
When the prediction result determination unit 5 determines that the change value calculation is unnecessary (step S3, N), the change value calculation unit 6 obtains the difference between the prediction target data and the change data (step S4), and the action It outputs as a quantity and ends (step S5).
[0056]
Hereinafter, embodiments based on the basic configuration of the present invention described above will be described in more detail.
Embodiment 1 FIG.
FIG. 12 is a block diagram showing the first embodiment. In FIG. 1, a search condition input unit 7 is added in FIG. 1 showing the basic configuration, and the search condition can be specified by the user. The change value calculation unit 6 performs the change value calculation so as to satisfy the designated search condition. FIG. 13 shows an example of prediction target data after the search condition is specified.
[0057]
In FIG. 13, the following items are shown.
(1) A mark indicating whether each explanatory variable is changed or fixed is shown on the first line.
(2) The search priority is described in the second line, and the search direction is described in the third line.
(3) The search range of each explanatory variable is described in the fourth and fifth lines.
[0058]
When the explanatory variable is a categorical value attribute, it is assumed that possible values are listed only in the fourth row as the search range, and the search direction cannot be specified. Therefore, the third and fifth lines are blank. These are arbitrarily designated by the user through a GUI (Graphical User Interface) or the like.
[0059]
When the search condition input unit 7 exists, the change value calculation unit 6 calculates the change value in light of the input search condition. At this time, search conditions that can be specified in the search condition input unit 7 are as follows.
[0060]
(1) Designation of change target explanatory variables.
(2) Specify the change range of each change target explanatory variable.
(3) Priority among each change target explanatory variable, and change direction in the case of numerical attribute.
[0061]
According to the first embodiment, the search condition can be specified by the user, and the prediction efficiency can be improved.
[0062]
Embodiment 2. FIG.
Next, a second embodiment will be described.
In the second embodiment, an inference MRB based on an example is used as the prediction unit 3, and the degree of influence output by the MBR is used as a data feature obtained from the prediction unit 3. The change value calculation unit 6 converts the prediction value to a desired value. The action amount (change amount) of the prediction target data that is as small as possible to change is automatically and efficiently calculated using the given degree of influence. For example, FIG. 12 can be used as the block diagram.
[0063]
FIG. 14 shows a flowchart of the change value calculation method in the change value calculation unit 6 utilizing the MBR influence degree. In FIG. 14, first, necessary data is input (step S11). Next, when the search condition input unit 7 is present, if an explanatory variable to be changed is specified (step S12, Y), the corresponding description. Only the influence degree of the variable is extracted (step S13). Next, when the search range is designated (step S14, Y), only the influence degree within the corresponding search range is extracted (step S15). FIG. 15 shows the degree of influence extracted according to the search condition example shown in FIG. FIG. 15 reflects that the range is specified in the explanatory variable “pressure” in FIG. 13.
[0064]
Subsequently, when the priority order is specified for the explanatory variable (step S16, Y), the priority order is set as the search order (step S17). Otherwise (step S16, N), the search order is extracted. The influence value and the direction of its influence are examined, set in the order of the influence in the explanatory variable with the influence that affects other than the desired value, and then the explanation with the influence that affects the desired value The variables are set in ascending order of influence (step S18).
[0065]
In the example of FIG. 13, since the search order is specified, the search is performed in the order of concentration → pressure. If the search order is not specified, the pressure (1.25-1.50: 0.70, OK) has an effect on the desired value rather than the concentration in Fig. 15 (25.0-27.5: 0.65, OK). Since it has a high influence degree in the influence degree to give, it searches in order of pressure-> concentration.
[0066]
Next, the search is performed as follows according to the search order. The following corresponds to step S19, and FIG. 16 shows a flowchart in which only that portion is extracted. In FIG. 16, when the search order is specified (step S21), it is determined whether there is an explanatory variable to be searched (step S22). (Steps S33, S20).
[0067]
If there is an explanatory variable to be searched (step S22, Y) and the explanatory variable is a category value attribute (step S23, Y), the category value of the maximum influence level that affects the desired value is searched (step S24). ). On the other hand, if the explanatory variable is a numerical attribute in step S23 (step S23, N), it is determined whether or not a priority direction is specified (step S30), and if a priority direction is specified (step S30, Y). Changes the numerical value in the priority direction and searches for the closest numerical value (boundary value) having an influence degree that affects the desired value (step S31). On the other hand, if there is no designation of the priority direction (step S30, N), the closest numerical value (boundary value) having an influence degree that influences the desired value by sequentially changing the numerical value in the direction of the larger or smaller boundary value. Is searched (step S32).
[0068]
Next, the presence / absence of a search result is determined (step S25). If there is no search result (step S25, N) in any of the numerical attribute and the category value attribute, the search target explanatory variable includes Since it does not correspond, it progresses to the next search object explanatory variable (step S22). On the other hand, if there is a search result (step S25, Y), if the current value in the explanatory variable has an influence level that does not affect the desired value (S26, N), the search is performed. The influence degree is output as it is (S29, S20).
[0069]
On the other hand, in step S26, if the current value of the explanatory variable has an influence level that affects the desired value (step S26, Y), whether or not the searched influence value is larger than the influence value. (Step S27), and if it is larger (step S27, Y), it is output as a search result (steps S29, S20). If the influence value is small in Step S27 (Step S27, N), the process proceeds to the next search target explanatory variable (Step S22).
[0070]
As the change value shown in step S29 (step S20), the corresponding explanatory variable value in the prediction target data is output in a form replaced with the search value. The change value is input to the prediction unit 3 and re-prediction is performed, and the prediction result determination unit 5 checks whether or not the prediction result of the change data becomes the desired value. If it does not match the desired value, or if the error is more than the allowable value and it is necessary to perform a change calculation again, first search for the numerical value with the larger influence value in the same explanatory variable. The value is output as the changed value. If not, the search is performed to the target explanatory variable in the next search order while maintaining the changed value or using the stepwise method or the like. When the change calculation is not required again, the output is the difference between the original prediction target data 2 and the change data. A modification example is shown in FIG.
[0071]
In the example of FIG. 17, according to the priority order for the prediction target data No. 101, first find a range (density 22.5-25.0) whose “value” is closest to 18.8 and the degree of influence contributes to OK. Change to 22.5 and perform MBR prediction for the changed data. Since the result did not satisfy the desired value, a search was then made as to whether there is a value having a larger influence value that affects the desired value in the same explanatory variable: “concentration”. 25.0-27.5). Therefore, the “density” is changed to 25.0, and MBR prediction is performed on the changed data in the same manner. However, since the prediction result does not satisfy the desired value and there is no numerical value with a larger influence value that affects the desired value in the explanatory variable: “concentration”, the explanation is the next priority order. Variable: Change “Pressure” from 2.3 to 2.0. MBR prediction is performed on the changed data again, and if the desired value OK is satisfied, the difference between the changed data and the original value (No. 101) is calculated and output as an action. In the example of FIG. 17, the output is finally as follows.
[0072]
Example: “Change the concentration from 18.8 to 22.5 and the pressure from 2.3 to 2.0.”
[0073]
In addition, although the change value was calculated in all the search orders, when the change value that satisfies the search range and the predicted value becomes the desired value is not obtained, the change value is output as not applicable and the process ends.
[0074]
Embodiment 3 FIG.
Next, a third embodiment will be described. In the third embodiment, case-based reasoning (MBR) is used as the prediction unit 3, and similar case data of the prediction target data that can be output by the MBR and its similarity are used as the data features obtained from the prediction unit 3, and the change is made. The value calculation unit 6 automatically and efficiently reduces the amount of behavior (change amount) of the prediction target data as small as possible to change the predicted value to the desired value by using the given similar case data and its similarity. It is to be calculated.
[0075]
In this case, the change value calculation unit 6 is most similar to the prediction target data as an efficient calculation method of the action amount (change amount) of the prediction target data using the similar cases of the prediction target data and the similarities thereof. In addition, data having a desired value is searched from similar case data, and the amount of action is determined based on the difference between the searched data and the prediction target data.
[0076]
Further, the searched data is regarded as the prediction target data, and is excluded from the known case data, and the prediction unit 3 performs the prediction again. The prediction result determination unit 5 determines the result, and the re-prediction value is the desired value. If the match or error is less than or equal to the allowable value, the action amount is determined by the difference between the searched data and the prediction target data. If the mismatch or error is greater than or equal to the allowable value, the change value calculation unit 6 By repeating the change value calculation, it is possible to derive a stable action amount that is resistant to noise.
[0077]
FIG. 8 shows the similar case data of the prediction target data obtained by MBR prediction and the reciprocal of the similarity. Here, the similar case data means k pieces of data that are most similar to the prediction target data in the known case data in the MBR prediction as described above. The similarity is a measure of the similarity between each similar case data and the prediction target data, and is obtained by the above-described equations (2) and (3).
[0078]
First, when the predicted value does not match the desired value, FIG. 18 shows an error between the desired value when the objective variable is a categorical value attribute and the objective variable value of each similar case, and FIG. 19 shows the case where the objective variable is a numerical attribute. The error between the desired value and the objective variable value of each similar case is shown. In each case, each point represents one similar case. In FIG. 18, 0 is displayed when the desired value and the objective variable value of each similar case match, and 1 is displayed when they do not match, and they do not match when the degree of similarity is high. However, it is shown that similar cases that coincide with each other also appear. On the other hand, in FIG. 19, since the objective variable is a numerical attribute, the error is represented by a continuous value.
[0079]
FIG. 20 shows a flowchart of the change value calculation method in the change value calculation unit 6 utilizing the MBR similar case and the similarity. In FIG. 20, when predetermined data is input (step S41), only similar case data whose objective variable value is a desired value or within a certain error from the desired value is extracted and set as similar case data. Correct (step S42). Subsequently, when the search condition input unit 7 exists, if a change target explanatory variable is designated (step S43, Y), similar cases in which only the target explanatory variable is extracted from similar case data are extracted. Similar case data is newly set again (step S44). Next, when the explanatory variable has a priority (step S45, Y), the similarity order is rearranged for each similar case data that is different only in the explanatory variable with the highest priority (step S46). At this time, if the priority direction is also designated, the order is rearranged including the order of the priority direction.
[0080]
FIG. 21 shows an example of similar case data in which the similar case data shown in FIG. 8 is rearranged according to the search conditions of FIG. In FIG. 21, rearranged similar case data is generated by the following operation.
[0081]
(1) Similar cases (No. 68, No. 11, No. 45, No. 83, No. 92, No. 73) that are not desired values are deleted.
(2) No. 08 has a desired value, but is deleted because fixed explanatory variables: temperature 1 and temperature 2 are different from the prediction target data.
{Circle around (3)} Although No. 62 is lower in similarity than No. 32, density is given priority as the priority order of the explanatory variables.
(4) Then, the similar case data is examined in the order of similarity after the rearrangement. When the search designation range is designated, it is determined whether the searched similar case data is within the search designation range. The search proceeds to the similar case data in the order of similarity after the rearrangement (step S47).
(5) When similar case data having a desired value as a result value within the search range is found (step S48, Y),
a) The difference between the original prediction target data and the search data is output as an action (step S49).
b) The search data is regarded as prediction target data and excluded from the known case data, and the prediction unit 3 performs re-prediction to check the stability (step S49) (FIG. 10).
[0082]
In b) above, when it is determined by the prediction result determination unit 5 that the predicted value does not match the desired value, or the error is larger than the allowable value, and that it is necessary to calculate the changed value again. Starts searching again for similar cases in the order of similarity after the rearrangement, and repeats this cycle until there is no need to calculate a change value. If the desired value matches the predicted value, or the error is less than or equal to the allowable value, and the change calculation is no longer necessary, the action amount is output with the difference between the original prediction target data and the search data. An example from change to output is shown in FIG.
[0083]
In the example of FIG. 22, first, similar case data No. 62 having the highest degree of similarity is regarded as change data, No. 62 is excluded from the known case data, No. 62 is predicted by MBR, and the desired value Judge whether to satisfy. If the desired value is not met, No. 34 with the next highest similarity is regarded as the changed data, and MBR is predicted with the No. 34 excluded from the known case data to determine whether the desired value is met. To do. If the desired value is satisfied, the search ends there, and the action is output with the difference between the change data (similar data regarded as) and the original data (No. 101). In the example of FIG. 22, the final output is as follows.
[0084]
Example: “Change the concentration from 18.8 to 22.0 and the pressure from 2.3 to 1.8.”
[0085]
As a result of searching for all similar cases, if there is no similar case satisfying the search range and having the desired value as a result value, it is output as not applicable and the process ends (step S49).
[0086]
Embodiment 4 FIG.
Embodiment 4 of the present invention will be described below.
23 and 24 are block diagrams according to the fourth embodiment. FIG. 23 has the same basic configuration as FIG. 12, but in FIGS. 23 and 24, the part including the prediction unit 3, the prediction result determination unit 5, and the change value calculation unit 6 is a parallel execution part 100 (100 </ b> A, 100 </ b> B). ) And a minimum action determination unit 8 for determining the minimum action is different. One execution part 100A corresponds to the second embodiment, and the other execution part 100B corresponds to the third embodiment. In the fourth embodiment, a change value calculation using the influence degree and a change value calculation using the similar case data and the similarity degree are respectively performed in parallel for the same prediction target data, and the following is output.
[0087]
(1) If “Not applicable” is output in any execution part, “Not applicable” is output.
(2) When only one of the execution parts outputs the change value, the difference between the change value and the prediction target data is output as the action amount.
(3) When both execution parts output change values, the least action determination unit 8 determines which one is more similar to the prediction target data, and more similar change values and predictions are made. The difference from the target data is output as an action amount.
[0088]
Here, the determination method in the minimum action determination unit 8 includes the following.
a) When priority order and priority direction are specified in the search range input unit 7
Compare the difference between both change values in the top priority explanatory variable and the prediction target data, and select the result that is closer to the prediction target data and changed to the priority direction. If there is no difference between them, the explanatory variable with the next highest priority is checked and it is repeatedly determined until a difference appears. Finally, the difference between the closer one and the prediction target data is output as an action.
[0089]
b) When priority order or priority direction is not specified
From the following equation (4), the distance between the prediction target data and each change value is calculated, and the difference between the closer change value and the prediction target data is output as an action.
Distance between data d = Σ (d_i)²                                    (Four)
Here, addition by Σ is performed for i.
[0090]
According to the fourth embodiment, it is possible to quickly and efficiently perform optimal prediction.
[0091]
(Additional remark 1) The prediction part which can output the predicted value and data characteristic with respect to prediction object data based on known example data, the desired value input part which can input the desired value with respect to the said predicted value, and the said desired value The prediction value is input, a comparison determination unit that compares and determines the prediction value and the desired value, and based on the output of the comparison determination unit, the prediction value, the prediction target data, and the data feature are used. An action amount generation device comprising: a change value calculation unit that calculates an action amount of the prediction target data.
(Supplementary note 2) In the behavior amount generating device according to supplementary note 1,
The comparison determination unit causes the change value calculation unit to calculate an action amount when the predicted value does not match the desired value or an error is greater than or equal to a tolerance value or greater than a tolerance value. .
(Supplementary Note 3) In the behavior amount generating apparatus according to Supplementary Note 1 or Supplementary Note 2,
The action amount generated by the change value calculation unit is a minimum action amount of prediction target data that is possible to change a predicted value to a desired value.
(Appendix 4) In the behavior amount generation device according to any one of Appendix 1 to Appendix 3,
A behavior amount generation apparatus comprising: a search condition input unit that enables a search condition to be specified, wherein the change value calculation unit further calculates a behavior amount of prediction target data based on the search condition.
(Supplementary note 5) In the behavior amount generating device according to supplementary note 4,
One or more controllable explanatory variables can be specified as change target explanatory variables as search conditions specified by the search condition input unit.
(Supplementary note 6) In the behavior amount generating device according to supplementary note 4 or supplementary note 5,
An action amount generating apparatus characterized in that a change value range for each change target explanatory variable can be specified as a search condition specified by the search condition input unit.
(Supplementary note 7) In the behavior amount generating device according to any one of supplementary notes 4 to 6,
The action amount generation device characterized in that the search condition in the search condition input unit includes a priority order of change target explanatory variables and a priority search direction in the case of a numerical attribute.
(Supplementary note 8) In the behavior amount generation device according to any one of supplementary notes 1 to 7,
A behavior amount generating apparatus using inference based on a case as the prediction unit, and using an influence degree output by inference based on the case as a data feature obtained from the prediction unit.
(Supplementary note 9) In the behavior amount generation device according to any one of supplementary notes 1 to 7,
A behavior amount characterized by using inference based on a case as the prediction unit, and using similar case data of the prediction target data output by the inference based on the case and its similarity as a data feature obtained from the prediction unit Generator.
(Supplementary Note 10) In the behavior amount generation device according to Supplementary Note 9,
The change value calculation unit searches the similar case data for data that is most similar to the prediction target data and has a desired value, and determines a behavior amount based on a difference between the searched data and the prediction target data. A feature-generating device that is characterized.
(Additional remark 11) In the action amount production | generation apparatus of Additional remark 9,
The change value calculation unit searches the similar case data for data that is most similar to the prediction target data and has a desired value, regards the searched data as prediction target data, and removes it from the known case data and again A behavior amount generating apparatus that performs prediction by the prediction unit, and calculates a behavior amount of the prediction target data based on a determination result in the comparison determination unit with respect to the obtained re-predicted value.
(Supplementary note 12) In the behavior amount generating device according to supplementary note 11,
When the re-predicted value matches the desired value, or the error is less than or equal to the allowable value or smaller than the allowable value, the change value calculation unit determines the difference between the searched data and the prediction target data and A behavior amount generation device characterized by being a behavior amount.
(Supplementary note 13) In the behavior amount generating device according to supplementary note 11,
The change value calculation unit repeats the change value calculation when the re-predicted value does not match the desired value or when the error is greater than or equal to an allowable value or greater than an allowable value.
(Supplementary Note 14) The behavior amount generation device according to Supplementary Note 8 and the behavior amount generation device according to any one of Supplementary Notes 11 to 13 are provided in parallel, and the prediction is performed based on prediction values by these behavior amount generation devices. An action amount generation device including a determination unit that determines an action amount of target data.
(Supplementary Note 15) A prediction step of outputting a prediction value and data characteristics for prediction target data based on known case data;
A desired value input step of inputting a desired value for the predicted value;
A comparison determination step for inputting the desired value and the predicted value, and comparing and determining the predicted value and the desired value;
A behavior amount generation method comprising: a change value calculation step of calculating a behavior amount of the prediction target data using the prediction value, the prediction target data, and the data feature based on the comparison determination result.
(Supplementary Note 16) A prediction step of outputting a prediction value and data characteristics for prediction target data based on known case data;
A desired value input step of inputting a desired value for the predicted value;
A comparison determination step for inputting the desired value and the predicted value, and comparing and determining the predicted value and the desired value;
A behavior amount generation program that causes a computer to execute a change value calculation step of calculating a behavior amount of prediction target data using the prediction value, the prediction target data, and the data feature based on the comparison determination result.
[0092]
【The invention's effect】
By using the present invention as described above, it is possible to efficiently calculate how and how much the prediction target data should be changed in order to change the predicted value to the desired value. Become.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a basic configuration of an embodiment.
FIG. 2 is a diagram illustrating an example of case data.
FIG. 3 is a diagram showing prediction target data (No. 101).
FIG. 4 is an MBR conceptual diagram.
FIG. 5 is a diagram illustrating an example of the degree of influence by newCCF.
FIG. 6 is a diagram showing explanatory variables (examples of the influence of pressure).
FIG. 7 is a diagram illustrating an example of the degree of influence of prediction target data.
FIG. 8 is a diagram showing the reciprocal of the similarity of similar case data with respect to prediction target data (No. 101).
FIG. 9 is a diagram illustrating a prediction result of prediction target data (No. 101).
FIG. 10 is an MBR conceptual diagram for deleting prediction target data from known case data.
FIG. 11 is a flowchart showing an overall operation.
FIG. 12 is a block diagram showing the first embodiment.
FIG. 13 is a diagram illustrating an example of change target explanatory variables, search ranges, search priorities, and search direction designations.
FIG. 14 is a flowchart showing an operation of a change value calculation unit using an influence degree.
FIG. 15 is a diagram showing the influence of pressure and concentration extracted according to search conditions.
FIG. 16 is a flowchart showing an operation of a change value calculation unit using an influence degree.
FIG. 17 is a diagram illustrating a change value calculation example using an influence degree with respect to prediction target data (No. 101).
FIG. 18 is a diagram showing a relationship between a similarity in a category value attribute and an objective variable value / desired value error.
FIG. 19 is a diagram illustrating a relationship between similarity in a numerical attribute and an objective variable value / desired value error.
FIG. 20 is a flowchart illustrating an operation of a change value calculation unit using similar cases and similarities.
FIG. 21 is a diagram showing similar case data for the prediction target data (No. 101) rearranged according to the search conditions of FIG. 20;
FIG. 22 is a diagram illustrating a change value calculation example using similar case data and similarity for prediction target data (No. 101).
FIG. 23 is a first block diagram showing a fourth embodiment.
FIG. 24 is a second block diagram showing the fourth embodiment.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 1 Known data input part, 2 Prediction object data input part, 3, 3A, 3B Prediction part, 4 Desired result input part (desired value input part), 5, 5A, 5B Prediction result judgment part (comparison judgment part), 6, 6A, 6B Change value calculation unit, 7 search condition input unit, 8 minimum action determination unit (determination unit).

Claims

Based on the known case data that has the value of the explanatory variable and the objective variable that accompanies it, the prediction target data whose target variable value for the explanatory variable is unknown is predicted using case-based reasoning (MBR) In addition to outputting a value, it is possible to output, as feature data, the degree of influence that the explanatory variable value of the explanatory variable of the known case data has on the target variable value, or the similar case data of the prediction target data and its similarity. A predictor;
A desired value input unit capable of inputting a desired value for the predicted value;
A comparison determination unit for comparing and determining the predicted value and the desired value;
In the determination result of the comparison determination unit, when the comparison result does not satisfy the allowable value, the explanatory variable value of the prediction target data is changed using the predicted value and the feature data, and the changed explanatory variable value is used. A prediction value is output by the prediction unit, is compared by the comparison determination unit, and a change amount of the explanatory variable value is set as an action amount of the prediction target data when the comparison result satisfies the allowable value A behavior amount generation apparatus comprising a value calculation unit.

The behavior amount generating apparatus according to claim 1,
A search condition input unit capable of specifying a search condition for explanatory variable values by the change value calculation unit is provided, and the change value calculation unit changes the explanatory variable value of the prediction target data based on the search condition. An action amount generation device characterized by the above.

In the behavior amount generation device according to claim 1 or 2,
The process of outputting the predicted value by inference based on the case by the prediction unit is as follows:
The degree of influence on a predetermined objective variable is derived from the known case data, and the individual similarity between the prediction target data and the known case data is calculated based on the degree of influence, and the prediction target data is calculated based on the calculated similarity. A behavior amount generating apparatus characterized in that a predetermined number of known case data that is most similar to each other is extracted, and a predicted value is output by a weighted sum of the extracted known case data.

A behavior amount generation method executed by a computer of a behavior amount generation device that generates a behavior amount,
Based on the known case data that has the value of the explanatory variable and the objective variable that accompanies it, the predicted value is calculated using the inference MBR based on the case of the prediction target data whose target variable value for the explanatory variable is unknown. and outputs a prediction step described variable values of the known case data explanatory variable is to output the degree of influence for that target variable value, or the similar case data and its similarity of the prediction target data as characteristic data,
A desired value input step capable of inputting a desired value for the predicted value;
A comparison and determination step of comparing and determining the predicted value and the desired value;
In the judgment result of the comparison determination step, if the comparison error does not satisfy the allowable value, by using the predicted value and the feature data, and change the explanatory variable values of the prediction target data, by the change described variable value change said to output the predicted value by the prediction step, the is compared by the comparison judgment step, when the comparison result satisfies the allowable value, to a change amount of the explanatory variable values and behavior of the prediction target data A value calculation step;
A behavior amount generation method comprising:

A behavior amount generation program for causing a computer of a behavior amount generation apparatus to generate a behavior amount,
Based on the known case data that has the value of the explanatory variable and the objective variable that accompanies it, the predicted value is calculated using the inference MBR based on the case of the prediction target data whose target variable value for the explanatory variable is unknown. and outputs a prediction step described variable values of the known case data explanatory variable is to output the degree of influence for that target variable value, or the similar case data and its similarity of the prediction target data as characteristic data,
A desired value input step capable of inputting a desired value for the predicted value;
A comparison and determination step of comparing and determining the predicted value and the desired value;
In the judgment result of the comparison determination step, if the comparison error does not satisfy the allowable value, by using the predicted value and the feature data, and change the explanatory variable values of the prediction target data, by the change described variable value change said to output the predicted value by the prediction step, the is compared by the comparison judgment step, when the comparison result satisfies the allowable value, to a change amount of the explanatory variable values and behavior of the prediction target data A value calculation step;
A behavior generation program that causes a computer to execute.