JP6760138B2

JP6760138B2 - Dialogue corpus creation program, dialogue corpus creation method, and information processing device

Info

Publication number: JP6760138B2
Application number: JP2017040681A
Authority: JP
Inventors: 哲朗高橋
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2017-03-03
Filing date: 2017-03-03
Publication date: 2020-09-23
Anticipated expiration: 2037-03-03
Also published as: JP2018146720A

Description

本発明は、対話コーパス作成プログラム、対話コーパス作成方法、及び情報処理装置に関する。 The present invention relates to a dialogue corpus creation program, a dialogue corpus creation method, and an information processing apparatus.

近年、機械が人を相手とした対話を自然な応答文で応答する対話システムが開発され製品化されている。人の入力文に対する予め蓄積した応答文の文パターンから対応する応答文の文パターンを求め、入力文から抽出した単語の特徴語と関連度の高い単語を用いて、求めた応答文のパターンとから入力文に対する応答文を生成する技術が知られている。 In recent years, a dialogue system in which a machine responds to a dialogue with a person with a natural response sentence has been developed and commercialized. The sentence pattern of the corresponding response sentence is obtained from the sentence pattern of the response sentence accumulated in advance for the human input sentence, and the obtained response sentence pattern is obtained by using the characteristic words of the words extracted from the input sentence and the words having a high degree of relevance. A technique for generating a response sentence to an input sentence from is known.

また、機械が人との対話においてより自然な文を出力するために、２者以上のユーザの対話文を受け付け、少なくとも一者の会話文を自然言語処理し、会話の繋がりデータを取得し蓄積することで対話コーパスを生成する技術等が提案されている。 In addition, in order for the machine to output more natural sentences in dialogue with humans, it accepts dialogue sentences of two or more users, processes at least one conversational sentence in natural language, and acquires and accumulates conversation connection data. A technique for generating a dialogue corpus by doing so has been proposed.

特開２０１５−１５３２６１号公報Japanese Unexamined Patent Publication No. 2015-153261 特開２００８−２９９７５４号公報Japanese Unexamined Patent Publication No. 2008-299754 特開２００７−１０２１０４号公報JP-A-2007-102104 特開２００１−１６６７８５号公報Japanese Unexamined Patent Publication No. 2001-166785

上述した技術では、２者で作業した場合、片方が作業している間の待ち時間が生じるため非効率的であるといった問題がある。効率的に２者による対話を行なおうとすると、ペアとなる２者を探す作業、２者の時間合せ等の管理が必要となり管理者の負担になる。 The above-mentioned technique has a problem that when two people work, it is inefficient because a waiting time occurs while one of them is working. If two parties try to have a dialogue efficiently, it is necessary to manage the work of finding a pair of two parties and the time adjustment of the two parties, which is a burden on the administrator.

したがって、１つの側面では、対話コーパスを効率的に作成することを目的とする。 Therefore, one aspect aims to efficiently create a dialogue corpus.

一態様によれば、２者の対話に係る対話情報に基づいて１発話を表わす発話データを作成する作業者の作業者ＩＤを、複数の作業者の作業者ＩＤを該対話へ参加可能な順に記憶した記憶部から取り出し、前記記憶部で管理される、複数の対話それぞれ毎に時系列に該対話に参加した作業者の作業者ＩＤの記録を参照して、取り出した前記作業者ＩＤの作業者の過去の参加状況に基づく作業者割当手順に従って、前記発話データを作成する作業者を割り当てる該対話を選択する割当処理を１発話ずつコンピュータに実行させ、前記複数の作業者により前記対話を完成させた対話データを得る処理を前記コンピュータに実行させる対話コーパス作成プログラムが提供される。 According to one aspect, the worker IDs of workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties, and the worker IDs of a plurality of workers in the order in which they can participate in the dialogue. The work of the worker ID taken out from the stored storage unit and managed by the storage unit by referring to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues. According to the worker allocation procedure based on the past participation status of the person, the computer is made to execute the allocation process for selecting the dialogue for assigning the worker who creates the speech data one by one, and the dialogue is completed by the plurality of workers. Provided is a dialogue corpus creation program that causes the computer to execute a process of obtaining the dialogue data.

また、上記課題を解決するための手段として、対話コーパス作成方法、及び情報処理装置とすることもできる。 Further, as a means for solving the above problems, a dialogue corpus creation method and an information processing device can be used.

対話コーパスを効率的に作成することができる。 You can efficiently create a dialogue corpus.

ペアの対話による対話コーパスの作成を説明するための図である。It is a figure for demonstrating the creation of the dialogue corpus by the dialogue of a pair. タスク分解をしてクラウドソーシングを用いるコーパス作成例を示す図である。It is a figure which shows the example of corpus creation which breaks down a task and uses crowdsourcing. 対話に対する意識の共通化と対話品質の維持の方法とを説明するための図である。It is a figure for demonstrating the common consciousness for dialogue and the method of maintaining the quality of dialogue. 対話画面例を示す図である。It is a figure which shows the dialogue screen example. 不適切チェックの有無に基づく発話木構造の例を説明するための図である。It is a figure for demonstrating the example of the utterance tree structure based on the presence or absence of an inappropriate check. 多重度「１」の場合の発話スコアリングを説明するための図である。It is a figure for demonstrating the utterance scoring in the case of the multiplicity "1". 多重度「３」の場合の発話スコアリングを説明するための図である。It is a figure for demonstrating the utterance scoring in the case of the multiplicity "3". 作業者スコアリングを説明するための図である。It is a figure for demonstrating the worker scoring. 対話スコアリングを説明するための図である。It is a figure for demonstrating dialogue scoring. 作業者割当手順の概要を説明するための図である。It is a figure for demonstrating the outline of the worker allocation procedure. 選択確率が最も高い作業者ＩＤが１つのみの場合の作業者割当処理の例を示す図である。It is a figure which shows the example of the worker allocation processing in the case of only one worker ID with the highest selection probability. 最も高い選択確率が複数存在する場合の作業者割当処理の第１例を示す図である。It is a figure which shows the 1st example of the worker allocation processing when there are a plurality of the highest selection probabilities. 最も高い選択確率が複数存在する場合の作業者割当処理の第２例を示す図である。It is a figure which shows the 2nd example of the worker allocation processing when there are a plurality of the highest selection probabilities. 最も高い選択確率が複数存在する場合の作業者割当処理の第３例を示す図である。It is a figure which shows the 3rd example of the worker allocation processing when there are a plurality of the highest selection probabilities. 本実施例におけるシステム構成例を示す図である。It is a figure which shows the system configuration example in this Example. 本実施例のシステムにおける全体処理の概要を説明するための図である。It is a figure for demonstrating the outline of the whole processing in the system of this Example. サーバ装置の機能構成例を示す図である。It is a figure which shows the functional configuration example of a server device. 作業者リストのデータ構成例を示す図である。It is a figure which shows the data structure example of a worker list. 対話タスクテーブルのデータ構成例を示す図である。It is a figure which shows the data structure example of an interactive task table. 発話テーブルのデータ構成例を示す図である。It is a figure which shows the data structure example of the utterance table. 発話データＤＢのデータ構成例を示す図である。It is a figure which shows the data structure example of the utterance data DB. 作業者スコアテーブルのデータ構成例を示す図である。It is a figure which shows the data structure example of the worker score table. スコア付き対話データのデータ構成例を示す図である。It is a figure which shows the data structure example of the dialogue data with a score. 初期化処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the initialization process. 作業者登録処理と作業者割当処理とを説明するためのフローチャート図である。It is a flowchart for demonstrating the worker registration process and the worker allocation process. 図２５のステップＳ２２２における対話タスク選択処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the interactive task selection process in step S222 of FIG. 図２６のステップＳ２３７における対話タスク生成処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the dialogue task generation process in step S237 of FIG. 発話作成処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the utterance making process. 図２８のステップＳ２５２における発話情報更新処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the utterance information update process in step S252 of FIG. 28. 対話データ取出処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the interactive data fetch process. 図３０のステップＳ２７１における作業者スコア算出処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the worker score calculation process in step S271 of FIG. 図３０のステップＳ２７２における対話スコア算出処理を説明するためのフローチャート図である。It is a flowchart for demonstrating the dialogue score calculation process in step S272 of FIG.

以下、本発明の実施の形態を図面に基づいて説明する。先ず、２者のペアにより対話コーパスを作成する場合について説明する。図１は、ペアの対話による対話コーパスの作成を説明するための図である。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. First, a case where a dialogue corpus is created by a pair of two parties will be described. FIG. 1 is a diagram for explaining the creation of a dialogue corpus by a pair dialogue.

図１では、予め対話のペアとなる被験者α及びβを用意し、そのペアに対話をしてもらい、対話の内容を記録することで対話コーパス１ａを作成する。被験者α及びβには、予め役割を説明しておく。例えば、旅行代理店の店員と、移動手段、ホテル等を予約する顧客等である。 In FIG. 1, subjects α and β to be a dialogue pair are prepared in advance, the pair is asked to have a dialogue, and the content of the dialogue is recorded to create a dialogue corpus 1a. The roles of subjects α and β will be explained in advance. For example, a clerk of a travel agency and a customer who reserves a means of transportation, a hotel, or the like.

図１では、被験者αには、旅行代理店の店員として会話してもらい、被験者βには、顧客として会話してもらった場合の対話例を示している。被験者αが１番目の発話「どちらに行かれますか？」で問いかけ、被験者βが２番目の発話「シカゴに行こうと思います」で返答する。 FIG. 1 shows an example of a dialogue in which the subject α has a conversation as a clerk of a travel agency and the subject β has a conversation as a customer. Subject α asks in the first utterance, "Where are you going?", And subject β replies in the second utterance, "I'm going to Chicago."

そして、被験者αが３番目の発話「何泊のご予定ですか？」で更に質問し、被験者βが４番目の発話「３泊くらいを考えてます」で返答する等々と、被験者αと被験者βとの間で会話が継続される。旅行代理店の店員役の被験者αが、旅行の計画に十分な情報を被験者βから取得すると、対話の終了となる。 Then, subject α asks a further question in the third utterance "How many nights are you planning?", Subject β replies in the fourth utterance "I'm thinking about 3 nights", and so on. The conversation with β is continued. When the subject α, who acts as a clerk of the travel agency, obtains sufficient information from the subject β to plan the trip, the dialogue ends.

このようにして、多くの被験者のペアから収集した対話コーパス１ａを用いて機械学習を行うことで、機械が人を相手とした場合に対話を適切に進行させることができる。対話コーパス１ａを用いた機械学習により、人手による対話の規則の記述を不要にきる。 In this way, by performing machine learning using the dialogue corpus 1a collected from many pairs of subjects, it is possible to appropriately proceed with the dialogue when the machine deals with a person. Machine learning using the dialogue corpus 1a eliminates the need to manually describe the rules of dialogue.

精度よく対話を進行させるためには、大規模な対話コーパス１ａを作成することが必要である。大規模な対話コーパス１ａにより機械学習を用いた対話システムの開発を高精度に行えるようになる。 In order to proceed with the dialogue accurately, it is necessary to create a large-scale dialogue corpus 1a. The large-scale dialogue corpus 1a enables the development of a dialogue system using machine learning with high accuracy.

しかし、このようにして、対話コーパス１ａを作成するためには、一般的には、対話のペアとなる被験者α及びβを用意し、場所を提供又は通信手段を介して、定めた時間に対話を開始させ、進行する対話の内容を記録する。この手法では、被験者は他者と共同で作業にあたることになり、片方の被験者が作業している間、ペアの相手はその作業を待つ時間が発生する。 However, in order to create the dialogue corpus 1a in this way, in general, subjects α and β to be a pair of dialogues are prepared, and dialogues are provided at a predetermined time by providing a place or via a communication means. And record the content of the ongoing dialogue. In this method, the subject works in collaboration with the other, and while one subject is working, the other person in the pair has time to wait for the work.

このような問題に対する簡単な解決策として、対話コーパス１ａに記録される対話を、定めた２者により開始から終了までを行わせるのではなく、対話を個別の複数のタスクに分割し、クラウドソーシング等を用いることにより多数の人の参加により独立に実施することが考えられる。 As a simple solution to such a problem, instead of having the dialogue recorded in the dialogue corpus 1a performed from the start to the end by two defined parties, the dialogue is divided into a plurality of individual tasks and crowdsourced. It is conceivable to carry out independently with the participation of a large number of people by using such as.

対話を個別の複数のタスクに分割するとは、１つの役割を１人の人が行うのではなく、会話のやり取り毎に役割を担う人を選定して会話を継続させることである。一例として、図１の被験者αと被験者βとの対話のうち、被験者α及びβのいずれかが発話する毎に、役割を担う人を変更する。この場合、１回の発話を行うことを１つの個別のタスクと見なす。 Dividing the dialogue into a plurality of individual tasks means that one person does not perform one role, but a person who plays a role is selected for each conversation and the conversation is continued. As an example, in the dialogue between the subject α and the subject β in FIG. 1, the person who plays the role is changed each time either the subject α or β speaks. In this case, making one utterance is regarded as one individual task.

クラウドソーシングとは、目的とする主タスクを、それぞれ独立した細かい単位のタスク（以下、小タスクという）に分解し、その小タスクを不特定多数の作業者によってネットワークを介して実施することである。 Crowdsourcing is to decompose the target main task into independent small unit tasks (hereinafter referred to as small tasks) and execute the small tasks by an unspecified number of workers via a network. ..

小タスクの配信や実施は、主に、ウェブブラウザやスマートフォン等を用いて行われるため、作業者のメリットとして、作業者は任意の場所において任意のタイミングで作業を実施できる。個々の小タスクは小規模で簡単な作業であるため、専門性を必要としない場合が多い。 Since the distribution and execution of small tasks are mainly performed using a web browser, a smartphone, or the like, as a merit of the worker, the worker can carry out the work at any place at any time. Individual subtasks are small and easy tasks and often do not require expertise.

また、不特定多数の作業者に小タスクを実施してもらうことにより、大きな主タスクを低いコストで早く完了させることができる。このことから、簡単な例として、複数の発話でなる対話を主タスクとし、発話単位を小タスクとすることが考えられる。 In addition, by having an unspecified number of workers perform small tasks, it is possible to complete a large main task quickly at low cost. From this, as a simple example, it is conceivable that the main task is a dialogue consisting of a plurality of utterances, and the utterance unit is a small task.

図２は、タスク分解をしてクラウドソーシングを用いるコーパス作成例を示す図である。図２において、作業者Ａ、Ｂ、Ｃ、及びＤを含む複数の作業者９による対話の進行例を示している。 FIG. 2 is a diagram showing an example of creating a corpus by decomposing tasks and using crowdsourcing. FIG. 2 shows an example of progress of dialogue by a plurality of workers 9 including workers A, B, C, and D.

作業者Ａが１番目の発話「どちらに行かれますか？」で問いかけることで会話が開始される。この問いかけに対して、作業者Ｂが２番目の発話「シカゴに行こうと思います」で返答する。 The conversation is started when Worker A asks the first utterance, "Where are you going?". Worker B responds to this question with the second utterance, "I'm going to Chicago."

作業者Ｂの返答に対して、図１の例では、１番目の発話の作業者Ａが次の質問を行うはずであるが、図２の例では、作業者Ａとは異なる作業者Ｃが、３番目の発話「何泊のご予定ですか？」で更に質問する。 In response to the response of the worker B, in the example of FIG. 1, the worker A of the first utterance should ask the next question, but in the example of FIG. 2, a worker C different from the worker A is asked. Ask more questions in the third utterance, "How many nights are you planning?"

作業者Ｃの質問に対して、回答する作業者９は作業者Ｂではなく作業者Ｄである。作業者Ｄは、４番目の発話「３泊くらいを考えてます」で返答する。このように、対話をする２者を固定せず、発話入力毎に、作業者９を選定し、多数の作業者９により一連の会話を個別に独立させて実施する。 The worker 9 who answers the question of the worker C is not the worker B but the worker D. Worker D replies with the fourth utterance, "I'm thinking about three nights." In this way, the two persons having a dialogue are not fixed, a worker 9 is selected for each utterance input, and a series of conversations are individually and independently carried out by a large number of workers 9.

本実施例を説明する前に、以下に用語を定義する。 Before explaining this embodiment, the terms are defined below.

発話は、一人の作業者が記述する一つの発言である。 An utterance is a statement described by a single worker.

対話は、複数の発話からなる一連の発話列である。 A dialogue is a series of utterances consisting of multiple utterances.

作業者は、発話を作成する個別の人であり、発話者という場合がある。 A worker is an individual person who creates an utterance and is sometimes referred to as the speaker.

図２で説明したようなクラウドソーシングを用いるコーパス作成方法では、上述したように対話の流れを個別の小タスクに分解すると整合性のある発話例を生成できない場合がある。したがって、本実施例では、
（Ｉ）整合性のチェック
（ＩＩ）発話、作業者、及び対話それぞれのスコアリング
（ＩＩＩ）作業者の割り当ての効率化
を行う。 In the corpus creation method using crowdsourcing as described in FIG. 2, if the dialogue flow is decomposed into individual subtasks as described above, it may not be possible to generate a consistent utterance example. Therefore, in this embodiment,
(I) Consistency check (II) Scoring of utterances, workers, and dialogues (III) Efficient allocation of workers.

（Ｉ）整合性のチェック
先ず、文脈の提示による作業者９の意識の共通化と対話品質を維持するために、
・作業者９には開始からの文脈を提示し
・作業者９は、その文脈を読んで理解することによって、これまでの発話者（作業者９）と同じ文脈を共有することができ、それにより同一ペアによって作成された発話と同様の品質を保持する。 (I) Consistency check First, in order to maintain the common consciousness of workers 9 and the quality of dialogue by presenting the context.
-Present the context from the beginning to the worker 9-The worker 9 can share the same context as the previous speaker (worker 9) by reading and understanding the context. Maintains the same quality as utterances created by the same pair.

次に、文脈の整合性を担保するために、
・１発話進むごとに、作業者９がそれまでの対話の内容が適切であるか否かをチェックする。これは、文脈の理解と同時に行うことができる。つまり、文脈を理解するために全て読んだ後、内容を理解できなければ不適切であると判断する。
・独立した作業者９によって対話コーパスを作成する際の問題点として整合性のある発話列が作成できない恐れがある点が挙げられるが、このチェックにより対話の整合性を担保する。
・作業者９は、直前の対話を不適切だと判断した場合、作業者９はそこでなされるであろう発話を作成する。 Then, to ensure the integrity of the context
-Every time one utterance progresses, the worker 9 checks whether or not the content of the dialogue so far is appropriate. This can be done at the same time as understanding the context. In other words, after reading everything to understand the context, if you cannot understand the content, it is judged to be inappropriate.
-The problem with creating a dialogue corpus by an independent worker 9 is that there is a risk that a consistent utterance sequence cannot be created, but this check ensures the consistency of the dialogue.
-If the worker 9 determines that the previous dialogue is inappropriate, the worker 9 creates an utterance that will be made there.

（ＩＩ）発話、作業者、及び対話それぞれのスコアリング
先ず、発話と、作業者９の信頼度のスコアリングについて説明する。
・作業者９のレベルにはばらつきがあることが一般的であり、その結果、不適切な発話が作成される可能性がある。
・各発話に、パラメータとして与えられた多重度においてその発話が正しいと判定された割合を発話スコアとする。
・分岐後は、発話スコアが高い発話を継続させる。分岐後の発話の枝を延ばすと表現される場合がある。
・作業者９が作成したすべての発話の発話スコアの平均値をその作業者９の作業者スコアとする。 (II) Scoring of utterances, workers, and dialogues First, utterances and scoring of the reliability of workers 9 will be described.
-The level of worker 9 generally varies, and as a result, inappropriate utterances may be created.
-For each utterance, the ratio at which the utterance is judged to be correct at the multiplicity given as a parameter is defined as the utterance score.
・ After branching, continue utterance with a high utterance score. It may be described as extending the branch of the utterance after branching.
-The average value of the utterance scores of all the utterances created by the worker 9 is set as the worker score of the worker 9.

次に、作業者スコアを用いた対話のスコアリングについて説明する。
・各対話のスコアを算出する。
・分岐があった場合は最も指示されたパスを選択し、一本のパスを抽出する。
・そのパスに出現する作業者９の作業者スコアの相乗平均を求め、これをその対話のスコアとし、以下、対話スコアという。 Next, dialogue scoring using worker scores will be described.
-Calculate the score for each dialogue.
・ If there is a branch, select the most specified path and extract one path.
-The geometric mean of the worker scores of the workers 9 appearing in the path is calculated, and this is used as the score of the dialogue, which is hereinafter referred to as the dialogue score.

（ＩＩＩ）作業者の割り当ての効率化
対話における発話列が長くなると文脈の理解とその内容のチェックに時間がかかる。このチェックを可能な限り省略し効率的にするために、同一の作業者９に同じタスクを優先的にアサインする。
・このために対話のタスク履歴に基づいた確率を用いてタスクを割り当てる。
ここでのタスクとは、作成予定の対話に相当し、以下の説明において、対話タスクという場合がある。作成予定の対話には、現在進行中の対話と、未だ開始されていない対話とを含む。 (III) Efficient allocation of workers When the utterance line in a dialogue becomes long, it takes time to understand the context and check its contents. In order to omit this check as much as possible and make it efficient, the same task is preferentially assigned to the same worker 9.
-For this purpose, assign tasks using probabilities based on the task history of the dialogue.
The task here corresponds to the dialogue to be created, and may be referred to as a dialogue task in the following description. Dialogues to be created include those that are currently in progress and those that have not yet started.

上述した（Ｉ）から（ＩＩＩ）について、例を用いて説明する。先ず、（Ｉ）整合性のチェックについて図３から図５で説明する。 The above-mentioned (I) to (III) will be described with reference to examples. First, (I) consistency check will be described with reference to FIGS. 3 to 5.

図３は、対話に対する意識の共通化と対話品質の維持の方法とを説明するための図である。図３では、作業者Ａ、Ｂ、Ｃ、Ｄを含む複数の異なる作業者９には、旅行代理店での交通機関、宿泊先等の予約の場面を想定してもらい、一連の会話を行ってもらった場合の例を示している。 FIG. 3 is a diagram for explaining a method of standardizing awareness of dialogue and maintaining dialogue quality. In FIG. 3, a plurality of different workers 9, including workers A, B, C, and D, are asked to imagine a situation of booking transportation, accommodation, etc. at a travel agency, and have a series of conversations. An example is shown when you receive it.

本実施例では、
・作業者９のペアの管理は行わず各作業者９を独立に受け付ける。
・各作業者９には、それまでの対話の文脈（生成された発話の列）を提示し、次の発話の作成を依頼する。 In this embodiment
-The pair of workers 9 is not managed, and each worker 9 is accepted independently.
-Each worker 9 is requested to present the context of the dialogue up to that point (the sequence of utterances generated) and to create the next utterance.

複数の作業者９は、携帯電話等の端末からネットワークを介して本実施例に係るサービスへ接続する。作業者９へは、少なくとも直前までの発話を提示する。対話を進行する場面の説明を更に提示してもよい。一方で、対話の場面は、直前までの発話から、作業者９により判断させてもよい。 The plurality of workers 9 connect to the service according to the present embodiment from a terminal such as a mobile phone via a network. The utterance at least immediately before is presented to the worker 9. Further explanations of the scenes in which the dialogue proceeds may be presented. On the other hand, the scene of the dialogue may be determined by the worker 9 from the utterances up to the last minute.

この例では、旅行代理店における予約に関する店員と顧客の対話を、複数の作業者９に行ってもらう。各作業者９は、状況を把握した上で、直前の発話に対して最も有り得る内容を次の発話として入力する。又は、直線の発話を不適切と判断した場合は修正する発話を入力する。 In this example, a plurality of workers 9 are asked to have a dialogue between a clerk and a customer regarding a reservation at a travel agency. After grasping the situation, each worker 9 inputs the most probable content for the immediately preceding utterance as the next utterance. Alternatively, if it is determined that the straight line utterance is inappropriate, enter the utterance to be corrected.

複数の作業者９のうち、最初に会話を入力する作業者Ａは、自分は店員であると判断して最初の発話を入力する。例えば、１番目の発話として「どちらに行かれますか？」等が入力される。 Of the plurality of workers 9, the worker A who inputs the conversation first determines that he / she is a clerk and inputs the first utterance. For example, "Where are you going?" Is input as the first utterance.

１番目の発話に対して、１番目の発話を提示された作業者Ｂは、自分は顧客であると判断して、２番目の発話を入力する。例えば、２番目の発話として「シカゴに行こうと思います」等が入力される。 Worker B, who is presented with the first utterance for the first utterance, determines that he / she is a customer and inputs the second utterance. For example, "I'm going to Chicago" is input as the second utterance.

１番目及び２番目の発話を提示された作業者Ｃは、自分は店員であると判断して３番目の発話を入力する。例えば、３番目の発話として「何泊のご予定ですか？」等が入力される。 Worker C, who is presented with the first and second utterances, determines that he is a clerk and inputs the third utterance. For example, "How many nights are you planning?" Is input as the third utterance.

１番目から３番目の発話を提示された作業者Ｄは、自分は顧客であると判断して、４番目の発話を入力する。例えば、４番目の発話として「３泊くらいを考えてます」等が入力される。 Worker D, who is presented with the first to third utterances, determines that he is a customer and inputs the fourth utterance. For example, "I'm thinking about 3 nights" is entered as the fourth utterance.

このように、直前までの発話を作業者９に提供することで、発話毎に作業者９が変更されても、作業者９は、状況と自分の役割を把握することができる。更に、本実施例では、図４に示すような画面を作業者９の端末に表示させることで、直前までの対話の整合性を作業者９によりチェックさせる。 By providing the utterances up to the last minute to the worker 9 in this way, even if the worker 9 is changed for each utterance, the worker 9 can grasp the situation and his / her role. Further, in this embodiment, the screen as shown in FIG. 4 is displayed on the terminal of the worker 9, so that the worker 9 checks the consistency of the dialogue up to the last minute.

図４は、対話画面例を示す図である。図４に例示する対話画面Ｇ７０は、発話履歴表示域７１と、発話入力域７２と、不適切チェック領域７３と、終了チェック領域７４とを有する。対話画面Ｇ７０において、作業者Ｃは、発話履歴表示域７１に表示れた対話の最初から直前までの２つの発話を読み、対話の状況を把握する。 FIG. 4 is a diagram showing an example of a dialogue screen. The dialogue screen G70 illustrated in FIG. 4 has an utterance history display area 71, an utterance input area 72, an inappropriate check area 73, and an end check area 74. On the dialogue screen G70, the worker C reads the two utterances from the beginning to the immediately preceding of the dialogue displayed in the utterance history display area 71, and grasps the situation of the dialogue.

作業者Ｃは、先ず、発話履歴表示域７１に表示れた
１．どちらに行かれますか？
２．シカゴに行こうと思います
の発話から、最後（２番目）の発話が適切であるか否かを判断する。 Worker C was first displayed in the utterance history display area 71. Where are you going?
2. 2. From the utterances I'm going to Chicago, determine if the last (second) utterance is appropriate.

作業者Ｃは、最後の発話が適切であると判断した場合、最後の発話に続く適切な発話を発話入力域７２に作成し入力する。一方、不適切であると判断した場合、作業者Ｃは、不適切チェック領域７３にチェックを入れる。不適切チェック領域７３にチェックを入れた場合、作業者Ｃに、最後の発話の代わりに、適切と考える発話を発話入力域７２に入力して修正する。 When the worker C determines that the last utterance is appropriate, the worker C creates and inputs an appropriate utterance following the last utterance in the utterance input area 72. On the other hand, when it is determined that it is inappropriate, the worker C checks the inappropriate check area 73. When the inappropriate check area 73 is checked, the worker C is corrected by inputting an utterance considered to be appropriate into the utterance input area 72 instead of the last utterance.

不適切チェック領域７３にチェックを入れた場合には、この修正は必須とすることが望ましい。修正すべき発話を作成させることで、作業者Ｃに最後の発話の適正判断を誠実に行わせることができる。本実施例では、このように発話毎に、次の作業者９に最後の発話の適正を判断させるため、最後の発話より以前の発話は、適正であると判断済みの発話であると言える。 When the inappropriate check area 73 is checked, it is desirable that this correction is indispensable. By having the utterance to be corrected created, the worker C can be made to make a proper judgment of the last utterance in good faith. In this embodiment, since the next worker 9 is made to judge the appropriateness of the last utterance for each utterance in this way, it can be said that the utterances before the last utterance are the utterances judged to be appropriate.

この例では、作業者Ｃは、不適切チェック領域７３にチェックを入れずに、発話入力域７２に「何泊のご予定ですか？」と発話を作成している。即ち、「シカゴに行こうと思います」の発話は適正であると判断し、この発話に続く次の発話を作成した場合である。 In this example, the worker C creates an utterance in the utterance input area 72 as "How many nights are you planning?" Without checking the inappropriate check area 73. That is, it is the case that the utterance of "I will go to Chicago" is judged to be appropriate and the next utterance following this utterance is created.

終了チェック領域７４は、対話の終了であると判断した場合に、作業者Ｃは、終了チェック領域７４にチェックを入れる。対話の終了とは、対話の目的が達成されたと判断した場合である。対話が宿泊に係る交通機関、ホテル等の予約を目的とする場合、出発地、目的地、滞在地、滞在日数等の情報を取得できたと判断した場合を対話の終了とする。このような必要な情報を得られた場合の他、質問に回答できた、相手の満足を得られた等が、対話の終了となる。 When it is determined that the end check area 74 is the end of the dialogue, the worker C checks the end check area 74. The end of the dialogue is when it is determined that the purpose of the dialogue has been achieved. When the purpose of the dialogue is to reserve transportation, hotels, etc. related to accommodation, the dialogue ends when it is judged that information such as the departure place, destination, place of stay, and length of stay can be obtained. In addition to the case where such necessary information is obtained, the dialogue ends when the question can be answered and the other party's satisfaction is obtained.

本実施例では、上述した不適切チェック領域７３へチェックされた場合に、図５に示すように、発話の進行を分岐させる木構造（以下、発話木構造という）により複数の発話で構成される対話の進行状態を表現する。 In this embodiment, when the inappropriate check area 73 described above is checked, as shown in FIG. 5, a plurality of utterances are composed of a tree structure (hereinafter referred to as an utterance tree structure) that branches the progress of the utterance. Express the progress of the dialogue.

図５は、不適切チェックの有無に基づく発話木構造の例を説明するための図である。図５（Ａ）では、作業者Ａによる最初（１番目）の発話「どちらに行かれますか？」に対して、次の作業者Ｂが２番目の発話「シカゴに行こうと思います」を作成し、この対話が作業者Ｃに割り当てられた際の対話画面Ｇ７０を示している。図４の対話画面Ｇ７０と同様である。 FIG. 5 is a diagram for explaining an example of an utterance tree structure based on the presence or absence of an inappropriate check. In FIG. 5 (A), in response to the first (first) utterance "Where are you going?" By worker A, the next worker B is the second utterance "I will go to Chicago." Is created, and the dialogue screen G70 when this dialogue is assigned to the worker C is shown. This is the same as the dialogue screen G70 of FIG.

この例では、作業者Ｃは、作業者Ａの発話に対する作業者Ｂの発話は適切であったと判断し、不適切チェックが無い状態で、作業者Ｃにより発話「何泊のご予定ですか？」が作成される。 In this example, the worker C judges that the utterance of the worker B is appropriate for the utterance of the worker A, and the utterance by the worker C "How many nights are you planning?" Without the inappropriate check. Is created.

この対話の進行状態は、発話木構造８ａで表現される。発話木構造８ａでは、発話を、その発話を作成した作業者９を表わすアルファベットのノードで簡潔に示す。図６〜図９も同様である。 The progress of this dialogue is represented by the utterance tree structure 8a. In the utterance tree structure 8a, the utterance is briefly indicated by a node of the alphabet representing the worker 9 who created the utterance. The same applies to FIGS. 6 to 9.

発話木構造８ａは、最初（１番目）の発話Ａから始まり、下へ２番目の発話Ｂ、更に下へ３番目の発話Ｃへと枝が伸びる。発話木構造８ａには分岐が存在しない。即ち、この時点では、未だ不適切チェックがなされていないことを示す。 The utterance tree structure 8a starts from the first (first) utterance A, extends downward to the second utterance B, and further downwards to the third utterance C. There are no branches in the utterance tree structure 8a. That is, at this point, it indicates that the inappropriate check has not been performed yet.

図５（Ｂ）では、作業者Ａによる最初（１番目）の発話「どちらに行かれますか？」に対して、次の作業者Ｄが２番目の発話「どこでも」を作成した状態の対話画面Ｇ７０が示されている。 In FIG. 5 (B), a dialogue in which the next worker D creates the second utterance "anywhere" in response to the first (first) utterance "Where are you going?" By the worker A. The screen G70 is shown.

作業者Ｅは、直前の作業者Ｄの発話「どこでも」は不適切であると判断し、修正する発話を作成する。不適切チェック有りの状態で、発話Ｅが作成された場合の発話木構造８ｂを示している。発話木構造８ｂでは、最初（１番目）の発話Ａから始まり、下へ２番目の発話Ｄと伸びるが、発話Ｄに対する不適切チェックにより、発話Ｅは発話Ａの直下に位置付けされる。 The worker E determines that the utterance "anywhere" of the immediately preceding worker D is inappropriate, and creates an utterance to be corrected. The utterance tree structure 8b when the utterance E is created with the improper check is shown. In the utterance tree structure 8b, the utterance tree structure 8b starts from the first (first) utterance A and extends downward to the second utterance D, but the utterance E is positioned directly under the utterance A due to an inappropriate check for the utterance D.

以下の説明において、発話木構造８ａ、８ｂ等のように特定しない限りにおいて、総称して発話木構造８という場合がある。 In the following description, unless otherwise specified such as the utterance tree structure 8a, 8b, etc., the utterance tree structure 8 may be generically referred to.

次に、（ＩＩ）発話、作業者、及び対話それぞれのスコアリングについて図６〜図９で説明する。先ず、発話スコアリングについて説明する。本実施例では、
・発話が分岐した際に、チェックのための多重度をパラメータとして与える。
・多重度は奇数を与える。
・分岐が起きた際に、分岐が起きるか否かを多重度の数まで作業者に評価させる。多重度が１の場合は、分岐した際に分岐後の発話のみを用いることなる。多重度は３以上であることが好ましい。
・作業者９によって指示された割り合いを発話スコアとする。 Next, scoring of (II) utterance, worker, and dialogue will be described with reference to FIGS. 6 to 9. First, utterance scoring will be described. In this embodiment
-When the utterance branches, the multiplicity for checking is given as a parameter.
-The multiplicity gives an odd number.
-When a branch occurs, let the operator evaluate whether or not the branch occurs up to the number of multiplicities. When the multiplicity is 1, only the utterance after the branch is used when the branch is made. The multiplicity is preferably 3 or more.
-The utterance score is the ratio instructed by the worker 9.

先ず、発話スコアリングについて図６及び図７で説明する。図６は、多重度「１」の場合の発話スコアリングを説明するための図である。図６（Ａ）では、図５（Ａ）の発話木構造８ａの各発話の発話スコアを示している。発話木構造８ａでは、分岐が存在しないため、発話Ａ、Ｂ及びＣの全ての発話スコアＵＳが「１」を示す。 First, utterance scoring will be described with reference to FIGS. 6 and 7. FIG. 6 is a diagram for explaining utterance scoring when the multiplicity is “1”. FIG. 6 (A) shows the utterance score of each utterance of the utterance tree structure 8a of FIG. 5 (A). In the utterance tree structure 8a, since there is no branch, all the utterance scores US of the utterances A, B, and C show "1".

図６（Ｂ）では、発話木構造８ｃにおいて分岐の前後の発話スコアＵＳを説明する。分岐前の発話Ｄが作成された時点では、発話Ａも発話Ｄも発話スコアＵＳは「１」である。発話Ｃの作成時に発話Ａから分岐すると、即ち、発話Ｄが不適切であると判断されると、多重度が１の場合は、不適切と判断された発話Ｄの発話スコアＵＳは「０」となり、修正用の発話Ｃに発話スコアＵＳ「１」が与えられる。 FIG. 6B describes the utterance score US before and after the branch in the utterance tree structure 8c. At the time when the utterance D before the branch is created, the utterance score US of both the utterance A and the utterance D is "1". When branching from utterance A when creating utterance C, that is, when utterance D is determined to be inappropriate, if the multiplicity is 1, the utterance score US of utterance D determined to be inappropriate is "0". Then, the utterance score US "1" is given to the utterance C for correction.

図７は、多重度「３」の場合の発話スコアリングを説明するための図である。図７（Ａ）では、発話Ａに応答して発話Ｄが作成された状態を示す。この状態では、発話Ａの発話スコアＵＳは「１」であり、発話Ｄの発話スコアＵＳも「１」である。 FIG. 7 is a diagram for explaining utterance scoring when the multiplicity is “3”. FIG. 7A shows a state in which utterance D is created in response to utterance A. In this state, the utterance score US of utterance A is "1", and the utterance score US of utterance D is also "1".

図７（Ｂ）では、発話Ｄが不適切と判断され、修正用に発話Ｃが作成される。この時点では、本当に発話Ｄが不適切であるのかは断定できない。この状態での発話木構造８ｄでは、発話Ｄの発話スコアＵＳが「１」から「１／２」に変更され、発話Ｃには発話スコアＵＳ「１／２」が与えられる。図７（Ｂ）の発話木構造８ｄに対して、図７（Ｃ）又は図７（Ｄ）の状態に遷移することが考えられる。 In FIG. 7B, the utterance D is determined to be inappropriate, and the utterance C is created for correction. At this point, it cannot be determined whether utterance D is really inappropriate. In the utterance tree structure 8d in this state, the utterance score US of the utterance D is changed from "1" to "1/2", and the utterance C is given the utterance score US "1/2". It is conceivable that the utterance tree structure 8d of FIG. 7 (B) transitions to the state of FIG. 7 (C) or FIG. 7 (D).

図７（Ｃ）では、発話Ｄを不適切と判断した、別の作業者Ｅにより発話Ｅが生成された状態を示している。発話Ｅの追加により、発話Ｄが不適切でないことは理解できるが、発話Ｃと発話Ｅとではいずれが適切かはまだ判定できない。この場合、発話Ｄが削除され、発話Ｃと発話Ｅの発話スコアＵＳが「１／２」ずつになる。 FIG. 7C shows a state in which the utterance E is generated by another worker E who determines that the utterance D is inappropriate. It can be understood that the utterance D is not inappropriate by adding the utterance E, but it cannot be determined yet which is more appropriate between the utterance C and the utterance E. In this case, the utterance D is deleted, and the utterance score US of the utterance C and the utterance E becomes "1/2" each.

図７（Ｄ）では、発話Ｄに対して不適切と判断されずに、発話Ｆが追加された状態を示している。したがって、発話Ｄの発話スコアＵＳは「１／２」から「２／３」へと変更され、発話Ｃの発話スコアＵＳは「１／２」から「１／３」へと変更される。 FIG. 7D shows a state in which the utterance F is added without being judged to be inappropriate for the utterance D. Therefore, the utterance score US of utterance D is changed from "1/2" to "2/3", and the utterance score US of utterance C is changed from "1/2" to "1/3".

次に、上述したように生成され変化する発話木構造８を用いて、作業者スコアリングが行われる。作業者スコアリングでは、作業者９が作成した全ての発話の発話スコアの平均値を算出し、作業者スコアとする。 Next, worker scoring is performed using the utterance tree structure 8 that is generated and changed as described above. In the worker scoring, the average value of the utterance scores of all the utterances created by the worker 9 is calculated and used as the worker score.

図８は、作業者スコアリングを説明するための図である。図８では、３つの発話木構造８ｅ、８ｆ、及び８ｇにおける各発話Ａ〜Ｇの発話スコアＵＳを用いて、作業者Ａ〜Ｇの作業者スコアＷＳを算出する場合で説明する。 FIG. 8 is a diagram for explaining worker scoring. FIG. 8 describes a case where the worker scores WS of the workers A to G are calculated by using the utterance scores US of the utterances A to G in the three utterance tree structures 8e, 8f, and 8g.

この例では、
作業者Ａについて、
作業者スコアＷＳ_Ａ＝（１＋１＋１）／３＝１を取得し、
作業者Ｂについて、
作業者スコアＷＳ_Ｂ＝（１／３＋１＋１）／３＝０．７８を取得し、
作業者Ｃについて、
作業者スコアＷＳ_Ｃ＝（１＋２／３）／２＝０．８３を取得し、
作業者Ｄについて、
作業者スコアＷＳ_Ｄ＝（０＋１／３）／２＝０．１７を取得し、
作業者Ｅについて、
作業者スコアＷＳ_Ｅ＝（１＋１＋１）／３＝１を取得し、
作業者Ｆについて、
作業者スコアＷＳ_Ｆ＝（１＋１）／２＝１を取得し、
作業者Ｇについて、
作業者スコアＷＳ_Ｇ＝（２／３）／１＝０．６７
を取得する。 In this example
About worker A
Obtain the worker score WS _A = (1 + 1 + 1) / 3 = 1 and
About worker B
Obtain the worker score WS _B = (1/3 + 1 + 1) / 3 = 0.78,
About worker C
Worker score WS _C = (1 + 2/3) / 2 = 0.83 is obtained,
About worker D
Obtain the worker score WS _D = (0 + 1/3) / 2 = 0.17 and
About worker E
Obtain the worker score WS _E = (1 + 1 + 1) / 3 = 1 and
About worker F
Obtain the worker score WS _F = (1 + 1) / 2 = 1 and
About worker G
Worker score WS _G = (2/3) / 1 = 0.67
To get.

作業者スコアＷＳが低い程、不適切な発話を作成する作業者９であることを示す。この作業者スコアＷＳに対して閾値を用いてフィルタリングすることで、不適切な発話を作成する作業者９を除外してもよい。この場合、不適切な発話の作成を抑制することができる。 The lower the worker score WS, the more the worker 9 creates an inappropriate utterance. By filtering the worker score WS using a threshold value, the worker 9 who creates an inappropriate utterance may be excluded. In this case, it is possible to suppress the creation of inappropriate utterances.

次に、対話スコアリングについて説明する。対話スコアリングは、最も支持されたパス上の発話を作成した作業者９の作業者スコアＷＳの相乗平均によって行われる。最も支持されたパスとは、発話の数が最も多い枝を選択することで定まる。 Next, dialogue scoring will be described. Dialogue scoring is performed by the geometric mean of the worker scores WS of worker 9 who created the utterances on the most favored path. The most favored path is determined by selecting the branch with the highest number of utterances.

図９は、対話スコアリングを説明するための図である。図９では、図８と同様の３つの発話木構造８ｅ、８ｆ、及び８ｇで表現されるそれぞれの対話の対話スコアＤＳを算出する場合で説明する。 FIG. 9 is a diagram for explaining dialogue scoring. In FIG. 9, the case of calculating the dialogue score DS of each dialogue represented by the three utterance tree structures 8e, 8f, and 8g similar to those in FIG. 8 will be described.

発話木構造８ｅで最も支持されたパスＰｅは、発話Ａ、Ｇ、Ｃ、Ｅ、及びＦの順の文脈である。したがって、パスＰｅを構成する発話を作成した作業者９の作業者スコアＷＳの相乗平均は、
（１×０．６７×０．８３×１×１）＾（１／５）＝０．８９
であり、対話スコアＤＳｅ＝０．８９を得る。 The most favored path Pe in the utterance tree structure 8e is the context of utterances A, G, C, E, and F in that order. Therefore, the geometric mean of the worker score WS of the worker 9 who created the utterances constituting the path Pe is
(1 x 0.67 x 0.83 x 1 x 1) ^ (1/5) = 0.89
And a dialogue score of DSe = 0.89 is obtained.

発話木構造８ｆで最も支持されたパスＰｆは、発話Ｅ、Ｆ、Ａ、及びＢの順の文脈である。したがって、パスＰｆを構成する発話を作成した作業者９の作業者スコアＷＳの相乗平均は、
（１×１×１×０．７８）＾（１／４）＝０．９４
であり、対話スコアＤＳｆ＝０．９４を得る。 The most supported path Pf in the utterance tree structure 8f is the context of utterances E, F, A, and B in that order. Therefore, the geometric mean of the worker score WS of the worker 9 who created the utterances constituting the path Pf is
(1 x 1 x 1 x 0.78) ^ (1/4) = 0.94
The dialogue score DSf = 0.94 is obtained.

発話木構造８ｇで最も支持されたパスＰｇは、発話Ｂ、Ｅ、Ｃ、及びＡの順の文脈である。したがって、パスＰｇを構成する発話を作成した作業者９の作業者スコアＷＳの相乗平均は、
（０．７８×１×０．８３×１）＾（１／４）＝０．９０
であり、対話スコアＤＳｇ＝０．９０を得る。 The most supported path Pg in the utterance tree structure 8g is the context of utterances B, E, C, and A in that order. Therefore, the geometric mean of the worker score WS of the worker 9 who created the utterances constituting the path Pg is
(0.78 × 1 × 0.83 × 1) ^ (1/4) = 0.90
The dialogue score DSg = 0.90 is obtained.

対話スコアＤＳが高い値、即ち、１に近い程、適切な文脈で表される対話であることを示す。この例では、対話スコアＤＳｆが最も高い値「０．９４」を示すため、パスＰｆによる文脈は適切な発話の連続を示すと考えられる。 The higher the dialogue score DS, that is, the closer it is to 1, the more appropriate the context is for the dialogue. In this example, since the dialogue score DSf shows the highest value "0.94", the context by the path Pf is considered to indicate a proper sequence of utterances.

次に、（ＩＩＩ）作業者の割り当ての効率化について図１０から図１４で説明する。図１０は、作業者割当手順の概要を説明するための図である。 Next, (III) efficiency improvement of worker allocation will be described with reference to FIGS. 10 to 14. FIG. 10 is a diagram for explaining an outline of the worker allocation procedure.

図１０において、作業者リスト４１は、作業者９の作業者ＩＤを追加した順で取り出すキュー構造を持つリストであり、一例として、キュー構造はＦＩＦＯ（First-In，First-Out）方式であればよい。この例では、作業者９のＩＤを右端から追加し、左端から作業者９のＩＤが取り出される。 In FIG. 10, the worker list 41 is a list having a queue structure in which the worker IDs of the workers 9 are taken out in the order in which the worker IDs are added. As an example, the queue structure may be in a FIFO (First-In, First-Out) system. Just do it. In this example, the ID of the worker 9 is added from the right end, and the ID of the worker 9 is taken out from the left end.

対話タスクテーブル４２は、対話毎に発話を作成した作業者９を順に示したテーブルである。各対話は、対話タスクとして対話タスクテーブル４２に登録され、以後、対話タスクテーブル４２で管理される。対話の進行状況は、対話タスクテーブル４２に記録された作業者９を特定する作業者ＩＤで示される。 The dialogue task table 42 is a table in which the workers 9 who created the utterances for each dialogue are shown in order. Each dialogue is registered in the dialogue task table 42 as a dialogue task, and is subsequently managed in the dialogue task table 42. The progress of the dialogue is indicated by a worker ID that identifies the worker 9 recorded in the dialogue task table 42.

この例では、対話に参加可能な複数の作業者９として、作業者リスト４１を参照すると、作業者ｗ４、ｗ９、ｗ２１、ｗ１０、ｗ５が既に待ち状態である。更に、先頭の作業者ｗ４を対話に割り当て、最後の作業者ｗ５に続いて、作業者ｗ７が作業者リスト４１に追加される状況を示している。 In this example, as a plurality of workers 9 who can participate in the dialogue, referring to the worker list 41, the workers w4, w9, w21, w10, and w5 are already in the waiting state. Further, the first worker w4 is assigned to the dialogue, and the worker w7 is added to the worker list 41 following the last worker w5.

対話タスクテーブル４２には、対話タスクｄ１、・・・、ｄ５等が登録され、対話タスクｄ１では、作業者ｗ４が最初の発話を作成後、最初の発話に対して、作業者ｗ１１が２番目の発話を作成した状態を示している。対話タスクｄ２、ｄ３、及びｄ４等についても同様に、発話が作成された順（発話作成順）に、作業者ＩＤが示される。対話タスクｄ５は、登録済みであるが、未だ、発話が作成されていない状態を示している。 Dialogue tasks d1, ..., D5, etc. are registered in the dialogue task table 42. In the dialogue task d1, after the worker w4 creates the first utterance, the worker w11 is second with respect to the first utterance. Shows the state in which the utterance of is created. Similarly, for the dialogue tasks d2, d3, d4, etc., the worker IDs are shown in the order in which the utterances are created (utterance creation order). The dialogue task d5 indicates a state in which the utterance has not been created yet, although it has been registered.

作業者９の対話への割り当ては、以下のような作業者割当手順に従って行う。
・手順（１）作業者リスト４１（図１０）の先頭（左）から作業者ＩＤを取り出す。
・手順（２）対話タスクテーブル４２に登録されているそれぞれの対話タスクに対して選択確率を求める。
選択確率＝作業者ＩＤの出現回数／発話個数
上式にて、発話個数には、取り出した作業者ＩＤが最後となる対話タスクはカウントしない。
・手順（３）選択確率が最大の対話タスクを選択する。
最大の対話タスクが複数ある場合、発話個数が最も少ない対話タスクを選択する。最も少ない対話タスクが複数存在する場合、ランダムに１つの対話タスクを選択する。 The worker 9 is assigned to the dialogue according to the following worker assignment procedure.
-Procedure (1) The worker ID is taken out from the head (left) of the worker list 41 (FIG. 10).
-Procedure (2) Obtain the selection probability for each dialogue task registered in the dialogue task table 42.
Selection probability = Number of occurrences of worker ID / number of utterances In the above formula, the number of utterances does not count the dialogue task whose last worker ID is taken out.
-Procedure (3) Select the dialogue task with the highest selection probability.
If there are multiple largest dialogue tasks, select the one with the least number of utterances. If there are a plurality of the least interactive tasks, one interactive task is randomly selected.

図１０に示すデータ例を用いて、上述した作業者割当手順に従った作業者割当処理について図１１から図１４で説明する。図１１は、選択確率が最も高い作業者ＩＤが１つのみの場合の作業者割当処理の例を示す図である。 Using the data example shown in FIG. 10, the worker allocation process according to the worker allocation procedure described above will be described with reference to FIGS. 11 to 14. FIG. 11 is a diagram showing an example of a worker allocation process when there is only one worker ID having the highest selection probability.

図１１における作業者割当処理では、手順（１）により、作業者リスト４１から作業者ＩＤ「ｗ４」が取り出され、手順（２）により、対話タスクテーブル４２が参照されて、取り出した作業者ＩＤ「ｗ４」の選択確率を算出する。 In the worker allocation process in FIG. 11, the worker ID “w4” is extracted from the worker list 41 by the procedure (1), and the interactive task table 42 is referred to by the procedure (2), and the extracted worker ID is extracted. The selection probability of "w4" is calculated.

この例の場合、対話タスクｄ１では、作成済みの２つの発話のうち１つが作業者ＩＤ「ｗ４」によるものであるため、選択確率「１／２」を得る。対話タスクｄ２では、作成済みの４つの発話のうち作業者ＩＤ「ｗ４」による発話は存在しないため、選択確率「０」を得る。 In the case of this example, in the dialogue task d1, since one of the two created utterances is due to the worker ID “w4”, the selection probability “1/2” is obtained. In the dialogue task d2, since there is no utterance by the worker ID “w4” among the four created utterances, the selection probability “0” is obtained.

対話タスクｄ３では、作業者ＩＤ「ｗ４」が最後に発話を作成しているため、選択確率の算出は行われない。選択確率の算出は抑止される。 In the dialogue task d3, since the worker ID "w4" creates the utterance at the end, the selection probability is not calculated. The calculation of the selection probability is suppressed.

対話タスクｄ４では、作成済みの２つの発話のうち作業者ＩＤ「ｗ４」による発話は存在しないため、選択確率「０」を得る。対話タスクｄ５では、作成済みの発話は未だ存在しないため、選択確率「０」を得る。 In the dialogue task d4, since there is no utterance by the worker ID “w4” among the two created utterances, the selection probability “0” is obtained. In the dialogue task d5, since the created utterance does not exist yet, the selection probability “0” is obtained.

次に、手順（３）により、選択確率が最も高い対話タスクｄ１を選択し、発話作成順の最後に「ｗ４」を追加する。対話タスクｄ１において、作業者ＩＤ「ｗ１１」の次に「ｗ４」が追加される。作業者リスト４１より、作業者ＩＤ「ｗ４」の次は作業者ＩＤ「ｗ９」である。作業者ＩＤ「ｗ９」に対する作業者割当処理は図１２で説明する。 Next, according to the procedure (3), the dialogue task d1 having the highest selection probability is selected, and "w4" is added to the end of the utterance creation order. In the dialogue task d1, "w4" is added after the worker ID "w11". From the worker list 41, the worker ID "w4" is followed by the worker ID "w9". The worker allocation process for the worker ID “w9” will be described with reference to FIG.

図１２は、最も高い選択確率が複数存在する場合の作業者割当処理の第１例を示す図である。図１２において、手順（１）により、作業者リスト４１から作業者ＩＤ「ｗ９」が取り出され、手順（２）により、対話タスクテーブル４２が参照されて、取り出した作業者ＩＤ「ｗ９」の選択確率を算出する。 FIG. 12 is a diagram showing a first example of the worker allocation process when a plurality of the highest selection probabilities exist. In FIG. 12, the worker ID “w9” is extracted from the worker list 41 by the procedure (1), the interactive task table 42 is referred to by the procedure (2), and the extracted worker ID “w9” is selected. Calculate the probability.

この例の場合、対話タスクｄ１〜ｄ５において、作業者ＩＤ「ｗ９」による発話は存在しないため、全ての対話タスクｄ１〜ｄ５に対して選択確率「０」を得る。この場合、最大の対話タスクが複数ある場合に相当し、手順（３）により、発話個数の最も少ない対話タスクを選択する。 In the case of this example, in the dialogue tasks d1 to d5, since there is no utterance by the worker ID "w9", the selection probability "0" is obtained for all the dialogue tasks d1 to d5. In this case, it corresponds to the case where there are a plurality of maximum dialogue tasks, and the dialogue task with the smallest number of utterances is selected according to the procedure (3).

対話タスクｄ１の発話個数は「３」であり、対話タスクｄ２の発話個数は「４」であり、対話タスクｄ３の発話個数は「１」であり、対話タスクｄ４の発話個数は「２」であり、対話タスクｄ５の発話個数は「０」である。 The number of utterances of the dialogue task d1 is "3", the number of utterances of the dialogue task d2 is "4", the number of utterances of the dialogue task d3 is "1", and the number of utterances of the dialogue task d4 is "2". Yes, the number of utterances in the dialogue task d5 is "0".

従って、作業者ＩＤ「ｗ９」に対して、対話タスクｄ５が選択される。発話個数が最も少ない対話タスクｄ５に作業者ＩＤ「ｗ９」が割り当てられる。この例では、対話タスクｄ５対して、最初の発話として「ｗ９」が記録される。そして、作業者リスト４１より、作業者ＩＤ「ｗ９」の次は作業者ＩＤ「ｗ２１」である。作業者ＩＤ「ｗ２１」に対する作業者割当処理は図１３で説明する。 Therefore, the dialogue task d5 is selected for the worker ID “w9”. The worker ID "w9" is assigned to the dialogue task d5 with the smallest number of utterances. In this example, "w9" is recorded as the first utterance for the dialogue task d5. Then, from the worker list 41, the worker ID "w9" is followed by the worker ID "w21". The worker allocation process for the worker ID “w21” will be described with reference to FIG.

図１３は、最も高い選択確率が複数存在する場合の作業者割当処理の第２例を示す図である。図１３において、手順（１）により、作業者リスト４１から作業者ＩＤ「ｗ２１」が取り出され、手順（２）により、対話タスクテーブル４２が参照されて、取り出した作業者ＩＤ「ｗ２１」の選択確率を算出する。 FIG. 13 is a diagram showing a second example of the worker allocation process when a plurality of the highest selection probabilities exist. In FIG. 13, the worker ID “w21” is extracted from the worker list 41 by the procedure (1), the interactive task table 42 is referred to by the procedure (2), and the extracted worker ID “w21” is selected. Calculate the probability.

この例の場合、対話タスクｄ１〜ｄ５において、作業者ＩＤ「ｗ２１」による発話は存在しないため、全ての対話タスクｄ１〜ｄ５に対して選択確率「０」を得る。この場合、最大の対話タスクが複数ある場合に相当し、手順（３）により、発話個数の最も少ない対話タスクを選択する。 In the case of this example, since there is no utterance by the worker ID "w21" in the dialogue tasks d1 to d5, the selection probability "0" is obtained for all the dialogue tasks d1 to d5. In this case, it corresponds to the case where there are a plurality of maximum dialogue tasks, and the dialogue task with the smallest number of utterances is selected according to the procedure (3).

対話タスクｄ１の発話個数は「３」であり、対話タスクｄ２の発話個数は「４」であり、対話タスクｄ３の発話個数は「１」であり、対話タスクｄ４の発話個数は「２」であり、対話タスクｄ５の発話個数は「１」である。従って、発話個数の最も少ない対話タスクが２つ存在する。 The number of utterances of the dialogue task d1 is "3", the number of utterances of the dialogue task d2 is "4", the number of utterances of the dialogue task d3 is "1", and the number of utterances of the dialogue task d4 is "2". Yes, the number of utterances in the dialogue task d5 is "1". Therefore, there are two dialogue tasks with the smallest number of utterances.

この場合、手順（３）により、発話個数の最も少ない複数の対話タスクからランダムにいずれか１つを選択する。ランダム選択の方法は、特に限定しない。既存の関数などを用いればよい。 In this case, according to the procedure (3), one of the plurality of dialogue tasks with the smallest number of utterances is randomly selected. The random selection method is not particularly limited. An existing function or the like may be used.

作業者ＩＤ「ｗ２１」に対して、対話タスク「ｄ３」と「ｄ５」のうち対話タスク「ｄ３」が選択されたものとする。対話タスクｄ３において、既に割り当てた作業者ＩＤ「ｗ４」の次に作業者ＩＤ「ｗ２１」を追加する。 It is assumed that the dialogue task "d3" is selected from the dialogue tasks "d3" and "d5" for the worker ID "w21". In the dialogue task d3, the worker ID "w21" is added next to the already assigned worker ID "w4".

作業者リスト４１より、作業者ＩＤ「ｗ２１」の次は作業者ＩＤ「ｗ１０」である。作業者ＩＤ「ｗ１０」は、対話タスクテーブル４２に存在しない。作業者ＩＤ「ｗ９」と同様の状況を示す。従って、最も発話個数の少ない対話タスクｄ５に追加される。そして、作業者リスト４１より、作業者ＩＤ「ｗ１０」の次は作業者ＩＤ「ｗ５」である。作業者ＩＤ「ｗ５」に対する作業者割当処理は図１４で説明する。 From the worker list 41, the worker ID "w21" is followed by the worker ID "w10". The worker ID "w10" does not exist in the interactive task table 42. The situation similar to that of the worker ID "w9" is shown. Therefore, it is added to the dialogue task d5 with the smallest number of utterances. Then, from the worker list 41, the worker ID "w10" is followed by the worker ID "w5". The worker allocation process for the worker ID “w5” will be described with reference to FIG.

図１４は、最も高い選択確率が複数存在する場合の作業者割当処理の第３例を示す図である。図１４において、手順（１）により、作業者リスト４１から作業者ＩＤ「ｗ５」が取り出され、手順（２）により、対話タスクテーブル４２が参照されて、取り出した作業者ＩＤ「ｗ５」の選択確率を算出する。 FIG. 14 is a diagram showing a third example of the worker allocation process when a plurality of the highest selection probabilities exist. In FIG. 14, the worker ID “w5” is extracted from the worker list 41 by the procedure (1), the interactive task table 42 is referred to by the procedure (2), and the extracted worker ID “w5” is selected. Calculate the probability.

この例の場合、対話タスクｄ１では、作成済みの３つの発話のうち作業者ＩＤ「ｗ５」による発話は存在しないため、選択確率「０」を得る。対話タスクｄ２では、作成済みの４つの発話のうち２つが作業者ＩＤ「ｗ５」によるものであるため、選択確率「１／２」（＝２／４）を得る。 In the case of this example, in the dialogue task d1, since there is no utterance by the worker ID “w5” among the three created utterances, the selection probability “0” is obtained. In the dialogue task d2, since two of the four created utterances are due to the worker ID "w5", the selection probability "1/2" (= 2/4) is obtained.

対話タスクｄ３では、対話タスクｄ１と同様に、作業者ＩＤ「ｗ５」による発話は存在しないため、選択確率「０」を得る。対話タスクｄ４では、作成済みの２つの発話のうち１つが作業者ＩＤ「ｗ５」によるものであるため、選択確率「１／２」を得る。対話タスクｄ５も、対話タスクｄ１と同様に、作業者ＩＤ「ｗ５」による発話は存在しないため、選択確率「０」を得る。 In the dialogue task d3, as in the dialogue task d1, since there is no utterance by the worker ID “w5”, the selection probability “0” is obtained. In the dialogue task d4, since one of the two created utterances is due to the worker ID "w5", the selection probability "1/2" is obtained. Similar to the dialogue task d1, the dialogue task d5 also has a selection probability of “0” because there is no utterance by the worker ID “w5”.

この例の場合、対話タスクｄ１〜ｄ５において、作業者ＩＤ「ｗ５」に関して、選択確率が最も高い対話タスクは「ｄ２」と「ｄ４」であり複数存在する。従って、手順（３）により、発話個数の最も少ない対話タスクを選択する。 In the case of this example, in the dialogue tasks d1 to d5, the dialogue tasks having the highest selection probability with respect to the worker ID "w5" are "d2" and "d4", and there are a plurality of dialogue tasks. Therefore, according to the procedure (3), the dialogue task with the smallest number of utterances is selected.

発話個数は、対話タスクｄ２とｄ４に対してカウントすればよい。対話タスクｄ２の発話個数は「４」であり、対話タスクｄ４の発話個数は「２」である。従って、発話個数の最も少ない対話タスクｄ４が選択される。 The number of utterances may be counted for the dialogue tasks d2 and d4. The number of utterances of the dialogue task d2 is "4", and the number of utterances of the dialogue task d4 is "2". Therefore, the dialogue task d4 with the smallest number of utterances is selected.

上述したような作業者割当処理を作業者リスト４１から作業者ＩＤを順に取り出すごとに繰り返す。一方、作業者リスト４１へは、作業者割当処理とは独立して随時作業者ＩＤを追加する処理を行う。 The worker allocation process as described above is repeated every time the worker ID is sequentially taken out from the worker list 41. On the other hand, the worker ID is added to the worker list 41 at any time independently of the worker allocation process.

本実施例に係る種々の処理は、図１５に示すようなシステム構成において行われる。図１５は、本実施例におけるシステム構成例を示す図である。図１５に示すシステム１０００は、サーバ装置１００と、携帯端末、ＰＣ（Personal Computer）等を含む複数の作業者端末３と、１以上の利用者端末６とを有し、サーバ装置１００と複数の作業者端末３及び１以上の利用者端末とはネットワーク２を介して接続される。 Various processes according to this embodiment are performed in the system configuration as shown in FIG. FIG. 15 is a diagram showing a system configuration example in this embodiment. The system 1000 shown in FIG. 15 includes a server device 100, a plurality of worker terminals 3 including a mobile terminal, a PC (Personal Computer), and one or more user terminals 6, and the server device 100 and a plurality of user terminals 6. The worker terminal 3 and one or more user terminals are connected via the network 2.

サーバ装置１００は、本実施例に係る対話コーパス作成処理を実行する装置であり、不特定多数の作業者９に、複数の対話のそれぞれの目的に応じた発話を作成させることで、より利用価値の高い対話事例を蓄積して対話コーパスを作成する。 The server device 100 is a device that executes the dialogue corpus creation process according to the present embodiment, and is more useful by having an unspecified number of workers 9 create utterances according to the purpose of each of the plurality of dialogues. Create a dialogue corpus by accumulating high dialogue cases.

複数の作業者端末３は、作業者９により利用される端末であり、各作業者９は、作業者端末３をサーバ装置１００にネットワーク２を介して接続することで、ネットワーク２を介して行われる対話に参加するこができる。 The plurality of worker terminals 3 are terminals used by the worker 9, and each worker 9 connects the worker terminal 3 to the server device 100 via the network 2 to perform the operation via the network 2. You can participate in the dialogue.

１以上の利用者端末６は、サーバ装置１００が累積した、対話に係る一連の会話データ（対話コーパス）を利用する利用者７により利用される端末であり、利用者７は、利用者端末６をサーバ装置１００にネットワーク２を介して接続し、サーバ装置１００から会話データを取得する。 The one or more user terminals 6 are terminals used by the user 7 who uses a series of conversation data (dialogue corpus) related to the dialogue accumulated by the server device 100, and the user 7 is the user terminal 6. Is connected to the server device 100 via the network 2, and conversation data is acquired from the server device 100.

サーバ装置１００は、コンピュータによって制御される情報処理装置であって、ＣＰＵ（Central Processing Unit）１１と、主記憶装置１２と、補助記憶装置１３と、入力装置１４と、表示装置１５と、通信Ｉ／Ｆ（インターフェース）１７と、ドライブ装置１８とを有し、バスＢ１に接続される。 The server device 100 is an information processing device controlled by a computer, and is a CPU (Central Processing Unit) 11, a main storage device 12, an auxiliary storage device 13, an input device 14, a display device 15, and a communication I. It has a / F (interface) 17 and a drive device 18, and is connected to the bus B1.

ＣＰＵ１１は、主記憶装置１２に格納されたプログラムに従ってサーバ装置１００を制御するプロセッサに相当する。主記憶装置１２には、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等が用いられ、ＣＰＵ１１にて実行されるプログラム、ＣＰＵ１１での処理に必要なデータ、ＣＰＵ１１での処理にて得られたデータ等を記憶又は一時保存する。 The CPU 11 corresponds to a processor that controls the server device 100 according to a program stored in the main storage device 12. A RAM (Random Access Memory), a ROM (Read Only Memory), or the like is used in the main storage device 12, and is obtained by a program executed by the CPU 11, data required for processing by the CPU 11, and processing by the CPU 11. Store or temporarily store the collected data.

補助記憶装置１３には、ＨＤＤ（Hard Disk Drive）等が用いられ、各種処理を実行するためのプログラム等のデータを格納する。補助記憶装置１３に格納されているプログラムの一部が主記憶装置１２にロードされ、ＣＰＵ１１に実行されることによって、各種処理が実現される。 An HDD (Hard Disk Drive) or the like is used in the auxiliary storage device 13, and data such as a program for executing various processes is stored in the auxiliary storage device 13. Various processes are realized by loading a part of the program stored in the auxiliary storage device 13 into the main storage device 12 and executing the program in the CPU 11.

入力装置１４は、マウス、キーボード等を有し、管理者がサーバ装置１００による処理に必要な各種情報を入力するために用いられる。表示装置１５は、ＣＰＵ１１の制御のもとに必要な各種情報を表示する。入力装置１４と表示装置１５とは、一体化したタッチパネル等によるユーザインタフェースであってもよい。通信Ｉ／Ｆ１７は、有線又は無線などのネットワークを通じて通信を行う。通信Ｉ／Ｆ１７による通信は無線又は有線に限定されるものではない
サーバ装置１００によって行われる処理を実現するプログラムは、例えば、ＣＤ−ＲＯＭ（Compact Disc Read-Only Memory）等の記憶媒体１９によってサーバ装置１００に提供される。 The input device 14 has a mouse, a keyboard, and the like, and is used by the administrator to input various information necessary for processing by the server device 100. The display device 15 displays various necessary information under the control of the CPU 11. The input device 14 and the display device 15 may be a user interface using an integrated touch panel or the like. The communication I / F 17 communicates through a network such as wired or wireless. Communication by communication I / F17 is not limited to wireless or wired. A program that realizes processing performed by the server device 100 is a server using a storage medium 19 such as a CD-ROM (Compact Disc Read-Only Memory). Provided to device 100.

ドライブ装置１８は、ドライブ装置１８にセットされた記憶媒体１９（例えば、ＣＤ−ＲＯＭ等）とサーバ装置１００とのインターフェースを行う。 The drive device 18 interfaces the storage medium 19 (for example, a CD-ROM or the like) set in the drive device 18 with the server device 100.

また、記憶媒体１９に、後述される本実施の形態に係る種々の処理を実現するプログラムを格納し、この記憶媒体１９に格納されたプログラムは、ドライブ装置１８を介してサーバ装置１００にインストールされる。インストールされたプログラムは、サーバ装置１００により実行可能となる。 Further, a program for realizing various processes according to the present embodiment described later is stored in the storage medium 19, and the program stored in the storage medium 19 is installed in the server device 100 via the drive device 18. To. The installed program can be executed by the server device 100.

尚、プログラムを格納する記憶媒体１９はＣＤ−ＲＯＭに限定されず、コンピュータが読み取り可能な、データとしての構造（structure）を有する１つ以上の非一時的（non-transitory）な、有形（tangible）な媒体であればよい。コンピュータ読取可能な記憶媒体として、ＣＤ−ＲＯＭの他に、ＤＶＤディスク、ＵＳＢメモリ等の可搬型記録媒体、フラッシュメモリ等の半導体メモリであっても良い。 The storage medium 19 for storing the program is not limited to the CD-ROM, and one or more non-transitory, tangible (tangible) having a structure as data that can be read by a computer. ) Any medium. As the computer-readable storage medium, in addition to the CD-ROM, a portable recording medium such as a DVD disk or a USB memory, or a semiconductor memory such as a flash memory may be used.

複数の作業者端末３は、コンピュータによって制御されるタブレット型、携帯電話等の情報処理端末であって、ＣＰＵ１１ａと、主記憶装置１２ａと、ユーザＩ／Ｆ（インターフェース）１６ａと、通信Ｉ／Ｆ１７ａと、ドライブ装置１８ａとを有し、バスＢ１に接続される。 The plurality of worker terminals 3 are information processing terminals such as tablet-type and mobile phones controlled by a computer, and include a CPU 11a, a main storage device 12a, a user I / F (interface) 16a, and a communication I / F 17a. And a drive device 18a, which are connected to the bus B1.

ＣＰＵ１１ａは、主記憶装置１２ａに格納されたプログラムに従って作業者端末３を制御するプロセッサに相当する。主記憶装置１２ａには、ＲＡＭ（Random Access Memory）、ＲＯＭ（Read Only Memory）等が用いられ、ＣＰＵ１１ａにて実行されるプログラム、ＣＰＵ１１ａでの処理に必要なデータ、ＣＰＵ１１ａでの処理にて得られたデータ等を記憶又は一時保存する。主記憶装置１２ａに格納されているプログラムが、ＣＰＵ１１ａに実行されることによって、各種処理が実現される。 The CPU 11a corresponds to a processor that controls the worker terminal 3 according to a program stored in the main storage device 12a. A RAM (Random Access Memory), a ROM (Read Only Memory), or the like is used in the main storage device 12a, and a program executed by the CPU 11a, data required for processing by the CPU 11a, and a process obtained by the CPU 11a are obtained. Store or temporarily store the collected data. Various processes are realized by executing the program stored in the main storage device 12a in the CPU 11a.

ユーザＩ／Ｆ１６ａは、ＣＰＵ１１ａの制御のもとに必要な各種情報を表示し、また、ユーザによる操作入力を可能とするタッチパネル等である。通信Ｉ／Ｆ１７ａによる通信は無線又は有線に限定されるものではない。 The user I / F 16a is a touch panel or the like that displays various information required under the control of the CPU 11a and enables the user to input operations. Communication by communication I / F17a is not limited to wireless or wired.

作業者端末３によって行われる処理を実現するプログラムは、ネットワーク２を介して外部装置からダウンロードされる。或いは、予め作業者端末３の主記憶装置１２ａ又は記憶媒体１９ａに記憶されていても良い。主記憶装置１２ａ及び／又は記憶媒体１９ａが記憶部１３０ａに相当する。 The program that realizes the processing performed by the worker terminal 3 is downloaded from the external device via the network 2. Alternatively, it may be stored in advance in the main storage device 12a or the storage medium 19a of the worker terminal 3. The main storage device 12a and / or the storage medium 19a corresponds to the storage unit 130a.

ドライブ装置１８ａは、ドライブ装置１８ａにセットされた記憶媒体１９ａ（例えば、ＳＤ（Secure Digital）メモリカード等）と作業者端末３とのインターフェースを行う。尚、記憶媒体１９ａは、コンピュータが読み取り可能な、データとしての構造（structure）を有する１つ以上の非一時的（non-transitory）な、有形（tangible）な媒体であればよい。 The drive device 18a interfaces the storage medium 19a (for example, SD (Secure Digital) memory card or the like) set in the drive device 18a with the worker terminal 3. The storage medium 19a may be one or more non-transitory, tangible media having a structure as data that can be read by a computer.

利用者端末６のハードウェア構成は、作業者端末３と同様であるためその詳細な説明を省略する。また、作業者端末３及び利用者端末６は、デスクトップ型、ノートブック型、ラップトップ型等の情報処理端末であっても良く、そのハードウェア構成は、サーバ装置１００のハードウェア構成と同様であるので、その説明を省略する。 Since the hardware configuration of the user terminal 6 is the same as that of the worker terminal 3, detailed description thereof will be omitted. Further, the worker terminal 3 and the user terminal 6 may be information processing terminals such as a desktop type, a notebook type, and a laptop type, and the hardware configuration thereof is the same as the hardware configuration of the server device 100. Since there is, the description thereof will be omitted.

図１６は、本実施例のシステムにおける全体処理の概要を説明するための図である。図１６では、本実施例のシステムにおける全体処理の概要を処理シーケンスを用いて説明する。 FIG. 16 is a diagram for explaining an outline of the overall processing in the system of this embodiment. In FIG. 16, an outline of the overall processing in the system of this embodiment will be described using a processing sequence.

作業者９が作業者端末３から対話参加要求３ａをサーバ装置１００へ送信する（ステップＳ１）。対話参加要求３ａには作業者ＩＤが含まれている。作業者ＩＤは、対話参加要求３ａを行う前にサーバ装置１００から取得しておいても良いし、初めての対話参加要求３ａに応じてサーバ装置１００が作業者ＩＤを付与し、作業者端末３に通知してもよい。以後は、サーバ装置１００から与えられた作業者ＩＤを指定して、対話参加要求３ａをサーバ装置１００へと送信する。 The worker 9 transmits the dialogue participation request 3a from the worker terminal 3 to the server device 100 (step S1). The worker ID is included in the dialogue participation request 3a. The worker ID may be acquired from the server device 100 before the dialogue participation request 3a is performed, or the server device 100 assigns the worker ID in response to the first dialogue participation request 3a and the worker terminal 3 May be notified to. After that, the worker ID given by the server device 100 is specified, and the dialogue participation request 3a is transmitted to the server device 100.

サーバ装置１００では、対話参加要求３ａで指定される作業者ＩＤを用いて、作業者ＩＤを割り当てる対話を特定する作業者割当処理を行って（ステップＳ２）、対話参加要求３ａの要求元である作業者端末３に対話タスク提示を行って、対話タスクを作業者端末３に表示させる（ステップＳ４）。対話タスク提示３ｂには、対話前提情報、発話履歴情報等が含まれる。対話前提情報は、対話がなされる場面（前提）の説明を含み、作業者９に自身の役割を理解させるための情報である。発話履歴情報は、対話画面Ｇ７０に表示される既になされた対話の内容に相当する。 The server device 100 uses the worker ID specified in the dialogue participation request 3a to perform a worker allocation process for specifying the dialogue for which the worker ID is assigned (step S2), and is the request source of the dialogue participation request 3a. The dialogue task is presented to the worker terminal 3 and the dialogue task is displayed on the worker terminal 3 (step S4). The dialogue task presentation 3b includes dialogue premise information, utterance history information, and the like. The dialogue premise information includes an explanation of the scene (premise) in which the dialogue is performed, and is information for making the worker 9 understand his / her role. The utterance history information corresponds to the content of the already performed dialogue displayed on the dialogue screen G70.

作業者端末３では、対話画面Ｇ７０がユーザＩ／Ｆ１６ａに表示され、作業者９により発話が作成されると（ステップＳ５）、発話作成情報３ｃがサーバ装置１００へ送信される（ステップＳ６）。発話作成情報３ｃには、不適切チェックの有無、終了有無、作業者９が対話画面Ｇ７０の発話入力域７２に入力した発話を表わす発話データ等を含む。 On the worker terminal 3, the dialogue screen G70 is displayed on the user I / F16a, and when the utterance is created by the worker 9 (step S5), the utterance creation information 3c is transmitted to the server device 100 (step S6). The utterance creation information 3c includes the presence / absence of an inappropriate check, the presence / absence of termination, utterance data representing the utterance input by the worker 9 in the utterance input area 72 of the dialogue screen G70, and the like.

サーバ装置１００は、発話作成情報３ｃの受信に応じて、発話作成処理を行う（ステップＳ７）。発話作成処理において、サーバ装置１００は、発話データを記憶部１３０に蓄積し、作業者９に割り当てた対話タスクに対応付けて発話データを特定する発話ＩＤを追加し、対話タスクにおける発話スコアを更新する（ステップＳ７）。このようなステップＳ１〜Ｓ７が適宜繰り返される。 The server device 100 performs the utterance creation process in response to the reception of the utterance creation information 3c (step S7). In the utterance creation process, the server device 100 stores the utterance data in the storage unit 130, adds an utterance ID that identifies the utterance data in association with the dialogue task assigned to the worker 9, and updates the utterance score in the dialogue task. (Step S7). Such steps S1 to S7 are repeated as appropriate.

一方、利用者７が利用者端末６から対話データ取得要求６ａを行うと（ステップＳ８）、サーバ装置１００は、対話データ取出処理を行う（ステップＳ９）。対話データ取出処理において、サーバ装置１００は、全ての対話を対象にして、発話スコアに基づいて作業者９に対する作業者スコアを算出する。また、サーバ装置１００は、対話毎に対話に関連付けられた発話毎の発話スコアに基づいて対話スコアを算出する。 On the other hand, when the user 7 makes the dialogue data acquisition request 6a from the user terminal 6 (step S8), the server device 100 performs the dialogue data retrieval process (step S9). In the dialogue data retrieval process, the server device 100 calculates the worker score for the worker 9 based on the utterance score for all the dialogues. Further, the server device 100 calculates the dialogue score for each dialogue based on the utterance score for each utterance associated with the dialogue.

そして、所定条件に基づくスコア付き対話データ６ｂを提供する（ステップＳ１０）。所定条件は、例えば、利用者７が所望する対話スコアの閾値等であり、対話データ取得要求６ａで指定されてもよいし、対話データ取出処理にて予め設定されたデフォルトの閾値であってもよい。 Then, the dialogue data 6b with a score based on a predetermined condition is provided (step S10). The predetermined condition is, for example, a threshold value of the dialogue score desired by the user 7, may be specified in the dialogue data acquisition request 6a, or may be a default threshold value preset in the dialogue data retrieval process. Good.

図１７は、サーバ装置の機能構成例を示す図である。図１７において、サーバ装置１００は、主に、初期化部５０と、作業者登録部５１と、作業者割当部５２と、発話作成部５５と、対話データ取出部５７と、ユーザＩ／Ｆ６０とを有する。初期化部５０と、作業者登録部５１と、作業者割当部５２と、発話作成部５５と、対話データ取出部５７と、ユーザＩ／Ｆ６０とは、対応するプログラムをＣＰＵ１１ａが実行することで行われる処理に相当する。 FIG. 17 is a diagram showing a functional configuration example of the server device. In FIG. 17, the server device 100 mainly includes an initialization unit 50, a worker registration unit 51, a worker allocation unit 52, an utterance creation unit 55, a dialogue data extraction unit 57, and a user I / F 60. Has. The initialization unit 50, the worker registration unit 51, the worker allocation unit 52, the utterance creation unit 55, the dialogue data extraction unit 57, and the user I / F60 are such that the CPU 11a executes a corresponding program. Corresponds to the processing performed.

記憶部１３０には、作業者リスト４１、対話タスクテーブル４２、発話テーブル４３、発話データＤＢ４４、作業者スコアテーブル４５、スコア付き対話データ６ｂ等が記憶される。 The storage unit 130 stores the worker list 41, the dialogue task table 42, the utterance table 43, the utterance data DB 44, the worker score table 45, the dialogue data 6b with a score, and the like.

初期化部５０は、本実施例に係るシステムの運用開始時、対話コーパスの作成をリセット等する際に作業者リスト４１、対話タスクテーブル４２等を初期化する。 The initialization unit 50 initializes the worker list 41, the dialogue task table 42, and the like when resetting the creation of the dialogue corpus at the start of operation of the system according to the present embodiment.

作業者登録部５１は、ユーザＩ／Ｆ６０を介して対話参加要求３ａの受信に応じて、対話参加要求３ａで指定される作業者ＩＤを作業者リスト４１に登録する処理部である。 The worker registration unit 51 is a processing unit that registers the worker ID specified in the dialogue participation request 3a in the worker list 41 in response to the reception of the dialogue participation request 3a via the user I / F60.

作業者割当部５２は、ユーザＩ／Ｆ６０を介して対話参加要求３ａの受信に応じて、１発話ごとに作業者９を特定し、特定した作業者９に発話を作成してもらう対話タスクを選択し、対話タスク提示３ｂを作業者端末９に行う処理部である。１発話ごとに作業者９を特定する処理は、作業者割当部５２により行われる処理の一部であり、作業者ＩＤ取出部に相当する。作業者割当部５２は、更に、対話タスク生成部５４を含む対話タスク選択部５３を有する。 The worker allocation unit 52 identifies the worker 9 for each utterance in response to the reception of the dialogue participation request 3a via the user I / F60, and asks the specified worker 9 to create an utterance. This is a processing unit that selects and performs the dialogue task presentation 3b on the worker terminal 9. The process of identifying the worker 9 for each utterance is a part of the process performed by the worker allocation unit 52, and corresponds to the worker ID extraction unit. The worker allocation unit 52 further has a dialogue task selection unit 53 including a dialogue task generation unit 54.

対話タスク選択部５３は、作業者割当部５２から呼び出される処理部であり、選択部に相当する。対話タスク選択部５３は、上述した作業者割当手順に従って、作業者９に割り当てる対話タスクを選択する。対話タスク選択部５３は、対話タスク生成部５４を呼び出して、対話タスクを生成させ、生成した対話タスクをユーザＩ／Ｆ６０を介して作業者端末３に送信して表示させる。 The dialogue task selection unit 53 is a processing unit called from the worker allocation unit 52, and corresponds to the selection unit. The dialogue task selection unit 53 selects the dialogue task to be assigned to the worker 9 according to the worker allocation procedure described above. The dialogue task selection unit 53 calls the dialogue task generation unit 54 to generate a dialogue task, and transmits the generated dialogue task to the worker terminal 3 via the user I / F60 to display it.

発話作成部５５は、ユーザＩ／Ｆ６０を介して作業者端末３から発話作成情報３ｃを受信すると、対話タスクテーブル４２を更新する。発話作成情報３ｃから取得される発話データは、新たに付与された発話ＩＤと共に発話データＤＢ４４に蓄積される。発話データＤＢ４４は、対話コーパス１ａに相当する。 When the utterance creation unit 55 receives the utterance creation information 3c from the worker terminal 3 via the user I / F60, the utterance creation unit 55 updates the dialogue task table 42. The utterance data acquired from the utterance creation information 3c is accumulated in the utterance data DB 44 together with the newly assigned utterance ID. The utterance data DB 44 corresponds to the dialogue corpus 1a.

発話作成部５５は、更に、発話情報更新部５６を有し、発話情報更新部５６を呼び出すことにより、発話データを追加し、対話タスクの発話スコアを再計算させ更新させる。 The utterance creation unit 55 further has an utterance information update unit 56, and by calling the utterance information update unit 56, the utterance data is added, and the utterance score of the dialogue task is recalculated and updated.

発話情報更新部５６は、発話作成情報３ｃに基づき、発話テーブル４３で管理される発話木構造を更新し、発話スコアを更新する。 The utterance information update unit 56 updates the utterance tree structure managed in the utterance table 43 based on the utterance creation information 3c, and updates the utterance score.

対話データ取出部５７は、ユーザＩ／Ｆ６０を介して利用者端末６から対話データ取得要求６ａを受信すると、発話テーブル４３を参照して、作業者スコアと、対話スコアとを算出し、スコア付き対話データ６ｂを出力する。対話データ６ｂの一部又は全部を利用者端末６に提供してもよい。利用者７がサーバ装置１００の管理者である場合、入力装置１１４から対話データ取得要求６ａを行ってもよい。 When the dialogue data acquisition unit 57 receives the dialogue data acquisition request 6a from the user terminal 6 via the user I / F60, the dialogue data acquisition unit 57 refers to the utterance table 43, calculates the worker score and the dialogue score, and has a score. The dialogue data 6b is output. A part or all of the dialogue data 6b may be provided to the user terminal 6. When the user 7 is the administrator of the server device 100, the dialogue data acquisition request 6a may be made from the input device 114.

対話データ取出部５７は、更に、作業者スコア算出部５８と、発話スコア算出部５９とを有する。対話データ取出部５７は、作業者スコア算出部５８を呼び出して、発話テーブル４３を用いて作業者９それぞれの作業者スコアを算出させる。作業者スコア算出部５８によって作業者スコアテーブル４５が記憶部１３０に出力され記憶される。 The dialogue data extraction unit 57 further includes a worker score calculation unit 58 and an utterance score calculation unit 59. The dialogue data extraction unit 57 calls the worker score calculation unit 58 to calculate the worker score of each worker 9 using the utterance table 43. The worker score calculation unit 58 outputs and stores the worker score table 45 in the storage unit 130.

対話データ取出部５７は、発話スコア算出部５９を呼び出して、作業者スコアテーブル４５を用いて対話タスクごとの対話スコアを算出する。対話データ取出部５７によってスコア付き対話データ６ｂが記憶部１３０に出力され記憶される。 The dialogue data extraction unit 57 calls the utterance score calculation unit 59 and calculates the dialogue score for each dialogue task using the worker score table 45. The dialogue data extraction unit 57 outputs the scored dialogue data 6b to the storage unit 130 and stores it.

ユーザＩ／Ｆ６０は、サーバ装置１００と作業者端末３の作業者９とのインタフェース、及びサーバ装置１００と利用者端末６の利用者７とインタフェースを、通信Ｉ／Ｆ１１７を介して行う。 The user I / F 60 performs an interface between the server device 100 and the worker 9 of the worker terminal 3 and an interface between the server device 100 and the user 7 of the user terminal 6 via the communication I / F 117.

次に、記憶部１３０に記憶される種々のデータ構成について図１８から図２３について説明する。 Next, FIGS. 18 to 23 will be described with respect to various data configurations stored in the storage unit 130.

図１８は、作業者リストのデータ構成例を示す図である。図１８に示す作業者リスト４１は、図１０で説明したように、作業者９の作業者ＩＤを追加した順で取り出すキュー構造を持つリストである。 FIG. 18 is a diagram showing an example of data structure of the worker list. As described with reference to FIG. 10, the worker list 41 shown in FIG. 18 is a list having a queue structure in which the worker IDs of the workers 9 are taken out in the order of addition.

この例では、作業者リスト４１の左から順に追加順に、ｗ４、ｗ９、ｗ２１、ｗ１０、ｗ５、ｗ７、ｗ８、そしてｗ４０が示されている。作業者リスト４１の長さは、作業者ＩＤの追加及び取り出しにより伸縮し、作業者リスト４１は拡張可能な記憶領域である。 In this example, w4, w9, w21, w10, w5, w7, w8, and w40 are shown in order of addition from the left of the worker list 41. The length of the worker list 41 is expanded and contracted by adding and taking out the worker ID, and the worker list 41 is an expandable storage area.

図１９は、対話タスクテーブルのデータ構成例を示す図である。図１９において、対話タスクテーブル４２は、対話タスク毎に、発話を作成した順で作業者ＩＤを記録したテーブルであり、対話ＩＤ、作業者ＩＤのリスト等の項目を有する。 FIG. 19 is a diagram showing an example of data configuration of the interactive task table. In FIG. 19, the dialogue task table 42 is a table in which worker IDs are recorded in the order in which utterances are created for each dialogue task, and has items such as a dialogue ID and a list of worker IDs.

対話ＩＤは、対話タスクを特定する識別情報である。全ての対話タスクはタスクテーブル４２に登録され管理される。作業者ＩＤのリストは、発話作成毎に、作業者ＩＤを作成順に記録した一覧である。 The dialogue ID is identification information that identifies the dialogue task. All interactive tasks are registered and managed in the task table 42. The list of worker IDs is a list in which worker IDs are recorded in the order of creation for each utterance creation.

この例では、対話ＩＤ「ｄ１」に対して、作業者ＩＤ「ｗ４」及び「Ｗ１１」が記録されている。対話ＩＤ「ｄ２」に対して、作業者ＩＤ「ｗ５」、「ｗ７」、「ｗ５」、及び「Ｗ８」が記録されている。 In this example, the worker IDs "w4" and "W11" are recorded for the dialogue ID "d1". Worker IDs "w5", "w7", "w5", and "W8" are recorded for the dialogue ID "d2".

図２０は、発話テーブルのデータ構成例を示す図である。図２０において、発話テーブル４３は、対話タスクに対して作成された発話を発話木構造８で記録し管理するテーブルである。発話テーブル４３は、対話ＩＤ、発話ＩＤ、親発話、作業者ＩＤ、発話スコア（ＵＳ）、支持数等の項目を有する。 FIG. 20 is a diagram showing an example of data structure of the utterance table. In FIG. 20, the utterance table 43 is a table for recording and managing the utterances created for the dialogue task in the utterance tree structure 8. The utterance table 43 has items such as a dialogue ID, an utterance ID, a parent utterance, an operator ID, an utterance score (US), and a number of supporters.

対話ＩＤは、対話を特定する識別情報を示し、対話タスクテーブル４２に登録された識別情報である。発話ＩＤは、作成された発話を表わす発話データと関連付けられて発話データＤＢ４４で管理されている識別情報である。 The dialogue ID indicates identification information that identifies the dialogue, and is identification information registered in the dialogue task table 42. The utterance ID is identification information that is associated with the utterance data representing the created utterance and is managed in the utterance data DB 44.

親発話は、発話木構造８において、親の関係にある発話ＩＤを示す。作業者ＩＤは、発話を作成した作業者９を特定する識別情報を示す。発話スコア（ＵＳ）は、支持数に基づいて算出された値を示し、発話の追加毎に更新される。支持数は、発話の追加毎に加算又は減算された値を示し、発話スコアの算出時に参照される。 The parent utterance indicates an utterance ID having a parental relationship in the utterance tree structure 8. The worker ID indicates identification information that identifies the worker 9 who created the utterance. The utterance score (US) indicates a value calculated based on the number of supporters, and is updated every time an utterance is added. The number of support indicates a value added or subtracted for each additional utterance, and is referred to when calculating the utterance score.

このデータ例では、対話ＩＤ「ｄ２」に関して、発話ＩＤ「ｕ２１」〜「ｕ２７」が記録されている。発話ＩＤ「ｕ２１」のレコードでは、自身が親発話に相当するため、親発話ＩＤは存在せず、発話作成者は作業者ＩＤ「ｗ１」の作業者９であり、発話スコアは「１」であり、支持数「１」を示している。 In this data example, the utterance IDs "u21" to "u27" are recorded with respect to the dialogue ID "d2". In the record of the utterance ID "u21", since the utterance ID corresponds to the parent utterance, the utterance ID does not exist, the utterance creator is the worker 9 of the worker ID "w1", and the utterance score is "1". Yes, it shows the number of support "1".

発話ＩＤ「ｕ２２」のレコードでは、親発話ＩＤは「ｕ２１」であり、発話作成者は作業者ＩＤ「ｗ４」の作業者９であり、発話スコアは「０」であり、支持数「０」を示している。発話ＩＤ「ｕ２３」のレコードでは、親発話ＩＤは「ｕ２１」であり、発話作成者は作業者ＩＤ「ｗ７」の作業者９であり、発話スコアは「２／３」であり、支持数「２」を示している。発話ＩＤ「ｕ２４」のレコードでは、親発話ＩＤは「ｕ２１」であり、発話作成者は作業者ＩＤ「ｗ２」の作業者９であり、発話スコアは「１／３」であり、支持数「１／３」を示している。 In the record of the utterance ID "u22", the parent utterance ID is "u21", the utterance creator is the worker 9 of the worker ID "w4", the utterance score is "0", and the number of support is "0". Is shown. In the record of the utterance ID "u23", the parent utterance ID is "u21", the utterance creator is the worker 9 of the worker ID "w7", the utterance score is "2/3", and the number of support is "2/3". 2 ”is shown. In the record of the utterance ID "u24", the parent utterance ID is "u21", the utterance creator is the worker 9 of the worker ID "w2", the utterance score is "1/3", and the number of support " 1/3 "is shown.

発話ＩＤ「ｕ２５」のレコードでは、親発話ＩＤは「ｕ２３」であり、発話作成者は作業者ＩＤ「ｗ３」の作業者９であり、発話スコアは「１」であり、支持数「１」を示している。発話ＩＤ「ｕ２６」のレコードでは、親発話ＩＤは「ｕ２５」であり、発話作成者は作業者ＩＤ「ｗ５」の作業者９であり、発話スコアは「１」であり、支持数「１」を示している。発話ＩＤ「ｕ２７」のレコードでは、親発話ＩＤは「ｕ２６」であり、発話作成者は作業者ＩＤ「ｗ６」の作業者９であり、発話スコアは「１」であり、支持数「１」を示している。 In the record of the utterance ID "u25", the parent utterance ID is "u23", the utterance creator is the worker 9 of the worker ID "w3", the utterance score is "1", and the number of support is "1". Is shown. In the record of the utterance ID "u26", the parent utterance ID is "u25", the utterance creator is the worker 9 of the worker ID "w5", the utterance score is "1", and the number of support is "1". Is shown. In the record of the utterance ID "u27", the parent utterance ID is "u26", the utterance creator is the worker 9 of the worker ID "w6", the utterance score is "1", and the number of support is "1". Is shown.

このようなデータにより、発話木構造８ｋを表わすことができる。発話木構造８ｋは、対話ｄ２を構成する発話作成の系図を表わしている。ノードは発話を示し、発話ＩＤがノード内に示されている。また、枝に作業者ＩＤを示し、ノード近傍に発話スコアを示している。 With such data, the utterance tree structure 8k can be represented. The utterance tree structure 8k represents a genealogy of utterance creation that constitutes dialogue d2. The node indicates an utterance, and the utterance ID is indicated in the node. In addition, the worker ID is shown on the branch, and the utterance score is shown near the node.

図２１は、発話データＤＢのデータ構成例を示す図である。図２１において、発話データＤＢ４４は、作業者端末３から受信した発話データを発話ＩＤに対応付けて記憶し管理するデータベースであり、発話ＩＤ、発話データ等の項目を有する。 FIG. 21 is a diagram showing a data configuration example of the utterance data DB. In FIG. 21, the utterance data DB 44 is a database that stores and manages utterance data received from the worker terminal 3 in association with the utterance ID, and has items such as the utterance ID and the utterance data.

発話ＩＤは、作業者端末３から受信した発話データ毎に一意に付与される識別情報であり、発話データＤＢ４４で全ての発話データが発話ＩＤにより管理される。発話ＩＤにより、発話データＤＢ４４と発話テーブル４３とが関連付けされる。 The utterance ID is identification information uniquely assigned to each utterance data received from the worker terminal 3, and all the utterance data is managed by the utterance ID in the utterance data DB 44. The utterance ID associates the utterance data DB 44 with the utterance table 43.

発話データは、作業者端末３から受信した発話作成情報３ｃから取得した、作業者９が作成した発話を表わすデータである。一例として、発話データは、テキスト、音声データ、対話状況を表現した画像や動画等である。 The utterance data is data representing the utterance created by the worker 9 acquired from the utterance creation information 3c received from the worker terminal 3. As an example, the utterance data is text, voice data, an image or a moving image expressing a dialogue situation, or the like.

このデータ例では、発話ＩＤ「ｕ１１」の発話データは「どちらへおでかけですか」であり、発話ＩＤ「ｕ１２」の発話データは「シカゴに行きます」であり、発話ＩＤ「ｕ１３」の発話データは「楽しんできてください」等が記録されている。他の発話ＩＤについても同様に、作業者９が作成した内容そのものが示される。 In this data example, the utterance data of the utterance ID "u11" is "Where are you going?", The utterance data of the utterance ID "u12" is "Go to Chicago", and the utterance data of the utterance ID "u13". "Please enjoy" etc. are recorded. Similarly, for the other utterance IDs, the content itself created by the worker 9 is shown.

図２２は、作業者スコアテーブルのデータ構成例を示す図である。図２２において、作業者スコアテーブル４５は、作業者９毎の作業者スコアを記録し管理するテーブルであり、作業者ＩＤ、作業者スコア等の項目を有する。 FIG. 22 is a diagram showing an example of data structure of the worker score table. In FIG. 22, the worker score table 45 is a table for recording and managing the worker score for each worker 9, and has items such as a worker ID and a worker score.

作業者ＩＤは、作業者リスト４１に１度以上追加されたことのある作業者ＩＤを示す。作業者ＩＤによって、発話テーブル４３と作業者スコアテーブル４５とが関連付けされる。
作業者スコアは、発話テーブル４３の作業者ＩＤに対応付けられている発話スコアを用いて算出されたスコアを示し、作業者９の発話作成に対する誠実な貢献度（作成した発話の内容の信頼度）を示す。 The worker ID indicates a worker ID that has been added to the worker list 41 at least once. The utterance table 43 and the worker score table 45 are associated with each other by the worker ID.
The worker score indicates a score calculated using the utterance score associated with the worker ID of the utterance table 43, and the degree of sincere contribution to the utterance creation of the worker 9 (reliability of the content of the created utterance). ) Is shown.

図２３は、スコア付き対話データのデータ構成例を示す図である。図２３において、スコア付き対話データ６ｂは、対話タスクを評価した結果を示すテーブルであり、対話ＩＤ、対話スコア、発話１、発話２、発話３等の項目を有する。 FIG. 23 is a diagram showing a data structure example of dialogue data with a score. In FIG. 23, the scored dialogue data 6b is a table showing the result of evaluating the dialogue task, and has items such as dialogue ID, dialogue score, utterance 1, utterance 2, and utterance 3.

対話ＩＤは、対話タスクを特定する識別情報であり、対話タスクテーブル４２に存在する対話ＩＤのいずれかを示す。対話スコアは、発話テーブル４３の発話スコアに基づいて、最も支持されたパスを特定し、特定したパス上で示される発話スコアを用いて算出された値を示し、対話タスクの完成度を示す。対話スコアが「１」の場合に最も完成度が高く、「０」に近付くほど完成度が低いことを示す。発話１、発話２、発話３等は、最も支持されたパス上でなされた発話データを示す。 The dialogue ID is identification information that identifies the dialogue task, and indicates any of the dialogue IDs existing in the dialogue task table 42. The dialogue score identifies the most supported path based on the utterance score of the utterance table 43, indicates a value calculated using the utterance score shown on the identified path, and indicates the degree of completion of the dialogue task. When the dialogue score is "1", the degree of completion is the highest, and the closer it is to "0", the lower the degree of completion. Utterance 1, utterance 2, utterance 3, etc. indicate utterance data made on the most supported path.

このデータ例では、対話ＩＤ「ｄ１」について、対話スコアは「１」を示し、最も支持されたパス上でなわれた発話は、発話１「どちらへおでかけですか」、発話２「シカゴに行きます」、発話３「楽しんできてください」等であったことが示されている。 In this data example, for the dialogue ID "d1", the dialogue score is "1", and the utterances made on the most supported path are utterance 1 "Where are you going?" And utterance 2 "Go to Chicago." It is shown that it was "Masu", utterance 3 "Please enjoy", etc.

対話ＩＤ「ｄ３」について、対話スコアは「０．７５」を示し、最も支持されたパス上でなわれた発話は、発話１「どこへ行きますか」、発話２「見送りです」、発話３「[終了]」が示されている。作業者９がこの対話は直前の発話で完結したと判断し、対話画面Ｇ７０の終了チェック領域７４をチェックした場合に、発話３のように[終了]が示される。 For the dialogue ID "d3", the dialogue score is "0.75", and the utterances made on the most supported path are utterance 1 "where are you going", utterance 2 "see off", utterance 3 "[End]" is shown. When the worker 9 determines that this dialogue is completed in the immediately preceding utterance and checks the end check area 74 of the dialogue screen G70, [End] is indicated as in utterance 3.

次に、各処理部５０〜５８における処理について説明する。以下の図において、点線はデータの流れを示し、実線は処理の流れを示す。先ず、初期化部５０による初期化処理について説明する。 Next, the processing in each processing unit 50 to 58 will be described. In the figure below, the dotted line shows the data flow and the solid line shows the processing flow. First, the initialization process by the initialization unit 50 will be described.

図２４は、初期化処理を説明するためのフローチャート図である。図２４において、初期化部５０は、対話数を取得する（ステップＳ２０１）。対話数の取得は、記憶部１３０に予め設定された対話数を読み込むことでもよいし、入力画面を表示装置１１５に表示して、サーバ装置１００の管理者等により初期設定として入力させてもよい。初期化部５０は、作業者リスト４１を空にし（ステップＳ２０２）、対話数に基づいて、対話タスクテーブル４２を初期化する（ステップＳ２０３）。 FIG. 24 is a flowchart for explaining the initialization process. In FIG. 24, the initialization unit 50 acquires the number of dialogues (step S201). The number of dialogues may be acquired by reading the number of dialogues preset in the storage unit 130, or by displaying an input screen on the display device 115 and having the administrator of the server device 100 input the number of dialogues as an initial setting. .. The initialization unit 50 empties the worker list 41 (step S202) and initializes the dialogue task table 42 based on the number of dialogues (step S203).

対話タスクテーブル４２の初期化の一例として、対話数「５」が指定された場合、対話タスクを識別する対話ＩＤを５個生成し、対話タスクテーブル４２に登録する。対話ＩＤ「ｄ１」、「ｄ２」、「ｄ３」、「ｄ４」、及び「ｄ５」が生成された場合、対話ＩＤ毎にレコードが対話タスクテーブル４２に生成される。 As an example of initialization of the dialogue task table 42, when the number of dialogues "5" is specified, five dialogue IDs that identify the dialogue tasks are generated and registered in the dialogue task table 42. When the dialogue IDs "d1", "d2", "d3", "d4", and "d5" are generated, a record is generated in the dialogue task table 42 for each dialogue ID.

次に、作業者登録部５１による作業者登録処理と、作業者割当部５２による作業者割当処理とについて説明する。図２５は、作業者登録処理と作業者割当処理とを説明するためのフローチャート図である。 Next, the worker registration process by the worker registration unit 51 and the worker allocation process by the worker allocation unit 52 will be described. FIG. 25 is a flowchart for explaining the worker registration process and the worker allocation process.

図２５において、作業者登録部５１は、対話参加要求３ａをユーザＩ／Ｆ６０を介して受信すると、対話参加要求３ａから作業者ＩＤを取得して、作業者リスト４１へ追加する（ステップＳ２１１）。作業者登録部５１は、対話参加要求３ａの受信毎に、同様の処理を行う。 In FIG. 25, when the worker registration unit 51 receives the dialogue participation request 3a via the user I / F60, the worker registration unit 51 acquires the worker ID from the dialogue participation request 3a and adds it to the worker list 41 (step S211). .. The worker registration unit 51 performs the same process each time the dialogue participation request 3a is received.

一方、作業者割当部５２は、所定間隔で、作業者リスト４１から順に作業者ＩＤを１つ取り出して（ステップＳ２２１）、対話タスク選択部５３を呼び出して作業者ＩＤ及び対話タスクテーブル４２を通知して、対話タスク選択部５３により対話タスク選択処理を行う（ステップＳ２２２）。作業者ＩＤに対して割り当てる対話タスクが対話ＩＤにより特定される。作業者割当部５２によるステップＳ２２１は作業者ＩＤ取出部に相当し、対話タスク選択部５３によるステップＳ２２２は選択部に相当する。 On the other hand, the worker allocation unit 52 takes out one worker ID in order from the worker list 41 at predetermined intervals (step S221), calls the dialogue task selection unit 53, and notifies the worker ID and the dialogue task table 42. Then, the dialogue task selection unit 53 performs the dialogue task selection process (step S222). The dialogue task to be assigned to the worker ID is specified by the dialogue ID. Step S221 by the worker allocation unit 52 corresponds to the worker ID extraction unit, and step S222 by the dialogue task selection unit 53 corresponds to the selection unit.

タスク選択処理により対話タスクが特定されると、作業者割当部５２は、作業者リスト４１は空か否かを判定する（ステップＳ２２４）。作業者リスト４１が空でない場合（ステップＳ２２４のＮＯ）、作業者割当部５２は、ステップＳ２２１へと戻り、上記同様の処理を繰り返す。一方、作業者リスト４１が空の場合（ステップＳ２２４のＹＥＳ）、作業者割当部５２は、この作業者割当処理を終了する。 When the interactive task is specified by the task selection process, the worker allocation unit 52 determines whether or not the worker list 41 is empty (step S224). If the worker list 41 is not empty (NO in step S224), the worker allocation unit 52 returns to step S221 and repeats the same process as described above. On the other hand, when the worker list 41 is empty (YES in step S224), the worker allocation unit 52 ends the worker allocation process.

図２６は、図２５のステップＳ２２２における対話タスク選択処理を説明するためのフローチャート図である。図２６において、対話タスク選択部５３は、作業者割当部５２から作業者ＩＤ及び対話タスクテーブル４２を受け取ると、対話タスクテーブル４２の全ての対話ＩＤに対して選択確率を算出する（ステップＳ２３１）。 FIG. 26 is a flowchart for explaining the dialogue task selection process in step S222 of FIG. 25. In FIG. 26, when the dialogue task selection unit 53 receives the worker ID and the dialogue task table 42 from the worker allocation unit 52, the dialogue task selection unit 53 calculates the selection probabilities for all the dialogue IDs in the dialogue task table 42 (step S231). ..

上述した作業者割当手順に従って、対話ＩＤごとに選択確率が求められる。対話タスクが、通知された作業者ＩＤで終わっている場合には、
選択確率＝０
とする。それ以外では、
選択確率＝作業者ＩＤの出現回数／発話個数
により算出する。 The selection probability is obtained for each dialogue ID according to the worker allocation procedure described above. If the interactive task ends with the notified worker ID,
Selection probability = 0
And. Otherwise
Selection probability = Calculated based on the number of occurrences / number of utterances of the worker ID.

対話タスク選択部５３は、選択確率の最大値を持つ対話ＩＤを対話タスクテーブル４２から選択し（ステップＳ２３２）、選択した対話ＩＤの数は１つか否かを判断する（ステップＳ２３３）。選択した対話ＩＤの数は１つの場合（ステップＳ２３３のＹＥＳ）、対話タスク選択部５３は、ステップＳ２３７へと進む。 The dialogue task selection unit 53 selects the dialogue ID having the maximum selection probability from the dialogue task table 42 (step S232), and determines whether or not the number of selected dialogue IDs is one (step S233). When the number of selected dialogue IDs is one (YES in step S233), the dialogue task selection unit 53 proceeds to step S237.

一方、選択した対話ＩＤの数が複数の場合（ステップＳ２３３のＮＯ）、対話タスク選択部５３は、発話個数が最小の対話ＩＤを選択し（ステップＳ２３４）、選択した対話ＩＤの数は１つか否かを判断する（ステップＳ２３５）。選択した対話ＩＤの数は１つの場合（ステップＳ２３５のＹＥＳ）、対話タスク選択部５３は、ステップＳ２３７へと進む。 On the other hand, when the number of selected dialogue IDs is plural (NO in step S233), the dialogue task selection unit 53 selects the dialogue ID having the smallest number of utterances (step S234), and the number of selected dialogue IDs is one. It is determined whether or not (step S235). When the number of selected dialogue IDs is one (YES in step S235), the dialogue task selection unit 53 proceeds to step S237.

一方、選択した対話ＩＤの数が複数の場合（ステップＳ２３５のＮＯ）、対話タスク選択部５３は、選択した複数の対話ＩＤからランダムに１つを選択し（ステップＳ２３６）、対話タスク生成部５４を呼び出して、対話ＩＤ及び発話テーブル４３を通知し、対話タスク生成部５４により対話タスク生成処理を行う（ステップＳ２３７）。対話ＩＤに対応する対話タスクが生成される。 On the other hand, when the number of selected dialogue IDs is plural (NO in step S235), the dialogue task selection unit 53 randomly selects one from the selected plurality of dialogue IDs (step S236), and the dialogue task generation unit 54 Is called to notify the dialogue ID and the utterance table 43, and the dialogue task generation unit 54 performs the dialogue task generation process (step S237). A dialogue task corresponding to the dialogue ID is generated.

対話タスク選択部５３は、対話タスクを取得すると、作業者割当部５２への戻り値として設定し、この対話タスク選択処理を終了する。 When the dialogue task selection unit 53 acquires the dialogue task, it sets it as a return value to the worker allocation unit 52 and ends the dialogue task selection process.

図２７は、図２６のステップＳ２３７における対話タスク生成処理を説明するためのフローチャート図である。図２７では、対話ＩＤで特定された対話タスクが発話の作成毎に、＜ケース１＞→＜ケース２＞→＜ケース３＞→＜ケース４＞の順に遷移した場合に対応付けて説明する。対話タスク生成部５４から指定される対話ＩＤの発話木構造８は、＜ケース１＞、＜ケース２＞、＜ケース３＞、又は＜ケース４＞のいずれかに相当する。 FIG. 27 is a flowchart for explaining the dialogue task generation process in step S237 of FIG. 26. In FIG. 27, a case where the dialogue task specified by the dialogue ID transitions in the order of <case 1> → <case 2> → <case 3> → <case 4> for each utterance creation will be described. The utterance tree structure 8 of the dialogue ID designated by the dialogue task generation unit 54 corresponds to any one of <Case 1>, <Case 2>, <Case 3>, or <Case 4>.

対話タスク生成部５４は、対話タスク生成部５４から受け付けた発話テーブル４３と対話ＩＤとを用いて、発話テーブル４３における指定された対話ＩＤに対応付けられる発話ＩＤを発話木構造８に従ってたどり、末端の発話ＩＤのうち、発話スコアが０でない発話ＩＤを取得する（ステップＳ２４１）。 The dialogue task generation unit 54 traces the utterance ID associated with the designated dialogue ID in the utterance table 43 according to the utterance tree structure 8 by using the utterance table 43 and the dialogue ID received from the dialogue task generation unit 54, and ends. Of the utterance IDs of the above, the utterance ID whose utterance score is not 0 is acquired (step S241).

対話タスク生成部５４は、取得した発話ＩＤの数が１であったか２であったかを判断する（ステップＳ２４２）。 The dialogue task generation unit 54 determines whether the number of acquired utterance IDs is 1 or 2 (step S242).

取得した発話ＩＤは１つであった場合、対話タスク生成部５４は、発話テーブル４３から取得した発話ＩＤを用いて発話データＤＢ４４を参照し発話履歴情報を取得して、対話タスクを作成する（ステップＳ２４３）。 When there is only one utterance ID acquired, the dialogue task generation unit 54 refers to the utterance data DB 44 using the utterance ID acquired from the utterance table 43, acquires utterance history information, and creates a dialogue task ( Step S243).

この場合は＜ケース１＞に相当する。親の発話ＩＤ「Ａ」に対して子の発話ＩＤは「Ｄ」のみである。従って、ステップＳ２４３では、発話データＤＢ４４から発話ＩＤ「Ａ」で特定される発話データと、発話ＩＤ「Ｄ」で特定される発話データとにより、＜ケース１＞の発話木構造８に従って時系列に並べることで発話履歴情報を取得する。この発話履歴情報を含む対話タスクが作成される。 In this case, it corresponds to <Case 1>. The utterance ID of the child is only "D" with respect to the utterance ID "A" of the parent. Therefore, in step S243, the utterance data specified by the utterance ID “A” from the utterance data DB 44 and the utterance data specified by the utterance ID “D” are arranged in chronological order according to the utterance tree structure 8 of <Case 1>. Get utterance history information by arranging them. A dialogue task containing this utterance history information is created.

その後、対話タスク生成部５４は、作成した対話タスクを戻り値として設定し、この対話タスク生成処理を終了する。 After that, the dialogue task generation unit 54 sets the created dialogue task as a return value, and ends the dialogue task generation process.

ステップＳ２４２において取得した発話ＩＤは２つであると判断した場合、対話タスク生成部５４は、発話テーブル４３から発話ＩＤのそれぞれの支持数を取得し、取得した支持数の合計値が２を示すか３を示すかを判断する（ステップＳ２４４）。 When it is determined that the number of utterance IDs acquired in step S242 is two, the dialogue task generation unit 54 acquires the number of support for each utterance ID from the utterance table 43, and the total value of the acquired number of support indicates 2. It is determined whether or 3 is shown (step S244).

ステップＳ２４４において指示数の合計値が２を示すと判断した場合、対話タスク生成部５４は、最後に追加された発話ＩＤを除く発話ＩＤを用いて発話データＤＢ４４を参照し発話履歴情報を取得して、対話タスクを作成する（ステップＳ２４５）。 When it is determined in step S244 that the total value of the number of instructions indicates 2, the dialogue task generation unit 54 refers to the utterance data DB 44 using the utterance ID excluding the last added utterance ID, and acquires the utterance history information. To create an interactive task (step S245).

この場合は＜ケース２＞又は＜ケース３＞に相当する。＜ケース２＞の場合、最後に追加された発話ＩＤは「Ｃ」であるため、この発話ＩＤ「Ｃ」が除かれる。従って、発話データＤＢ４４から発話ＩＤ「Ａ」で特定される発話データと、発話ＩＤ「Ｄ」で特定される発話データとにより、＜ケース２＞の発話木構造８に従って時系列に並べることで発話履歴情報を取得する。この発話履歴情報を含む対話タスクが作成される。 In this case, it corresponds to <Case 2> or <Case 3>. In the case of <Case 2>, since the last added utterance ID is "C", this utterance ID "C" is excluded. Therefore, the utterance data specified by the utterance ID "A" from the utterance data DB 44 and the utterance data specified by the utterance ID "D" are arranged in chronological order according to the utterance tree structure 8 of <Case 2>. Get history information. A dialogue task containing this utterance history information is created.

＜ケース３＞の場合、最後に追加された発話ＩＤは「Ｅ」であるため、この発話ＩＤ「Ｅ」が除かれる。更に、発話ＩＤ「Ｄ」の発話は不適切であると判断されているため考慮せず、発話データＤＢ４４から発話ＩＤ「Ａ」で特定される発話データと、発話ＩＤ「Ｃ」で特定される発話データとにより、＜ケース３＞の発話木構造８に従って時系列に並べることで発話履歴情報を取得する。この発話履歴情報を含む対話タスクが作成される。 In the case of <Case 3>, since the last added utterance ID is "E", this utterance ID "E" is excluded. Further, since it is determined that the utterance of the utterance ID "D" is inappropriate, it is not considered, and the utterance data specified by the utterance ID "A" and the utterance ID "C" are specified from the utterance data DB44. Based on the utterance data, the utterance history information is acquired by arranging them in chronological order according to the utterance tree structure 8 of <Case 3>. A dialogue task containing this utterance history information is created.

ステップＳ２４４において取得した指示数の合計値が３を示すと判断した場合、対話タスク生成部５４は、発話テーブル４４を参照して、発話スコアの最も高い発話を選択して、対話タスクを作成する（ステップＳ２４６）。 When it is determined that the total value of the number of instructions acquired in step S244 indicates 3, the dialogue task generation unit 54 refers to the utterance table 44, selects the utterance with the highest utterance score, and creates a dialogue task. (Step S246).

この場合は＜ケース４＞に相当する。＜ケース４＞の場合、発話スコアの最も高い発話ＩＤは「Ｆ」である。発話ＩＤ「Ｆ」を有するパスに従って、発話データＤＢ４４から発話ＩＤ「Ａ」で特定される発話データと、発話ＩＤ「Ｄ」で特定される発話データと、発話ＩＤ「Ｆ」で特定される発話データとにより、＜ケース４＞の発話木構造８に従って時系列に並べることで発話履歴情報を取得する。この発話履歴情報を含む対話タスクが作成される。 In this case, it corresponds to <Case 4>. In the case of <Case 4>, the utterance ID having the highest utterance score is "F". According to the path having the utterance ID "F", the utterance data specified by the utterance ID "A" from the utterance data DB44, the utterance data specified by the utterance ID "D", and the utterance specified by the utterance ID "F". Based on the data, the utterance history information is acquired by arranging them in chronological order according to the utterance tree structure 8 of <Case 4>. A dialogue task containing this utterance history information is created.

次に、発話作成部５５による発話作成処理を説明する。図２８は、発話作成処理を説明するためのフローチャート図である。図２８において、発話作成部５５は、ユーザＩ／Ｆ６０を介して発話作成情報３ｃを受信すると、発話作成情報３ｃ内の発話データに発話ＩＤを付与して、発話データＤＢ４４に記録し、発話ＩＤを対話タスクテーブル４２へ追加する（ステップＳ２５１）。 Next, the utterance creation process by the utterance creation unit 55 will be described. FIG. 28 is a flowchart for explaining the utterance creation process. In FIG. 28, when the utterance creation unit 55 receives the utterance creation information 3c via the user I / F60, it assigns the utterance ID to the utterance data in the utterance creation information 3c, records it in the utterance data DB 44, and records the utterance ID. Is added to the interactive task table 42 (step S251).

そして、発話作成部５５は、発話作成情報３ｃから不適切チェック有無を取得して、取得した不適切チェック有無と、発話ＩＤと、発話テーブル４３とを発話情報更新部５６に通知して、発話情報更新処理を行う（ステップＳ２５２）。発話情報更新処理により発話テーブル４３が更新され、発話作成部５５は、この発話作成処理を終了する。 Then, the utterance creation unit 55 acquires the presence / absence of the inappropriate check from the utterance creation information 3c, notifies the utterance information update unit 56 of the acquired improper check presence / absence, the utterance ID, and the utterance table 43, and utters. Information update processing is performed (step S252). The utterance table 43 is updated by the utterance information update process, and the utterance creation unit 55 ends the utterance creation process.

次に、発話情報更新部５６による発話情報更新処理について説明する。図２９は、図２８のステップＳ２５２における発話情報更新処理を説明するためのフローチャート図である。 Next, the utterance information update process by the utterance information update unit 56 will be described. FIG. 29 is a flowchart for explaining the utterance information update process in step S252 of FIG. 28.

図２９において、発話情報更新部５６は、発話ＩＤを新しく追加するのか否かを判断する（ステップＳ２６１）。発話木構造８ｐの最後の発話ＩＤ「Ｄ」を親発話ＩＤとする発話ＩＤが他になければ、新たに追加する発話であると判断する。 In FIG. 29, the utterance information updating unit 56 determines whether or not to newly add the utterance ID (step S261). If there is no other utterance ID having the last utterance ID "D" of the utterance tree structure 8p as the parent utterance ID, it is determined that the utterance is newly added.

一例として、図２８では、割り当てた対話が発話木構造８ｐで表現される場合は、新たな発話ＩＤの追加であると判断する。一方、割り当てた対話が発話木構造８ｑで表現される場合は、新たな発話ＩＤは新たな追加ではないと判断する。 As an example, in FIG. 28, when the assigned dialogue is represented by the utterance tree structure 8p, it is determined that a new utterance ID is added. On the other hand, when the assigned dialogue is represented by the utterance tree structure 8q, it is determined that the new utterance ID is not a new addition.

ステップＳ２６１において新しく追加すると判断した場合、発話情報更新部５６は、更に、最後の発話が不適切であるか否かを判断する（ステップＳ２６２）。不適切チェック無しの場合、発話情報更新部５６は、最後の発話は不適切でないと判定し、発話テーブル４３に新たな発話ＩＤを追加登録し、支持数に１を設定する（ステップＳ２６３）。 When it is determined in step S261 that a new utterance is to be added, the utterance information update unit 56 further determines whether or not the last utterance is inappropriate (step S262). If there is no inappropriate check, the utterance information update unit 56 determines that the last utterance is not inappropriate, additionally registers a new utterance ID in the utterance table 43, and sets the number of support to 1. (Step S263).

最後の発話ＩＤ「Ｄ」を親発話ＩＤとした、支持数「１」の新たな発話ＩＤのレコードが発話テーブル４３に追加される。発話木構造８ｐは発話木構造８ｐ−１に遷移する。そして、発話情報更新部５６は、ステップＳ２６８へと進む。 A record of a new utterance ID having a support number of "1" with the last utterance ID "D" as the parent utterance ID is added to the utterance table 43. The utterance tree structure 8p transitions to the utterance tree structure 8p-1. Then, the utterance information update unit 56 proceeds to step S268.

一方、ステップＳ２６２において、不適切チェック有りの場合、発話情報更新部５６は、最後の発話は不適切であると判定し、発話テーブル４３に新たな発話ＩＤを追加登録し、最後の発話ＩＤと同じ親発話ＩＤを設定し、支持数を１に設定する（ステップＳ２６４）。 On the other hand, in step S262, when there is an inappropriate check, the utterance information update unit 56 determines that the last utterance is inappropriate, additionally registers a new utterance ID in the utterance table 43, and sets it as the last utterance ID. The same parent utterance ID is set, and the number of supports is set to 1 (step S264).

最後の発話ＩＤと同じ親発話ＩＤが設定され、支持数「１」の新たな発話ＩＤのレコードが発話テーブル４３に追加される。発話木構造８ｐは発話木構造８ｐ−２に遷移する。そして、発話情報更新部５６は、ステップＳ２６８へと進む。 The same parent utterance ID as the last utterance ID is set, and a record of a new utterance ID with a support number of "1" is added to the utterance table 43. The utterance tree structure 8p transitions to the utterance tree structure 8p-2. Then, the utterance information update unit 56 proceeds to step S268.

ステップＳ２６１において新しく追加しないと判断した場合、発話情報更新部５６は、更に、最後の発話が不適切であるか否かを判断する（ステップＳ２６５）。不適切チェック無しの場合、発話情報更新部５６は、最後の発話は不適切でないと判定し、発話テーブル４３において、親発話ＩＤに対して支持数を１増やし、新たな発話ＩＤを追加登録し、その支持数を１に設定する（ステップＳ２６６）。 If it is determined in step S261 that no new utterance is to be added, the utterance information updating unit 56 further determines whether or not the last utterance is inappropriate (step S265). If there is no inappropriate check, the utterance information update unit 56 determines that the last utterance is not inappropriate, increases the number of support by 1 with respect to the parent utterance ID in the utterance table 43, and additionally registers a new utterance ID. , The number of supports is set to 1 (step S266).

親発話ＩＤのレコードの支持数が１加算され、支持数「１」の新たな発話ＩＤのレコードが発話テーブル４３に追加される。発話木構造８ｑ発話木構造８ｑ−１に遷移する。そして、発話情報更新部５６は、ステップＳ２６８へと進む。 The number of support of the record of the parent utterance ID is added by 1, and the record of the new utterance ID with the number of support "1" is added to the utterance table 43. Speaking tree structure 8q Transition to the utterance tree structure 8q-1. Then, the utterance information update unit 56 proceeds to step S268.

一方、ステップＳ２６５において、不適切チェック有りの場合、発話情報更新部５６は、最後の発話は不適切であると判定する。そして、発話情報更新部５６は、発話テーブルにおいて、最後の発話ＩＤに対して支持数を０に設定し、新たな発話ＩＤを追加登録し、最後の発話ＩＤと同じ親発話ＩＤを設定し、支持数を１に設定する（ステップＳ２６７）。 On the other hand, in step S265, when there is an inappropriate check, the utterance information updating unit 56 determines that the last utterance is inappropriate. Then, the utterance information update unit 56 sets the number of support to 0 for the last utterance ID in the utterance table, additionally registers a new utterance ID, sets the same parent utterance ID as the last utterance ID, and sets the same parent utterance ID. The number of supports is set to 1 (step S267).

最後の発話ＩＤのレコードの支持数が０に設定さら、最後の発話ＩＤと同じ親発話ＩＤを持ち、支持数「１」の新たな発話ＩＤのレコードが発話テーブル４３に追加される。発話木構造８ｑ発話木構造８ｑ−２に遷移する。そして、発話情報更新部５６は、ステップＳ２６８へと進む。 The number of support of the record of the last utterance ID is set to 0. Further, the record of the new utterance ID having the same parent utterance ID as the last utterance ID and the number of support "1" is added to the utterance table 43. The utterance tree structure 8q transitions to the utterance tree structure 8q-2. Then, the utterance information update unit 56 proceeds to step S268.

ステップＳ２６８では、発話情報更新部５６は、支持数が変化したノードと、その兄弟ノードそれぞれに対して、発話スコアを算出し、それぞれのレコードに設定する。発話スコアは、
支持数 ÷ 兄弟ノードの合計支持数
により得られる。 In step S268, the utterance information update unit 56 calculates the utterance score for each of the node whose support number has changed and its sibling node, and sets it in each record. The utterance score is
Support number ÷ Obtained by the total support number of sibling nodes.

発話木構造８ｐ−１では、新たな発話ＩＤ「Ｎ」の兄弟ノードは存在しないため、発話スコアは「１」となる。 In the utterance tree structure 8p-1, since there is no sibling node of the new utterance ID “N”, the utterance score is “1”.

また、発話木構造８ｐ−２では、新たな発話ＩＤ「Ｎ」の兄弟ノードは発話ＩＤ「Ｄ」である。従って、発話ＩＤ「Ｎ」の発話スコアに「１／２」が設定され、発話ＩＤ「Ｄ」の発話スコアに「１／２」が設定される。 Further, in the utterance tree structure 8p-2, the sibling node of the new utterance ID “N” is the utterance ID “D”. Therefore, "1/2" is set in the utterance score of the utterance ID "N", and "1/2" is set in the utterance score of the utterance ID "D".

一方、発話木構造８ｑ−１では、支持数の変化した発話ＩＤ「Ｄ」には発話ＩＤ「Ｃ」の兄弟ノードが存在するため、発話ＩＤ「Ｄ」の発話スコアに「２／３」が設定され、発話ＩＤ「Ｃ」の発話スコアに「１／３」が設定される。発話ＩＤ「Ｎ」に兄弟は存在しないため、発話スコアは「１」となる。 On the other hand, in the utterance tree structure 8q-1, since the utterance ID “D” whose support number has changed has a sibling node of the utterance ID “C”, the utterance score of the utterance ID “D” is “2/3”. It is set, and "1/3" is set in the utterance score of the utterance ID "C". Since there are no siblings in the utterance ID "N", the utterance score is "1".

また、発話木構造８ｑ−２では、支持数の変化した発話ＩＤ「Ｄ」には、発話ＩＤ「Ｃ」と、新たに追加した発話ＩＤ「Ｎ」とが兄弟ノードとして存在している。発話ＩＤ「Ｄ」の発話スコアは「０」であり、削除された状態を示す。兄弟ノードの、発話ＩＤ「Ｃ」の発話スコアに「１／２」が設定され、発話ＩＤ「Ｎ」の発話スコアに「１／２」が設定される。 Further, in the utterance tree structure 8q-2, the utterance ID "C" and the newly added utterance ID "N" exist as sibling nodes in the utterance ID "D" whose support number has changed. The utterance score of the utterance ID "D" is "0", indicating a deleted state. The utterance score of the utterance ID "C" of the sibling node is set to "1/2", and the utterance score of the utterance ID "N" is set to "1/2".

次に、対話データ取出部５７による対話データ取出処理を説明する。図３０は、対話データ取出処理を説明するためのフローチャート図である。図３０において、対話データ取出部５７は、ユーザＩ／Ｆ６０から対話データ取得要求６ａの通知を受けると、作業者スコア算出部５８を呼び出して、発話テーブル４３を通知し、作業者スコア算出処理を行う（ステップＳ２７１）。対話データ取出部５７は、作業者スコア算出部５８から、戻り値として、作業者９毎に発話作成の内容に対する信頼度を示した作業者スコアテーブル４５を取得する。 Next, the dialogue data retrieval process by the dialogue data retrieval unit 57 will be described. FIG. 30 is a flowchart for explaining the dialogue data retrieval process. In FIG. 30, when the dialogue data acquisition unit 57 receives the notification of the dialogue data acquisition request 6a from the user I / F60, it calls the worker score calculation unit 58, notifies the utterance table 43, and performs the worker score calculation process. (Step S271). The dialogue data extraction unit 57 acquires, as a return value, a worker score table 45 indicating the reliability of the content of the utterance creation for each worker 9 from the worker score calculation unit 58.

そして、対話データ取出部５７は、対話スコア算出部５９を呼び出して、作業者スコア算出部５８から得た作業者スコアテーブル４５を通知し、対話スコア算出処理を行う（ステップＳ２７２）。対話データ取出部５７は、対話スコア算出部５９から、戻り値として、対話毎に文脈の適切さを示したスコア付き対話データ６ｂを取得する。 Then, the dialogue data extraction unit 57 calls the dialogue score calculation unit 59, notifies the worker score table 45 obtained from the worker score calculation unit 58, and performs the dialogue score calculation process (step S272). The dialogue data extraction unit 57 acquires scored dialogue data 6b indicating the appropriateness of the context for each dialogue as a return value from the dialogue score calculation unit 59.

次に、作業者スコア算出部５８による作業者スコア算出処理について説明する。図３１は、図３０のステップＳ２７１における作業者スコア算出処理を説明するためのフローチャート図である。 Next, the worker score calculation process by the worker score calculation unit 58 will be described. FIG. 31 is a flowchart for explaining the worker score calculation process in step S271 of FIG. 30.

図３１において、作業者スコア算出部５８は、発話テーブル４３を参照して、作業者ＩＤ毎に、発話数と発話スコアの合計とを求める（ステップＳ２８１）。 In FIG. 31, the worker score calculation unit 58 refers to the utterance table 43 and obtains the total number of utterances and the utterance score for each worker ID (step S281).

そして、作業者スコア算出部５８は、作業者ＩＤ毎に、求めた発話数と発話スコアの合計とから平均値を算出し、作業者ＩＤと算出した平均値とを対応付けた作業者スコアテーブル４５を作成する（ステップＳ２８２）。作業者スコア算出部５８は、作成した作業者スコアテーブル４５を戻り値に設定し、この作業者スコア算出処理を終了する。 Then, the worker score calculation unit 58 calculates an average value from the obtained number of utterances and the total of the utterance scores for each worker ID, and the worker score table in which the worker ID and the calculated average value are associated with each other. 45 is created (step S282). The worker score calculation unit 58 sets the created worker score table 45 as a return value, and ends the worker score calculation process.

次に、対話スコア算出部５９による対話スコア算出処理について説明する。図３２は、図３０のステップＳ２７２における対話スコア算出処理を説明するためのフローチャート図である。 Next, the dialogue score calculation process by the dialogue score calculation unit 59 will be described. FIG. 32 is a flowchart for explaining the dialogue score calculation process in step S272 of FIG. 30.

図３２において、対話スコア算出部５９は、発話テーブル４３を参照して、対話ＩＤ毎に、発話木構造８の親子関係に従い、発話スコアに基づいて発話ＩＤを選択することで、最も支持されたパスを特定する（ステップＳ２９１）。２以上の発話ＩＤに対して同一の親発話ＩＤが対応付けられている場合、対話スコア算出部５９は、発話スコアが最大値の発話ＩＤを選択する。 In FIG. 32, the dialogue score calculation unit 59 is most supported by selecting the utterance ID based on the utterance score according to the parent-child relationship of the utterance tree structure 8 for each dialogue ID with reference to the utterance table 43. The path is specified (step S291). When the same parent utterance ID is associated with two or more utterance IDs, the dialogue score calculation unit 59 selects the utterance ID having the maximum utterance score.

対話スコア算出部５９は、発話テーブル４３を参照して、対話ＩＤ毎に、特定したパス上の発話ＩＤに対応付けられた作業者ＩＤを取得する（ステップＳ２９２）。更に、対話スコア算出部５９は、作業者スコアテーブル４５を参照して、対話ＩＤ毎に、特定したパス上の発話ＩＤに対応付けられた作業者ＩＤを取得する（ステップＳ２９３）。 The dialogue score calculation unit 59 refers to the utterance table 43 and acquires the worker ID associated with the utterance ID on the specified path for each dialogue ID (step S292). Further, the dialogue score calculation unit 59 refers to the worker score table 45 and acquires the worker ID associated with the utterance ID on the specified path for each dialogue ID (step S293).

また、対話スコア算出部５９は、パス毎に、作業者スコアの合計値と、対話ＩＤの総数とを用いて相乗平均を算出する（ステップＳ２９４）。 In addition, the dialogue score calculation unit 59 calculates the geometric mean for each pass using the total value of the worker scores and the total number of dialogue IDs (step S294).

そして、対話スコア算出部５９は、各対話ＩＤに対して、最も支持されたパスの発話ＩＤの順に、発話データＤＢ４４から発話データを取得して、対話ＩＤと、対話スコアと、発話データとを対応付けたスコア付き対話データ６ｂを作成する（ステップＳ２９５）。対話スコア算出部５９は、作成したスコア付き対話データ６ｂを戻り値に設定して、この対話スコア算出処理を終了する。 Then, the dialogue score calculation unit 59 acquires utterance data from the utterance data DB 44 in the order of the utterance IDs of the most supported paths for each dialogue ID, and obtains the dialogue ID, the dialogue score, and the utterance data. The associated dialogue data 6b with a score is created (step S295). The dialogue score calculation unit 59 sets the created dialogue data 6b with a score as a return value, and ends the dialogue score calculation process.

上述より、本実施例では、作業者９のペアを作る必要がないため、ペアを作って時間を調整する等の作業コストを削減することができる。ペアを作る場合には、ペアの管理コストの負担や高い作業単価のために、一定の費用を用いて作成できる対話コーパスのサイズが小さくなる。そのため、小規模な対話コーパスしか作ることが出来なかった。 From the above, in this embodiment, since it is not necessary to make a pair of workers 9, it is possible to reduce the work cost such as making a pair and adjusting the time. When creating a pair, the size of the dialogue corpus that can be created at a fixed cost becomes small due to the burden of managing the pair and the high unit cost of work. Therefore, only a small dialogue corpus could be created.

一方、本実施例では、個別の作業者９による対話の発話作業が可能となるため、作業者９は、作業者端末３を用いて、任意の場所や時間に作業を行える。そのため不特定多数の作業者９が作業にあたることができ、大規模な対話コーパスの作成を可能とする。また、不特定多数の個別の作業者９に作業を発注する方法であるクラウドソーシングを利用できるようになる。 On the other hand, in the present embodiment, since the individual worker 9 can perform the utterance work of the dialogue, the worker 9 can perform the work at an arbitrary place and time by using the worker terminal 3. Therefore, an unspecified number of workers 9 can perform the work, and a large-scale dialogue corpus can be created. In addition, crowdsourcing, which is a method of ordering work from an unspecified number of individual workers 9, can be used.

更に、不適切な発話を作成する作業者９を作業者スコアに基づいて特定することが可能であり、作業者スコアにより作業者９をフィルタリングすることにより不適切な発話の作成を抑制できる。また、発話の適切さを示すスコアを算出可能であるため、質の良い対話を特定でき、対話の内容（対話の流れの発話データのセット）を提供可能である。 Further, it is possible to identify the worker 9 who creates an inappropriate utterance based on the worker score, and by filtering the worker 9 by the worker score, the creation of an inappropriate utterance can be suppressed. In addition, since it is possible to calculate a score indicating the appropriateness of utterance, it is possible to identify a high-quality dialogue and provide the content of the dialogue (a set of utterance data in the flow of the dialogue).

また、選択確率を用いた作業者９の対話への割り当てにより、作業者９の負荷を下げることができ、これにより作業単価を低く抑えることができる。 Further, by assigning the worker 9 to the dialogue using the selection probability, the load on the worker 9 can be reduced, and thus the work unit price can be kept low.

クラウドソーシングでは、作業可能な作業者９の数は時系列上で一定ではないが、その数の如何によらず適切な作業の割り当てを行うことができる。 In crowdsourcing, the number of workers 9 who can work is not constant in time series, but appropriate work can be assigned regardless of the number.

本発明は、具体的に開示された実施例に限定されるものではなく、特許請求の範囲から逸脱することなく、主々の変形や変更が可能である。 The present invention is not limited to the specifically disclosed examples, and major modifications and modifications can be made without departing from the scope of claims.

以上の本実施例を含む実施形態に関し、更に以下の付記を開示する。
（付記１）
２者の対話に係る対話情報に基づいて１発話を表わす発話データを作成する作業者の作業者ＩＤを、複数の作業者の作業者ＩＤを該対話へ参加可能な順に記憶した記憶部から、１発話ごとに取り出し、
前記記憶部で管理される、複数の対話それぞれ毎に時系列に該対話に参加した作業者の作業者ＩＤの記録を参照して、取り出した前記作業者ＩＤの作業者の過去の参加状況に基づく作業者割当手順に従って、前記発話データを作成する作業者を割り当てる該対話を選択し、
前記複数の作業者により前記対話を完成させた対話データを得る
処理をコンピュータに実行させる対話コーパス作成プログラム。
（付記２）
前記対話を割り当てた作業者の作業者端末に、該対話の対話情報の表示と、該対話情報で示される直前の発話の内容が不適切である場合のチェックと、前記発話データの入力領域とを有する作成画面を表示させ、
前記作業者端末から、少なくとも前記発話データと、前記不適切のチェック有無とを受信すると、受信した該発話データを、前記対話の進行に従って親子関係を持つ木構造で管理し、前記不適切のチェック有りの場合、前記受信した発話データを、親の発話データから枝分かれさせて直前の発話データと兄弟関係として前記記憶部に記録し、
直後の発話データと共に受信した前記不適切のチェックの有無に応じて、前記直前の発話データの不適切さを判定する
処理を前記コンピュータに実行させる付記１記載の対話コーパス作成プログラム。
（付記３）
前記木構造では、受信した前記発話データに対して、支持数を付与し、
前記直前の発話データに対して兄弟関係の別の発話データが存在する場合、受信した前記発話データと共に受信した前記不適切のチェックの有無に応じて、該直前の発話データの前記支持数を増減する
処理を前記コンピュータに実行させる付記２記載の対話コーパス作成プログラム。
（付記４）
前記支持数が増減した発話データと、該発話データの兄弟関係の発話データとに対して、前記支持数の合計に対する該発話データそれぞれの支持数の割合により、発話スコアを算出し前記記憶部に記憶する
処理を前記コンピュータに実行させる付記３記載の対話コーパス作成プログラム。
（付記５）
前記記憶部を参照して、作業者ＩＤ毎に、作成した前記発話データの個数と、前記発話スコアの合計値とを取得し、
前記作業者ＩＤ毎に、前記発話スコアの平均値を算出して、該作業者ＩＤに算出した該平均値を対応付けて前記記憶部に記憶する
処理を前記コンピュータに実行させる付記４記載の対話コーパス作成プログラム。
（付記６）
前記記憶部を参照して、対話毎の発話データの木構造の親子関係と、前記発話スコアとに基づいて、該対話において最も支持された一連の発話データのつながりによるパスを特定し、
前記パス毎に、前記発話データを作成した作業者の作業者ＩＤに対応付けられた前記平均値を前記記憶部から取得して合計し、合計した値とパスの総数とを用いて相乗平均を算出し、対話を特定する対話ＩＤと算出した相乗平均とを対応付けて前記記憶部に記憶する
処理を前記コンピュータに実行させる付記５記載の対話コーパス作成プログラム。
（付記７）
前記パス上の親子関係に従って発話データを順に並べ、前記相乗平均と共に前記対話ＩＤに対応付けたスコア付き対話データを作成する
処理を前記コンピュータに実行させる付記６記載の対話コーパス作成プログラム。
（付記８）
前記コンピュータに、
前記複数の対話それぞれにおいて、該対話に参加した作業者の作業者ＩＤの記録に、取り出された前記作業者ＩＤが存在する割合を算出させ、
最も高い割合の対話の最後に取り出された前記作業者ＩＤを追加し、追加した該作業者ＩＤで特定される作業者を該最も高い割合の対話に割り当てさせる
ことを特徴とする付記１乃至７のいずれか一項記載の対話コーパス作成プログラム。
（付記９）
前記コンピュータに、
前記最も高い割合の対話が複数存在する場合、該複数の対話から、取り出された前記作業者ＩＤの出現回数が最も多い対話を選択する
ことを特徴とする付記８記載の対話コーパス作成プログラム。
（付記１０）
前記コンピュータに、
前記最も高い割合の対話が複数存在し、かつ、取り出された前記作業者ＩＤの出現回数が同数の場合、該複数の対話から、ランダムに対話を１つを選択する。
ことを特徴とする付記８記載の対話コーパス作成プログラム。
（付記１１）
２者の対話に係る対話情報に基づいて１発話を表わす発話データを作成する作業者の作業者ＩＤを、複数の作業者の作業者ＩＤを該対話へ参加可能な順に記憶した記憶部から、１発話ごとに取り出し、
前記記憶部で管理される、複数の対話それぞれ毎に時系列に該対話に参加した作業者の作業者ＩＤの記録を参照して、取り出した前記作業者ＩＤの作業者の過去の参加状況に基づく作業者割当手順に従って、前記発話データを作成する作業者を割り当てる該対話を選択し、
前記複数の作業者により前記対話を完成させた対話データを得る
処理をコンピュータが行う対話コーパス作成方法。
（付記１２）
２者の対話に係る対話情報に基づいて１発話を表わす発話データを作成する作業者の作業者ＩＤを、複数の作業者の作業者ＩＤを該対話へ参加可能な順に記憶した記憶部から、１発話ごとに取り出す作業者ＩＤ取出部と、
前記記憶部で管理される、複数の対話それぞれ毎に時系列に該対話に参加した作業者の作業者ＩＤの記録を参照して、取り出した前記作業者ＩＤの作業者の過去の参加状況に基づく作業者割当手順に従って、前記発話データを作成する作業者を割り当てる該対話を選択する選択部と
前記複数の作業者により前記発話データを作成させ、作成された発話データを収集する発話作成部と
を有する情報処理装置。 The following additional notes will be further disclosed with respect to the above embodiments including the present embodiment.
(Appendix 1)
From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. Take out for each utterance
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. According to the worker allocation procedure based on, select the dialogue to which the worker who creates the utterance data is assigned, and
A dialogue corpus creation program that causes a computer to execute a process of obtaining dialogue data that completes the dialogue by the plurality of workers.
(Appendix 2)
Displaying the dialogue information of the dialogue on the worker terminal of the worker to which the dialogue is assigned, checking when the content of the utterance immediately before being indicated by the dialogue information is inappropriate, and the input area of the utterance data. Display the creation screen with
When at least the utterance data and the presence / absence of the inappropriate check are received from the worker terminal, the received utterance data is managed by a tree structure having a parent-child relationship according to the progress of the dialogue, and the inappropriate check is performed. If there is, the received utterance data is branched from the parent's utterance data and recorded in the storage unit as a sibling relationship with the immediately preceding utterance data.
The dialogue corpus creation program according to Appendix 1, which causes the computer to execute a process of determining the inappropriateness of the immediately preceding utterance data according to the presence or absence of the inappropriateness check received together with the immediately preceding utterance data.
(Appendix 3)
In the tree structure, a support number is given to the received utterance data, and the support number is given.
When another utterance data having a sibling relationship exists with respect to the immediately preceding utterance data, the number of supports of the immediately preceding utterance data is increased or decreased depending on the presence or absence of the inappropriate check received together with the received utterance data. The interactive corpus creation program according to Appendix 2, which causes the computer to execute the processing to be performed.
(Appendix 4)
The utterance score is calculated from the utterance data in which the number of support increases or decreases and the utterance data of the sibling relationship of the utterance data, and the ratio of the number of support of each of the utterance data to the total number of support is calculated and stored in the storage unit. The interactive corpus creation program according to Appendix 3, which causes the computer to execute a process of storing data.
(Appendix 5)
With reference to the storage unit, the number of the created utterance data and the total value of the utterance score are acquired for each worker ID.
The dialogue according to Appendix 4, wherein an average value of the utterance scores is calculated for each worker ID, and a process of associating the calculated average value with the worker ID and storing the calculated average value in the storage unit is executed by the computer. Corpus creation program.
(Appendix 6)
With reference to the storage unit, the path of the most supported series of utterance data in the dialogue is identified based on the parent-child relationship of the tree structure of the utterance data for each dialogue and the utterance score.
For each pass, the average value associated with the worker ID of the worker who created the speech data is acquired from the storage unit and totaled, and the geometric mean is calculated using the total value and the total number of passes. The dialogue corpus creation program according to Appendix 5, which causes the computer to execute a process of associating a calculated dialogue ID for specifying a dialogue with a calculated geometric mean and storing it in the storage unit.
(Appendix 7)
The dialogue corpus creation program according to Appendix 6, wherein the computer is made to execute a process of arranging utterance data in order according to a parent-child relationship on the path and creating dialogue data with a score associated with the dialogue ID together with the geometric mean.
(Appendix 8)
On the computer
In each of the plurality of dialogues, the ratio of the extracted worker IDs present in the record of the worker IDs of the workers who participated in the dialogue is calculated.
Addendum 1 to 7, characterized in that the worker ID taken out at the end of the highest percentage of dialogue is added, and the worker specified by the added worker ID is assigned to the highest percentage of dialogue. The dialogue corpus creation program described in any one of the above.
(Appendix 9)
On the computer
The dialogue corpus creation program according to Appendix 8, wherein when a plurality of the dialogues having the highest ratio exist, the dialogue having the largest number of occurrences of the worker ID extracted is selected from the plurality of dialogues.
(Appendix 10)
On the computer
When a plurality of the highest ratio dialogues exist and the number of occurrences of the extracted worker IDs is the same, one dialogue is randomly selected from the plurality of dialogues.
The dialogue corpus creation program described in Appendix 8 characterized by the above.
(Appendix 11)
From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. Take out for each utterance
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. According to the worker allocation procedure based on, select the dialogue to which the worker who creates the utterance data is assigned, and
A method of creating a dialogue corpus in which a computer performs a process of obtaining dialogue data in which the dialogue is completed by the plurality of workers.
(Appendix 12)
From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. The worker ID extraction section that retrieves each utterance,
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. A selection unit that selects the dialogue to which a worker who creates the utterance data is assigned according to a worker allocation procedure based on the above, and a utterance creation unit that causes the plurality of workers to create the utterance data and collects the created utterance data. Information processing device with.

３作業者端末
３ａ対話参加要求
３ｂ対話タスク提示
６利用者端末
６ａ対話データ取得要求
６ｂスコア付き対話データ
７利用者
９作業者
４１作業者リスト
４２対話タスクテーブル
４３発話テーブル
４４発話データＤＢ
４５作業者スコアテーブル
５０初期化部
５１作業者登録部
５２作業者割当部
５３対話タスク選択部
５４対話タスク生成部
５５発話作成部
５６発話情報更新部
５７対話データ取出部
５８作業者スコア算出部
５９対話スコア算出部
６０ユーザＩ／Ｆ
１００サーバ装置 3 Worker terminal 3a Dialogue participation request 3b Dialogue task presentation 6 User terminal 6a Dialogue data acquisition request 6b Dialogue data with score 7 User 9 Worker 41 Worker list 42 Dialogue task table 43 Speech table 44 Speech data DB
45 Worker score table 50 Initialization unit 51 Worker registration unit 52 Worker allocation unit 53 Dialogue task selection unit 54 Dialogue task generation unit 55 Speaking creation unit 56 Speaking information update unit 57 Dialogue data extraction unit 58 Worker score calculation unit 59 Dialogue score calculation unit 60 User I / F
100 server device

Claims

From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. Take out for each utterance
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. According to the worker allocation procedure based on, select the dialogue to which the worker who creates the utterance data is assigned, and
A dialogue corpus creation program that causes a computer to execute a process of obtaining dialogue data that completes the dialogue by the plurality of workers.

Displaying the dialogue information of the dialogue on the worker terminal of the worker to which the dialogue is assigned, checking when the content of the utterance immediately before being indicated by the dialogue information is inappropriate, and the input area of the utterance data. Display the creation screen with
When at least the utterance data and the presence / absence of the inappropriate check are received from the worker terminal, the received utterance data is managed by a tree structure having a parent-child relationship according to the progress of the dialogue, and the inappropriate check is performed. If there is, the received utterance data is branched from the parent's utterance data and recorded in the storage unit as a sibling relationship with the immediately preceding utterance data.
The interactive corpus creation program according to claim 1, wherein the computer executes a process of determining the inappropriateness of the immediately preceding utterance data according to the presence or absence of the inappropriateness check received together with the immediately preceding utterance data.

In the tree structure, a support number is given to the received utterance data, and the support number is given.
When another utterance data having a sibling relationship exists with respect to the immediately preceding utterance data, the number of supports of the immediately preceding utterance data is increased or decreased depending on the presence or absence of the inappropriate check received together with the received utterance data. The interactive corpus creation program according to claim 2, wherein the computer executes the processing to be performed.

The utterance score is calculated from the utterance data in which the number of support increases or decreases and the utterance data of the sibling relationship of the utterance data, and the ratio of the number of support of each of the utterance data to the total number of support is calculated and stored in the storage unit. The interactive corpus creation program according to claim 3, wherein the computer executes a memorizing process.

From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. Take out for each utterance
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. According to the worker allocation procedure based on, select the dialogue to which the worker who creates the utterance data is assigned, and
A method of creating a dialogue corpus in which a computer performs a process of obtaining dialogue data in which the dialogue is completed by the plurality of workers.

From the storage unit that stores the worker IDs of the workers who create utterance data representing one utterance based on the dialogue information related to the dialogue between the two parties in the order in which the worker IDs of a plurality of workers can participate in the dialogue. The worker ID extraction section that retrieves each utterance,
With reference to the record of the worker ID of the worker who participated in the dialogue in chronological order for each of the plurality of dialogues managed by the storage unit, the past participation status of the worker with the extracted worker ID can be obtained. According to the worker allocation procedure based on the above, the selection unit that selects the dialogue to which the worker who creates the utterance data is assigned, and the utterance creation unit that causes the plurality of workers to create the utterance data and collects the created utterance data. Information processing device with.