JP7689541B2

JP7689541B2 - Information processing method, model training method, device, equipment, medium, and program product

Info

Publication number: JP7689541B2
Application number: JP2023048430A
Authority: JP
Inventors: ファルー; スーチーバオ; ファンフア; ファンワン; ファウー; シューウェイファン
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2022-08-10
Filing date: 2023-03-24
Publication date: 2025-06-06
Anticipated expiration: 2043-03-24
Also published as: CN115292467A; JP2023078411A; CN115292467B

Description

本開示は、コンピュータ技術の分野に関し、特に、人工知能と音声技術の分野に関し、具体的に、情報処理方法、モデルトレーニング方法、装置、機器、媒体及びプログラム製品に関する。 The present disclosure relates to the field of computer technology, in particular to the field of artificial intelligence and speech technology, and specifically to information processing methods, model training methods, devices, equipment, media, and program products.

自然言語処理技術の発展に伴い、機械学習モデルはスマート対話の分野で使用することができ、対話モデルは、ユーザが入力した文に基づいて返答し、ユーザと対話する効果を実現する。 With the development of natural language processing technology, machine learning models can be used in the field of smart dialogue, and the dialogue model can respond based on the sentences entered by the user, achieving the effect of dialogue with the user.

現在、対話モデルの対話の精度は低く、対話の品質は悪い。 Currently, the dialogue model has low dialogue accuracy and poor dialogue quality.

本開示は、情報処理方法、モデルトレーニング方法、装置、機器、媒体及びプログラム製品を提供する。 The present disclosure provides an information processing method, a model training method, an apparatus, a device, a medium, and a program product.

本開示の一様態によれば、情報処理方法を提供し、前記方法は、
初期対話文を取得するステップと、
前記初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得するステップとを含み、
前記対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、前記第２の候補返答サンプル文は、前記複数の候補返答サンプル文のいずれかであり、前記修正返答サンプル文は、前記候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to one aspect of the present disclosure, there is provided an information processing method, the method comprising:
obtaining an initial dialogue;
inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a corrected response sample sentence, a second candidate response sample sentence, and a recall response sample sentence, and an initial dialogue sample sentence is input into an initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first response sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、モデルトレーニング方法を提供し、前記方法は、
初期対話サンプル文を取得するステップと、
前記初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得するステップと、
前記複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得するステップと、
前記修正返答サンプル文、前記複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて前記初期対話モデルをトレーニングして、対話モデルを取得するステップとを含み、
前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the present disclosure, there is provided a model training method, the method comprising:
obtaining an initial dialogue sample sentence;
inputting the initial dialogue sample sentence into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
modifying a first candidate reply sample sentence from the plurality of candidate reply sample sentences to obtain a modified reply sample sentence;
training the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model;
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences, among the training sample sentences.

本開示の別の態様によれば、情報処理装置を提供し、前記装置は、
初期対話文を取得する取得モジュールと、
前記初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する入力モジュールと、を含み、
前記対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、前記第２の候補返答サンプル文は、前記複数の候補返答サンプル文のいずれかであり、前記修正返答サンプル文は、前記候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the present disclosure, there is provided an information processing device, the device comprising:
an acquisition module for acquiring an initial dialogue;
an input module for inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a corrected response sample sentence, a second candidate response sample sentence, and a recall response sample sentence, and an initial dialogue sample sentence is input into an initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first response sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences.

本開示の別の態様によれば、モデルトレーニング装置を提供し、前記装置は、
初期対話サンプル文を取得する文取得モジュールと、
前記初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得する文入力モジュールと、
前記複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する修正モジュールと、
前記修正返答サンプル文、前記複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて前記初期対話モデルをトレーニングして、対話モデルを取得するトレーニングモジュールと、を含み、
前記リコール返答サンプル文は、トレーニングサンプル文のうち、前記初期対話サンプル文と前記複数の候補返答サンプル文とを除く他のサンプル文である。 According to another aspect of the present disclosure, there is provided a model training apparatus, the apparatus comprising:
a sentence acquisition module for acquiring an initial dialogue sample sentence;
a sentence input module for inputting the initial dialogue sample sentence into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
a correction module that corrects a first candidate reply sample sentence from the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;
a training module for training the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model;
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences, among the training sample sentences.

本開示の別の態様によれば、電子機器を提供し、前記電子機器は、
少なくとも１つのプロセッサと、
前記少なくとも１つのプロセッサと通信可能に接続されるメモリと、を含み、
前記メモリには、前記少なくとも１つのプロセッサによって実行可能な命令が記憶され、前記少なくとも１つのプロセッサが上記の方法のを実行できるように、前記命令は前記少なくとも１つのプロセッサによって実行される。 According to another aspect of the present disclosure, there is provided an electronic device, the electronic device comprising:
At least one processor;
a memory communicatively coupled to the at least one processor;
The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor such that the at least one processor performs the above-described methods.

本開示の別の態様によれば、コンピュータ命令が記憶されている非一時的なコンピュータ読み取り可能な記憶媒体であって、前記コンピュータ命令は、コンピュータに上記の方法を実行させる。 According to another aspect of the present disclosure, a non-transitory computer-readable storage medium having computer instructions stored thereon causes a computer to perform the method described above.

本開示の別の態様によれば、コンピュータプログラムであって、前記コンピュータプログラムは、プロセッサによって実行される場合、上記の方法のステップを実現する。 According to another aspect of the present disclosure, there is provided a computer program that, when executed by a processor, implements the steps of the above method.

本開示のいくつかの実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングを行い、対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 In some embodiments of the present disclosure, a dialogue model is obtained by training based on the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence, and an initial dialogue sample sentence is input into the initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being one of the plurality of candidate response sample sentences, the corrected response sample sentence being a high quality dialogue sentence obtained by correcting the first response sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences, and a dialogue model with high dialogue accuracy is obtained by continuing to train the initial dialogue model on the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence, and the initial dialogue sentence is input into the dialogue model to obtain a target response sentence with high dialogue quality.

なお、この部分に記載の内容は、本開示の実施例の肝心または重要な特徴を特定することを意図しておらず、本開示の範囲を限定することも意図していないことを理解されたい。本開示の他の特徴は、以下の説明によって容易に理解される。 It should be understood that the contents described in this section are not intended to identify key or important features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood from the following description.

図面は、本技術案をよりよく理解するために使用され、本開示を限定するものではない。
本開示の実施例１によって提供される情報処理方法の概略フローチャートである。本開示の実施例２によって提供されるモデルトレーニング方法の概略フローチャートである。本開示の実施例３によって提供される情報処理方法のフローチャートである。本開示の例示的な実施例によって提供される情報処理装置の概略構成図である。本開示の例示的な実施例によって提供されるモデルトレーニング装置の概略構成図である。本開示の実施例を実現するための例示的な電子機器の概略ブロック図である。 The drawings are used for a better understanding of the present technical solution, and are not intended to limit the present disclosure.
1 is a schematic flowchart of an information processing method provided by the first embodiment of the present disclosure; 1 is a schematic flowchart of a model training method provided by Example 2 of the present disclosure; 11 is a flowchart of an information processing method provided by Example 3 of the present disclosure. FIG. 1 is a schematic configuration diagram of an information processing device provided by an exemplary embodiment of the present disclosure. FIG. 1 is a schematic configuration diagram of a model training apparatus provided by an exemplary embodiment of the present disclosure. FIG. 1 is a schematic block diagram of an exemplary electronic device for implementing embodiments of the present disclosure.

以下、図面と併せて本開示の例示的な実施例を説明し、理解を容易にするためにその中には本開示の実施例の様々な詳細事項が含まれており、それらは単なる例示的なものと見なされるべきである。したがって、当業者は、本開示の範囲及び精神から逸脱することなく、ここで説明される実施例に対して様々な変更と修正を行うことができることを認識されたい。同様に、明確及び簡潔にするために、以下の説明では、周知の機能及び構造の説明を省略する。 Below, exemplary embodiments of the present disclosure are described in conjunction with the drawings, and various details of the embodiments of the present disclosure are included therein for ease of understanding, and should be considered as merely exemplary. Therefore, those skilled in the art should recognize that various changes and modifications can be made to the embodiments described herein without departing from the scope and spirit of the present disclosure. Similarly, for clarity and conciseness, the following description omits descriptions of well-known functions and structures.

なお、本開示の技術案では、関連するユーザ個人情報の収集、記憶、使用、加工、伝送、提供、公開などの処理は、いずれも関連する法律法規の規定に合致し、かつ公序良俗に違反しない。 In addition, in the technical solution disclosed herein, the collection, storage, use, processing, transmission, provision, disclosure and other processing of relevant user personal information shall all comply with the provisions of relevant laws and regulations and shall not violate public order and morals.

人工知能はコンピュータに人間のある思惟過程と知能行為（学習、推理、思考、計画など）をシミュレートさせることを研究する学科であり、ハードウェアレベルの技術もソフトウェアレベルの技術もある。人工知能ハードウェア技術は一般的にセンサ、専用人工知能チップ、クラウドコンピューティング、分散ストレージ、ビッグデータ処理などの技術を含む。人工知能ソフトウェア技術は主にコンピュータビジョン技術、音声認識技術、自然言語処理技術及び機械学習／深層学習、ビッグデータ処理技術、ナレッジグラフ技術などのいくつかの方向を含む。 Artificial intelligence is a discipline that studies how computers can simulate certain human thought processes and intelligent behaviors (learning, reasoning, thinking, planning, etc.), and includes both hardware and software level technologies. AI hardware technology generally includes sensors, dedicated AI chips, cloud computing, distributed storage, big data processing, and other technologies. AI software technology mainly includes computer vision technology, voice recognition technology, natural language processing technology, and several other aspects such as machine learning/deep learning, big data processing technology, and knowledge graph technology.

対話システムの分野では、ソーシャルメディアの評論データに基づいてトレーニングされた大規模な対話モデルが続々と現れている。しかし、ソーシャルメディアの評論シーンと実際の人間の対話シーンとの間にずれがあるため、モデルの生成能力がよくない。 In the field of dialogue systems, large-scale dialogue models trained on social media review data are emerging one after another. However, due to a discrepancy between social media review scenes and actual human dialogue scenes, the model's generation ability is poor.

生成式対話モデルは、推論時に複数の候補返答を生成し、その後、生成スコアを使用して返答を評価してソートする。しかし、生成スコアに基づくソート方法では、高品質な返答を効果的に前列に置くことができない。 A generative dialogue model generates multiple candidate replies during inference and then evaluates and sorts the replies using a generation score. However, sorting methods based on the generation score cannot effectively bring high-quality replies to the front of the queue.

上記に存在する技術的課題に対して、本開示のいくつかの実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 In response to the above technical problems, in some embodiments of the present disclosure, a dialogue model is obtained by training based on the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence, and an initial dialogue sample sentence is input into the initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being one of the plurality of candidate response sample sentences, the corrected response sample sentence being a high quality dialogue sentence obtained by correcting the first response sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences, and a dialogue model with high dialogue accuracy is obtained by continuing to train the initial dialogue model on the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence, and the initial dialogue sentence is input into the dialogue model to obtain a target response sentence with high dialogue quality.

以下、図面と併せて、本開示の各実施例によって提供される技術案を詳細に説明する。 The technical solutions provided by each embodiment of the present disclosure are described in detail below with reference to the drawings.

図１は、本開示の実施例１によって提供される情報処理方法の概略フローチャートである。図１に示すように、当該方法は、以下のステップＳ１０１～１０２を含む。 Figure 1 is a schematic flowchart of an information processing method provided by the first embodiment of the present disclosure. As shown in Figure 1, the method includes the following steps S101 to S102.

Ｓ１０１、初期対話文を取得する。
Ｓ１０２、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する。
対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 S101: An initial dialogue is obtained.
S102, input the initial dialogue sentence into a trained dialogue model to obtain a target response sentence.
The dialogue model is a model obtained by training based on the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence, and the initial dialogue sample sentence is input into the initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first reply sample sentence among the candidate response sample sentences, and the recall response sample sentence being any other sample sentence among the training sample sentences excluding the initial dialogue sample sentence and the plurality of candidate response sample sentences.

本実施例では、上記方法の実行主体は、サーバまたは端末装置であってもよい。 In this embodiment, the entity that executes the above method may be a server or a terminal device.

上記方法の実行主体がサーバである場合、サーバの実現形態は限定されない。例えば、サーバは、汎用サーバ、クラウドサーバ、クラウドホスト、仮想センタなどのサーバ装置であってもよい。サーバの構成は主にプロセッサ、ハードディスク、メモリ、システムバスなど、及び汎用コンピュータアーキテクチャーのタイプを含む。 When the execution entity of the above method is a server, the implementation form of the server is not limited. For example, the server may be a server device such as a general-purpose server, a cloud server, a cloud host, or a virtual center. The configuration of the server mainly includes a processor, a hard disk, a memory, a system bus, etc., and a type of general-purpose computer architecture.

上記方法の実行主体が端末装置である場合、端末装置の実現形態は限定されない。端末装置は、パーソナルコンピュータ、タブレットコンピュータ、スマートフォン、スマートウェアラブルデバイスのいずれかを含むが、これに限定されない。 When the execution entity of the above method is a terminal device, the implementation form of the terminal device is not limited. The terminal device includes, but is not limited to, a personal computer, a tablet computer, a smartphone, or a smart wearable device.

本実施例では、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングを行い、対話モデルを取得し、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された対話の品質の高い文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文であり、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に対して、初期対話モデルをトレーニングし続けることにより、対話の精度の高い対話モデルを取得し、初期対話文を取得し、初期対話文を対話モデルに入力して、対話の品質の高いターゲット返答文を取得する。 In this embodiment, training is performed based on the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence to obtain a dialogue model, and the initial dialogue sample sentence is input into the initial dialogue model to obtain multiple candidate response sample sentences, the second candidate response sample sentence is one of the multiple candidate response sample sentences, the corrected response sample sentence is a high-quality dialogue obtained by correcting the first response sample sentence among the candidate response sample sentences, and the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the multiple candidate response sample sentences among the training sample sentences, and by continuing to train the initial dialogue model for the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence, a dialogue model with high dialogue accuracy is obtained, an initial dialogue sentence is obtained, and the initial dialogue sentence is input into the dialogue model to obtain a target response sentence with high dialogue quality.

以下、応用シーンに合わせて本開示の技術案を説明する。 Below, we will explain the technical proposals disclosed in this disclosure according to application scenarios.

応用シーン１：スマートフォンは、ユーザが音声で入力した初期対話文「今日の天気はどうですか」に応答し、スマートフォンは初期対話文をサーバにアップロードし、サーバは初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文「今日は晴れです」を取得し、サーバは、ターゲット返答文をスマートフォンに下り送信し、スマートフォンは音声でターゲット返答文「今日は晴れです」を再生する。 Application scenario 1: The smartphone responds to an initial dialogue sentence "How is the weather today?" input by the user via voice, the smartphone uploads the initial dialogue sentence to the server, the server inputs the initial dialogue sentence into a trained dialogue model to obtain the target response sentence "It's sunny today", the server transmits the target response sentence down to the smartphone, and the smartphone plays the target response sentence "It's sunny today" via voice.

応用シーン２：スマートフォンは、ユーザが音声で入力した初期対話文「今日の天気はどうですか」に応答し、スマートフォンは、ローカルに統合された対話モデルに初期対話文を入力して、ターゲット返答文「今日は晴れです」を取得し、スマートフォンは、ターゲット返答文「今日は晴れです」を音声で再生する。 Application scenario 2: The smartphone responds to an initial dialogue sentence "How is the weather today?" input by the user through voice, and inputs the initial dialogue sentence into the locally integrated dialogue model to obtain a target response sentence "It's sunny today", and the smartphone plays the target response sentence "It's sunny today" through voice.

対話モデルを使用する前に、初期対話モデルをトレーニングして対話モデルを取得する必要がある。以下、対話モデルをトレーニングする過程を説明する。 Before using a dialogue model, we need to train an initial dialogue model to obtain a dialogue model. The process of training a dialogue model is described below.

図２は、本開示の実施例２によって提供されるモデルトレーニング方法の概略フローチャートである。図２に示すように、当該方法は以下のステップＳ２０１～２０４を含む。 Figure 2 is a schematic flowchart of a model training method provided by the second embodiment of the present disclosure. As shown in Figure 2, the method includes the following steps S201 to S204.

Ｓ２０１、初期対話サンプル文を取得する。 S201: Obtain initial dialogue sample sentences.

Ｓ２０２、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得する。 S202: Input the initial dialogue sample sentence into the initial dialogue model to obtain multiple candidate response sample sentences.

Ｓ２０３、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する。 S203: Modify a first candidate response sample sentence from among the plurality of candidate response sample sentences to obtain a modified response sample sentence.

Ｓ２０４、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する。
リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 S204, training an initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and the recall response sample sentence to obtain a dialogue model.
The recall response sample sentences are the training sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences.

上記の対話モデルをトレーニングするためのトレーニング装置は、任意のタイプのコンピュータ装置であってもよく、本開示の実施例はこれに対して限定しない。 The training device for training the above dialogue model may be any type of computing device, and the embodiments of the present disclosure are not limited thereto.

なお、初期対話モデルはトレーニング済みのモデルであってもよく、初期対話モデルの精度が低く、初期対話モデルを使用した対話の品質が悪い。 Note that the initial dialogue model may be a trained model, and the accuracy of the initial dialogue model may be low, resulting in poor dialogue quality when using the initial dialogue model.

初期対話サンプル文を取得し、初期対話サンプル文を初期対話モデルに入力して、修正返答サンプル文を取得する。複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得し、複数の候補返答サンプル文の中から第２の候補返答サンプル文をランダムに選択し、トレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文から、リコール返答サンプル文を選択する。修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文は１つのトレーニングデータセットを構成する。上記のステップを繰り返して、モデルトレーニングのためのトレーニングデータセットを取得する。 Obtain an initial dialogue sample sentence, input the initial dialogue sample sentence into an initial dialogue model to obtain a revised response sample sentence; modify a first candidate response sample sentence from the plurality of candidate response sample sentences to obtain a revised response sample sentence; randomly select a second candidate response sample sentence from the plurality of candidate response sample sentences; and select a recall response sample sentence from the other sample sentences in the training sample sentences excluding the initial dialogue sample sentence and the plurality of candidate response sample sentences. The revised response sample sentence, the second candidate response sample sentence, and the recall response sample sentence constitute one training dataset. The above steps are repeated to obtain a training dataset for model training.

なお、初期対話サンプル文は、データセットのカバレッジ範囲を増やすために、例えば、ニュース分野、ソーシャルメディア分野、文学分野及び実写対話分野など、できるだけ異なる分野のデータセットを採用する。 In order to increase the coverage of the dataset, the initial dialogue sample sentences will be taken from datasets in as many different fields as possible, such as news, social media, literature, and live-action dialogue.

上記の実施例では、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得する。例えば、第１の候補返答サンプル文に対して、コピー、訂正、または作成などの操作を行い、修正返答サンプル文を取得する。 In the above embodiment, a first candidate response sample sentence among a plurality of candidate response sample sentences is modified to obtain a modified response sample sentence. For example, an operation such as copying, correcting, or creating is performed on the first candidate response sample sentence to obtain a modified response sample sentence.

例えば、ラベリングインターフェースで初期対話サンプル文を入力する操作に応答し、初期対話サンプル文「毎日雨が降って気分が悪くなった」を取得し、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」、「雨の日は寝るのにぴったりだよ」、「私も気分が悪い、誰も付き合ってくれないから」、「雨の日はいいね」、「私も！雨の日は好きじゃない」、「そうですね、外出できなくなって困っています」及び「そうですね、私も雨の日は嫌いです」を取得する。 For example, in response to an operation of inputting an initial dialogue sample sentence in the labeling interface, the initial dialogue sample sentence "It rains every day and makes me feel bad" is obtained, and the initial dialogue sample sentence is input into the initial dialogue model to obtain a number of candidate response sample sentences "Music and chocolate go well with rainy days", "Rainy days are perfect for sleeping", "I feel bad too because no one will join me", "Rainy days are nice", "Me too! I don't like rainy days", "Yes, it's a problem that I can't go out", and "Yes, I don't like rainy days either".

複数の候補返答サンプル文のうちの第１の候補返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」を修正して、修正返答サンプル文「雨の日は、音楽とチョコレートが似合うと思いますよ」を取得し、複数の候補返答サンプル文から第２の候補返答サンプル文「雨の日は寝るのにぴったりだよ」をランダムに選択し、トレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文から、リコール返答サンプル文「今日は晴れです」を選択する。修正返答サンプル文「雨の日は、音楽とチョコレートが似合うよ」、第２の候補返答サンプル文「雨の日は寝るのにぴったりだよ」及びリコール返答サンプル文「今日は晴れです」は１つのトレーニングデータセットを構成する。 A first candidate response sample sentence "Music and chocolate go well with rainy days" from among the multiple candidate response sample sentences is modified to obtain a modified response sample sentence "I think music and chocolate go well with rainy days", a second candidate response sample sentence "Rainy days are perfect for sleeping" is randomly selected from the multiple candidate response sample sentences, and a recall response sample sentence "It's sunny today" is selected from the other sample sentences excluding the initial dialogue sample sentence and the multiple candidate response sample sentences from the training sample sentences. The modified response sample sentence "Music and chocolate go well with rainy days", the second candidate response sample sentence "Rainy days are perfect for sleeping", and the recall response sample sentence "It's sunny today" constitute one training dataset.

上記の実施例では、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する。実現可能な一形態としては、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文を初期対話モデルの初期文生成モデルに入力して、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率を取得し、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 In the above embodiment, the initial dialogue model is trained based on the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence among the multiple candidate response sample sentences, to obtain a dialogue model. In one possible embodiment, the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence are input into an initial sentence generation model of the initial dialogue model to obtain the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence, and the initial sentence generation model and the initial sentence determination model of the initial dialogue model are jointly trained based on the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence to obtain a dialogue model.

一実施例では、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。実際返答文と修正返答サンプル文とに基づいて、損失関数を決定し、損失関数に基づいて、修正返答サンプル文の確率が第２の候補返答サンプル文の確率より大きく、修正返答サンプル文の確率がリコール返答サンプル文の確率より大きく、第２の候補返答サンプル文の確率がリコール返答サンプル文の確率より大きいことをトレーニングターゲットとして、初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 In one embodiment, an initial sentence generation model and an initial sentence determination model of an initial dialogue model are jointly trained based on the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain a dialogue model. A loss function is determined based on the actual response sentence and the corrected response sample sentence, and based on the loss function, the probability of the corrected response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the corrected response sample sentence is greater than the probability of the recall response sample sentence, and the probability of the second candidate response sample sentence is greater than the probability of the recall response sample sentence, as training targets. The initial sentence generation model and the initial sentence determination model are jointly trained to obtain a dialogue model.

上記各実施例の説明と併せて、図３は本開示の実施例３によって提供される情報処理方法のフローチャートである。図３に示すように、当該方法は以下のステップＳ３０１～Ｓ３０４を含む。 In conjunction with the above description of each embodiment, FIG. 3 is a flowchart of an information processing method provided by the third embodiment of the present disclosure. As shown in FIG. 3, the method includes the following steps S301 to S304.

Ｓ３０１、端末装置は音声入力操作に応答し、初期対話文を取得する。 S301: The terminal device responds to a voice input operation and obtains an initial dialogue.

Ｓ３０２、端末装置は初期対話文をサーバに送信する。 S302: The terminal device transmits the initial dialogue to the server.

Ｓ３０３、サーバは、初期対話文を受信し、初期対話文を対話モデルに入力して、ターゲット返答文を取得し、ターゲット返答文を端末装置に下り送信する。 S303: The server receives the initial dialogue, inputs the initial dialogue into the dialogue model to obtain a target response sentence, and transmits the target response sentence down to the terminal device.

Ｓ３０４、端末装置はターゲット返答文を受信して、ターゲット返答文を音声で再生する。 S304: The terminal device receives the target response sentence and plays the target response sentence aloud.

本実施例では、サーバの実現形態は限定されない。例えば、サーバは、汎用サーバ、クラウドサーバ、クラウドホスト、仮想センタなどのサーバ装置であってもよい。サーバの構成は主にプロセッサ、ハードディスク、メモリ、システムバスなど、及び汎用コンピュータアーキテクチャーのタイプを含む。 In this embodiment, the implementation form of the server is not limited. For example, the server may be a server device such as a general-purpose server, a cloud server, a cloud host, or a virtual center. The configuration of the server mainly includes a processor, a hard disk, a memory, a system bus, etc., and a type of general-purpose computer architecture.

本実施例では、端末装置の実現形態は限定されない。端末装置は、パーソナルコンピュータ、タブレットコンピュータ、スマートフォン、スマートウェアラブルデバイスのいずれかを含むが、これに限定されない。 In this embodiment, the implementation form of the terminal device is not limited. The terminal device may include, but is not limited to, a personal computer, a tablet computer, a smartphone, or a smart wearable device.

本実施例の各ステップの実現形態は上記実施例の説明を参照することができ、本実施例では説明を省略し、同時に、本実施例は、上記の各実施例に対応する部分の有益な効果を取得することができる。 The implementation form of each step in this embodiment can be referred to the explanation of the above embodiments, and the explanation will be omitted in this embodiment, while at the same time, this embodiment can obtain the beneficial effects of the parts corresponding to each of the above embodiments.

図４は、本開示の例示的な実施例によって提供される情報処理装置４０の概略構成図である。この情報処理装置４０は、取得モジュール４１と入力モジュール４２を含む。 FIG. 4 is a schematic diagram of an information processing device 40 provided by an exemplary embodiment of the present disclosure. The information processing device 40 includes an acquisition module 41 and an input module 42.

取得モジュール４１は、初期対話文を取得する。 The acquisition module 41 acquires the initial dialogue.

入力モジュール４２は、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する。
対話モデルは、修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文に基づいてトレーニングして取得されたモデルであり、初期対話サンプル文を初期対話モデルに入力して複数の候補返答サンプル文を取得し、第２の候補返答サンプル文は、複数の候補返答サンプル文のいずれかであり、修正返答サンプル文は、候補返答サンプル文のうちの第１の返答サンプル文を修正して取得された文であり、リコール返答サンプル文は、トレーニングサンプル文のうち、初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 The input module 42 inputs the initial dialogue sentence into the trained dialogue model to obtain the target response sentence.
The dialogue model is a model obtained by training based on the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence, and the initial dialogue sample sentence is input into the initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first reply sample sentence among the candidate response sample sentences, and the recall response sample sentence being any other sample sentence among the training sample sentences excluding the initial dialogue sample sentence and the plurality of candidate response sample sentences.

選択的に、入力モジュール４２は、初期対話文をトレーニング済みの対話モデルに入力して、ターゲット返答文を取得する場合、
対話モデルの内部で、初期対話文を対話モデルの文生成モデルに入力して、複数の候補返答文と各候補返答文の確率とを取得し、
複数の候補返答文と各候補返答文の確率とを対話モデルの文決定モデルに入力して、ターゲット返答文を取得する。 Optionally, the input module 42 inputs the initial dialogue sentence into the trained dialogue model to obtain the target response sentence,
Within the dialogue model, inputting the initial dialogue sentence into a sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and a probability of each candidate response sentence;
A plurality of candidate response sentences and the probability of each candidate response sentence are input to a sentence determination model of the dialogue model to obtain a target response sentence.

選択的に、入力モジュール４２は、複数の候補返答文と各候補返答文の確率とを対話モデルの文決定モデルに入力して、ターゲット返答文を取得する場合、
複数の候補返答文と各候補返答文の確率とを文決定モデルに入力し、複数の候補返答文の中から、最も確率の高いターゲット返答文を選択する。 Optionally, the input module 42 inputs a plurality of candidate response sentences and the probability of each candidate response sentence into a sentence determination model of the dialogue model to obtain a target response sentence,
A plurality of candidate reply sentences and the probability of each candidate reply sentence are input to a sentence determination model, and a target reply sentence with the highest probability is selected from the plurality of candidate reply sentences.

図５は、本開示の例示的な実施例によって提供されるモデルトレーニング装置５０の概略構成図である。このモデルトレーニング装置５０は、文取得モジュール５１、文入力モジュール５２、修正モジュール５３及びトレーニングモジュール５４を含み、
文取得モジュール５１は、初期対話サンプル文を取得し、
文入力モジュール５２は、初期対話サンプル文を初期対話モデルに入力して、複数の候補返答サンプル文を取得し、
修正モジュール５３は、複数の候補返答サンプル文のうちの第１の候補返答サンプル文を修正して、修正返答サンプル文を取得し、
トレーニングモジュール５４は、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得し、
リコール返答サンプル文がトレーニングサンプル文のうちの初期対話サンプル文と複数の候補返答サンプル文とを除く他のサンプル文である。 5 is a schematic diagram of a model training device 50 provided by an exemplary embodiment of the present disclosure. The model training device 50 includes a sentence acquisition module 51, a sentence input module 52, a correction module 53, and a training module 54.
The sentence acquisition module 51 acquires an initial dialogue sample sentence,
The sentence input module 52 inputs the initial dialogue sample sentence into the initial dialogue model to obtain a plurality of candidate reply sample sentences;
The correction module 53 corrects a first candidate reply sample sentence of the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;
a training module 54 training an initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and the recall response sample sentence to obtain a dialogue model;
The recall response sample sentences are sample sentences other than the initial dialogue sample sentences and the plurality of candidate response sample sentences among the training sample sentences.

選択的に、トレーニングモジュール５４は、修正返答サンプル文、複数の候補返答サンプル文のうちの第２の候補返答サンプル文及びリコール返答サンプル文に基づいて初期対話モデルをトレーニングして、対話モデルを取得する場合、
修正返答サンプル文、第２の候補返答サンプル文及びリコール返答サンプル文を初期対話モデルの初期文生成モデルに入力して、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率を取得し、
実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 Optionally, the training module 54 trains an initial dialogue model based on the revised response sample sentence, the second candidate response sample sentence of the plurality of candidate response sample sentences, and the recall response sample sentence to obtain a dialogue model,
inputting the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence into an initial sentence generation model of the initial dialogue model to obtain the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence;
An initial sentence generation model and an initial sentence determination model of the initial dialogue model are jointly trained based on the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain a dialogue model.

選択的に、トレーニングモジュール５４は、実際返答文、修正返答サンプル文の確率、第２の候補返答サンプル文の確率及びリコール返答サンプル文の確率に基づいて初期対話モデルの初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する場合、
実際返答文と修正返答サンプル文とに基づいて、損失関数を決定し、
損失関数に基づいて、修正返答サンプル文の確率が第２の候補返答サンプル文の確率より大きく、修正返答サンプル文の確率がリコール返答サンプル文の確率より大きく、第２の候補返答サンプル文の確率がリコール返答サンプル文の確率より大きいことをトレーニングターゲットとして、初期文生成モデルと初期文決定モデルとを共同トレーニングして、対話モデルを取得する。 Optionally, the training module 54 jointly trains the initial sentence generation model and the initial sentence determination model of the initial dialogue model according to the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain a dialogue model:
determining a loss function based on the actual response sentence and the sample corrected response sentence;
Based on the loss function, the probability of the corrected response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the corrected response sample sentence is greater than the probability of the recall response sample sentence, and the probability of the second candidate response sample sentence is greater than the probability of the recall response sample sentence, and the initial sentence generation model and the initial sentence determination model are jointly trained to obtain a dialogue model.

上記実施例の装置について、その各モジュールの操作を実行する具体的な方式は、当該方法に関する実施例においてすでに詳細に説明したが、ここでは詳細に説明しない。 The specific method for executing the operations of each module of the device in the above embodiment has already been described in detail in the embodiment relating to the method, and will not be described in detail here.

本開示の実施例によれば、本開示は、電子機器および読み取り可能な記憶媒体をさらに提供する。
本開示の実施例によれば、本開示は、コンピュータプログラムをさらに提供し、コンピュータプログラムがプロセッサによって実行される場合、本開示によって提供される情報処理方法またはモデルトレーニング方法を実現する。 According to an embodiment of the present disclosure, the present disclosure further provides an electronic device and a readable storage medium.
According to an embodiment of the present disclosure, the present disclosure further provides a computer program, which, when executed by a processor, realizes the information processing method or the model training method provided by the present disclosure.

図６は、本開示の実施例を実行するための例示的な電子機器６００の概略ブロック図である。電子機器は、ラップトップコンピュータ、デスクトップコンピュータ、ワークステーション、パーソナルデジタルアシスタント、サーバ、ブレードサーバ、メインフレームコンピュータ、および他の適切なコンピュータなどの様々な形態のデジタルコンピュータを表すことを目的とする。電子機器は、パーソナルデジタル処理、携帯電話、スマートフォン、ウェアラブルデバイス、および他の同様のコンピューティングデバイスなどの様々な形態のモバイルデバイスを表すこともできる。本明細書で示される部品、それらの接続と関係、およびそれらの機能は、単なる例であり、本明細書の説明および／または求められる本開示の実現を制限することを意図したものではない。 FIG. 6 is a schematic block diagram of an exemplary electronic device 600 for implementing embodiments of the present disclosure. The electronic device is intended to represent various forms of digital computers, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, mobile phones, smartphones, wearable devices, and other similar computing devices. The components, their connections and relationships, and their functions shown herein are merely examples and are not intended to limit the description herein and/or the implementation of the present disclosure as sought.

図６に示すように、電子機器６００は、読み取り専用メモリ（ＲＯＭ）６０２に記憶されているコンピュータプログラムまたは記憶ユニット６０８からランダムアクセスメモリ（ＲＡＭ）６０３にロードされたコンピュータプログラムに従って様々な適切な動作および処理を実行できる計算ユニット６０１を含む。ＲＡＭ６０３には、電子機器６００の動作に必要な各種のプログラムやデータも記憶されてもよい。計算ユニット６０１、ＲＯＭ６０２、およびＲＡＭ６０３は、バス６０４を介して互いに接続されている。バス６０４には、入力／出力（Ｉ／Ｏ）インターフェース６０５も接続されている。 As shown in FIG. 6, the electronic device 600 includes a computing unit 601 that can perform various appropriate operations and processes according to a computer program stored in a read-only memory (ROM) 602 or loaded from a storage unit 608 into a random access memory (RAM) 603. The RAM 603 may also store various programs and data required for the operation of the electronic device 600. The computing unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to the bus 604.

電子機器６００の複数のコンポーネントはＩ／Ｏインターフェース６０５に接続され、キーボード、マウスなどの入力ユニット６０６、各タイプのディスプレイ、スピーカなどの出力ユニット６０７、磁気ディスク、光ディスクなどの記憶ユニット６０８、およびネットワークカード、モデム、無線通信トランシーバなどの通信ユニット６０９を含む。通信ユニット６０９は、電子機器６００が、インターネットなどのコンピュータネットワークおよび／または各種の電信ネットワークを介して他のデバイスと情報／データを交換することを可能にする。 The components of the electronic device 600 are connected to an I/O interface 605, which includes an input unit 606 such as a keyboard, a mouse, etc., an output unit 607 such as various types of displays, speakers, etc., a storage unit 608 such as a magnetic disk, an optical disk, etc., and a communication unit 609 such as a network card, a modem, a wireless communication transceiver, etc. The communication unit 609 enables the electronic device 600 to exchange information/data with other devices via a computer network such as the Internet and/or various telecommunication networks.

計算ユニット６０１は、処理および計算能力を有する様々な汎用および／または専用の処理コンポーネントであってもよい。計算ユニット６０１のいくつかの例は、中央処理ユニット（ＣＰＵ）、グラフィック処理ユニット（ＧＰＵ）、各種の専用の人工知能（ＡＩ）計算チップ、機械学習モデルアルゴリズムを実行する各種の計算ユニット、デジタル信号プロセッサ（ＤＳＰ）、およびいずれかの適切なプロセッサ、コントローラ、マイクロコントローラなどを含むが、これらに限定されない。計算ユニット６０１は、上記に記載された各方法及び処理、例えば、情報処理方法とモデルトレーニング方法を実行する。例えば、いくつかの実施例では、情報処理方法とモデルトレーニング方法を、記憶ユニット６０８などの機械読み取り可能な媒体に有形的に含まれるコンピュータソフトウェアプログラムとして実現することができる。いくつかの実施例では、コンピュータプログラムの一部または全部は、ＲＯＭ６０２および／または通信ユニット６０９を介して電子機器６００にロードおよび／またはインストールすることができる。コンピュータプログラムがＲＡＭ６０３にロードされ、計算ユニット６０１によって実行される場合、上記に記載された情報処理方法とモデルトレーニング方法の１つまたは複数のステップが実行されてもよい。代替的に、他の実施例では、計算ユニット６０１は情報処理方法とモデルトレーニング方法を実行するように、他のいずれかの適切な方式（例えば、ファームウェアを介して）によって構成されてもよい。 The computing unit 601 may be various general-purpose and/or dedicated processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a central processing unit (CPU), a graphic processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units that execute machine learning model algorithms, a digital signal processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 executes each of the methods and processes described above, such as the information processing method and the model training method. For example, in some embodiments, the information processing method and the model training method can be realized as a computer software program tangibly included in a machine-readable medium such as the storage unit 608. In some embodiments, some or all of the computer program can be loaded and/or installed in the electronic device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into the RAM 603 and executed by the computing unit 601, one or more steps of the information processing method and the model training method described above may be performed. Alternatively, in other embodiments, the computation unit 601 may be configured in any other suitable manner (e.g., via firmware) to perform the information processing method and the model training method.

本明細書で上記記載のシステムと技術の様々な実施形態は、デジタル電子回路システム、集積回路システム、フィールドプログラマブルゲートアレイ（ＦＰＧＡ）、特定用途向け集積回路（ＡＳＩＣ）、特定用途向け標準製品（ＡＳＳＰ）、システムオンチップ（ＳＯＣ）、コンプレックス・プログラマブル・ロジック・デバイス（ＣＰＬＤ）、コンピュータハードウェア、ファームウェア、ソフトウェア、および／またはそれらの組み合わせで実現することができる。これらの様々な実施形態は、１つ又は複数のコンピュータプログラムで実施されることを含むことができ、当該１つ又は複数のコンピュータプログラムは、少なくとも１つのプログラマブルプロセッサを含むプログラム可能なシステムで実行および／または解釈されることができ、当該プログラマブルプロセッサは、特定用途向け又は汎用プログラマブルプロセッサであってもよく、ストレージシステム、少なくとも１つの入力装置、および少なくとも１つの出力装置からデータおよび命令を受信し、データおよび命令を当該ストレージシステム、当該少なくとも１つの入力装置、および当該少なくとも１つの出力装置に伝送することができる。 Various embodiments of the systems and techniques described herein above may be implemented in digital electronic circuitry systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips (SOCs), complex programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs that may be executed and/or interpreted by a programmable system that includes at least one programmable processor, which may be an application specific or general purpose programmable processor, and that may receive data and instructions from a storage system, at least one input device, and at least one output device, and transmit data and instructions to the storage system, the at least one input device, and the at least one output device.

本開示の方法を実行するためのプログラムコードは、１つ又は複数のプログラミング言語の任意の組み合わせで書くことができる。これらのプログラムコードは、プロセッサ又はコントローラによって実行された際に、フローチャートおよび／またはブロック図に規定された機能／操作が実施されるように、汎用コンピュータ、専用コンピュータ、又は他のプログラマブルデータ処理装置のプロセッサ又はコントローラに提供されてもよい。プログラムコードは、完全に機械上で実行されるか、部分的に機械上で実行されるか、スタンドアロンソフトウェアパッケージとして、部分的に機械上で実行され、部分的にリモート機械上で実行され又は完全にリモート機械又はサーバ上で実行されてもよい。 Program codes for carrying out the methods of the present disclosure can be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, a special purpose computer, or other programmable data processing apparatus such that, when executed by the processor or controller, the functions/operations specified in the flowcharts and/or block diagrams are performed. The program codes may be executed entirely on the machine, partially on the machine, as a stand-alone software package, partially on the machine, partially on a remote machine, or entirely on a remote machine or server.

本開示のコンテクストでは、機械読み取り可能な媒体は、命令実行システム、装置、またはデバイスによって使用されるために、又は命令実行システム、装置、またはデバイスと組み合わせて使用するためのプログラムを含むか、又は記憶することができる有形の媒体であってもよい。機械読み取り可能な媒体は、機械読み取り可能な信号媒体または機械読み取り可能な記憶媒体であってもよい。機械読み取り可能な媒体は、電子的、磁気的、光学的、電磁気的、赤外線的、又は半導体システム、装置又はデバイス、または上記コンテンツの任意の適切な組み合わせを含むことができるが、これらに限定されない。機械読み取り可能な記憶媒体のより具体的な例は、１つ又は複数のラインに基づく電気的接続、ポータブルコンピュータディスク、ハードディスク、ランダムアクセスメモリ（ＲＡＭ）、リードオンリーメモリ（ＲＯＭ）、消去可能プログラマブルリードオンリーメモリ（ＥＰＲＯＭ又はフラッシュメモリ）、光ファイバ、ポータブルコンパクトディスクリードオンリーメモリ（ＣＤ－ＲＯＭ）、光学記憶装置、磁気記憶装置、または上記コンテンツの任意の適切な組み合わせを含む。 In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain or store a program for use by or in combination with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the above content. More specific examples of machine-readable storage media include one or more line-based electrical connections, portable computer disks, hard disks, random access memories (RAMs), read-only memories (ROMs), erasable programmable read-only memories (EPROMs or flash memories), optical fibers, portable compact disk read-only memories (CD-ROMs), optical storage devices, magnetic storage devices, or any suitable combination of the above content.

ユーザとのインタラクションを提供するために、コンピュータ上でここで説明されるシステム及び技術を実施することができ、当該コンピュータは、ユーザに情報を表示するためのディスプレイ装置（例えば、ＣＲＴ（陰極線管）又はＬＣＤ（液晶ディスプレイ）モニタ）と、キーボード及びポインティングデバイス（例えば、マウス又はトラックボール）とを有し、ユーザは、当該キーボード及び当該ポインティングデバイスによって入力をコンピュータに提供することができる。他の種類の装置も、ユーザとのインタラクションを提供することができ、例えば、ユーザに提供されるフィードバックは、任意の形式のセンシングフィードバック（例えば、ビジョンフィードバック、聴覚フィードバック、又は触覚フィードバック）であってもよく、任意の形式（音響入力と、音声入力、または、触覚入力とを含む）でユーザからの入力を受信することができる。 To provide interaction with a user, the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user, and a keyboard and pointing device (e.g., a mouse or trackball) by which the user can provide input to the computer. Other types of devices can also provide interaction with a user, for example, the feedback provided to the user can be any form of sensing feedback (e.g., vision feedback, auditory feedback, or haptic feedback) and can receive input from the user in any form (including acoustic, speech, or haptic input).

ここで説明されるシステムおよび技術は、バックエンドコンポーネントを含むコンピューティングシステム（例えば、データサーバとする）、又はミドルウェアコンポーネントを含むコンピューティングシステム（例えば、アプリケーションサーバ）、又はフロントエンドコンポーネントを含むコンピューティングシステム（例えば、グラフィカルユーザインターフェース又はウェブブラウザを有するユーザコンピュータ、ユーザは、当該グラフィカルユーザインターフェース又は当該ウェブブラウザによってここで説明されるシステムおよび技術の実施形態とインタラクションできる）、又はこのようなバックエンドコンポーネントと、ミドルウェアコンポーネントと、フロントエンドコンポーネントのいずれかの組み合わせを含むコンピューティングシステムで実行することができる。任意の形態又は媒体のデジタルデータ通信（例えば、通信ネットワーク）によってシステムのコンポーネントを相互に接続することができる。通信ネットワークの例は、ローカルエリアネットワーク（ＬＡＮ）と、ワイドエリアネットワーク（ＷＡＮ）と、インターネットと、ブロックチェーンネットワークを含む。 The systems and techniques described herein may be implemented in a computing system that includes a back-end component (e.g., a data server), or a computing system that includes a middleware component (e.g., an application server), or a computing system that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with embodiments of the systems and techniques described herein), or any combination of such back-end, middleware, and front-end components. The components of the system may be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network (LAN), a wide area network (WAN), the Internet, and a blockchain network.

コンピュータシステムは、クライアントとサーバを含むことができる。クライアントとサーバは、一般に、互いに離れており、通常に通信ネットワークを介してインタラクションする。対応するコンピュータ上で実行され、互いにクライアント－サーバ関係を有するコンピュータプログラムによってクライアントとサーバとの関係が生成される。サーバはクラウドサーバであってもよく、クラウドコンピューティングサーバまたはクラウドホストとも呼ばれ、クラウドコンピューティングサービスシステムにおける１つのホスト製品であり、従来の物理ホストとＶＰＳサービス（「ＶｉｒｔｕａｌＰｒｉｖａｔｅＳｅｒｖｅｒ」，または「ＶＰＳ」と省略する）に存在する管理の難しさ、ビジネス拡張性の弱いという欠陥を解決した。サーバは分散システムのサーバであってもよく、ブロックチェーンを組み込んだサーバであってもよい。 The computer system may include a client and a server. The client and server are generally remote from each other and typically interact with each other via a communication network. The relationship between the client and the server is generated by computer programs running on corresponding computers and having a client-server relationship with each other. The server may be a cloud server, also called a cloud computing server or cloud host, which is a host product in a cloud computing service system and solves the deficiencies of difficult management and weak business scalability that exist in traditional physical hosts and VPS services (abbreviated as "Virtual Private Server", or "VPS"). The server may be a server of a distributed system or a server incorporating blockchain.

なお、上記に示される様々な形式のフローを使用して、ステップを並べ替え、追加、又は削除することができることを理解されたい。例えば、本開示に記載の各ステップは、並列に実行されてもよいし、順次実行されてもよいし、異なる順序で実行されてもよいが、本開示で開示されている技術案が所望の結果を実現することができれば、本明細書では限定されない。 It should be understood that steps can be rearranged, added, or removed using the various types of flows shown above. For example, the steps described in this disclosure may be performed in parallel, sequentially, or in a different order, but are not limited herein as long as the technical solutions disclosed in this disclosure can achieve the desired results.

上記具体的な実施形態は、本開示の保護範囲を制限するものではない。当業者は、設計要求と他の要因に応じて、様々な修正、組み合わせ、サブコンビネーション、及び代替を行うことができると理解されたい。任意の本開示の精神と原則内で行われる修正、同等の置換、及び改善などは、いずれも本開示の保護範囲内に含まれるべきである。 The above specific embodiments do not limit the scope of protection of the present disclosure. It should be understood that those skilled in the art may make various modifications, combinations, subcombinations, and substitutions according to design requirements and other factors. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure should be included within the scope of protection of the present disclosure.

Claims

An information processing method executed by an information processing device,
obtaining an initial dialogue;
inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a corrected response sample sentence, a second candidate response sample sentence, and a recall response sample sentence, and an initial dialogue sample sentence is input to an initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first reply sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences, and the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence are inputting sample sentences into an initial sentence generation model of the initial dialogue model to obtain probabilities of corrected response sample sentences, probabilities of second candidate response sample sentences and probabilities of recall response sample sentences; generating actual response sentences according to the probabilities of the corrected response sample sentences, the probabilities of the second candidate response sample sentences and the probabilities of the recall response sample sentences generated by the initial sentence generation model through an initial sentence determination model; and acquiring the dialogue model by jointly training the initial sentence generation model and the initial sentence determination model of the initial dialogue model according to the probabilities of the actual response sentences, the probabilities of the corrected response sample sentences, the probabilities of the second candidate response sample sentences and the probabilities of the recall response sample sentences;
23. An information processing method comprising:

inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence,
within the dialogue model, inputting the initial dialogue sentence into a sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and a probability of each of the candidate response sentences;
inputting the plurality of candidate response sentences and a probability of each of the candidate response sentences into a sentence determination model of the dialogue model to obtain a target response sentence;
2. The information processing method according to claim 1,

inputting the plurality of candidate response sentences and the probabilities of each of the candidate response sentences into a sentence determination model of the dialogue model to obtain a target response sentence,
inputting the plurality of candidate response sentences and a probability of each of the candidate response sentences into the sentence determination model, and selecting a most probable target response sentence from the plurality of candidate response sentences.
3. The information processing method according to claim 2.

A model training method executed by a model training device, comprising:
obtaining an initial dialogue sample sentence;
inputting the initial dialogue sample sentence into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
modifying a first candidate reply sample sentence from the plurality of candidate reply sample sentences to obtain a modified reply sample sentence;
training the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model;
the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences,
training the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence of the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model,
inputting the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence into an initial sentence generation model of the initial dialogue model to obtain the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence; and then generating an actual response sentence according to the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence generated by the initial sentence generation model through an initial sentence determination model;
and co- training the initial sentence generation model and the initial sentence determination model of the initial dialogue model based on the probabilities of the actual response sentences, the corrected response sample sentences, the second candidate response sample sentences, and the recall response sample sentences to obtain the dialogue model.
A model training method comprising:

a step of jointly training the initial sentence generation model and the initial sentence determination model of the initial dialogue model based on the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain the dialogue model,
determining a loss function based on the actual response sentence and the corrected response sample sentence;
and co-training the initial sentence generation model and the initial sentence determination model to obtain the dialogue model, based on the loss function, with training targets that the probability of the corrected response sample sentence is greater than the probability of the second candidate response sample sentence, the probability of the corrected response sample sentence is greater than the probability of the recall response sample sentence, and the probability of the second candidate response sample sentence is greater than the probability of the recall response sample sentence.
5. The method of claim 4, wherein the model training method is

An information processing device,
an acquisition module for acquiring an initial dialogue;
an input module for inputting the initial dialogue sentence into a trained dialogue model to obtain a target response sentence;
The dialogue model is a model obtained by training based on a corrected response sample sentence, a second candidate response sample sentence, and a recall response sample sentence, and an initial dialogue sample sentence is input to an initial dialogue model to obtain a plurality of candidate response sample sentences, the second candidate response sample sentence being any one of the plurality of candidate response sample sentences, the corrected response sample sentence being a sentence obtained by correcting a first reply sample sentence among the candidate response sample sentences, and the recall response sample sentence being a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences, and the corrected response sample sentence, the second candidate response sample sentence, and the recall response sample sentence are inputting sample sentences into an initial sentence generation model of the initial dialogue model to obtain probabilities of corrected response sample sentences, probabilities of second candidate response sample sentences and probabilities of recall response sample sentences; generating actual response sentences according to the probabilities of the corrected response sample sentences, the probabilities of the second candidate response sample sentences and the probabilities of the recall response sample sentences generated by the initial sentence generation model through an initial sentence determination model; and acquiring the dialogue model by jointly training the initial sentence generation model and the initial sentence determination model of the initial dialogue model according to the probabilities of the actual response sentences, the probabilities of the corrected response sample sentences, the probabilities of the second candidate response sample sentences and the probabilities of the recall response sample sentences;
23. An information processing apparatus comprising:

When the input module inputs the initial dialogue sentence into a trained dialogue model to obtain a target response sentence,
Within the dialogue model, inputting the initial dialogue sentence into a sentence generation model of the dialogue model to obtain a plurality of candidate response sentences and a probability of each of the candidate response sentences;
inputting the plurality of candidate response sentences and the probability of each of the candidate response sentences into a sentence determination model of the dialogue model to obtain a target response sentence;
7. The information processing apparatus according to claim 6,

the input module inputs the plurality of candidate response sentences and the probabilities of each of the candidate response sentences into a sentence determination model of the dialogue model to obtain a target response sentence,
inputting the plurality of candidate response sentences and a probability of each of the candidate response sentences into the sentence determination model, and selecting a target response sentence with the highest probability from the plurality of candidate response sentences;
8. The information processing apparatus according to claim 7,

A model training device,
a sentence acquisition module for acquiring an initial dialogue sample sentence;
a sentence input module for inputting the initial dialogue sample sentence into an initial dialogue model to obtain a plurality of candidate reply sample sentences;
a correction module that corrects a first candidate reply sample sentence from the plurality of candidate reply sample sentences to obtain a corrected reply sample sentence;
a training module for training the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model;
the recall response sample sentence is a sample sentence other than the initial dialogue sample sentence and the plurality of candidate response sample sentences among the training sample sentences,
when the training module trains the initial dialogue model based on the revised response sample sentence, a second candidate response sample sentence among the plurality of candidate response sample sentences, and a recall response sample sentence to obtain a dialogue model,
inputting the corrected response sample sentence, the second candidate response sample sentence and the recall response sample sentence into an initial sentence generation model of the initial dialogue model to obtain the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence; and then generating an actual response sentence according to the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence and the probability of the recall response sample sentence generated by the initial sentence generation model using an initial sentence determination model ;
jointly training the initial sentence generation model and the initial sentence determination model of the initial dialogue model based on the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain the dialogue model;
A model training device characterized by:

the training module jointly trains the initial sentence generation model and the initial sentence determination model of the initial dialogue model according to the actual response sentence, the probability of the corrected response sample sentence, the probability of the second candidate response sample sentence, and the probability of the recall response sample sentence to obtain the dialogue model;
determining a loss function based on the actual response sentence and the sample corrected response sentence;
Based on the loss function, a probability of the corrected response sample sentence is greater than a probability of the second candidate response sample sentence, a probability of the corrected response sample sentence is greater than a probability of the recall response sample sentence, and a probability of the second candidate response sample sentence is greater than a probability of the recall response sample sentence, using these as training targets, to jointly train the initial sentence generation model and the initial sentence determination model to obtain the dialogue model.
10. The model training device according to claim 9.

An electronic device,
At least one processor;
a memory communicatively coupled to the at least one processor;
The memory stores instructions executable by the at least one processor, the instructions being executed by the at least one processor such that the at least one processor performs a method according to any one of claims 1 to 3 or 4 or 5.
1. An electronic device comprising:

A non-transitory computer-readable storage medium having computer instructions stored thereon, comprising:
The computer instructions cause a computer to carry out a method according to any one of claims 1 to 3 or 4 or 5.
A non-transitory computer-readable storage medium comprising:

A computer program comprising:
The computer program, when executed by a processor, implements the steps of the method according to any one of claims 1 to 3, 4 or 5.
A computer program comprising: