Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
JP7687589B2 - Question creation device, question creation method, and program - Google Patents
[go: Go Back, main page]

JP7687589B2 - Question creation device, question creation method, and program - Google Patents

Question creation device, question creation method, and program Download PDF

Info

Publication number
JP7687589B2
JP7687589B2 JP2021081813A JP2021081813A JP7687589B2 JP 7687589 B2 JP7687589 B2 JP 7687589B2 JP 2021081813 A JP2021081813 A JP 2021081813A JP 2021081813 A JP2021081813 A JP 2021081813A JP 7687589 B2 JP7687589 B2 JP 7687589B2
Authority
JP
Japan
Prior art keywords
word
sentence
question
confidence
masked
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
JP2021081813A
Other languages
Japanese (ja)
Other versions
JP2022175437A (en
Inventor
倫太 今井
匠哉 松森
哲平 吉野
遼一 柴田
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Keio University
Original Assignee
Keio University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Keio University filed Critical Keio University
Priority to JP2021081813A priority Critical patent/JP7687589B2/en
Publication of JP2022175437A publication Critical patent/JP2022175437A/en
Application granted granted Critical
Publication of JP7687589B2 publication Critical patent/JP7687589B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Description

本開示は、問題文作成装置、問題文作成方法及びプログラムに関する。 This disclosure relates to a question creation device, a question creation method, and a program.

近年の情報技術の発展によって、様々な分野に情報技術が適用されてきている。例えば、情報技術を利用した各種学習支援ツールが開発されてきている。このような学習支援ツールの一例として、スマートフォン、タブレット、パーソナルコンピュータなどを操作して、学習支援アプリ等を利用して学習者は、学習支援アプリ上に表示された問題を解くことによって英語などを学習することができる。 With the recent development of information technology, information technology is being applied to various fields. For example, various learning support tools that utilize information technology have been developed. One example of such a learning support tool is a learning support app that allows a learner to operate a smartphone, tablet, personal computer, etc. and learn English by solving problems displayed on the learning support app.

このような学習支援アプリによって提供される問題形式の一例として穴埋め問題があるが、このような穴埋め問題を自動生成するための問題文作成方法が提案されている。穴埋め問題とは、前後の文脈から空欄に入る単語や用語を学習者に推測させ、あるいは、選択肢から選択させる形式の問題である。 One example of the question format provided by such learning support apps is fill-in-the-blank questions, and a method for creating questions to automatically generate such fill-in-the-blank questions has been proposed. Fill-in-the-blank questions are questions that require learners to guess the word or term that should fill in the blank from the context, or to select it from a list of options.

特開2018-17904号公報JP 2018-17904 A

近年、自然言語処理等の分野に人工知能が利用され、GoogleによるBERT(Bidirectional Encoder Representations from Transformers)などの様々な自然言語処理モデルが開発され、自然言語処理におけるこれらの自然言語処理モデルのパフォーマンスの高さが注目されている。 In recent years, artificial intelligence has been used in fields such as natural language processing, and various natural language processing models such as BERT (Bidirectional Encoder Representations from Transformers) by Google have been developed, and the high performance of these natural language processing models in natural language processing has attracted attention.

本開示の課題は、自然言語処理モデルを利用した穴埋め問題のための問題文作成技術を提供することである。 The objective of this disclosure is to provide a technique for creating questions for fill-in-the-blank questions using a natural language processing model.

上記課題を解決するため、本発明の一態様は、データセットから学習対象の単語を含む第1の文を取得する文取得部と、前記第1の文において前記単語をマスクし、推測モデルを利用して、前記第1の文のマスク箇所における前記単語の確信度と前記マスク箇所に入り得る他の候補である候補単語の確信度とを算出する確信度算出部と、前記単語の確信度が前記候補単語の確信度よりも所定値以上大きい場合、前記マスクされた文を穴埋め問題として出力する問題文出力部と、を有する問題文作成装置に関する。 In order to solve the above problem, one aspect of the present invention relates to a question creation device having a sentence acquisition unit that acquires a first sentence including a word to be learned from a dataset, a confidence calculation unit that masks the word in the first sentence and calculates the confidence of the word in the masked portion of the first sentence and the confidence of a candidate word that is another candidate that can be placed in the masked portion using a prediction model, and a question output unit that outputs the masked sentence as a fill-in-the-blank question if the confidence of the word is greater than the confidence of the candidate word by a predetermined value or more.

本開示によると、自然言語処理モデルを利用した穴埋め問題のための問題文作成技術を提供することができる。 The present disclosure provides a technique for creating questions for fill-in-the-blank questions that utilizes a natural language processing model.

本開示の一実施例による問題文作成装置を示す概略図である。1 is a schematic diagram showing a question sentence creation device according to an embodiment of the present disclosure. 本開示の一実施例による学習支援システムによる学習者の学習を示す概略図である。1 is a schematic diagram showing a learner's learning using a learning support system according to an embodiment of the present disclosure. 本開示の一実施例による問題文作成処理を示す概略図である。1 is a schematic diagram showing a question sentence creation process according to an embodiment of the present disclosure. 本開示の一実施例による問題文作成装置のハードウェア構成を示すブロック図である。1 is a block diagram showing a hardware configuration of a question sentence creation device according to an embodiment of the present disclosure. 本開示の一実施例による問題文作成装置の機能構成を示すブロック図である。1 is a block diagram showing a functional configuration of a question sentence creation device according to an embodiment of the present disclosure. 本開示の他の実施例による問題文作成処理を示す概略図である。FIG. 11 is a schematic diagram showing a question sentence creation process according to another embodiment of the present disclosure. 本開示の他の実施例による問題文作成処理を示す概略図である。FIG. 11 is a schematic diagram showing a question sentence creation process according to another embodiment of the present disclosure. 本開示の他の実施例による問題文作成装置の機能構成を示すブロック図である。FIG. 13 is a block diagram showing a functional configuration of a question sentence creation device according to another embodiment of the present disclosure. 本開示の一実施例による問題文作成処理を示すフローチャートである。13 is a flowchart illustrating a question sentence creation process according to an embodiment of the present disclosure.

<用語の説明>
・本明細書において、確信度とは、Mask予測課題(単語穴埋め課題)の正当(予測)確率であり、単語の確信度とは、その単語が文中の該当の箇所に入る確からしさ(確信度)である。「単語の確信度」は、文内にある他の一つ以上の単語に基づいて計算される。また、該当の文の前後の文内にある一つ以上の単語にも基づいて、「単語の確信度」が計算されてもよい。その値(つまり、確信度の値)が高いほど、他の単語から類推して該当の単語が用いられることが多いことを表す。なお、確信度は算出できればよく、特定の算出手段である必要はない。本実施例では、BERTが算出する確信度を用いて説明する。
<Terminology>
In this specification, the confidence level is the correct (predicted) probability of a mask prediction task (word fill-in-the-blank task), and the confidence level of a word is the likelihood (confidence level) that the word will be placed in the corresponding position in the sentence. The "confidence level of a word" is calculated based on one or more other words in the sentence. The "confidence level of a word" may also be calculated based on one or more words in the sentences before and after the corresponding sentence. The higher the value (i.e., the confidence level value), the more frequently the corresponding word is used by analogy with other words. Note that it is sufficient for the confidence level to be calculated, and there is no need for a specific calculation method. In this embodiment, the confidence level calculated by BERT is used for explanation.

以下の実施例では、穴埋め問題のための問題文作成装置が開示される。 In the following embodiment, a question creation device for fill-in-the-blank questions is disclosed.

本開示の一実施例による問題文作成装置100は、図1に示されるように、学習対象の単語と当該単語を含む文とを入力として受け付けると、BERTなどの何れか適当な自然言語処理モデルを利用して、入力された単語に対する穴埋め問題を出力する。例えば、図2に示されるように、学習者Aが学習支援システムによって運営される学習用アプリを用いて英語を学習するとき、学習支援システムは、学習者Aの学習履歴などから学習者Aが習得済みの英単語に関する情報を取得すると共に、データベースやウェブ上から取得した英文コンテンツから、未習得の英単語に対する穴埋め問題を作成する。 As shown in FIG. 1, a question creation device 100 according to an embodiment of the present disclosure receives as input a word to be learned and a sentence containing the word, and outputs fill-in-the-blank questions for the input word using any appropriate natural language processing model such as BERT. For example, as shown in FIG. 2, when learner A learns English using a learning app operated by a learning support system, the learning support system obtains information about English words that learner A has learned from learner A's learning history, etc., and creates fill-in-the-blank questions for English words that learner A has not learned from English content obtained from a database or the web.

具体的には、問題文作成装置100は、図3に示されるように、データセット(図3のS11、後述する図7のS31)などから取得した文(図3のS12、図7のS32)において学習対象の英単語をマスク(図3のS13、図7のS33)し、マスク箇所を含む文をBERTなどの自然言語処理モデルに入力(図3のS14、図7のS34)し、当該マスク箇所に入るべき英単語の候補を推定する。本開示によると、問題文作成装置100は、自然言語処理モデルから推定される英単語の各候補の確信度を推定し、最も高い第1の確信度、すなわち、マスクされた学習対象の英単語の確信度と次に高い第2の確信度との間の差分が所定値以上である場合、当該マスク箇所を含む文を第1の確信度を有する英単語に対する穴埋め問題として出力する(図3のS15、図7のS35)。これにより、マスク箇所に入るべき英単語として、他の英単語候補よりも有意な差分を有する高い確信度の英単語を含む文を問題文として出力することが可能になり、学習対象の英単語のみを正解とする穴埋め問題を作成することができる。 Specifically, as shown in FIG. 3, the question creation device 100 masks (S13 in FIG. 3, S33 in FIG. 7) the English words to be learned in sentences (S12 in FIG. 3, S32 in FIG. 7) acquired from a dataset (S11 in FIG. 3, S31 in FIG. 7 described later) or the like, inputs the sentences including the masked portions into a natural language processing model such as BERT (S14 in FIG. 3, S34 in FIG. 7), and estimates candidates for the English words to be placed in the masked portions. According to the present disclosure, the question creation device 100 estimates the confidence of each candidate English word estimated from the natural language processing model, and when the difference between the highest first confidence, i.e., the confidence of the masked English word to be learned and the second highest second confidence, is equal to or greater than a predetermined value, outputs the sentence including the masked portion as a fill-in-the-blank question for the English word having the first confidence (S15 in FIG. 3, S35 in FIG. 7). This makes it possible to output as a question a sentence that contains an English word with a high degree of certainty that has a significant difference from other English word candidates as the English word to be filled in the masked portion, and to create fill-in-the-blank questions in which only the English word being studied is the correct answer.

問題文作成装置100は、後述するように、学習用アプリなど介し学習者のスマートフォン、タブレット、パーソナルコンピュータ等に問題文を提供するサーバなどであってもよい。例えば、問題文作成装置100は、学習用アプリを運営する企業、業者等のサーバであってもよく、当該企業、業者等が保持する文をデータセットとして格納するデータベースから処理対象の文を取得し、後述するように問題文を作成する。好ましくは、当該データセットは、著作権管理された文から構成される。 The question creation device 100 may be a server that provides questions to a learner's smartphone, tablet, personal computer, etc. via a learning app, as described below. For example, the question creation device 100 may be a server of a company, vendor, etc. that operates a learning app, and obtains the sentence to be processed from a database that stores sentences held by the company, vendor, etc. as a dataset, and creates a question as described below. Preferably, the dataset is composed of copyright-managed sentences.

問題文作成装置100は、例えば、図4に示されるように、CPU(Central Processing Unit)などのプロセッサ101、RAM(Random Access Memory)、フラッシュメモリなどのメモリ102、ハードディスクなどのストレージ103、及び入出力(I/O)インタフェース104によるハードウェア構成を有してもよい。 The question creation device 100 may have a hardware configuration, for example as shown in FIG. 4, including a processor 101 such as a CPU (Central Processing Unit), a memory 102 such as a RAM (Random Access Memory) or a flash memory, a storage 103 such as a hard disk, and an input/output (I/O) interface 104.

プロセッサ101は、後述される問題文作成装置100の各種処理を実行する。 The processor 101 executes various processes of the question creation device 100, which will be described later.

メモリ102は、問題文作成装置100における各種データ及びプログラムを格納し、特に作業用データ、実行中のプログラムなどのためのワーキングメモリとして機能する。具体的には、メモリ102は、ストレージ103からロードされた後述される各種処理を実行及び制御するためのプログラムなどを格納し、プロセッサ101によるプログラムの実行中にワーキングメモリとして機能する。 The memory 102 stores various data and programs in the question creation device 100, and functions in particular as a working memory for working data, programs currently being executed, and the like. Specifically, the memory 102 stores programs for executing and controlling various processes described below that are loaded from the storage 103, and functions as a working memory while the processor 101 is executing the programs.

ストレージ103は、問題文作成装置100における各種データ及びプログラムを格納する。 Storage 103 stores various data and programs in the question creation device 100.

I/Oインタフェース104は、ユーザからの命令、入力データなどを受け付け、出力結果を表示、再生などすると共に、外部装置との間でデータを入出力するためのインタフェースである。例えば、I/Oインタフェース104は、USB(Universal Serial Bus)、通信回線、キーボード、マウス、ディスプレイ、マイクロフォン、スピーカなどの各種データを入出力するためのデバイスであってもよい。 The I/O interface 104 is an interface for receiving commands and input data from a user, displaying and playing back output results, and inputting and outputting data between external devices. For example, the I/O interface 104 may be a device for inputting and outputting various types of data, such as a USB (Universal Serial Bus), a communication line, a keyboard, a mouse, a display, a microphone, and a speaker.

しかしながら、本開示による問題文作成装置100は、上述したハードウェア構成に限定されず、他の何れか適切なハードウェア構成を有してもよい。例えば、問題文作成装置100による各種処理の1つ以上は、これを実現するよう配線化された処理回路又は電子回路により実現されてもよい。 However, the question creation device 100 according to the present disclosure is not limited to the above-mentioned hardware configuration, and may have any other appropriate hardware configuration. For example, one or more of the various processes performed by the question creation device 100 may be realized by a processing circuit or electronic circuit that is hardwired to realize the process.

次に、図5~9を参照して、本開示の一実施例による問題文作成装置100をより詳細に説明する。問題文作成装置100は、学習対象の単語(例えば、英単語、用語、人名、場所など)と文とを入力として受け付け、当該単語を空欄とする穴埋め問題を出力する。以下の実施例では、学習対象の単語として英単語が適用され、当該英単語がマスクされた穴埋め問題が出力されるが、本開示はこれに限定されず、英語学習以外の他の任意の科目における用語の学習に適用されうる。 Next, the question creation device 100 according to an embodiment of the present disclosure will be described in more detail with reference to Figures 5 to 9. The question creation device 100 receives as input a word to be learned (e.g., an English word, term, person's name, place, etc.) and a sentence, and outputs a fill-in-the-blank question with the word left blank. In the following embodiment, an English word is applied as the word to be learned, and a fill-in-the-blank question in which the English word is masked is output, but the present disclosure is not limited thereto and may be applied to the learning of terms in any subject other than English language learning.

図5は、本開示の一実施例による問題文作成装置100の機能構成を示すブロック図である。図5に示されるように、問題文作成装置100は、文取得部110、確信度算出部120及び問題文出力部130を有する。 FIG. 5 is a block diagram showing the functional configuration of a question sentence creation device 100 according to an embodiment of the present disclosure. As shown in FIG. 5, the question sentence creation device 100 has a sentence acquisition unit 110, a confidence factor calculation unit 120, and a question sentence output unit 130.

文取得部110は、データセットから学習対象の単語を含む文を取得する。本明細書において、文とは句点「。」又はピリオド「.」で終わる1語以上の単語からなるものであり、文章はまとまった意味を有する複数の連続した文である。取得対象の文又は文章は、例えば、データベースやウェブ上から取得されてもよい。好ましくは、取得対象の文又は文章は、著作権が管理されているデータベースのデータセットから抽出されてもよい。 The sentence acquisition unit 110 acquires sentences including the words to be learned from the dataset. In this specification, a sentence is one or more words ending with a period "." or a full stop ".", and a sentence is a series of multiple consecutive sentences having a coherent meaning. The sentence or sentences to be acquired may be acquired, for example, from a database or the web. Preferably, the sentence or sentences to be acquired may be extracted from a dataset in a database where copyright is managed.

確信度算出部120は、取得した文において学習対象の単語をマスクし、推測モデルを利用して、当該文のマスク箇所における学習対象の単語の確信度とマスク箇所に入り得る他の候補である候補単語の確信度とを算出する。具体的には、確信度算出部120は、学習対象の単語を含む文又は文章において、当該単語をマスクすることによってマスク箇所を空欄とする文又は文章に変換する。そして、確信度算出部120は、マスク箇所を含む文又は文章をBERTなどの何れか適切な自然言語処理モデルに入力する。BERTは、図3に示されるように、マスク箇所に入るべき候補単語と、各候補単語の確信度とを出力する。 The confidence calculation unit 120 masks the word to be learned in the acquired sentence, and uses a prediction model to calculate the confidence of the word to be learned in the masked portion of the sentence and the confidence of other candidate words that can be placed in the masked portion. Specifically, the confidence calculation unit 120 converts a sentence or text containing the word to be learned into a sentence or text in which the masked portion is left blank by masking the word. The confidence calculation unit 120 then inputs the sentence or text containing the masked portion into any appropriate natural language processing model such as BERT. BERT outputs the candidate words to be placed in the masked portion and the confidence of each candidate word, as shown in FIG. 3.

図示された具体例では、マスクされた単語"continue"に対して、BERTは、"This city will cease to be and will ( ) to grow!"というマスク箇所を含む入力文から、マスク箇所の候補単語及び確信度"continue 0.9373・・・"、"cease 0.0506・・・"、"begin 0.0025・・・"を出力する。ここで、確信度は当該単語がマスク箇所に入る確率を示すものであり、例えば、マスクされた単語の確信度が最も高くなる。なお、マスクされた単語の確信度が最も高くならない場合もありうる。 In the illustrated specific example, for the masked word "continue", BERT outputs candidate words and confidence levels for the masked part "continue 0.9373...", "cease 0.0506...", and "begin 0.0025..." from an input sentence containing the masked part "This city will cease to be and will ( ) to grow!". Here, the confidence level indicates the probability that the word will be included in the masked part, and for example, the confidence level of the masked word is the highest. Note that there may be cases where the confidence level of the masked word is not the highest.

また、BERTは、マスクされた単語"hurt"に対して、"I apologize if my actions ( ) your pride."というマスク箇所を含む入力文から、マスク箇所の候補単語及び確信度"hurt 0.9465・・・"、"wounded 0.0167・・・"、"offended 0.0062・・・"を出力する。 For the masked word "hurt," BERT outputs candidate words and confidence scores for the masked part "hurt 0.9465...," "wounded 0.0167...," and "offended 0.0062..." from an input sentence that includes the masked part "I apologize if my actions ( ) your pride."

なお、マスクされた単語(例えば、上述した実施例の"continue"、"hurt")が、マスク箇所に入り得る候補として推測されなかった場合には、問題文作成装置100は、エラー処理であるとして、別の文を取得する。 If the masked word (for example, "continue" or "hurt" in the above-mentioned embodiment) is not predicted as a possible candidate for the masked portion, the question creation device 100 will handle this as an error and obtain another sentence.

上述した実施例では、BERTに1つの文が入力されたが、本開示はこれに限定されず、一実施例では、連続した複数の文から構成される文章がBERTに入力されてもよい。例えば、確信度算出部120は、図6に示されるように、学習対象の単語を含む文と当該文の前後の文とから構成される文章(図6のS21、S22)に対して、学習対象の単語をマスク(図6のS23)し、マスク箇所を含む文章をBERTに入力(図6のS24)してもよい。同様にして、BERTは、マスク箇所を含む入力文からマスク箇所の候補単語及び確信度を出力する(図6のS25)。 In the above-described embodiment, one sentence is input to the BERT, but the present disclosure is not limited thereto, and in one embodiment, a sentence consisting of multiple consecutive sentences may be input to the BERT. For example, as shown in FIG. 6, the certainty calculation unit 120 may mask (S23 in FIG. 6) the word to be learned from a sentence consisting of a sentence including the word to be learned and sentences before and after the sentence (S21, S22 in FIG. 6), and input the sentence including the masked portion to the BERT (S24 in FIG. 6). In a similar manner, the BERT outputs candidate words and certainty for the masked portion from the input sentence including the masked portion (S25 in FIG. 6).

問題文出力部130は、単語の確信度と候補単語の確信度との差分が所定値以上である場合、マスクされた文を穴埋め問題として出力する。具体的には、問題文出力部130は、マスク箇所の各候補単語の確信度を比較し、最も高い確信度(例えば、マスクされた単語の確信度)と次に高い確信度との差分を算出する。そして、最も高い確信度と次に高い確信度との差分が所定の閾値以上である場合、問題文出力部130は、入力文は当該単語の穴埋め問題として適していると判断し、マスクされた文を当該単語の穴埋め問題として出力する。他方、最も高い確信度と次に高い確信度との差分が所定の閾値未満である場合、問題文出力部130は、入力文は当該単語の穴埋め問題として適していないと判断し、マスクされた文を当該単語の穴埋め問題として出力しない。すなわち、これらの確信度に有意な差がない場合、入力文は穴埋め問題の正解を一意的に特定することが困難であり、穴埋め問題として適していないと考えられる。同様に、最も高い確信度がマスクされた単語の確信度でない場合も、入力文は穴埋め問題として適していないと考えられる。つまり、問題文出力部130は、最も高い確信度がマスクされた単語の確信度であり、かつ、マスクされた単語の確信度と候補単語の確信度との差分が所定値以上である場合に、マスクされた文を穴埋め問題として出力する。 If the difference between the confidence of the word and the confidence of the candidate word is equal to or greater than a predetermined value, the question output unit 130 outputs the masked sentence as a fill-in-the-blank question. Specifically, the question output unit 130 compares the confidence of each candidate word in the masked portion and calculates the difference between the highest confidence (for example, the confidence of the masked word) and the next highest confidence. If the difference between the highest confidence and the next highest confidence is equal to or greater than a predetermined threshold, the question output unit 130 determines that the input sentence is suitable as a fill-in-the-blank question for the word, and outputs the masked sentence as a fill-in-the-blank question for the word. On the other hand, if the difference between the highest confidence and the next highest confidence is less than a predetermined threshold, the question output unit 130 determines that the input sentence is not suitable as a fill-in-the-blank question for the word, and does not output the masked sentence as a fill-in-the-blank question for the word. In other words, if there is no significant difference between these confidences, it is difficult to uniquely identify the correct answer to the fill-in-the-blank question for the input sentence, and it is considered that the input sentence is not suitable as a fill-in-the-blank question. Similarly, if the highest confidence level is not the confidence level of the masked word, the input sentence is also considered to be unsuitable as a fill-in-the-blank question. In other words, the question sentence output unit 130 outputs the masked sentence as a fill-in-the-blank question if the highest confidence level is the confidence level of the masked word and the difference between the confidence level of the masked word and the confidence level of the candidate word is equal to or greater than a predetermined value.

一実施例では、単語の確信度と候補単語の確信度との差分が所定値未満である場合、文取得部110は、取得した文(すなわち、データセットからの学習対象の単語を含む文)に隣接する文を取得し、確信度算出部120は、推測モデルを利用して、当該取得した文と当該取得した文に隣接する文とから構成される文章のマスク箇所における学習対象の単語の確信度と候補単語の確信度とを算出し、問題文出力部130は、学習対象の単語の確信度と各候補単語の確信度との差分が所定値以上である場合、マスクされた文章を当該単語の穴埋め問題として出力してもよい。例えば、BERTへの入力文が穴埋め問題として適していない場合、入力文の前後の文が探索され、入力文と前後の文とから構成される文章がBERTに入力されてもよい。一般に、文脈の範囲が拡大すると、マスク箇所に入るべき候補単語が絞られると考えられ、学習対象の単語の確信度と次に高い確信度との間の差分が大きくなると想定される。従って、入力文が穴埋め問題として適していない場合、文取得部110は、入力文の前後の文をデータセットから抽出し、確信度算出部120は、抽出した文章において学習対象の単語をマスクし、マスク箇所を含む文章をBERTに入力し、候補単語及び各候補単語の確信度を算出してもよい。そして、最も高い学習対象の単語の確信度と次に高い確信度との間の差分が所定値以上になった場合、問題文作成装置100は、当該文章を穴埋め問題として出力してもよい。つまり、第n(nは1以上の整数)の文において単語の確信度と候補単語の確信度との差分が所定値未満である場合、文取得部110は、第nの文に隣接する第n+1の文を取得し、確信度算出部120は、推測モデルを利用して、当該取得した文(第nの文)と当該取得した文に隣接する文(第n+1の文)とから構成される文章のマスク箇所における学習対象の単語の確信度と候補単語の確信度とを算出し、問題文出力部130は、学習対象の単語の確信度と各候補単語の確信度との差分が所定値以上である場合、マスクされた文章を当該単語の穴埋め問題として出力することができる(例えば、指定された回数まで、あるいは、学習対象の単語の確信度と各候補単語の確信度との差分が所定値以上になるまで、隣接する文の取得が繰り返されてもよい)。 In one embodiment, if the difference between the confidence of the word and the confidence of the candidate word is less than a predetermined value, the sentence acquisition unit 110 acquires a sentence adjacent to the acquired sentence (i.e., a sentence including the learning target word from the dataset), the confidence calculation unit 120 uses a prediction model to calculate the confidence of the learning target word and the confidence of the candidate word in the masked portion of the sentence composed of the acquired sentence and the sentence adjacent to the acquired sentence, and the question sentence output unit 130 may output the masked sentence as a fill-in-the-blank question for the word if the difference between the confidence of the learning target word and the confidence of each candidate word is equal to or greater than a predetermined value. For example, if the input sentence to the BERT is not suitable as a fill-in-the-blank question, the sentences before and after the input sentence may be searched, and the sentence composed of the input sentence and the sentences before and after the input sentence may be input to the BERT. In general, it is considered that as the range of the context expands, the candidate words to be included in the masked portion are narrowed down, and it is assumed that the difference between the confidence of the learning target word and the next highest confidence becomes large. Therefore, when the input sentence is not suitable as a fill-in-the-blank question, the sentence acquisition unit 110 may extract sentences before and after the input sentence from the dataset, and the certainty calculation unit 120 may mask the words to be learned in the extracted sentences, input the sentences including the masked parts to BERT, and calculate the candidate words and the certainty of each candidate word. Then, when the difference between the highest certainty of the word to be learned and the next highest certainty is equal to or greater than a predetermined value, the question sentence creation device 100 may output the sentence as a fill-in-the-blank question. That is, if the difference between the confidence of the word and the confidence of the candidate word in the nth sentence (n is an integer equal to or greater than 1) is less than a predetermined value, the sentence acquisition unit 110 acquires the n+1th sentence adjacent to the nth sentence, the confidence calculation unit 120 uses the prediction model to calculate the confidence of the word to be studied and the confidence of the candidate word in the masked portion of the sentence composed of the acquired sentence (nth sentence) and the sentence adjacent to the acquired sentence (n+1th sentence), and the question sentence output unit 130 can output the masked sentence as a fill-in-the-blank question for the word if the difference between the confidence of the word to be studied and the confidence of each candidate word is equal to or greater than a predetermined value (for example, acquisition of the adjacent sentence may be repeated up to a specified number of times, or until the difference between the confidence of the word to be studied and the confidence of each candidate word is equal to or greater than a predetermined value).

また、一実施例では、問題文出力部130は、穴埋め問題において単語と当該単語に付随する選択候補単語とを選択肢として出力してもよい。すなわち、問題文出力部130は、穴埋め問題の選択肢を自動生成してもよい。例えば、問題文出力部130は、空欄の選択肢として、当該学習対象の単語と当該単語の派生語や異なる品詞の単語とを穴埋め問題の空欄の選択肢として選択してもよい。具体的には、学習対象の単語が動詞である場合、問題文出力部130は、辞書データベースなどを参照して、当該動詞の対応する名詞、形容詞、副詞などを選択肢として決定してもよい。 In addition, in one embodiment, the question output unit 130 may output a word and a selection candidate word associated with the word as options in a fill-in-the-blank question. That is, the question output unit 130 may automatically generate options for a fill-in-the-blank question. For example, the question output unit 130 may select the word to be studied and a derivative of the word or a word of a different part of speech as options for the blank in a fill-in-the-blank question. Specifically, when the word to be studied is a verb, the question output unit 130 may refer to a dictionary database or the like to determine the corresponding noun, adjective, adverb, etc. of the verb as options.

また、一実施例では、問題文出力部130は、穴埋め問題において単語の1つ以上の文字を表示してもよい。例えば、問題文出力部130は、穴埋め問題の空欄にヒントとして、正解の単語の最初の文字(例えば、正解の単語が"continue"の場合、最初の文字の"c")を表示してもよい。 In one embodiment, the question output unit 130 may display one or more letters of a word in a fill-in-the-blank question. For example, the question output unit 130 may display the first letter of the correct word (for example, if the correct word is "continue," then the first letter "c") as a hint in the blank of the fill-in-the-blank question.

また、一実施例では、問題文出力部130は、穴埋め問題において文と当該文に付随する文とを表示してもよい。例えば、問題文出力部130は、図7に示されるように、英語の穴埋め問題において、英文と当該英文の訳文とを表示してもよい。 In one embodiment, the question output unit 130 may display a sentence and a sentence accompanying the sentence in a fill-in-the-blank question. For example, as shown in FIG. 7, the question output unit 130 may display an English sentence and a translation of the English sentence in an English fill-in-the-blank question.

また、一実施例では、問題文出力部130は、穴埋め問題において文と当該文に付随する画像とを表示してもよい。例えば、"二酸化炭素は、1つの炭素原子と2つの( )とが結合したものである。"という穴埋め問題において、炭素原子と2つの酸素原子とが結合した図"O=C=O"が一緒に表示されてもよい。 In one embodiment, the question output unit 130 may display a sentence and an image associated with the sentence in a fill-in-the-blank question. For example, in a fill-in-the-blank question such as "Carbon dioxide is a bond between one carbon atom and two ( )," a diagram of a carbon atom and two oxygen atoms bonded together, "O=C=O," may be displayed together.

また、一実施例では、問題文作成装置100は更に、図8に示されるように、学習対象の単語を提供する単語提供部140を有してもよい。例えば、単語提供部140は、学習者の学習履歴や習得レベルに基づき学習対象の単語を決定し、文取得部110、確信度算出部120及び/又は問題文出力部130に学習対象の単語を提供してもよい。 In one embodiment, the question creation device 100 may further include a word providing unit 140 that provides words to be learned, as shown in FIG. 8. For example, the word providing unit 140 may determine words to be learned based on the learning history and learning level of the learner, and provide the words to be learned to the sentence acquisition unit 110, the confidence calculation unit 120, and/or the question output unit 130.

なお、上述した各実施例は適宜組み合わされてもよい。 The above-mentioned embodiments may be combined as appropriate.

図9は、本開示の一実施例による問題文作成処理を示すフローチャートである。問題文作成処理は、上述した問題文作成装置100によって実行され、特に問題文作成装置100のプロセッサがプログラムを実行することによって実現されうる。 Figure 9 is a flowchart showing a question creation process according to one embodiment of the present disclosure. The question creation process is executed by the above-mentioned question creation device 100, and in particular can be realized by the processor of the question creation device 100 executing a program.

図9に示されるように、ステップS101において、問題文作成装置100は、データセットから学習対象の単語を含む文を取得する。例えば、問題文作成装置100は、著作権管理されたデータベースから学習対象の単語を含む文又は文章を抽出する。具体的には、学習対象の単語が高校受験用の英単語である場合、問題文作成装置100は、これらの英単語を含む英文をデータベースから抽出する。 As shown in FIG. 9, in step S101, the question creation device 100 acquires sentences including the words to be studied from the dataset. For example, the question creation device 100 extracts sentences or paragraphs including the words to be studied from a copyright-managed database. Specifically, if the words to be studied are English words for high school entrance exams, the question creation device 100 extracts English sentences including these English words from the database.

ステップS102において、問題文作成装置100は、取得した文において学習対象の単語をマスクする。例えば、学習対象の単語を含む文を取得すると、問題文作成装置100は、取得した文における当該単語の箇所を空欄にする。 In step S102, the question creation device 100 masks the word to be learned in the acquired sentence. For example, when a sentence containing the word to be learned is acquired, the question creation device 100 leaves the location of the word in the acquired sentence blank.

ステップS103において、問題文作成装置100は、推測モデルを利用して、マスクされた文のマスク箇所における当該単語の確信度と候補単語の確信度とを算出する。例えば、問題文作成装置100は、BERTにマスク箇所を含む文を入力し、当該マスク箇所に入りうる候補単語及び各候補単語の確信度を取得する。例えば、マスクされた単語の確信度が最も高くなる。 In step S103, the question creation device 100 uses the inference model to calculate the confidence of the word in the masked portion of the masked sentence and the confidence of the candidate words. For example, the question creation device 100 inputs a sentence including a masked portion into BERT, and obtains candidate words that may be included in the masked portion and the confidence of each candidate word. For example, the confidence of the masked word is the highest.

ステップS104において、問題文作成装置100は、マスクされた単語の確信度と他の候補単語の確信度との差分が所定値以上である場合、マスクされた文を穴埋め問題として出力する。なお、所定値は、マスクされた単語と候補単語とを有意に区別しうる何れか適当な値に設定されてもよい。 In step S104, if the difference between the certainty of the masked word and the certainty of the other candidate words is equal to or greater than a predetermined value, the question creation device 100 outputs the masked sentence as a fill-in-the-blank question. Note that the predetermined value may be set to any appropriate value that can significantly distinguish the masked word from the candidate words.

以上、本発明の実施例について詳述したが、本発明は上述した特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 Although the examples of the present invention have been described in detail above, the present invention is not limited to the specific embodiments described above, and various modifications and variations are possible within the scope of the gist of the present invention as described in the claims.

100 問題文作成装置
110 文取得部
120 確信度算出部
130 問題文出力部
140 単語提供部
100 Question sentence creation device 110 Sentence acquisition unit 120 Confidence factor calculation unit 130 Question sentence output unit 140 Word provision unit

Claims (10)

データセットから学習対象の単語を含む第1の文を取得する文取得部と、
前記第1の文において前記単語をマスクし、推測モデルを利用して、前記第1の文のマスク箇所における前記単語の確信度と前記マスク箇所に入り得る他の候補である候補単語の確信度とを算出する確信度算出部と、
前記単語と前記候補単語とを区別する所定値を設定し、前記単語の確信度が前記候補単語の確信度よりも前記所定値以上大きい場合、前記マスクされた文を穴埋め問題として出力する問題文出力部と、
を有する問題文作成装置。
a sentence acquisition unit that acquires a first sentence including a word to be learned from the dataset;
a confidence calculation unit that masks the word in the first sentence and calculates the confidence of the word in the masked portion of the first sentence and the confidence of other candidate words that may be included in the masked portion by using an inference model;
a question output unit that sets a predetermined value for distinguishing the word from the candidate word, and outputs the masked sentence as a fill-in-the-blank question when the certainty of the word is greater than the certainty of the candidate word by at least the predetermined value;
A problem creation device having the above structure.
前記単語の確信度と前記候補単語の確信度との差分が前記所定値未満である場合、
前記文取得部は、前記データセット内の文章において前記第1の文に隣接する第2の文を取得し、
前記確信度算出部は、前記推測モデルを利用して、前記第1の文と前記第2の文とから構成される前記文章のマスク箇所における前記単語の確信度と候補単語の確信度とを算出し、
前記問題文出力部は、前記単語の確信度が前記候補単語の確信度よりも前記所定値以上大きい場合、前記マスクされた文章を前記単語の穴埋め問題として出力する、請求項1に記載の問題文作成装置。
if the difference between the confidence of the word and the confidence of the candidate word is less than the predetermined value,
the sentence acquisition unit acquires a second sentence adjacent to the first sentence in a sentence in the dataset ;
the certainty calculation unit uses the inference model to calculate certainty of the word and certainty of the candidate word in a masked portion of the sentence composed of the first sentence and the second sentence;
2 . The question creation device according to claim 1 , wherein the question output unit outputs the masked sentence as a fill-in-the-blank question for the word when the certainty of the word is greater than the certainty of the candidate word by at least the predetermined value.
前記単語の確信度と前記候補単語の確信度との差分が前記所定値未満である場合、
前記文取得部は、前記データセット内の文章において第nの文に隣接する第n+1の文を取得し、nは1以上の整数であり、
前記確信度算出部は、前記推測モデルを利用して、前記第nの文と前記第n+1の文とから構成される前記文章のマスク箇所における前記単語の確信度と候補単語の確信度とを算出し、
前記問題文出力部は、前記単語の確信度が前記候補単語の確信度よりも前記所定値以上大きい場合、前記マスクされた文章を前記単語の穴埋め問題として出力する、請求項1に記載の問題文作成装置。
if the difference between the confidence of the word and the confidence of the candidate word is less than the predetermined value,
the sentence acquisition unit acquires an n+1-th sentence adjacent to an n-th sentence in the sentences in the dataset , where n is an integer equal to or greater than 1;
the certainty calculation unit uses the inference model to calculate certainty of the word and certainty of the candidate word in a masked portion of the sentence composed of the nth sentence and the n+1th sentence;
2 . The question creation device according to claim 1 , wherein the question output unit outputs the masked sentence as a fill-in-the-blank question for the word when the certainty of the word is greater than the certainty of the candidate word by at least the predetermined value.
前記問題文出力部は、前記穴埋め問題において前記単語と前記単語に付随する選択候補単語とを選択肢として出力する、請求項1から3のいずれか一項に記載の問題文作成装置。 The question creation device according to any one of claims 1 to 3, wherein the question output unit outputs the word and a selection candidate word associated with the word as options in the fill-in-the-blank question. 前記問題文出力部は、前記穴埋め問題において前記単語の1つ以上の文字を表示する、請求項1から4のいずれか一項に記載の問題文作成装置。 The question creation device according to any one of claims 1 to 4, wherein the question output unit displays one or more characters of the word in the fill-in-the-blank question. 前記問題文出力部は、前記穴埋め問題において前記マスクされた文と当該文に付随する文とを表示する、請求項1から5のいずれか一項に記載の問題文作成装置。 The question creation device according to any one of claims 1 to 5, wherein the question output unit displays the masked sentence and a sentence associated with the masked sentence in the fill-in-the-blank question. 前記問題文出力部は、前記穴埋め問題において前記マスクされた文と当該文に付随する画像とを表示する、請求項1から5のいずれか一項に記載の問題文作成装置。 The question creation device according to any one of claims 1 to 5, wherein the question output unit displays the masked sentence and an image associated with the sentence in the fill-in-the-blank question. 前記学習対象の単語を提供する単語提供部を更に有する、請求項1から7のいずれか一項に記載の問題文作成装置。 The question creation device according to any one of claims 1 to 7, further comprising a word providing unit that provides the words to be studied. データセットから学習対象の単語を含む第1の文を取得するステップと、
前記第1の文において前記単語をマスクし、推測モデルを利用して、前記第1の文のマスク箇所における前記単語の確信度と前記マスク箇所に入り得る他の候補である候補単語の確信度とを算出するステップと、
前記単語と前記候補単語とを区別する所定値を設定し、前記単語の確信度が前記候補単語の確信度よりも前記所定値以上大きい場合、前記マスクされた文を穴埋め問題として出力するステップと、
をコンピュータが実行する問題文作成方法。
obtaining a first sentence from the dataset that includes a word to be trained;
masking the word in the first sentence and calculating the confidence of the word in the masked portion of the first sentence and the confidence of other candidate words that may be included in the masked portion using an inference model;
setting a predetermined value for distinguishing the word from the candidate word, and outputting the masked sentence as a fill-in-the-blank question if the certainty of the word is greater than the certainty of the candidate word by at least the predetermined value;
A method of creating questions that is carried out by a computer.
データセットから学習対象の単語を含む第1の文を取得する処理と、
前記第1の文において前記単語をマスクし、推測モデルを利用して、前記第1の文のマスク箇所における前記単語の確信度と前記マスク箇所に入り得る他の候補である候補単語の確信度とを算出する処理と、
前記単語と前記候補単語とを区別する所定値を設定し、前記単語の確信度が前記候補単語の確信度よりも前記所定値以上大きい場合、前記マスクされた文を穴埋め問題として出力する処理と、
をコンピュータに実行させるプログラム。
obtaining a first sentence including a word to be trained from the dataset;
A process of masking the word in the first sentence and calculating the confidence of the word in the masked portion of the first sentence and the confidence of other candidate words that may be included in the masked portion using a prediction model;
a process of setting a predetermined value for distinguishing the word from the candidate word, and outputting the masked sentence as a fill-in-the-blank question if the certainty of the word is greater than the certainty of the candidate word by at least the predetermined value;
A program that causes a computer to execute the following.
JP2021081813A 2021-05-13 2021-05-13 Question creation device, question creation method, and program Active JP7687589B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2021081813A JP7687589B2 (en) 2021-05-13 2021-05-13 Question creation device, question creation method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2021081813A JP7687589B2 (en) 2021-05-13 2021-05-13 Question creation device, question creation method, and program

Publications (2)

Publication Number Publication Date
JP2022175437A JP2022175437A (en) 2022-11-25
JP7687589B2 true JP7687589B2 (en) 2025-06-03

Family

ID=84145304

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2021081813A Active JP7687589B2 (en) 2021-05-13 2021-05-13 Question creation device, question creation method, and program

Country Status (1)

Country Link
JP (1) JP7687589B2 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002268536A (en) 2001-03-07 2002-09-20 Tsubota Toru Problem forming system, problem forming method and program for making computer execute problem forming processing
JP2008233553A (en) 2007-03-20 2008-10-02 Fujitsu Ltd Learning support device, learning support method and program thereof
JP2010151922A (en) 2008-12-24 2010-07-08 Fujitsu Ltd Question creating program, question creating device, question creating method
JP2019097670A (en) 2017-11-29 2019-06-24 株式会社 学研ホールディングス Puzzle game device, puzzle game method, and program for the same
JP2019211796A (en) 2019-09-09 2019-12-12 atama plus株式会社 Learning support device and problem setting method
JP2020160158A (en) 2019-03-25 2020-10-01 Tis株式会社 QA generator, QA generator and program

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10307724A (en) * 1997-05-09 1998-11-17 Nippon Telegr & Teleph Corp <Ntt> Insect eating solution generation apparatus and method, and recording medium storing a program for implementing the method
JPH1115360A (en) * 1997-06-24 1999-01-22 Matsushita Electric Ind Co Ltd Correction instruction method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002268536A (en) 2001-03-07 2002-09-20 Tsubota Toru Problem forming system, problem forming method and program for making computer execute problem forming processing
JP2008233553A (en) 2007-03-20 2008-10-02 Fujitsu Ltd Learning support device, learning support method and program thereof
JP2010151922A (en) 2008-12-24 2010-07-08 Fujitsu Ltd Question creating program, question creating device, question creating method
JP2019097670A (en) 2017-11-29 2019-06-24 株式会社 学研ホールディングス Puzzle game device, puzzle game method, and program for the same
JP2020160158A (en) 2019-03-25 2020-10-01 Tis株式会社 QA generator, QA generator and program
JP2019211796A (en) 2019-09-09 2019-12-12 atama plus株式会社 Learning support device and problem setting method

Also Published As

Publication number Publication date
JP2022175437A (en) 2022-11-25

Similar Documents

Publication Publication Date Title
US9443513B2 (en) System and method for automated detection of plagiarized spoken responses
JP7103264B2 (en) Generation device, learning device, generation method and program
Wang et al. Computer assisted language learning system based on dynamic question generation and error prediction for automatic speech recognition
US11657237B2 (en) Electronic device and natural language generation method thereof
CN107437417A (en) Based on speech data Enhancement Method and device in Recognition with Recurrent Neural Network speech recognition
US20150254999A1 (en) Method and system for providing contextual vocabulary acquisition through association
US20240203396A1 (en) Unambiguous phonics system
CN112199476B (en) Automatic decision making for scaffold selection after partially correct answers in a conversational intelligent tutor system
US20180158361A1 (en) Method for facilitating contextual vocabulary acquisition through association
KR20160107383A (en) System for generating language sentence patterns suggestions
US11893349B2 (en) Systems and methods for generating locale-specific phonetic spelling variations
RU2344492C2 (en) Dynamic support of pronunciation for training in recognition of japanese and chinese speech
JP7687589B2 (en) Question creation device, question creation method, and program
CN110459079A (en) Spelling Training Method for New Words Based on Phonetic Vocabulary
Wang [Retracted] The Performance of Artificial Intelligence Translation App in Japanese Language Education Guided by Deep Learning
Lee et al. Foreign language tutoring in oral conversations using spoken dialog systems
RU2479867C2 (en) Linguistic user interface operating method
Liu et al. Optimization to automated phonetic transcription grading tool (APTgt)–automatic exam generator
US20220076588A1 (en) Apparatus and method for providing foreign language education using foreign language sentence evaluation of foreign language learner
KR102188553B1 (en) The System For Providing Korean Language Education System with Animation
JP6566468B2 (en) User-adaptive test program, apparatus and method for identifying answers based on ambiguous understanding and minor misunderstandings
Lee et al. Grammatical error correction based on learner comprehension model in oral conversation
US20160267811A1 (en) Systems and methods for teaching foreign languages
Nhambongo et al. The benefits of using artificial intelligence in language learning
Filighera et al. Towards a vocalization feedback pipeline for language learners

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20240509

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20250122

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20250128

RD01 Notification of change of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7426

Effective date: 20250214

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20250227

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20250214

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20250415

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20250513

R150 Certificate of patent or registration of utility model

Ref document number: 7687589

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150