JP7564074B2

JP7564074B2 - Adversarial data detection device and adversarial data detection method

Info

Publication number: JP7564074B2
Application number: JP2021165477A
Authority: JP
Inventors: 恭平山本; 雅之吉野; 由美子横張; のん川名
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-10-07
Filing date: 2021-10-07
Publication date: 2024-10-08
Anticipated expiration: 2041-10-07
Also published as: JP2023056241A; WO2023058569A1

Description

本発明は、敵対的データ検知装置及び敵対的データ検知方法に関する。 The present invention relates to an adversarial data detection device and an adversarial data detection method.

金融、医療、及び製造などの様々な分野でＡＩ（ＡｒｔｉｆｉｃｉａｌＩｎｔｅｌｌｉｇｅｎｃｅ）の普及が進み、ＡＩが機微情報を扱う場面が増加している。一方、ＡＩに特有の脆弱性をついた攻撃が報告されている。 AI (Artificial Intelligence) is becoming more widespread in various fields such as finance, medicine, and manufacturing, and there are an increasing number of situations where AI handles sensitive information. Meanwhile, attacks that exploit vulnerabilities specific to AI have been reported.

ＡＩ特有の脆弱性をついた攻撃の一例として、人が気づかない程の微小のノイズがデータに加えられ、ＡＩに誤判定を引き起こさせるＡＥ（ＡｄｖｅｒｓａｒｉａｌＥｘａｍｐｌｅ）攻撃がある。以下では、微小のノイズが加えられたデータを敵対的（Ａｄｖｅｒｓａｒｉａｌ）データ、敵対的データをＡＩへ投じることをＡＥ攻撃とも呼ぶ。 One example of an attack that exploits vulnerabilities specific to AI is the Adversarial Example (AE) attack, in which minute noise that is unnoticeable to humans is added to data, causing the AI to make a false judgment. In what follows, data with minute noise added will be referred to as adversarial data, and throwing adversarial data at an AI will be referred to as an AE attack.

ＡＥ攻撃への対策として、敵対的データを生成するまでの過程や敵対的データの持つ特徴を捉えることでＡＥ攻撃を検知する技術が提案されている。ＡＥ攻撃を検知する背景技術として、非特許文献１がある。非特許文献１に記載の技術は、入力データにランダムな微小ノイズを加えた際の、出力結果の変化を利用してＡＥ攻撃を検知する。 As a countermeasure against AE attacks, technology has been proposed that detects AE attacks by capturing the process leading up to the generation of adversarial data and the characteristics of the adversarial data. Non-Patent Document 1 is a background technology for detecting AE attacks. The technology described in Non-Patent Document 1 detects AE attacks by utilizing changes in the output results when random minute noise is added to input data.

ＪｉｎｇｙｉＷａｎｇ，外３名， “ＤｅｔｅｃｔｉｎｇＡｄｖｅｒｓａｒｉａｌＳａｍｐｌｅｓｆｏｒＤｅｅｐＮｅｕｒａｌＮｅｔｗｏｒｋｓｔｈｒｏｕｇｈＭｕｔａｔｉｏｎＴｅｓｔｉｎｇ．”，［ｏｎｌｉｎｅ］，ＮＩＰＳ２０１８年５月，［令和３年３月２６日検索］，インターネット<https://arxiv.org/pdf/1805.05010.pdf>Jingyi Wang and 3 others, “Detecting Adversarial Samples for Deep Neural Networks through Mutation Testing.”, [online], NIPS May 2018 , [Retrieved March 26, 2021], Internet <https://arxiv .org/pdf/1805.05010.pdf>

非特許文献１のＡＥ攻撃を検知する技術は、攻撃者が敵対的データの検知手法についての情報を知っている状況下では、安全性が十分でなく、攻撃者は非特許文献１に記載の検知をすり抜ける敵対的データを意図的に生成できる。そこで、本発明の一態様は、攻撃者が敵対的データの検知手法についての情報を知っている状況下においても高精度に敵対的データを検知して、ひいては安全性を保つ。 The technology for detecting AE attacks in Non-Patent Document 1 is not sufficiently secure in a situation where an attacker knows information about a method for detecting adversarial data, and the attacker can intentionally generate adversarial data that evades the detection described in Non-Patent Document 1. Therefore, one aspect of the present invention detects adversarial data with high accuracy even in a situation where an attacker knows information about a method for detecting adversarial data, thereby maintaining security.

上記課題を解決するために、本発明の一態様は以下の構成を採用する。敵対的データ検知装置は、プロセッサとメモリとを有し、前記メモリは、評価対象データと、データが入力されると評価値を出力するモデルと、を保持し、前記プロセッサは、前記評価対象データを前記モデルに入力して、前記評価対象データの評価値を算出し、前記評価対象データに異なるノイズを付加して、複数の拡張評価対象データを生成し、前記複数の拡張評価対象データに基づいて、第１の判定処理を実行し、前記第１の判定処理において、前記複数の拡張評価対象データそれぞれに対して、当該拡張評価対象データの評価値が、前記評価対象データの評価値と一致しないようになるまで、ノイズを加えて当該拡張評価対象データを更新し、更新後の拡張評価対象データそれぞれの評価値の偏りに基づいて、前記評価対象データが敵対的データであるかを判定する。 In order to solve the above problem, one aspect of the present invention employs the following configuration. The adversarial data detection device has a processor and a memory, and the memory holds evaluation target data and a model that outputs an evaluation value when data is input. The processor inputs the evaluation target data to the model, calculates an evaluation value of the evaluation target data, adds different noises to the evaluation target data to generate multiple extended evaluation target data, and executes a first judgment process based on the multiple extended evaluation target data. In the first judgment process, for each of the multiple extended evaluation target data, noise is added to the extended evaluation target data until the evaluation value of the extended evaluation target data does not match the evaluation value of the evaluation target data, and judges whether the evaluation target data is adversarial data based on the bias of the evaluation value of each of the updated extended evaluation target data.

本発明の一態様によれば、攻攻撃者が敵対的データの検知手法についての情報を知っている状況下においても高精度に敵対的データを検知することができ、ひいては安全性を保つことができる。 According to one aspect of the present invention, it is possible to detect adversarial data with high accuracy even in a situation where an attacker knows information about methods for detecting adversarial data, thereby maintaining security.

上記した以外の課題、構成及び効果は、以下の実施形態の説明により明らかにされる。 Issues, configurations, and advantages other than those described above will become clear from the description of the embodiments below.

実施例１における敵対的データ検知システムの構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of an adversarial data detection system according to a first embodiment. 実施例１におけるノイズ付加サーバのハードウェア構成例を示すブロック図である。4 is a block diagram showing an example of the hardware configuration of a noise addition server in the first embodiment. FIG. 実施例１における評価サーバのハードウェア構成例を示すブロック図である。FIG. 2 is a block diagram showing an example of a hardware configuration of an evaluation server according to the first embodiment. 実施例１における検知サーバのハードウェア構成例を示すブロック図である。3 is a block diagram showing an example of a hardware configuration of a detection server according to the first embodiment; FIG. 実施例１におけるデータ送信端末のハードウェア構成例を示すブロック図である。2 is a block diagram showing an example of a hardware configuration of a data sending terminal according to the first embodiment; 実施例１における敵対的データ検知処理の一例を示すシーケンス図である。FIG. 11 is a sequence diagram illustrating an example of an adversarial data detection process in the first embodiment. 実施例１における拡張評価対象データ更新処理の一例を示すフローチャートである。13 is a flowchart illustrating an example of an extended evaluation target data update process in the first embodiment. 実施例１における敵対的データ判定処理の一例を示すフローチャートである。11 is a flowchart illustrating an example of a hostile data determination process according to the first embodiment. 実施例１における敵対的データ検知処理の概要の一例を示す説明図である。FIG. 2 is an explanatory diagram illustrating an example of an overview of an adversarial data detection process in the first embodiment. 実施例２における敵対的データ検知処理の一例を示すシーケンス図である。FIG. 11 is a sequence diagram illustrating an example of an adversarial data detection process in the second embodiment. 実施例２における敵対的データ判定処理の一例を示すフローチャートである。13 is a flowchart illustrating an example of a hostile data determination process according to the second embodiment. 実施例３における敵対的データ検知システムの入出力データの一例を示す説明図である。FIG. 13 is an explanatory diagram illustrating an example of input and output data of the adversarial data detection system in the third embodiment.

以下、添付図面を参照して本発明の実施形態を説明する。本実施形態において、同一の構成には原則として同一の符号を付け、繰り返しの説明は省略する。なお、本実施形態は本発明を実現するための一例に過ぎず、本発明の技術的範囲を限定するものではないことに注意すべきである。本実施形態は、データ送信者から受信したデータが、ＡＥ（ＡｄｖｅｒｓａｒｉａｌＥｘａｍｐｌｅ）攻撃により生成されたデータであるか否かを判定し、判定結果に応じてデータ受信者に処理結果を送信するシステムを説明する。 Below, an embodiment of the present invention will be described with reference to the attached drawings. In this embodiment, the same components are generally given the same reference numerals, and repeated description will be omitted. It should be noted that this embodiment is merely one example for realizing the present invention, and does not limit the technical scope of the present invention. This embodiment describes a system that determines whether data received from a data sender is data generated by an AE (Adversarial Example) attack, and transmits a processing result to the data receiver according to the determination result.

図１は、敵対的データ検知システムの構成例を示すブロック図である。敵対的データ検知システムは、例えば、評価対象データを保持するデータ送信端末４００、評価対象データにノイズを付加するノイズ付加サーバ１００、評価対象データに対する評価値を計算する評価サーバ２００、評価対象データがＡＥ攻撃を受けたデータであるかを判定する検知サーバ３００、及び評価対象データに対する評価値を受信するデータ受信端末５００を含む。 Figure 1 is a block diagram showing an example of the configuration of an adversarial data detection system. The adversarial data detection system includes, for example, a data transmission terminal 400 that holds evaluation target data, a noise addition server 100 that adds noise to the evaluation target data, an evaluation server 200 that calculates an evaluation value for the evaluation target data, a detection server 300 that determines whether the evaluation target data is data that has been subjected to an AE attack, and a data receiving terminal 500 that receives the evaluation value for the evaluation target data.

データ送信端末４００、ノイズ付加サーバ１００、評価サーバ２００、検知サーバ３００、及びデータ受信端末５００の詳細な構成については後述する。データ送信端末４００とノイズ付加サーバ１００と評価サーバ２００と検知サーバ３００とデータ受信端末５００は、インターネット等のネットワーク６００を介して相互に情報を送受信する。 Detailed configurations of the data transmission terminal 400, the noise addition server 100, the evaluation server 200, the detection server 300, and the data receiving terminal 500 will be described later. The data transmission terminal 400, the noise addition server 100, the evaluation server 200, the detection server 300, and the data receiving terminal 500 transmit and receive information to and from each other via a network 600 such as the Internet.

なお、敵対的データ検知システムに含まれる装置の一部又は全部が一体化していてもよい。例えば、ノイズ付加サーバ１００、評価サーバ２００、及び検知サーバ３００の少なくとも一部が一体化していてもよいし、データ送信端末４００とデータ受信端末５００とが一体化していてもよい。 In addition, some or all of the devices included in the adversarial data detection system may be integrated. For example, at least some of the noise addition server 100, the evaluation server 200, and the detection server 300 may be integrated, or the data transmission terminal 400 and the data receiving terminal 500 may be integrated.

図２は、ノイズ付加サーバ１００のハードウェア構成例を示すブロック図である。ノイズ付加サーバ１００は、例えば、互いにバス等の内部信号線１０４で接続された、プロセッサ（ＣＰＵ）１０１、補助記憶装置１０２、メモリ１０３、表示装置１０５、入出力インターフェース１０６、及び通信インターフェース１０７を有する計算機によって構成される。 Figure 2 is a block diagram showing an example of the hardware configuration of the noise addition server 100. The noise addition server 100 is composed of a computer having a processor (CPU) 101, auxiliary storage device 102, memory 103, display device 105, input/output interface 106, and communication interface 107, which are connected to each other by an internal signal line 104 such as a bus.

プロセッサ１０１は、メモリ１０３に格納されたプログラムを実行する。メモリ１０３は、不揮発性の非一時的な記憶素子であるＲＯＭ及び揮発性の記憶素子であるＲＡＭを含む。ＲＯＭは、不変なプログラム（たとえばＢＩＯＳ）などを格納する。ＲＡＭは、ＤＲＡＭ（ＤｙｎａｍｉｃＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）のような高速かつ揮発性の記憶素子であり、プロセッサ１０１が実行するプログラム及びプログラムの実行時に使用されるデータを一時的に格納する。 The processor 101 executes programs stored in the memory 103. The memory 103 includes a ROM, which is a non-volatile, non-temporary storage element, and a RAM, which is a volatile storage element. The ROM stores immutable programs (e.g., BIOS) and the like. The RAM is a high-speed, volatile storage element such as a DRAM (Dynamic Random Access Memory), and temporarily stores programs executed by the processor 101 and data used when the programs are executed.

補助記憶装置１０２は、例えば、磁気記憶装置（ＨＤＤ）、フラッシュメモリ（ＳＳＤ）等の大容量かつ不揮発性の非一時的な記憶装置であり、プロセッサ１０１が実行するプログラム及びプログラムの実行時に使用されるデータを格納する。すなわち、プログラムは、補助記憶装置１０２から読み出されて、メモリ１０３にロードされて、プロセッサ１０１によって実行される。 The auxiliary storage device 102 is, for example, a large-capacity, non-volatile, non-transient storage device such as a magnetic storage device (HDD) or a flash memory (SSD), and stores the programs executed by the processor 101 and data used when the programs are executed. That is, the programs are read from the auxiliary storage device 102, loaded into the memory 103, and executed by the processor 101.

入出力インターフェース１０６は、キーボードやマウスなどが接続され、オペレータからの入力を受けるインターフェースである。また、入出力インターフェース１０６は、表示装置１０５やプリンタなどが接続され、プログラムの実行結果をオペレータが視認可能な形式で出力するインターフェースでもある。表示装置１０５は、入出力インターフェース１０６から出力されたプログラムの実行結果を表示する。 The input/output interface 106 is an interface to which a keyboard, mouse, etc. are connected and which receives input from an operator. The input/output interface 106 is also an interface to which a display device 105, a printer, etc. are connected and which outputs the results of program execution in a format that can be viewed by the operator. The display device 105 displays the results of program execution output from the input/output interface 106.

通信インターフェース１０７は、所定のプロトコルに従って、他の装置との通信を制御するネットワークインターフェース装置である。また、通信インターフェース１０７は、例えば、ＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）等のシリアルインターフェースを含んでもよい。 The communication interface 107 is a network interface device that controls communication with other devices according to a specific protocol. The communication interface 107 may also include a serial interface such as a Universal Serial Bus (USB).

プロセッサ１０１が実行するプログラムの一部又は全部は、計算機が読み取り可能な非一時的記憶媒体であるリムーバブルメディア（ＣＤ－ＲＯＭ、フラッシュメモリなど）から、又はネットワーク６００を介して接続された非一時的記憶装置を備える外部計算機から、ノイズ付加サーバ１００に提供され、非一時的記憶媒体である不揮発性の補助記憶装置１０２に格納されてもよい。このため、ノイズ付加サーバ１００は、リムーバブルメディアからデータを読み込むインターフェースを有するとよい。これは、評価サーバ２００、検知サーバ３００、データ送信端末４００、及びデータ受信端末５００についても同様である。 A part or all of the program executed by the processor 101 may be provided to the noise addition server 100 from removable media (CD-ROM, flash memory, etc.) which is a non-transitory storage medium readable by the computer, or from an external computer equipped with a non-transitory storage device connected via the network 600, and stored in the non-volatile auxiliary storage device 102 which is a non-transitory storage medium. For this reason, the noise addition server 100 may have an interface for reading data from removable media. This also applies to the evaluation server 200, the detection server 300, the data transmission terminal 400, and the data receiving terminal 500.

ノイズ付加サーバ１００は、物理的に一つの計算機上で、又は、論理的又は物理的に構成された複数の計算機上で構成される計算機システムであり、同一の計算機上で別個のスレッドで動作してもよく、複数の物理的計算機資源上に構築された仮想計算機上で動作してもよい。これは、評価サーバ２００、検知サーバ３００、データ送信端末４００、及びデータ受信端末５００についても同様である。 The noise addition server 100 is a computer system configured on one physical computer or on multiple logically or physically configured computers, and may operate in separate threads on the same computer, or may operate on a virtual computer constructed on multiple physical computer resources. The same applies to the evaluation server 200, detection server 300, data transmission terminal 400, and data reception terminal 500.

プロセッサ１０１は、例えば、データ拡張部１１１、及びノイズ付加部１１２を含む。例えば、プロセッサ１０１はメモリ１０３にロードされたデータ拡張プログラムに従って動作することで、データ拡張部１１１として機能し、メモリ１０３にロードされたノイズ付加プログラムに従って動作することで、ノイズ付加部１１２として機能する。 The processor 101 includes, for example, a data expansion unit 111 and a noise addition unit 112. For example, the processor 101 functions as the data expansion unit 111 by operating according to a data expansion program loaded into the memory 103, and functions as the noise addition unit 112 by operating according to a noise addition program loaded into the memory 103.

データ拡張部１１１は、例えば、評価対象データにノイズを付加して拡張評価対象データを生成する。ノイズ付加部１１２は、拡張評価対象データにノイズを付加することで拡張評価対象データを更新する。 The data extension unit 111, for example, adds noise to the evaluation target data to generate extended evaluation target data. The noise addition unit 112 updates the extended evaluation target data by adding noise to the extended evaluation target data.

図３は、評価サーバ２００のハードウェア構成例を示すブロック図である。評価サーバ２００は例えば、互いにバス等の内部信号線２０４で接続された、プロセッサ（ＣＰＵ）２０１、補助記憶装置２０２、メモリ２０３、表示装置２０５、入出力インターフェース２０６、及び通信インターフェース２０７を有する計算機によって構成される。 Figure 3 is a block diagram showing an example of the hardware configuration of the evaluation server 200. The evaluation server 200 is composed of a computer having a processor (CPU) 201, an auxiliary storage device 202, a memory 203, a display device 205, an input/output interface 206, and a communication interface 207, which are connected to each other by an internal signal line 204 such as a bus.

プロセッサ２０１、補助記憶装置２０２、メモリ２０３、内部信号線２０４、表示装置２０５、入出力インターフェース２０６、及び通信インターフェース２０７のハードウェアとしての説明は、プロセッサ１０１、補助記憶装置１０２、メモリ１０３、内部信号線１０４、表示装置１０５、入出力インターフェース１０６、及び通信インターフェース１０７のハードウェアとしての説明と同様であるため説明を省略する。 The hardware description of the processor 201, auxiliary storage device 202, memory 203, internal signal line 204, display device 205, input/output interface 206, and communication interface 207 is the same as the hardware description of the processor 101, auxiliary storage device 102, memory 103, internal signal line 104, display device 105, input/output interface 106, and communication interface 107, so the description will be omitted.

プロセッサ２０１は評価部２１１を含む。例えば、プロセッサ２０１は、メモリ２０３にロードされた評価プログラムに従って動作することで、評価部２１１として機能する。
評価部２１１は、評価対象データ及び拡張評価対象データに対する評価値を計算する。 The processor 201 includes an evaluation unit 211. For example, the processor 201 functions as the evaluation unit 211 by operating according to an evaluation program loaded in the memory 203.
The evaluation unit 211 calculates evaluation values for the evaluation target data and the extended evaluation target data.

図４は、検知サーバ３００のハードウェア構成例を示すブロック図である。検知サーバ３００は、例えば、互いにバス等の内部信号線３０４で接続された、プロセッサ（ＣＰＵ）３０１、補助記憶装置３０２、メモリ３０３、表示装置３０５、入出力インターフェース３０６、及び通信インターフェース３０７を有する計算機によって構成される。 Figure 4 is a block diagram showing an example of the hardware configuration of the detection server 300. The detection server 300 is configured, for example, by a computer having a processor (CPU) 301, auxiliary storage device 302, memory 303, display device 305, input/output interface 306, and communication interface 307, all of which are connected to each other by an internal signal line 304 such as a bus.

プロセッサ３０１、補助記憶装置３０２、メモリ３０３、内部信号線３０４、表示装置３０５、入出力インターフェース３０６、及び通信インターフェース３０７のハードウェアとしての説明は、プロセッサ１０１、補助記憶装置１０２、メモリ１０３、内部信号線１０４、表示装置１０５、入出力インターフェース１０６、及び通信インターフェース１０７のハードウェアとしての説明と同様であるため説明を省略する。 The hardware description of the processor 301, auxiliary storage device 302, memory 303, internal signal line 304, display device 305, input/output interface 306, and communication interface 307 is the same as the hardware description of the processor 101, auxiliary storage device 102, memory 103, internal signal line 104, display device 105, input/output interface 106, and communication interface 107, so the description will be omitted.

プロセッサ３０１は第１検知部３１１と第２検知部３１２を含む。例えば、プロセッサ３０１は、メモリ３０３にロードされた第１検知プログラムに従って動作することで、第１検知部３１１として機能し、メモリ３０３にロードされた第２検知プログラムに従って動作することで、第２検知部３１２として機能する。 The processor 301 includes a first detection unit 311 and a second detection unit 312. For example, the processor 301 functions as the first detection unit 311 by operating according to a first detection program loaded into the memory 303, and functions as the second detection unit 312 by operating according to a second detection program loaded into the memory 303.

第１検知部３１１は、評価対象データが敵対的データ（ＡＥ攻撃を受けたデータであるか）であるかの最終判定を行う。第２検知部３１２は、評価対象データが敵対的データであるか敵対的データである可能性があるかを判定し、敵対的データである可能性があるかを判定した場合に、第１検知部３１１に最終判定を依頼する。なお、実施例１のプロセッサ３０１は第２検知部３１２を含まなくてもよい。 The first detection unit 311 makes a final determination as to whether the data to be evaluated is hostile data (data subjected to an AE attack). The second detection unit 312 determines whether the data to be evaluated is hostile data or has the possibility of being hostile data, and requests the first detection unit 311 to make a final determination if it has determined that the data is likely to be hostile data. Note that the processor 301 of the first embodiment does not need to include the second detection unit 312.

図５は、データ送信端末４００のハードウェア構成例を示すブロック図である。データ送信端末４００は例えば、互いにバス等の内部信号線４０４で連結された、プロセッサ（ＣＰＵ）４０１、補助記憶装置４０２、メモリ４０３、表示装置４０５、入出力インターフェース４０６、及び通信インターフェース４０７を有する計算機によって構成される。 Figure 5 is a block diagram showing an example of the hardware configuration of the data transmission terminal 400. The data transmission terminal 400 is composed of a computer having a processor (CPU) 401, an auxiliary storage device 402, a memory 403, a display device 405, an input/output interface 406, and a communication interface 407, which are connected to each other by an internal signal line 404 such as a bus.

プロセッサ４０１、補助記憶装置４０２、メモリ４０３、内部信号線４０４、表示装置４０５、入出力インターフェース４０６、及び通信インターフェース４０７のハードウェアとしての説明は、プロセッサ１０１、補助記憶装置１０２、メモリ１０３、内部信号線１０４、表示装置１０５、入出力インターフェース１０６、及び通信インターフェース１０７のハードウェアとしての説明と同様であるため説明を省略する。 The hardware description of the processor 401, auxiliary storage device 402, memory 403, internal signal line 404, display device 405, input/output interface 406, and communication interface 407 is the same as the hardware description of the processor 101, auxiliary storage device 102, memory 103, internal signal line 104, display device 105, input/output interface 106, and communication interface 107, so the description will be omitted.

データ送信端末４００の補助記憶装置４０２は、評価対象データ４２１を保持する。なお、評価対象データ４２１は、データ送信端末４００のメモリ４０３に格納されていてもよい。 The auxiliary storage device 402 of the data transmission terminal 400 holds the evaluation target data 421. The evaluation target data 421 may be stored in the memory 403 of the data transmission terminal 400.

データ受信端末５００のハードウェア構成例は、図５のデータ送信端末４００のハードウェア構成例と同様であるため説明を省略する。 The hardware configuration example of the data receiving terminal 500 is similar to the hardware configuration example of the data sending terminal 400 in Figure 5, so the explanation is omitted.

図６は、敵対的データ検知処理の一例を示すシーケンス図である。まず、データ送信端末４００は、評価対象データをノイズ付加サーバ１００へ送信する（Ｓ６０１）。画像データは、評価対象データの一例である。 Figure 6 is a sequence diagram showing an example of the hostile data detection process. First, the data transmission terminal 400 transmits evaluation target data to the noise-added server 100 (S601). Image data is an example of evaluation target data.

データ拡張部１１１は、受信した評価対象データに対してデータ拡張を行うことで複数の拡張評価対象データを生成する（Ｓ６０２）。ステップＳ６０２の処理の一例について説明する。例えば、拡張評価対象データの生成数を示すパラメータｎが予め設定されてメモリ１０３に格納されている。データ拡張部１１１は、例えば、ｎ種類の異なるノイズを生成し、１つの評価対象データにｎ種類のノイズそれぞれを付加することで、ｎ個の拡張評価対象データを生成する。データ拡張部１１１が生成するノイズは、例えば、評価対象データと同じ次元数の数値ベクトルである。また、データ拡張部１１１は、例えば、ガウス分布等の所定の確率分布に従って、ノイズを生成する。 The data expansion unit 111 generates multiple pieces of extended evaluation target data by performing data expansion on the received evaluation target data (S602). An example of the processing of step S602 will be described. For example, a parameter n indicating the number of extended evaluation target data to be generated is set in advance and stored in the memory 103. The data expansion unit 111 generates n different types of noise, for example, and adds each of the n types of noise to one piece of evaluation target data to generate n pieces of extended evaluation target data. The noise generated by the data expansion unit 111 is, for example, a numerical vector with the same number of dimensions as the evaluation target data. In addition, the data expansion unit 111 generates noise according to a predetermined probability distribution, such as a Gaussian distribution.

ノイズ付加部１１２と評価部２１１は相互に通信して、評価対象データに対する評価値の計算と、拡張評価対象データに対する評価値の計算と、ステップＳ６０２で生成された拡張評価対象データに対するノイズ付加と、ノイズが付加された拡張評価対象データに対する評価値の計算と、を行う（Ｓ６０３）。 The noise addition unit 112 and the evaluation unit 211 communicate with each other to calculate an evaluation value for the evaluation target data, calculate an evaluation value for the extended evaluation target data, add noise to the extended evaluation target data generated in step S602, and calculate an evaluation value for the extended evaluation target data with noise added (S603).

例えば、ノイズ付加部１１２はｎ個の拡張評価対象データそれぞれに対し異なるノイズを生成し付加することでｎ個の拡張評価対象データそれぞれを更新し、評価部２１１は更新後のｎ個の拡張データそれぞれについて評価値を算出する。なお、ノイズ付加部１１２は、例えば、ガウス分布、一様分布、又はラプラス分布等の所定の確率分布に従ってノイズを生成してもよいし、評価部２１１により計算される評価値を意図的に変化させるようなノイズを生成してもよい。評価値を意図的に変化させるノイズとして、例えば、敵対的データ作成時に生成されるノイズと同様の手法で生成されるノイズが使用される。 For example, the noise addition unit 112 updates each of the n pieces of extended evaluation target data by generating and adding a different noise to each of the n pieces of extended evaluation target data, and the evaluation unit 211 calculates an evaluation value for each of the n pieces of extended data after updating. Note that the noise addition unit 112 may generate noise according to a predetermined probability distribution such as a Gaussian distribution, a uniform distribution, or a Laplace distribution, or may generate noise that intentionally changes the evaluation value calculated by the evaluation unit 211. As noise that intentionally changes the evaluation value, for example, noise generated by a method similar to that used for noise generated when creating adversarial data is used.

本実施形態において、評価部２１１は、例えば、深層学習を用いて生成された学習済みモデルに更新後のｎ個の拡張評価対象データを入力して得られたクラスを評価値として算出する。なお、当該学習済みモデルは、深層学習を行う際に訓練データに対しリサイズやパディング等の処理を施したデータを学習させることで、ロバスト性を向上させた学習済みモデルであってもよい。なお、この学習済みモデルは、例えば、メモリ２０３に予め格納されている。 In this embodiment, the evaluation unit 211, for example, inputs the updated n pieces of extended evaluation target data into a trained model generated using deep learning, and calculates the class obtained as an evaluation value. Note that the trained model may be a trained model with improved robustness by training data that has been resized, padded, or otherwise processed when performing deep learning. Note that this trained model is, for example, stored in advance in the memory 203.

なお、ステップＳ６０３におけるｎ個の拡張評価対象データそれぞれの更新処理と更新後のｎ個の拡張評価対象データそれぞれの評価値算出処理は、所定条件が満たされるまで繰り返し実行される。評価部２１１は、ステップＳ６０３で最終的に算出された拡張評価対象データそれぞれの評価値を検知サーバ３００へ送信する（Ｓ６０４）。なお、ステップＳ６０３とステップＳ６０４の処理の詳細は図７を用いて後述する。 The update process for each of the n pieces of extended evaluation target data in step S603 and the calculation process for the evaluation value for each of the n pieces of extended evaluation target data after the update are repeated until a predetermined condition is satisfied. The evaluation unit 211 transmits the evaluation value for each piece of extended evaluation target data finally calculated in step S603 to the detection server 300 (S604). Details of the processes in steps S603 and S604 will be described later with reference to FIG. 7.

第１検知部３１１は、ステップＳ６０４で受信した評価値を用いて、ステップＳ６０１で送信された評価対象データが、敵対的データであるか否か判定する（Ｓ６０５）。ステップＳ６０５の処理の詳細は図８を用いて後述する。 The first detection unit 311 uses the evaluation value received in step S604 to determine whether the evaluation target data transmitted in step S601 is hostile data (S605). Details of the processing in step S605 will be described later with reference to FIG. 8.

第１検知部３１１は、評価対象データが敵対的データであると判定した場合（Ｓ６０５：Ｙｅｓ）、評価対象データの評価値をデータ送信端末４００に送信せず、当該評価値の送信を拒否することを示す通知（判定結果：ｒｅｊｅｃｔ）をデータ送信端末４００に送信する（Ｓ６０６）。第１検知部３１１は、評価対象データが敵対的データでないと判定した場合（Ｓ６０５：Ｎｏ）、評価対象データの評価値を、データ受信端末５００に送信して（Ｓ６０７）、敵対的データ検知処理が終了する。 When the first detection unit 311 determines that the evaluation target data is hostile data (S605: Yes), it does not transmit the evaluation value of the evaluation target data to the data sending terminal 400, and transmits a notification (determination result: reject) to the data sending terminal 400 indicating that the transmission of the evaluation value is rejected (S606). When the first detection unit 311 determines that the evaluation target data is not hostile data (S605: No), it transmits the evaluation value of the evaluation target data to the data receiving terminal 500 (S607), and the hostile data detection process ends.

なお、第１検知部３１１は、ステップＳ８０３において評価対象データが敵対的データであると判定した場合に、ステップＳ６０６において、当該評価値の送信を拒否することを示す通知（判定結果：ｒｅｊｅｃｔ）をデータ送信端末４００に送信せずに、ステップＳ６０３で最終的に算出された拡張評価対象データそれぞれの評価値をデータ受信端末５００に送信してもよい。 If the first detection unit 311 determines in step S803 that the evaluation target data is hostile data, it may transmit the evaluation value of each of the extended evaluation target data finally calculated in step S603 to the data receiving terminal 500 in step S606 without transmitting a notification (determination result: reject) to the data sending terminal 400 indicating that the transmission of the evaluation value is rejected.

図７は、ステップＳ６０３及びステップＳ６０４における、拡張評価対象データ更新処理の一例を示すフローチャートである。評価部２１１は、ステップＳ６０１で受け取った評価対象データＤの評価値Ｅｖａｌを計算する（Ｓ７０１）。つまり、評価部２１１は、評価対象データＤを上記した学習済みモデルに入力して得られた評価対象データＤのクラスを評価値Ｅｖａｌとして計算する。 Figure 7 is a flowchart showing an example of the extended evaluation target data update process in steps S603 and S604. The evaluation unit 211 calculates an evaluation value Eval of the evaluation target data D received in step S601 (S701). In other words, the evaluation unit 211 calculates the class of the evaluation target data D obtained by inputting the evaluation target data D into the trained model described above as the evaluation value Eval.

評価部２１１は、ｉ＝１，・・・，ｎそれぞれについてｉ＝１から順に、下記のステップＳ７０２～ステップＳ７０５の処理を実行する。評価部２１１は、ｎ個の拡張評価対象データ｛Ｄ_１，・・・，Ｄ_ｎ｝に含まれるＤ_ｉの評価値ＡＥＥｖａｌ_ｉを計算する（Ｓ７０２）。評価部２１１は、Ｅｖａｌ＝ＡＥＥｖａｌ_ｉであるかを判定する（Ｓ７０３）。 The evaluation unit 211 executes the following steps S702 to S705 for each of i=1, ..., n, starting from i=1. The evaluation unit 211 calculates an evaluation value AEEval _i of D _i included in n pieces of extended evaluation target data {D ₁ , ..., D _n } (S702). The evaluation unit 211 determines whether Eval=AEEval _i (S703).

評価部２１１は、Ｅｖａｌ＝ＡＥＥｖａｌ_ｉであると判定した場合（Ｓ７０３：Ｙｅｓ）、ノイズ付加部１１２はＤ_ｉにノイズを付加することでＤ_ｉを更新し（Ｓ７０４）、ステップＳ７０２に戻って更新後のＤ_ｉに対して評価値を再度計算する。なお、ステップＳ７０４におけるノイズ付加の方法は、ステップＳ６０２におけるノイズ付加の方法と同じであっても異なってもよい。 When the evaluation unit 211 determines that Eval=AEEval _i (S703: Yes), the noise addition unit 112 updates _Di by adding noise to _Di (S704), and the process returns to step S702 to recalculate the evaluation value for the updated _Di. Note that the method of adding noise in step S704 may be the same as or different from the method of adding noise in step S602.

評価部２１１は、Ｅｖａｌ＝ＡＥＥｖａｌ_ｉでないと判定した場合（Ｓ７０３：Ｎｏ）、直近のステップＳ９０２で算出した評価値ＡＥＥｖａｌ_ｉを、例えば、メモリ２０３に保存して（Ｓ７０５）、ｉの値をインクリメントしてステップＳ９０２に戻る。 If the evaluation unit 211 determines that Eval=AEEval _i is not satisfied (S703: No), the evaluation unit 211 stores the evaluation value AEEval _i calculated in the most recent step S902 in, for example, the memory 203 (S705), increments the value of i, and returns to step S902.

なお、Ｅｖａｌ＝ＡＥＥｖａｌ_ｉであることは、評価対象データＤのクラスとＤ_ｉのクラスとが一致していること、即ちＤにノイズが付加されて生成されたにも関わらずクラスが変化していないことを示す。一方、Ｅｖａｌ＝ＡＥＥｖａｌ_ｉでないことは、評価対象データＤのクラスとＤ_ｉのクラスとが一致していないこと、即ちＤにノイズが付加されてクラスが変化したことを示す。 Note that Eval=AEEval _i indicates that the class of the evaluation target data D and the class of _Di match, that is, the class has not changed even though noise has been added to D when it was generated. On the other hand, if Eval=AEEval _i is not true, the class of the evaluation target data D and the class of _Di do not match, that is, noise has been added to D and the class has changed.

なお、評価部２１１は、Ｄ_ｉに対して所定回数以上のステップＳ７０４の処理が行われてもステップＳ７０５においてＥｖａｌ＝ＡＥＥｖａｌ_ｉであると判定した場合には、このＤ_ｉに対するステップＳ７０２～ステップＳ７０４の処理を終了して（即ちこのＤ_ｉに対する最終的な評価値を算出することなく）、ｉをインクリメントしてもよいし、データ送信端末４００にエラーを送信して敵対的データ検知処理を終了してもよい。 In addition, if the evaluation unit 211 determines in step S705 that Eval = AEEval _i even after the processing of step S704 has been performed a predetermined number of times or more for _Di , it may end the processing of steps S702 to S704 for this _Di (i.e., without calculating a final evaluation value for this _Di ) and increment i, or it may send an error to the data sending terminal 400 and end the hostile data detection processing.

ｉ＝１，・・・，ｎそれぞれに対してステップＳ７０５の処理が完了したら、評価部２１１は、メモリ２０３に保存されたＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎを検知サーバ３００に送信して（Ｓ６０４）、拡張評価対象データ更新処理が終了する。 When the process of step S705 is completed for each of i=1, . . . , n, the evaluation unit 211 transmits AEEval ₁ , . . . , AEEval _n stored in the memory 203 to the detection server 300 (S604), and the extended evaluation target data update process ends.

図８は、ステップＳ６０４及びステップＳ６０５における敵対的データ判定処理の一例を示すフローチャートである。第１検知部３１１は、評価値｛ＡＥＥｖａｌ_１，…，ＡＥＥｖａｌ_ｎ｝を受け取る（Ｓ６０４）。第１検知部３１１は、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎからなる評価値群における最頻値の度数Ｎｕｍを求める（Ｓ８０１）。 8 is a flowchart showing an example of the hostile data determination process in steps S604 and S605. The first detection unit 311 receives the evaluation values {AEEval ₁ , ..., AEEval _n } (S604). The first detection unit 311 calculates the frequency Num of the most frequent value in the evaluation value group consisting of AEEval ₁ , ..., AEEval _n (S801).

第１検知部３１１は、Ｎｕｍをデータ数ｎで割った値が、予め定められた閾値ｑより大きいかを判定する（Ｓ８０２）。なお、閾値ｑは、例えば、メモリ３０３に予め格納されている。第１検知部３１１は、Ｎｕｍをデータ数ｎで割った値が、予め定められた閾値ｑより大きいと判定した場合（Ｓ８０２：Ｙｅｓ）、評価対象データが敵対的データであると判定して（Ｓ８０３）、敵対的データ判定処理を終了する。 The first detection unit 311 determines whether the value obtained by dividing Num by the number of data n is greater than a predetermined threshold value q (S802). The threshold value q is, for example, stored in advance in the memory 303. If the first detection unit 311 determines that the value obtained by dividing Num by the number of data n is greater than the predetermined threshold value q (S802: Yes), it determines that the evaluation target data is hostile data (S803) and ends the hostile data determination process.

第１検知部３１１は、Ｎｕｍをデータ数ｎで割った値が、予め定められた閾値ｑ以下であると判定した場合（Ｓ８０２：Ｎｏ）、評価対象データが敵対的データでないと判定して（Ｓ８０４）、敵対的データ判定処理を終了する。 If the first detection unit 311 determines that the value obtained by dividing Num by the number of data n is equal to or less than a predetermined threshold value q (S802: No), it determines that the data to be evaluated is not hostile data (S804) and ends the hostile data determination process.

図８の例では、第１検知部３１１は、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎからなる評価値群における最頻値の度数Ｎｕｍを敵対的データの検知に用いているが、Ｎｕｍは敵対的データを検知するための指標の一例に過ぎず、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎの偏りを示す指標であれば上記したＮｕｍと異なる指標を用いてもよい。第１検知部３１１は、例えば、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎからなる評価値群において度数が上位ｎ位までの評価値の度数の合計をＮｕｍとして用いてもよいし、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎそれぞれを出力したデータ数に関する分散を敵対的データの検知に用いてもよい。 8, the first detection unit 311 uses the frequency Num of the most frequent value in the evaluation value group consisting of AEEval ₁ , ..., AEEval _n to detect hostile data, but Num is merely one example of an index for detecting hostile data, and an index other than the above-mentioned Num may be used as long as it is an _index indicating the bias of AEEval ₁ , ..., AEEval _n _. For example, the first detection unit 311 may use as Num the sum of the frequencies of the evaluation values with the highest frequencies of the nth highest in the evaluation value group consisting of AEEval _{1 , ..., AEEval n, or may use the variance regarding the number of data output for each of AEEval 1} , ..., AEEval _n to detect hostile data.

上記した分散の一例について説明する。第１検知部３１１は、ＡＥＥｖａｌ_１，・・・，ＡＥＥｖａｌ_ｎのからなる評価値群に含まれる評価値の度数を算出し、算出した度数の分散を算出する。例えば、拡張評価対象データの個数が１０個であり、拡張評価対象データの評価値として取り得る値が、０又は１のいずれかであるとする。さらに、ＡＥＥｖａｌ_１～ＡＥＥｖａｌ_１０の値がいずれも１であるとする。この場合、評価値０の度数が０であり、評価値１の度数が１０である。このとき第１検知部３１１は、例えば、（平均）＝（拡張評価対象データの個数）／（評価値として取り得る値の個数）を算出し（つまり１０／２＝５を算出し）、評価値として取り得る各値について(平均－度数)^２-を算出して（つまり、（５－０）^２＝２５と、（５－１０）^２＝２５と、を算出して）、算出した(（平均）－度数))^２の総和を拡張評価対象データの個数で割った値（つまり、（２５＋２５）／１０＝５）を分散として算出する。第１検知部３１１は、算出した分散が予め定められた閾値ｒ以下であると判定した場合、評価対象データが敵対的データでないと判定し、算出した分散が予め定められた閾値ｒより大きいと判定した場合、評価対象データが敵対的データであると判定する。 An example of the above-mentioned variance will be described. The first detection unit 311 calculates the frequency of the evaluation values included in the evaluation value group consisting of AEEval ₁ , ..., AEEval _n , and calculates the variance of the calculated frequencies. For example, it is assumed that the number of extended evaluation target data is 10, and the value that can be taken as the evaluation value of the extended evaluation target data is either 0 or 1. Furthermore, it is assumed that the values of AEEval ₁ to AEEval ₁₀ are all 1. In this case, the frequency of evaluation value 0 is 0, and the frequency of evaluation value 1 is 10. At this time, the first detection unit 311 calculates, for example, (average) = (number of extended evaluation target data) / (number of values that can be taken as evaluation values) (i.e., calculates 10/2 = 5), calculates (average - frequency) ^2- for each value that can be taken as an evaluation value (i.e., calculates (5-0) ² = 25 and (5-10) ² = 25), and calculates the sum of the calculated ((average) - frequency) ² divided by the number of extended evaluation target data (i.e., (25 + 25) / 10 = 5) as the variance. If the first detection unit 311 determines that the calculated variance is equal to or less than a predetermined threshold r, it determines that the evaluation target data is not hostile data, and if the calculated variance is greater than the predetermined threshold r, it determines that the evaluation target data is hostile data.

図９は、敵対的データ検知処理の概要の一例を示す説明図である。図９の例では、評価対象データが画像である例を説明する。人間が当該画像を見ると当該画像が猫の画像であると認識するが、評価部２１１がモデルに当該画像を入力すると当該画像が馬の画像であると誤認識する。つまり図９の評価対象データは敵対的データである。 Figure 9 is an explanatory diagram showing an example of an overview of the adversarial data detection process. In the example of Figure 9, an example is described in which the data to be evaluated is an image. When a human looks at the image, he or she recognizes it as an image of a cat, but when the evaluation unit 211 inputs the image into the model, the image is mistakenly recognized as an image of a horse. In other words, the data to be evaluated in Figure 9 is adversarial data.

この場合、評価対象データである画像は人間からの見た目は猫の画像であるため、評価対象データである画像にノイズが付加されれば、当該ノイズが付加された画像が当該モデルに入力された分類結果は猫の画像である可能性が高い。 In this case, the image data to be evaluated looks like an image of a cat to humans, so if noise is added to the image data to be evaluated, the classification result when the image with the added noise is input into the model is likely to be an image of a cat.

データ拡張部１１１は、画像に異なるノイズを付加することで、複数の画像（拡張評価対象データ）を生成する（Ｓ６０２）。ノイズ付加部１１２は、複数の画像それぞれに対して、評価部２１１による評価値（即ち画像をモデルに入力したときに出力される分類結果）が変化するまでノイズを付加する（Ｓ６０３）。 The data expansion unit 111 generates multiple images (extended evaluation target data) by adding different noises to the images (S602). The noise addition unit 112 adds noise to each of the multiple images until the evaluation value by the evaluation unit 211 (i.e., the classification result output when the image is input to the model) changes (S603).

第１検知部３１１は、変化後の評価値の偏りを示す指標（Ｎｕｍ）を計算し、偏りが大きいと判定した場合、（敵対的データである馬の画像に様々なノイズが付加されると、元画像である猫の画像に変化すると考えられるため）、評価対象データが敵対的データであると判定する。 The first detection unit 311 calculates an index (Num) that indicates the bias of the evaluation value after the change, and if it determines that the bias is large, it determines that the data to be evaluated is hostile data (because it is believed that when various noises are added to the image of a horse, which is hostile data, it will change into the original image, which is an image of a cat).

図９の例において、閾値ｑを０．５とすれば、変化後の４つの評価値のうち３つの評価値が猫の画像であるため、Ｎｕｍ／ｎ＝３／４＞０．５であるため、第１検知部３１１は、評価対象データが敵対的データであると判定する。 In the example of Figure 9, if the threshold value q is 0.5, three of the four evaluation values after the change are images of a cat, and therefore Num/n = 3/4 > 0.5, so the first detection unit 311 determines that the evaluation target data is hostile data.

以上、本実施例の敵対的データ検知システムは、前述の処理により敵対的データを高精度かつ少ない計算量で検知することができる。なお、敵対的データがモデルに入力されると正常なデータと異なる評価値を出力するが、敵対的データは、正常なデータに作為的な微小ノイズを加えて作成されたデータである可能性が高い。従って、敵対的データにノイズ付加し続けると元の正常データの評価値を出力するデータに戻りやすい傾向があると期待できる。つまり、変化後の拡張評価対象データの評価値が同じ値になる割合が大きい場合には、当該同じ評価値を有する拡張評価対象データは正常なデータに戻っていると考えられ、本実施例の敵対的データ検知システムはこの性質を利用して評価対象データが敵対的データであるかを判定している。 As described above, the adversarial data detection system of this embodiment can detect adversarial data with high accuracy and small computational complexity through the above-mentioned processing. When adversarial data is input to the model, it outputs an evaluation value different from that of normal data, but it is highly likely that the adversarial data is data created by adding artificially small noise to normal data. Therefore, it can be expected that if noise is continued to be added to the adversarial data, it will tend to return to data that outputs the evaluation value of the original normal data. In other words, if the evaluation value of the expanded evaluation target data after the change is the same value at a high rate, it is considered that the expanded evaluation target data having the same evaluation value has returned to normal data, and the adversarial data detection system of this embodiment uses this property to determine whether the evaluation target data is adversarial data.

具体的には、前述したように、本実施例の敵対的データ検知システムは、拡張評価対象データの評価値が変化するまで拡張評価対象データにノイズを加え続け、変化後の評価値の偏りに基づいて評価対象データが敵対的データであるかを判定する。従って、攻撃者が本実施例の検知手法についての情報を有している場合であっても、モデルの情報を有していて微小なノイズを少ない回数加えても当該モデルによる出力結果が変わらない敵対的データを生成した場合であっても、評価対象データが敵対的データであるかを高精度に検知することができ、ひいては安全性を保つことができる。 Specifically, as described above, the adversarial data detection system of this embodiment continues to add noise to the extended evaluation target data until the evaluation value of the extended evaluation target data changes, and determines whether the evaluation target data is adversarial data based on the bias of the evaluation value after the change. Therefore, even if an attacker has information about the detection method of this embodiment, or has information about a model and generates adversarial data in which the output result of the model does not change even if a small amount of noise is added a small number of times, it is possible to detect with high accuracy whether the evaluation target data is adversarial data, and thus security can be maintained.

また、ノイズ付加部１１２は、ステップＳ６０３において、毎回ガウス分布に従ってノイズを生成する等のように各拡張評価対象データに対して同一のアルゴリズムから生成したノイズを付加してもよいし、Ｄ_１にはガウス分布に従うノイズを付加し、Ｄ_２には一様分布に従うノイズを付加する等のように、異なる拡張評価対象データに異なるアルゴリズムから生成したノイズを付加してもよい。また、ステップＳ６０３において、検知サーバ３００からデータ受信端末５００へ送信される評価対象データの評価値の計算は、ステップＳ６０２において事前に実行されてもよいし、ステップＳ６１３の直前に実行されてもよい（ステップＳ６０１より後かつステップＳ６１３より前の任意のタイミングで行われればよい）。 In addition, in step S603, the noise adding unit 112 may add noise generated from the same algorithm to each extended evaluation target data, such as generating noise according to a Gaussian distribution each time, or may add noise generated from different algorithms to different extended evaluation target data, such as adding noise according to a Gaussian distribution to _D1 and adding noise according to a uniform distribution to _D2 . In addition, in step S603, the calculation of the evaluation value of the evaluation target data transmitted from the detection server 300 to the data receiving terminal 500 may be performed in advance in step S602, or may be performed immediately before step S613 (it may be performed at any timing after step S601 and before step S613).

本実施例では、敵対的データ検知処理の別例を説明する。本実施例の敵対的データ検知処理では、敵対的データ検知システムは、評価対象データにノイズを付加して最初に生成した拡張評価対象データに対する評価値に基づいて評価対象データが敵対的データであるかを判定し（第１の判定）、評価対象データが敵対的データであると判定した場合、さらに実施例１で説明したステップＳ６０３～ステップＳ６０７の判定処理（第２の判定）を実行する。 In this embodiment, another example of the adversarial data detection process will be described. In the adversarial data detection process of this embodiment, the adversarial data detection system determines whether the evaluation target data is adversarial data based on the evaluation value for the extended evaluation target data that is first generated by adding noise to the evaluation target data (first determination), and if it determines that the evaluation target data is adversarial data, it further executes the determination process (second determination) of steps S603 to S607 described in the first embodiment.

図１０は、敵対的データ検知処理の一例を示すシーケンス図である。図６との相違点を主に説明する。ステップＳ６０２に続いて、データ拡張部１１１は、生成した拡張評価対象データと評価対象データとを、評価サーバ２００に送信する（Ｓ１００１）。 Figure 10 is a sequence diagram showing an example of the adversarial data detection process. The differences from Figure 6 will be mainly explained. Following step S602, the data extension unit 111 transmits the generated extended evaluation target data and the evaluation target data to the evaluation server 200 (S1001).

評価部２１１は、ステップＳ１００１で受信した評価対象データ及び拡張評価対象データそれぞれの評価値を計算する（Ｓ１００２）。評価部２１１は、ステップＳ１００２で計算した評価値を、検知サーバ３００に送信する（Ｓ１００３）。第２検知部３１２は、ステップＳ１００３で受信した評価値を用いて、評価対象データが敵対的データであるか敵対的データでない可能性があるかを判定する（Ｓ１００４）。ステップＳ１００４の処理の詳細は、図１１を用いて後述する。 The evaluation unit 211 calculates an evaluation value for each of the evaluation target data and the extended evaluation target data received in step S1001 (S1002). The evaluation unit 211 transmits the evaluation value calculated in step S1002 to the detection server 300 (S1003). The second detection unit 312 uses the evaluation value received in step S1003 to determine whether the evaluation target data is likely to be hostile data or not (S1004). Details of the processing in step S1004 will be described later with reference to FIG. 11.

第２検知部３１２は、評価対象データが敵対的データであると判定した場合（Ｓ１００４：Ｙｅｓ）、評価対象データの評価値をデータ送信端末４００に送信せず、当該評価値の送信を拒否することを示す通知（判定結果：ｒｅｊｅｃｔ）をデータ送信端末４００に送信して（Ｓ１００５）、敵対的データ検知処理が終了する。第２検知部３１２は、敵対的データでない可能性があると判定した場合（Ｓ１００４：Ｎｏ）、第２の判定に移行すること（及び／又は評価対象データが敵対的データでない可能性があること）を示す通知（判定結果：ａｃｃｅｐｔ）をノイズ付加サーバ１００に送信し（Ｓ１００６）、ステップＳ６０３～ステップＳ６０７の処理（第２の判定）が行われて、敵対的データ検知処理が終了する。 If the second detection unit 312 determines that the evaluation target data is hostile data (S1004: Yes), it does not transmit the evaluation value of the evaluation target data to the data transmission terminal 400, and transmits a notification (determination result: reject) to the data transmission terminal 400 indicating that the transmission of the evaluation value is rejected (S1005), and the hostile data detection process ends. If the second detection unit 312 determines that the data is not likely to be hostile data (S1004: No), it transmits a notification (determination result: accept) to the noise addition server 100 indicating that the data will proceed to a second determination (and/or that the evaluation target data is not likely to be hostile data) (S1006), and the processes of steps S603 to S607 (second determination) are performed, and the hostile data detection process ends.

なお、Ｄ_ｉ（ｉ＝１，・・・，ｎ）それぞれに対する初回のステップＳ７０２において算出されるＤ_ｉの評価値は、ステップＳ１００２で算出される拡張評価対象データそれぞれの評価値と同一であるため、Ｄ_ｉ（ｉ＝１，・・・，ｎ）それぞれに対する初回のステップＳ７０２の処理は省略されてもよい。 In addition, since the evaluation value of _Di calculated in the first step S702 for each _Di (i = 1, ..., n) is the same as the evaluation value of each extended evaluation target data calculated in step S1002, the processing of the first step S702 for each _Di (i = 1, ..., n) may be omitted.

図１１はステップＳ１００４における敵対的データ判定処理（第２の判定）の一例を示すフローチャートである。第２検知部３１２は、評価対象データに対する評価値Ｅｖａｌと拡張評価対象データに対する評価値｛Ｅｖａｌ_１，…，Ｅｖａｌ_ｎ｝を受け取る（Ｓ１００３）。第２検知部３１２は、Ｅｖａｌ≠Ｅｖａｌ_ｉ（ｉ＝１，…ｎ）となる拡張評価対象データの個数Ｎｕｍを求める（Ｓ１１０１）。第２検知部３１２は、Ｎｕｍを拡張評価対象データの個数ｎで割った値が予め定められた閾値ｐより小さいと判定する（Ｓ１１０２）。なお、閾値ｐは、例えば、メモリ３０３に予め格納されている。 11 is a flowchart showing an example of the adversarial data determination process (second determination) in step S1004. The second detection unit 312 receives an evaluation value Eval for the evaluation target data and an evaluation value {Eval ₁ , ..., Eval _n } for the extended evaluation target data (S1003). The second detection unit 312 obtains the number Num of extended evaluation target data for which Eval ≠ Eval _i (i = 1, ... n) (S1101). The second detection unit 312 determines that the value obtained by dividing Num by the number n of extended evaluation target data is smaller than a predetermined threshold p (S1102). The threshold p is, for example, stored in advance in the memory 303.

第２検知部３１２は、Ｎｕｍを拡張評価対象データの個数ｎで割った値が予め定められた閾値ｐより小さいと判定した場合（Ｓ１１０２：Ｙｅｓ）、評価対象データは敵対的データであると判定して（Ｓ１１０３）、敵対的データ判定処理を終了する。第２検知部３１２は、Ｎｕｍを拡張評価対象データの個数ｎで割った値が予め定められた閾値ｐ以上であると判定した場合（Ｓ１１０２：Ｎｏ）、評価対象データが敵対的データでない可能性があると判定して（Ｓ１１０４）、敵対的データ判定処理を終了する。 When the second detection unit 312 determines that the value obtained by dividing Num by the number n of data to be subjected to extended evaluation is smaller than a predetermined threshold p (S1102: Yes), it determines that the data to be subjected to evaluation is hostile data (S1103) and ends the hostile data determination process. When the second detection unit 312 determines that the value obtained by dividing Num by the number n of data to be subjected to extended evaluation is equal to or greater than the predetermined threshold p (S1102: No), it determines that the data to be subjected to evaluation may not be hostile data (S1104) and ends the hostile data determination process.

なお、第２検知部３１２は、ステップＳ１１０２において、評価値が異なる拡張評価対象データの個数の割合を検知に用いているが、他の指標を検知に用いてもよい。第２検知部３１２は、例えば、ステップＳ１１０２において、拡張評価対象データそれぞれの評価値の、評価対象データの評価値からの変化量をカウントし、当該変化量の平均値をＮｕｍの代わりに用いてもよい。 Note that in step S1102, the second detection unit 312 uses the proportion of the number of extended evaluation target data with different evaluation values for detection, but other indices may be used for detection. For example, in step S1102, the second detection unit 312 may count the amount of change in the evaluation value of each extended evaluation target data from the evaluation value of the evaluation target data, and use the average of the amount of change instead of Num.

敵対的データは、正常なデータに作為的な微小ノイズを加えて作成されたデータである可能性が高いため、モデルが出力される評価値が変化する境界に近いデータである可能性が高い。つまり、評価対象データが敵対的データである場合には、評価対象データにノイズが付加された拡張評価対象データの評価値は、評価対象データの評価値と異なる可能性が高い。従って、本実施例の敵対的データ検知システムは、図１１の処理で評価対象データと評価値が異なる拡張評価対象データの割合に基づいて、評価対象データが敵対的データであるかを判定することにより、効率的に敵対的データを検知することができる。 Since hostile data is likely to be data created by adding artificial minute noise to normal data, it is likely to be data close to the boundary where the evaluation value output by the model changes. In other words, when the evaluation target data is hostile data, the evaluation value of the extended evaluation target data in which noise has been added to the evaluation target data is likely to be different from the evaluation value of the evaluation target data. Therefore, the adversarial data detection system of this embodiment can efficiently detect adversarial data by determining whether the evaluation target data is adversarial data based on the proportion of extended evaluation target data whose evaluation value differs from that of the evaluation target data in the process of FIG. 11.

なお、上記した例では、敵対的データ検知システムは、第１の判定において評価対象データが敵対的データであると判定した場合に第２の判定により評価対象データが敵対的データであるかの最終的な判定を行うが、第２の判定において評価対象データが敵対的データであると判定した場合に第１の判定による評価対象データが敵対的データであるかの最終的な判定を行ってもよい。 In the above example, if the first judgment determines that the data to be evaluated is adversarial data, the adversarial data detection system makes a final judgment as to whether the data to be evaluated is adversarial data by the second judgment. However, if the second judgment determines that the data to be evaluated is adversarial data, the adversarial data detection system may make a final judgment as to whether the data to be evaluated is adversarial data by the first judgment.

以上、本実施例の敵対的データ検知システムは、本実施例の敵対的データ検知システムは、評価対象データに対して、異なる性質を捉えるための２つの検知手法を用いることにより、より安全性が高くかつ高精度に敵対的データを検知することができる。 As described above, the adversarial data detection system of this embodiment can detect adversarial data more safely and with higher accuracy by using two detection methods to capture different properties of the data to be evaluated.

本実施例は、敵対的データ検知システムのユースケースの一例を説明する。図１２は、敵対的データ検知システムの入出力データの一例を示す説明図である。図１２の例では、評価対象データが医療画像であり、敵対的データ検知システムは当該医療画像に基づいて、医療画像の被写体の人間が病気であるか、及び病気である場合には被写体の人間がり患している病気の種類を判定する。 This embodiment describes an example of a use case of the adversarial data detection system. FIG. 12 is an explanatory diagram showing an example of input/output data of the adversarial data detection system. In the example of FIG. 12, the data to be evaluated is a medical image, and the adversarial data detection system determines, based on the medical image, whether the human subject of the medical image is sick, and if so, the type of illness the human subject is suffering from.

データ送信端末４００は、ステップＳ６０１において、医療画像をノイズ付加サーバ１００に送信する。ステップＳ６０５において、検知サーバ３００は、医療画像が敵対的データであると判定した場合、ステップＳ６０６において、その旨を示す報告をデータ送信端末４００へ送信する。 In step S601, the data transmission terminal 400 transmits a medical image to the noise addition server 100. In step S605, if the detection server 300 determines that the medical image is hostile data, in step S606, it transmits a report to that effect to the data transmission terminal 400.

検知サーバ３００は、ステップＳ６０５において、医療画像が敵対的データではないと判定した場合、ステップＳ６０７において、医療画像に対する評価結果（被写体の人間が病気であるか否か、及び病気である場合には病気の種類を示す分類結果）をデータ受信端末５００へ送信する。 If the detection server 300 determines in step S605 that the medical image is not hostile data, in step S607, it transmits the evaluation result for the medical image (whether the subject is sick or not, and if so, the classification result indicating the type of illness) to the data receiving terminal 500.

なお、本発明は上記した実施例に限定されるものではなく、様々な変形例が含まれる。例えば、上記した実施例は本発明を分かりやすく説明するために詳細に説明したものであり、必ずしも説明した全ての構成を備えるものに限定されるものではない。また、ある実施例の構成の一部を他の実施例の構成に置き換えることも可能であり、また、ある実施例の構成に他の実施例の構成を加えることも可能である。また、各実施例の構成の一部について、他の構成の追加や削除、置換をすることが可能である。 The present invention is not limited to the above-described embodiments, but includes various modified examples. For example, the above-described embodiments have been described in detail to clearly explain the present invention, and are not necessarily limited to those having all of the configurations described. It is also possible to replace part of the configuration of one embodiment with the configuration of another embodiment, and it is also possible to add the configuration of another embodiment to the configuration of one embodiment. It is also possible to add, delete, or replace part of the configuration of each embodiment with other configurations.

また、上記の各構成、機能、処理部、処理手段等は、それらの一部又は全部を、例えば集積回路で設計する等によりハードウェアで実現してもよい。また、上記の各構成、機能等は、プロセッサがそれぞれの機能を実現するプログラムを解釈し、実行することによりソフトウェアで実現してもよい。各機能を実現するプログラム、テーブル、ファイル等の情報は、メモリや、ハードディスク、ＳＳＤ（ＳｏｌｉｄＳｔａｔｅＤｒｉｖｅ）等の記録装置、または、ＩＣカード、ＳＤカード、ＤＶＤ等の記録媒体に置くことができる。 The above configurations, functions, processing units, processing means, etc. may be realized in part or in whole in hardware, for example by designing them as integrated circuits. The above configurations, functions, etc. may be realized in software by a processor interpreting and executing a program that realizes each function. Information such as the programs, tables, files, etc. that realize each function can be stored in a memory, a recording device such as a hard disk or SSD (Solid State Drive), or a recording medium such as an IC card, SD card, or DVD.

また、制御線や情報線は説明上必要と考えられるものを示しており、製品上必ずしも全ての制御線や情報線を示しているとは限らない。実際には殆ど全ての構成が相互に接続されていると考えてもよい。 In addition, the control lines and information lines shown are those considered necessary for the explanation, and not all control lines and information lines on the product are necessarily shown. In reality, it can be assumed that almost all components are interconnected.

１００ノイズ付加サーバ、２００評価サーバ、３００検知サーバ、４００データ送信端末、５００データ受信端末、１０１プロセッサ、１０２補助記憶装置、１０３メモリ、１０７通信インターフェース、１１１データ拡張部、１１２ノイズ付加部、２０１プロセッサ、２０２補助記憶装置、２０３メモリ、２０７通信インターフェース、２１１評価部、３０１プロセッサ、３０２補助記憶装置、３０３メモリ、３０７通信インターフェース、３１１第１検知部、３１２第２検知部、４０１プロセッサ、４０２補助記憶装置、４０３メモリ、４０７通信インターフェース 100 Noise addition server, 200 Evaluation server, 300 Detection server, 400 Data transmission terminal, 500 Data reception terminal, 101 Processor, 102 Auxiliary storage device, 103 Memory, 107 Communication interface, 111 Data expansion unit, 112 Noise addition unit, 201 Processor, 202 Auxiliary storage device, 203 Memory, 207 Communication interface, 211 Evaluation unit, 301 Processor, 302 Auxiliary storage device, 303 Memory, 307 Communication interface, 311 First detection unit, 312 Second detection unit, 401 Processor, 402 Auxiliary storage device, 403 Memory, 407 Communication interface

Claims

1. An adversarial data detection device, comprising:
A processor and a memory,
The memory holds evaluation target data and a model that outputs an evaluation value when the data is input,
The processor,
inputting the evaluation target data into the model to calculate an evaluation value of the evaluation target data;
Adding different noises to the evaluation target data to generate a plurality of extended evaluation target data;
Executing a first determination process based on the plurality of extended evaluation target data;
In the first determination process,
updating the extended evaluation target data by adding noise to each of the plurality of extended evaluation target data until the evaluation value of the extended evaluation target data does not match the evaluation value of the evaluation target data;
An adversarial data detection device that determines whether the evaluation target data is adversarial data based on the bias of the evaluation values of each of the updated extended evaluation target data.

2. The adversarial data detection apparatus of claim 1,
The processor determines that the evaluation target data is hostile data when the frequency of the most frequent evaluation value of each of the updated extended evaluation target data among the number of the multiple extended evaluation target data is greater than a predetermined threshold.

2. The adversarial data detection apparatus of claim 1,
The processor,
inputting each of the plurality of extended evaluation target data into the model, and calculating an evaluation value for each of the plurality of extended evaluation target data;
execute a second determination process for determining whether the evaluation target data is likely to be non-hostile data based on a rate of agreement between the evaluation value of each of the plurality of extended evaluation target data and the evaluation value of the evaluation target data;
The adversarial data detection device executes the first determination process when it determines that the evaluation target data is likely not adversarial data.

4. The adversarial data detection apparatus of claim 3,
The processor determines that the evaluation target data is likely not hostile data when the rate at which the evaluation values of each of the multiple extended evaluation target data match the evaluation value of the evaluation target data is less than a predetermined threshold.

2. The adversarial data detection apparatus of claim 1,
An adversarial data detection device, wherein the noise added in generating the plurality of extended evaluation target data and the noise added in updating the plurality of extended evaluation target data include at least one of noise following a predetermined probability distribution and noise generated to manipulate the evaluation value.

2. The adversarial data detection apparatus of claim 1,
The model is a trained model generated by deep learning, or a trained model obtained by training data that has been subjected to processing including at least one of resizing and padding when deep learning is performed. Adversarial data detection device.

An adversarial data detection method by an adversarial data detection device, comprising:
The adversarial data detection device includes a processor and a memory;
The memory holds evaluation target data and a model that outputs an evaluation value when the data is input,
The adversarial data detection method includes:
The processor inputs the evaluation target data into the model and calculates an evaluation value of the evaluation target data;
The processor adds different noises to the evaluation target data to generate a plurality of extended evaluation target data;
The processor executes a first determination process based on the plurality of extended evaluation target data;
In the first determination process,
The processor adds noise to each of the plurality of extended evaluation target data to update the extended evaluation target data until the evaluation value of the extended evaluation target data does not match the evaluation value of the evaluation target data;
The method for detecting hostile data, wherein the processor determines whether the evaluation target data is hostile data based on a bias in the evaluation value of each piece of updated extended evaluation target data.

8. The method of claim 7, further comprising:
The method for detecting adversarial data, wherein the processor determines that the evaluation target data is adversarial data if the frequency of the most frequent evaluation value of each of the updated extended evaluation target data among the number of the multiple extended evaluation target data is greater than a predetermined threshold.

8. The method of claim 7, further comprising:
The processor inputs each of the plurality of extended evaluation target data into the model and calculates an evaluation value for each of the plurality of extended evaluation target data;
The processor executes a second determination process to determine whether the evaluation target data is likely to be non-adversarial data based on a rate at which the evaluation value of each of the plurality of extended evaluation target data matches the evaluation value of the evaluation target data;
The method for detecting adversarial data, wherein the processor executes the first determination process when the processor determines that the evaluation target data is likely not adversarial data.

10. The method of claim 9, further comprising:
The method for detecting hostile data, wherein the processor determines that the evaluation target data is likely not hostile data when the rate at which the evaluation values of each of the plurality of extended evaluation target data match the evaluation value of the evaluation target data is less than a predetermined threshold.

8. The method of claim 7, further comprising:
An adversarial data detection method, wherein the noise added in generating the multiple extended evaluation target data and the noise added in updating the multiple extended evaluation target data include at least one of noise following a predetermined probability distribution and noise generated to manipulate the evaluation value.

8. The method of claim 7, further comprising:
The adversarial data detection method, wherein the model is a trained model generated by deep learning, or a trained model obtained by training data that has been subjected to processing including at least one of resizing and padding when deep learning is performed.