JP7833962B2

JP7833962B2 - Similarity region detection device and similarity region detection program

Info

Publication number: JP7833962B2
Application number: JP2022081737A
Authority: JP
Inventors: 翔平森; 敏西村
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2022-05-18
Filing date: 2022-05-18
Publication date: 2026-03-23
Anticipated expiration: 2042-05-18
Also published as: JP2023170185A

Description

本発明は、画像の類似領域を検出する類似領域検出装置及び類似領域検出プログラムに関する。 This invention relates to a similarity region detection device and a similarity region detection program for detecting similar regions in images.

画像の類似領域を検出する方法として、色特徴量や局所特徴量等の特徴量に基づく方法が広く知られている。色特徴量に基づく方法の一例として、非特許文献１の方法がある。この方法は、入力画像と色ヒストグラムの類似度が高い領域を検出するものである。物体の形状の微小な違いの検出には向かないが、物体の形状変化に頑健な検出が可能である。
また、局所特徴量に基づく方法の一例として、非特許文献２の方法がある。この方法は、局所特徴量からVisual Word（辞書）を作成し、Visual Wordの出現頻度のヒストグラムの類似度が高い領域を検出するものである。物体の回転、スケール等の形状変化や明度変動等に対して頑健な検出が可能である。 Methods based on feature quantities, such as color features and local features, are widely known for detecting similar regions in images. One example of a method based on color features is the method described in Non-Patent Document 1. This method detects regions with a high similarity between the input image and the color histogram. While it is not suitable for detecting minute differences in object shape, it is robust to changes in object shape.
Another example of a method based on local features is the method described in Non-Patent Document 2. This method creates a Visual Word (dictionary) from local features and detects regions with high similarity in the histogram of the frequency of occurrence of the Visual Words. It enables robust detection against changes in shape such as object rotation and scale, as well as changes in brightness.

M. J. Swain、D. H. Ballard、 “Color indexing” 、 International Journal of Computer Vision、 Vol.7、No.1、1991、pp.11-32M. J. Swain, D. H. Ballard, “Color indexing”, International Journal of Computer Vision, Vol.7, No.1, 1991, pp.11-32 G. Csurka、C. Dance、L. Fan、J. Willamowski、C. Bray、 “Visual categorization with bags of keypoints” 、ECCV Workshop on Statistical Learning in Computer Vision、2004G. Csurka, C. Dance, L. Fan, J. Willamowski, C. Bray, “Visual categorization with bags of keypoints”, ECCV Workshop on Statistical Learning in Computer Vision, 2004

例えば、目標画像となるテンプレートと、入力画像中のテンプレートに類似度が最大となる入力画像内の位置を検出する場合、従来技術における探索は、探索領域（探索窓）を順次ずらして、探索領域（探索窓）の特徴量が目標画像（テンプレート）の特徴量に類似するかどうかを逐次的に比較する方法が知られている。
この探索方法に基づいて類似画像を検出する場合、入力画像において検出目標とする画像が目標画像（テンプレート）と同じ大きさで、同じ向きであれば、探索窓の位置を順番にずらしていくことで、目標画像（テンプレート）に類似度が最大となる入力画像内の位置を検出することができる。
しかしながら、入力画像において検出目標とする画像の大きさや向きが目標画像（テンプレート）に比較して未知の場合（例えば、２倍の大きさで斜めに傾いている等）、探索領域（探索窓）を様々な大きさ（例えば、目標画像（テンプレート）の１／２倍、等倍、３／２倍、２倍．．．）と、様々な方向（例えば、目標画像（テンプレート）を０度、３０度、６０度．．．に傾ける方向）を考慮して、目標画像（テンプレート）に類似する画像を探索する必要があった。このため、従来技術における探索は、探索領域（探索窓）の位置に係るパラメータに加え、探索領域（探索窓）の大きさや方向等形状に係るパラメータを変化させて膨大な照合処理を行う必要があるという課題があった。 For example, when detecting the position within an input image that has the greatest similarity to a target image template, conventional search techniques involve sequentially shifting the search area (search window) and comparing whether the features of the search area (search window) are similar to the features of the target image (template).
When detecting similar images based on this search method, if the target image in the input image is the same size and orientation as the target image (template), the position in the input image that maximizes the similarity to the target image (template) can be detected by sequentially shifting the position of the search window.
However, if the size and orientation of the target image in the input image are unknown compared to the target image (template) (for example, twice the size and tilted at an angle), it was necessary to search for images similar to the target image (template) by considering various sizes of the search area (search window) (for example, 1/2 the size, 1:1 the size, 3/2 the size, 2 times the size of the target image (template)) and various directions (for example, directions that tilt the target image (template) to 0 degrees, 30 degrees, 60 degrees, etc.). Therefore, the search in conventional technology had the problem of requiring a huge amount of matching processing by changing parameters related to the shape of the search area (search window), such as its size and orientation, in addition to parameters related to the position of the search area (search window).

本発明は、このような従来技術の問題点に鑑みてなされたもので、探索領域の位置や形状に依らず一定の計算量で、画像の特徴量のヒストグラムが類似する領域を検出することで、目標画像に類似する、入力画像内の類似領域を検出することを可能とする類似領域検出装置及びそのプログラムを提供することを目的とする。 This invention has been made in view of the problems of the prior art, and aims to provide a similarity region detection device and program that can detect similar regions within an input image that are similar to a target image by detecting regions with similar histograms of image features with a constant amount of computation, regardless of the position or shape of the search region.

本発明に係る類似領域検出装置は、目標画像及び入力画像を受け付ける入力部と、前記目標画像から特徴量を抽出するとともに、前記入力画像を予め設定されたサイズで分割した分割領域のそれぞれから、特徴量を抽出する特徴量抽出部と、前記特徴量抽出部により、前記目標画像から抽出された特徴量と、前記入力画像の各分割領域から抽出された特徴量と、に基づいて、前記分割領域からいずれを選択するかを変数として数理モデルを生成する数理モデル化部と、前記数理モデル化部により生成された数理モデルの値が、予め定めた閾値の条件を満たすように前記数理モデルを最小化する前記変数の解を取得するモデル変数計算部と、前記モデル変数計算部で取得された変数の解で特定される類似領域を決定する領域検出部と、を備え、前記領域検出部により決定された類似領域を出力する。 The similarity region detection device according to the present invention comprises: an input unit that receives a target image and an input image; a feature extraction unit that extracts features from the target image and from each of the divided regions obtained by dividing the input image into predetermined sizes; a mathematical modeling unit that generates a mathematical model based on the features extracted from the target image and the features extracted from each of the divided regions of the input image, using which of the divided regions to select as a variable; a model variable calculation unit that obtains a solution for the variable that minimizes the mathematical model so that the value of the mathematical model generated by the mathematical modeling unit satisfies a predetermined threshold condition; and a region detection unit that determines the similarity region identified by the solution of the variable obtained by the model variable calculation unit, and outputs the similarity region determined by the region detection unit.

前記数理モデル化部は、前記領域検出部により決定される類似領域の検出目標とするサイズを予め設定するようにしてもよい。 The mathematical modeling unit may pre-set the target size for detecting similar regions determined by the region detection unit.

前記数理モデルは、ヒストグラム類似度モデルと、領域サイズモデルと、領域塊状度モデルと、を備え、前記ヒストグラム類似度モデルは最小化されることで、前記分割領域の特徴量のヒストグラムが前記目標画像から抽出された特徴量のヒストグラムに近づくように作用し、前記領域サイズモデルは最小化されることで、前記分割領域から選択される分割領域の個数を予め定めた条件を満たすように作用し、前記領域塊状度モデルは最小化されることで、前記分割領域のうち、隣接する分割領域を選択する度合いが高くなるように作用するようにしてもよい。 The mathematical model may comprise a histogram similarity model, a region size model, and a region clumping model. The histogram similarity model is minimized to ensure that the histogram of features in the divided regions approaches the histogram of features extracted from the target image. The region size model is minimized to ensure that the number of divided regions selected from the divided regions satisfies a predetermined condition. The region clumping model is minimized to increase the degree to which adjacent divided regions are selected from among the divided regions.

前記モデル変数計算部は、前記ヒストグラム類似度モデルと、前記領域サイズモデルと、前記領域塊状度モデルと、をそれぞれ異なるバランス係数によりそれぞれ重みづけをしたうえで加算して算出される全体関数を予め定めた閾値の条件を満たすように最小化する前記変数の解を取得するようにしてもよい。 The model variable calculation unit may obtain a solution for the variables that minimizes the overall function, calculated by adding the histogram similarity model, the region size model, and the region clumping model after weighting them with different balance coefficients, so that it satisfies a predetermined threshold condition.

前記モデル変数計算部は、前記全体関数を予め定めた閾値の条件を満たすように最小化することで取得した前記変数の解を用いて、前記ヒストグラム類似度モデルと、前記領域サイズモデルと、前記領域塊状度モデルの値を計算し、それぞれのモデルの値が、それぞれ予め定めた閾値以下になるように、前記バランス係数を調整するようにしてもよい。 The model variable calculation unit may use the solution obtained by minimizing the overall function to satisfy a predetermined threshold to calculate the values of the histogram similarity model, the region size model, and the region clumping model, and adjust the balance coefficient so that the values of each model are below a predetermined threshold.

前記領域検出部は、前記モデル変数計算部で取得された変数の解で特定される分割領域を用いて後処理をすることで、前記類似領域を決定するようにしてもよい。 The region detection unit may determine the similar region by performing post-processing using the partitioned region identified by the solution of the variables obtained by the model variable calculation unit.

前記後処理は、前記分割領域から生成される４連結成分のうち最大の４連結成分を類似領域とする処理、前記分割領域から生成される８連結成分のうち最大の８連結成分を類似領域とする処理、二値画像に対して数回の膨張処理を行った後に、前記膨張処理と同じ回数の収縮処理を行うクロージング処理、又は二値画像に対して数回の収縮処理を行った後に、前記収縮処理と同じ回数の膨張処理を行うオープニング処理、の何れかの処理であるようにしてもよい。 The post-processing may consist of one of the following: a process that sets the largest of the four connected components generated from the divided region as a similar region; a process that sets the largest of the eight connected components generated from the divided region as a similar region; a closing process that performs several dilation operations on the binary image followed by the same number of condensation operations as the dilation operations; or an opening process that performs several condensation operations on the binary image followed by the same number of dilation operations as the condensation operations.

本発明に係る類似領域検出プログラムは、前記類似領域検出装置としてコンピュータを機能させるためのものである。 The similarity region detection program according to the present invention is intended to enable a computer to function as the similarity region detection device.

本発明によれば、探索領域の位置や形状によらず一定の計算量で、画像の特徴量のヒストグラムが類似する領域を検出することで、目標画像に類似する、入力画像内の類似領域を検出することを可能とする類似領域検出装置及びそのプログラムを提供することができる。 According to the present invention, a similarity region detection device and program are provided that can detect similar regions within an input image that are similar to a target image by detecting regions with similar histograms of image features with a constant computational load, regardless of the position or shape of the search region.

実施形態における類似領域検出システムの構成を示す図である。This figure shows the configuration of the similarity region detection system in the embodiment. 実施形態における類似領域検出装置の備える数理モデル化部の構成例を示す機能ブロック図である。This is a functional block diagram showing an example of the configuration of the mathematical modeling unit included in the similarity region detection device in the embodiment. 実施形態における類似領域検出装置の備える数理モデル化部で用いる分割領域と隣接領域を模式的に表わした図である。This figure schematically represents the divided region and adjacent region used in the mathematical modeling unit of the similarity region detection device in the embodiment. 実施形態における数理モデル化部を構成する領域塊状度モデル定式化部の処理例を示す図である。This figure shows an example of processing by the domain blockage model formulation unit, which constitutes the mathematical modeling unit in the embodiment. 実施形態における数理モデル化部を構成する領域塊状度モデル定式化部の処理例を示す図である。This figure shows an example of processing by the domain blockage model formulation unit, which constitutes the mathematical modeling unit in the embodiment. 実施形態における数理モデル化部を構成する領域塊状度モデル定式化部の処理例を示す図である。This figure shows an example of processing by the domain blockage model formulation unit, which constitutes the mathematical modeling unit in the embodiment. 実施形態における数理モデル化部を構成する領域塊状度モデル定式化部の処理例を示す図である。This figure shows an example of processing by the domain blockage model formulation unit, which constitutes the mathematical modeling unit in the embodiment. 実施形態における類似領域検出装置の動作を示すフローチャートを示す図である。This diagram shows a flowchart illustrating the operation of the similarity region detection device in the embodiment.

以下、本発明の実施形態の一例について説明する。
図１は、本実施形態における類似領域検出システム１００の構成を示す図である。
類似領域検出システム１００は、類似領域検出装置１と、計算機２と、を備える。 An example of an embodiment of the present invention will be described below.
Figure 1 shows the configuration of the similarity region detection system 100 in this embodiment.
The similarity region detection system 100 comprises a similarity region detection device 1 and a computer 2.

類似領域検出装置１は、入力装置５０から、目標画像と、入力画像と、を受け付け、類似検出対象となる画像が含まれる目標画像内の目標領域に類似する類似領域を入力画像から検出し、検出した類似領域を出力装置６０へ出力するものである。このため、類似領域検出装置１は、予め入力画像を分割した分割領域の集合から、いずれの分割領域を選択するかを変数として、目標領域に類似する分割領域の部分集合を選択する場合に、最小の値をとる、組合せ最適化問題の数理モデルを定式化する。 The similarity region detection device 1 receives the target image and the input image from the input device 50. It detects similar regions from the input image that are similar to the target region within the target image containing the image to be detected, and outputs the detected similar regions to the output device 60. Therefore, the similarity region detection device 1 formulates a mathematical model of a combinatorial optimization problem that takes the minimum value when selecting a subset of partitioned regions similar to the target region, using the selection of which partitioned region to choose from a pre-divided set of partitioned regions of the input image as a variable.

ここで、組合せ最適化問題の数理モデルについて簡単に説明する。本実施形態では、数理モデルＨを以下の（式１）に示すＱＵＢＯ（ＱｕａｄｒａｔｉｃＵｎｃｏｎｓｔｒａｎｅｄＢｉｎａｒｙＯｐｔｉｍａｉｚａｔｉｏｎ）：制約なし二次形式二値変数最適化）表現で構築する。

Ｈ＝Σ_ｉΣ_ｊ（Ｑ（ｉ，ｊ）ｑ_ｉｑ_ｊ）＝ｑ^ＴＱｑ
（式１）
ここで、ｑ_ｉ、ｑ_ｊは、ベクトルｑのｉ番目及びｊ番目の要素であり、０又は１の値をとる二値変数である。
また、Σ_ｉ及びΣ_ｊは、ベクトルｑの全ての要素の総和をとる演算子である。
ベクトルの右上の記号^Ｔはベクトルの転置を表す。
Ｑ（ｉ，ｊ）は、パラメータ行列Ｑのｉ行目ｊ列目の要素であり、二値変数の積ｑ_ｉｑ_ｊの係数となる。
なお、数理モデルＨは、ＱＵＢＯ表現以外にも、それと等価なイジングモデル表現を用いてもよい。
類似領域検出装置１は、予め定めた分割領域の候補からいずれを選択するかを変数として、組合せ最適化問題の数理モデルＨを、例えば式１に示すＱＵＢＯ表現に定式化する。 Here, we will briefly explain the mathematical model of the combinatorial optimization problem. In this embodiment, the mathematical model H is constructed using the QUBO (Quadratic Unconstrained Binary Optimization) representation shown in (Equation 1) below.

H=Σ _i Σ _j (Q(i, j)q _i q _j ) = ^{q T} Qq
(Formula 1)
Here, q _i and q _j are the i-th and j-th elements of the vector q, and are binary variables that take the value of 0 or 1.
Furthermore, Σ _i and Σ _j are operators that take the sum of all elements of vector q.
The symbol ^T above and to the right of a vector represents the transpose of the vector.
Q(i,j) is the element in the i-th row and j-th column of the parameter matrix Q, and is the coefficient of the binary product q _i q _j .
In addition to the QUBO representation, the mathematical model H may also use an equivalent Ising model representation.
The similarity region detection device 1 uses the selection of one of the predetermined candidate division regions as a variable to formulate the mathematical model H of the combinatorial optimization problem into a QUBO representation, for example, as shown in Equation 1.

計算機２は、類似領域検出装置１により生成される数理モデルの解を計算により求めるものである。具体的には、計算機２は、式（１）で示すｑの組を計算する。なお、計算機２は、類似領域検出装置１により生成される数理モデルの解を計算するものであれば、どのような計算機でもよい。例えば、計算機２は、ＱＵＢＯやイジングモデルを求解するイジングマシン等の計算機である。また、計算機２は、独立した構成とする必要はなく、類似領域検出装置１の内部に備える構成としてもよい。 Computer 2 calculates the solution to the mathematical model generated by the similarity region detection device 1. Specifically, Computer 2 calculates the set of q shown in equation (1). Computer 2 can be any computer that calculates the solution to the mathematical model generated by the similarity region detection device 1. For example, Computer 2 could be a QUBO or an Ising machine that solves the Ising model. Furthermore, Computer 2 does not need to be a separate component; it may be integrated into the similarity region detection device 1.

図１に、本実施形態における類似領域検出システム１００を構成する類似領域検出装置１の機能構成を示す。
類似領域検出装置１は、制御部１０及び記憶部２０を備えた情報処理装置であり、サーバ又はパーソナルコンピュータ等汎用の装置の他、専用のハードウェアとして実装されてもよい。制御部１０は、記憶部２０に格納された類似領域検出プログラム等を読み出し実行することにより、後述する入力部１１、特徴量抽出部１２、数理モデル化部１３、モデル変数計算部１４、領域検出部１５、及び出力部１６として機能する。
類似領域検出装置１は、これらの機能部により、探索領域の位置や形状に依らず一定の計算量で、画像の特徴量のヒストグラムが類似する領域を検出することができる。
次に、類似領域検出装置１の各機能について説明する。 Figure 1 shows the functional configuration of the similarity region detection device 1 that constitutes the similarity region detection system 100 in this embodiment.
The similarity region detection device 1 is an information processing device equipped with a control unit 10 and a storage unit 20, and may be implemented as a general-purpose device such as a server or personal computer, or as dedicated hardware. The control unit 10 functions as an input unit 11, a feature extraction unit 12, a mathematical modeling unit 13, a model variable calculation unit 14, a region detection unit 15, and an output unit 16, which will be described later, by reading and executing a similarity region detection program etc. stored in the storage unit 20.
The similarity region detection device 1, through these functional units, can detect regions where the histograms of image features are similar with a constant amount of computation, regardless of the position or shape of the search region.
Next, we will explain the functions of the similarity region detection device 1.

入力部１１は、類似領域検出装置１と通信可能に接続される入力装置５０から、目標画像と入力画像とを受け付ける。ここで、目標画像は、類似検出対象とする画像となる目標領域を含む画像である。また、入力画像は、目標領域の画像に類似する領域を検出する対象となる画像である。入力部１１は、受け付けた目標画像（目標領域の画像）と入力画像とを特徴量抽出部１２に出力する。
なお、目標画像及び入力画像は、それぞれ任意の形状としてもよいが、本実施形態では、説明を簡単にするために、目標画像は任意の形状の物体を目標領域として設定し、入力画像は一般的な矩形の画像を例示して説明するが、目標画像と入力画像は、これに限られない。目標画像及び入力画像は、ともに任意の形状でもよい。 The input unit 11 receives the target image and the input image from the input device 50, which is connected to the similarity region detection device 1 in a communicative manner. Here, the target image is an image that includes the target region, which is the image to be detected as similar. The input image is an image that is to be detected as having regions similar to the image of the target region. The input unit 11 outputs the received target image (image of the target region) and the input image to the feature extraction unit 12.
The target image and input image may each be of any shape, but in this embodiment, for the sake of simplicity, the target image is set as an object of any shape as the target area, and the input image is a general rectangular image as an example; however, the target image and input image are not limited to these. Both the target image and input image may be of any shape.

特徴量抽出部１２は、入力部１１で受け付けた目標画像（目標領域の画像）、及び入力画像（入力画像領域）を予め設定されたサイズで分割した分割領域のそれぞれから、特徴量を抽出する。本実施形態では、特徴量として画素値等から抽出される色特徴量を例示するが、特に特徴量の種類は限定されない。例えば、輝度勾配から抽出される局所特徴量（つまり形状の特徴量）、また、ＨＯＧ（ＨｉｓｔｏｇｒａｍｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔ）やＳＩＦＴ（Ｓｃａｌｅ－ＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）を用いて算出される局所特徴量等を用いるようにしてもよい。
特徴量抽出部１２は、目標画像に対して類似検出したい物体（画像）が含まれる目標領域の画像特徴量を抽出する。また、特徴量抽出部１２は、入力画像に対しては予め設定されたサイズで入力画像領域を分割した分割領域毎に、特徴量を抽出する。
特徴量抽出部１２は、抽出した、目標領域の特徴量、及び入力画像の分割領域毎の特徴量を数理モデル化部１３に出力する。 The feature extraction unit 12 extracts features from the target image (image of the target region) received by the input unit 11, and from the divided regions obtained by dividing the input image (input image region) into pre-set sizes. In this embodiment, color features extracted from pixel values, etc., are given as examples of features, but the type of features is not particularly limited. For example, local features extracted from the brightness gradient (i.e., shape features), or local features calculated using HOG (Histogram of Oriented Gradient) or SIFT (Scale-Invariant Feature Transform) may be used.
The feature extraction unit 12 extracts image features from the target image, specifically from the target region containing the object (image) to be detected as similar. Furthermore, the feature extraction unit 12 extracts features from each divided region of the input image, which is divided into pre-set sizes.
The feature extraction unit 12 outputs the extracted feature quantities of the target region and the feature quantities of each segmented region of the input image to the mathematical modeling unit 13.

数理モデル化部１３は、ヒストグラム類似度モデル定式化部１３１、領域サイズモデル定式化部１３２、及び領域塊状度モデル定式化部１３３を備える。 The mathematical modeling unit 13 comprises a histogram similarity model formulation unit 131, a region size model formulation unit 132, and a region chunkiness model formulation unit 133.

ヒストグラム類似度モデル定式化部１３１は、入力画像内の選択する分割領域の特徴量のヒストグラムを、目標画像内の目標領域の特徴量のヒストグラムにできる限り近づけるための評価関数を生成するものである。具体的には、ヒストグラム類似度モデル定式化部１３１により生成される評価関数の取る値が最小化されることで、分割領域の選択に際して、分割領域の特徴量のヒストグラムが、目標領域の特徴量のヒストグラムに近づくように作用する。 The histogram similarity model formulation unit 131 generates an evaluation function that brings the histogram of the features of the selected segmented region in the input image as close as possible to the histogram of the features of the target region in the target image. Specifically, by minimizing the value of the evaluation function generated by the histogram similarity model formulation unit 131, the system ensures that the histogram of the features of the segmented region approaches the histogram of the features of the target region during the segmented region selection process.

領域サイズモデル定式化部１３２は、入力画像のうち選択する分割領域の個数を決定するための評価関数を生成するものである。具体的には、選択される分割領域の個数をできる限り最大化する、又は検出目標の分割領域の個数にするように作用する。そうすることで、入力画像のうち選択する分割領域の総和となるサイズをできるだけ最大化するか、又は検出目標の分割領域の個数にする。 The region size model formulation unit 132 generates an evaluation function for determining the number of segmented regions to select from the input image. Specifically, it works to maximize the number of selected segmented regions as much as possible, or to match the number of segmented regions targeted for detection. In doing so, it maximizes the sum of the size of the selected segmented regions from the input image, or matches the number of segmented regions targeted for detection.

領域塊状度モデル定式化部１３３は、入力画像のうち隣接する分割領域を選択する度合いを表す塊状度を決定するための評価関数を生成するものである。具体的には、隣接領域のいずれも選択されるか、あるいはいずれも選択されないとき、報酬項となり、隣接領域のいずれかのみが選択されるときには罰金項となるように定式化される。すなわち、領域塊状度モデル定式化部１３３により生成される評価関数のとる値が最小化されることで、分割領域の選択に際して、隣接する領域が多く選択されるように作用する。
数理モデル化部１３について説明するうえで、共通に用いる記号等の定義について説明する。 The region chunking model formulation unit 133 generates an evaluation function for determining chunking, which represents the degree to which adjacent divided regions are selected from the input image. Specifically, it is formulated so that a reward term is generated when either all adjacent regions are selected or none are selected, and a penalty term is generated when only one of the adjacent regions is selected. In other words, by minimizing the value taken by the evaluation function generated by the region chunking model formulation unit 133, the system acts to select more adjacent regions when selecting divided regions.
In explaining the mathematical modeling unit 13, we will now explain the definitions of symbols and other terms that are commonly used.

目標画像のうち類似検出したい物体（画像）が含まれる目標領域をＶ^＊と表記する。
本実施形態では、前述したように入力画像は、横Ｗピクセル、縦Ｈピクセルの矩形の画像を例示する。具体的には、入力画像を、予め設定したサイズ毎に分割した分割領域の頂点集合をＶとし、各分割領域をｖ（ｘ，ｙ）（ｘ＝１，２，・・・、Ｘ、ｙ＝１，２，・・・、Ｙ）とする。
そして、分割領域ｖ（ｘ，ｙ）が、選択されるときに値１をとり、選択されないときに値０を取る二値関数をｖ^＃（ｘ，ｙ）で表す。
すなわち、入力画像を分割することで分割領域が生成され、
集合Ｖ＝｛ｖ（ｘ，ｙ）；（ｘ＝１，２，・・・，Ｘ，ｙ＝１，２，・・・，Ｙ）｝
が作成される。
Ｘ，Ｙは、それぞれ分割領域の横の個数と縦の個数である。例えば、Ｗ=３０００,Ｈ＝２０００，Ｘ＝３０，Ｙ＝２０とすると、分割領域のサイズは１００ピクセル×１００ピクセルとなり、分割領域の個数は、３０×２０＝６００個となる。
また例えば、Ｗ=３０００,Ｈ＝２０００，Ｘ＝３０００，Ｙ＝２０００とすると、分割領域は画素に一致し、分割領域のサイズは１ピクセル×１ピクセルとなり、分割領域の個数は３０００×２０００＝６００００００個となる。
分割領域のサイズ（例えば、縦のピクセル数と横のピクセル数）は、目標領域Ｖ＊のサイズとは独立して設定可能である。
分割領域のサイズは、類似領域の検出を大雑把にしたり、より細かくしたりする等、検出の精度に関係する。例えば、分割領域のサイズを目標領域Ｖ＊のサイズよりも大きくすると、類似領域の検出が大雑把になるため、分割領域のサイズは、目標領域Ｖ＊のサイズ以下とすることが好ましい。
高精度な検出が必要な場合には、より細かく分割領域を区切ることが好ましい。ただし、細かく分割領域を区切るほど、変数の数が多くなり、計算時間が増大する。このため、具体的なピクセル数については画像や目的によって設定するものとする。 The target region within the target image that contains the object (image) to be detected for similarity is denoted as V ^* .
In this embodiment, as described above, the input image is exemplified as a rectangular image with width W pixels and height H pixels. Specifically, the input image is divided into pre-set sizes, and V is the set of vertices of the divided regions, with each divided region being v(x,y) (x=1,2,...,X, y=1,2,...,Y).
Then, v ^# (x,y) represents a binary function that takes the value 1 when the partition region v(x,y) is selected and the value 0 when it is not selected.
In other words, by dividing the input image, a divided region is generated.
Set V={v(x,y);(x=1,2,...,X,y=1,2,...,Y)}
This is created.
X and Y represent the number of horizontal and vertical divisions in the divided area, respectively. For example, if W=3000, H=2000, X=30, and Y=20, the size of the divided area will be 100 pixels × 100 pixels, and the number of divided areas will be 30 × 20 = 600.
For example, if W=3000, H=2000, X=3000, and Y=2000, the division regions will correspond to pixels, the size of the division region will be 1 pixel × 1 pixel, and the number of division regions will be 3000 × 2000 = 6,000,000.
The size of the divided region (for example, the number of vertical pixels and the number of horizontal pixels) can be set independently of the size of the target region V*.
The size of the division region affects the detection accuracy, such as making the detection of similar regions coarser or finer. For example, if the size of the division region is larger than the size of the target region V*, the detection of similar regions will become coarser; therefore, it is preferable to keep the size of the division region less than or equal to the size of the target region V*.
When high-precision detection is required, it is preferable to divide the region into smaller segments. However, the more finely the region is divided, the more variables are required, and the longer the computation time becomes. Therefore, the specific number of pixels should be determined based on the image and purpose.

各分割領域ｖ（ｘ，ｙ）に対して、類似領域として選択される場合に値１をとり、選択されない場合に値０を取る二値関数をｖ^＃（ｘ，ｙ）が定義される。これにより、後述するように、類似領域の検出は、分割領域ｖ（ｘ，ｙ）のうちいずれの分割領域を選択するかについては、二値変数であるｖ^＃（ｘ，ｙ）についてそれぞれ二次式で表現されるヒストグラム類似度モデルＨ_１、領域サイズモデルＨ_２、及び領域塊状度モデルＨ_３をバランス係数λ_１、λ_２、λ_３により加算した全体関数Ｈ_ＡＬＬを最小化する二値変数ｖ^＃（ｘ，ｙ）の解を算出することに帰着させることができる。

Ｈ_ＡＬＬ＝ λ_１Ｈ_１＋λ_２Ｈ_２＋λ_３Ｈ_３
（式２） For each partitioned region v(x,y), a binary function v ^# (x,y) is defined that takes the value 1 if it is selected as a similar region and the value 0 if it is not selected. As a result, as will be described later, the detection of similar regions can be reduced to calculating the solution for the binary variable v ^# (x,y) that minimizes the overall function _HALL , which is obtained by adding the histogram similarity model _H1 , the region size model _H2 , and the region clumping model _H3 , each expressed as a quadratic equation for the binary variable v ^# (x,y), using the balance coefficients _λ1 , _λ2 , and _λ3 .

H _ALL = λ ₁ H ₁ + λ ₂ H ₂ + λ ₃ H ₃
(Formula 2)

次に、ヒストグラム類似度モデル定式化部１３１について説明する。ヒストグラム類似度モデル定式化部１３１は、入力画像内の選択する分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムＦ（ｘ，ｙ）を、目標画像内の目標領域Ｖ^＊の特徴量のヒストグラムｈ^＊にできる限り近づけるための評価関数を生成するものである。ここでは、特徴量として、画素値等から抽出される色特徴量を例示する。
具体的には、例えばＲＧＢそれぞれを４階調とした６４色の色特徴量を用いる場合、各画素の特徴量は、各画素（ｗ，ｈ）の色番号に対応する要素を１、それ以外を０とする６４次元のｏｎｅ－ｈｏｔベクトルｆ（ｗ、ｈ）で表される。なお、特徴量としては、前述したように、例えば、輝度勾配から抽出される局所特徴量（つまり形状の特徴量）等、任意の特徴量を適用してもよい。いずれの特徴量であっても、特徴量の種類を問わず、特徴量の個数の次元のベクトルとして表すことができる。 Next, the histogram similarity model formulation unit 131 will be described. The histogram similarity model formulation unit 131 generates an evaluation function that brings the histogram F(x,y) of the feature quantities of a selected segmented region v(x,y) in the input image as close as possible to the histogram h ^* of the feature quantities of the target region V ^* in the target image. Here, color features extracted from pixel values, etc., are used as examples of feature quantities.
Specifically, for example, when using 64 color features with 4 levels for each of RGB, the feature of each pixel is represented by a 64-dimensional one-hot vector f(w, h) where the element corresponding to the color number of each pixel (w, h) is 1 and the others are 0. As mentioned above, any feature may be applied as the feature, such as local features extracted from the luminance gradient (i.e., shape features). Regardless of the type of feature, it can be represented as a vector with dimensions equal to the number of features.

目標領域Ｖ^＊の特徴量のヒストグラムｈ^＊は、出現頻度の総和で正規化した相対度数として、以下の式３（［数１］）により定義することができる。
明細書中の数式は、式ｎ（ｎ：自然数）により表し、引用している。ただし、イメージ化した式には墨付きカッコによる数を付加する必要がある。このため、式を示す番号と、墨付きカッコ内の番号と、は異なる。
（式３）
ここで、｜Ｖ^＊｜は目標領域Ｖ^＊に含まれる画素（ｗ，ｈ）数を表わす演算子、Σは、Ｖ^＊に含まれる画素（ｗ，ｈ）についてｆ（ｗ、ｈ）の総和をとる演算子である。 The histogram h ^* of the features in the target region V ^* can be defined as the relative frequency normalized by the sum of the occurrence frequencies using the following equation 3 ([Equation 1]).
The mathematical formulas in the specification are represented and cited using the formula n (where n is a natural number). However, the visualized formulas require the addition of numbers in black brackets. Therefore, the numbers indicating the formulas and the numbers within the black brackets are different.
(Formula 3)
Here, |V ^* | is an operator that represents the number of pixels (w, h) included in the target region V ^* , and Σ is an operator that takes the sum of f(w, h) for each pixel (w, h) included in V ^* .

入力画像内の各分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムＦ（ｘ，ｙ）についても出現頻度の総和で正規化した相対度数として、以下の式４により定義することができる。
（式４）
ここで、｜ｖ（ｘ，ｙ）｜は、分割領域ｖ（ｘ，ｙ）に含まれる画素（ｗ，ｈ）数を表す演算子、Σは、分割領域ｖ（ｘ，ｙ）に含まれる画素（ｗ，ｈ）について、ｆ（ｗ、ｈ）の総和をとる演算子である。 The histogram F(x,y) of the features of each segmented region v(x,y) in the input image can also be defined as a relative frequency normalized by the sum of the occurrence frequencies, using the following equation 4.
(Formula 4)
Here, |v(x,y)| is an operator that represents the number of pixels (w,h) included in the divided region v(x,y), and Σ is an operator that takes the sum of f(w,h) for pixels (w,h) included in the divided region v(x,y).

次に、各分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムＦ（ｘ，ｙ）を要素とするＸＹ次元のベクトルＦを

Ｆ＝｛Ｆ（１，１）Ｆ（１，２）・・・Ｆ（Ｘ，Ｙ－１）Ｆ（Ｘ，Ｙ）｝
（式５）
と定義するとともに、
各分割領域の二値関数値ｖ^＃（ｘ，ｙ）を要素とするＸＹ次元のベクトルｖ^＃を
ｖ^＃＝｛ｖ^＃（１，１）ｖ^＃（１，１）・・・ｖ^＃（Ｘ，Ｙ－１）ｖ^＃（Ｘ，Ｙ）｝
（式６）
と定義する。
そうすると、入力画像内の選択する分割領域ｖ（ｘ，ｙ）の目標個数｜Ｖ_{ｔａｒｇｅｔ}｜を予め設定する場合（以下、「分割領域の目標個数が設定される場合」ともいう）、入力画像内の選択する分割領域ｖ（ｘ，ｙ）における特徴量のヒストグラムｈは、選択される分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムＦ（ｘ，ｙ）の総和を目標個数（｜Ｖ_{ｔａｒｇｅｔ}｜）で正規化した相対度数として、以下の式７により定義することができる。
ｈ＝（１／｜Ｖ_{ｔａｒｇｅｔ}｜）Ｆｖ^＃
（式７）
ここで、「分割領域の目標個数が設定される場合」とは、例えば、次のようなケースを意味する。例えば、目標画像に含まれるニンジンを類似検出対象とする場合、入力画像内にニンジンの含まれるピクセル数を予め把握することができれば、目標画像に類似する類似領域の目標サイズを予め設定することができる。このような場合、目標サイズから目標個数｜Ｖ_{ｔａｒｇｅｔ}｜を算出することができる。 Next, we have an XY-dimensional vector F whose elements are the histograms F(x,y) of the features of each partitioned region v(x,y).

F={F(1,1) F(1,2)...F(X,Y-1) F(X,Y)}
(Formula 5)
As defined,
A vector v ^# in XY dimensions, whose elements are the binary function values v ^# (x, y) of each partitioned region, can be defined as v ^# = {v ^# (1, 1) v ^# (1, 1) ... v ^# (X, Y - 1) v ^# (X, Y)}
(Formula 6)
This is how it is defined.
In that case, if the _target number of selected segmented regions v(x,y) in the input image is set in advance (hereinafter also referred to as "when the target number of segmented regions is set"), the histogram h of the features in the selected segmented region v(x,y) in the input image can be defined by the following equation 7 as the relative frequency obtained by normalizing the sum of the histograms F(x,y) of the features in the selected segmented region v(x,y) by the target number (|V _target |).
h=(1/|V _target |)Fv ^#
(Formula 7)
Here, "when a target number of divided regions is set" means, for example, the following case. For example, if the carrots included in the target image are to be detected as similar, if the number of pixels containing carrots in the input image can be known in advance, the target size of similar regions similar to the target image can be set in advance. In such a case, the target number |V _target | can be calculated from the target size.

「分割領域の目標個数が設定される場合」、ヒストグラム類似度モデルＨ_１１は、以下の式８で定義することができる。
（式８）
ここで、|| ||_２ ^２は、Ｌ２ノルム|| ||_２の２乗、ベクトルの右上の記号Ｔはベクトルの転置を表す。
式８に、「分割領域の目標個数が設定される場合」の選択される分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムｈ（式７）を代入すると、ヒストグラム類似度モデルＨ_１１は、次の式９で表される。
（式９） When a target number of division regions is set, the histogram similarity model _H11 can be defined by the following equation 8.
(Formula 8)
Here, || || ₂ ² represents the L2 norm || || ₂ squared, and the symbol T above the vector represents the transpose of the vector.
Substituting the histogram h (Equation 7) of the features of the selected division region v(x,y) in the case where "a target number of division regions is set" into Equation 8, the histogram similarity model _H11 is expressed in the following Equation 9.
(Formula 9)

他方、「分割領域の目標個数を設定しない場合」、選択される分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムｈを、選択される分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムＦ（ｘ，ｙ）の総和を、選択される分割領域ｖ（ｘ，ｙ）の総数で正規化した相対度数として、以下の式１０により定義することができる。
（式１０）
ここで、Σは、入力画像の全ての分割領域ｖ（ｘ，ｙ）の総和をとる演算子である。すなわち、Σの値は選択される分割領域ｖ（ｘ，ｙ）の総数である。 On the other hand, if "no target number of partitioned regions is set," the histogram h of the features of the selected partitioned region v(x,y) can be defined as the relative frequency obtained by normalizing the sum of the histograms F(x,y) of the features of the selected partitioned region v(x,y) by the total number of selected partitioned regions v(x,y) using the following equation 10.
(Formula 10)
Here, Σ is an operator that takes the sum of all the division regions v(x,y) of the input image. In other words, the value of Σ is the total number of division regions v(x,y) that are selected.

「分割領域の目標個数が設定されない場合」、ヒストグラム類似度モデルＨ_１２は、以下の式１１で定義することができる。
（式１１）
式１１に、「分割領域の目標個数が設定されない場合」の選択される分割領域ｖ（ｘ，ｙ）の特徴量のヒストグラムｈ（式１０）を代入すると、
ヒストグラム類似度モデルＨ_１２は、式１２で表される。
（式１２）
ここで、１_ＸＹは、全ての要素が１であるＸＹ次元ベクトル、１_{（ＸＹ）×（ＸＹ）}は、全ての要素が１である（ＸＹ）×（ＸＹ）次元行列である。また、式１２で、「〇の中にＸが記載された記号」は、テンソル積を示す。
すなわち、１_ＸＹと（ｈ^＊ＴＦ）とのテンソル積は、ＸＹ次元の全ての行が（ｈ^＊ＴＦ）である行列であり、（ｈ^＊ＴＦ）１_{（ＸＹ）×（ＸＹ）}は、（ＸＹ）×（ＸＹ）次元の全ての要素が、（ｈ^＊ＴＦ）である行列となる。
以上、ヒストグラム類似度モデル定式化部１３１は、「分割領域の目標個数が設定される場合」と、「分割領域の目標個数が設定されない場合」と、のそれぞれのケースに応じて、式９で示すヒストグラム類似度モデルＨ_１１、及び式１２で示すヒストグラム類似度モデルＨ_１２を出力する。その後、ヒストグラム類似度モデル定式化部１３１は、ヒストグラム類似度モデルＨ_１１又はヒストグラム類似度モデルＨ_１２をモデル変数計算部１４に出力する。
以上により、前述したように、ヒストグラム類似度モデル定式化部１３１により生成される評価関数の取る値が最小化されることで、分割領域の選択に際して、分割領域の特徴量のヒストグラムが、目標領域の特徴量のヒストグラムに近づくように作用する。 If "the target number of division regions is not set," the histogram similarity model _H12 can be defined by the following equation 11.
(Formula 11)
Substituting the histogram h (Equation 10) of the features of the selected partition region v(x,y) in the case where "the target number of partition regions is not set" into Equation 11,
The histogram similarity model _H12 is represented by Equation 12.
(Formula 12)
Here, _1XY is an XY-dimensional vector where all elements are 1, and 1 _(XY)×(XY) is an (XY)×(XY)-dimensional matrix where all elements are 1. Also, in equation 12, the symbol "X inside a circle" represents the tensor product.
In other words, the tensor product of 1 _XY and (h ^{* T} F) is a matrix of XY dimensions where all rows are (h ^{* T} F), and (h ^{* T} F) 1 _{(XY) × (XY)} is a matrix of (XY) × (XY) dimensions where all elements are (h ^{* T} F).
As described above, the histogram similarity model formulation unit 131 outputs the histogram similarity model H11 shown in Equation 9 and the histogram similarity model _H12 shown in Equation 12, depending on whether a target number of division regions is set or not. Subsequently, the histogram similarity model formulation unit 131 outputs either the histogram similarity model _H11 or the histogram similarity model _H12 to the model variable calculation unit ₁₄ .
As described above, the value taken by the evaluation function generated by the histogram similarity model formulation unit 131 is minimized, so that when selecting a division region, the histogram of the features of the division region approaches the histogram of the features of the target region.

領域サイズモデル定式化部１３２は、前述したように、入力画像のうち選択する分割領域の個数を決定するための評価関数を生成するものである。具体的には、選択される分割領域の個数をできる限り最大化する、又は検出目標の分割領域の個数にするように作用する。そうすることで、入力画像のうち選択する分割領域の総和となるサイズをできるだけ最大化する。 The region size model formulation unit 132, as described above, generates an evaluation function for determining the number of segmented regions to select from the input image. Specifically, it works to maximize the number of selected segmented regions, or to match the number of segmented regions of the detection target. In doing so, it maximizes the sum of the sizes of the selected segmented regions in the input image.

領域サイズモデル定式化部１３２は、ヒストグラム類似度モデル定式化部１３１と同様に、「分割領域の目標個数が設定される場合」と、「分割領域の目標個数が設定されない場合」と、のそれぞれのケースに応じて領域サイズモデルＨ_２１及び領域サイズモデルＨ_２２を出力する。その後、領域サイズモデル定式化部１３２は、領域サイズモデルＨ_２１、及び領域サイズモデルＨ_２２をモデル変数計算部１４に出力する。
まず、「分割領域の目標個数が設定される場合」について説明する。 The region size model formulation unit 132 outputs region size models H21 and H22 according to the cases of "when a target number of divided regions is set" and "when a target number of divided regions is not set, _" similar to the histogram similarity model formulation unit _131. Subsequently, the region size model formulation unit 132 outputs region size models _H21 and _H22 to the model variable calculation unit 14.
First, let's explain the case where "a target number of partitioned areas is set."

「分割領域の目標個数が設定される場合」、領域サイズモデルＨ_２１は、式１３で表される。
（式１３）
式１３に示す領域サイズモデルＨ_２１は、入力画像における分割領域の目標個数（目標サイズ）と、入力画像内の選択する分割領域の個数との差の二乗である。したがって、最小化されることで、選択する分割領域をできる限り分割領域の目標個数（目標サイズ）に近づける働きをする。 When a target number of partitioned regions is set, the region size model _H21 is expressed by Equation 13.
(Formula 13)
The region size model _H21 shown in Equation 13 is the square of the difference between the target number of segmented regions (target size) in the input image and the number of segmented regions to be selected in the input image. Therefore, by minimizing this difference, it works to bring the number of selected segmented regions as close as possible to the target number (target size) of segmented regions.

他方、「分割領域の目標個数が設定されない場合」、領域サイズモデルＨ_２２は、式１４で表される。
（式１４）
式１４に示す領域サイズモデルＨ２２は、入力画像のうち選択する分割領域の個数の総和に負の符号をつけたものである。したがって、最小化されることで、選択する分割領域の個数（目標サイズ）をできる限り最大化する働きをする。
領域サイズモデル定式化部１３２は、「分割領域の目標個数が設定される場合」、式（１３）に示す領域サイズモデルＨ_２１を、また「分割領域の目標個数が設定されない場合」、式（１４）に示す領域サイズモデルＨ_２２をモデル変数計算部１４に出力する。 On the other hand, if "the target number of partitioned regions is not set," the region size model _H22 is expressed by Equation 14.
(Formula 14)
The region size model H22 shown in Equation 14 is the sum of the number of selected segmented regions from the input image, with a negative sign. Therefore, by minimizing this value, it works to maximize the number of selected segmented regions (target size) as much as possible.
The domain size model formulation unit 132 outputs the domain size model _H21 shown in equation (13) to the model variable calculation unit 14 when "a target number of divided domains is set", and the domain size model _H22 shown in equation (14) when "a target number of divided domains is not set".

最後に、領域塊状度モデル定式化部１３３について説明する。領域塊状度モデル定式化部１３３は、前述したように、入力画像のうち隣接する分割領域を選択する度合いを表す塊状度を決定するための評価関数を生成するものである。具体的には、隣接領域がいずれも選択されるか、あるいはいずれも選択されないとき、報酬項となり、隣接領域がいずれかのみ選択されるときには罰金項となるように定式化される。すなわち、領域塊状度モデル定式化部１３３により生成される評価関数のとる値が最小化されることで、分割領域の選択に際して、隣接する領域が多く選択されるように作用する。
隣接する分割領域を選択する度合いを表す塊状度を説明するうえで、隣接する分割領域について説明する。
隣接する分割領域となる２つの隣接領域を結ぶ辺集合をＥとし、各隣接領域の関係となる分割領域のペアを（ｖ（ｘ，ｙ），ｖ（ｘ’、ｙ’））、ただし、ｖ（ｘ，ｙ）≠ｖ（ｘ’、ｙ’）とする。２つの隣接領域と隣接領域の共有する辺（「隣接領域を結ぶ辺」ともいう）とは、１対１に対応する。
ここで、隣接領域の集合を、隣接領域を結ぶ辺集合Ｅと同一視する。
なお、画像の最外縁の分割領域は画像の最外縁と隣接すると見做すことで、当該分割領域の外側の辺を便宜上、辺集合Ｅに含むと定義する。この場合、ｖ（ｘ‘、ｙ’）は、画像の外側にあると仮定する。なお、ｖ（ｘ’、ｙ’）は画像に含まれないが、ｖ^＃（ｘ‘、ｙ’）の値は、隣接領域ｖ（ｘ，ｙ）の関数ｖ^＃（ｘ，ｙ）の値と同一の値とする。 Finally, the region chunking model formulation unit 133 will be explained. As mentioned above, the region chunking model formulation unit 133 generates an evaluation function for determining chunking, which represents the degree to which adjacent divided regions are selected from the input image. Specifically, it is formulated so that when all adjacent regions are selected or none are selected, it becomes a reward term, and when only one of the adjacent regions is selected, it becomes a penalty term. In other words, by minimizing the value taken by the evaluation function generated by the region chunking model formulation unit 133, it acts to select more adjacent regions when selecting divided regions.
To explain the degree of clumping, which represents the degree to which adjacent partitioned regions are selected, we will first explain adjacent partitioned regions.
Let E be the set of edges connecting two adjacent partitioned regions, and let (v(x,y), v(x',y')) be the pair of partitioned regions that represent the relationship between each adjacent region, where v(x,y) ≠ v(x',y'). There is a one-to-one correspondence between two adjacent regions and the edges they share (also called "edges connecting adjacent regions").
Here, we identify the set of adjacent regions with the set of edges E that connect these adjacent regions.
Furthermore, the outermost division region of the image is considered adjacent to the outermost edge of the image, and for convenience, the edges outside this division region are defined as being included in the edge set E. In this case, v(x', y') is assumed to be outside the image. Although v(x', y') is not included in the image, the value of v ^# (x', y') is assumed to be the same as the value of the function v ^# (x, y) of the adjacent region v(x, y).

図３は、隣接する２つの分割領域に対して、２つの隣接領域を結ぶ辺を対応づけた図である。図３を参照すると、２０個の分割領域ｖ（ｘ，ｙ）（ｘ＝１，２，３，４，５、ｙ＝１，２，３，４）がある。そして、最外縁の分割領域が画像の最外縁と隣接すると見做すと、画像の最外縁は便宜上ｖ（０、ｙ）、ｖ（６，ｙ）（ｙ＝１，２，３，４）、ｖ（ｘ，０）、ｖ（ｘ、５）（ｘ＝１，２，３，４，５）により表される。２つの隣接領域を結ぶ（矢印で示す）辺が、ｙの値を固定したとき、６（＝５＋１）個あり、ｘの値を固定したとき、５（＝４＋１）個あることがわかる。すなわち、隣接領域の個数は、（５＋１）・４＋５・（４＋１）＝４９個あることがわかる。
このように、隣接領域の個数は、Ｘ・（Ｙ＋１）＋（Ｘ＋１）・Ｙ個であることがわかる。 Figure 3 shows the correspondence between two adjacent divided regions and the edges connecting them. Referring to Figure 3, there are 20 divided regions v(x,y) (x=1,2,3,4,5, y=1,2,3,4). If we consider the outermost divided region to be adjacent to the outermost edge of the image, then the outermost edge of the image can be represented for convenience by v(0,y), v(6,y) (y=1,2,3,4), v(x,0), and v(x,5) (x=1,2,3,4,5). When the value of y is fixed, there are 6 (=5+1) edges connecting two adjacent regions (indicated by arrows), and when the value of x is fixed, there are 5 (=4+1) edges. That is, the number of adjacent regions is (5+1) * 4 + 5 * (4+1) = 49.
Thus, we can see that the number of adjacent regions is X * (Y + 1) + (X + 1) * Y.

領域塊状度モデル定式化部１３３は、領域塊状度モデルＨ_３を以下の式１５により定義することができる。
（式１５）
ここで、Σは、入力画像の全ての隣接する分割領域の総和をとる演算子である。
なお、ｖ^＃（０、ｙ）の値は、隣接領域ｖ（１，ｙ）の関数ｖ^＃（１，ｙ）の値と同一の値とする。同様に、ｖ^＃（Ｘ＋１、ｙ）の値は、隣接領域ｖ（Ｘ，ｙ）の関数ｖ^＃（Ｘ，ｙ）の値と同一の値とする。ｖ^＃（ｘ、０）及びｖ＃（ｘ、Ｙ＋１）についても同様である。
そうすると、隣接領域がいずれも選択されるか、あるいはいずれも選択されないとき、
［－（２ｖ^＃（ｘ，ｙ）－１）・（２ｖ^＃（ｘ’，ｙ’）－１）］＝－１となり、最小化される場合には報酬項となるように作用する。
他方、隣接領域がいずれかのみ選択されるとき、
［－（２ｖ^＃（ｘ，ｙ）－１）・（２ｖ^＃（ｘ’，ｙ’）－１）］＝１となり、最小化される場合には罰金項となるように作用する。よって、この式に示す領域塊状度モデルＨ_３は、最小化されることで、できる限り隣接する分割領域を選択する働きをする。隣接する領域が多く選択される場合は、高い塊状度となりＨ_３の値は、より小さくなる。隣接しない領域を多く選択される場合は、低い塊状度となりＨ_３の値は、より大きくなる。 The regional aggregateness model formulation unit 133 can define the regional aggregateness model _H3 by the following equation 15.
(Formula 15)
Here, Σ is an operator that takes the sum of all adjacent partitioned regions of the input image.
Note that the value of v ^# (0,y) is the same as the value of the function v ^# (1,y) in the adjacent region v(1,y). Similarly, the value of v ^# (X+1,y) is the same as the value of the function v ^# (X,y) in the adjacent region v(X,y). The same applies to v ^# (x,0) and v#(x,Y+1).
Then, when either all adjacent regions are selected or none are selected,
[-(2v ^# (x,y)-1) * (2v ^# (x',y')-1)] = -1, and acts as a reward term when minimized.
On the other hand, when only one of the adjacent regions is selected,
[-(2v ^# (x,y)-1) * (2v ^# (x',y')-1)] = 1, and acts as a penalty term when minimized. Therefore, the domain aggregate model _H3 shown in this equation works to select as many adjacent partitioned regions as possible when minimized. If many adjacent regions are selected, the aggregate is high and the value of _H3 becomes smaller. If many non-adjacent regions are selected, the aggregate is low and the value of _H3 becomes larger.

図４Ａから図４Ｂを参照して、領域塊状度モデル定式化部１３３の処理例を説明する。ここでは、Ｘ＝５，Ｙ＝４の場合を例示している。ここで、選択する分割領域はハッチングで示している。また、白矢印に対応する隣接する分割領域はいずれも選択されるか、あるいはいずれも選択されないケースを、黒矢印に対応する隣接する分割領域はいずれかのみ選択されるケースを示す。
図４Ａ、図４Ｂ、図４Ｃ、及び図４Ｄのように分割領域を選択する場合、Ｈ_３の値は、それぞれ以下のように算出できる。 Referring to Figures 4A and 4B, an example of the processing of the region blockiness model formulation unit 133 will be explained. Here, the case where X = 5 and Y = 4 is shown as an example. Here, the selected division regions are indicated by hatching. Furthermore, the cases where all adjacent division regions corresponding to the white arrows are selected or none are selected are shown, and the cases where only one of the adjacent division regions corresponding to the black arrows is selected are shown.
When selecting the division region as shown in Figures 4A, 4B, 4C, and 4D, the value of _H3 can be calculated as follows.

図４Ａを参照すると、隣接領域がいずれも選択されるか、あるいはいずれも選択されないケース（白矢印）が３９個、隣接領域がいずれかのみ選択されるケース（黒矢印）が１０個であることから、
Ｈ_３＝－１・３９＋１・１０＝－２９
となる。
図４Ｂを参照すると、隣接領域がいずれも選択されるか、あるいはいずれも選択されないケース（白矢印）が３７個、隣接領域がいずれかのみ選択されるケース（黒矢印）が１２個であることから、
Ｈ_３＝－１・３７＋１・１２＝－２５
となる。
図４Ｃを参照すると、隣接領域がいずれも選択されるか、あるいはいずれも選択されないケース（白矢印）が２９個、隣接領域がいずれかのみ選択されるケース（黒矢印）が２０個であることから、
Ｈ_３＝－１・２９＋１・２０＝－９
となる。
図４Ｄを参照すると、隣接領域がいずれも選択されるか、あるいはいずれも選択されないケース（白矢印）が２５個、隣接領域がいずれかのみ選択されるケース（黒矢印）が２４個であることから、
Ｈ_３＝－１・２５＋１・２４＝－１
となる。
Ｈ_３の値は、図４Ａの場合が最小値をとり、図４Ｄの場合が、最大の値を取ることがわかる。このように、領域塊状度モデルＨ_３は、入力画像のうち隣接する分割領域を選択する度合いを表す塊状度を示し、領域塊状度モデルＨ_３はとる値が最小化されることで、分割領域の選択に際して、隣接する領域が多く選択されるように作用することがわかる。
以上、数理モデル化部１３について説明した。 Referring to Figure 4A, there are 39 cases where all adjacent regions are selected or none are selected (white arrows), and 10 cases where only one of the adjacent regions is selected (black arrows).
H ₃ =-1・39+1・10=-29
This is the result.
Referring to Figure 4B, there are 37 cases where all adjacent regions are selected or none are selected (white arrows), and 12 cases where only one of the adjacent regions is selected (black arrows).
H ₃ =-1・37+1・12=-25
This is the result.
Referring to Figure 4C, there are 29 cases where all adjacent regions are selected or none are selected (white arrows), and 20 cases where only one of the adjacent regions is selected (black arrows).
H ₃ =-1・29+1・20=-9
This is the result.
Referring to Figure 4D, there are 25 cases where all adjacent regions are selected or none are selected (white arrows), and 24 cases where only one of the adjacent regions is selected (black arrows).
H ₃ =-1・25+1・24=-1
This is the result.
It can be seen that the value of _H3 is at its minimum in the case of Figure 4A and at its maximum in the case of Figure 4D. Thus, the regional clumping model _H3 represents the degree of clumping, which is the degree to which adjacent divided regions are selected from the input image. It can be seen that minimizing the value of the regional clumping model _H3 acts to select more adjacent regions when dividing regions.
The mathematical modeling unit 13 has been explained above.

数理モデル化部１３により生成されるヒストグラム類似度モデルＨ_１１、Ｈ_１２、領域サイズモデルＨ_２１、Ｈ_２２、及び領域塊状度モデルＨ_３は、いずれも二値変数であるｖ^＃（ｘ，ｙ）について二次式で表現されることから、式変形によって、前述した式（１）で表されるＱＵＢＯ表現に変換することができる。
なお、二値変数について次数が二次以下であれば、ＱＵＢＯ表現に変換することが可能であることは当業者にとって自明である。すなわち、０又は１をとる二値変数ｑでは、ｑとｑの二乗とは等価（０も１も二乗しても不変）であるため、一次の項と二次の項とは等価変換することができる。また、定数項は、二値変数をどのように選んでも評価関数には影響を与えないことから無視することができる。このように、二値変数ｖ^＃（ｘ，ｙ）について、ヒストグラム類似度モデルＨ_１１、Ｈ_１２、領域サイズモデルＨ_２１、Ｈ_２２、及び領域塊状度モデルＨ_３は、いずれも次数が二次以下であることから、ＱＵＢＯ表現に変換することができる。
次に、モデル変数計算部１４について説明する。 The histogram similarity models _H11 and _H12 , the region size models _H21 and _H22 , and the region clumping model _H3 , generated by the mathematical modeling unit 13, are all expressed as quadratic equations in terms of the binary variable v ^# (x,y). Therefore, they can be transformed into the QUBO representation shown in equation (1) above by manipulating the equations.
It is obvious to those skilled in the art that binary variables can be converted to QUBO representations if their degree is second or lower. That is, for a binary variable q that takes the value 0 or 1, q and q squared are equivalent (squaring 0 or 1 does not change anything), so first-order terms and second-order terms can be converted to equivalent terms. Also, the constant term can be ignored because it does not affect the evaluation function regardless of how the binary variable is chosen. Thus, for the binary variable v ^# (x,y), the histogram similarity models _H11 and _H12 , the region size models _H21 and _H22 , and the region chunking model _H3 can all be converted to QUBO representations because their degree is second or lower.
Next, the model variable calculation unit 14 will be described.

モデル変数計算部１４は、数理モデル化部１３で生成された数理モデルで、最適な分割領域となる変数の解を、計算機２により計算するものである。モデル変数計算部１４は、数理モデル化部１３で生成された数理モデルを加算した全体関数を、計算機２に出力し、計算結果となる変数の解を取得する。
ここでは、モデル変数計算部１４は、「分割領域の目標個数が設定される場合」、ヒストグラム類似度モデルＨ_１１と、領域サイズモデルＨ_２１と、領域塊状度モデルＨ_３と、をバランス係数λ_１、λ_２、λ_３を用いて加算した以下の式２－１の全体関数Ｈ_ＡＬＬを最小化する二値変数であるｖ^＃（ｘ，ｙ）の解の計算を、計算機２に指示する。
Ｈ_ＡＬＬ＝ λ_１Ｈ_１１＋λ_２Ｈ_２１＋λ_３Ｈ_３
（式１６）
同様に、モデル変数計算部１４は、「分割領域の目標個数が設定されない場合」、ヒストグラム類似度モデルＨ_１２と、領域サイズモデルＨ_２２と、領域塊状度モデルＨ_３と、をバランス係数λ_１、λ_２、λ_３を用いて加算した以下の式２－２の全体関数Ｈ_ＡＬＬを最小化する二値変数であるｖ^＃（ｘ，ｙ）の解の計算を、計算機２に指示する。
Ｈ_ＡＬＬ＝ λ_１Ｈ_１２＋λ_２Ｈ_２２＋λ_３Ｈ_３
（式１７） The model variable calculation unit 14 calculates the solution for the variable that constitutes the optimal partition region in the mathematical model generated by the mathematical modeling unit 13, using the computer 2. The model variable calculation unit 14 outputs the overall function obtained by summing the mathematical models generated by the mathematical modeling unit 13 to the computer 2 and obtains the solution for the variable that is the calculation result.
Here, the model variable calculation unit 14 instructs the computer 2 to calculate the solution for v# (x, y), a binary variable that minimizes the overall function _HALL of equation _2-1 , which is obtained by adding the histogram similarity model H11, the region size model _H21 , and the region clumping model _H3 using the balance coefficients _λ1 , _λ2 , _{and λ3} , when ^" the target number of partitioned regions is set".
H _ALL = λ ₁ H ₁₁ + λ ₂ H ₂₁ + λ ₃ H ₃
(Formula 16)
Similarly, the model variable calculation unit 14 instructs the computer 2 to calculate the solution for v# (x, y), a binary variable that minimizes the overall function _HALL of equation _2-2 , which is obtained by adding the histogram similarity model H12, the region size model _H22 , and the region clumping model _H3 using the balance coefficients _λ1 , _λ2 , _{and λ3} , when ^" the target number of partitioned regions is not set".
H _ALL = λ ₁ H ₁₂ + λ ₂ H ₂₂ + λ ₃ H ₃
(Formula 17)

ここで、バランス係数λ_１、λ_２、λ_３は、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３のそれぞれに掛かる比重であって、その値が大きいほどその値に係るモデルに比重が置かれる。バランス係数λ_１、λ_２、λ_３に予め初期値を設定し、計算機に計算を実行させて、変数ｖ^＃（ｘ，ｙ）の解を取得する。例えば、目標領域のヒストグラムにどの程度類似した領域を検出するか、あるいは塊状度がどの程度高い領域を検出するか、等に応じて初期値を設定する。
取得した変数の解を用いて、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３を計算し、それぞれ３つのモデルの値（計算結果）が予めそれぞれのモデルに定めた閾値以下になるように、対応するバランス係数を調整する。そして、バランス係数λ_１、λ_２、λ_３を調整する間、計算機に全体関数Ｈ_ＡＬＬの再計算（求解）を行う処理を繰り返す。例えば、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）又はＨ_３のいずれかの値が予め定めた閾値以下でない場合、そのモデルに対応するバランス係数λ_１、λ_２、λ_３のいずれかを増加させることで、このモデルに課される比重を重くする。これによって、このモデルの値を減少させることができる。
このようにバランス係数λ_１、λ_２、λ_３を調整することで、全体関数Ｈ_ＡＬＬの最適化を図ることができる。
なお、この全体関数Ｈ_ＡＬＬの再計算には、繰り返し回数の上限を設けてもよい。これによって、設定した繰り返し回数の上限の値によっては、３つのモデルの何れかの値が閾値以下とならない場合でも、計算を終了させるようにしてもよい。
その後、モデル変数計算部１４は、変数ｖ^＃（ｘ，ｙ）の解を領域検出部１５に出力する。 Here, the balance coefficients _λ1 , _λ2 , and _λ3 represent the weighting applied to each of the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 , respectively. The larger the value, the more weight is placed on the model associated with that value. Initial values are set in advance for the balance coefficients _λ1 , _λ2 , and _λ3 , and the computer is made to perform calculations to obtain the solution for the variable v ^# (x, y). For example, the initial values are set according to how similar the detected region is to the histogram of the target region, or how high the degree of aggregate is to detect the region, etc.
Using the solutions for the acquired variables, three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 are calculated, and the corresponding balance coefficients are adjusted so that the values (calculation results) of each of the three models are below a predetermined threshold. While adjusting the balance coefficients _λ1 , _λ2 , and _λ3 , the process of recalculating (solving) the overall function _HALL is repeated. For example, if the value of any of the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), or _H3 is not below a predetermined threshold, the weight assigned to that model is increased by increasing one of the balance coefficients _λ1 , _λ2 , or _λ3 . This can be used to decrease the value of that model.
By adjusting the balance coefficients _λ1 , _λ2 , and _λ3 in this way, the overall function _HALL can be optimized.
Furthermore, an upper limit may be set on the number of iterations for recalculating this overall function _HALL . Depending on the set upper limit on the number of iterations, the calculation may be terminated even if the value of any of the three models does not fall below a threshold.
Subsequently, the model variable calculation unit 14 outputs the solution for the variable v ^# (x, y) to the region detection unit 15.

領域検出部１５は、モデル変数計算部１４で計算された変数ｖ^＃（ｘ，ｙ）の解に基づいて、類似領域として検出する領域を決定するものである。変数ｖ^＃（ｘ，ｙ）の解を直接用いる場合、対応する分割領域ｖ（ｘ，ｙ）を類似領域としてもよい。
また、変数ｖ^＃（ｘ，ｙ）の解に対応する分割領域ｖ^＃（ｘ，ｙ）を用いて後処理をしたものを類似領域としてもよい。例えば、解に対応する分割領域の集合により複数の４連結成分が作成される場合、それらの４連結成分のうち、最大の４連結成分を類似領域とすることが考えられる。なお、４連結成分に換えて８連結成分を適用してもよい。また、解に対応する分割領域に対して、二値画像処理における膨張や収縮を用いるクロージングやオープニングの処理を施したものを類似領域とすることも考えられる。
類似領域のデータ形式としてはｖ（ｘ，ｙ）を用いたデータとしてもよいし、また画素情報に修正したデータや、枠線等により類似領域を示す画像データとしてもよい。
このように、類似領域を決定することで、例えば、ユーザの目的に応じた出力形態により類似領域を表示することができる。
その後、領域検出部１５は、類似領域を、出力部１６に出力する。 The region detection unit 15 determines the region to be detected as a similar region based on the solution of the variable v ^# (x,y) calculated by the model variable calculation unit 14. When the solution of the variable v ^# (x,y) is used directly, the corresponding partitioned region v(x,y) may be used as the similar region.
Alternatively, a similar region may be obtained by post-processing using the partitioned region v ^# (x,y) corresponding to the solution of the variable v ^# (x,y). For example, if multiple quad-connected components are created from a set of partitioned regions corresponding to the solution, the largest of these quad-connected components may be used as the similar region. Note that an oct-connected component may be applied instead of a quad-connected component. Furthermore, a similar region may be obtained by applying closing or opening processes using dilation and deflation in binary image processing to the partitioned region corresponding to the solution.
The data format for similar regions may be data using v(x,y), data modified with pixel information, or image data indicating similar regions with borders, etc.
By determining similar regions in this way, it is possible to display these regions in an output format that suits the user's purpose, for example.
Subsequently, the region detection unit 15 outputs similar regions to the output unit 16.

出力部１６は、領域検出部１５で決定した類似領域を出力装置６０に出力する。なお、出力部１６は、領域検出部１５で決定した類似領域を出力装置６０に限らず、通信可能に接続された装置（又はシステム）（図示せず）に対して出力するようにしてもよい。
以上、類似領域検出システム１００の備える機能について説明した。次に、類似領域検出装置１の動作について説明する。 The output unit 16 outputs the similar region determined by the region detection unit 15 to the output device 60. The output unit 16 may also output the similar region determined by the region detection unit 15 to a device (or system) (not shown) that is connected via communication, rather than to the output device 60.
The functions of the similarity region detection system 100 have been described above. Next, the operation of the similarity region detection device 1 will be described.

図５は、類似領域検出装置１の動作を示すフローチャートを示す図である。
図５を参照すると、ステップＳ０１において、入力部１１は、入力装置５０を介して、目標画像と入力画像と、を受け付ける。 Figure 5 is a flowchart showing the operation of the similarity region detection device 1.
Referring to Figure 5, in step S01, the input unit 11 receives the target image and the input image via the input device 50.

ステップＳ０２において、特徴量抽出部１２は、ステップＳ０１で受け付けた目標画像と入力画像と、からそれぞれ特徴量を抽出する。具体的には、特徴量抽出部１２は、目標画像から類似検出したい物体（画像）が含まれる目標領域の画像特徴量を抽出する。また、特徴量抽出部１２は、入力画像から、予め設定されたサイズで入力画像領域を分割した分割領域ｖ（ｘ，ｙ）毎に特徴量を抽出する。 In step S02, the feature extraction unit 12 extracts features from the target image and the input image received in step S01. Specifically, the feature extraction unit 12 extracts image features from the target image for the target region containing the object (image) to be detected as similar. Furthermore, the feature extraction unit 12 extracts features from the input image for each divided region v(x, y) obtained by dividing the input image region according to a predetermined size.

ステップＳ０３において、数理モデル化部１３は、ステップＳ０２で抽出された特徴量に基づいて、類似領域として検出する分割領域を決定する数理モデルとして、ヒストグラム類似度モデルＨ_１と、領域サイズモデルＨ_２と、領域塊状度モデルＨ_３と、を生成する。具体的には、分割領域の目標個数が設定される場合、数理モデル化部１３は、ヒストグラム類似度モデルＨ_１１と、領域サイズモデルＨ_２１と、領域塊状度モデルＨ_３と、を生成する。分割領域の目標個数を設定されない場合、数理モデル化部１３は、ヒストグラム類似度モデルＨ_１２と、領域サイズモデルＨ_２２と、領域塊状度モデルＨ_３と、を生成する。 In step S03, the mathematical modeling unit 13 generates a histogram similarity model H1, a region size model _H2 , and a region chunking model H3 as mathematical models for determining the partitioned regions to be detected as similar regions, based on the features extracted in step _S02 . Specifically, if a target number of partitioned regions is set, the mathematical modeling unit ₁₃ generates a histogram similarity model _H11 , a region size model _H21 , and a region chunking model _H3 . If a target number of partitioned regions is not set, the mathematical modeling unit 13 generates a histogram similarity model _H12 , a region size model _H22 , and a region chunking model _H3 .

ステップＳ０４において、モデル変数計算部１４は、ステップＳ０３で生成された３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３から、全体関数Ｈ_ＡＬＬを算出する際のバランス係数λ_１、λ_２、λ_３の初期値を設定する。 In step S04, the model variable calculation unit 14 sets initial values for the balance coefficients λ1, _λ2 , _and _λ3 used when calculating the overall function _HALL from the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 generated in step S03.

ステップＳ０５において、モデル変数計算部１４は、ステップＳ０３で生成された３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３と、ステップＳ０４で設定された、あるいは、後記するステップＳ０８で調整されたバランス係数λ_１、λ_２、λ_３と、を用いて、全体関数Ｈ_ＡＬＬを最小化する二値変数であるｖ^＃（ｘ，ｙ）の解を、計算機２を用いて計算する。 In step S05, the model variable calculation unit 14 uses the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 generated in step S03, and the balance coefficients _λ1 , _λ2 , and _λ3 set in step S04 or adjusted in step S08 (described later), to calculate the solution of the binary variable v ^# (x, y) that minimizes the overall function _HALL using the computer 2.

ステップＳ０６において、モデル変数計算部１４は、全体関数Ｈ_ＡＬＬを最小化する変数を計算する計算回数が予め定めた閾値以下か否かを判定する。ここで、計算回数が予め定めた閾値以下の場合（Ｙｅｓの場合）、ステップＳ０７に動作を進める。計算回数が予め定めた閾値を超えた場合（Ｎｏの場合）、ステップＳ０９に動作を進める。 In step S06, the model variable calculation unit 14 determines whether the number of calculations required to calculate the variable that minimizes the overall function H _ALL is less than or equal to a predetermined threshold. If the number of calculations is less than or equal to the predetermined threshold (Yes), the operation proceeds to step S07. If the number of calculations exceeds the predetermined threshold (No), the operation proceeds to step S09.

ステップＳ０７において、モデル変数計算部１４は、ステップＳ０５で求められた変数の解を用いて、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３を計算し、それぞれの数理モデルの値が、予めモデル毎に定めた閾値以下か否かを判定する。３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３のそれぞれの閾値は、所望の性能に応じて自由に設定できる。ここで、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３のそれぞれの値が、予めモデル毎に定めた閾値以下の場合（Ｙｅｓの場合）、ステップＳ０９に動作を進める。一方、３つのモデルＨ_１（Ｈ_１１又はＨ_１２），Ｈ_２（Ｈ_２１又はＨ_２２）及びＨ_３のいずれかの値が予めモデル毎に定めた閾値以下でない場合（Ｎｏの場合）、ステップＳ０８に動作を進める。 In step S07, the model variable calculation unit 14 uses the solutions of the variables obtained in step S05 to calculate three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 , and determines whether the values of each mathematical model are below a predetermined threshold for each model. The thresholds for each of the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 can be freely set according to the desired performance. If the values of each of the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 are below the predetermined threshold for each model (Yes), the process proceeds to step S09. On the other hand, if the value of any of the three models _H1 ( _H11 or _H12 ), _H2 ( _H21 or _H22 ), and _H3 is not below a predetermined threshold for each model (i.e., No), the process proceeds to step S08.

ステップＳ０８において、モデル変数計算部１４は、数理モデルの値が予め定めた閾値以下である数理モデルについて、対応するバランス係数に予め定めた正の実数を加算することでバランス係数を増加させるか、あるいは、数理モデルの値が予め定めた閾値以下でない数理モデルについて、バランス係数に予め定めた負の実数を加算することで、バランス係数を減少させる。例えば、ヒストグラム類似度モデルＨ_１（Ｈ_１１又はＨ_１２）の値のみが閾値以下でない場合、バランス係数λ_１を増加させるか、あるいはバランス係数λ_２、λ_３を減少させる。その後、ステップＳ０５に動作を戻す。 In step S08, the model variable calculation unit 14 either increases the balance coefficient by adding a predetermined positive real number to the corresponding balance coefficient for mathematical models whose values are below a predetermined threshold, or decreases the balance coefficient by adding a predetermined negative real number to the balance coefficient for mathematical models whose values are not below a predetermined threshold. For example, if only the value of the histogram similarity model _H1 ( _H11 or _H12 ) is not below the threshold, the balance coefficient _λ1 is increased, or the balance coefficients _λ2 and _λ3 are decreased. After that, the operation returns to step S05.

ステップＳ０９において、領域検出部１５は、ステップＳ０５で求められた、二値変数ｖ^＃（ｘ，ｙ）の解に基づいて、類似領域を決定する。 In step S09, the region detection unit 15 determines a similar region based on the solution to the binary variable v ^# (x,y) obtained in step S05.

ステップＳ１０において、出力部１６は、領域検出部１５で決定した類似領域を出力装置６０へ出力する。
以上、類似領域検出装置１の動作について説明した。 In step S10, the output unit 16 outputs the similar region determined by the region detection unit 15 to the output device 60.
The operation of the similarity region detection device 1 has been explained above.

本実施形態によれば、類似領域検出装置１において、特徴量抽出部１２は目標画像Ｖ^＊及び入力画像を受け付け、目標画像Ｖ^＊から特徴量ｈ^＊を抽出するとともに、入力画像を予め設定されたサイズで分割した分割領域ｖ（ｘ，ｙ）のそれぞれから、特徴量を抽出し、数理モデル化部１３は、目標画像から抽出された特徴量と、入力画像の各分割領域から抽出された特徴量と、に基づいて、分割領域からいずれを選択するかを変数として数理モデルを生成し、モデル変数計算部１４は、生成された数理モデルＨの値が、予め定めた閾値の条件を満たすように数理モデルＨを最小化する変数ｖ^＃（ｘ，ｙ）の解を取得することで、領域検出部１５は変数の解で特定される類似領域を決定する。
これにより、類似領域検出装置１は、探索領域の位置や形状によらず一定の計算量で、画像の特徴量のヒストグラムが類似する領域を検出することで、目標画像に類似する、入力画像内の類似領域を検出することができる。 According to this embodiment, in the similarity region detection device 1, the feature extraction unit 12 receives a target image V ^* and an input image, extracts feature quantities h ^* from the target image V ^* , and extracts feature quantities from each of the divided regions v(x,y) obtained by dividing the input image into predetermined sizes, the mathematical modeling unit 13 generates a mathematical model based on the feature quantities extracted from the target image and the feature quantities extracted from each divided region of the input image, using which of the divided regions to select as a variable, the model variable calculation unit 14 obtains a solution for the variable v ^# (x,y) that minimizes the mathematical model H so that the value of the generated mathematical model H satisfies a predetermined threshold condition, and the region detection unit 15 determines the similarity region identified by the solution of the variable.
As a result, the similarity region detection device 1 can detect similar regions within the input image that are similar to the target image by detecting regions with similar histograms of image features with a constant amount of computation, regardless of the position or shape of the search region.

数理モデル化部１３は、領域検出部１５により決定される類似領域の検出目標とするサイズを予め設定するようにしてもよい。それにより、入力画像における検出目標にサイズが既知であれば、計算量を減らすことができる。 The mathematical modeling unit 13 may pre-set the target size for detection of similar regions determined by the region detection unit 15. This reduces the computational complexity if the size of the detection target in the input image is known.

数理モデルＨは、ヒストグラム類似度モデルＨ_１と、領域サイズモデルＨ_２と、領域塊状度モデルＨ_３と、を備え、ヒストグラム類似度モデルＨ_１は最小化されることで、分割領域の特徴量のヒストグラムｈが目標画像から抽出された特徴量ｈ^＊のヒストグラムに近づくように作用し、領域サイズモデルＨ_２は最小化されることで、分割領域から選択される分割領域の個数を予め定めた条件を満たすように作用し、領域塊状度モデルＨ_３は最小化されることで、分割領域のうち、隣接する分割領域を選択する度合いが高くなるように作用するため、適切な類似領域の検出が可能である。 The mathematical model H comprises a histogram similarity model _H1 , a region size model _H2 , and a region clumping model _H3 . The histogram similarity model _H1 is minimized to ensure that the histogram h of the feature quantities of the divided regions approaches the histogram of the feature quantities h ^* extracted from the target image. The region size model _H2 is minimized to ensure that the number of divided regions selected from the divided regions satisfies a predetermined condition. The region clumping model _H3 is minimized to increase the degree to which adjacent divided regions are selected from among the divided regions. Thus, appropriate similarity region detection is possible.

モデル変数計算部１４は、ヒストグラム類似度モデルＨ_１と、領域サイズモデルＨ_２と、領域塊状度モデルＨ_３と、をそれぞれ異なるバランス係数λ_１、λ_２、λ_３によりそれぞれ重みづけをしたうえで加算して算出される全体関数Ｈ_ＡＬＬを予め定めた閾値の条件を満たすように最小化する変数ｖ^＃（ｘ，ｙ）の解を取得するようにしてもよい。
このように、バランス係数λ_１、λ_２、λ_３に基づいて作成される全体関数Ｈ_ＡＬＬの最適化を図ることができる。 The model variable calculation unit 14 may obtain a solution for the variable v# ₍ x, y) that minimizes the overall function _HALL, which is calculated by adding together the histogram similarity model _H1 , the region size model _H2 , and the region clumping model H3, each weighted by different balance coefficients _λ1 , _λ2 , and _λ3 , so as to satisfy ^a predetermined threshold condition.
In this way, the overall function _HALL , which is created based on the balance coefficients _λ1 , _λ2 , and _λ3, can be optimized.

モデル変数計算部は、さらに前記全体関数を予め定めた閾値の条件を満たすように最小化することで取得した前記変数の解を用いて、前記ヒストグラム類似度モデルと、前記領域サイズモデルと、前記領域塊状度モデルの値を計算し、それぞれのモデルの値が、それぞれ予め定めた閾値以下になるように、前記バランス係数を調整するようにしてもよい。
このようにバランス係数λ１、λ２、λ３を調整することで、全体関数ＨＡＬＬの最適化を図ることができる。 The model variable calculation unit may further calculate the values of the histogram similarity model, the region size model, and the region clumping model using the solution of the variables obtained by minimizing the overall function to satisfy a predetermined threshold condition, and adjust the balance coefficient so that the values of each model are less than or equal to a predetermined threshold.
By adjusting the balance coefficients λ1, λ2, and λ3 in this way, the overall function HALL can be optimized.

前記領域検出部は、前記モデル変数計算部で取得された変数の解で特定される分割領域を用いて後処理をすることで、前記類似領域を決定するようにしてもよい。
具体的には、前記後処理は、前記分割領域から生成される４連結成分のうち最大の４連結成分を類似領域とする処理、前記分割領域から生成される８連結成分のうち最大の８連結成分を類似領域とする処理、二値画像に対して数回の膨張処理を行った後に、前記膨張処理と同じ回数の収縮処理を行うクロージング処理、又は二値画像に対して数回の収縮処理を行った後に、前記収縮処理と同じ回数の膨張処理を行うオープニング処理、の何れかの処理であるようにしてもよい。
このように類似領域を決定することで、例えば、ユーザの目的に応じた出力形態により類似領域を表示することができる。 The region detection unit may determine the similar region by performing post-processing using the partitioned region identified by the solution of the variables obtained by the model variable calculation unit.
Specifically, the post-processing may be any of the following: a process that sets the largest of the four connected components generated from the divided region as a similar region; a process that sets the largest of the eight connected components generated from the divided region as a similar region; a closing process that performs several dilation processes on the binary image followed by the same number of condensation processes as the dilation process; or an opening process that performs several condensation processes on the binary image followed by the same number of dilation processes as the condensation process.
By determining similar regions in this way, it becomes possible to display similar regions in an output format that suits the user's purpose, for example.

また、本発明に係る類似領域検出プログラムは、前記類似領域検出装置としてコンピュータを機能させるためのものである。 Furthermore, the similarity region detection program according to the present invention is intended to enable a computer to function as the similarity region detection device.

以上、本発明の実施形態について説明したが、本発明は前述した実施形態に限るものではない。また、本実施形態に記載された効果は、本発明から生じる最も好適な効果を列挙したに過ぎず、本発明による効果は、本実施形態に記載されたものに限定されるものではない。 Although embodiments of the present invention have been described above, the present invention is not limited to the embodiments described above. Furthermore, the effects described in these embodiments are merely a list of the most preferred effects arising from the present invention, and the effects of the present invention are not limited to those described in these embodiments.

本実施形態において、モデル変数計算部１４は、ヒストグラム類似度モデルＨ_１を画素値から抽出される色特徴量を用いて、算出するようにしたが、前述したように、特徴量として、例えば輝度勾配から抽出される局所特徴量（つまり形状の特徴量）、ＨＯＧ（ＨｉｓｔｏｇｒａｍｏｆＯｒｉｅｎｔｅｄＧｒａｄｉｅｎｔ）やＳＩＦＴ（Ｓｃａｌｅ－ＩｎｖａｒｉａｎｔＦｅａｔｕｒｅＴｒａｎｓｆｏｒｍ）を用いて算出される局所特徴量等を用いるようにしてもよい。
すなわち、ヒストグラム類似度モデルＨ_１として、異なる特徴量に基づいて、２種類以上のヒストグラム類似度モデルＨ_１－１、Ｈ_１－２等を生成することができる。
そうすると、２種類以上のヒストグラム類似度モデルＨ_１－１、Ｈ_１－２等と、領域サイズモデルＨ_２と、領域塊状度モデルＨ_３と、をバランス係数λ_１－１、λ_１－２、λ_２、λ_３を用いて加算した式１８に示す全体関数Ｈ_ＡＬＬを最小化する変数ｖ^＃（ｘ，ｙ）を求めるようにしてもよい。
Ｈ_ＡＬＬ＝ λ_１－１Ｈ_１－１＋λ_１－２Ｈ_１－２＋λ_２Ｈ_２＋λ_３Ｈ_３
（式１８）
そうすることで、２種類以上の特徴量によるヒストグラム類似度モデルＨ_１－１、Ｈ_１－２を用いる全体関数により算出される類似領域は、２種類以上の特徴量を考慮した類似領域を検出することが可能である。 In this embodiment, the model variable calculation unit 14 calculates the histogram similarity model _H1 using color features extracted from pixel values. However, as mentioned above, the features may also be local features extracted from brightness gradients (i.e., shape features), local features calculated using HOG (Histogram of Oriented Gradient) or SIFT (Scale-Invariant Feature Transform), etc.
In other words, as a histogram similarity model _H1 , it is possible to generate two or more types of histogram similarity models _H1-1 , _H1-2, etc., based on different features.
Alternatively, one could seek to find the variable v#(x,y) that minimizes the overall ^function _HALL shown in Equation ₁₈ , which is obtained by adding two or more histogram similarity models _H1-1 , _H1-2 , etc., a region size model _H2 , and a region clumping model H3 using balance coefficients _λ1-1 , _λ1-2 , _λ2 , and _λ3 .
H _ALL = λ _1-1 H _1-1 +λ _1-2 H _1-2 +λ ₂ H ₂ +λ ₃ H ₃
(Formula 18)
By doing so, the similarity regions calculated by the overall function using histogram similarity models _H1-1 and _H1-2 , which utilize two or more types of features, can detect similarity regions that take two or more types of features into consideration.

本実施形態では、主に類似領域検出装置１の構成と動作について説明したが、本発明はこれに限られず、各構成要素を備え、類似領域を出力装置６０に出力するための方法、又はプログラムとして構成されてもよい。 In this embodiment, the configuration and operation of the similarity region detection device 1 have been mainly described, but the present invention is not limited thereto. It may also be configured as a method or program for outputting similarity regions to the output device 60, comprising the various components.

さらに、類似領域検出装置１の機能を実現するためのプログラムをコンピュータで読み取り可能な記録媒体に記録して、この記録媒体に記録されたプログラムをコンピュータシステムに読み込ませ、実行することによって実現してもよい。 Furthermore, the functionality of the similarity region detection device 1 may be realized by recording a program for realizing the device's functions on a computer-readable recording medium, loading the program recorded on this medium into a computer system, and executing it.

ここでいう「コンピュータシステム」とは、ＯＳや周辺機器等のハードウェアを含むものとする。また、「コンピュータで読み取り可能な記録媒体」とは、フレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置のことをいう。 Here, "computer system" includes hardware such as the operating system and peripheral devices. Furthermore, "computer-readable recording media" refers to portable media such as flexible disks, magneto-optical disks, ROMs, and CD-ROMs, as well as storage devices such as hard disks built into computer systems.

さらに「コンピュータで読み取り可能な記録媒体」とは、インターネット等のネットワークや電話回線等の通信回線を介してプログラムを送信する場合の通信線のように、短時刻の間、動的にプログラムを保持するもの、その場合のサーバやクライアントとなるコンピュータシステム内部の揮発性メモリのように、一定時刻プログラムを保持しているものも含んでもよい。また、上記プログラムは、前述した機能の一部を実現するためのものであってもよく、さらに前述した機能をコンピュータシステムにすでに記録されているプログラムとの組み合わせで実現できるものであってもよい。 Furthermore, "computer-readable recording media" may include those that dynamically hold programs for short periods, such as communication lines used when transmitting programs via networks like the Internet or communication lines like telephone lines, as well as those that hold programs for a fixed period, such as volatile memory within the server or client computer systems. Moreover, the program itself may only implement a portion of the aforementioned functions, and may also be capable of implementing the aforementioned functions in combination with programs already recorded in the computer system.

１００類似領域検出システム
１類似領域検出装置
１０制御部
１１入力部
１２特徴量抽出部
１３数理モデル化部
１３１ヒストグラム類似度モデル定式化部
１３２領域サイズモデル定式化部
１３３領域塊状度モデル定式化部
１４モデル変数計算部
１５領域検出部
１６出力部
２０記憶部
５０入力装置
６０出力装置
２計算機（イジングマシン） 100 Similar Region Detection System 1 Similar Region Detection Device 10 Control Unit
11 Input Unit 12 Feature Extraction Unit 13 Mathematical Modeling Unit 131 Histogram Similarity Model Formulation Unit 132 Region Size Model Formulation Unit 133 Region Bulking Model Formulation Unit 14 Model Variable Calculation Unit 15 Region Detection Unit 16 Output Unit 20 Storage Unit 50 Input Device 60 Output Device 2 Computer (Ising Machine)

Claims

An input unit that receives the target image and the input image,
A feature extraction unit extracts features from the target image and also extracts features from each of the divided regions obtained by dividing the input image into pre-set sizes.
A mathematical modeling unit generates a mathematical model based on the features extracted from the target image by the feature extraction unit and the features extracted from each segmented region of the input image, with the selection of which segmented region to choose as a variable.
A model variable calculation unit obtains a solution for the variables that minimizes the mathematical model so that the values of the mathematical model generated by the mathematical modeling unit satisfy a predetermined threshold condition,
The system includes a region detection unit that determines a similar region identified by the solution of the variables obtained by the model variable calculation unit,
A similar region detection device that outputs similar regions determined by the region detection unit.

The mathematical modeling unit is,
The similar region detection device according to claim 1, wherein the size of the detection target for the similar region determined by the region detection unit is set in advance.

The aforementioned mathematical model comprises a histogram similarity model, a region size model, and a region clumping model.
The histogram similarity model is minimized so that the histogram of the features in the divided region approaches the histogram of the features extracted from the target image.
The aforementioned region size model is minimized so that the number of division regions selected from the division regions satisfies a predetermined condition.
The similarity region detection device according to claim 1 or claim 2, wherein the region clumping model is minimized to increase the degree to which adjacent divided regions are selected from among the divided regions.

The aforementioned model variable calculation unit is:
The similarity region detection device according to claim 3, which obtains a solution for the variable that minimizes an overall function calculated by adding the histogram similarity model, the region size model, and the region clumping model, each weighted by different balance coefficients, so as to satisfy a predetermined threshold condition.

The aforementioned model variable calculation unit is:
The similarity region detection device according to claim 4, wherein the values of the histogram similarity model, the region size model, and the region clumping model are calculated using the solution of the variables obtained by minimizing the overall function to satisfy a predetermined threshold condition, and the balance coefficient is adjusted so that the value of each model is less than or equal to a predetermined threshold.

The aforementioned model variable calculation unit is:
The similarity region detection device according to claim 5, wherein the values of the histogram similarity model, the region size model, and the region clumping model are calculated using the solution of the variables obtained by minimizing the overall function to satisfy a predetermined threshold condition, and if the value of any of the models is not less than or equal to the threshold, the balance coefficient corresponding to the model is increased.

The region detection unit,
The similarity region detection device according to claim 1 or claim 2, wherein the similarity region is determined by post-processing using the partitioned region identified by the solution of the variables obtained by the model variable calculation unit.

The aforementioned post-processing is as follows:
The similarity region detection device according to claim 7, comprising any of the following processes: a process in which the largest of the four connected components generated from the divided region is designated as a similarity region; a process in which the largest of the eight connected components generated from the divided region is designated as a similarity region; a closing process in which a dilation process is performed on the binary image several times, followed by a stenosis process the same number of times as the dilation process; or an opening process in which a stenosis process is performed on the binary image several times, followed by a dilation process the same number of times as the stenosis process.

A similar region detection program for causing a computer to function as a similar region detection device according to claim 1 or claim 2.