JP7637563B2

JP7637563B2 - Recognition processing method and recognition processing device

Info

Publication number: JP7637563B2
Application number: JP2021088017A
Authority: JP
Inventors: 大佑萩原; 宣隆木村; 泰樹矢野; 信博知原
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2025-02-28
Anticipated expiration: 2041-05-25
Also published as: WO2022249911A1; JP2022181142A

Description

本発明は、認識処理方法および認識処理装置に係り、特に多種類の物品の認識処理におけるパラメータの決定技術に関する。 The present invention relates to a recognition processing method and a recognition processing device, and in particular to a technique for determining parameters in the recognition processing of a wide variety of objects.

ピッキングロボットは、先端にグリッパーの付いたロボットアームを有し、センサで取得した情報に基づいて物品の仕分けやケースのパレタイズ・デパレタイズを行う知能ロボットの一つであり、例えば物流倉庫に使用されている。ピッキングロボットの性能は、センサで取得した情報を解析するための物体認識処理によって左右されるといってもよい。 A picking robot is an intelligent robot that has a robot arm with a gripper at the end and sorts items and palletizes/depalletizes cases based on information obtained by sensors. It is used in logistics warehouses, for example. The performance of a picking robot depends on the object recognition process used to analyze the information obtained by the sensors.

物体認識処理として、センサで取得された画像データから特徴量を抽出し、その特徴量をもとに画像データ内の認識対象物体位置を推定する技術が広く用いられている。この特徴量の設計などに関連して、認識性能を左右する複数の調整可能なパラメータが存在することが一般的である。認識対象物品群に対する適切なパラメータ値は、認識対象物品群のモデルと、各物品に対してその物品が撮像されていてその物品位置が分かっている画像データ（以下、教示データ）を用意できれば、計算機で自動的に算出することができる。しかし、モデルや教示データは人手で作成して、用意する必要がある。特に教示データは、算出されるパラメータ値のロバスト性を保証するために各物品に対して多数枚求められる。そのため、各物品について画像の撮影と物品位置情報の付与を行わなければならず、多大な人手と作業コストを要する。例えば物流倉庫の場合、パラメータ決定を要する対象物品数が膨大となるので、パラメータ設定作業を効率的に行うことが求められる。 A technology that extracts features from image data acquired by a sensor and estimates the position of the object to be recognized in the image data based on the features is widely used as an object recognition process. In relation to the design of these features, there are generally several adjustable parameters that affect the recognition performance. Appropriate parameter values for a group of objects to be recognized can be calculated automatically by a computer if a model of the group of objects to be recognized and image data (hereinafter referred to as teaching data) in which the object is photographed and the object position is known for each object are prepared. However, the model and teaching data must be created and prepared manually. In particular, a large number of teaching data are required for each object to ensure the robustness of the calculated parameter values. Therefore, images must be taken of each object and object position information must be added, which requires a lot of manpower and labor costs. For example, in the case of a logistics warehouse, the number of objects to be recognized that require parameter determination is enormous, so it is necessary to perform parameter setting work efficiently.

多種類の物品を対象とするパラメータ設定作業を効率的に行うために、その物品群をいくつかのクラスに分類し、そのクラスごとに適切なパラメータ値を設定する案が考えられる。例えば、特許文献１には、認識対象が分類されるクラスを識別し、そのクラスに対応するパラメータ値をその認識対象の適切なパラメータ値として決定して高精度の認識を可能とする、パラメータ設定方法が開示されている。 In order to efficiently perform parameter setting work for many types of objects, one idea is to classify the objects into several classes and set appropriate parameter values for each class. For example, Patent Document 1 discloses a parameter setting method that identifies the class into which the object to be recognized is classified, and determines the parameter values corresponding to that class as appropriate parameter values for the object to be recognized, enabling highly accurate recognition.

特開２０２０－６８００８号公報JP 2020-68008 A

特許文献１に記載のパラメータ設定は、人や自動車といった抽象度の高いパターンを想定している。そのため、どのようなパターンを用意して、認識対象をどのように分類するかを比較的容易に判断できる。しかし、同一パターン内での多様性が大きい場合は、同一のパラメータ値で高い認識性能を実現することは難しくなる。例えば、物流倉庫で取扱われる段ボール箱の場合、箱のサイズや模様が多岐に渡るため、パターンをさらに細分化する必要がある。しかし乍ら、パターンの細分化が進むと、適切なパターン分類は人間の感覚から離れてしまい、どのようなパターンを用意してどのように物品を分類するかが非自明となる。特に、新しい物品の適切なパラメータ値を決定するときに、パターンの情報を活用するためには、その物品が何れのパターンに分類されるかを適切に判定することが重要となる。 The parameter settings described in Patent Document 1 assume highly abstract patterns such as people and automobiles. Therefore, it is relatively easy to determine what patterns to prepare and how to classify the recognition target. However, if there is a large amount of diversity within the same pattern, it becomes difficult to achieve high recognition performance with the same parameter values. For example, in the case of cardboard boxes handled in a logistics warehouse, the sizes and patterns of the boxes vary widely, so it is necessary to further subdivide the patterns. However, as the pattern subdivision progresses, appropriate pattern classification becomes detached from human senses, and it becomes non-obvious what patterns to prepare and how to classify the items. In particular, when determining appropriate parameter values for a new item, in order to utilize pattern information, it is important to appropriately determine which pattern the item will be classified into.

そこで、本発明の目的は、少ない教示データ量で、多種類の物品に対する物品認識処理のパラメータの決定を可能とすることにある。 The object of the present invention is to make it possible to determine parameters for object recognition processing for many types of objects using a small amount of teaching data.

本発明に係る認識処理方法の好ましい一例は、コンピュータが物品の認識処理を実行する認識処理方法であって、
複数の登録済み物品を含み、それぞれ代表パラメータ値を有する複数のパターンを記憶部に保管するステップと、
少なくとも、新規物品のモデルと、複数のシーン画像を有する新規物品の情報を受付けるステップと、
前記複数の各パターンの前記代表パラメータ値を用いて、前記新規物品の複数のシーン画像の認識処理を行う認識処理ステップと、
前記認識処理の結果、最もよいと判断された代表パラメータ値に対応する前記パターンを前記新規物品の最類似パターンとして選択するステップと、
選択された前記最類似パターンの前記代表パラメータ値を用いた前記新規物品の認識処理の結果が良好な場合、前記代表パラメータ値を前記新規物品の認識可能なパラメータ値として決定するステップと、を有する認識処理方法、である。 A preferred example of the recognition processing method according to the present invention is a recognition processing method in which a computer executes an object recognition process, comprising the steps of:
storing a plurality of patterns including a plurality of registered articles, each pattern having a representative parameter value, in a storage unit;
receiving new item information having at least a model of the new item and a plurality of scene images;
a recognition processing step of performing a recognition process on a plurality of scene images of the novel article using the representative parameter values of each of the plurality of patterns;
selecting the pattern corresponding to the representative parameter value determined to be the best as a result of the recognition process as the most similar pattern to the new article;
and if the result of the recognition processing of the new article using the representative parameter value of the selected most similar pattern is good, determining the representative parameter value as a recognizable parameter value of the new article.

また、本発明に係る認識処置装置の好ましい例は、複数の登録済み物品を含みそれぞれ代表パラメータ値を有する複数のパターンを保管する記憶部と、
少なくとも、新規物品のモデルと、複数のシーン画像を有する新規物品の情報を受付ける受付部と、
前記複数の各パターンの前記代表パラメータ値を用いて、前記新規物品の複数のシーン画像の認識処理を行う認識処理部と、
前記認識処理部による認識処理の結果、最もよいと判断された代表パラメータ値に対応する前記パターンを、前記新規物品の最類似パターンとして選択する決定部と、
選択された前記最類似パターンの前記代表パラメータ値を用いた前記新規物品の認識処理の結果が良好な場合、前記代表パラメータ値を前記新規物品の認識可能なパラメータ値として出力する出力部と、を有する認識処理装置、である。
本発明はまた、コンピュータが上記認識処理方法を実行するプログラムとして把握され得る。
A preferred embodiment of the recognition processing device according to the present invention includes a storage unit that stores a plurality of patterns including a plurality of registered articles, each of which has a representative parameter value;
A receiving unit that receives at least a model of a new item and information on the new item having a plurality of scene images;
a recognition processing unit that performs a recognition process for a plurality of scene images of the novel article by using the representative parameter values of each of the plurality of patterns;
a determination unit that selects the pattern corresponding to the representative parameter value that is determined to be the best as a result of the recognition processing by the recognition processing unit as the most similar pattern of the new article;
and an output unit that outputs the representative parameter value as a recognizable parameter value of the new article if the result of the recognition process of the new article using the representative parameter value of the selected most similar pattern is good.
The present invention can also be understood as a program for causing a computer to execute the above recognition processing method.

本発明によれば、少ない教示データ量で、多種類の物品に対する物品認識処理のパラメータの決定が可能となる。 The present invention makes it possible to determine parameters for object recognition processing for many types of objects using a small amount of teaching data.

実施例１による物品認識処理の概要を示す図である。FIG. 2 is a diagram showing an overview of an article recognition process according to the first embodiment. パラメータ算出の例を示す図である。FIG. 13 is a diagram illustrating an example of parameter calculation. 物品群のパターン分類の例を示す図である。FIG. 13 is a diagram showing an example of pattern classification of a group of items. 物品認識処理装置の機能ブロックを示す図である。FIG. 2 is a diagram showing functional blocks of the item recognition processing apparatus. 物品認識処理装置におけるパラメータ決定の処理動作を示す図である。FIG. 11 is a diagram showing a processing operation of parameter determination in the article recognition processing device. 代表パラメータを用いた認識処理の例を示す図である。11A and 11B are diagrams illustrating an example of a recognition process using representative parameters. 代表パラメータを用いた認識処理の結果を判断する例を示す図である。11A and 11B are diagrams illustrating an example of determining a result of a recognition process using a representative parameter. 最類似パターンの代表パラメータ値に基づくパラメータ決定の例を示す図である。FIG. 13 is a diagram showing an example of parameter determination based on a representative parameter value of a most similar pattern. シーン画像に物品位置情報を付与する方法の例を示す図である。11A and 11B are diagrams illustrating an example of a method for adding item position information to a scene image. 同時最適化の例を示す図である。FIG. 13 illustrates an example of joint optimization. 同時最適化の結果に基づくパラメータ決定の例を示す図である。FIG. 13 is a diagram illustrating an example of parameter determination based on the result of joint optimization. 個別最適化の結果に基づくパラメータ決定の例を示す図である。FIG. 13 is a diagram illustrating an example of parameter determination based on a result of individual optimization. 物品認識処理装置のハードウェアの構成例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of an item recognition processing apparatus. 実施例２によるパラメータ決定の処理動作を示す図である。FIG. 11 is a diagram illustrating a processing operation of parameter determination according to the second embodiment. 実施例２による最類似パターンの代表パラメータ値に基づくパラメータ決定およびパターン更新の例を示す図である。13A and 13B are diagrams illustrating an example of parameter determination and pattern updating based on a representative parameter value of a most similar pattern according to the second embodiment. 実施例２による同時最適化の結果に基づくパラメータ決定およびパターン更新の例を示す図である。FIG. 13 is a diagram showing an example of parameter determination and pattern update based on the result of joint optimization according to the second embodiment. 実施例２による個別最適化の結果に基づくパラメータ決定およびパターン更新の例を示す図である。FIG. 11 is a diagram showing an example of parameter determination and pattern update based on the results of individual optimization according to the second embodiment. 実施例３によるパラメータ決定の処理動作を示す図である。FIG. 11 is a diagram illustrating a processing operation of parameter determination according to the third embodiment. 物品認識処理装置の適用例を示す斜視図である。FIG. 11 is a perspective view showing an application example of an article recognition processing apparatus.

以下、図面を参照して本発明の好ましい実施形態を説明する。なお、実施例の説明に際して、複数の図面において同一の部位には同一符号を付して、重複する説明は省略することがある。 Below, preferred embodiments of the present invention will be described with reference to the drawings. In describing the embodiments, the same parts in multiple drawings will be given the same reference numerals, and duplicate descriptions may be omitted.

実施例１による物品認識処理装置におけるパラメータの決定の処理について説明する。 The process of determining parameters in the item recognition processing device according to the first embodiment is described below.

図１は、コンピュータの処理により実現される物品認識処理の概要を示している。
コンピュータは、認識対象物品（以下単に対象物品という）のモデルと、カメラ等の撮影手段により取得される対象物品を含むシーン画像と、認識処理のパラメータ値を入力データとして、認識処理を実行して、シーン画像中の対象物品の推定位置情報を出力データとして出力する。認識処理の処理手順や各データの詳細は用いられる認識手法に因る。物品のモデルはその物品を特徴付けるデータであり、物品のCADデータや物品の画像から抽出した特徴量などが含まれる。シーン画像は例えばRGB画像やDepth画像、もしくはその組といった形式を持つ画像である。シーン画像には一般的には一つ以上の対象物品が含まれるが、対象物品が含まれていなくても認識処理上の問題はない。 FIG. 1 shows an overview of an article recognition process realized by computer processing.
The computer executes the recognition process using a model of the object to be recognized (hereinafter simply referred to as the object), a scene image including the object captured by a photographing means such as a camera, and parameter values of the recognition process as input data, and outputs estimated position information of the object in the scene image as output data. The processing procedure of the recognition process and details of each data depend on the recognition method used. The object model is data that characterizes the object, and includes CAD data of the object and feature amounts extracted from the image of the object. The scene image is an image having a format such as an RGB image or a depth image, or a combination thereof. A scene image generally includes one or more object objects, but there is no problem in the recognition process even if the object object is not included.

認識処理のパラメータ値θは、認識率や認識速度のような認識処理の性能を左右する値であり、一般に多次元のベクトル量として表現される。出力される推定位置情報としては、２次元画像上の物品位置に興味がある場合は画像のピクセル値集合、３次元空間上の物品位置姿勢に興味がある場合は３次元空間上の座標値集合、などが考えられる。それらは数値データであるが、図１に示すようにシーン画像上に表現できる場合もある。 The parameter value θ for the recognition process is a value that determines the performance of the recognition process, such as the recognition rate and speed, and is generally expressed as a multidimensional vector quantity. The estimated position information that is output may be a set of pixel values of the image if you are interested in the position of the item on a two-dimensional image, or a set of coordinate values in three-dimensional space if you are interested in the position and orientation of the item in three-dimensional space. Although these are numerical data, they may also be expressed on a scene image, as shown in Figure 1.

認識の結果が成功か失敗かの判断は、シーン画像に物品位置情報が付与されている場合は、物品位置情報と出力される推定位置情報を比較することで可能となる。評価指標としては、２次元画像上の物品位置に興味がある場合はIoU（Intersection Over Union）、３次元空間上の物品位置姿勢に興味がある場合は平均二乗誤差、などがあり、その評価指標の値を閾値と比較する。 If the scene image is provided with item position information, it is possible to determine whether the recognition result is a success or failure by comparing the item position information with the output estimated position information. Evaluation indices include IoU (Intersection Over Union) if you are interested in the item position on a two-dimensional image, and mean squared error if you are interested in the item position and orientation in three-dimensional space, and the value of the evaluation index is compared with a threshold value.

一方、シーン画像に物品位置情報が付与されていない場合、推定位置情報を人間に視認可能に表示装置に表示できるときは、オペレータがその表示を見て成否を判断することが可能である。この場合、シーン画像の物品位置情報は必要ないため、人手や作業コストが節約できる。 On the other hand, if the scene image does not have item position information, and the estimated position information can be displayed on a display device so that it is visible to humans, the operator can judge whether it has been successful by looking at the display. In this case, since the item position information in the scene image is not required, manpower and operating costs can be saved.

ここで、認識手法の一例を述べる。シーン画像中に含まれる対象物品の３次元位置姿勢を求めるために、シーン画像としてRGB画像とDepth画像の組を用いる。対象物品のモデルを作成するために、まず、対象物品を様々な方向から撮影して得た２次元画像からSIFT特徴点及びSIFT特徴量を計算する。SIFT（Scale-Invariant Feature Transform）は局所特徴量と呼ばれる画像特徴量の一種であり、SIFT特徴点は２次元画像のピクセル位置、SIFT特徴量は実ベクトル量として表現される。対象物品のCADモデルを用意し、得られたSIFT特徴点に対応するCADモデル上のデータ点に、対応するSIFT特徴量が紐づけられたものを、対象物品のモデルとして用いる。 Here, we will explain one example of a recognition method. To obtain the three-dimensional position and orientation of a target object contained in a scene image, a set of an RGB image and a depth image is used as the scene image. To create a model of the target object, first, SIFT feature points and SIFT feature quantities are calculated from two-dimensional images obtained by photographing the target object from various angles. SIFT (Scale-Invariant Feature Transform) is a type of image feature quantity called a local feature, and SIFT feature points are expressed as pixel positions in two-dimensional images, and SIFT feature quantities are expressed as real vector quantities. A CAD model of the target object is prepared, and the SIFT feature quantities are linked to data points on the CAD model that correspond to the obtained SIFT feature points, and these are used as the model of the target object.

コンピュータが、認識処理を実行すると、RGBシーン画像からSIFT特徴点及びSIFT特徴量が計算され、Depthシーン画像によりそれらのSIFT特徴点に対応する３次元座標値が得られる。そして、モデルとシーン画像の３次元特徴点の対応関係を特徴量の値に基づいて求め、その対応関係から両者をつなぐ変換行列を計算する。その変換行列が上記の推定位置情報に相当する。この認識手法では、SIFT特徴点・特徴量の計算に関係するパラメータ、３次元特徴点の対応関係から変換行列を求める際に関係するパラメータなどが認識処理のパラメータとなる。なお、本実施例の処理手順や効果は上記の認識手法に限定されない。 When the computer executes the recognition process, SIFT feature points and SIFT feature quantities are calculated from the RGB scene image, and three-dimensional coordinate values corresponding to those SIFT feature points are obtained from the depth scene image. Then, the correspondence between the three-dimensional feature points of the model and the scene image is found based on the feature quantity values, and a transformation matrix connecting the two is calculated from that correspondence. This transformation matrix corresponds to the estimated position information described above. In this recognition method, parameters related to the calculation of SIFT feature points and feature quantities, parameters related to finding a transformation matrix from the correspondence of three-dimensional feature points, and the like become parameters of the recognition process. Note that the processing procedure and effects of this embodiment are not limited to the above recognition method.

図２は、認識処理の適切なパラメータ値を算出する例を示す。
コンピュータは、対象物品のモデルと、それが含まれる複数枚のシーン画像と、シーン画像上の物品位置情報とを含む入力データを基に、最適化計算を実行して、出力データとして認識処理のパラメータ値を算出する。最適化計算は、パラメータ値に応じて値を返す評価関数を最大化するパラメータ値を用いる最適化手法の処理手順に従って実行する。これにより、評価関数の設計を通して目的に応じたパラメータ値を求めることが可能となる。例えば、評価関数を、入力した複数のシーン画像に対する認識率とすると、得られるパラメータ値は、入力した複数のシーン画像に対する認識率ができるだけ高くなることが期待される。さらに、認識処理速度に依存する項を加えることで、できるだけ高速な認識処理が可能なパラメータ値を算出することが可能である。 FIG. 2 shows an example of calculating appropriate parameter values for the recognition process.
The computer executes an optimization calculation based on input data including a model of the target item, multiple scene images including the target item, and item position information on the scene images, and calculates parameter values for the recognition process as output data. The optimization calculation is executed according to a processing procedure of an optimization method using parameter values that maximize an evaluation function that returns a value according to the parameter value. This makes it possible to obtain parameter values according to the purpose through the design of the evaluation function. For example, if the evaluation function is the recognition rate for multiple input scene images, the obtained parameter values are expected to maximize the recognition rate for the multiple input scene images. Furthermore, by adding a term that depends on the recognition processing speed, it is possible to calculate parameter values that enable the fastest possible recognition processing.

以下の説明では、入力した複数のシーン画像に対する認識率ができるだけ高くなるようなパラメータ値を得ることを目的とした評価関数を想定して説明する。この場合、一般的に、入力するシーン画像の多様性が大きいほど、得られるパラメータ値の適用範囲が広くなる（ロバスト性が高くなる）ことに注意する。つまり、ロバストなパラメータ値を得るためには、物品位置情報が付与されたシーン画像（教示データ）が多数枚必要となり、多大な人手や作業コストが発生する。後述するように、本実施例は、その教示データが少量でもロバスト性が高いパラメータを決定することを可能にする処理手順を含んでいる。 In the following explanation, we will assume an evaluation function that aims to obtain parameter values that will maximize the recognition rate for multiple input scene images. In this case, it should be noted that, in general, the greater the diversity of the input scene images, the wider the range of applicability of the obtained parameter values (the higher the robustness). In other words, to obtain robust parameter values, a large number of scene images (training data) with item position information attached are required, resulting in significant manpower and labor costs. As will be described later, this embodiment includes a processing procedure that makes it possible to determine highly robust parameters even with a small amount of training data.

ここで、最適化手法の一例を述べる。最適化手法として、グリッドサーチやベイズ最適化が挙げられる。グリッドサーチは、例えば、パラメータ空間を適当な大きさのグリッドに分割し、各グリッド点に対応するパラメータ値での評価関数値を計算して、最大となるパラメータ値を選択する。ベイズ最適化は、既に計算済みのパラメータ値とその下での評価関数値に基づいて、より大きな評価関数値が得られることが期待される、パラメータ値を探索していく。認識処理のパラメータの最適化では、評価関数値を計算するために、あるパラメータ値を用いた認識処理を実行することが多く、演算コストが比較的大きい。そのため、できるだけ効率的な最適化手法が好ましい。なお、本実施例の処理手順や効果は、上記の最適化手法に限定されない。 Here, an example of an optimization method will be described. Examples of optimization methods include grid search and Bayesian optimization. Grid search, for example, divides the parameter space into grids of an appropriate size, calculates the evaluation function value at the parameter value corresponding to each grid point, and selects the maximum parameter value. Bayesian optimization searches for parameter values that are expected to produce a larger evaluation function value based on already calculated parameter values and the evaluation function values under those values. In optimizing parameters for recognition processing, recognition processing is often performed using certain parameter values to calculate the evaluation function value, and the calculation cost is relatively high. For this reason, an optimization method that is as efficient as possible is preferable. Note that the processing procedure and effects of this embodiment are not limited to the optimization method described above.

図２の例では、入力データとして対象物品が１種類の場合を述べているが、２種類以上の物品でも構わない。後述の例において、異なる種類の物品群を同時に認識可能とするパラメータ値を求める際に上記の処理が用いられる。 In the example of Figure 2, the input data shows a single type of target item, but two or more types of items may be used. In the example described below, the above process is used to find parameter values that enable simultaneous recognition of groups of different types of items.

図３は、物品群のパターン分類の一例を示す。図３の、横軸は複数の物品群における各物品パターン、縦軸はパラメータ値を表す。各物品の上側の縦棒は、物品ごとの要求認識性能を実現するパラメータ値の範囲を表す。ここでは簡単のため、パラメータとして１次元を想定しているが、２次元以上のものでも同様に考えることができる。図示の物品は、説明を分かりやすくするための例であり、実際のものを正確に反映しているわけではないことに留意して欲しい。要求認識性能によってそれを満たすパラメータ値は変化するが、一般に有限の幅を持つ。そのため、異なる物品同士でも同一のパラメータ値で要求認識性能を達成できる場合がある。本実施例ではこの点に着目して、要求認識性能を達成できる同一のパラメータ値が見出せる物品群を一つのパターンとする。図３の例では、パターン分けにより、５つの物品が、３つの物品から成るパターンＡと、２つの物品から成るパターンＢから成る２つのパターン群に分類される。ここで、各パターンに見出せる同一のパラメータ値を代表パラメータ値と称する。本実施例は、物品の適切なパラメータ値の決定に対する、このパターンの具体的な活用を提案する。 Figure 3 shows an example of pattern classification of a group of items. In Figure 3, the horizontal axis represents each item pattern in a plurality of groups of items, and the vertical axis represents the parameter value. The vertical bar above each item represents the range of parameter values that realizes the required recognition performance for each item. For simplicity, one-dimensional parameters are assumed here, but two-dimensional or higher parameters can be considered in the same way. Please note that the items shown are examples for easy understanding of the explanation, and do not accurately reflect the actual ones. The parameter values that satisfy the required recognition performance change depending on the required recognition performance, but generally have a finite range. Therefore, there are cases where the required recognition performance can be achieved with the same parameter values even for different items. In this embodiment, focusing on this point, a group of items in which the same parameter values that can achieve the required recognition performance can be found is considered as one pattern. In the example of Figure 3, five items are classified into two pattern groups, pattern A consisting of three items and pattern B consisting of two items, by pattern classification. Here, the same parameter value found in each pattern is called a representative parameter value. This embodiment proposes a specific use of this pattern for determining appropriate parameter values for items.

上記パターン分類に関して更に述べるに、本実施例におけるパターン分類は、一般に人間の感覚とは合致しない。換言すれば、要求認識性能を同時に達成するパラメータ値が存在する物品群を同一パターンとするという基準により、本パターン分類はパラメータ決定という目的にとって都合の良いパターン分類を行っていると言える。さらに、この明確な基準のために、後述するように、新規パラメータを設定する対象物品がどのパターンに分類されるかの判定を適切に行うことが可能である。次に、本パターン分類は一意ではない。できるだけ少ないパターン数にすることが重要であることが後で明らかとなるが、一意でないことで本実施例の効果が失われるわけではない。また、パターン分類は同じでも代表パラメータ値の選択に自由度がある。これも同様に本実施例の効果を失わせるわけではない。 To go further with regard to the above pattern classification, the pattern classification in this embodiment does not generally match human senses. In other words, it can be said that this pattern classification is a convenient pattern classification for the purpose of parameter determination, based on the criterion that groups of objects for which there are parameter values that simultaneously achieve the required recognition performance are classified as the same pattern. Furthermore, due to this clear criterion, as will be described later, it is possible to appropriately determine which pattern an object for which new parameters are to be set is classified into. Next, this pattern classification is not unique. It will become clear later that it is important to keep the number of patterns as small as possible, but the fact that it is not unique does not eliminate the effect of this embodiment. Also, even if the pattern classification is the same, there is a degree of freedom in the selection of representative parameter values. This also does not eliminate the effect of this embodiment.

図４は、物品認識処理装置の機能ブロックを示す。
受付部４０１は、新規物品の情報（例えばモデル情報やシーン画像）の入力を受け付ける。受付部は入力部と言ってもよい。認識処理部４０２は、受付部４０１による入力情報に対して物品の認識処理を行う。対象となる物品、シーン画像、パラメータ値は遷移元に応じて変わる。代表パラメータ値評価部４０３は、新規物品に対する代表パラメータ値の性能を評価する。代表パラメータ値ごとに、認識処理を実行し認識成否信号受付部の結果をもとに認識率を算出する。 FIG. 4 shows functional blocks of the article recognition processing apparatus.
The reception unit 401 receives input of information on new items (e.g., model information and scene images). The reception unit may also be called an input unit. The recognition processing unit 402 performs item recognition processing on the information input by the reception unit 401. The target item, scene image, and parameter value change depending on the transition source. The representative parameter value evaluation unit 403 evaluates the performance of the representative parameter value for the new item. For each representative parameter value, recognition processing is executed and a recognition rate is calculated based on the result of the recognition success/failure signal reception unit.

最類似パターン決定部４０４は、複数のパターンに対応する各代表パラメータ値の間で認識率を比較して、認識率が最も高い代表パラメータ値に対応するパターンを最類似パターンとして決定する。認識率が要求値以上ならば最適パラメータ値出力部４０９へ出力し、認識率が要求値未満ならばアノテーション信号受付部４０６へ移行する。 The most similar pattern determination unit 404 compares the recognition rates between the representative parameter values corresponding to the multiple patterns, and determines the pattern corresponding to the representative parameter value with the highest recognition rate as the most similar pattern. If the recognition rate is equal to or greater than the required value, it outputs it to the optimal parameter value output unit 409, and if the recognition rate is less than the required value, it transitions to the annotation signal reception unit 406.

認識成否信号受付部４０５は、認識処理部４０２による認識結果が成功か失敗かの信号を受け付ける。一例では、認識結果の画像を表示装置に表示し、その画像をオペレータが見て、認識結果の成否を判断し、その判断結果を入力部より受付ける。 The recognition success/failure signal receiving unit 405 receives a signal indicating whether the recognition result by the recognition processing unit 402 is successful or unsuccessful. In one example, an image of the recognition result is displayed on a display device, and an operator looks at the image to determine whether the recognition result is successful or unsuccessful, and the result of the determination is received from the input unit.

アノテーション信号受付部４０６は、最類似パターンの代表パラメータ値での新規物品の失敗シーンの一部に対するアノテーション信号を受け付ける。そして、最適化部４０７による同時最適化の処理結果が要求値未満で認識性能評価部４０８から遷移した場合は、新規物品の正解シーンも対象に含める。アノテーション信号受付部４０６は、この段階でまだ正解情報が付与されていない新規物品の失敗シーンがある場合に、その一部に対するアノテーション信号を受け付けることができる。ここで、発明者は、同時最適化とは、新規物品の認識率が要求性能を満たさない場合、新規物品に対する最類似パターンを選択して、その最類似パターンに新規物品を追加して代表パラメータを置換える（更新する）こと、という（詳細は後述）。 The annotation signal receiving unit 406 receives an annotation signal for a part of the failure scene of the new item with the representative parameter value of the most similar pattern. Then, when the result of the simultaneous optimization process by the optimization unit 407 is less than the required value and transitions from the recognition performance evaluation unit 408, the correct scene of the new item is also included as a target. If there is a failure scene of the new item to which correct answer information has not yet been assigned at this stage, the annotation signal receiving unit 406 can receive an annotation signal for that part. Here, the inventors mean that simultaneous optimization means, when the recognition rate of the new item does not satisfy the required performance, selecting the most similar pattern for the new item, adding the new item to that most similar pattern, and replacing (updating) the representative parameters (details will be described later).

最適化部４０７は、アノテーション信号受付部４０６で正解情報を付与された新規物品のシーン画像と最類似パターンの情報を用いて同時最適化を実行する。または、新規物品の情報に基づいて個別最適化を実行する。算出されたパラメータ値は認識性能評価部４０８で評価される。ここで、発明者は、個別最適化とは、新規物品の情報に基づいて要求性能を満たす新規パターンを作成すること、という（詳細は後述）。 The optimization unit 407 performs simultaneous optimization using the scene image of the new item to which the correct answer information has been assigned by the annotation signal receiving unit 406 and information on the most similar pattern. Alternatively, it performs individual optimization based on the information on the new item. The calculated parameter values are evaluated by the recognition performance evaluation unit 408. Here, the inventors mean that individual optimization means creating a new pattern that satisfies the required performance based on the information on the new item (details will be described later).

認識性能評価部４０８は、最適化部４０７で算出されたパラメータ値の新規物品に対する認識率を評価し、要求値を満たすかどうかを確かめる。同時最適化の結果が要求値を満たさない場合、最適化部４０７に戻って個別最適化を実行する。 The recognition performance evaluation unit 408 evaluates the recognition rate for the new item of the parameter values calculated by the optimization unit 407 and checks whether it satisfies the required value. If the result of the joint optimization does not satisfy the required value, the unit returns to the optimization unit 407 and performs individual optimization.

出力部４０９は、算出された新規物品の最適パラメータ値を出力して、例えば表示装置に表示する。なお、要求値未満の場合は認識不能と判断されて、最適化部４０７に戻ることになる。パターン情報ＤＢ４１０は、登録済み物品群の情報(例えばモデル情報やシーン画像)と、それらから抽出されるパターン情報を記憶する記憶部である。なお、物品認識処理装置のハードウェア構成については、図１３を参照して後述する。 The output unit 409 outputs the calculated optimal parameter value of the new object, and displays it, for example, on a display device. If the value is less than the required value, it is determined that recognition is impossible, and the process returns to the optimization unit 407. The pattern information DB 410 is a storage unit that stores information on registered object groups (e.g., model information and scene images) and pattern information extracted from them. The hardware configuration of the object recognition processing device will be described later with reference to FIG. 13.

図５は、物品認識処理装置におけるパラメータ決定の処理動作のフローチャートである。図５に示す処理手順により、新規物品のパラメータ決定が行われる。本実施例では、物品の認識処理を実行するために、少なくとも新規物品のモデルとシーン画像を含む入力情報を要求する。以下の説明では、１つ以上の物品のパラメータが決定されていて、その情報を基に物品のパターン分類が行われているものとする。 Figure 5 is a flowchart of the parameter determination processing operation in the item recognition processing device. The parameters of a new item are determined according to the processing procedure shown in Figure 5. In this embodiment, to execute the item recognition process, input information is required that includes at least a model of the new item and a scene image. In the following explanation, it is assumed that parameters of one or more items have been determined, and that item pattern classification is performed based on that information.

まず、受付部４０１が受け付けた入力情報に対して、認識処理部４０２が、複数のパターンの各代表パラメータ値を用いて、新規物品のシーン画像に対する新規物品の認識を行う（Ｓ０５）。認識処理部４０２の認識結果に基づき、認識成否信号受付部４０５からの信号に応じて、代表パラメータ評価部４０３が各代表パラメータ値に対する認識の結果を判断して、最も良いとみなされた代表パラメータ値を選択する（Ｓ１０）。なお、代表パラメータ値に対応するパターンを最類似パターンという。 First, the recognition processing unit 402 uses the representative parameter values of multiple patterns to recognize the new item in the scene image of the new item for the input information received by the reception unit 401 (S05). Based on the recognition result of the recognition processing unit 402 and in response to a signal from the recognition success/failure signal reception unit 405, the representative parameter evaluation unit 403 judges the recognition result for each representative parameter value and selects the representative parameter value that is deemed to be the best (S10). The pattern corresponding to the representative parameter value is called the most similar pattern.

ここでステップＳ０５－Ｓ１０の処理について、図６を参照して具体的に述べる。新規物品の各シーン画像に対して、複数のパターンＡ、Ｂ等の各代表パラメータ値による新規物品の認識処理を実行して、図６に示すような、推定位置情報の出力データが得られたとする。なお理解し易くするために、各推定位置情報を対応するシーン画像上に表わしている。なお、用意した新規物品の全てのシーン画像の認識処理を行っているが、必要に応じて一部のみに制限しても認識処理を行っても構わない。各推定位置情報に基づいて認識が成功したか失敗したかを判断して、各代表パラメータ値の認識性能の評価（すなわち認識率）が得られる。各パターンに対する評価を比較して評価が最も良いものを最類似パターンとして選択する。図６の例では、認識率０．９５のパターンＡが最類似パターンとして選択されている。 The processing of steps S05-S10 will now be described in detail with reference to FIG. 6. Assume that the recognition processing of the new item is performed for each scene image of the new item using each representative parameter value of multiple patterns A, B, etc., and output data of estimated position information as shown in FIG. 6 is obtained. For ease of understanding, each estimated position information is shown on the corresponding scene image. Note that the recognition processing is performed for all scene images of the prepared new item, but the recognition processing may be limited to only a portion as necessary. The recognition is judged to be successful or unsuccessful based on each estimated position information, and an evaluation of the recognition performance of each representative parameter value (i.e., the recognition rate) is obtained. The evaluations for each pattern are compared and the one with the best evaluation is selected as the most similar pattern. In the example of FIG. 6, pattern A with a recognition rate of 0.95 is selected as the most similar pattern.

ここで、新規物品のシーン画像に物品位置情報が付与されていれば、代表パラメータ評価部４０３が、適切に設定された閾値を用いて認識成否の判断を自動で行うことができる。しかし、この段階では、新規物品のシーン画像に対して物品位置情報は必ずしも要求していない。そこで、本実施例では、例えば図７に示すような判断を採用する。すなわち、各推定位置情報を対応するシーン画像上に表現して、表示装置に表示する。オペレータがその表示を見て認識成否を判断して、その判断結果（成功か失敗）を入力部から入力する。 Here, if item position information is added to the scene image of the new item, the representative parameter evaluation unit 403 can automatically determine whether recognition has been successful or not using an appropriately set threshold value. However, at this stage, item position information is not necessarily required for the scene image of the new item. Therefore, in this embodiment, for example, a determination such as that shown in FIG. 7 is adopted. That is, each piece of estimated position information is expressed on the corresponding scene image and displayed on the display device. The operator looks at the display and determines whether recognition has been successful or not, and inputs the result of the determination (success or failure) from the input unit.

認識の結果が成功と判断された新規物品のシーン画像については、出力された推定位置情報を記憶部に保存しておき、以降そのシーン画像の物品位置情報として利用することができる。この処理により、そのシーン画像に対する新規物品の認識の結果の判断を自動で行うことが可能となる。なお、この時点で物品位置情報は最適なものとなっている保証はないため、以下の処理が重要となる。 For scene images of new items for which the recognition result is judged to be successful, the output estimated position information is saved in the memory unit and can be used thereafter as the item position information for that scene image. This process makes it possible to automatically judge the result of the recognition of new items for that scene image. Note that there is no guarantee that the item position information at this point is optimal, so the following process is important.

図５に戻って、次に、最類似パターン決定部４０４が、ステップＳ１０で決定された最類似パターンの代表パラメータ値を用いた認識の結果が良好かを判断する（Ｓ２０）。これは、例えば予め設定された閾値と比較することで判断することができる。 Returning to FIG. 5, the most similar pattern determination unit 404 then determines whether the recognition results using the representative parameter values of the most similar pattern determined in step S10 are good (S20). This can be determined, for example, by comparing with a preset threshold value.

ステップＳ２０で良好と判断された場合（Ｓ２０：ＹＥＳ）、新規物品を認識可能なパラメータ値を最類似パターンの代表パラメータ値として決定して、出力部４０９に出力する（Ｓ２１）。この場合、新規物品の適切なパラメータ値を、図２に示すような処理で算出する必要はなく、新規物品のシーン画像に物品位置情報を付与することなく決定できていることになる。図８はこの処理の一例を示しており、新規物品の最類似パターンとしてパターンＡが選択されている。図８に示すように、パターンＡの代表パラメータ値は、パターンＡの物品群の要求認識性能を満たすパラメータ範囲と新規物品の要求認識性能を満たすパラメータ範囲の両方に含まれる値となっている。 If it is judged to be good in step S20 (S20: YES), the parameter values capable of recognizing the new item are determined as the representative parameter values of the most similar pattern and output to the output unit 409 (S21). In this case, it is not necessary to calculate appropriate parameter values for the new item using the process shown in FIG. 2, and the values can be determined without adding item position information to the scene image of the new item. FIG. 8 shows an example of this process, in which pattern A is selected as the most similar pattern to the new item. As shown in FIG. 8, the representative parameter values of pattern A are values that are included in both the parameter range that satisfies the required recognition performance of the item group of pattern A and the parameter range that satisfies the required recognition performance of the new item.

この段階で、仮に図５の以降の処理を行わない場合でも、既に少ない教示データ量でパラメータ決定を行うことができるという効果がある。以降の処理はその効果をさらに高めることができる。 At this stage, even if the subsequent processing in Figure 5 is not performed, the effect is that parameters can be determined with a small amount of teaching data. Subsequent processing can further enhance this effect.

ステップＳ２０で良好でないと判断された場合（Ｓ２０:ＮＯ）、アノテーション信号受付部４０６が、最類似パターンの代表パラメータ値を用いた認識処理において認識失敗と判断された新規物品のシーン画像の一部に対して、物品位置情報の付与を行う（Ｓ３０）。例えば図９に、シーン画像へ物品位置情報を付与する例を示す。図９に示すように、シーン画像を表示装置に表示して、オペレータがその表示を見て、適切な物品位置情報を入力部から入力することができる。例えば、シーン画像上のバウンディングボックスを考えている場合は、シーン画像上の４点を選択するだけでよい。 If it is determined in step S20 that the result is not satisfactory (S20: NO), the annotation signal receiving unit 406 assigns item position information to a portion of the scene image of the new item that was determined to have failed to be recognized in the recognition process using the representative parameter value of the most similar pattern (S30). For example, FIG. 9 shows an example of assigning item position information to a scene image. As shown in FIG. 9, the scene image is displayed on a display device, and the operator can view the display and input appropriate item position information from the input unit. For example, if considering a bounding box on the scene image, it is only necessary to select four points on the scene image.

次に、最適化部４０７が新規物品のシーン画像と最類似パターンを用いてパラメータの算出を行う（Ｓ４０）。例えば、図１０に示すように、入力データとして、新規物品のモデルと物品位置情報が付与されているシーン画像に加えて、最類似パターンに関する情報を用いる。最類似パターンに関する情報として、具体的には、その最類似パターンに分類される物品のモデルやシーン画像、そこに付与されている物品位置情報、その物品のパラメータ決定の際に得られる中間データ、さらに最類似パターンの代表パラメータ値があげられる。この時点で、一般にパラメータ算出に利用可能な新規物品の物品位置情報が付与されているシーン画像は少ないため、それを補うために最も性質の近い最類似パターンの情報を援用する。この結果、新規物品の教示データが少なくても、算出されるパラメータ値は新規物品が十分に認識可能となるものとなっていることが期待できる。新規物品と最類似パターンの類似度が高いほど（例えば、ステップＳ１０で得られる最類似パターンの代表パラメータ値での新規物品の認識率が高いほど）、その可能性も高くなると考えられる。以下、このパラメータ算出を同時最適化という。 Next, the optimization unit 407 calculates parameters using the scene image of the new item and the most similar pattern (S40). For example, as shown in FIG. 10, in addition to the scene image to which the model of the new item and the item position information are attached, information on the most similar pattern is used as input data. Specifically, the information on the most similar pattern includes the model and scene image of the item classified into the most similar pattern, the item position information attached thereto, intermediate data obtained when the parameters of the item are determined, and the representative parameter value of the most similar pattern. At this point, since there are generally few scene images to which the item position information of the new item is attached that can be used for parameter calculation, information on the most similar pattern with the closest properties is used to compensate for this. As a result, even if there is little teaching data for the new item, it is expected that the calculated parameter values will be such that the new item can be sufficiently recognized. The higher the similarity between the new item and the most similar pattern (for example, the higher the recognition rate of the new item with the representative parameter value of the most similar pattern obtained in step S10), the higher the possibility is considered to be. Hereinafter, this parameter calculation is referred to as simultaneous optimization.

ここで、同時最適化の幾つかの具体例を述べる。第１の例は教示データの混合の例である。この例で想定している最適化処理では、入力したシーン画像の認識性能をできるだけ高めるようなパラメータ値を探索する。最類似パターンに分類される物品の教示データは、最類似パターンの代表パラメータ値を算出するための情報を含んでいると考えられる。一方、その代表パラメータ値を用いた認識の結果が成功と判断された新規物品のシーン画像は、もし適切な物品位置情報が与えられていれば、その代表パラメータ値を算出するための情報を少なくとも部分的には含んでいると考えられる。つまり、適切な物品位置情報が付与されていないそれらの新規物品のシーン画像の代わりに、最類似パターンに分類される物品の教示データを用いても、パラメータ算出においては同様の効果があると期待できる。最類似パターンの代表パラメータ値を用いた認識の結果が失敗と判断された新規物品のシーン画像の一部には、ステップＳ３０の処理で物品位置情報が付与されている。そこで、その物品位置情報と最類似パターンに分類される物品の教示データをパラメータ算出の入力データとすることで、新規物品の認識性能を十分に高くするパラメータ値が得られる。 Here, some concrete examples of simultaneous optimization are described. The first example is an example of mixing teaching data. In the optimization process assumed in this example, a parameter value that will increase the recognition performance of the input scene image as much as possible is searched for. The teaching data of the item classified as the most similar pattern is considered to contain information for calculating the representative parameter value of the most similar pattern. On the other hand, if appropriate item position information is given, the scene image of the new item whose recognition result using the representative parameter value is judged to be successful is considered to contain at least part of the information for calculating the representative parameter value. In other words, even if the teaching data of the item classified as the most similar pattern is used instead of the scene images of those new items to which appropriate item position information is not given, it is expected that the same effect will be obtained in parameter calculation. The item position information is given to a part of the scene image of the new item whose recognition result using the representative parameter value of the most similar pattern is judged to be a failure in the process of step S30. Therefore, by using the item position information and the teaching data of the item classified as the most similar pattern as input data for parameter calculation, a parameter value that sufficiently increases the recognition performance of the new item can be obtained.

同時最適化の第２の例は、パラメータ空間上の探索の範囲を最類似パターンの代表パラメータ値の近傍に制限する例である。この場合、入力する教示データは、最類似パターンの代表パラメータ値を用いた認識の結果が失敗と判断された新規物品のシーン画像である。シーン画像はステップＳ３０の処理で物品位置情報が付与されても構わない。最類似パターンの代表パラメータ値の近傍のパラメータ値は、最類似パターンの代表パラメータ値と同様に、最類似パターンの代表パラメータ値を用いた認識の結果が成功と判断された新規物品のシーン画像を正しく認識できることが期待される。そこで、成功と判断されたシーン画像の範囲内で入力教示データを正しく認識できるパラメータ値が見つかれば、それは新規物品の認識性能を十分に高くするパラメータ値である。
このように同時最適化により探索範囲を制限でき、演算コストを削減することができる。 The second example of simultaneous optimization is an example in which the range of search in the parameter space is limited to the vicinity of the representative parameter value of the most similar pattern. In this case, the input teaching data is a scene image of a new item whose recognition result using the representative parameter value of the most similar pattern is judged to be a failure. The scene image may be given item position information in the process of step S30. It is expected that the parameter values in the vicinity of the representative parameter value of the most similar pattern can correctly recognize the scene image of a new item whose recognition result using the representative parameter value of the most similar pattern is judged to be a success, similar to the representative parameter value of the most similar pattern. Therefore, if a parameter value that can correctly recognize the input teaching data is found within the range of the scene image judged to be successful, it is a parameter value that sufficiently improves the recognition performance of the new item.
In this way, the search range can be restricted by simultaneous optimization, and the computational costs can be reduced.

次に、認識性能評価部４０８が、算出されたパラメータ値で新規物品の認識を十分にできるかを判断する（Ｓ５０）。ここでもし、物品位置情報が付与されていない新規物品のシーン画像に対する認識も考慮して判断する場合は、例えば図７に示したような方法を用いることができる。 Next, the recognition performance evaluation unit 408 judges whether the calculated parameter values are sufficient to recognize the new item (S50). If the judgment is to be made while also taking into consideration the recognition of scene images of new items to which no item position information is attached, a method such as that shown in FIG. 7 can be used.

ステップＳ５０で認識可能と判断された場合（Ｓ５０：ＹＥＳ）、算出されたパラメータ値を新規物品の認識可能なパラメータ値として決定する（Ｓ５１）。このように代表パラメータ値がそのままでは流用できない場合でも、適切なパラメータ値を算出する上で最も有用なパターンの情報を活用することで、新規物品のシーン画像に人手で付与する物品位置情報が少量であっても適切なパラメータ値を決定することができる。図１１はこの処理の一例を示しており、新規物品の最類似パターンとしてパターンＡが選択されている。図１１に示すように、パターンＡの代表パラメータ値は、新規物品の要求認識性能を満たすパラメータ範囲に含まれていないが、パターンＢの代表パラメータ値よりもパターンＡの代表パラメータ値の範囲に近くなっている。そして、同時最適化によって新規物品が認識可能なパラメータ値が、パターンＡの代表パラメータ値の比較的近傍に見出される。この段階でのパラメータ値で必ずしもパターンＡに分類される物品群が認識可能となる必要はない。 If it is determined in step S50 that the item is recognizable (S50: YES), the calculated parameter value is determined as the recognizable parameter value of the new item (S51). Even if the representative parameter value cannot be used as is, by utilizing the information of the pattern that is most useful for calculating the appropriate parameter value, it is possible to determine the appropriate parameter value even if the item position information manually added to the scene image of the new item is small. FIG. 11 shows an example of this process, in which pattern A is selected as the most similar pattern of the new item. As shown in FIG. 11, the representative parameter value of pattern A is not included in the parameter range that satisfies the required recognition performance of the new item, but is closer to the range of the representative parameter value of pattern A than the representative parameter value of pattern B. Then, the parameter value that allows the new item to be recognized by simultaneous optimization is found relatively close to the representative parameter value of pattern A. It is not necessary that the parameter value at this stage allows the item group classified into pattern A to be recognized.

一方、ステップＳ５０で認識可能でないと判断された場合（Ｓ５０：ＮＯ）、新規物品のシーン画像で物品位置情報が付与されていないものの一部に対して、物品位置情報を付与する（Ｓ６０）。この実現方法として例えば、図９に示したような方法を用いることができる。 On the other hand, if it is determined in step S50 that the item is not recognizable (S50: NO), item position information is assigned to some of the scene images of the new item that do not have item position information assigned (S60). This can be achieved, for example, by using the method shown in FIG. 9.

次にパラメータ算出を行う（Ｓ７０）。ここで、入力データとしては新規物品のモデルと物品位置情報が付与されている新規物品のシーン画像で十分である。この時点では、ステップＳ４０の処理時よりも利用可能な新規物品の教示データが多く、新規物品の認識性能を十分に高くするパラメータ値が算出される可能性は高くなっている。ここでは、この最適化を同時最適化と区別して個別最適化ということにする。 Next, parameter calculation is performed (S70). Here, a model of the new item and a scene image of the new item with item position information are sufficient as input data. At this point, there is more teaching data for the new item available than at the time of processing in step S40, and it is highly likely that parameter values that will sufficiently improve the recognition performance of the new item will be calculated. Here, this optimization is referred to as individual optimization to distinguish it from simultaneous optimization.

そして、認識性能評価部４０８が、算出されたパラメータ値で新規物品の認識を十分にできるかを判断する（Ｓ８０）。ここでもし、物品位置情報が付与されていない新規物品のシーン画像に対する認識も考慮して判断する場合は、例えば図７に示したような方法を用いることができる。 Then, the recognition performance evaluation unit 408 judges whether the calculated parameter values are sufficient to recognize the new item (S80). Here, if the judgment is made taking into consideration the recognition of scene images of new items that do not have item position information, for example, a method such as that shown in FIG. 7 can be used.

ステップＳ８０で認識可能と判断された場合（Ｓ８０：ＹＥＳ）、算出されたパラメータ値を新規物品の認識可能なパラメータ値として決定する（Ｓ８１）。図１２はこの処理の一例を示しており、新規物品の最類似パターンとしてパターンＡが選択されている。しかし、同時最適化で算出されたパラメータ値の性能が良くないため、新規物品の個別最適化で算出されたパラメータ値が新規物品を認識可能なパラメータ値として決定されている。 If it is determined in step S80 that the new item is recognizable (S80: YES), the calculated parameter values are determined as the recognizable parameter values of the new item (S81). Figure 12 shows an example of this process, in which pattern A is selected as the most similar pattern of the new item. However, since the performance of the parameter values calculated in the joint optimization is not good, the parameter values calculated in the individual optimization of the new item are determined as the recognizable parameter values of the new item.

一方、ステップＳ８０で認識可能でないと判断された場合（Ｓ８０：ＮＯ）、その新規物品は現状想定している認識手法では十分な性能を達成できないこととなり、例えばパラメータ設定対象から除外して、一連の処理を終了する。 On the other hand, if it is determined in step S80 that the new item cannot be recognized (S80: NO), the currently assumed recognition method will not be able to achieve sufficient performance for the new item, and the item will be excluded from the parameter setting targets, for example, and the process will end.

上記のように、本実施例に係るパラメータ決定の処理によれば、新規物品の適切なパラメータ値を決定する上で最も有用と期待される登録済み物品群内のパターンを選択し、その情報を活用することで、新規物品の教示データ量が少なくても適切なパラメータ決定が可能となる。 As described above, the parameter determination process of this embodiment selects a pattern from the group of registered items that is expected to be most useful in determining appropriate parameter values for a new item, and by utilizing this information, it becomes possible to determine appropriate parameters even if the amount of teaching data for the new item is small.

＜ハードウェア構成＞
図１３を参照して、物品認識処理装置のハードウェア構成例について説明する。物品認識処理装置１は、プロセッサ１０１、メモリ１０２、補助記憶装置１０３、入力装置１０４、出力装置１０５、及び通信IF（Interface）１０６が、内部バス１０７に接続されて構成されるコンピュータである。 <Hardware Configuration>
An example of the hardware configuration of the item recognition processing device will be described with reference to Fig. 13. The item recognition processing device 1 is a computer configured by a processor 101, a memory 102, an auxiliary storage device 103, an input device 104, an output device 105, and a communication IF (Interface) 106, all of which are connected to an internal bus 107.

プロセッサ１０１は、メモリ１０２に格納されているプログラムを実行して、適切なパラメータ算出処理を実現する。本実施例では、パラメータ算出処理に関する諸機能（図４参照）を実行するプログラムを想定するが、その他の種々のプログラムも実行されてよい。プログラム実行のための入力データや出力データは主にメモリ１０２に格納される。メモリ１０２は、例えば、変更する必要のないプログラムを格納するための不揮発性の記憶素子（例えば、ROM（Read Only Memory））と、実行するプログラム及びプログラム実行時に使用するデータを一時的に格納するための揮発性の記憶素子（例えば、RAM（Random Access Memory））とで構成されてよい。補助記憶装置１０３は、例えば磁気記憶装置（HDD：Hard Disk Drive）のような不揮発性の大容量な記憶装置を含み、プロセッサ１０１が実行するプログラム及びプログラム実行時に使用されるデータ（ＤＢ等を含むデータ）を記憶する。本実施例に係るパラメータ算出処理のプログラムは、予め補助記憶装置１０３に記憶されていて、そこから読みだされて、メモリ１０２にロードされて、プロセッサ１０１によって実行される。 The processor 101 executes a program stored in the memory 102 to realize an appropriate parameter calculation process. In this embodiment, a program that executes various functions related to the parameter calculation process (see FIG. 4) is assumed, but various other programs may also be executed. Input data and output data for executing the program are mainly stored in the memory 102. The memory 102 may be composed of, for example, a non-volatile storage element (e.g., a ROM (Read Only Memory)) for storing a program that does not need to be changed, and a volatile storage element (e.g., a RAM (Random Access Memory)) for temporarily storing the program to be executed and data used when the program is executed. The auxiliary storage device 103 includes a non-volatile large-capacity storage device such as a magnetic storage device (HDD: Hard Disk Drive), and stores the program executed by the processor 101 and data used when the program is executed (data including DB, etc.). The program for the parameter calculation process according to this embodiment is stored in the auxiliary storage device 103 in advance, read out from there, loaded into the memory 102, and executed by the processor 101.

入力装置１０４は、キーボードやマウスのような、オペレータからの入力を受け付ける入力部であり、プログラムの実行や入力データの指定などを可能とする。出力装置１０５は、情報を表示する表示装置や情報を印字するプリンタのような、プログラムの実行結果を可視的に出力する装置である。本実施例では、表示装置にオペレータが認識可能な形式で表示して、図８や図１０で示す処理を実現する。通信IF１０６は、認識処理装置１と他の装置とを接続して通信を制御するネットワークインターフェース装置である。 The input device 104 is an input unit such as a keyboard or mouse that accepts input from an operator, and enables program execution and input data specification. The output device 105 is a device that visually outputs the results of program execution, such as a display device that displays information or a printer that prints information. In this embodiment, the display device displays information in a format that can be recognized by the operator, thereby realizing the processing shown in Figures 8 and 10. The communication IF 106 is a network interface device that connects the recognition processing device 1 to other devices and controls communication.

なお、このハードウェアの構成は、単体のコンピュータで構成してもよいし、あるいは、プロセッサ１０１、メモリ１０２、補助記憶装置１０３、入力装置１０４、出力装置１０５、通信IF１０６の、1または複数の構成部位が、ネットワークで接続された1または複数のコンピュータで構成されてもよい。 This hardware configuration may be configured as a single computer, or one or more of the components, processor 101, memory 102, auxiliary storage device 103, input device 104, output device 105, and communication IF 106, may be configured as one or more computers connected via a network.

実施例２は、実施例１におけるパラメータ決定対象物品のパラメータ決定と同時にパターンの更新を行う。実施例１では、新規物品の登録時に既に利用可能なパターン分類の情報があることを前提としているが、実施例２ではその適切なパターン分類の情報を構築する具体的な方法を提示する。 In Example 2, the pattern is updated at the same time as the parameters of the object for which parameters are determined in Example 1. In Example 1, it is assumed that there is already available pattern classification information when a new object is registered, but in Example 2, a specific method for constructing the appropriate pattern classification information is presented.

図１４は、実施例２によるパラメータ決定処理手順の一例を示す。
実施例1に係る図５（パラメータ決定処理手順）に比べて、ステップＳ５００、Ｓ２１０，Ｓ５１０，Ｓ８１０が相違している。これらのステップの処理により、新規物品のパラメータ決定に加えて、新規物品の登録に伴う適切なパターン更新が可能となる。以下では、図５と同一の処理についてはその説明を省略して、図１４の特徴的な上記処理ステップによるパターン更新処理動作とその作用効果について述べる。 FIG. 14 illustrates an example of a parameter determination process according to the second embodiment.
5 (parameter determination processing procedure) according to the first embodiment, steps S500, S210, S510, and S810 are different. The processing of these steps enables appropriate pattern updating accompanying registration of a new item, in addition to parameter determination of the new item. In the following, the description of the same processing as in FIG. 5 will be omitted, and the pattern updating processing operation and its action and effect according to the characteristic processing steps in FIG. 14 will be described.

ステップＳ２１０では、新規物品を認識可能なパラメータ値として最類似パターンの代表パラメータ値が選択されるだけでなく、新規物品が分類されるパターンを最類似パターンとする。図１５にこの処理の一例を示す。新規物品の最類似パターンとしてパターンＡが選択されているため、新規物品を認識可能なパラメータ値としてパターンＡの代表パラメータ値が選択され、新規物品はパターンＡに追加される。 In step S210, not only are representative parameter values of the most similar pattern selected as parameter values with which the new item can be recognized, but the pattern into which the new item is classified is designated as the most similar pattern. An example of this process is shown in FIG. 15. Because pattern A is selected as the most similar pattern for the new item, the representative parameter values of pattern A are selected as parameter values with which the new item can be recognized, and the new item is added to pattern A.

次にステップＳ５００では、ステップＳ４０の同時最適化により算出されたパラメータ値が新規物品に加えて最類似パターンに分類される物品群も同時に認識可能かどうかを判断する。ここで、物品位置情報が付与されていないシーン画像に対する認識も考慮して判断する場合は、例えば、図７に示す方法で実現することができる。 Next, in step S500, it is determined whether the parameter values calculated by the simultaneous optimization in step S40 are capable of simultaneously recognizing a group of objects classified into the most similar pattern in addition to a new object. Here, if the determination is made while taking into consideration the recognition of scene images to which object position information is not attached, this can be realized, for example, by the method shown in FIG. 7.

ステップＳ５１０では、ステップＳ５００で算出されたパラメータ値が新規物品に加えて最類似パターンに分類される物品群も同時に認識可能と判断された場合に行われる。そのパラメータ値を新規物品の認識可能なパラメータ値として決定するだけでなく、新規物品を最類似パターンに分類し、最類似パターンの代表パラメータ値を算出されたパラメータ値で置き換える。図１６にこの処理の一例を示す。新規物品の最類似パターンとしてパターンＡが選択されているが、その代表パラメータ値による新規物品の認識が良好ではないと判断されたため、同時最適化が行われ新たなパラメータ値が算出されている。その値は、パターンＡの物品群の要求認識性能を満たすパラメータ範囲と新規物品の要求認識性能を満たすパラメータ範囲の両方に含まれている値となっており、パターンＡの新しい代表パラメータ値とすることができる。そして、新規物品はパターンＡに分類される。 Step S510 is performed when it is determined that the parameter values calculated in step S500 can simultaneously recognize not only the new item but also the group of items classified as the most similar pattern. Not only are the parameter values determined as the recognizable parameter values of the new item, but the new item is classified as the most similar pattern and the representative parameter value of the most similar pattern is replaced with the calculated parameter value. An example of this process is shown in FIG. 16. Pattern A is selected as the most similar pattern of the new item, but it is determined that the new item cannot be recognized well using the representative parameter value, so simultaneous optimization is performed and a new parameter value is calculated. The value is included in both the parameter range that satisfies the required recognition performance of the group of items in pattern A and the parameter range that satisfies the required recognition performance of the new item, and can be used as the new representative parameter value of pattern A. The new item is then classified into pattern A.

ステップＳ５００の判定により、このパターン更新は実施例２におけるパターン分類の基準を満たしている。さらに、このパターン更新は、パターンの数を変えずに新規物品を取り込むことができている。パターン数が増えると、ステップＳ２０やステップＳ５００における処理が増大するため、このパターン更新は好ましいといえる。また、代表パラメータ値も更新できるので、柔軟にパターン更新ができる。その際の代表パラメータ値は同時最適化により新規物品の教示データ量が少なくても算出できるという効果を奏する。 The determination in step S500 shows that this pattern update meets the criteria for pattern classification in Example 2. Furthermore, this pattern update is able to incorporate new items without changing the number of patterns. This pattern update is preferable because an increase in the number of patterns increases the amount of processing in steps S20 and S500. Furthermore, the representative parameter values can also be updated, allowing for flexible pattern updating. This has the advantage that the representative parameter values can be calculated by simultaneous optimization even when the amount of teaching data for the new item is small.

最後にステップＳ８１０では、ステップＳ７０の個別最適化で算出されたパラメータ値が新規物品を認識可能なパラメータ値として決定されるだけでなく、新規物品のみを所属物品とし、代表パラメータ値が算出されたパラメータ値である新規パターンが生成される。図１７にこの処理の一例を示す。新規物品の最類似パターンとしてパターンＡが選択されているが、その代表パラメータ値による新規物品の認識が良好ではないとみなされたため、同時最適化が行われ新たなパラメータ値が算出されている。同時最適化で算出されたパラメータ値は新規物品を十分に認識できている。しかし、パターンＡに分類される全ての物品を十分に認識することはできていないため、実施例２では個別最適化が行われる。個別最適化で算出されたパラメータ値は新規物品を十分に認識できているため、そのパラメータ値が新規物品を認識可能なパラメータ値として決定される。さらに、新規パターンＣが生成され、新規物品がそこに分類され、代表パラメータ値は個別最適化で算出されたものとなる。 Finally, in step S810, not only are the parameter values calculated in the individual optimization in step S70 determined as parameter values capable of recognizing the new item, but a new pattern is generated in which only the new item is a belonging item and the representative parameter values are calculated. An example of this process is shown in FIG. 17. Pattern A is selected as the most similar pattern of the new item, but since it is deemed that the recognition of the new item using the representative parameter value is not good, simultaneous optimization is performed and new parameter values are calculated. The parameter values calculated in the simultaneous optimization are sufficient to recognize the new item. However, since it is not possible to sufficiently recognize all items classified into pattern A, individual optimization is performed in Example 2. Since the parameter values calculated in the individual optimization are sufficient to recognize the new item, the parameter values are determined as parameter values capable of recognizing the new item. Furthermore, a new pattern C is generated, the new item is classified into it, and the representative parameter values are those calculated in the individual optimization.

ステップＳ８１０のパターン更新により、どのパターンにも分類されないと判断された新規物品は新しく生成したパターンに分類されることになる。特に、登録済み物品が一つもない状況からでも、パターンを構築することが可能となる。 By updating the patterns in step S810, new items that are determined not to be classified into any patterns will be classified into the newly generated pattern. In particular, it is possible to construct a pattern even in a situation where there are no registered items.

ここで、上記パターン更新処理を行うための機能について説明する。図４を参照して、実施例１との相違する機能について述べる。認識性能評価部４０８は、最適化部で算出されたパラメータ値の新規物品に対する認識率を評価し、要求値を満たすかどうかを確かめる。同時最適化の場合は、新規物品だけでなく、最類似パターンに属する複数の物品（物品群）についても認識率を評価する。同時最適化の結果が要求値を満たさない場合は、アノテーション信号受付部４０６に戻る。 Now, the function for performing the above pattern update process will be described. The functions that differ from those in the first embodiment will be described with reference to FIG. 4. The recognition performance evaluation unit 408 evaluates the recognition rate for the new item of the parameter values calculated by the optimization unit, and checks whether it satisfies the required value. In the case of simultaneous optimization, the recognition rate is evaluated not only for the new item, but also for multiple items (item groups) that belong to the most similar pattern. If the result of simultaneous optimization does not satisfy the required value, the process returns to the annotation signal reception unit 406.

パターン情報ＤＢ４１０は、登録済み物品群の情報(モデル情報およびシーン画像)とそれらから抽出されるパターン情報を格納する。実施例２ではそれらの情報を、新規物品の最適パラメータを出力する際に適宜変更する。 The pattern information DB 410 stores information on registered item groups (model information and scene images) and pattern information extracted from them. In the second embodiment, this information is appropriately changed when outputting optimal parameters for a new item.

以上、実施例２のパラメータ決定および更新によれば、新規物品の適切なパラメータ値を決定する上で最も有用と期待される登録済み物品群内のパターンを選択し、その情報を活用することで、新規物品の教示データ量が少なくても適切なパラメータ決定が可能となる。さらにその後のパラメータ決定対象物品にとって有用なパターンとなるようにパターンを更新するが可能となる。 As described above, according to the parameter determination and update of the second embodiment, by selecting a pattern from the group of registered items that is expected to be most useful in determining appropriate parameter values for a new item, and utilizing that information, it is possible to determine appropriate parameters for the new item even if the amount of teaching data is small. Furthermore, it is possible to update the pattern so that it becomes a useful pattern for the item for which parameters are to be determined subsequently.

実施例３は、実施例１のパラメータ決定において同時最適化の処理を繰り返すことができるパラメータ決定を提示する。図１８は、実施例３によるパラメータ決定処理手順の一例を示す。実施例１と対比すると、係る図４の処理手順に、新たにステップＳ５２が追加され、それによりループ処理を可能としている。ステップＳ５２以外およびそれを実現させる機能は、実施例１と実質的に同じであるためそれらの説明を省略する。 Example 3 presents parameter determination that can repeat the simultaneous optimization process in the parameter determination of Example 1. FIG. 18 shows an example of a parameter determination process procedure according to Example 3. In comparison with Example 1, step S52 is newly added to the process procedure of FIG. 4, which enables loop processing. Steps other than step S52 and the functions that realize them are substantially the same as those of Example 1, so their description will be omitted.

本実施例では、ステップＳ４０の同時最適化により算出されたパラメータ値がステップＳ５０で新規物品を十分には認識できないと判断された場合（Ｓ５０:ＮＯ）、新規物品のステップＳ２０で最類似パターンの代表パラメータ値による認識で認識失敗と判断された新規物品のシーン画像について、その全てに物品位置情報が付与されているかを判断する（Ｓ５２）。この判断の結果、全てに付与済みの場合（Ｓ５２：ＹＥＳ）、ステップＳ６０に移り、以後は実施例１と同様の処理手順となる。 In this embodiment, if it is determined in step S50 that the parameter values calculated by the simultaneous optimization in step S40 are not sufficient to recognize the new item (S50: NO), it is determined whether item position information has been added to all of the scene images of the new item that were determined to have failed to be recognized using the representative parameter values of the most similar pattern in step S20 (S52). If the result of this determination is that item position information has been added to all of the images (S52: YES), the process proceeds to step S60, and thereafter the same processing procedure as in embodiment 1 is followed.

一方、上記判断の結果、物品位置情報が付与されていないものがある場合（Ｓ５２：ＮＯ）、ステップＳ３０に戻り、物品位置情報が付与されていないシーン画像に対して、物品位置情報を付与することが可能となる。すなわち、ステップＳ４０で同時最適化が行われるが、前回の同時最適化との違いは、今回の方が新規物品の物品位置情報付与済みのシーン画像が増えていることである。そのため、算出されたパラメータ値が、ステップＳ５０において、新規物品を十分に認識できる可能性は高くなる。もし十分には認識できない場合、まだ物品位置情報を付与できるシーン画像が残っていれば、ステップＳ５２の処理で再びステップＳ３０に戻り、上述の処理を繰り返すことができる。繰り返し処理を抜ける可能性としては、ステップＳ５０で新規物品を十分に認識可能なパラメータ値が算出されたと判断される処理で（ステップＳ５１）、全ての失敗シーン画像に物品位置情報が付与済みであると判断される場合、またはステップＳ５１においてこれ以上同時最適化を行う必要性はないと判断される場合、である。 On the other hand, if the result of the above judgment is that there is no item position information assigned (S52: NO), the process returns to step S30, and item position information can be assigned to the scene images to which item position information has not been assigned. That is, simultaneous optimization is performed in step S40, but the difference from the previous simultaneous optimization is that there are more scene images to which item position information of new items has been assigned this time. Therefore, the calculated parameter values are more likely to be able to fully recognize the new item in step S50. If the new item cannot be fully recognized, and there are still scene images to which item position information can be assigned, the process returns to step S30 again in step S52, and the above-mentioned process can be repeated. The possibility of exiting the repeated process is when it is determined in the process in which it is determined in step S50 that parameter values capable of fully recognizing the new item have been calculated (step S51), when it is determined that item position information has been assigned to all failure scene images, or when it is determined in step S51 that there is no need to perform further simultaneous optimization.

このように、実施例３によれば、実施例１と比べて、同時最適化の結果が良好になるまで徐々に新規物品への物品位置情報を付与できるため、できるだけ少ない教示データ作成コストで新規物品を十分に認識可能なパラメータ値が得られるという効果がある。 In this way, according to the third embodiment, compared to the first embodiment, it is possible to gradually assign item position information to new items until the results of simultaneous optimization become favorable, which has the effect of obtaining parameter values that can adequately recognize new items with the lowest possible cost for creating teaching data.

実施例４は、代表パラメータ値として複数の代表パラメータ（代表パラメータ値群という）を考慮する例である。 Example 4 is an example in which multiple representative parameters (called a representative parameter value group) are considered as representative parameter values.

[実施例１の対応]
実施例１における認識処理（図５参照）では、各パターンに代表パラメータ値が一つずつ存在するとしている。一方、所属物品の認識性能を要求性能以上にするパラメータ値を複数個見出すことが可能であり、本実施例において各パターンに複数の代表パラメータ値を付与することが可能である。 [Response to Example 1]
In the recognition process in the first embodiment (see FIG. 5), each pattern has one representative parameter value. However, it is possible to find multiple parameter values that make the recognition performance of the belonging article equal to or higher than the required performance, and therefore, in this embodiment, it is possible to assign multiple representative parameter values to each pattern.

例えば、最類似パターンの選択において、各パターンごとに全ての代表パラメータ値について新規物品の認識性能を評価し、最も認識性能の良い代表パラメータ値が属するパターンを最類似パターンとして選択する。パラメータ値評価の回数が増える分、演算コストや認識結果の判断に要するコストは増大するが、より粒度の細かい判定が可能となる。また、この場合、新規物品の最適パラメータ値の決定に際して活用する代表パラメータ値としては、最類似パターンの複数の代表パラメータ値の中でも最類似パターン選択時に着目した最も認識率の高い代表パラメータ値を採用することができる。 For example, when selecting the most similar pattern, the recognition performance of the new item is evaluated for all representative parameter values for each pattern, and the pattern containing the representative parameter value with the best recognition performance is selected as the most similar pattern. As the number of parameter value evaluations increases, the computational costs and the costs required to determine the recognition results increase, but more granular judgments become possible. In this case, the representative parameter value used when determining the optimal parameter value for the new item can be the representative parameter value with the highest recognition rate that was focused on when selecting the most similar pattern among the multiple representative parameter values of the most similar patterns.

[実施例２の対応]
上記実施例１対応において、より粒度の細かい最類似パターンの選択及びより認識性能の高い最適パラメータ値の決定を実現するために、各パターンの代表パラメータ値を複数考慮する例について述べたが、実施例２対応においても同様の例を述べることができる。この場合、実施例１における変更点に加えて、パターン情報の更新に関する処理への変更が必要となる。例えば、ステップＳ５１０で最類似パターンの代表パラメータ値を変更する際に複数の代表パラメータ値の変更を行い、ステップＳ８１０で新規パターンを追加する際には代表パラメータ値を複数求めておく。前者については、ステップＳ４０とＳ５００を繰り返して必要個数の代表パラメータ値を求め、最類似パターンの既存代表パラメータ値と置き換えればよい。後者も同様に、ステップＳ７０とＳ８０を必要個数の代表パラメータ値が求まるまで繰り返せばよい。ここで、一つでも新しい代表パラメータ値が求まっていれば、代表パラメータ値となり得る他の値はその値の近くに存在することが期待できるため、探索範囲をその近傍に制限して演算コストの増大をある程度抑えることが可能である。 [Response to Example 2]
In the above embodiment 1, an example was described in which a plurality of representative parameter values for each pattern were considered in order to select the most similar pattern with finer granularity and determine the optimal parameter value with higher recognition performance. A similar example can be described in the embodiment 2. In this case, in addition to the changes in the embodiment 1, a change is required to the process related to updating the pattern information. For example, when changing the representative parameter value of the most similar pattern in step S510, a plurality of representative parameter values are changed, and when adding a new pattern in step S810, a plurality of representative parameter values are obtained. For the former, steps S40 and S500 are repeated to obtain the required number of representative parameter values, and the required number of representative parameter values are replaced with the existing representative parameter values of the most similar pattern. Similarly, for the latter, steps S70 and S80 are repeated until the required number of representative parameter values are obtained. Here, if at least one new representative parameter value has been obtained, it is expected that other values that can become the representative parameter value are near that value, so it is possible to limit the search range to the vicinity of that value and suppress the increase in the calculation cost to some extent.

図１９を参照して、上記実施例１－４に係る物品認識処理装置をピッキングロボットの物品認識に適用した例について述べる。図１９において、物品認識処理装置１を実現するコンピュータは、入力装置１０４であるマウスと、出力装置１０５であるディスプレイを有している。物品認識処理装置１の通信ＩＦ１０６には通信線１９０１を介して、ロボット１９０２と、画像センサ１９０３が接続される。ロボット１９０２は物品認識処理装置１による認識の結果を受け取って所望の動作を行う。画像センサ１９０３はロボット１９０２の動作の対象となる物体１９３１や物体に対する作業シーンを撮像する。なお、センサやコンピュータは、ロボットの一部としてもよい。 With reference to FIG. 19, an example in which the item recognition processing device according to the above-mentioned Examples 1-4 is applied to item recognition in a picking robot will be described. In FIG. 19, a computer that realizes the item recognition processing device 1 has a mouse as an input device 104 and a display as an output device 105. A robot 1902 and an image sensor 1903 are connected to the communication IF 106 of the item recognition processing device 1 via a communication line 1901. The robot 1902 receives the recognition result by the item recognition processing device 1 and performs the desired operation. The image sensor 1903 captures an object 1931 that is the target of the operation of the robot 1902 and a work scene for the object. Note that the sensor and computer may be part of the robot.

図１９に示すロボットの動作が行われるまでの流れは例えば次のようである。まず、ロボット１９０２の本体またはその周辺に設置されているセンサ１９０３が取得した画像に対して、物品認識処理装置１が認識処理を実行して、認識の結果を得る。そして、コンピュータで認識の結果を適切なプログラムで解析し、適切なロボット動作を誘起する情報を得る。その情報を基にロボット１９０２に動作を行なわせる。例えば、認識対象の物体１９３１の把持、および把持した物体を所定の位置に移動させる。また、図１９の作業シーンは固定された状態で描かれているが、作業シーンを適宜変更することができる。特に実際の製造ラインのように刻々と変化する状況を用いることも可能である。 For example, the flow of operations of the robot shown in FIG. 19 is as follows. First, the item recognition processing device 1 executes recognition processing on an image acquired by a sensor 1903 installed on or around the body of the robot 1902, and obtains the recognition result. Then, the computer analyzes the recognition result using an appropriate program to obtain information that induces appropriate robot operations. Based on that information, the robot 1902 is made to perform an operation. For example, the object 1931 to be recognized is grasped, and the grasped object is moved to a specified position. Also, although the work scene in FIG. 19 is depicted in a fixed state, the work scene can be changed as appropriate. In particular, it is possible to use situations that change from moment to moment, such as an actual production line.

実施例１をこの適用例に適用した場合、次に効果が得られる。すなわち、ステップＳ１０などで発生する、認識の結果の成否判定をロボット１９０２の動作に基づいて判定できる。図７に示すように、認識結果を表示装置に表示する場合、オペレータが、本来は目的達成に不十分な精度でも認識成功、逆に十分な精度でも認識失敗と判断してしまう可能性は否定できない。そこで本適用例によれば、実際の目的の動作をロボットに行わせることで、より適切な判断が可能となる。 When Example 1 is applied to this application example, the following effect is obtained. That is, the success or failure of the recognition result, which occurs in step S10 or the like, can be determined based on the movement of the robot 1902. As shown in FIG. 7, when the recognition result is displayed on a display device, it cannot be denied that the operator may determine that the recognition is successful even when the accuracy is insufficient to achieve the purpose, or that the recognition is unsuccessful even when the accuracy is sufficient. Therefore, according to this application example, a more appropriate judgment can be made by having the robot perform the actual intended movement.

１：物品認識処理装置１０１：プロセッサ
１０２：メモリ１０３：補助記憶装置
１０４：入力装置１０５：出力装置
１０６：通信ＩＦ１０７：内部バス
４０１：受付部４０２：認識成否信号受付部
４０３：代表パラメータ値評価部４０４：最類似パターン決定部
４０５：パターン情報ＤＢ４０６：アノテーション信号受付部
４０７：最適化部４０８：認識性能評価部
４０９：出力部４１０：記憶部 1: Item recognition processing device 101: Processor 102: Memory 103: Auxiliary storage device 104: Input device 105: Output device 106: Communication IF 107: Internal bus 401: Reception unit 402: Recognition success/failure signal reception unit 403: Representative parameter value evaluation unit 404: Most similar pattern determination unit 405: Pattern information DB 406: Annotation signal reception unit 407: Optimization unit 408: Recognition performance evaluation unit 409: Output unit 410: Storage unit

Claims

A recognition processing method in which a computer executes an article recognition process, comprising the steps of:
storing a plurality of patterns including a plurality of registered articles, each pattern having a representative parameter value, in a storage unit;
receiving new item information having at least a model of the new item and a plurality of scene images;
a recognition processing step of performing a recognition process on a plurality of scene images of the novel article using the representative parameter values of each of the plurality of patterns;
selecting the pattern corresponding to the representative parameter value determined to be the best as a result of the recognition process as the most similar pattern to the new article;
If a result of a recognition process for the new article using the representative parameter value of the selected most similar pattern is good, determining the representative parameter value as a recognizable parameter value of the new article;
The recognition processing method includes:

If the result of the recognition process (re-recognition process) of the new article using the representative parameter value of the most similar pattern is not good,
selecting at least one of the plurality of scene images of the novel article that is determined to have failed as a result of the recognition process;
Adding item position information to the selected scene image;
calculating new parameter values based on the selected scene image, the item position information, the model of the new item, and the representative parameter values of the most similar pattern or information for calculating the representative parameter values of the most similar pattern;
performing a recognition process for the new article on the plurality of scene images of the new article using the model of the new article and the new parameter values;
If the result of the recognition process is good, determining the new parameter value as a recognizable parameter value of the new article;
The recognition processing method according to claim 1

Each of the plurality of patterns has, as pattern information, at least models of all the registered articles classified into the pattern and a plurality of scene images in which all the registered articles classified into the pattern are captured;
if a result of the recognition process using the representative parameter value of the most similar pattern is good, the model of the new article and the plurality of scene images in which the new article is captured are added to the pattern information of the most similar pattern, and the new article is classified into the most similar pattern;
if the result of the recognition process (re-recognition process) of the new item is good for the plurality of scene images of the new item using the model of the new item and the new parameter values, for each of all registered items classified into the most similar pattern, a recognition process of the registered item is performed for the plurality of scene images in which the registered item is photographed using the model of the registered item and the new parameter values;
if the results of the recognition process for all of the registered articles classified into the most similar pattern are good, add the model of the new article and the plurality of scene images in which the new article is photographed to the pattern information of the most similar pattern;
replacing the representative parameter value of the most similar pattern with the new parameter value, and classifying the new article into the most similar pattern;
The recognition processing method according to claim 2 .

If the results of all of the recognition processes are not good,
selecting a portion of the plurality of scene images of the novel item;
Adding item position information to each of the selected scene images;
Calculating new parameter values based on a portion of the plurality of scene images of the new item to which item position information is assigned, the item position information, and the model of the new item;
performing a recognition process (re-recognition process) of the new article on the plurality of scene images of the new article using the model of the new article and the new parameter values;
If the result of the re-recognition process is good, the new parameter values are determined as recognizable parameter values of the new article;
retaining at least the model of the new article and the plurality of scene images in which the new article is captured as pattern information, creating a new pattern in which the new parameter value is a representative parameter value, and classifying the new article into the new pattern;
The recognition processing method according to claim 3.

displaying, on a display device, information on an estimated position of the article relative to a scene image of the article together with the scene image;
inputting a result of the recognition processing based on the display through an input device;
The recognition processing method according to claim 1 .

The target is to determine the parameters of the robot's object recognition process.
the robot executes a predetermined operation based on estimated position information of the object estimated by object recognition for a scene image of the object, the object recognition being executed using the recognition process, a model of the object, and arbitrary parameter values;
determining a result of the recognition process based on the motion;
The recognition processing method according to claim 1 .

When the result of the re-recognition process is not good and the result of the recognition process using the representative parameter value of the most similar pattern is determined to be a failure, there is a scene image of the new article to which no article position information is assigned, and the process according to claim 2 is executed again.
The recognition processing method according to claim 2 .

each of the plurality of patterns is a representative parameter value group having a plurality of parameter values that enable recognition of a plurality of the registered articles classified into the corresponding pattern;
performing a recognition process for the new article on the plurality of scene images of the new article by using the representative parameter value group of each of the plurality of patterns;
selecting the pattern corresponding to the representative parameter value group having the best parameter value as a result of the recognition process as the most similar pattern to the new article;
determining, when a result of the recognition process using the parameter values belonging to the representative parameter value group of the most similar pattern is good, the parameter values belonging to the representative parameter value group of the most similar pattern as recognizable parameter values of the novel article;
The recognition processing method according to claim 1 .

A storage unit that stores a plurality of patterns including a plurality of registered articles, each of the patterns having a representative parameter value;
A receiving unit that receives at least a model of a new item and information on the new item having a plurality of scene images;
a recognition processing unit that performs a recognition process for a plurality of scene images of the novel article by using the representative parameter values of each of the plurality of patterns;
a determination unit that selects the pattern corresponding to the representative parameter value that is determined to be the best as a result of the recognition processing by the recognition processing unit as the most similar pattern of the new article;
an output unit that outputs the representative parameter value as a recognizable parameter value of the new article when a result of a recognition process of the new article using the representative parameter value of the selected most similar pattern is good;
A recognition processing device having the above configuration.

If the result of the recognition process of the new article using the representative parameter value of the most similar pattern is not good,
selecting at least one of the plurality of scene images of the novel article that is determined to have failed as a result of the recognition process;
Adding item position information to the selected scene image;
the determination unit calculates new parameter values based on the selected scene image, the item position information, the model of the new item, and the representative parameter values of the most similar pattern or information for calculating the representative parameter values of the most similar pattern;
the recognition processing unit performs a recognition process of the new article on the plurality of scene images of the new article by using the model of the new article and the new parameter values;
When a result of the recognition process is good, the determination unit determines the new parameter value as a recognizable parameter value of the new article.
The recognition processing device according to claim 9 .

Each of the plurality of patterns has, as pattern information, at least models of all the registered articles classified into the pattern and a plurality of scene images in which all the registered articles classified into the pattern are captured;
when a result of the recognition process using the representative parameter value of the most similar pattern is good, the determination unit adds the model of the new article and the plurality of scene images in which the new article is captured to the pattern information of the most similar pattern, and classifies the new article into the most similar pattern;
the recognition processing unit, when a result of a recognition process (re-recognition process) of the new item for the plurality of scene images of the new item using the model of the new item and the new parameter values is good, performs a recognition process of the registered item for each of all registered items classified into the most similar pattern, using the model of the registered item and the new parameter values for the plurality of scene images in which the registered item is photographed;
when a result of the recognition process of all the registered articles classified into the most similar pattern is good, the determination unit adds the model of the new article and the plurality of scene images in which the new article is photographed to the pattern information of the most similar pattern;
replacing the representative parameter value of the most similar pattern with the new parameter value, and classifying the new article into the most similar pattern;
The recognition processing device according to claim 10.

If the results of all of the recognition processes are not good,
selecting a portion of the plurality of scene images of the novel item;
Adding item position information to each of the selected scene images;
The determination unit calculates new parameter values based on a part of the plurality of scene images of the new item to which item position information is assigned, the item position information, and the model of the new item;
the recognition processing unit executes a recognition process (re-recognition process) of the new article on the plurality of scene images of the new article using the model of the new article and the new parameter values;
The determination unit determines the new parameter values as recognizable parameter values of the new article when a result of the re-recognition process is good,
retaining at least the model of the new article and the plurality of scene images in which the new article is captured as pattern information, creating a new pattern in which the new parameter value is a representative parameter value, and classifying the new article into the new pattern;
The recognition processing device according to claim 11.