JP7759038B2

JP7759038B2 - Information Processing Systems

Info

Publication number: JP7759038B2
Application number: JP2024106989A
Authority: JP
Inventors: 成吉谷井
Original assignee: Marketvision Co Ltd
Current assignee: Marketvision Co Ltd
Priority date: 2023-08-16
Filing date: 2024-07-02
Publication date: 2025-10-23
Anticipated expiration: 2043-08-16
Also published as: JP2025027688A; JP7644914B2; JP2025027980A

Description

本発明は、画像情報から物体を同定する場合の認識精度を低下させない情報処理システムに関する。 The present invention relates to an information processing system that does not reduce recognition accuracy when identifying objects from image information.

コンビニエンスストア、スーパーなどの各種の店舗では、販売している商品などの物体を陳列棚に置いて販売をしていることが一般的である。この陳列方法としては、購買者に対して目につきやすくするために物体を横方向に複数陳列しておく、あるいは、物体の一つが購入されても、同一の物体をほかの人が購入できるように、物体を縦方向に陳列しておく場合がある。そして、物体が陳列棚のどこにいくつ陳列されているかを管理することは、物体の販売戦略上、重要である。 In convenience stores, supermarkets, and other stores, it is common for products and other objects to be sold by placing them on display shelves. This display method involves displaying multiple objects horizontally to make them more visible to customers, or vertically so that even if one object is purchased, another person can purchase the same object. Managing where and how many objects are displayed on the display shelves is important in terms of sales strategies for the objects.

そのため、店舗における物体の実際の陳列状況を把握するため、陳列棚を撮影装置で撮影し、その撮影した画像情報から陳列されている物体を自動的に特定する方法がある。たとえば物体ごとの標本画像をもとに、店舗の陳列棚を撮影した画像に対して画像認識技術を用いる方法がある。これらの従来技術として、たとえば、下記特許文献１、特許文献２がある。 Therefore, in order to understand the actual display status of objects in a store, one method is to photograph the display shelves with a camera and automatically identify the displayed objects from the captured image information. For example, one method uses image recognition technology on images of the store's display shelves based on specimen images of each object. Examples of such prior art include Patent Documents 1 and 2 listed below.

特開平５－３４２２３０号公報Japanese Patent Application Publication No. 5-342230 特開平５－３３４４０９号公報Japanese Patent Application Publication No. 5-334409

特許文献１の発明は、物体をどの陳列棚に陳列すべきかが知識のない者にもできるように支援するシステムである。そのため、物体をどこに陳列するかを把握することはできるが、陳列されている物体を特定するものではない。また特許文献２は、物体の陳列を支援する棚割支援システムにおいて、物体画像の入力を支援するシステムである。しかし特許文献２のシステムでは、棚割支援システムを利用する際の物体画像の入力を支援するのみであって、このシステムを用いたとしても、具体的な物体の陳列状況を把握することはできない。 The invention of Patent Document 1 is a system that assists even those without knowledge in determining on which display shelf an object should be displayed. As such, it is possible to determine where an object should be displayed, but it does not identify the object that is displayed. Patent Document 2 is a system that supports the input of object images in a shelf allocation support system that supports the display of objects. However, the system of Patent Document 2 only supports the input of object images when using the shelf allocation support system, and even if this system is used, it is not possible to grasp the specific display status of objects.

さらに、特許文献１、特許文献２以外にも、陳列棚を撮影した画像情報から陳列されている物体を画像認識処理技術を用いて同定する技術もある。これによって、店舗における実際の陳列状況を把握することはできる点で有益である。 In addition to Patent Documents 1 and 2, there is also technology that uses image recognition processing technology to identify displayed objects from image information captured on display shelves. This is beneficial in that it allows for understanding the actual display situation in a store.

従来技術において画像認識処理技術を用いて物体を同定する場合、陳列棚を撮影した画像情報から、物体があると思われる矩形領域を検出し、その矩形領域について物体の標本画像とマッチング処理を実行する、あるいは矩形領域を入力値として深層学習（ディープラーニング）の処理を実行することで、物体を同定する。 In conventional technology, when identifying an object using image recognition processing technology, a rectangular area where the object is likely to be located is detected from image information captured of a display shelf, and the object is identified by performing a matching process on that rectangular area with a sample image of the object, or by performing deep learning processing using the rectangular area as input.

しかし、物体の形状（輪郭）は矩形とは限らない。物体があると思われる領域を矩形で検出すると、当該矩形領域にほかの物体の一部が写り込む場合がある。また、当該矩形領域に背景が写り込む場合もある。そのため、このような矩形領域を、深層学習（ディープラーニング）の教師データや処理対象の画像、あるいは画像マッチング処理とすると、物体を同定する精度を低下させる原因となる課題がある。 However, the shape (outline) of an object is not necessarily rectangular. When detecting an area where an object is thought to be located as a rectangle, parts of other objects may be reflected in that rectangular area. In addition, the background may also be reflected in that rectangular area. Therefore, when such rectangular areas are used as training data for deep learning, images to be processed, or image matching processing, there is an issue that this can cause a decrease in the accuracy of object identification.

さらに、物体が未学習などによって未知の場合、その後の物体の同定処理を行っても誤認識となるが、そのような誤認識を行う物体の同定処理を行うことは処理の無駄であって、可能な限りに早期に除外することが好ましい。しかし、従来はそのようなことが行えない。また、未学習の物体であること自体を検出することも難しい。 Furthermore, if an object is unknown because it has not been learned, subsequent object identification processing will result in an incorrect recognition. However, performing identification processing on such an object that results in an incorrect recognition is a waste of processing, and it is preferable to eliminate it as early as possible. However, this has not been possible with conventional methods. Furthermore, it is difficult to detect an object as an unlearned object in the first place.

本発明者は上記課題に鑑み、画像情報に写っている物体を同定する際に、物体を同定する精度を向上させる情報処理システムを発明した。 In consideration of the above-mentioned problems, the inventors have invented an information processing system that improves the accuracy of identifying objects captured in image information.

第１の発明は、画像情報に写っている物体を同定する情報処理システムであって、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部と、画像情報に写っている物体の物体識別情報を同定する認識処理部と、を有しており、前記認識処理部は、前記画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている物体の外形を外形領域として特定し、前記特定した外形領域の対応する属性に対する信頼度を出力し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定した外形領域に写っている物体の一または複数の物体識別情報を同定し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定した外形領域に写っている物体を未知の物体であると同定する、情報処理システムである。
A first invention is an information processing system that identifies an object shown in image information, the information processing system having a first learning processing unit that performs machine learning using first annotation data that associates data indicating the outer shape of the object with attributes to create a first learning model, and a recognition processing unit that identifies object identification information of the object shown in the image information, wherein the recognition processing unit uses the image information and the first learning model to identify the outer shape of the object shown in the image information as an outer shape region and output a reliability for the attribute corresponding to the identified outer shape region, and if the reliability of the attribute corresponding to the identified outer shape region is not below a predetermined threshold , identifies one or more pieces of object identification information for the object shown in the identified outer shape region using a second learning model or image matching process that has been machine learned using image information of the identified outer shape region and second annotation data that associates image data of the object with object identification information, and if the reliability of the attribute corresponding to the identified outer shape region is below a predetermined threshold , the information processing system identifies the object shown in the identified outer shape region as an unknown object.

本発明のように、第１のアノテーションデータを用いて機械学習した学習モデルを利用して物体識別情報を同定することで、画像情報から切り出す外形領域を従来のような矩形領域から、物体の外形に沿った領域にできる。これによって、ほかの物体や背景の写り込みを減らすことができ、物体を同定する際の精度を向上させることができる。 As in the present invention, by identifying object identification information using a learning model trained by machine learning using the first annotation data, the contour area cut out from the image information can be changed from a conventional rectangular area to an area that follows the contour of the object. This reduces the inclusion of other objects or the background, improving the accuracy of object identification.

また、特定した物体の外形領域の対応する属性が所定の条件を充足しない場合、たとえばその信頼度が所定の閾値よりも低いなど、所定の条件を充足する場合には、当該外形領域の物体識別情報の同定処理を行わずに、その外形領域について所定の物体、たとえば未知の物体（たとえば学習していない物体）であるなどと同定することができる。これによって、未知の物体の検出などを行うことができ、また処理負荷の軽減を図ることができる。 In addition, if the attributes corresponding to the outline region of an identified object do not satisfy a predetermined condition, for example, if the reliability is lower than a predetermined threshold, the outline region can be identified as a predetermined object, such as an unknown object (e.g., an object that has not been learned), without performing identification processing on the object identification information of the outline region. This makes it possible to detect unknown objects and also reduces the processing load.

上述の発明において、前記第１の学習処理部は、前記第１のアノテーションデータを用いて画像セグメンテーションによる機械学習をして前記第１の学習モデルを作成する、情報処理システムのように構成することができる。 In the above-mentioned invention, the first learning processing unit can be configured as an information processing system that performs machine learning through image segmentation using the first annotation data to create the first learning model.

機械学習をする際には、画像セグメンテーションの方法による機械学習が好ましい。 When performing machine learning, image segmentation methods are preferred.

上述の発明において、前記情報処理システムは、前記第２のアノテーションデータを用いて機械学習をして前記第２の学習モデルを作成する第２の学習処理部、を有する情報処理システムのように構成することができる。
In the above-mentioned invention, the information processing system can be configured as an information processing system having a second learning processing unit that performs machine learning using the second annotation data to create the second learning model.

外形領域が物体の外形とした領域で構成されているため、これらの発明のような処理を行うことで物体識別情報を同定することが好ましい。また、同定した物体の物体識別情報について所定の条件を充足する場合、たとえばその信頼度が所定の閾値よりも低いなど、所定の条件を充足する場合には、同定した物体識別情報にかかわらずに、その外形領域について所定の物体、たとえば未知の物体であるなどと同定する。これによって、信頼度の低い物体識別情報の同定を回避することができる。 Because the outline region is made up of an area that represents the outline of an object, it is preferable to identify object identification information by performing processing such as those of these inventions. Furthermore, if the object identification information of the identified object satisfies a predetermined condition, such as if its reliability is lower than a predetermined threshold, then the outline region is identified as a predetermined object, such as an unknown object, regardless of the identified object identification information. This makes it possible to avoid identifying object identification information with low reliability.

上述の発明において、前記認識処理部は、前記特定した外形領域の画像情報と、標本情報記憶部に記憶する物体の標本情報とを画像マッチング処理することで、前記外形領域に写っている物体の物体識別情報を同定する、情報処理システムのように構成することができる。 In the above-mentioned invention, the recognition processing unit can be configured as an information processing system that identifies object identification information for the object depicted in the outline region by performing image matching processing between image information of the identified outline region and specimen information of the object stored in the specimen information storage unit.

外形領域に写っている物体の物体識別情報の同定処理としては、本発明のように画像マッチング処理を用いてもよい。 Image matching processing, as in the present invention, may be used to identify the object identification information of the object captured in the outline region.

第５の発明は、画像情報に写っている物体を同定する情報処理システムであって、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、を有しており、前記画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている物体の外形を外形領域として特定させ、前記特定させた外形領域の対応する属性に対する信頼度を出力させ、前記特定させた外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定させた外形領域に写っている物体の物体識別情報を同定させ、前記特定させた外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定させた外形領域に写っている物体を未知の物体であると同定させる、情報処理システムである。
A fifth invention is an information processing system for identifying an object shown in image information, comprising: a first learning processing unit that performs machine learning using first annotation data that associates data indicating the outline of the object with attributes to create a first learning model; the information processing system uses the image information and the first learning model to identify the outline of the object shown in the image information as an outline region and output a reliability for the attribute corresponding to the identified outline region; if the reliability of the attribute corresponding to the identified outline region is not below a predetermined threshold , the information processing system identifies object identification information of the object shown in the identified outline region using a second learning model or image matching process that has been machine learned using image information of the identified outline region and second annotation data that associates image data of the object with object identification information; and if the reliability of the attribute corresponding to the identified outline region is below a predetermined threshold , the information processing system identifies the object shown in the identified outline region as an unknown object.

第６の発明は、画像情報に写っている物体を同定する情報処理システムであって、物体を撮影した画像情報の入力を受け付ける画像情報入力受付処理部と、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報から、写っている物体の物体識別情報を同定する物体認識処理部と、を有しており、前記物体認識処理部は、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をすることによって作成された第１の学習モデルと、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報とを用いて、前記写っている物体の外形を外形領域として特定し、前記特定した外形領域の対応する属性に対する信頼度を出力し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定した外形領域に写っている物体の物体識別情報を同定し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定した外形領域に写っている物体を未知の物体であると同定する、情報処理システムである。
A sixth invention is an information processing system for identifying an object shown in image information, the system including: an image information input reception processing unit that receives input of image information obtained by photographing an object; and an object recognition processing unit that identifies object identification information of the object shown in the image from the received input image information or image information obtained by orthogonalizing the image information, wherein the object recognition processing unit identifies the outer shape of the object using a first learning model created by machine learning using first annotation data that associates data indicating the outer shape of an object with attributes, and the received input image information or image information obtained by orthogonalizing the image information. an information processing system that identifies an object in the specified outer region as an unknown object, outputs a reliability for an attribute corresponding to the specified outer region, and , if the reliability of the attribute corresponding to the specified outer region is not below a predetermined threshold , identifies object identification information of the object appearing in the specified outer region using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information, and, if the reliability of the attribute corresponding to the specified outer region is below a predetermined threshold , identifies the object appearing in the specified outer region as an unknown object.

これらの発明のように構成しても、第１の発明と同様の技術的効果を得ることができる。 Even with these inventions configured, the same technical effects as the first invention can be obtained.

第１の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、画像情報に写っている物体の物体識別情報を同定する認識処理部、として機能させる情報処理プログラムであって、前記認識処理部は、前記画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている物体の外形を外形領域として特定し、前記特定した外形領域の対応する属性に対する信頼度を出力し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定した外形領域に写っている物体の一または複数の物体識別情報を同定し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定した外形領域に写っている物体を未知の物体であると同定する、情報処理プログラムである。
The first aspect of the present invention can be realized by loading the program of the present invention into a computer and executing it. In other words, this is an information processing program that causes a computer to function as a first learning processing unit that performs machine learning using first annotation data that associates data indicating the outer shape of an object with attributes to create a first learning model, and a recognition processing unit that identifies object identification information of an object shown in image information, wherein the recognition processing unit uses the image information and the first learning model to identify the outer shape of the object shown in the image information as an outer shape region, outputs a reliability for the attribute corresponding to the identified outer shape region, and , if the reliability of the attribute corresponding to the identified outer shape region is not below a predetermined threshold , identifies one or more object identification information of the object shown in the identified outer shape region using a second learning model or image matching process that has been machine learned using the image information of the identified outer shape region and second annotation data that associates image data of the object with object identification information, and, if the reliability of the attribute corresponding to the identified outer shape region is below a predetermined threshold , identifies the object shown in the identified outer shape region as an unknown object.

第５の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をして第１の学習モデルを作成する第１の学習処理部、として機能させる情報処理プログラムであって、画像情報と前記第１の学習モデルとを用いて、前記画像情報に写っている物体の外形を外形領域として特定させ、前記特定させた外形領域の対応する属性に対する信頼度を出力させ、前記特定させた外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定させた外形領域に写っている物体の物体識別情報を同定させ、前記特定させた外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定させた外形領域に写っている物体を未知の物体であると同定させる、情報処理プログラムである。 The fifth invention can be realized by loading the program of the present invention into a computer and executing it. That is, the information processing program causes a computer to function as a first learning processing unit that performs machine learning using first annotation data that associates data indicating the outline of an object with attributes to create a first learning model, the information processing program causing a computer to identify the outline of an object shown in the image information as an outline region using image information and the first learning model, outputting a reliability for the attribute corresponding to the identified outline region, and, if the reliability of the attribute corresponding to the identified outline region is not equal to or less than a predetermined threshold, identifying object identification information of the object shown in the identified outline region using a second learning model or image matching process that has been machine-learned using image information of the identified outline region and second annotation data that associates image data of the object with object identification information, and, if the reliability of the attribute corresponding to the identified outline region is equal to or less than a predetermined threshold, identifying the object shown in the identified outline region as an unknown object.

第６の発明は、本発明のプログラムをコンピュータに読み込ませて実行することで、実現することができる。すなわち、コンピュータを、物体を撮影した画像情報の入力を受け付ける画像情報入力受付処理部、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報から、写っている物体の物体識別情報を同定する物体認識処理部、として機能させる情報処理プログラムであって、前記物体認識処理部は、物体の外形を示すデータと属性とを対応づけた第１のアノテーションデータを用いて機械学習をすることによって作成された第１の学習モデルと、前記入力を受け付けた画像情報若しくは前記画像情報を正置化した画像情報とを用いて、前記写っている物体の外形を外形領域として特定し、前記特定した外形領域の対応する属性に対する信頼度を出力し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下ではない場合には、前記特定した外形領域の画像情報と、物体の画像データと物体識別情報とを対応づけた第２のアノテーションデータを用いて機械学習をした第２の学習モデルまたは画像マッチング処理とを用いて、前記特定した外形領域に写っている物体の物体識別情報を同定し、前記特定した外形領域の対応する属性の信頼度が所定の閾値以下の場合には、前記特定した外形領域に写っている物体を未知の物体であると同定する、情報処理プログラムである。 A sixth aspect of the present invention can be realized by loading the program of the present invention into a computer and executing it. That is, the information processing program causes a computer to function as an image information input reception processing unit that receives input of image information of an object, and an object recognition processing unit that identifies object identification information of the object in the image from the received input image information or image information obtained by orthogonalizing the image information, wherein the object recognition processing unit identifies the outline of the object in the image as an outline region using a first learning model created by machine learning using first annotation data that associates data indicating the outline of the object with attributes, and the received input image information or image information obtained by orthogonalizing the image information. and outputs a reliability for the attribute corresponding to the specified outer region; if the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold , identifies object identification information of the object appearing in the specified outer region using a second learning model or image matching process that is machine learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information ; and if the reliability of the attribute corresponding to the specified outer region is equal to or less than a predetermined threshold , identifies the object appearing in the specified outer region as an unknown object.

本発明の情報処理システムを用いることで、画像情報に写っている物体を同定する際の同定の精度を向上させることが可能となる。 By using the information processing system of the present invention, it is possible to improve the accuracy of identifying objects captured in image information.

本発明の情報処理システムの構成の一例を模式的に示すブロック図である。1 is a block diagram schematically illustrating an example of a configuration of an information processing system according to the present invention. 本発明の情報処理システムにおける物体認識処理部の構成の一例を模式的に示すブロック図である。FIG. 2 is a block diagram schematically illustrating an example of a configuration of an object recognition processing unit in the information processing system of the present invention. 本発明の情報処理システムで用いるコンピュータのハードウェア構成の一例を模式的に示すブロック図である。FIG. 2 is a block diagram schematically illustrating an example of a hardware configuration of a computer used in the information processing system of the present invention. 本発明の情報処理システムにおける学習処理の処理プロセスの一例を示すフローチャートである。10 is a flowchart showing an example of a learning process in the information processing system of the present invention. 本発明の情報処理システムにおける認識処理の処理プロセスの一例を示すフローチャートである。10 is a flowchart showing an example of a processing process for recognition processing in the information processing system of the present invention. 第１のアノテーションデータの一例を模式的に示す図である。FIG. 2 is a diagram schematically illustrating an example of first annotation data. 第１のアノテーションデータの他の一例を模式的に示す図である。FIG. 10 is a diagram schematically illustrating another example of the first annotation data. 第２のアノテーションデータの一例を模式的に示す図である。FIG. 10 is a diagram schematically illustrating an example of second annotation data. 第２のアノテーションデータの他の一例を模式的に示す図である。FIG. 10 is a diagram schematically illustrating another example of the second annotation data. 撮影画像情報の一例を示す図である。FIG. 10 is a diagram showing an example of captured image information. 撮影画像情報の他の一例を示す図である。FIG. 10 is a diagram showing another example of captured image information. 図１０の撮影画像情報を正置化した画像情報の一例を示す図である。11 is a diagram showing an example of image information obtained by normalizing the captured image information of FIG. 10. FIG. 図１１の撮影画像情報を正置化した画像情報の一例を示す図である。12 is a diagram showing an example of image information obtained by normalizing the captured image information of FIG. 11. FIG. 実施例２における情報処理システムの構成の一例を模式的に示すブロック図である。FIG. 10 is a block diagram illustrating an example of a configuration of an information processing system according to a second embodiment. 物体が陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して，棚段領域の指定の入力を受け付けた状態を模式的に示す図である。10 is a diagram showing a schematic diagram of a state in which input specifying a shelf area is received for normal-orientation image information obtained by normalizing image information obtained by photographing a display shelf on which objects are displayed. FIG. 物体が吊り下げられて陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して，棚段領域の指定の入力を受け付けた状態を模式的に示す図である。10 is a diagram showing a schematic diagram of a state in which input specifying a shelf area is received for normal-orientation image information obtained by normalizing image information captured of a display shelf on which objects are hung and displayed. FIG. 棚段領域の画像情報から外形領域を特定した場合の一例を示す図である。FIG. 10 is a diagram showing an example of a case where an outer shape area is identified from image information of a shelf area. 実施例２の情報処理システムにおける認識処理の処理プロセスの一例を示すフローチャートである。10 is a flowchart illustrating an example of a processing process of a recognition process in an information processing system according to a second embodiment. 撮影画像情報の一例を示す図である。FIG. 10 is a diagram showing an example of captured image information. 図１９の撮影画像情報に対して正置化処理を実行した正置画像情報の一例を示す図である。FIG. 20 is a diagram showing an example of normalized image information obtained by performing normalization processing on the photographed image information of FIG. 19 . 実施例３における情報処理システムの構成の一例を模式的に示すブロック図である。FIG. 10 is a block diagram illustrating an example of the configuration of an information processing system according to a third embodiment. 標本情報記憶部に記憶される標本情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of specimen information stored in a specimen information storage unit. 実施例４における物体認識処理部の一例を模式的に示すブロック図である。FIG. 10 is a block diagram illustrating an example of an object recognition processing unit according to a fourth embodiment.

本発明の情報処理システム１の処理機能の一例をブロック図で図１および図２に示す。情報処理システム１は、管理端末２と画像情報入力端末３とを用いる。図１は情報処理システム１の全体の機能を示すブロック図であり、図２は後述する物体認識処理部２１３の機能を示すブロック図である。 An example of the processing functions of the information processing system 1 of the present invention is shown in block diagrams in Figures 1 and 2. The information processing system 1 uses a management terminal 2 and an image information input terminal 3. Figure 1 is a block diagram showing the overall functions of the information processing system 1, and Figure 2 is a block diagram showing the functions of the object recognition processing unit 213, which will be described later.

管理端末２は、情報処理システム１を運営する企業等の組織が利用するコンピュータである。また、画像情報入力端末３は、同定の対象となる物体を含む画像情報、たとえば同定の対象となる商品を陳列している店舗の陳列棚を撮影した画像情報の入力を行う端末である。 The management terminal 2 is a computer used by an organization such as a company that operates the information processing system 1. The image information input terminal 3 is a terminal that inputs image information containing objects to be identified, such as image information captured of a store display shelf displaying the products to be identified.

情報処理システム１における管理端末２、画像情報入力端末３は、コンピュータを用いて実現される。図３にコンピュータのハードウェア構成の一例を模式的に示す。コンピュータは、プログラムの演算処理を実行するＣＰＵなどの演算装置７０と、情報を記憶するＲＡＭやハードディスクなどの記憶装置７１と、情報を表示するディスプレイなどの表示装置７２と、情報の入力が可能なキーボードやマウスなどの入力装置７３と、演算装置７０の処理結果や記憶装置７１に記憶する情報をインターネットやＬＡＮなどのネットワークを介して送受信する通信装置７４とを有している。 The management terminal 2 and image information input terminal 3 in the information processing system 1 are realized using a computer. Figure 3 shows a schematic diagram of an example of the hardware configuration of a computer. The computer has a computing device 70 such as a CPU that executes the arithmetic processing of a program, a storage device 71 such as RAM or a hard disk that stores information, a display device 72 such as a display that displays information, an input device 73 such as a keyboard or mouse that can input information, and a communication device 74 that sends and receives the processing results of the computing device 70 and the information stored in the storage device 71 via a network such as the Internet or a LAN.

コンピュータがタッチパネルディスプレイを備えている場合には、表示装置７２と入力装置７３とが一体的に構成されていてもよい。タッチパネルディスプレイは、たとえばタブレット型コンピュータやスマートフォンなどの可搬型通信端末などで利用されることが多いが、それに限定するものではない。 If the computer is equipped with a touch panel display, the display device 72 and input device 73 may be integrated. Touch panel displays are often used in portable communication devices such as tablet computers and smartphones, but are not limited to these.

タッチパネルディスプレイは、そのディスプレイ上で、直接、所定の入力デバイス（タッチパネル用のペンなど）や指などによって入力を行える点で、表示装置７２と入力装置７３の機能が一体化した装置である。 A touch panel display is a device that combines the functions of a display device 72 and an input device 73, in that input can be made directly on the display using a specified input device (such as a touch panel pen) or a finger.

画像情報入力端末３は、上記の各装置のほか、カメラなどの撮影装置を備えていてもよい。画像情報入力端末３として、携帯電話、スマートフォン、タブレット型コンピュータなどの可搬型通信端末を用いることもできる。 In addition to the devices described above, the image information input terminal 3 may also be equipped with a camera or other imaging device. Portable communication terminals such as mobile phones, smartphones, and tablet computers can also be used as the image information input terminal 3.

本発明における各手段は、その機能が論理的に区別されているのみであって、物理上あるいは事実上は同一の領域を為していても良い。本発明の各手段における処理は、その処理順序を適宜変更することもできる。また、処理の一部を省略してもよい。たとえば後述する正置化処理を省略することもできる。その場合、正置化処理をしていない画像情報に対する処理を実行することができる。 The functions of the various means of the present invention are only logically distinct, and may physically or practically be in the same area. The order of processing performed by the various means of the present invention may be changed as appropriate. Part of the processing may also be omitted. For example, the normalization processing described below may be omitted. In this case, processing can be performed on image information that has not been normalized.

情報処理システム１は、学習処理部２０と認識処理部２１と物体情報記憶部２２とを有する。学習処理部２０は、第１の学習処理部２０１と第２の学習処理部２０２とを有する。 The information processing system 1 has a learning processing unit 20, a recognition processing unit 21, and an object information storage unit 22. The learning processing unit 20 has a first learning processing unit 201 and a second learning processing unit 202.

第１の学習処理部２０１は、第１のアノテーションデータを用いて、同定の対象となる物体を含む画像情報、たとえば同定の対象となる商品を陳列している陳列棚を撮影した画像情報に対する機械学習による学習処理、好ましくは、画像セグメンテーションの方法による学習処理を行う。この学習処理とは、機械学習における学習処理であって、たとえば深層学習（ディープラーニング）を用いた学習モデルを作成するため、画像セグメンテーションによる学習処理を実行する。 The first learning processing unit 201 uses the first annotation data to perform a machine learning learning process, preferably a learning process using an image segmentation method, on image information containing the object to be identified, such as image information captured of a display shelf displaying the product to be identified. This learning process is a learning process in machine learning, and performs a learning process using image segmentation to create a learning model using, for example, deep learning.

第１のアノテーションデータとは、同定の対象となる可能性のある物体の輪郭を外形とし、その輪郭による閉領域の内側をマスク処理したデータと、その物体の輪郭の属性を分類したタグ（ラベル）とを対応づけたデータである。なお、外形とは、物体の輪郭そのものであってもよいし、物体の輪郭を含み、物体の輪郭に沿った形状、あるいは物体の輪郭を示す形状など、物体の輪郭から幅を持たせた形状を外形としてもよい。すなわち、物体の輪郭の外形としては、矩形領域に限定するものではなく、その物体の輪郭を反映するような非矩形領域となる場合もある。 The first annotation data is data that associates the contour of an object that may be the target of identification as its outer shape, data in which the inside of the closed area formed by that contour is masked, with tags (labels) that classify the attributes of the object's contour. Note that the outer shape may be the object's contour itself, or a shape that includes the object's contour and follows the object's contour, or a shape that has some width from the object's contour, such as a shape that indicates the object's contour. In other words, the outer shape of an object's contour is not limited to a rectangular area, and may also be a non-rectangular area that reflects the object's contour.

たとえば物体が陳列棚に陳列されている商品の場合、第１のアノテーションデータとしては、陳列棚に陳列される可能性のある物体の輪郭を外形とし、その輪郭による閉領域の内側をマスク処理したデータと、その物体の輪郭の属性を分類したタグ（ラベル）とを対応づけたデータである。 For example, if the object is a product displayed on a display shelf, the first annotation data is data in which the outline of the object that may be displayed on the display shelf is used as the outer shape, and the inside of the closed area formed by the outline is masked, and a tag (label) that classifies the attributes of the object's outline is associated with the data.

第１のアノテーションデータは、一つの物体に一つでなくてもよく、一つの物体に複数あってもよい。すなわち、複数の方向から物体の輪郭を外形としてその内側をマスク処理したデータと属性とを対応づけて、それぞれを当該物体の第１のアノテーションデータとしてもよい。属性とは物体の種別である。物体の種別としては、動物の種別（象、虎、ライオンなど）、鳥類の種別（ニワトリ、インコ、クジャクなど）、昆虫の種別（カブトムシ、クワガタ、蝶など）、魚類の種別（マグロ、サバ、イワシなど）、植物（樹木、花卉など）の種別（桜、梅、百合、薔薇、菊、チューリップなど）、移動体（乗り物、飛行体など）の種別（自動車、自転車、自動二輪車、飛行機、ヘリコプター、ドローン、ＵＡＶ（Unmanned Aircraft Vehicle）、船舶、ボートなど）のほか、物体の容器の分類や物体の物体識別情報（ＪＡＮコードなど）など、各種の種別が含まれる。容器の分類としては、缶、ビン、箱、パウチ容器など容器の種類であってもよいし、洗剤容器のように用途に応じたさらに細分化されたものであってもよい。すなわち、属性とは、当該物体の輪郭による閉領域がどのように分類されるかを示すものであればよい。第１のアノテーションデータの一例を図６および図７に示す。図６および図７では、物体として陳列商品の場合を示しているが、それに限定されないことは上述のとおりである。物体識別情報としては、ＪＡＮコードに限られるものではなく、物体を一意に識別できる情報であれば如何なる情報であってもよい。 There does not have to be one piece of first annotation data per object; there can be multiple pieces of first annotation data per object. In other words, data obtained by masking the interior of an object's outline from multiple directions can be associated with attributes, and each piece of first annotation data for the object can be used. Attributes refer to the type of object. Object types include animal types (e.g., elephant, tiger, lion), bird types (e.g., chicken, parakeet, peacock), insect types (e.g., rhinoceros beetle, stag beetle, butterfly), fish types (e.g., tuna, mackerel, sardine), plant types (e.g., cherry blossom, plum, lily, rose, chrysanthemum, tulip), and mobile object (e.g., vehicle, aircraft) types (e.g., automobile, bicycle, motorcycle, airplane, helicopter, drone, UAV (Unmanned Aircraft Vehicle), ship, boat), as well as various other types, such as the classification of the object's container and the object's object identification information (e.g., JAN code). Container classifications may be by type, such as cans, bottles, boxes, or pouches, or may be further subdivided according to use, such as detergent containers. In other words, attributes only need to indicate how the closed area defined by the contour of the object is classified. Examples of the first annotation data are shown in Figures 6 and 7. Figures 6 and 7 show the case of displayed products as objects, but as mentioned above, this is not a limitation. Object identification information is not limited to JAN codes, and may be any information that can uniquely identify an object.

図６は、缶の輪郭とその輪郭による閉領域の内側をマスク処理したデータと、属性として「缶」を対応づけて第１のアノテーションデータとした場合を示しており、図７は、詰め替え用シャンプーの輪郭とその輪郭による閉領域の内側をマスク処理したデータと、属性として「パウチ容器」を対応づけて第１のアノテーションデータとした場合を示している。第１のアノテーションデータにおける属性としては、物体の輪郭自体から物体を同定できるような場合には、容器の分類ではなく、ＪＡＮコードなどの物体の識別情報を用いてもよい。 Figure 6 shows the first annotation data obtained by associating the outline of a can and the masked area inside the closed area formed by the outline with the attribute "can," while Figure 7 shows the first annotation data obtained by associating the outline of a shampoo refill and the masked area inside the closed area formed by the outline with the attribute "pouch container." As an attribute in the first annotation data, object identification information such as a JAN code may be used instead of the container classification if the object can be identified from the object's outline itself.

第１の学習処理部２０１での第１のアノテーションデータを用いて機械学習用の学習処理を実行することで、同定の対象となる可能性のある物体を撮影した画像情報から、物体の輪郭の領域を特定するための学習モデル（第１の学習モデル）を作成する。たとえば、陳列棚を撮影した画像情報から、物体の輪郭の領域を特定するための学習モデル（第１の学習モデル）を作成する。なお、第１の学習モデルは、物体の輪郭の領域のほか、物体の属性を特定可能な学習モデルであってもよい。 By performing a learning process for machine learning using the first annotation data in the first learning processing unit 201, a learning model (first learning model) for identifying the contour area of an object is created from image information of an object that may be the target of identification. For example, a learning model (first learning model) for identifying the contour area of an object is created from image information of an image of a display shelf. Note that the first learning model may be a learning model that can identify the attributes of an object in addition to the contour area of the object.

第２の学習処理部２０２は、第２のアノテーションデータを用いて、機械学習の学習処理を実行することで、所定の画像情報、好ましくは後述する外形領域の画像情報から、その領域にある物体の物体識別情報を同定するための学習モデル（第２の学習モデル）を作成する。この際の学習処理としては、好ましくは画像分類（Image Classification）の方法による学習処理を実行するとよいが、物体検出（Object Detection）、画像分類・物体位置特定（Image Classification・Localization）などの方法であってもよい。なお、第２の学習モデルは、外形領域の画像情報ほか、その外形領域に対応する属性を第２の学習モデルの入力値として入力可能な学習モデルであってもよい。 The second learning processing unit 202 uses the second annotation data to perform machine learning learning processing, thereby creating a learning model (second learning model) for identifying object identification information of an object in a specified image information, preferably image information of an outline region described below, from that region. The learning processing in this case is preferably performed using an image classification method, but methods such as object detection, image classification/object localization, etc. may also be used. The second learning model may be a learning model that allows attributes corresponding to the outline region, in addition to image information of the outline region, to be input as input values for the second learning model.

第２のアノテーションデータとは、同定の対象となる可能性のある物体を撮影した画像情報と、その物体の物体識別情報をタグ（ラベル）として対応づけたデータである。たとえば物体が陳列商品の場合、陳列棚に陳列される可能性のある商品の画像情報と、その物体の物体識別情報をタグ（ラベル）として対応づけたデータである。また物体が動物の場合には、同定の対象となる可能性のある動物の画像情報と、動物の名称などの識別情報（学名、通称名など）を物体識別情報のタグ（ラベル）として対応づけたデータである。第２のアノテーションデータの一例を図８および図９に示す。図８および図９では、物体として陳列商品の場合を示しているが、それに限定されないことは上述のとおりである。第２のアノテーションデータも第１のアノテーションデータと同様に、一物体に一つでなくてもよく、複数あってもよい。すなわち、複数の方向から物体を撮影し、各方向からの物体の画像情報と物体識別情報を対応づけて第２のアノテーションデータとしてもよい。また、第２の学習モデルで入力値として外形領域に対応する属性も入力可能とする場合には、第２のアノテーションデータとして、同定の対象となる可能性のある物体を撮影した画像情報とその物体の属性と、その物体の物体識別情報をタグ（ラベル）として対応づけたデータとしてもよい。 The second annotation data is data that associates image information of an object that may be identified with the object identification information of that object as a tag (label). For example, if the object is a displayed product, the data associates image information of the product that may be displayed on a shelf with the object identification information of that object as a tag (label). If the object is an animal, the data associates image information of the animal that may be identified with the animal's name or other identification information (scientific name, common name, etc.) as a tag (label) of the object identification information. Examples of second annotation data are shown in Figures 8 and 9. While Figures 8 and 9 show the case of a displayed product as the object, as mentioned above, this is not a limitation. Like the first annotation data, the second annotation data does not have to be one per object; multiple pieces of data may be present. In other words, the object may be photographed from multiple directions, and the image information of the object from each direction may be associated with the object identification information to create the second annotation data. Furthermore, if attributes corresponding to the outer shape region can also be input as input values in the second learning model, the second annotation data may be data that associates image information of an object that may be the subject of identification with the attributes of that object and the object identification information of that object as tags (labels).

図８は、缶の画像情報と、物体識別情報とを対応づけて第２のアノテーションデータとした場合を示しており、図９は、詰め替え用シャンプーの画像情報と、物体識別情報とを対応づけて第２のアノテーションデータとした場合を示している。図８の第２のアノテーションデータは、図６の第１のアノテーションデータに対応し、図８の第２のアノテーションデータは、図７の第２のアノテーションデータに対応する。 Figure 8 shows the case where image information of a can is associated with object identification information to form second annotation data, and Figure 9 shows the case where image information of a shampoo refill is associated with object identification information to form second annotation data. The second annotation data in Figure 8 corresponds to the first annotation data in Figure 6, and the second annotation data in Figure 8 corresponds to the second annotation data in Figure 7.

認識処理部２１は、画像情報入力受付処理部２１０と画像情報記憶部２１１と画像情報正置化処理部２１２と物体認識処理部２１３とを有する。 The recognition processing unit 21 has an image information input reception processing unit 210, an image information storage unit 211, an image information alignment processing unit 212, and an object recognition processing unit 213.

画像情報入力受付処理部２１０は、画像情報入力端末３で撮影した、同定対象となる物体を含む画像情報（撮影画像情報）の入力を受け付け、後述する画像情報記憶部２１１に記憶させる。画像情報入力端末３からは、撮影画像情報のほか、撮影日時、画像情報を識別する画像情報識別情報などをあわせて入力を受け付けるとよい。物体が陳列棚に陳列される陳列商品の場合の撮影画像情報の一例を図１０、図１１に示す。図１０、図１１では、陳列棚に３段の棚段があり、そこに物体が陳列されている撮影画像情報である。なお、本発明においては特にその処理を明記はしないが、陳列棚や棚段は横方向に長いことが多い。そのため、その処理においては、一定の幅で区切り、各処理の処理対象としてもよい。 The image information input reception processing unit 210 receives input of image information (captured image information) including the object to be identified, captured by the image information input terminal 3, and stores it in the image information storage unit 211 (described below). It is preferable to receive input from the image information input terminal 3, including the captured image information, the date and time of capture, and image information identification information that identifies the image information. An example of captured image information when the object is a product displayed on a display shelf is shown in Figures 10 and 11. Figures 10 and 11 show captured image information of a display shelf with three shelves on which the object is displayed. While this invention does not specifically specify the processing, display shelves and shelves are often long horizontally. Therefore, in this processing, the image may be divided into sections at a certain width and used as the processing target for each process.

画像情報記憶部２１１は、画像情報入力端末３から受け付けた撮影画像情報、撮影日時、画像情報識別情報などを対応づけて記憶する。撮影画像情報とは、本発明の処理対象となる画像情報であればよい。一般的には、単に撮影した場合、撮影対象物を正対した状態で撮影することが困難であることから、それを正対した状態に補正する補正処理、たとえば台形補正処理などを実行することがよい。一つの撮影の対象を複数枚で撮影した場合に、それが一つの画像情報として合成された画像情報も含まれる。また、歪み補正処理が実行された後の画像情報も撮影画像情報に含まれる。 The image information storage unit 211 stores captured image information, shooting date and time, image information identification information, etc. received from the image information input terminal 3, in association with each other. Captured image information may be any image information that is the subject of processing in the present invention. Generally, when simply capturing an image, it is difficult to capture the subject facing forward, so it is advisable to perform a correction process to correct the image to a forward facing state, such as keystone correction. This also includes image information in which multiple images of a single subject are taken and then combined into a single image. Image information after distortion correction has been performed is also included in captured image information.

画像情報正置化処理部２１２は、画像情報記憶部２１１に記憶した撮影画像情報に対して、撮影対象物が正対した状態になるように補正する処理（正置化処理）、たとえば台形補正処理を実行した正置画像情報を生成する。台形補正処理は、撮影画像情報に写っている物体を正対した状態となるように行う補正処理である。たとえば、物体が陳列商品の場合、陳列商品を陳列する棚段が水平になるように行う補正処理である。正置化とは、撮影装置のレンズの光軸を撮影対象である平面の垂線方向に沿って、十分に遠方から撮影した場合と同じになるように画像情報を変形させることであり、たとえば台形補正処理があるが、それに限定するものではない。 The image information orientation processor 212 performs a process (orthogonalization process) on the captured image information stored in the image information storage unit 211 to correct the captured object so that it is facing upright, for example, by performing keystone correction, to generate orthogonalized image information. Keystone correction is a correction process performed to orient the object captured in the captured image information so that it is facing upright. For example, if the object is a displayed commodity, this is a correction process performed so that the shelves on which the displayed commodity is displayed are horizontal. Orthogonalization is the process of transforming the image information so that the optical axis of the lens of the image capture device is aligned perpendicular to the plane of the object being captured, so that it appears the same as if it were captured from a sufficiently distant location; for example, this can be done by keystone correction, but is not limited to this.

画像情報正置化処理部２１２が実行する台形補正処理は、撮影画像情報において４頂点の指定の入力を受け付け、その各頂点を用いて台形補正処理を実行する。指定を受け付ける４頂点としては、物体が陳列棚に陳列する商品の場合、陳列棚の棚段の４頂点であってもよいし、陳列棚の棚位置の４頂点であってもよい。また、２段、３段の棚段のまとまりの４頂点であってもよい。４頂点としては任意の４点を指定できる。図１２に図１０の撮影画像情報を、図１３に図１１の撮影画像情報をそれぞれ正置化した撮影画像情報（正置画像情報）の一例を示す。 The keystone correction process performed by the image information alignment processor 212 accepts input specifying four vertices in the captured image information, and performs keystone correction using each of those vertices. If the object is a commodity displayed on a display shelf, the four vertices that are accepted for designation may be the four vertices of the shelf level on the display shelf, or the four vertices of the shelf position on the display shelf. They may also be the four vertices of a group of two or three shelf levels. Any four points can be specified as the four vertices. Figure 12 shows an example of captured image information (aligned image information) obtained by orthogonally aligning the captured image information of Figure 10, and Figure 13 shows an example of captured image information (aligned image information) obtained by orthogonally aligning the captured image information of Figure 11.

物体認識処理部２１３は、画像情報、好ましくは撮影画像情報若しくは正置画像情報に写っている画像情報からそこに写っている物体を認識する処理を実行する。 The object recognition processing unit 213 performs processing to recognize objects captured in image information, preferably captured image information or upright image information.

物体認識処理部２１３は、外形特定処理部２１３１と物体同定処理部２１３２とを有する。 The object recognition processing unit 213 has an outer shape specification processing unit 2131 and an object identification processing unit 2132.

外形特定処理部２１３１は、正置画像情報に写っている物体の外形の領域（外形領域）を特定する。 The contour identification processing unit 2131 identifies the contour area (contour area) of an object captured in the upright image information.

外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、撮影画像情報、正置画像情報を入力値として入力し、入力した画像情報から外形領域を特定する。すなわち、外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とする領域、たとえば正置画像情報を入力し、その出力値に基づいて、外形の領域を特定する。特定した外形の領域については、外形領域を識別する外形識別情報を割り当てて、撮影画像情報、正置画像情報における位置情報（たとえば画像情報における座標）とともに物体識別情報記憶部２２に記憶させる。 The contour identification processing unit 2131 inputs the captured image information and upright image information as input values to the learning model (first learning model) trained by the first learning processing unit 201, and identifies the contour region from the input image information. That is, the contour identification processing unit 2131 inputs the region to be processed, such as upright image information, to the learning model trained by the first learning processing unit 201 (a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of multiple intermediate layers are optimized), and identifies the contour region based on the output value. The identified contour region is assigned contour identification information that identifies the contour region, and stored in the object identification information storage unit 22 together with position information in the captured image information and upright image information (for example, coordinates in the image information).

外形特定処理部２１３１で出力する出力値としては、その外形領域が対応する属性のタグとそれに対する信頼度（確率）も合わせて出力するとよい。この際に、もっとも高い信頼度の属性のタグを当該外形領域のタグとして特定するが、その信頼度が一定の閾値以下であれば、特定した外形領域について、未知の物体（学習していないなどの理由によって同定できない物体）などの所定の物体であると同定をする。 The output value output by the contour identification processing unit 2131 should also include the attribute tag corresponding to the contour region and its reliability (probability). In this case, the attribute tag with the highest reliability is identified as the tag for the contour region, and if the reliability is below a certain threshold, the identified contour region is identified as a specified object, such as an unknown object (an object that cannot be identified for reasons such as not being learned).

物体同定処理部２１３２は、外形領域に表示されている物体の物体識別情報を、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、外形領域の画像情報を入力値として入力し、入力された画像情報からその領域にある物体の識別情報を同定する。すなわち、物体同定処理部２１３２は、第２の学習処理部２０２において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とする外形領域の画像情報を入力し、その出力値に基づいて、外形領域にある物体の物体識別情報を同定する。この際に、入力値として外形領域の画像情報のほか、外形特定処理部２１３１で特定したその外形領域が対応する属性のタグの情報を入力値として入力してもよい。なお、この際には、外形領域について一つの物体識別情報を同定してもよいし、複数の物体識別情報を同定してもよい。複数の物体識別情報を同定する場合には、同定する物体識別情報の候補を出力することとなる。 The object identification processing unit 2132 inputs the image information of the outline region as an input value to a learning model (second learning model) trained by the second learning processing unit 202, and identifies the identification information of the object in that region from the input image information. That is, the object identification processing unit 2132 inputs the image information of the outline region to be processed into the learning model trained by the second learning processing unit 202 (a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of multiple intermediate layers are optimized), and identifies the object identification information of the object in the outline region based on the output value. At this time, in addition to the image information of the outline region, tag information of the attribute corresponding to the outline region identified by the outline identification processing unit 2131 may also be input as an input value. Note that at this time, one object identification information or multiple object identification information may be identified for the outline region. If multiple object identification information is identified, candidates for the object identification information to be identified are output.

物体同定処理部２１３２で出力する出力値としては、その外形領域が同定した物体識別情報の信頼度（確率）も合わせて出力する。この際に、もっとも高い信頼度の物体識別情報を当該外形領域の物体識別情報として同定するが、もっとも高い信頼度が一定の閾値以下であれば、入力した外形領域について、未知の物体と同定をする。 The output value output by the object identification processing unit 2132 also includes the reliability (probability) of the object identification information identified by that outline region. At this time, the object identification information with the highest reliability is identified as the object identification information for that outline region, but if the highest reliability is below a certain threshold, the input outline region is identified as an unknown object.

物体認識処理部２１３は、外形特定処理部２１３１、物体同定処理部２１３２の処理をまとめて深層学習などによって実行してもよい。 The object recognition processing unit 213 may perform the processing of the contour specification processing unit 2131 and the object identification processing unit 2132 together using deep learning or the like.

物体識別情報記憶部２２は、撮影画像情報、正置画像情報に写っている物体の物体識別情報を示す情報を記憶する。たとえば、物体識別情報に対応付けて、撮影日時情報、撮影画像情報の画像情報識別情報、正置画像情報の画像識別情報、外形を識別するための外形識別情報に対応づけて物体識別情報記憶部２２に記憶する。 The object identification information storage unit 22 stores information indicating the object identification information of objects shown in the photographed image information and upright image information. For example, the object identification information is associated with the photographing date and time information, image information identification information of the photographed image information, image identification information of the upright image information, and external shape identification information for identifying the external shape, and these are stored in the object identification information storage unit 22.

つぎに本発明の情報処理システム１の処理プロセスの一例を図４および図５のフローチャートを用いて説明する。 Next, an example of the processing process of the information processing system 1 of the present invention will be explained using the flowcharts in Figures 4 and 5.

まず、本発明の情報処理システム１の認識処理部２１で用いる学習モデルを学習するための、学習処理を、図４のフローチャートを用いて説明する。 First, the learning process for learning the learning model used in the recognition processing unit 21 of the information processing system 1 of the present invention will be explained using the flowchart in Figure 4.

第１の学習処理部２０１における学習モデルの教師データとして、第１のアノテーションデータを作成する（Ｓ１００）。第１のアノテーションデータは、同定対象となる可能性のある物体の輪郭により外形を形成し、その外形の内側の閉領域をマスク処理した画像データとする。この画像データに、属性をタグとして対応づけて作成する。 First annotation data is created as training data for the learning model in the first learning processing unit 201 (S100). The first annotation data is image data in which the outline of an object that may be the target of identification is formed, and the closed area inside that outline is masked. Attributes are associated with this image data as tags.

同様に、第２の学習処理部２０２における学習モデルの教師データとして、第２のアノテーションデータを作成する（Ｓ１１０）。第２のアノテーションデータは、同定対象となる可能性のある物体の画像に、その物体の物体識別情報をタグとして対応づけて作成する。この際の物体の画像情報は、第１のアノテーションデータに対応しているとよく、物体の輪郭を外形とした画像情報であるとよい。 Similarly, second annotation data is created as training data for the learning model in the second learning processing unit 202 (S110). The second annotation data is created by associating an image of an object that may be the target of identification with the object identification information of that object as a tag. The image information of the object in this case preferably corresponds to the first annotation data, and is preferably image information that shows the outline of the object.

そして、作成した第１のアノテーションデータを教師データとして入力し、第１の学習処理部２０１において機械学習用の学習処理を実行し、物体の輪郭の領域を特定するための学習モデル（第１の学習モデル）を作成する（Ｓ１２０）。 Then, the created first annotation data is input as training data, and a learning process for machine learning is performed in the first learning processing unit 201, creating a learning model (first learning model) for identifying the contour area of the object (S120).

また、作成した第２のアノテーションデータを教師データとして入力し、第２の学習処理部２０２において機械学習用の学習処理を実行し、画像情報、好ましくは外形領域からその領域にある物体の物体識別情報を同定するための学習モデル（第２の学習モデル）を作成する（Ｓ１３０）。 The created second annotation data is input as training data, and a learning process for machine learning is performed in the second learning processing unit 202 to create a learning model (second learning model) for identifying object identification information of an object in the image information, preferably the outline area, from that area (S130).

以上のような処理を実行することで、各学習モデルを作成することができる。 By performing the above process, each learning model can be created.

つぎに、同定対象となる物体を撮影した画像情報から、その画像情報に写っている物体の物体識別情報を同定するための認識処理を、図５のフローチャートを用いて説明する。 Next, the recognition process for identifying object identification information of an object captured in image information of the object to be identified will be explained using the flowchart in Figure 5.

同定対象となる物体を撮影した撮影画像情報は、画像情報入力端末３から入力され、管理端末２の画像情報入力受付処理部２１０でその入力を受け付ける（Ｓ２００）。また、撮影日時、撮影画像情報の画像情報識別情報の入力を受け付ける。そして、画像情報入力受付処理部２１０は、入力を受け付けた撮影画像情報、撮影日時、撮影画像情報の画像情報識別情報を対応づけて画像情報記憶部２１１に記憶させる。 Photographed image information of an object to be identified is input from the image information input terminal 3, and this input is accepted by the image information input acceptance processing unit 210 of the management terminal 2 (S200). It also accepts input of the shooting date and time and image information identification information of the photographed image information. The image information input acceptance processing unit 210 then associates the accepted input photographed image information, shooting date and time, and image information identification information of the photographed image information and stores them in the image information storage unit 211.

管理端末２において所定の操作入力を受け付けると、画像情報正置化処理部２１２は、画像情報記憶部２１１に記憶する撮影画像情報を抽出し、台形補正処理などの正置化処理を行うための４点の入力を受け付け、正置化処理を実行する（Ｓ２１０）。 When a specified operation input is received on the management terminal 2, the image information alignment processing unit 212 extracts the captured image information stored in the image information storage unit 211, receives input of four points for performing alignment processing such as keystone correction processing, and executes the alignment processing (S210).

そして、正置画像情報に対して、管理端末２において所定の操作入力を受け付けることで、外形を特定する処理を実行する（Ｓ２２０）。すなわち、外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、撮影画像情報または正置画像情報における一部または全部の領域を入力値として入力し、入力した画像情報から外形領域を特定する。 Then, the management terminal 2 receives a predetermined operational input for the upright image information, and executes a process to identify the outer shape (S220). That is, the outer shape identification processing unit 2131 inputs some or all of the area in the captured image information or upright image information as input values into the learning model (first learning model) trained by the first learning processing unit 201, and identifies the outer shape area from the input image information.

以上のように正置画像情報に写っている各物体の各外形領域を特定し、それぞれの属性（タグ）を同定する。そして、同定したタグの信頼度が一定の閾値以下の場合には（Ｓ２３０）、その外形領域にある物体は未知の物体であると同定をする（Ｓ２４０）。 As described above, the contour regions of each object captured in the upright image information are identified, and their attributes (tags) are identified. If the reliability of the identified tag is below a certain threshold (S230), the object in that contour region is identified as an unknown object (S240).

一方、同定したタグの信頼度が一定の閾値より大きい場合には（Ｓ２３０）、物体同定処理部２１３２は、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、外形領域の画像情報を入力値として入力し、外形領域に写っている物体の物体識別情報を同定する（Ｓ２５０）。そして同定した物体識別情報の信頼度が一定の閾値以下の場合には（Ｓ２６０）、その外形領域にある物体は未知の物体であると同定をする（Ｓ２４０）。 On the other hand, if the reliability of the identified tag is greater than a certain threshold (S230), the object identification processing unit 2132 inputs the image information of the outline region as an input value into the learning model (second learning model) trained in the second learning processing unit 202, and identifies the object identification information of the object appearing in the outline region (S250). If the reliability of the identified object identification information is equal to or less than a certain threshold (S260), the object in the outline region is identified as an unknown object (S240).

一方、同定した物体識別情報の信頼度が一定の閾値より大きい場合には（Ｓ２６０）、同定した物体識別情報は、撮影日時、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、外形識別情報に対応づけて物体識別情報記憶部２２に記憶させる。 On the other hand, if the reliability of the identified object identification information is greater than a certain threshold (S260), the identified object identification information is stored in the object identification information storage unit 22 in association with the shooting date and time, image information identification information of the captured image information, image information identification information of the upright image information, and external shape identification information.

なお、同定した物体識別情報の修正処理についての入力を受け付けてもよい。 In addition, input regarding the process of correcting the identified object identification information may also be accepted.

以上のような処理を行うことで、撮影画像情報に写っている物体の物体識別情報を同定することができる。また従来のシステムのように、外形領域を矩形領域とせず、物体の輪郭の外形に沿って物体の同定を行うので、外形領域に含まれる不要な情報、たとえば他の物体などのノイズが除外されるので、認識精度が向上することとなる。 By performing the above processing, it is possible to identify the object identification information of objects captured in captured image information. Furthermore, unlike conventional systems, the outer region is not a rectangular region, but rather the object is identified along the outline of the object's contour. This eliminates unnecessary information contained in the outer region, such as noise from other objects, thereby improving recognition accuracy.

なお、第１のアノテーションデータにおける属性として物体識別情報を用いている場合（物体の外形から物体が同定できる場合）には、画像情報を第１の学習モデルに入力して外形領域を特定することで、当該物体の物体識別情報を同定できる。その場合には、第２の学習処理部２０２、物体同定処理部２１３２による処理を実行せずともよく、外形特定処理部２１３１で外形領域を特定すると、その外形領域に写っている物体の物体識別情報を、第１の学習モデルによる出力結果としての属性の物体識別情報で同定してもよい。 Note that when object identification information is used as an attribute in the first annotation data (when an object can be identified from its external shape), the object identification information of the object can be identified by inputting the image information into the first learning model and identifying the external shape region. In this case, it is not necessary to perform processing by the second learning processing unit 202 and the object identification processing unit 2132; once the external shape region is identified by the external shape identification processing unit 2131, the object identification information of the object depicted in that external shape region can be identified using the object identification information of the attribute as an output result from the first learning model.

上述の実施例１の情報処理システム１において、物体として陳列棚に陳列している陳列商品の場合、さらに、棚段の領域（棚段領域）ごとにその処理を実行してもよい。この場合の情報処理システム１の処理機能の一例をブロック図で図１４に示す。 In the information processing system 1 of Example 1 described above, in the case of displayed products displayed on a display shelf as objects, the processing may be further performed for each shelf area (shelf area). An example of the processing functions of the information processing system 1 in this case is shown in a block diagram in Figure 14.

本実施例２の情報処理システム１では、認識処理部２１にさらに棚段特定処理部２１４を有している。 In the information processing system 1 of this embodiment 2, the recognition processing unit 21 further includes a shelf level identification processing unit 214.

棚段特定処理部２１４は、画像情報正置化処理部２１２において撮影画像情報に対して台形補正処理を実行した正置画像情報のうち、物体が配置される可能性のある、陳列棚における棚段の領域（棚段領域）を特定する。撮影画像情報および正置画像情報には陳列棚が写っているが、陳列棚には、物体が陳列される棚段領域がある。そのため、正置画像情報から棚段領域を特定する。棚段領域の特定としては、管理端末２の操作者が手動で棚段領域を指定し、それを棚段特定処理部２１４が受け付けてもよいし、初回に手動で入力を受け付けた棚段領域の情報に基づいて、二回目以降は自動で棚段領域を特定してもよい。 The shelf level identification processing unit 214 identifies the area of a shelf on a display shelf (shelf level area) where an object may be placed, from the upright image information obtained by performing keystone correction processing on the captured image information in the image information alignment processing unit 212. The captured image information and upright image information show a display shelf, but the display shelf has a shelf level area where objects are displayed. Therefore, the shelf level area is identified from the upright image information. To identify the shelf level area, the operator of the management terminal 2 may manually specify the shelf level area, which the shelf level identification processing unit 214 may accept, or the shelf level area may be identified automatically from the second time onwards based on the shelf level area information manually input the first time.

図１５に、飲料缶などの物体が陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して、棚段領域の指定の入力を受け付けた状態を模式的に示す。また、図１６に、歯ブラシなどの物体が吊り下げられて陳列されている陳列棚を撮影した画像情報を正置化した正置画像情報に対して、棚段領域の指定の入力を受け付けた状態を模式的に示す。 Figure 15 shows a schematic diagram of a state in which input for specifying a shelf area has been received for upright image information obtained by orthogonally ...

なお、棚段特定処理部２１４は、棚段領域を特定する際に、深層学習（ディープラーニング）を用いて棚段領域を特定してもよい。この場合、中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデルに対して、上記正置画像情報を入力し、その出力値に基づいて、棚段領域を特定してもよい。また学習モデルとしては、さまざまな正置画像情報に棚段領域を正解データとして与えたものを用いることができる。 When identifying shelf level areas, the shelf level identification processing unit 214 may use deep learning to identify shelf level areas. In this case, the above-mentioned upright image information may be input into a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of multiple intermediate layers are optimized, and the shelf level area may be identified based on the output values. Furthermore, the learning model may be one in which shelf level areas are assigned as correct answer data to various upright image information.

棚段特定処理部２１４で特定した棚段領域は、その画像情報を棚段領域画像情報として特定する。棚段特定処理部２１４は、実際に、画像情報として切り出してもよいし、実際には画像情報としては切り出さずに、領域の画像情報を座標などで特定するなどによって、仮想的に切り出すのでもよい。なお、陳列棚に棚段が複数ある場合には、それぞれが棚段領域画像情報として切り出される。また棚段の領域を示す座標としては、その領域を特定するための頂点の座標であり、正置画像情報におけるたとえば４点、右上と左下、左上と右下の２点の座標などでよい。また、正置画像情報における陳列棚など、画像情報における所定箇所（たとえば陳列棚の左上の頂点）を基準とした相対座標である。なお、本明細書において画像情報を切り出すとは、棚段特定処理部２１４における切り出しと同様に、実際に、画像情報として切り出してもよいし、実際には画像情報としては切り出さずに、領域の画像情報を座標などで特定するなどによって、仮想的に切り出すのでもよい。 The shelf level area identified by the shelf level identification processing unit 214 identifies its image information as shelf level area image information. The shelf level identification processing unit 214 may actually cut out the image information, or it may virtually cut out the image information of the area using coordinates or the like without actually cutting out the image information. If the display shelf has multiple shelves, each is cut out as shelf level area image information. The coordinates indicating the shelf level area are the coordinates of the vertices used to identify the area, and may be, for example, four points in the upright image information, such as the coordinates of the upper right and lower left, or two points in the upper left and lower right. They are also relative coordinates based on a predetermined location in the image information (for example, the upper left vertex of the display shelf), such as the display shelf in the upright image information. In this specification, cutting out image information may mean actually cutting out the image information, as in the case of cutting out by the shelf level identification processing unit 214, or it may mean virtually cutting out the image information of the area using coordinates or the like without actually cutting out the image information.

本実施例の情報処理システム１における物体認識処理部２１３における外形特定処理部２１３１、物体同定処理部２１３２は、実施例１の処理のほか、以下のような処理を実行してもよい。 The contour specification processing unit 2131 and object identification processing unit 2132 in the object recognition processing unit 213 in the information processing system 1 of this embodiment may perform the following processing in addition to the processing of Example 1.

外形特定処理部２１３１は、正置画像情報における棚段領域における棚段ごとに、外形の領域（外形領域）を特定する。外形とは物体が置かれる領域であって、その物体が置かれているか否かは問わない。外形領域の大きさは、そこに置かれるべき物体と同一または略同一の大きさである。 The outer shape identification processing unit 2131 identifies the outer shape area (outer shape area) for each shelf in the shelf area in the upright image information. The outer shape is the area where an object is placed, regardless of whether the object is placed there or not. The size of the outer shape area is the same as or approximately the same as the size of the object to be placed there.

外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、撮影画像情報、正置画像情報若しくは棚段領域の画像情報を入力値として入力し、入力した画像情報から外形領域を特定する。すなわち、外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とする領域、たとえば棚段領域の画像情報を入力し、その出力値に基づいて、外形の領域を特定する。特定した外形の領域については、外形領域を識別する外形識別情報を割り当てて、撮影画像情報、正置画像情報若しくは棚段領域の画像情報における位置情報（たとえば画像情報における座標）とともに物体識別情報記憶部２２に記憶させる。 The contour identification processing unit 2131 inputs the captured image information, upright image information, or image information of the shelf area as input values to the learning model (first learning model) trained by the first learning processing unit 201, and identifies the contour area from the input image information. That is, the contour identification processing unit 2131 inputs image information of the area to be processed, for example, the shelf area, to the learning model trained by the first learning processing unit 201 (a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of multiple intermediate layers are optimized), and identifies the contour area based on the output value. The identified contour area is assigned contour identification information that identifies the contour area, and stored in the object identification information storage unit 22 together with position information (for example, coordinates in the image information) in the captured image information, upright image information, or image information of the shelf area.

外形特定処理部２１３１で出力する出力値としては、その外形領域が対応する属性のタグ、たとえば、缶、ビン、パウチ容器などとそれに対する信頼度（確率）も合わせて出力する。この際に、もっとも高い信頼度の属性のタグを当該外形領域のタグとして特定するが、その信頼度が一定の閾値以下であれば、特定した外形領域について、未知の物体と同定をする。 The output value output by the contour identification processing unit 2131 includes the attribute tag corresponding to the contour region, for example, can, bottle, pouch container, etc., along with its reliability (probability). At this time, the attribute tag with the highest reliability is identified as the tag for the contour region, and if that reliability is below a certain threshold, the identified contour region is identified as an unknown object.

図１７に、棚段領域の画像情報から外形領域を特定した場合の一例を示す。図１７（ａ）は第１の学習処理部２０１により学習させた学習モデルに対して入力する棚段領域の画像情報の一例であり、図１７（ｂ）は図１７（ａ）で入力値とした棚段領域の画像情報において、上記学習モデルを用いて外形領域を特定した状態の一例を示す図である。図１７（ｂ）では棚段領域において外形領域を特定した状態を重畳して示しているが、特定した外形領域の画像情報をそのまま切り出して出力をしてもよい。 Figure 17 shows an example of an outline region identified from image information of a shelf level region. Figure 17(a) is an example of image information of a shelf level region input to a learning model trained by the first learning processing unit 201, and Figure 17(b) is a diagram showing an example of an outline region identified using the learning model in the image information of the shelf level region used as an input value in Figure 17(a). Figure 17(b) shows an example of an outline region identified in a shelf level region superimposed, but the image information of the identified outline region may also be cut out and output as is.

物体同定処理部２１３２は、外形領域に表示されている物体の物体識別情報を、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、外形領域の画像情報を入力値として入力し、入力された画像情報からその領域にある物体の識別情報を同定する。すなわち、物体同定処理部２１３２は、第２の学習処理部２０２において学習させた学習モデル（中間層が多数の層からなるニューラルネットワークの各層のニューロン間の重み付け係数が最適化された学習モデル）に対して、処理対象とする外形領域の画像情報を入力し、その出力値に基づいて、外形領域にある物体の物体識別情報を同定する。この際に、入力値として外形領域の画像情報のほか、外形特定処理部２１３１で特定したその外形領域が対応する属性のタグ、たとえば、缶、ビン、パウチ容器の情報を入力してもよい。なお、この際には、外形領域について一つの物体識別情報を同定してもよいし、複数の物体識別情報を同定してもよい。複数の物体識別情報を同定する場合には、同定する物体識別情報の候補を出力することとなる。 The object identification processing unit 2132 inputs the image information of the outline region as an input value to a learning model (second learning model) trained by the second learning processing unit 202, and identifies the identification information of the object in that region from the input image information. That is, the object identification processing unit 2132 inputs the image information of the outline region to be processed into the learning model trained by the second learning processing unit 202 (a learning model in which the weighting coefficients between neurons in each layer of a neural network consisting of multiple intermediate layers are optimized), and identifies the object identification information of the object in the outline region based on the output value. At this time, in addition to the image information of the outline region, attribute tags corresponding to the outline region identified by the outline identification processing unit 2131, such as information on cans, bottles, and pouch containers, may be input as input values. Note that one or more object identification information may be identified for the outline region. If multiple object identification information is identified, candidates for object identification information to be identified are output.

実施例１と同様に、物体認識処理部２１３は、外形特定処理部２１３１、物体同定処理部２１３２の処理をまとめて深層学習などによって実行してもよい。 Similar to Example 1, the object recognition processing unit 213 may perform the processing of the outer shape specification processing unit 2131 and the object identification processing unit 2132 together using deep learning or the like.

物体識別情報記憶部２２は、陳列棚の棚段の各外形に表示されている物体の物体識別情報を示す情報を記憶する。たとえば、物体識別情報に対応付けて、撮影日時情報、店舗情報、撮影画像情報の画像情報識別情報、正置画像情報の画像識別情報、外形を識別するための外形識別情報に対応づけて物体識別情報記憶部２２に記憶する。 The object identification information storage unit 22 stores information indicating the object identification information of the objects displayed on the outer shape of each shelf of the display shelf. For example, the object identification information is associated with the photographing date and time information, store information, image information identification information of the photographed image information, image identification information of the upright image information, and outer shape identification information for identifying the outer shape, and these are stored in the object identification information storage unit 22.

つぎに本実施例における情報処理システム１の処理プロセスの一例を図４および図１８のフローチャートを用いて説明する。なお、本実施例における以下の説明では、撮影画像情報から陳列している物体の物体識別情報を同定する場合を説明する。 Next, an example of the processing process of the information processing system 1 in this embodiment will be described using the flowcharts in Figures 4 and 18. Note that the following description of this embodiment will focus on the case where object identification information for displayed objects is identified from captured image information.

なお、本実施例における情報処理システム１の認識処理部２１で用いる学習モデルを学習するための学習処理（図４）は、実施例１と同様であるので説明を省略する。 Note that the learning process (Figure 4) for learning the learning model used by the recognition processing unit 21 of the information processing system 1 in this embodiment is the same as in Example 1, so a description thereof will be omitted.

陳列棚を撮影した画像情報から、陳列棚に陳列されている物体の物体識別情報を同定するための認識処理を、図１８のフローチャートを用いて説明する。 The recognition process for identifying object identification information of objects displayed on a display shelf from image information captured of the display shelf is explained using the flowchart in Figure 18.

店舗の陳列棚が撮影された撮影画像情報は、画像情報入力端末３から入力され、管理端末２の画像情報入力受付処理部２１０でその入力を受け付ける（Ｓ３００）。図１９に、撮影画像情報の一例を示す。また、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報の入力を受け付ける。そして、画像情報入力受付処理部２１０は、入力を受け付けた撮影画像情報、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報を対応づけて画像情報記憶部２１１に記憶させる。 Photographed image information of store display shelves is input from the image information input terminal 3, and this input is accepted by the image information input acceptance processing unit 210 of the management terminal 2 (S300). Figure 19 shows an example of photographed image information. It also accepts input of the date and time of photography, store identification information, and image information identification information of the photographed image information. The image information input acceptance processing unit 210 then associates the accepted input photographed image information, date and time of photography, store identification information, and image information identification information of the photographed image information, and stores them in the image information storage unit 211.

管理端末２において所定の操作入力を受け付けると、正置画像情報正置化処理部２１２は、画像情報記憶部２１１に記憶する撮影画像情報を抽出し、台形補正処理などの正置化処理を行うための頂点である棚位置（陳列棚の位置）の４点の入力を受け付け、正置化処理を実行する（Ｓ３１０）。このようにして正置化処理が実行された撮影画像情報（正置画像情報）の一例が、図２０である。 When a specified operation input is received at the management terminal 2, the orthogonal image information orthogonalization processing unit 212 extracts the captured image information stored in the image information storage unit 211, receives input of four shelf positions (display shelf positions) that are the vertices for performing orthogonalization processing such as keystone correction, and executes the orthogonalization processing (S310). Figure 20 shows an example of captured image information (orthogonal image information) that has been orthogonalized in this way.

そして、正置画像情報に対して、管理端末２において所定の操作入力を受け付けることで、棚段特定処理部２１４は、棚段位置領域を特定する（Ｓ３２０）。すなわち、正置画像情報における棚段領域の入力を受け付ける。図１５、図１６が、正置画像情報から棚段領域が特定された状態を示す図である。 Then, by receiving a specified operation input on the management terminal 2 in response to the normal image information, the shelf level identification processing unit 214 identifies the shelf level position area (S320). In other words, it accepts input of the shelf level area in the normal image information. Figures 15 and 16 show the state in which the shelf level area has been identified from the normal image information.

以上のようにして、棚段領域を特定すると、正置画像情報から棚段領域の画像情報を切り出す。そして、棚段領域画像情報における棚段ごとに、外形を特定する処理を実行する（Ｓ３３０）。すなわち、外形特定処理部２１３１は、第１の学習処理部２０１において学習させた学習モデル（第１の学習モデル）に、棚段領域の画像情報を入力値として入力し、入力した画像情報から外形領域を特定する。 Once the shelf level area is identified in this manner, image information of the shelf level area is extracted from the upright image information. Then, a process is performed to identify the outline for each shelf level in the shelf level area image information (S330). That is, the outline identification processing unit 2131 inputs the image information of the shelf level area as an input value into the learning model (first learning model) trained by the first learning processing unit 201, and identifies the outline area from the input image information.

以上のように正置画像情報に写っている各物体の各外形領域を特定し、それぞれの属性（タグ）を同定する。そして、同定したタグの信頼度が一定の閾値以下の場合には（Ｓ３４０）、その外形領域にある物体は未知の物体であると同定をする（Ｓ３５０）。 As described above, the contour regions of each object captured in the upright image information are identified, and their attributes (tags) are identified. If the reliability of the identified tag is below a certain threshold (S340), the object in that contour region is identified as an unknown object (S350).

一方、同定したタグの信頼度が一定の閾値より大きい場合には（Ｓ３４０）、物体同定処理部２１３２は、第２の学習処理部２０２において学習させた学習モデル（第２の学習モデル）に、外形領域の画像情報を入力値として入力し、外形領域に写っている物体の物体識別情報を同定する（Ｓ３６０）。そして同定した物体識別情報の信頼度が一定の閾値以下の場合には（Ｓ３７０）、その外形領域にある物体は未知の物体であると同定をする（Ｓ３５０）。 On the other hand, if the reliability of the identified tag is greater than a certain threshold (S340), the object identification processing unit 2132 inputs the image information of the outline region as an input value into the learning model (second learning model) trained in the second learning processing unit 202, and identifies the object identification information of the object appearing in the outline region (S360). If the reliability of the identified object identification information is equal to or less than a certain threshold (S370), the object in the outline region is identified as an unknown object (S350).

一方、同定した物体識別情報の信頼度が一定の閾値より大きい場合には（Ｓ３７０）、同定した物体識別情報は、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、外形識別情報に対応づけて物体識別情報記憶部２２に記憶させる。 On the other hand, if the reliability of the identified object identification information is greater than a certain threshold (S370), the identified object identification information is stored in the object identification information storage unit 22 in association with the photographing date and time, store identification information, image information identification information of the photographed image information, image information identification information of the upright image information, and external shape identification information.

なお、すべての外形領域の物体識別情報を同定できるとは限らない。そこで、同定できない外形領域については、物体識別情報の入力を受け付け、入力を受け付けた物体識別情報を、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、外形識別情報に対応づけて物体識別情報記憶部２２に記憶する。また、同定した物体識別情報の修正処理についても同様に、入力を受け付けてもよい。 Note that it is not always possible to identify object identification information for all outer shape regions. Therefore, for outer shape regions that cannot be identified, input of object identification information is accepted, and the accepted input object identification information is stored in the object identification information storage unit 22 in association with the shooting date and time, store identification information, image information identification information for the captured image information, image information identification information for the upright image information, and outer shape identification information. Input may also be accepted similarly for the process of correcting identified object identification information.

以上のような処理を行うことで、撮影画像情報に写っている陳列棚の棚段に陳列されている物体の物体識別情報を同定することができる。また従来のシステムのように、外形領域を矩形領域とせず、物体の輪郭の外形に沿って物体の同定を行うので、外形領域に含まれる不要な情報、たとえば他の物体などのノイズが除外されるので、認識精度が向上することとなる。 By performing the above processing, it is possible to identify the object identification information of the objects displayed on the shelves of the display shelf captured in the captured image information. Furthermore, unlike conventional systems, the outer shape region is not a rectangular region, but rather the object is identified along the outline of the object's contour. This eliminates unnecessary information contained in the outer shape region, such as noise from other objects, thereby improving recognition accuracy.

実施例１および実施例２では、外形領域の特定と、外形領域から物体識別情報の同定の２つの処理で機械学習を用いる構成を説明したが、外形領域から物体識別情報を同定する処理については、画像マッチング処理を用いてもよい。この場合の情報処理システム１の構成の一例を図２１に示す。なお、図２１では、実施例２の場合の情報処理システム１の構成に基づく場合を示しているが、実施例１に基づいてもよいことは当然である。 In Examples 1 and 2, configurations were described in which machine learning was used in two processes: specifying the outline region and identifying object identification information from the outline region. However, image matching processing may be used for the process of identifying object identification information from the outline region. An example of the configuration of information processing system 1 in this case is shown in Figure 21. Note that Figure 21 shows a case based on the configuration of information processing system 1 in Example 2, but it goes without saying that it may also be based on Example 1.

本実施例における情報処理システム１では、学習処理部２０では第２の学習処理部２０２は設ける必要はない。また、認識処理部２１では、画像マッチング処理に用いる標本情報を記憶する標本情報記憶部２１５を備える。 In the information processing system 1 of this embodiment, the learning processing unit 20 does not need to have a second learning processing unit 202. In addition, the recognition processing unit 21 is equipped with a specimen information storage unit 215 that stores specimen information used in the image matching process.

標本情報記憶部２１５は、画像情報に写っている陳列棚の棚段に陳列されている物体がどの物体であるかを識別するための標本情報を記憶する。標本情報は、陳列棚に陳列される可能性のある物体を、上下、左右、斜めなど複数の角度から撮影をした画像情報である。図２２に標本情報記憶部２１５に記憶される標本情報の一例を示す。図２２では、標本情報として、缶ビールをさまざまな角度から撮影をした場合を示しているが、缶ビールに限られない。標本情報記憶部２１５は、標本情報と、物体識別情報とを対応付けて記憶する。 The specimen information storage unit 215 stores specimen information for identifying the object displayed on the shelf of the display shelf shown in the image information. The specimen information is image information of objects that may be displayed on the display shelf, photographed from multiple angles, such as from above, below, left, right, or diagonally. Figure 22 shows an example of specimen information stored in the specimen information storage unit 215. Figure 22 shows specimen information of canned beer photographed from various angles, but is not limited to canned beer. The specimen information storage unit 215 stores specimen information in association with object identification information.

なお、標本情報記憶部２１５には、標本情報とともに、または標本情報に代えて、標本情報から抽出された、類似性の算出に必要となる情報、たとえば画像特徴量とその位置のペアの情報を記憶していてもよい。標本情報には、類似性の算出に必要となる情報も含むとする。この場合、物体認識処理部２１３は、後述する外形領域の画像情報と、標本情報とのマッチング処理を行う際に、標本情報について毎回、画像特徴量を算出せずともよくなり、計算時間を短縮することができる。 The specimen information storage unit 215 may store, together with or instead of the specimen information, information extracted from the specimen information that is required for calculating similarity, such as information on pairs of image features and their positions. The specimen information also includes information required for calculating similarity. In this case, when performing matching processing between image information of the outline region (described below) and the specimen information, the object recognition processing unit 213 does not need to calculate image features for the specimen information every time, thereby reducing calculation time.

また標本情報記憶部に記憶する標本情報は、第１の学習処理部２０１の学習処理の際に用いた第１のアノテーションデータにおける物体の輪郭の外形をマスク処理した物体の画像情報を用いてもよい。すなわち、第１のアノテーションデータを作成する際に、物体を一または複数の方向から撮影した物体の画像情報若しくはその画像特徴量を標本情報とする。そして、当該撮影した物体の画像情報のうち、輪郭を外形として、その閉領域の内側をマスク処理するとともに、属性をタグ付けして第１のアノテーションデータを作成する。このような処理によって、標本情報と第１のアノテーションデータをまとめて作成することができる。 The specimen information stored in the specimen information storage unit may also be image information of an object in which the outer shape of the object's contour in the first annotation data used during the learning process by the first learning processing unit 201 has been masked. That is, when creating the first annotation data, image information of the object photographed from one or more directions or its image features is used as specimen information. Then, from the image information of the photographed object, the contour is used as the outer shape, and the inside of the closed area is masked, and attributes are tagged to create the first annotation data. Through this process, specimen information and first annotation data can be created together.

本実施例における物体同定処理部２１３２は、外形特定処理部２１３１で特定した外形領域の画像情報と、標本情報記憶部２１５に記憶する標本情報とのマッチング処理を実行し、その外形領域に表示されている物体の物体識別情報を同定する。すなわち、ある棚段の外形領域（この外形の領域の外形識別情報をＸとする）における画像情報と、標本情報記憶部に記憶する各標本情報とから、それぞれの画像特徴量を算出し、特徴点のペアを求めることで、類似性を判定する。そして、もっとも類似性の高い標本情報を特定し、そのときの類似性があらかじめ定められた閾値以上であれば、その標本情報に対応する物体識別情報を標本情報記憶部２１５に基づいて同定する。そして、同定した物体識別情報を、その外形識別情報Ｘの外形に表示されている物体の物体識別情報とする。なお、いずれの標本情報とも類似ではないと判定した外形については、その外形識別情報について「空」であることを示す情報（物体がないことを示す情報）を付する。物体同定処理部２１３２は、同定した物体識別情報または「空」であることを示す情報を、撮影日時、店舗識別情報、撮影画像情報の画像情報識別情報、正置画像情報の画像情報識別情報、外形識別情報に対応づけて物体識別情報記憶部２２に記憶する。 In this embodiment, the object identification processing unit 2132 performs a matching process between the image information of the contour region identified by the contour identification processing unit 2131 and the specimen information stored in the specimen information storage unit 215, and identifies the object identification information of the object displayed in that contour region. That is, the image information of the contour region of a certain shelf (the contour identification information of this contour region is designated X) is used to calculate image features from each of the specimen information stored in the specimen information storage unit, and similarity is determined by finding pairs of feature points. The most similar specimen information is then identified, and if the similarity at that time is equal to or greater than a predetermined threshold, the object identification information corresponding to that specimen information is identified based on the specimen information storage unit 215. The identified object identification information is then used as the object identification information of the object displayed in the contour of the contour identification information X. For contours determined to be unsimilar to any specimen information, information indicating that the contour identification information is "empty" (information indicating that no object is present) is added to the contour identification information. The object identification processing unit 2132 stores the identified object identification information or information indicating that the object is "empty" in the object identification information storage unit 22 in association with the image capture date and time, store identification information, image information identification information of the captured image information, image information identification information of the upright image information, and external shape identification information.

物体同定処理部２１３２は、一例として、具体的には以下のような処理を実行する。まず、処理対象となる外形領域の座標で構成される画像情報と、標本情報記憶部２１５に記憶する標本情報との類似性を判定し、その類似性がもっとも高い標本情報に対応する物体識別情報を特定し、特定した類似性があらかじめ定めた閾値以上であれば、上記座標で構成される外形領域に表示されている物体の物体識別情報として同定をする。 As an example, the object identification processing unit 2132 specifically executes the following process. First, it determines the similarity between the image information composed of the coordinates of the outline region to be processed and the specimen information stored in the specimen information storage unit 215, identifies the object identification information corresponding to the specimen information with the highest similarity, and if the identified similarity is equal to or greater than a predetermined threshold, identifies it as the object identification information of the object displayed in the outline region composed of the coordinates.

ここで外形の画像情報と標本情報との類似性を判定するには、以下のような処理を行う。まず、物体同定処理部２１３２における物体識別情報の同定処理の前までの処理において、正置画像情報の棚段における外形の領域の画像情報と、標本情報との方向が同じ（横転や倒立していない）となっており、また、それぞれの画像情報の大きさが概略同じとなっている（所定範囲以上で画像情報の大きさが異なる場合には、類似性の判定の前にそれぞれの画像情報の大きさが所定範囲内となるようにサイズ合わせをしておく）。 To determine the similarity between the image information of the outline and the specimen information, the following processing is performed. First, in the processing prior to the identification processing of the object identification information in the object identification processing unit 2132, the image information of the outline region on the shelf in the upright image information and the specimen information are oriented in the same direction (not turned sideways or upside down), and the sizes of the respective image information are roughly the same (if the sizes of the image information differ by more than a predetermined range, the sizes of the respective image information are adjusted so that they are within the predetermined range before determining the similarity).

物体同定処理部２１３２は、外形領域の画像情報と、標本情報との類似性を判定するため、外形の画像情報の画像特徴量（たとえば局所特徴量）に基づく特徴点と、標本情報との画像特徴量（たとえば局所特徴量）に基づく特徴点を、それぞれ抽出する。そして、外形の画像情報の特徴点と、標本情報の特徴点とでもっとも類似性が高いペアを検出し、それぞれで対応する点の座標の差を求める。そして、差の平均値を求める。差の平均値は、外形領域の画像情報と、標本情報との全体の平均移動量を示している。そして、すべての特徴点のペアの座標差を平均の座標差と比較し、外れ度合いの大きなペアを除外する。そして、残った対応点の数で類似性を順位付ける。 To determine the similarity between the image information of the outline region and the specimen information, the object identification processing unit 2132 extracts feature points based on the image features (e.g., local features) of the image information of the outline and feature points based on the image features (e.g., local features) of the specimen information. It then detects the most similar pair of feature points in the image information of the outline and feature points in the specimen information, and calculates the difference in coordinates of corresponding points between them. It then calculates the average difference. The average difference indicates the overall average amount of movement between the image information of the outline region and the specimen information. It then compares the coordinate differences of all feature point pairs with the average coordinate difference, and excludes pairs with a high degree of discrepancy. It then ranks the similarity based on the number of remaining corresponding points.

以上のような方法で外形領域の画像情報と、標本情報との類似性を算出できる。また、その精度を向上させるため、さらに、色ヒストグラム同士のＥＭＤ（ＥａｒｔｈＭｏｖｅｒｓＤｉｓｔａｎｃｅ）を求め、類似性の尺度としてもよい。これによって、撮影された画像情報の明度情報等の環境変化に比較的強い類似性の比較を行うことができ、高精度で特定をすることができる。 The above method can be used to calculate the similarity between the image information of the outline region and the specimen information. Furthermore, to improve accuracy, the EMD (Earth Movement Distance) between the color histograms can be calculated and used as a measure of similarity. This allows for a comparison of similarities that are relatively resistant to environmental changes, such as brightness information in the captured image information, allowing for highly accurate identification.

類似性の判定としては、ほかにも、各外形領域の画像情報のシグネチャ（画像特徴量と重みの集合）同士のＥＭＤを求め、類似性の尺度としてもよい。シグネチャの画像特徴量としては、たとえば外形領域の画像情報のＨＳＶ色空間内の頻度分布を求め、色相と彩度に関してグルーピングを行って、特徴の個数とＨＳＶ色空間内の領域による画像特徴量とすることができる。色相と彩度についてグルーピングを行うのは、撮影条件に大きく左右されないように、明度への依存度を下げるためである。 Another way to determine similarity is to calculate the EMD between the signatures (a collection of image features and weights) of the image information of each contour region and use this as a measure of similarity. For example, the image feature of the signature can be determined by calculating the frequency distribution in the HSV color space of the image information of the contour region, grouping it based on hue and saturation, and using this as the image feature based on the number of features and the area in the HSV color space. Grouping based on hue and saturation is done to reduce dependence on lightness so that it is not significantly affected by shooting conditions.

また、処理の高速化のため、シグネチャとＥＭＤの代わりに、適宜の色空間内での画像情報の色コリログラムや色ヒストグラムなどの画像特徴量間のＬ２距離等の類似性を用いることもできる。 In addition, to speed up processing, similarities such as L2 distance between image features such as color correlograms and color histograms of image information within an appropriate color space can be used instead of signatures and EMD.

類似性の判定は、上述に限定をするものではない。同定した物体識別情報は、撮影日時情報、店舗情報、撮影画像情報の画像情報識別情報、正置画像情報の画像識別情報、外形識別情報に対応づけて物体識別情報記憶部２２に記憶する。 The similarity determination is not limited to the above. The identified object identification information is stored in the object identification information storage unit 22 in association with the photographing date and time information, store information, image information identification information of the photographed image information, image identification information of the upright image information, and external shape identification information.

なお、物体識別情報が同定できなかった外形は、物体識別情報記憶部２２においてその外形領域が「空」であることを示す情報（物体が欠品などないことを示す情報）が記憶される。 For contours for which object identification information could not be identified, information indicating that the contour area is "empty" (information indicating that the object is not missing, etc.) is stored in the object identification information storage unit 22.

以上のように、外形領域の画像情報から物体を同定する場合において画像マッチング処理を用いた場合であっても、外形領域が矩形領域ではないので、精度よく画像マッチング処理を実行することができる。 As described above, even when using image matching processing to identify an object from image information of its outline region, the image matching processing can be performed with high accuracy because the outline region is not a rectangular region.

上述の実施例２または実施例３の変形例として、棚段単位での変化を検出する棚段比較処理部２１４３を設け、棚段単位で変化がない場合には、前回の認識結果をそのまま用いることもできる。この場合の物体認識処理部２１３の一例を図２３に示す。 As a modification of the above-described second or third embodiment, a shelf level comparison processing unit 2143 is provided to detect changes in shelf levels, and if there are no changes in shelf levels, the previous recognition results can be used as is. An example of the object recognition processing unit 213 in this case is shown in Figure 23.

棚段比較処理部２１３３は、前回（Ｎ－１回目）の正置画像情報における棚段の領域の画像情報と、今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づいて、その類似性が高ければその棚段における各外形の物体識別情報または「空」は同一と判定する。この類似性の判定処理は、上述のように、前回（Ｎ－１回目）の正置画像情報における棚段の領域の画像情報の画像特徴量と、今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づく類似性の判定でもよいし、色ヒストグラム同士のＥＭＤを用いたものであってもよい。また、それらに限定するものではない。そして、物体同定処理部２１３２における外形単位ごとの特定処理ではなく、物体同定処理部２１３２に、Ｎ回目の正置画像情報におけるその棚段における各外形の物体識別情報を、Ｎ－１回目の同一の棚段における各外形の物体識別情報と同一として、物体識別情報記憶部２２に記憶させる。これによって、あまり物体の動きがない棚段や逆にきわめて短いサイクルで管理される棚段など、変化がほとんど生じない棚段についての処理を省略することができる。 The shelf level comparison processing unit 2133 determines that the object identification information for each external shape of a shelf level in the previous (N-1) upright image information or the "empty" symbol is the same if there is a high degree of similarity between the image information for the shelf level area in the previous (N-1) upright image information and the image information for the shelf level area in the current (N) upright image information. As described above, this similarity determination process may determine similarity based on the image feature values of the image information for the shelf level area in the previous (N-1) upright image information and the image information for the shelf level area in the current (N) upright image information, or may use the EMD between color histograms. This method is not limited to these. Then, rather than performing a process for identifying each external shape unit in the object identification processing unit 2132, the object identification processing unit 2132 stores the object identification information for each external shape of a shelf level in the Nth upright image information in the object identification information storage unit 22 as the same as the object identification information for each external shape of the same shelf level in the N-1th upright image information. This allows us to omit processing for shelves that rarely change, such as shelves with little object movement or shelves that are managed in extremely short cycles.

上述の実施例１乃至実施例４の処理を、適宜、組み合わせることもできる。またその各処理については、本発明の明細書に記載した順序に限定するものではなく、その目的を達成する限度において適宜、変更することが可能である。また、物体認識処理部２１３における処理は、撮影画像情報に対して正置化処理を実行した正置画像情報に対して実行したが、撮影画像情報に対して実行をしてもよい。その場合、正置画像情報を、撮影画像情報と読み替えればよい。 The processes of Examples 1 to 4 described above can also be combined as appropriate. Furthermore, the order of each process is not limited to the order described in the present specification, and can be changed as appropriate to the extent that the objective is achieved. Furthermore, although the processing in the object recognition processing unit 213 is performed on upright image information obtained by performing a normalization process on the photographed image information, it may also be performed on the photographed image information. In this case, upright image information should be read as photographed image information.

また、実施例２および実施例４においては、認識処理部２１において棚段領域を特定してそこから後述の外形領域を特定する処理とせずに、棚段領域を特定せずに撮影画像情報、正置画像情報若しくは棚段領域の画像情報の全体から後述の外形領域を特定するように構成することもできる。その場合には、棚段特定処理部２１４は設けずともよく、その処理を実行しないように構成してもよい。 In addition, in Examples 2 and 4, instead of identifying the shelf level area in the recognition processing unit 21 and then identifying the outline area (described below) from that, it is also possible to configure the recognition processing unit 21 to identify the outline area (described below) from the entire captured image information, upright image information, or image information of the shelf level area without identifying the shelf level area. In this case, the shelf level identification processing unit 214 does not need to be provided, and the processing may be configured not to be performed.

上述の各実施例では、コンビニエンスストアやスーパーなどの陳列棚について例示して説明をしたが、それに限定するものではなく、たとえば調剤薬局の医薬品を陳列する陳列棚（医薬品棚）に陳列される医薬品（物体）に適用することもできる。同様に、倉庫の陳列棚に陳列される物体に適用することもできる。 In the above-described embodiments, examples have been given of display shelves in convenience stores, supermarkets, etc., but the present invention is not limited to these. For example, the present invention can be applied to medicines (objects) displayed on display shelves (medicine shelves) in pharmacies. Similarly, the present invention can be applied to objects displayed on display shelves in warehouses.

また、物体として陳列棚に陳列している陳列商品以外であってもよい。たとえば、物体として、動物、鳥類、昆虫、魚類、植物、移動体など、各種の同定対象となる物体であってもよい。このように各種の物体を撮影した画像情報であれば、陳列棚に陳列している陳列商品以外の画像情報に適用してもよい。 The object may also be something other than merchandise displayed on a display shelf. For example, the object may be any type of object that can be identified, such as an animal, bird, insect, fish, plant, or moving object. In this way, image information captured of various objects may be applied to image information other than merchandise displayed on a display shelf.

本発明の情報処理システム１を用いることで、画像情報から陳列している物体を同定する際に、物体を同定する精度を向上させることが可能となる。 By using the information processing system 1 of the present invention, it is possible to improve the accuracy of identifying objects when identifying displayed objects from image information.

１：情報処理システム
２：管理端末
３：画像情報入力端末
２０：学習処理部
２１：認識処理部
２２：物体識別情報記憶部
７０：演算装置
７１：記憶装置
７２：表示装置
７３：入力装置
７４：通信装置
２０１：第１の学習処理部
２０２：第２の学習処理部
２１０：画像情報入力受付処理部
２１１：画像情報記憶部
２１２：画像情報正置化処理部
２１３：物体認識処理部
２１４：棚段特定処理部
２１３１：外形特定処理部
２１３２：物体同定処理部
２１３３：棚段比較処理部 DESCRIPTION OF SYMBOLS 1: Information processing system 2: Management terminal 3: Image information input terminal 20: Learning processing unit 21: Recognition processing unit 22: Object identification information storage unit 70: Arithmetic unit 71: Storage device 72: Display device 73: Input device 74: Communication device 201: First learning processing unit 202: Second learning processing unit 210: Image information input reception processing unit 211: Image information storage unit 212: Image information alignment processing unit 213: Object recognition processing unit 214: Shelf level identification processing unit 2131: Outline identification processing unit 2132: Object identification processing unit 2133: Shelf level comparison processing unit

Claims

An information processing system for identifying an object appearing in image information,
a first learning processing unit that performs machine learning using first annotation data in which data indicating the external shape of an object is associated with attributes, to create a first learning model;
a recognition processing unit that identifies object identification information of an object shown in the image information;
It has
The recognition processing unit
Using the image information and the first learning model, an outline of an object shown in the image information is identified as an outline region, and a reliability of an attribute corresponding to the identified outline region is output;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, one or more pieces of object identification information of the object appearing in the specified outer region are identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information of the object;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object captured in the specified outline region is identified as an unknown object.
An information processing system comprising:

The first learning processing unit
performing machine learning by image segmentation using the first annotation data to create the first learning model;
2. The information processing system according to claim 1, wherein:

The information processing system includes:
a second learning processing unit that performs machine learning using the second annotation data to create the second learning model;
3. The information processing system according to claim 1, further comprising:

The recognition processing unit
performing an image matching process between the image information of the specified outline region and specimen information of the object stored in a specimen information storage unit, thereby identifying object identification information of the object shown in the outline region;
3. The information processing system according to claim 1 or 2.

An information processing system for identifying an object appearing in image information,
a first learning processing unit that performs machine learning using first annotation data in which data indicating the external shape of an object is associated with attributes, to create a first learning model;
It has
Using the image information and the first learning model, an outline of an object shown in the image information is identified as an outline region, and a reliability of an attribute corresponding to the identified outline region is output;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, the object identification information of the object shown in the specified outer region is identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object shown in the specified outline region is identified as an unknown object.
An information processing system comprising:

An information processing system for identifying an object appearing in image information,
an image information input reception processing unit that receives input of image information obtained by photographing an object;
an object recognition processing unit that identifies object identification information of an object shown in the image information from the received input image information or image information obtained by orthogonally positioning the image information;
It has
The object recognition processing unit
using a first learning model created by machine learning using first annotation data in which data indicating the outline of an object is associated with an attribute, and the received input image information or image information obtained by orthogonally positioning the image information, to identify the outline of the object as an outline region, and output a reliability for the attribute corresponding to the identified outline region;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, the object identification information of the object shown in the specified outer region is identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object captured in the specified outline region is identified as an unknown object.
An information processing system comprising:

Computer,
a first learning processing unit that performs machine learning using first annotation data in which data indicating the external shape of an object is associated with attributes, to create a first learning model;
a recognition processing unit that identifies object identification information of an object shown in the image information;
An information processing program that functions as
The recognition processing unit
Using the image information and the first learning model, an outline of an object shown in the image information is identified as an outline region, and a reliability of an attribute corresponding to the identified outline region is output;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, one or more pieces of object identification information of the object appearing in the specified outer region are identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information of the object;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object captured in the specified outline region is identified as an unknown object.
An information processing program characterized by:

Computer,
a first learning processing unit that performs machine learning using first annotation data in which data indicating the external shape of an object is associated with attributes, to create a first learning model;
An information processing program that functions as
Using image information and the first learning model, an outline of an object shown in the image information is identified as an outline region, and a reliability for an attribute corresponding to the identified outline region is output;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, the object identification information of the object shown in the specified outer region is identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object shown in the specified outline region is identified as an unknown object.
An information processing program characterized by:

Computer,
an image information input reception processing unit that receives input of image information obtained by photographing an object;
an object recognition processing unit that identifies object identification information of an object shown in the image information from the received input image information or image information obtained by orthogonally positioning the image information;
An information processing program that functions as
The object recognition processing unit
using a first learning model created by machine learning using first annotation data in which data indicating the outline of an object is associated with an attribute, and the received input image information or image information obtained by orthogonally positioning the image information, to identify the outline of the object as an outline region, and output a reliability for the attribute corresponding to the identified outline region;
If the reliability of the attribute corresponding to the specified outer region is not equal to or less than a predetermined threshold, the object identification information of the object shown in the specified outer region is identified using a second learning model or image matching process that has been machine-learned using image information of the specified outer region and second annotation data that associates image data of the object with object identification information;
If the reliability of the attribute corresponding to the specified outline region is equal to or less than a predetermined threshold, the object captured in the specified outline region is identified as an unknown object.
An information processing program characterized by: