JP6941331B2

JP6941331B2 - Image recognition system

Info

Publication number: JP6941331B2
Application number: JP2020115230A
Authority: JP
Inventors: 成吉谷井
Original assignee: Marketvision Co Ltd
Current assignee: Marketvision Co Ltd
Priority date: 2018-10-16
Filing date: 2020-07-02
Publication date: 2021-09-29
Anticipated expiration: 2037-08-29
Also published as: JP2020161196A

Description

本発明は，店舗などに陳列されている商品を特定するための画像認識システムに関する。 The present invention relates to an image recognition system for identifying products displayed in stores and the like.

小売業界においては，商品の陳列状況が商品の販売に影響することが知られている。そのため，商品の製造会社，販売会社としては，自社または他社のどのような商品が店舗に陳列されているのかを把握することで，自社商品の開発戦略，販売戦略につなげることができる。 In the retail industry, it is known that the display status of products affects the sales of products. Therefore, as a product manufacturing company or sales company, it is possible to connect to the development strategy and sales strategy of the company's products by grasping what kind of products of the company or other companies are displayed in the store.

一方，その実現のためには，店頭に陳列されている商品の正確な特定が重要である。そこで，店舗の陳列棚を撮影し，その画像情報から陳列されている商品を，人間が手作業で特定をすることが考えられる。この場合，ほぼ正確に商品を特定することができる。しかし，商品の陳列状況を継続的に把握するためには，一定期間ごとにその陳列状況を把握しなければならないが，店舗の陳列棚を撮影した画像情報から毎回，人間が商品を特定するのは負担が大きく，また非効率である。 On the other hand, in order to realize this, it is important to accurately identify the products displayed in the store. Therefore, it is conceivable that a person manually identifies the products displayed from the image information of the display shelves of the store. In this case, the product can be identified almost accurately. However, in order to continuously grasp the display status of products, it is necessary to grasp the display status at regular intervals, but humans identify the products every time from the image information taken of the display shelves of the store. Is burdensome and inefficient.

そこで店舗の陳列棚を撮影した画像情報から，そこに陳列されている商品を自動的に特定し，商品の陳列状況を把握することが求められる。たとえば商品ごとの標本画像をもとに，店舗の陳列棚を撮影した画像に対して画像認識技術を用いる方法がある。これらの従来技術として，たとえば，下記特許文献１乃至特許文献３に示すような技術を用いて，商品の陳列状況を管理するシステムが存在する。 Therefore, it is required to automatically identify the products displayed on the display shelves of the store from the image information taken and grasp the display status of the products. For example, there is a method of using image recognition technology for an image of a store display shelf based on a sample image of each product. As these conventional techniques, for example, there is a system for managing the display status of products by using the techniques shown in Patent Documents 1 to 3 below.

特開平５−３４２２３０号公報Japanese Unexamined Patent Publication No. 5-342230 特開平５−３３４４０９号公報Japanese Unexamined Patent Publication No. 5-334409 国際公開２０１２／０２９５４８International release 2012/029548

特許文献１の発明は，商品をどの陳列棚に陳列すべきかが知識のない者にもできるように支援するシステムである。そのため，実際に陳列されている商品を把握することはできない。また特許文献２は，商品の陳列を支援する棚割支援システムにおいて，商品画像の入力を支援するシステムである。しかし特許文献２のシステムでは，棚割支援システムを利用する際の商品画像の入力を支援するのみであって，このシステムを用いたとしても，実際に陳列されている商品を把握することはできない。 The invention of Patent Document 1 is a system that supports a person who does not have knowledge about which display shelf a product should be displayed on. Therefore, it is not possible to grasp the products that are actually displayed. Further, Patent Document 2 is a system that supports input of a product image in a shelf allocation support system that supports the display of products. However, the system of Patent Document 2 only supports the input of product images when using the shelf allocation support system, and even if this system is used, it is not possible to grasp the products actually displayed. ..

特許文献３は，陳列棚に空き空間がある場合に，その空き空間に陳列すべき商品を特定したり，陳列棚に陳列すべき商品を置き間違えた場合にそれを通知する発明である。この発明でも，商品と，陳列棚に陳列されている商品との画像マッチング処理を行って，陳列棚に陳列されている商品を特定しているが，認識精度が低いのが実情である。 Patent Document 3 is an invention that specifies a product to be displayed in the empty space when there is an empty space on the display shelf, or notifies the product when the product to be displayed on the display shelf is mistakenly placed. In the present invention as well, image matching processing is performed between the product and the product displayed on the display shelf to identify the product displayed on the display shelf, but the actual situation is that the recognition accuracy is low.

陳列棚を撮影した画像に対して，商品の標本画像をもとに画像認識技術を用いる場合，その画像認識の精度や処理負荷が問題となる。たとえば，陳列棚が設置されている店舗では，照明の状態が均一ではなく，標本画像の照明との差が大きいと，画像認識の精度が低下する。そのため，陳列棚を撮影した画像から，そこに陳列されている商品を特定することは容易ではない場合がある。また，画像認識には大きな処理負荷が発生するので，多大なリソースが必要となってしまう。そのため，投下資本が大きくなってしまう。 When image recognition technology is used based on a sample image of a product for an image taken on a display shelf, the accuracy of the image recognition and the processing load become problems. For example, in a store where display shelves are installed, if the lighting conditions are not uniform and the difference from the lighting of the sample image is large, the accuracy of image recognition deteriorates. Therefore, it may not be easy to identify the products displayed on the display shelves from the images taken. In addition, a large processing load is generated for image recognition, which requires a large amount of resources. Therefore, the invested capital becomes large.

本発明者は，陳列棚を撮影した画像に写っている商品を識別するための一つの方法として，その商品の価格等を表示するための商品タグに着目し，陳列されている商品を精度よく特定することができる画像認識システムを発明した。また，陳列棚に写っている商品を認識するとともに，商品タグに記載されている情報を認識し，それらを照合することで，陳列されている商品を精度よく特定することができる画像認識システムを発明した。 The present inventor pays attention to a product tag for displaying the price of the product, etc. as one method for identifying the product shown in the image of the display shelf, and accurately displays the displayed product. We have invented an image recognition system that can be identified. In addition, an image recognition system that can accurately identify the products on display by recognizing the products on the display shelves, recognizing the information on the product tags, and collating them. Invented.

第１の発明は，商品を陳列する陳列棚が写っている第１の画像情報に対して第１の正置化処理をして第２の画像情報を生成する第１の正置化処理部と，前記第２の画像情報における商品タグ配置領域を含む領域に対して第２の正置化処理をする第２の正置化処理部と，前記第２の正置化処理をした画像情報から商品タグ領域を特定する商品タグ特定処理部と，前記特定した商品タグ領域においてＯＣＲ認識処理を行うことで商品タグに表記された情報を特定する商品タグ内情報特定処理部と，を有する画像認識システムである。 The first invention is a first emplacement processing unit that generates a second image information by performing a first emplacement process on the first image information showing a display shelf for displaying products. And the second emplacement processing unit that performs the second emplacement processing on the area including the product tag arrangement area in the second image information, and the image information that has undergone the second emplacement processing. An image having a product tag specifying processing unit that specifies the product tag area from the above, and an information specifying processing unit in the product tag that specifies the information written on the product tag by performing OCR recognition processing in the specified product tag area. It is a recognition system.

本発明の画像認識システムを用いることで，商品タグに表記されている情報を精度よく認識できるので，それに対応して陳列されている商品を精度よく特定することができる。 By using the image recognition system of the present invention, the information written on the product tag can be accurately recognized, and therefore the products displayed corresponding to the information can be accurately identified.

上述の発明において，前記商品タグ内情報特定処理部は，前記特定した商品タグ領域を二値化してラベリング処理を行うことでボックスを特定し，特定したボックスのうち，ボックスの高さ，幅，ベースラインが所定条件を充足するボックス同士をブロックとして特定し，前記特定したブロックに対して前記ＯＣＲ認識処理を実行する，画像認識システムである。 In the above-described invention, the information specifying processing unit in the product tag identifies the box by binarizing the specified product tag area and performing labeling processing, and among the specified boxes, the height, width, and the like of the box. This is an image recognition system that specifies boxes whose baseline satisfies a predetermined condition as blocks and executes the OCR recognition process on the specified blocks.

第１の発明は，本発明のコンピュータプログラムを読み込ませて実行することで実現することができる。すなわち，コンピュータを，商品を陳列する陳列棚が写っている第１の画像情報に対して第１の正置化処理をして第２の画像情報を生成する第１の正置化処理部，前記第２の画像情報における商品タグ配置領域を含む領域に対して第２の正置化処理をする第２の正置化処理部，前記第２の正置化処理をした画像情報から商品タグ領域を特定する商品タグ特定処理部，前記特定した商品タグ領域においてＯＣＲ認識処理を行うことで商品タグに表記された情報を特定する商品タグ内情報特定処理部，として機能させる画像認識プログラムである。 The first invention can be realized by loading and executing the computer program of the present invention. That is, the first emplacement processing unit, which generates the second image information by performing the first emplacement processing on the first image information in which the display shelf for displaying the products is shown. A second emplacement processing unit that performs a second emplacement process on an area including a product tag arrangement area in the second image information, and a product tag from the image information that has undergone the second emplacement process. This is an image recognition program that functions as a product tag identification processing unit that specifies an area and an information identification processing unit within a product tag that specifies the information written on the product tag by performing OCR recognition processing in the specified product tag area. ..

本発明の画像認識システムを用いることによって，陳列棚に陳列されている商品を精度よく特定することができる。 By using the image recognition system of the present invention, it is possible to accurately identify the products displayed on the display shelves.

本発明の画像認識システムのシステム構成の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the system structure of the image recognition system of this invention. 本発明の画像認識システムで用いるコンピュータのハードウェア構成の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the hardware composition of the computer used in the image recognition system of this invention. 本発明の画像認識システムにおける全体の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the whole processing process in the image recognition system of this invention. 商品タグ配置領域の正置化処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the product tag arrangement area normalization processing. 商品タグ画像情報の特定処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the specific processing of the product tag image information. 商品タグ内情報の特定処理の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the processing process of the specific processing of the information in a product tag. 撮影画像情報に正置化処理を行った正置画像情報の一例を示す図である。It is a figure which shows an example of the emplacement image information which performed the emplacement processing on the photographed image information. 撮影画像情報に正置化処理を行った正置画像情報のほかの一例を示す図である。It is a figure which shows another example of the emplacement image information which performed the emplacement processing on the photographed image information. 図７の正置画像情報に対して，正置画像情報に商品タグ配置領域の指定を行った状態の一例を示す図である。FIG. 5 is a diagram showing an example of a state in which a product tag placement area is specified for the normal image information with respect to the normal image information of FIG. 7. 図８の正置画像情報に対して，正置画像情報に商品タグ配置領域の指定を行った状態の一例を示す図である。FIG. 5 is a diagram showing an example of a state in which a product tag placement area is specified for the normal image information with respect to the normal image information of FIG. 図９における商品タグ配置領域に対して正置化処理を行った画像情報の一例を示す図である。It is a figure which shows an example of the image information which performed the normalization processing with respect to the product tag arrangement area in FIG. 商品タグ配置領域正置化処理部において，台形補正処理の基準となる垂直方向の直線を検出した状態を示す図である。It is a figure which shows the state which detected the straight line in the vertical direction which is the reference of the keystone correction processing in the product tag arrangement area normalization processing part. 商品タグ特定処理部における商品タグ領域の上辺位置，下辺位置を特定する処理を模式的に示す図である。It is a figure which shows typically the process of specifying the upper side position and the lower side position of the product tag area in the product tag identification processing unit. 商品タグ特定処理部における商品タグ領域の横方向の位置を特定する処理を模式的に示す図である。It is a figure which shows typically the process which specifies the position in the lateral direction of the product tag area in the product tag identification processing unit. 図１１の正置化した商品タグ配置領域の画像情報から，商品タグ領域を特定した状態を模式的に示す図である。It is a figure which shows typically the state which specified the product tag area from the image information of the product tag arrangement area which was put upright in FIG. 商品タグ内情報特定処理部における処理において，商品タグ領域を二値化した画像情報を模式的に示す図である。It is a figure which shows typically the image information which binarized the product tag area in the processing in the product tag information identification processing unit. 商品タグ内情報特定処理部における処理において，ボックスを生成した状態を模式的に示す図である。It is a figure which shows typically the state which generated the box in the processing in the information identification processing part in a product tag. 商品タグ内情報特定処理部における処理において，ブロックを生成した状態を模式的に示す図である。It is a figure which shows typically the state which generated the block in the processing in the information identification processing part in a product tag. 商品タグに表記された商品名，ＯＣＲ認識の結果の商品辞書に記憶する商品名のうち編集距離が最小の商品名，編集距離の一例を示す図である。It is a figure which shows an example of the product name written in the product tag, the product name which has the smallest edit distance among the product names stored in the product dictionary of the result of OCR recognition, and the edit distance. 最終候補の文字数ごとの確定してよい編集距離の対応関係の一例を示す図である。It is a figure which shows an example of the correspondence relation of the edit distance which may be decided for each number of characters of a final candidate. 撮影画像情報の一例を模式的に示す図である。It is a figure which shows an example of the photographed image information schematically. 撮影画像情報の一例を模式的に示す図である。It is a figure which shows an example of the photographed image information schematically. 実施例２における画像認識システムのシステム構成の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the system structure of the image recognition system in Example 2. FIG. 商品タグ認識処理部の構成の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the structure of the product tag recognition processing part. 陳列商品認識処理部の構成の一例を模式的に示すブロック図である。It is a block diagram which shows typically an example of the structure of the display product recognition processing unit. 実施例２における画像認識システムにおける全体の処理プロセスの一例を示すフローチャートである。It is a flowchart which shows an example of the whole processing process in the image recognition system in Example 2. 商品識別情報特定処理部におけるＮ回目の処理を模式的に示す図である。It is a figure which shows typically the Nth processing in the product identification information identification processing unit. 標本画像情報記憶部に記憶される標本画像情報の一例を示す。An example of the sample image information stored in the sample image information storage unit is shown. 棚段領域および商品タグ配置領域が特定された状態を示す図である。It is a figure which shows the state which specified the shelf stage area and the product tag arrangement area. 棚段領域および商品タグ配置領域が特定された状態を示す図である。It is a figure which shows the state which specified the shelf stage area and the product tag arrangement area. Ｎ−１回目の撮影画像情報に対して特徴量採取領域を設定した状態の一例を示す。An example of a state in which the feature amount collection area is set for the N-1th captured image information is shown. Ｎ回目の撮影画像情報に対して特徴量採取領域を設定した状態の一例を示す図である。It is a figure which shows an example of the state which set the feature amount collection area for the Nth photograph image information. Ｎ−１回目の特徴量採取領域と，Ｎ回目の特徴量採取領域とのペアの関係を示す図である。It is a figure which shows the relationship of the pair of the N-1th feature collection area and the Nth feature collection area. Ｎ−１回目の撮影画像情報における棚位置Ｃを，関数Ｆにより，Ｎ回目の撮影画像情報における棚位置Ｄとして射影する状態を示す図である。It is a figure which shows the state which the shelf position C in the N-1th photographed image information is projected as the shelf position D in the Nth photographed image information by a function F. 撮影画像情報の一例を示す図である。It is a figure which shows an example of photographed image information. Ｎ−１回目の撮影画像情報に特徴量採取領域を設定した状態を示す図である。It is a figure which shows the state which set the feature amount collection area in the N-1 first photographed image information. Ｎ回目の撮影画像情報に特徴量採取領域を設定した状態をに示す図である。It is a figure which shows the state which set the feature amount collection area in the Nth photograph image information. Ｎ回目の撮影画像情報において，棚位置Ｄ１乃至Ｄ４を特定した状態の一例を示す図である。It is a figure which shows an example of the state which specified the shelf positions D1 to D4 in the Nth photographed image information.

本発明の画像認識システム１のシステム構成の一例を図１に示す。画像認識システム１は，管理端末２と撮影画像情報入力端末４とを用いる。 An example of the system configuration of the image recognition system 1 of the present invention is shown in FIG. The image recognition system 1 uses a management terminal 2 and a captured image information input terminal 4.

管理端末２は，画像認識システム１を運営する企業等の組織が利用するコンピュータである。また，撮影画像情報入力端末４は，店舗の陳列棚を撮影した画像情報の入力を行う端末である。 The management terminal 2 is a computer used by an organization such as a company that operates the image recognition system 1. Further, the photographed image information input terminal 4 is a terminal for inputting image information obtained by photographing the display shelf of the store.

画像認識システム１における管理端末２，撮影画像情報入力端末４は，コンピュータを用いて実現される。図２にコンピュータのハードウェア構成の一例を模式的に示す。コンピュータは，プログラムの演算処理を実行するＣＰＵなどの演算装置７０と，情報を記憶するＲＡＭやハードディスクなどの記憶装置７１と，情報を表示するディスプレイなどの表示装置７２と，情報の入力が可能なキーボードやマウスなどの入力装置７３と，演算装置７０の処理結果や記憶装置７１に記憶する情報をインターネットやＬＡＮなどのネットワークを介して送受信する通信装置７４とを有している。 The management terminal 2 and the captured image information input terminal 4 in the image recognition system 1 are realized by using a computer. FIG. 2 schematically shows an example of the hardware configuration of the computer. The computer can input information from an arithmetic unit 70 such as a CPU that executes arithmetic processing of a program, a storage device 71 such as a RAM or a hard disk that stores information, and a display device 72 such as a display that displays information. It has an input device 73 such as a keyboard and a mouse, and a communication device 74 that transmits and receives processing results of the arithmetic unit 70 and information stored in the storage device 71 via a network such as the Internet or LAN.

コンピュータがタッチパネルディスプレイを備えている場合には，表示装置７２と入力装置７３とが一体的に構成されていてもよい。タッチパネルディスプレイは，たとえばタブレット型コンピュータやスマートフォンなどの可搬型通信端末などで利用されることが多いが，それに限定するものではない。 When the computer is provided with a touch panel display, the display device 72 and the input device 73 may be integrally configured. Touch panel displays are often used, for example, in portable communication terminals such as tablet computers and smartphones, but are not limited thereto.

タッチパネルディスプレイは，そのディスプレイ上で，直接，所定の入力デバイス（タッチパネル用のペンなど）や指などによって入力を行える点で，表示装置７２と入力装置７３の機能が一体化した装置である。 The touch panel display is a device in which the functions of the display device 72 and the input device 73 are integrated in that input can be performed directly on the display with a predetermined input device (such as a pen for a touch panel) or a finger.

撮影画像情報入力端末４は，上記の各装置のほか，カメラなどの撮影装置を備えていてもよい。撮影画像情報入力端末４として，携帯電話，スマートフォン，タブレット型コンピュータなどの可搬型通信端末を用いることもできる。 The captured image information input terminal 4 may include a photographing device such as a camera in addition to the above-mentioned devices. As the captured image information input terminal 4, a portable communication terminal such as a mobile phone, a smartphone, or a tablet computer can also be used.

本発明における各手段は，その機能が論理的に区別されているのみであって，物理上あるいは事実上は同一の領域を為していても良い。本発明の各手段における処理は，その処理順序を適宜変更することもできる。また，処理の一部を省略してもよい。たとえば正置化処理を省略することもできる。その場合，正置化処理をしていない画像情報に対する処理を実行することができる。 Each means in the present invention has only a logical distinction in its function, and may form the same area physically or substantially. The processing order of the processing in each means of the present invention may be changed as appropriate. In addition, a part of the processing may be omitted. For example, the emplacement process can be omitted. In that case, it is possible to execute processing on the image information that has not been subjected to the orthostatic processing.

画像認識システム１における管理端末２は，撮影画像情報入力端末４とネットワークを介して情報の送受信が可能である。 The management terminal 2 in the image recognition system 1 can send and receive information via a network with the captured image information input terminal 4.

画像認識システム１は，撮影画像情報入力受付処理部２０と，撮影画像情報記憶部２１と，撮影画像情報正置化処理部２２と，位置特定処理部２３と，商品タグ配置領域切出処理部２４と，商品タグ配置領域正置化処理部２５と，商品タグ特定処理部２６と，商品タグ内情報特定処理部２７とを有する。 The image recognition system 1 includes a photographed image information input reception processing unit 20, a photographed image information storage unit 21, a photographed image information normalization processing unit 22, a position identification processing unit 23, and a product tag placement area cutting processing unit. 24, a product tag arrangement area normalization processing unit 25, a product tag specifying processing unit 26, and an information specifying processing unit 27 in the product tag.

撮影画像情報入力受付処理部２０は，撮影画像情報入力端末４で撮影した店舗の陳列棚の画像情報（撮影画像情報）の入力を受け付け，後述する撮影画像情報記憶部２１に記憶させる。撮影画像情報入力端末４からは，撮影画像情報のほか，撮影日時，店舗名などの店舗識別情報，画像情報を識別する画像情報識別情報などをあわせて入力を受け付けるとよい。 The photographed image information input reception processing unit 20 receives the input of the image information (photographed image information) of the display shelf of the store photographed by the photographed image information input terminal 4, and stores it in the photographed image information storage unit 21 described later. In addition to the photographed image information, the photographed image information input terminal 4 may accept input including the photographed date and time, store identification information such as the store name, and image information identification information for identifying the image information.

撮影画像情報記憶部２１は，撮影画像情報入力受付処理部２０で受け付けた撮影画像情報，撮影日時，店舗識別情報，画像情報識別情報などを対応づけて記憶する。撮影画像情報とは，台形補正処理を実行する対象となる画像情報であればよく，一つの陳列棚を複数枚で撮影した場合に，それが一つの画像情報として合成された画像情報も含まれる。また，歪み補正処理が実行された後の画像情報も撮影画像情報に含まれる。 The photographed image information storage unit 21 stores the photographed image information, the shooting date and time, the store identification information, the image information identification information, etc. received by the photographed image information input reception processing unit 20 in association with each other. The captured image information may be any image information to be executed for trapezoidal correction processing, and includes image information synthesized as one image information when one display shelf is photographed by a plurality of images. .. In addition, the captured image information also includes the image information after the distortion correction processing is executed.

撮影画像情報正置化処理部２２は，撮影画像情報記憶部２１に記憶した撮影画像情報に対して台形補正処理を実行して正置化した，正置画像情報を生成する。台形補正処理は，撮影画像情報に写っている陳列棚の棚段が水平に，そこに陳列されている商品に対する商品タグが垂直になるように行う補正処理である。 The captured image information emplacement processing unit 22 generates the emplacement image information in which the photographed image information stored in the photographed image information storage unit 21 is subjected to trapezoidal correction processing and is emplaced. The keystone correction process is a correction process performed so that the shelves of the display shelves shown in the photographed image information are horizontal and the product tags for the products displayed there are vertical.

撮影画像情報正置化処理部２２が実行する台形補正処理は，撮影画像情報において４頂点の指定の入力を受け付け，その各頂点を用いて台形補正処理を実行する。指定を受け付ける４頂点としては，陳列棚の棚段の４頂点であってもよいし，陳列棚の棚位置の４頂点であってもよい。また，２段，３段の棚段のまとまりの４頂点であってもよい。４頂点としては任意の４点を指定できる。 The keystone correction process executed by the captured image information emplacement processing unit 22 receives input for designating four vertices in the photographed image information, and executes the keystone correction process using each of the vertices. The four vertices that accept the designation may be the four vertices of the shelf stage of the display shelf or the four vertices of the shelf position of the display shelf. Further, it may be four vertices of a group of two or three shelves. Any four points can be specified as the four vertices.

図７および図８に正置化処理がされた撮影画像情報（正置画像情報）の一例を示す。図７は，陳列棚として，ビールなどの飲料用缶の商品を，上下２段の棚段に陳列をしている正置画像情報である。図８は，歯ブラシなどの商品を陳列棚に吊す態様で陳列する吊し棚であって，上下２段に商品を陳列している正置画像情報である。 7 and 8 show an example of the photographed image information (normal image information) that has been subjected to the normalization process. FIG. 7 is vertical image information in which beverage can products such as beer are displayed on two upper and lower shelves as display shelves. FIG. 8 is a hanging shelf for displaying products such as toothbrushes on a display shelf, and is vertical image information in which the products are displayed in two upper and lower stages.

位置特定処理部２３は，撮影画像情報正置化処理部２２において撮影画像情報に対して台形補正処理を実行した正置画像情報のうち，商品タグが取り付けられる可能性のある領域（商品タグ配置領域）を特定する。すなわち，撮影画像情報および正置画像情報には陳列棚が写っているが，陳列棚には，商品が陳列される棚段の領域と，そこに陳列される商品に対する商品タグが取り付けられる可能性のある商品タグ配置領域とがある。そのため，正置画像情報から商品タグ配置領域を特定する。商品タグ配置領域の特定としては，管理端末２の操作者が手動で商品タグ配置領域を指定し，それを位置特定処理部２３が受け付けてもよいし，初回に手動で入力を受け付けた商品タグ配置領域の情報に基づいて，二回目以降は自動で商品タグ配置領域を特定してもよい。 The position identification processing unit 23 is an area (product tag arrangement) in which the product tag may be attached to the normal image information in which the trapezoidal correction processing is executed on the photographed image information in the photographed image information normalization processing unit 22. Area) is specified. That is, although the display shelf is shown in the photographed image information and the vertical image information, there is a possibility that the area of the shelf where the products are displayed and the product tag for the products displayed there are attached to the display shelf. There is a product tag placement area with. Therefore, the product tag placement area is specified from the normal image information. To specify the product tag placement area, the operator of the management terminal 2 may manually specify the product tag placement area, and the position identification processing unit 23 may accept the product tag placement area, or the product tag for which input is manually received for the first time. The product tag placement area may be automatically specified from the second time onward based on the placement area information.

図９に，図７の正置画像情報に対して商品タグ配置領域の指定の入力を受け付けた状態を模式的に示す。また，図１０に，図８の正置画像情報に対して商品タグ配置領域の指定の入力を受け付けた状態を模式的に示す。 FIG. 9 schematically shows a state in which the input of the designation of the product tag placement area is accepted for the normal image information of FIG. 7. Further, FIG. 10 schematically shows a state in which the input of the designation of the product tag arrangement area is accepted for the normal image information of FIG.

商品タグ配置領域切出処理部２４は，位置特定処理部２３で特定した商品タグ配置領域の画像情報を商品タグ配置領域画像情報として切り出す。商品タグ配置領域切出処理部２４は，実際に，画像情報として切り出してもよいし，実際には画像情報としては切り出さずに，仮想的に切り出すのでもよい。画像情報を仮想的に切り出すとは，特定した領域，たとえば商品タグ配置領域の範囲を処理対象として処理を実行させることをいう。 The product tag arrangement area cutout processing unit 24 cuts out the image information of the product tag arrangement area specified by the position identification processing unit 23 as the product tag arrangement area image information. The product tag arrangement area cutout processing unit 24 may actually cut out as image information, or may cut out virtually without actually cutting out as image information. Virtually cutting out image information means executing processing with the specified area, for example, the range of the product tag placement area as the processing target.

商品タグ配置領域正置化処理部２５は，商品タグ配置領域切出処理部２４において切り出した商品タグ配置領域画像情報を正置化する台形補正処理を実行する。陳列棚の面が垂直であるのに対し，商品タグの面は，顧客から見やすいように，垂直面よりも上向きをしていることが多い。そこで商品タグ配置領域の画像情報を正置化することで，認識精度を向上させる。図１１に，図９の商品タグ配置領域の画像情報に対して，正置化した商品タグ配置領域の画像情報の一例を示す。図１１（ａ）が図９における上の棚段の商品タグ配置領域の画像情報を正置化した商品タグ配置領域の画像情報であり，図１１（ｂ）が図９における下の棚段の商品タグ配置領域の画像情報を正置化した商品タグ配置領域の画像情報である。 The product tag placement area normalization processing unit 25 executes trapezoidal correction processing for normalizing the product tag placement area image information cut out by the product tag placement area cutting processing unit 24. While the surface of the display shelf is vertical, the surface of the product tag is often facing upward from the vertical surface so that the customer can easily see it. Therefore, the recognition accuracy is improved by making the image information of the product tag placement area normal. FIG. 11 shows an example of the image information of the product tag arrangement area that has been rectified with respect to the image information of the product tag arrangement area of FIG. FIG. 11 (a) shows the image information of the product tag placement area in which the image information of the product tag placement area of the upper shelf in FIG. 9 is rectified, and FIG. 11 (b) shows the image information of the lower shelf in FIG. This is the image information of the product tag placement area in which the image information of the product tag placement area is rectified.

商品タグ配置領域正置化処理部２５は，以下のような処理を実行することで，商品タグ配置領域の画像情報を正置化する。すなわち，商品タグ配置領域の画像情報において，エッジ検出を行い，左右の両端に近い箇所で，一定の長さ以上の垂直に近い輪郭線（たとえば７０度から１１０度のように，垂直（９０度）から所定範囲の角度内の輪郭線）を特定する。なお，左右の両端に近い箇所の輪郭線を抽出することが好ましいが，それに限定しない。図９の商品タグ配置領域の画像情報の場合，図１２に示すように，Ｌ１乃至Ｌ４をそれぞれ特定する。図１２（ａ）が図９における上の棚段の商品タグ配置領域であり，図１２（ｂ）が図９における下の棚段の商品タグ配置領域である。なお，特定する輪郭線Ｌ１乃至Ｌ４は，実際に商品タグ配置領域の画像情報に描画するわけではない。そして図１２（ａ）のＬ１，Ｌ２，図１２（ｂ）のＬ３，Ｌ４が，それぞれ垂直線となるように，商品タグ配置領域の画像情報に対する台形補正処理をそれぞれ実行する。このような処理を実行することで，商品タグ配置領域の画像情報を正置化し，図１１に示す正置化した商品タグ配置領域の画像情報を得られる。なお，商品タグ配置領域正置化処理部２５の処理を実行することで，商品タグ特定処理部２６，商品タグ内情報特定処理部２７の精度を向上させることができることから，その処理を実行することが好ましいが，省略することもできる。その場合，商品タグ特定処理部２６，商品タグ内情報特定処理部２７は，商品タグ配置領域切出処理部２４で切り出した商品タグ配置領域に対して実行することとなる。 The product tag placement area normalization processing unit 25 normalizes the image information of the product tag placement area by executing the following processing. That is, in the image information of the product tag placement area, edge detection is performed, and a nearly vertical contour line of a certain length or more (for example, 70 degrees to 110 degrees) is vertical (90 degrees) at a location near the left and right ends. ) To specify the contour line) within a predetermined range of angles. It is preferable, but not limited to, to extract the contour lines near the left and right ends. In the case of the image information of the product tag arrangement area of FIG. 9, L1 to L4 are specified, respectively, as shown in FIG. FIG. 12 (a) is the product tag arrangement area of the upper shelf in FIG. 9, and FIG. 12 (b) is the product tag arrangement area of the lower shelf in FIG. The contour lines L1 to L4 to be specified are not actually drawn in the image information of the product tag arrangement area. Then, the trapezoidal correction processing for the image information of the product tag arrangement area is executed so that L1 and L2 in FIG. 12A and L3 and L4 in FIG. 12B are vertical lines, respectively. By executing such a process, the image information of the product tag arrangement area is made normal, and the image information of the product tag arrangement area shown in FIG. 11 can be obtained. By executing the processing of the product tag placement area normalization processing unit 25, the accuracy of the product tag specifying processing unit 26 and the information specifying processing unit 27 in the product tag can be improved. Therefore, the processing is executed. It is preferable, but it can be omitted. In that case, the product tag identification processing unit 26 and the product tag in-information identification processing unit 27 execute the product tag arrangement area cut out by the product tag arrangement area cutout processing unit 24.

商品タグ特定処理部２６は，正置化した商品タグ配置領域画像情報から，各商品タグの領域（商品タグ領域）を特定する。商品タグ領域の特定処理には，主に２種類の方法を用いることができる。第１の方法は，輪郭線に基づいて商品タグ領域を特定する方法であり，第２の方法は，全体の明暗の分布などの全体的な特徴を，商品タグのテンプレートの画像情報とマッチングすることで商品タグ領域を特定する方法である。第１の方法，第２の方法以外の方法を用いることもできる。 The product tag identification processing unit 26 specifies an area (product tag area) of each product tag from the product tag arrangement area image information that has been placed upright. Two main methods can be used for the process of specifying the product tag area. The first method is a method of specifying the product tag area based on the contour line, and the second method is to match the overall features such as the distribution of the entire light and darkness with the image information of the product tag template. This is a method of specifying the product tag area. A method other than the first method and the second method can also be used.

第１の方法は，商品タグの地の色（背景色）が白色が多い（背景より明るい）ことを利用する方法である。すなわち，まず正置化した商品タグ配置領域の画像情報のうち，画像情報の明度情報を横方向に積算したヒストグラムを生成する。そしてヒストグラムの立ち上がり，立ち下がり位置を特定し，商品タグの上辺位置Ａ，下辺位置Ｂを特定する。この処理を模式的に示すのが図１３である。立ち上がりとは，ヒストグラムにおいて，黒から白の方向に急峻（あらかじめ定められた比率以上）に増加する箇所であり，立ち下がりとは，ヒストグラムにおいて，白から黒の方向に急峻に減少する箇所である。 The first method is a method that utilizes the fact that the background color (background color) of the product tag is often white (brighter than the background). That is, first, among the image information of the product tag placement area that has been placed upright, a histogram is generated in which the brightness information of the image information is integrated in the horizontal direction. Then, the rising and falling positions of the histogram are specified, and the upper side position A and the lower side position B of the product tag are specified. FIG. 13 schematically shows this process. The rising edge is the point where the histogram increases sharply from black to white (more than a predetermined ratio), and the falling edge is the point where the histogram sharply decreases from white to black. ..

そして，正置化した商品タグ配置領域の画像情報のうち，上辺位置Ａ，下辺位置Ｂの間を切り出し，正置化した商品タグ配置領域画像情報の明度情報を縦方向に積算したヒストグラムを生成する。そして，ヒストグラムの立ち上がり，立ち下がり位置を特定し，立ち上がりとその右の所定の距離範囲にある立ち下がりとをペアとし，それぞれを左辺位置，右辺位置として，商品タグ領域を特定する。そして，ペアを形成できなかった立ち上がりについてはその右側に，ペアを形成できなかった立ち下がりについてはその左側に，あらかじめ定めた距離内に商品タグ領域がなければ，商品タグ領域として特定をする。この処理を模式的に示すのが図１４である。 Then, out of the image information of the product tag placement area that has been placed upright, a histogram is generated in which the brightness information of the image information of the product tag placement area that has been placed upright is integrated in the vertical direction by cutting out between the upper side position A and the lower side position B. do. Then, the rising and falling positions of the histogram are specified, and the rising edge and the falling edge within a predetermined distance range to the right of the rising edge are paired, and the product tag area is specified by setting each as the left side position and the right side position. Then, if there is no product tag area within a predetermined distance, it is specified as a product tag area on the right side of the rising edge where the pair could not be formed and on the left side of the falling edge where the pair could not be formed. FIG. 14 schematically shows this process.

また第１の方法により，図１１の正置化した商品タグ配置領域の画像情報から，商品タグ領域を特定した状態を図１５に示す。上辺位置Ａ，下辺位置Ｂ，左辺位置（立ち上がり）Ｕ，右辺位置（立ち下がり）Ｄのそれぞれで構成される矩形領域が，特定された商品タグの領域である。 Further, FIG. 15 shows a state in which the product tag area is specified from the image information of the product tag placement area that is placed upright in FIG. 11 by the first method. A rectangular area composed of each of the upper side position A, the lower side position B, the left side position (rising) U, and the right side position (falling) D is the specified product tag area.

第２の方法は，いわゆるテンプレートマッチングである。すなわち，テンプレートとして，商品タグの画像情報をあらかじめ登録しておき，テンプレートと，正置化した商品タグ配置領域の画像情報とのマッチングをすることで，商品タグ領域を特定する。 The second method is so-called template matching. That is, the product tag area is specified by registering the image information of the product tag as a template in advance and matching the template with the image information of the product tag placement area that has been placed upright.

商品タグには，税抜価格，税込価格，商品識別情報（商品名など），メーカー名，定格などが含まれる。そのため，テンプレートとなる商品タグの画像情報に，商品名など商品識別情報や価格の具体的な数字，文字を含めるとその部分も含めて画像マッチング処理の判定対象となるため，その部分をモザイク化，削除等することで，判定対象から中立化や除外してあることが好ましい。中立化とは，どんな入力について高い点，低い点を配点しないことであり，除外とは，画像マッチング処理の際に，その部分をマッチングの対象から除外することである。 The product tag includes the tax-excluded price, the tax-included price, the product identification information (product name, etc.), the manufacturer name, the rating, and the like. Therefore, if the image information of the product tag used as a template includes product identification information such as the product name, specific price numbers, and characters, that part is also included in the judgment target of the image matching process, so that part is mosaicked. , It is preferable to neutralize or exclude it from the judgment target by deleting it. Neutralization means not allocating high and low points for any input, and exclusion means excluding that part from the matching target during image matching processing.

商品タグ内情報特定処理部２７は，商品タグ特定処理部２６で特定した商品タグ領域に記載されている情報をＯＣＲ認識などにより特定する処理を実行する。ＯＣＲ認識をする場合には，商品タグ領域として特定した領域のすべてまたは一部について行うことができる。商品タグ領域として特定したすべての領域に行うとノイズなどにより誤認識が発生する可能性が高いので，ＯＣＲ認識をする対象領域を限定することが好ましい。この場合，商品タグ特定処理部２６で用いた第１の方法，第２の方法にそれぞれ対応した処理を実行する。 The product tag information identification processing unit 27 executes a process of specifying the information described in the product tag area specified by the product tag identification processing unit 26 by OCR recognition or the like. When OCR recognition is performed, it can be performed for all or a part of the area specified as the product tag area. If it is applied to all the areas specified as the product tag area, there is a high possibility that erroneous recognition will occur due to noise or the like, so it is preferable to limit the target area for OCR recognition. In this case, the processes corresponding to the first method and the second method used in the product tag specifying processing unit 26 are executed.

商品タグ特定処理部２６で第１の方法を用いた場合，特定した商品タグ領域において，まず二値化処理を行う。そして，二値化した画像情報においてラベリング処理を実行する。ラベリング処理とは，二値画像情報において，白または黒が連続した画素に同一の番号（識別情報）を割り振る処理を実行することで，連続する画素同士を一つの島（グループ）化する処理である。そしてラベリング処理によって検出した島を含む矩形領域（ボックス）を生成し，ボックスの高さ，幅，ベースラインを求める。ボックスを生成する際には，同一番号にラベリングされた領域を囲む最小の，垂直，水平の線分で囲まれた長方形を生成することが好ましいが，それに限定しない。なおボックスを生成する際に，あらかじめ定めた閾値となる高さ，幅を充足しない島はノイズとしてボックスを生成せず，そのまま処理対象から除去する。たとえば高さが小さすぎる島は横罫線や画像上のゴミの可能性があり，幅が広すぎる島はロゴなどの可能性があり，これらはノイズとして除去をする。 When the first method is used in the product tag specifying processing unit 26, binarization processing is first performed in the specified product tag area. Then, the labeling process is executed on the binarized image information. The labeling process is a process of assigning the same number (identification information) to pixels in which white or black is continuous in binary image information, thereby forming consecutive pixels into one island (group). be. Then, a rectangular area (box) including the islands detected by the labeling process is generated, and the height, width, and baseline of the box are obtained. When generating a box, it is preferable, but not limited to, to generate the smallest rectangle surrounded by vertical and horizontal line segments that surrounds the area labeled with the same number. When generating a box, islands that do not satisfy the height and width that are the predetermined thresholds are not generated as noise and are removed from the processing target as they are. For example, an island that is too small may have horizontal ruled lines or dust on the image, and an island that is too wide may have a logo, etc., which are removed as noise.

商品タグで使用される文字は，一般的にはゴシック体など太字が多い。そのため，画像情報に多少のピンぼけがある場合でも，一つの文字列を形成する文字群は，ベースラインと高さがそろった島として検出することができる。 The characters used in product tags are generally in bold, such as Gothic. Therefore, even if the image information is slightly out of focus, the character group forming one character string can be detected as an island having the same height as the baseline.

そして商品タグ内情報特定処理部２７は，所定の類似性を有する隣接したボックス同士を併合し，ブロックを構成する。すなわち，ベースラインと高さが所定範囲内で一致し，高さおよび幅が一定の閾値の範囲内にある連続するボックスを併合し，ブロックを構成する。この際に，併合するブロックの間にある小さいボックスなどもまとめて一つのブロックとして構成する。これによって，濁点，半濁点，ハイフンなども一つのブロックに取り込まれることとなる。ブロックは，ＯＣＲ認識の対象となる領域である。そして，ブロックのうち，高さがもっとも高いブロックを価格領域（税抜価格領域）と推定し，ＯＣＲ認識を行う。また，ほかのブロックについても同様に，ＯＣＲ認識を行う。以上のような処理を実行することで，商品タグ領域において複数行にわたって自由にレイアウトされた原稿に対応したＯＣＲ認識を行うよりも精度よく，文字認識処理を実行することができる。この処理を模式的に示すのが図１６乃至図１８である。図１６は二値化した画像情報であり，図１７はボックス（破線で示す領域）を生成した状態を示す図である。また図１８はブロック（破線で示す領域）を生成した状態を示す図である。 Then, the information identification processing unit 27 in the product tag merges adjacent boxes having predetermined similarities to form a block. That is, a block is formed by merging consecutive boxes whose baseline and height match within a predetermined range and whose height and width are within a certain threshold range. At this time, small boxes and the like between the blocks to be merged are also collectively configured as one block. As a result, voiced sound marks, semi-voiced sound marks, hyphens, etc. are also incorporated into one block. The block is an area that is the target of OCR recognition. Then, among the blocks, the block with the highest height is estimated as the price range (price range excluding tax), and OCR recognition is performed. Similarly, OCR recognition is performed for other blocks. By executing the above processing, character recognition processing can be executed more accurately than performing OCR recognition corresponding to a manuscript freely laid out over a plurality of lines in the product tag area. 16 to 18 schematically show this process. FIG. 16 is binarized image information, and FIG. 17 is a diagram showing a state in which a box (area shown by a broken line) is generated. Further, FIG. 18 is a diagram showing a state in which a block (region shown by a broken line) is generated.

以上のようにして商品タグ内情報特定処理部２７は，商品タグに記載した情報を文字認識することができる。 As described above, the information identification processing unit 27 in the product tag can recognize the information described in the product tag as characters.

商品タグ特定処理部２６で第２の方法を用いた場合，テンプレートとした商品タグの画像情報に，あらかじめ税抜価格，税込価格，メーカー名，商品名などの商品識別情報，定格が表記されるそれぞれの文字枠の位置，大きさ（高さ，幅）が設定されている。そのため，商品タグ特定処理部２６で特定した商品タグ領域から，該当箇所の画像情報を切り出し，ＯＣＲ認識処理を実行する。この際に，価格，メーカー名，商品名などの商品識別情報，定格によって使用される文字種別（たとえば数字，ローマ字，記号，文字列など）を制約条件として定めておくことで，ＯＣＲ認識処理の精度を向上させることができる。 When the second method is used in the product tag identification processing unit 26, the product identification information such as the tax-excluded price, the tax-included price, the manufacturer name, and the product name, and the rating are written in advance in the image information of the product tag used as the template. The position and size (height, width) of each character frame are set. Therefore, the image information of the corresponding portion is cut out from the product tag area specified by the product tag specifying processing unit 26, and the OCR recognition process is executed. At this time, the OCR recognition process can be performed by defining the product identification information such as price, manufacturer name, and product name, and the character type used by the rating (for example, numbers, Roman characters, symbols, character strings, etc.) as constraints. The accuracy can be improved.

さらに商品タグ内情報特定処理部２７は，読み取った情報の整合性を確認する処理を実行する。整合性確認処理としては，辞書照合による整合性の確認処理，ロジカルチェックの２種類を行うことが好ましい。 Further, the information identification processing unit 27 in the product tag executes a process of confirming the consistency of the read information. As the consistency confirmation process, it is preferable to perform two types of consistency confirmation process by dictionary collation and logical check.

辞書照合による整合性の確認処理は，たとえば以下のように実行する。画像認識システム１には，陳列棚に陳列される可能性のある商品の商品名などの商品識別情報と，それに対応するコード情報（たとえばＪＡＮコード）とを対応づけて記憶する商品辞書（図示せず）を備えている。そして，商品タグ内情報特定処理部２７で認識した価格を示す領域から読み取った文字列以外の文字列と，商品辞書に登録されたすべての商品名などの商品識別情報との編集距離（レーベンシュタイン距離）を求める。そして求めた編集距離のうち，最小の編集距離が一つであるならば，その編集距離の商品名などの商品識別情報を最終候補とする。そして最終候補となった商品名などの商品識別情報の文字列の長さに対して，許容できる編集距離をあらかじめ定めておき，許容できる編集距離内であれば商品名などの商品識別情報を同定する。許容できる編集距離を超えていれば，読み取った文字列は未確定とする。また，最小の編集距離が複数ある場合には，読み取った文字列は未確定とする。 The consistency confirmation process by dictionary collation is executed as follows, for example. The image recognition system 1 stores product identification information such as product names of products that may be displayed on display shelves in association with corresponding code information (for example, JAN code) (shown in the figure). It is equipped with. Then, the editing distance (Levenshtein) between the character string other than the character string read from the area indicating the price recognized by the information identification processing unit 27 in the product tag and the product identification information such as all the product names registered in the product dictionary. Distance) is calculated. If the minimum editing distance is one of the obtained editing distances, the product identification information such as the product name of the editing distance is used as the final candidate. Then, the allowable editing distance is determined in advance for the length of the character string of the product identification information such as the product name that is the final candidate, and the product identification information such as the product name is identified if it is within the allowable editing distance. do. If the edit distance exceeds the allowable editing distance, the read character string is undetermined. If there are multiple minimum editing distances, the read character string is undetermined.

編集距離とは，二つの文字列がどの程度異なっているかを示す距離の一種であって，具体的には，一文字の挿入，削除，置換によって，一方の文字列をもう一方の文字列に変形するのに必要な手順の最小回数である。図１９に商品タグに表記された商品名，ＯＣＲ認識の結果，商品辞書に記憶する商品名のうち編集距離が最小の商品名，編集距離の一例を示す。また，図２０に，最終候補の文字数ごとの確定してよい編集距離の対応関係の表を示す。なお，本明細書では編集距離を用いて処理をする場合を説明するが，編集距離の算出において置換の距離を短くする処理を行った距離関数であってもよい。この距離関数には，二つの文字列がどの程度異なっているかを示す距離を算出する関数であって，上述の編集距離も含まれる。 The edit distance is a type of distance that indicates how different the two character strings are. Specifically, one character string is transformed into the other character string by inserting, deleting, or replacing one character. The minimum number of steps required to do this. FIG. 19 shows an example of the product name and the editing distance with the shortest editing distance among the product names stored in the product dictionary as a result of the product name and OCR recognition written on the product tag. Further, FIG. 20 shows a table of correspondences of edit distances that may be determined for each number of characters of the final candidate. In this specification, the case where the processing is performed using the editing distance will be described, but the distance function may be a distance function that has been processed to shorten the replacement distance in the calculation of the editing distance. This distance function is a function that calculates the distance indicating how different the two character strings are, and includes the above-mentioned editing distance.

そして，上記の最小の編集距離の商品名などの商品識別情報を求めた後，ＯＣＲ認識結果から商品名などの商品識別情報の該当部分を除いた文字列に対し，別に備える定格辞書（図示せず）に記憶する各定格との編集距離，メーカー名辞書（図示せず）に記憶する各メーカー名との編集距離をそれぞれ求める。たとえば図１９における「のどごし生３５０ｍｌ」の例では，「３５０ｍｌ」の部分に対し，定格の辞書における「３５０ｍｌ」と編集距離０で一致し，定格部分の文字列であることを同定する。同様に，メーカー名についても編集距離を求め，最小の編集距離にある文字列をメーカー名であることを同定する。そしてＯＣＲ認識結果から定格部分，メーカー名部分の各文字列を取り除いて，取り除いた残りの文字列に対し，商品名辞書における最短の編集距離にある文字列を求め，許容できる編集距離であるか否かを判定する。許容できる編集距離内であれば，商品名などの商品識別情報，メーカー名，定格を確定する。このような処理をすることで，商品名などの商品識別情報に定格，メーカー名が含まれる場合にも適切に確定をすることができる。なお，定格辞書とは，陳列棚に陳列される可能性のある商品の定格（容量など）を記憶する辞書である。メーカー名辞書とは，陳列棚に陳列される可能性のある商品のメーカー名を記憶する辞書である。 Then, after obtaining the product identification information such as the product name with the minimum editing distance described above, a rating dictionary (shown) is separately prepared for the character string excluding the relevant part of the product identification information such as the product name from the OCR recognition result. The editing distance with each rating stored in (not shown) and the editing distance with each manufacturer name stored in the manufacturer name dictionary (not shown) are obtained. For example, in the example of "350 ml of throat raw" in FIG. 19, the part of "350 ml" matches "350 ml" in the rated dictionary with an editing distance of 0, and it is identified that it is a character string of the rated part. Similarly, the editing distance is obtained for the maker name, and the character string with the minimum editing distance is identified as the maker name. Then, each character string of the rated part and the manufacturer name part is removed from the OCR recognition result, and the character string having the shortest editing distance in the product name dictionary is obtained from the removed remaining character string, and is it an acceptable editing distance? Judge whether or not. If it is within the allowable editing distance, determine the product identification information such as the product name, the manufacturer name, and the rating. By performing such processing, even if the product identification information such as the product name includes the rating and the manufacturer name, it can be appropriately determined. The rating dictionary is a dictionary that stores the ratings (capacity, etc.) of products that may be displayed on the display shelves. The manufacturer name dictionary is a dictionary that stores the manufacturer names of products that may be displayed on the display shelf.

認識結果を利用者に示し，選択してもらうための表示を行う際には，確定した文字列，未確定の文字列については，それぞれが特定可能な方法で表示が行われているとよい。たとえば確定した文字列と未確定の文字列との色を分ける，確定の文字列および／または未確定の文字列には確定または未確定を示す情報を付するなどがある。未確定の文字列については，最小の編集距離となった商品名などの商品識別情報の候補が複数ある場合には，各候補を表示してもよい。 When displaying the recognition result to the user and displaying it for selection, it is preferable that the confirmed character string and the unconfirmed character string are displayed in a method that can be identified by each. For example, the color of a confirmed character string and an unconfirmed character string may be separated, and information indicating confirmed or unconfirmed may be added to a confirmed character string and / or an unconfirmed character string. For undetermined character strings, if there are multiple candidates for product identification information such as the product name with the minimum editing distance, each candidate may be displayed.

商品タグ内情報特定処理部２７におけるロジカルチェック（認識した情報の論理的整合性の判定処理）は以下のように行うことができる。たとえば価格領域として２つの価格を読み取った場合，一つは税抜価格，一つは税込価格となるが，税抜価格が税込価格よりも高くなっている場合には，それらを逆転して認識させる。また，税抜価格に消費税率を乗算して得た値が税込価格とは一致しない場合には，いずれかまたは双方に誤認識があるとする。さらに，商品名などの商品識別情報を認識した場合には，その商品または商品カテゴリの通常の価格帯に収まっているか，を判定する。また，商品名などの商品識別情報，メーカー名，定格の対応関係が一致しているかを判定してもよい。 The logical check (the processing for determining the logical consistency of the recognized information) in the information specifying processing unit 27 in the product tag can be performed as follows. For example, if you read two prices as the price range, one will be the tax-excluded price and one will be the tax-included price, but if the tax-excluded price is higher than the tax-included price, they will be reversed and recognized. Let me. If the value obtained by multiplying the tax-excluded price by the consumption tax rate does not match the tax-included price, it is assumed that either or both are misrecognized. Furthermore, when the product identification information such as the product name is recognized, it is determined whether the product or the product category is within the normal price range. In addition, it may be determined whether the correspondence between the product identification information such as the product name, the manufacturer name, and the rating matches.

以上のように商品タグ内の情報の整合性を確認することで，撮影画像情報に写っている商品タグに含まれる情報を確定することができる。このように確定した情報は，たとえば表形式で出力をすることができる。 By confirming the consistency of the information in the product tag as described above, the information included in the product tag reflected in the photographed image information can be determined. The information determined in this way can be output in tabular format, for example.

つぎに本発明の画像認識システム１の処理プロセスの一例を図３乃至図６のフローチャートを用いて説明する。 Next, an example of the processing process of the image recognition system 1 of the present invention will be described with reference to the flowcharts of FIGS. 3 to 6.

店舗の陳列棚が撮影された撮影画像情報は，撮影画像情報入力端末４から入力され，管理端末２の撮影画像情報入力受付処理部２０でその入力を受け付ける（Ｓ１００）。図２１および図２２に，撮影画像情報の一例を示す。また，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報の入力を受け付ける。そして，撮影画像情報入力受付処理部２０は，入力を受け付けた撮影画像情報，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報を対応づけて撮影画像情報記憶部２１に記憶させる。 The photographed image information photographed on the display shelf of the store is input from the photographed image information input terminal 4, and the input is accepted by the photographed image information input reception processing unit 20 of the management terminal 2 (S100). 21 and 22 show an example of captured image information. It also accepts input of image information identification information such as shooting date and time, store identification information, and shooting image information. Then, the captured image information input reception processing unit 20 stores the captured image information, the shooting date and time, the store identification information, and the image information identification information of the captured image information in association with each other in the captured image information storage unit 21.

管理端末２において所定の操作入力を受け付けると，正置画像情報正置化処理部は，撮影画像情報記憶部２１に記憶する撮影画像情報を抽出し，台形補正処理を行うための頂点である棚位置（陳列棚の位置）の４点の入力を受け付け，台形補正処理を実行する（Ｓ１１０）。このようにして台形補正処理が実行された撮影画像情報（正置画像情報）の一例が，図７および図８である。 When the management terminal 2 receives a predetermined operation input, the normal image information normalization processing unit extracts the photographed image information stored in the photographed image information storage unit 21, and is the apex for performing the keystone correction processing. The input of four points of the position (position of the display shelf) is received, and the keystone correction process is executed (S110). 7 and 8 show an example of the captured image information (normal image information) in which the keystone correction process is executed in this way.

そして，正置画像情報に対して，管理端末２において所定の操作入力を受け付けることで，位置特定処理部２３は，商品タグ配置領域を特定する（Ｓ１２０）。すなわち，正置画像情報における商品タグ配置領域の入力を受け付ける。図９および図１０が，商品タグ配置領域が特定された状態を示す図である。そして，商品タグ配置領域切出処理部２４はＳ１２０で特定した商品タグ配置領域の画像情報を切り出し（Ｓ１３０），商品タグ配置領域正置化処理部２５が台形補正処理を実行することで，商品タグ配置領域の画像情報に対する正置化処理を実行する（Ｓ１４０）。 Then, the position specifying processing unit 23 specifies the product tag placement area by accepting a predetermined operation input in the management terminal 2 for the normal image information (S120). That is, the input of the product tag placement area in the normal image information is accepted. 9 and 10 are diagrams showing a state in which the product tag arrangement area is specified. Then, the product tag placement area cutting processing unit 24 cuts out the image information of the product tag placement area specified in S120 (S130), and the product tag placement area normalization processing unit 25 executes the trapezoidal correction processing to perform the product. The normalization process for the image information in the tag placement area is executed (S140).

商品タグ配置領域の画像情報に対する正置化処理としては，まず，商品タグ配置領域の画像情報においてエッジ検出を行う。そして，検出したエッジのうち，一定の長さ以上であって，垂直から所定範囲の角度内にある輪郭線を特定する（Ｓ２００）。そして，特定した輪郭線のうち，もっとも左および右にある輪郭線を特定する（Ｓ２１０）。このように特定した輪郭線の一例を図１２に示す。そして，特定した輪郭線（図１２（ａ）のＬ１，Ｌ２，図１２（ｂ）のＬ３，Ｌ４）が，それぞれ垂直線となるように，商品タグ配置領域の画像情報に対する台形補正処理を実行する（Ｓ２２０）。このような処理を実行することで，商品タグ配置領域の画像情報を正置化し，図１１に示す正置化した商品タグ配置領域の画像情報を得られる。 As the normalization process for the image information in the product tag placement area, first, edge detection is performed in the image information in the product tag placement area. Then, among the detected edges, a contour line having a certain length or more and within an angle within a predetermined range from the vertical is specified (S200). Then, among the specified contour lines, the leftmost and rightmost contour lines are specified (S210). An example of the contour line thus specified is shown in FIG. Then, the trapezoidal correction process is executed for the image information of the product tag arrangement area so that the specified contour lines (L1, L2 in FIG. 12A and L3 and L4 in FIG. 12B) become vertical lines, respectively. (S220). By executing such a process, the image information of the product tag arrangement area is made normal, and the image information of the product tag arrangement area shown in FIG. 11 can be obtained.

Ｓ１４０において商品タグ配置領域の画像情報の正置化処理が終了すると，商品タグ特定処理部２６が，第１の方法または第２の方法により，商品タグ配置領域の画像情報から，個々の商品タグ領域を特定する（Ｓ１５０）。 When the process of normalizing the image information of the product tag placement area is completed in S140, the product tag identification processing unit 26 uses the first method or the second method to obtain individual product tags from the image information of the product tag placement area. The region is specified (S150).

第１の方法の場合には，正置化した商品タグ配置領域の画像情報のうち，画像情報の明度情報を横方向に積算したヒストグラムを生成し（Ｓ３００），ヒストグラムの立ち上がり，立ち下がり位置を特定する。そして，特定したヒストグラムの立ち上がりを商品タグの上辺位置Ａ，ヒストグラムの立ち下がり位置を商品タグの下辺位置Ｂとして特定する（Ｓ３１０）。 In the case of the first method, among the image information of the product tag placement area that has been placed upright, a histogram that integrates the brightness information of the image information in the horizontal direction is generated (S300), and the rising and falling positions of the histogram are set. Identify. Then, the rising position of the specified histogram is specified as the upper side position A of the product tag, and the falling position of the histogram is specified as the lower side position B of the product tag (S310).

つぎに，正置化した商品タグ配置領域の画像情報のうち，上辺位置Ａ，下辺位置Ｂの間を切り出し，正置化した商品タグ配置領域の画像情報の明度情報を縦方向に積算したヒストグラムを生成する（Ｓ３２０）。 Next, of the image information of the product tag placement area that has been placed upright, a histogram that cuts out between the upper side position A and the lower side position B and integrates the brightness information of the image information of the product tag placement area that has been placed upright in the vertical direction. Is generated (S320).

生成したヒストグラムにおいて立ち上がり位置Ｕ，立ち下がり位置Ｄを特定し，立ち上がり位置Ｕ（左辺位置）とその右の所定の距離範囲にある立ち下がり位置Ｄ（右辺位置）とをペアとして特定し，商品タグ領域として特定する（Ｓ３３０）。 In the generated histogram, the rising position U and the falling position D are specified, the rising position U (left side position) and the falling position D (right side position) within a predetermined distance range to the right of the rising position U are specified as a pair, and the product tag is specified. It is specified as an area (S330).

ペアを形成できなかった立ち上がり位置Ｕについてはその右側に，ペアを形成できなかった立ち下がり位置Ｄについてはその左側に，あらかじめ定めた距離内に商品タグ領域がなければ，商品タグ領域として特定をする（Ｓ３４０）。 If there is no product tag area within a predetermined distance, specify it as a product tag area on the right side of the rising position U where the pair could not be formed, and on the left side of the falling position D where the pair could not be formed. (S340).

以上のような処理によって，それぞれの商品タグ領域を特定した状態が図１５である。 FIG. 15 shows a state in which each product tag area is specified by the above processing.

また第２の方法を用いる場合には，商品タグ特定処理部２６は，あらかじめ登録しているテンプレートの商品タグの画像情報と，正置化した商品タグ配置領域の画像情報との画像マッチング処理を実行することで，商品タグ領域を特定する。 When the second method is used, the product tag identification processing unit 26 performs image matching processing between the image information of the product tag of the template registered in advance and the image information of the product tag arrangement area that has been placed upright. By executing, the product tag area is specified.

以上のようにして商品タグ特定処理部２６が商品タグ領域を特定すると，商品タグ内情報特定処理部２７が，商品タグ内における情報を特定する（Ｓ１６０）。 When the product tag specifying processing unit 26 specifies the product tag area as described above, the information specifying processing unit 27 in the product tag specifies the information in the product tag (S160).

商品タグ内情報特定処理部２７における第１の方法は，特定した商品タグ領域について二値化処理することで，二値化した画像情報とする（Ｓ４００）。そして，二値化した画像情報におけるラベリング処理により，ボックスを特定する（Ｓ４１０）。なお，あらかじめ定めた閾値となる高さ，幅を充足しない島はノイズとしてボックスを生成せず，そのまま処理対象から除去する。 The first method in the product tag information specifying processing unit 27 obtains binarized image information by binarizing the specified product tag area (S400). Then, the box is specified by the labeling process on the binarized image information (S410). Islandes that do not satisfy the height and width that are the predetermined thresholds are removed from the processing target as they are without generating a box as noise.

そして生成したボックスの高さ，幅，ベースラインを求め，ベースラインと高さが所定範囲内で一致し，高さおよび幅が一定の閾値の範囲内にある隣接するボックスを特定することで，特定したボックスを併合し，ブロックを構成する（Ｓ４２０）。そしてブロックのうち，高さがもっとも高いブロックを価格領域（税抜価格領域）と推定し，ＯＣＲ認識を行う（Ｓ４３０）。ＯＣＲ認識の結果，価格情報を得られなければ（数値の文字列を認識できなければ）（Ｓ４４０），高さが次に高いブロックを価格領域（税抜価格領域）と推定し，同様にＯＣＲ認識を行う。 Then, by finding the height, width, and baseline of the generated box, and identifying adjacent boxes whose baseline and height match within a predetermined range and whose height and width are within a certain threshold range, The specified boxes are merged to form a block (S420). Then, among the blocks, the block with the highest height is estimated as the price range (price range excluding tax), and OCR recognition is performed (S430). If the price information cannot be obtained as a result of OCR recognition (if the numerical character string cannot be recognized) (S440), the block with the next highest height is estimated to be the price range (tax-excluded price range), and the OCR is also obtained. Recognize.

なおＳ４３０のＯＣＲ認識においては，価格に用いる「数字」，「コンマ」等の価格表示を構成する文字認識を制約条件として付加することで価格の読み取り精度を向上することができる。 In the OCR recognition of S430, the reading accuracy of the price can be improved by adding the character recognition constituting the price display such as "number" and "comma" used for the price as a constraint condition.

このように価格領域（税抜価格領域）と推定したブロックから価格情報をＯＣＲ認識により取得すると（Ｓ４４０），Ｓ４３０で価格領域（税抜価格領域）と推定して取得した価格情報以外のブロックを特定し（Ｓ４５０），特定した各ブロックに対してＯＣＲ認識を行う（Ｓ４６０）。このＯＣＲ認識については，２種類のＯＣＲ認識処理を行うことが好ましい。すなわち，通常の標準的なＯＣＲ認識処理と，価格表示を構成する文字認識を制約条件として付加したＯＣＲ認識処理である。 When the price information is acquired by OCR recognition from the block estimated as the price area (tax-excluded price area) in this way (S440), the blocks other than the price information acquired by estimating the price area (tax-excluded price area) in S430 are obtained. It is specified (S450), and OCR recognition is performed for each specified block (S460). For this OCR recognition, it is preferable to perform two types of OCR recognition processing. That is, it is an ordinary standard OCR recognition process and an OCR recognition process in which character recognition constituting a price display is added as a constraint condition.

Ｓ４５０で特定したブロックには，価格領域（税込価格領域）のブロックと，メーカー名，商品名などの商品識別情報，定格などの情報の領域のブロックが含まれる。そして各ブロックについて２種類のＯＣＲ認識処理を実行する。メーカー名，商品名などの商品識別情報，定格などの情報の領域のブロックについては，標準的なＯＣＲ認識処理では文字列を認識し，制約条件を付したＯＣＲ認識処理では多くはエラーを含む文字列となる。この場合，２つの認識処理の認識結果が所定値以上相違するかを判定し，相違する場合には，標準的なＯＣＲ認識処理で実行した文字列を認識結果とし，価格領域（税込価格領域）のブロック以外のブロックであると判定する。一方，価格領域（税込価格領域）のブロックについては，標準的なＯＣＲ認識処理では価格情報の文字列を認識し，制約条件を付したＯＣＲ認識処理でも価格情報の文字列を認識する。この場合，２つの認識処理の認識結果が所定値以上相違するかを判定し，相違しない場合には，価格領域（税込価格領域）のブロックであると判定し，制約条件を付加した認識処理による文字列を認識結果の価格情報とする。 The block specified in S450 includes a block in the price area (price area including tax) and a block in the area of information such as product identification information such as manufacturer name and product name, and rating. Then, two types of OCR recognition processing are executed for each block. For blocks in the area of product identification information such as manufacturer name and product name, and information such as rating, character strings are recognized by standard OCR recognition processing, and many characters containing errors in OCR recognition processing with constraints. Become a column. In this case, it is determined whether the recognition results of the two recognition processes differ by a predetermined value or more, and if they differ, the character string executed by the standard OCR recognition process is used as the recognition result, and the price area (tax-included price area). It is determined that the block is other than the block of. On the other hand, for blocks in the price area (price area including tax), the standard OCR recognition process recognizes the price information character string, and the constrained OCR recognition process also recognizes the price information character string. In this case, it is determined whether the recognition results of the two recognition processes differ by a predetermined value or more, and if they do not differ, it is determined that the block is in the price range (price range including tax), and the recognition process with constraints added. The character string is used as the price information of the recognition result.

なお，商品タグ特定処理部２６で第２の方法を用いた場合には，テンプレートとした商品タグには，あらかじめ税抜価格，税込価格，メーカー名，商品名などの商品識別情報，定格がそれぞれ表記される文字枠の位置，サイズが設定されている。そのため，商品タグ特定処理部２６で特定した商品タグ領域から，該当箇所の画像情報を切り出し，ＯＣＲ認識処理を実行すればよい。 When the second method is used in the product tag specifying processing unit 26, the product tag used as a template has the product identification information such as the tax-excluded price, the tax-included price, the manufacturer name, and the product name, and the rating, respectively. The position and size of the displayed character frame are set. Therefore, the image information of the corresponding portion may be cut out from the product tag area specified by the product tag specifying processing unit 26, and the OCR recognition process may be executed.

そして商品タグ内情報特定処理部２７は，特定した商品名等との辞書照合処理を実行する（Ｓ４７０）。すなわち，読み取った文字列と，商品辞書における各商品名などの商品識別情報との編集距離を求め，最小の編集距離の商品名などの商品識別情報を特定し，それがあらかじめ定めた所定距離内であれば商品名などの商品識別情報として同定する。そして，読み取った文字列から最短の編集距離の商品名部分を取り除き，残りの部分文字列に対し，定格辞書における各定格との編集距離を求め，最小の編集距離が所定距離内であるかを判定して，所定距離内にあればその部分を定格の文字列として同定する。同様に，読み取った文字列から最小の編集距離の商品名部分と定格部分を取り除き，残りの部分文字列に対し，メーカー名辞書における各メーカー名との編集距離を求め，最小の編集距離が所定距離内であるかを判定して，所定距離内にあればその部分をメーカー名の文字列として同定する。 Then, the product tag information specifying processing unit 27 executes a dictionary collation process with the specified product name or the like (S470). That is, the edit distance between the read character string and the product identification information such as each product name in the product dictionary is obtained, the product identification information such as the product name with the minimum edit distance is specified, and the edit distance is within a predetermined distance. If so, it is identified as product identification information such as a product name. Then, the product name part with the shortest edit distance is removed from the read character string, the edit distance with each rating in the rating dictionary is obtained for the remaining substring, and whether the minimum edit distance is within the predetermined distance is determined. Judgment is made, and if it is within a predetermined distance, that part is identified as a rated character string. Similarly, the product name part and the rated part with the minimum editing distance are removed from the read character string, the editing distance with each manufacturer name in the manufacturer name dictionary is obtained for the remaining partial character string, and the minimum editing distance is determined. It is determined whether it is within the distance, and if it is within the predetermined distance, that part is identified as a character string of the manufacturer name.

さらに，商品タグ内情報特定処理部２７は，文字認識した文字列に対するロジカルチェックの処理を実行する（Ｓ４８０）。すなわち文字認識した文字列が論理的に矛盾しないか，などを判定する。 Further, the product tag information identification processing unit 27 executes a logical check process for the character-recognized character string (S480). That is, it is judged whether or not the character strings recognized as characters are logically inconsistent.

ロジカルチェックの結果，矛盾がないようであれば，それぞれ認識した文字列について，税抜価格，税込価格，商品名などの商品識別情報，メーカー，定格を特定し，それらを，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報と対応づけて所定の記憶領域に記憶，出力をする。たとえば表形式で出力をする。 If there is no inconsistency as a result of the logical check, identify the product identification information such as tax-excluded price, tax-included price, product name, manufacturer, and rating for each recognized character string, and identify them as the shooting date and time and store identification. Information and captured image information is stored and output in a predetermined storage area in association with the image information identification information. For example, output in tabular format.

実施例１においては，撮影画像情報から商品タグに記載される情報を読み取る場合を説明したが，さらに，陳列棚に陳列する商品の画像認識と対応づけるようにしてもよい。すなわち，陳列棚に陳列する商品の商品名などの商品識別情報を，商品タグに記載される情報からも特定し，それらを照合するようにしてもよい。 In the first embodiment, the case where the information described in the product tag is read from the photographed image information has been described, but further, it may be associated with the image recognition of the product displayed on the display shelf. That is, the product identification information such as the product name of the product displayed on the display shelf may be specified from the information described in the product tag and collated with each other.

この場合，実施例１の画像認識システム１は，本実施例２における画像認識システム１の一部の機能を構成する。実施例２における画像認識システム１は，撮影画像情報入力受付処理部２０と，撮影画像情報記憶部２１と，撮影画像情報記憶部２１と，撮影画像情報正置化処理部２２と，位置特定処理部２３と，商品タグ認識処理部２８と，陳列商品認識処理部２９と，標本画像情報記憶部３０と，商品識別情報記憶部３１と，整合性判定処理部３２と，認識結果照合処理部３３とを有する。図２３に，実施例２における画像認識システム１のシステム構成の一例を示す。また，図２４に商品タグ認識処理部２８の構成の一例を，図２５に陳列商品認識処理部２９の構成の一例を示す。 In this case, the image recognition system 1 of the first embodiment constitutes a part of the functions of the image recognition system 1 of the second embodiment. The image recognition system 1 in the second embodiment includes a photographed image information input reception processing unit 20, a photographed image information storage unit 21, a photographed image information storage unit 21, a photographed image information emplacement processing unit 22, and a position identification process. Unit 23, product tag recognition processing unit 28, display product recognition processing unit 29, sample image information storage unit 30, product identification information storage unit 31, consistency determination processing unit 32, and recognition result matching processing unit 33. And have. FIG. 23 shows an example of the system configuration of the image recognition system 1 in the second embodiment. Further, FIG. 24 shows an example of the configuration of the product tag recognition processing unit 28, and FIG. 25 shows an example of the configuration of the display product recognition processing unit 29.

撮影画像情報入力受付処理部２０，撮影画像情報記憶部２１，撮影画像情報正置化処理部２２は，実施例１と同様である。 The photographed image information input reception processing unit 20, the photographed image information storage unit 21, and the photographed image information emplacement processing unit 22 are the same as those in the first embodiment.

位置特定処理部２３は，実施例１における位置特定処理部２３の機能に加え，棚段の領域(棚段領域）を特定する。すなわち，撮影画像情報および正置画像情報に写っている陳列棚のうち，商品が陳列される棚段領域と，そこに陳列される商品に対する商品タグが取り付けられている商品タグ配置領域とがある。そのため，正置画像情報から商品タグ配置領域と棚段領域とを特定する。商品タグ配置領域，棚段領域の特定としては，管理端末２の操作者が手動で商品タグ配置領域，棚段領域を指定し，それを位置特定処理部２３が受け付けてもよいし，初回に手動で入力を受け付けた商品タグ配置領域，棚段領域の情報に基づいて，二回目以降は自動で商品タグ配置領域，棚段領域を特定してもよい。 The position identification processing unit 23 specifies a shelf area (shelf area) in addition to the function of the position identification processing unit 23 in the first embodiment. That is, among the display shelves shown in the photographed image information and the vertical image information, there is a shelf area where products are displayed and a product tag arrangement area where product tags for the products displayed there are attached. .. Therefore, the product tag placement area and the shelf stage area are specified from the normal image information. To specify the product tag placement area and the shelf level area, the operator of the management terminal 2 may manually specify the product tag placement area and the shelf level area, and the position identification processing unit 23 may accept the product tag placement area and the shelf level area for the first time. The product tag placement area and the shelf area may be automatically specified from the second time onward based on the information of the product tag placement area and the shelf area for which the input is manually received.

商品タグ認識処理部２８は，商品タグの認識処理を実行する。すなわち，実施例１の画像認識システム１における商品タグ配置領域切出処理部２４，商品タグ配置領域正置化処理部２５，商品タグ特定処理部２６，商品タグ内情報特定処理部２７の各処理を実行する。各処理部における処理は，実施例１の画像認識システム１の場合と同様である。 The product tag recognition processing unit 28 executes the product tag recognition process. That is, each process of the product tag arrangement area cutting processing unit 24, the product tag arrangement area normalization processing unit 25, the product tag identification processing unit 26, and the product tag in-information identification processing unit 27 in the image recognition system 1 of the first embodiment. To execute. The processing in each processing unit is the same as in the case of the image recognition system 1 of the first embodiment.

陳列商品認識処理部２９は，撮影画像情報に写っている陳列棚における棚段に陳列されている商品を認識する処理を実行する。陳列商品認識処理部２９は，棚段領域切出処理部２９１とフェイス特定処理部２９２と商品識別情報特定処理部２９３と棚段画像マッチング処理部２９４とを有する。 The display product recognition processing unit 29 executes a process of recognizing the products displayed on the shelves in the display shelf shown in the photographed image information. The display product recognition processing unit 29 includes a shelf stage area cutting processing unit 291, a face identification processing unit 292, a product identification information identification processing unit 293, and a shelf stage image matching processing unit 294.

棚段領域切出処理部２９１は，位置特定処理部２３で特定した棚段の領域の画像情報を棚段領域画像情報として切り出す。棚段領域切出処理部２９１は，実際に，画像情報として切り出してもよいし，実際には画像情報としては切り出さずに，仮想的に切り出すのでもよい。なお，陳列棚に棚段が複数ある場合には，それぞれが棚段領域画像情報として切り出される。 The shelf area cutout processing unit 291 cuts out the image information of the shelf area specified by the position specifying processing unit 23 as the shelf area image information. The shelf area cutout processing unit 291 may actually cut out as image information, or may cut out virtually without actually cutting out as image information. If there are multiple shelves on the display shelf, each is cut out as shelf area image information.

フェイス特定処理部２９２は，正置画像情報における棚段領域における棚段ごとに，商品が置かれているフェイス（商品が置かれている領域）を特定する。フェイス特定処理部２９２は，初回のフェイスの特定処理と，二回目以降のフェイスの特定処理とに分かれる。 The face specifying processing unit 292 identifies the face on which the product is placed (the area on which the product is placed) for each shelf in the shelf area in the normal image information. The face identification processing unit 292 is divided into a first face identification process and a second and subsequent face identification processes.

フェイス特定処理部２９２における初回のフェイスの特定処理は，位置特定処理部２３で特定した棚段の座標で構成される領域（好ましくは矩形領域）の範囲内において，商品が置かれている領域（フェイス）を特定する。具体的には，商品と商品との間に生じる細く狭い陰影を特定する，画像の繰り返しパターンを特定する，パッケージの上辺の段差を特定する，商品幅が同一であるなどの制約に基づいて区切り位置を特定する，などによって，フェイスの領域を特定する。フェイスの特定処理としては，商品のカテゴリや商品の形態によって，任意の方法を採用可能であり，上記に限定するものではない。また，自動的に特定したフェイスに対して，担当者による修正入力を受け付けてもよい。さらに，担当者からフェイスの位置の入力を受け付けるのでもよい。特定したフェイスを構成する領域の座標は，正置画像情報におけるフェイスの領域の座標に，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報，正置画像情報の画像識別情報，フェイスを識別するためのフェイス識別情報とを対応づけて管理する。またフェイスの領域を示す座標としては，矩形領域を特定するために必要な頂点の座標であり，たとえば４点，または右上と左下，左上と右下の２点の座標でよい。 The first face identification process in the face identification processing unit 292 is performed in the area (preferably a rectangular area) in which the product is placed within the area composed of the coordinates of the shelf step specified by the position identification processing unit 23 (preferably a rectangular area). Face) is identified. Specifically, it is separated based on restrictions such as identifying thin and narrow shadows that occur between products, identifying repeating patterns of images, identifying steps on the upper side of the package, and having the same product width. Specify the area of the face by specifying the position. As the face identification process, any method can be adopted depending on the product category and the product form, and the process is not limited to the above. In addition, the person in charge may accept correction input for the automatically identified face. Furthermore, the input of the position of the face may be accepted from the person in charge. The coordinates of the area constituting the specified face are the coordinates of the face area in the normal image information, the shooting date / time information, the store information, the image information identification information of the shot image information, the image identification information of the normal image information, and the face. It manages the face identification information for identification in association with each other. The coordinates indicating the face area are the coordinates of the vertices required to specify the rectangular area, and may be, for example, the coordinates of four points, or the coordinates of two points, upper right and lower left, upper left and lower right.

フェイス特定処理部２９２における二回目以降のフェイスの特定処理は，同一の陳列棚の同一の棚段について，前回（Ｎ−１回目）の正置画像情報で特定したフェイスの領域の座標を今回（Ｎ回目）の正置画像情報で特定したフェイスの領域の座標とする。 In the second and subsequent face identification processes in the face identification processing unit 292, the coordinates of the face area specified in the previous (N-1th) normal image information for the same shelf on the same display shelf are used this time (N-1). It is the coordinates of the face area specified by the normal image information of the Nth time).

フェイスの領域の座標は，棚段の位置の座標と同様に，正置画像情報における陳列棚など，画像情報における所定箇所（たとえば陳列棚の左上の頂点Ｃ１）を基準とした相対座標である。 Similar to the coordinates of the position of the shelf, the coordinates of the face area are relative coordinates based on a predetermined position in the image information (for example, the upper left vertex C1 of the display shelf) such as the display shelf in the vertical image information.

商品識別情報特定処理部２９３は，陳列棚の棚段ごとに，フェイスに表示されている商品の商品識別情報を特定する。商品識別情報としては，商品名のほか，その商品に対して割り当てられているＪＡＮコードなどがあるが，それに限定されない。商品を識別することができる情報であればいかなるものでもよい。 The product identification information identification processing unit 293 specifies the product identification information of the product displayed on the face for each shelf of the display shelf. The product identification information includes, but is not limited to, the product name and the JAN code assigned to the product. Any information that can identify the product may be used.

商品識別情報特定処理部２９３は，以下のような処理を実行する。すなわち，フェイスごとに，フェイスの画像情報と，後述する標本画像情報記憶部３０に記憶する商品の標本画像情報とマッチングすることで，そのフェイスに表示されている商品の商品識別情報を特定する。具体的には，まず，処理対象となるフェイスの座標で構成される領域の画像情報と，標本画像情報記憶部３０に記憶する標本画像情報との類似性を判定し，その類似性がもっとも高い標本画像情報に対応する商品識別情報を，上記座標で構成されるフェイスに表示されている商品の商品識別情報として特定をする。 The product identification information identification processing unit 293 executes the following processing. That is, by matching the image information of the face with the sample image information of the product stored in the sample image information storage unit 30 described later for each face, the product identification information of the product displayed on the face is specified. Specifically, first, the similarity between the image information of the region composed of the coordinates of the face to be processed and the sample image information stored in the sample image information storage unit 30 is determined, and the similarity is the highest. The product identification information corresponding to the sample image information is specified as the product identification information of the product displayed on the face composed of the above coordinates.

ここでフェイスの画像情報と標本画像情報との類似性を判定するには，以下のような処理を行う。まず，商品識別情報特定処理部２９３における商品識別情報の特定処理の前までの処理において，正置画像情報の棚段におけるフェイスの領域の画像情報と，標本画像情報との方向が同じ（横転や倒立していない）となっており，また，それぞれの画像情報の大きさがおおよそ同じとなっている（所定範囲以上で画像情報の大きさが異なる場合には，類似性の判定の前にそれぞれの画像情報の大きさが所定範囲内となるようにサイズ合わせをしておく）。そして，フェイスの画像情報と，標本画像情報との類似性は，フェイスの画像情報の画像特徴量（たとえば局所特徴量）に基づく特徴点と，標本画像情報との画像特徴量（たとえば局所特徴量）に基づく特徴点を，それぞれ抽出する。そして，フェイスの画像情報の特徴点と，標本画像情報の特徴点とでもっとも類似性が高いペアを検出し，それぞれで対応する点の座標の差を求める。そして，差の平均値を求める。差の平均値は，フェイスの画像情報と，標本画像情報との全体の平均移動量を示している。そして，すべての特徴点のペアの座標差を平均の座標差と比較し，外れ度合いの大きなペアを除外する。そして，残った対応点の数で類似性を順位付ける。 Here, in order to determine the similarity between the face image information and the sample image information, the following processing is performed. First, in the processing before the product identification information identification processing in the product identification information identification processing unit 293, the directions of the image information of the face area on the shelf of the orthodox image information and the sample image information are the same (rollover or rollover). It is not inverted), and the size of each image information is approximately the same (if the size of the image information is different within a predetermined range, each before determining the similarity. Adjust the size so that the size of the image information in is within the specified range). The similarity between the face image information and the sample image information is the image feature amount (for example, local feature amount) between the feature point based on the image feature amount (for example, local feature amount) of the face image information and the sample image information. ) Is extracted. Then, the pair having the highest similarity between the feature points of the face image information and the feature points of the sample image information is detected, and the difference in the coordinates of the corresponding points is obtained for each pair. Then, find the average value of the difference. The average value of the difference indicates the total average moving amount of the face image information and the sample image information. Then, the coordinate difference of all the feature point pairs is compared with the average coordinate difference, and the pair with a large degree of deviation is excluded. Then, the similarity is ranked by the number of remaining corresponding points.

以上のような方法でフェイスの画像情報と，標本画像情報との類似性を算出できる。また，その精度を向上させるため，さらに，色ヒストグラム同士のＥＭＤ（ＥａｒｔｈＭｏｖｅｒｓＤｉｓｔａｎｃｅ）を求め，類似性の尺度としてもよい。これによって，撮影された画像情報の明度情報等の環境変化に比較的強い類似性の比較を行うことができ，高精度で特定をすることができる。なお，類似性の判定は，上述に限定をするものではない。特定した商品識別情報は，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報，正置画像情報の画像識別情報，フェイスを識別するためのフェイス識別情報に対応づけて商品識別情報記憶部３１に記憶する。 The similarity between the face image information and the sample image information can be calculated by the above method. Further, in order to improve the accuracy, the EMD (Earth Movers Distance) between the color histograms may be obtained and used as a measure of similarity. As a result, it is possible to compare the similarity of the captured image information, which is relatively strong against environmental changes such as the brightness information, and to specify it with high accuracy. The determination of similarity is not limited to the above. The identified product identification information corresponds to the shooting date / time information, the store information, the image information identification information of the shooting image information, the image identification information of the normal image information, and the face identification information for identifying the face, and the product identification information storage unit. Store in 31.

以上のようにして特定した商品識別情報は，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報，正置画像情報の画像識別情報，フェイスを識別するためのフェイス識別情報に対応づけて商品識別情報記憶部３１に記憶する。 The product identification information identified as described above is associated with the shooting date / time information, the store information, the image information identification information of the shooting image information, the image identification information of the orthodox image information, and the face identification information for identifying the face. It is stored in the product identification information storage unit 31.

棚段画像マッチング処理部２９４は，前回（Ｎ−１回目）の正置画像情報における棚段の領域の画像情報と，今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づいて，その類似性が高ければその棚段における各フェイスの商品識別情報は同一と判定する。この類似性の判定処理は，上述のように，前回（Ｎ−１回目）の正置画像情報における棚段の領域の画像情報の画像特徴量と，今回（Ｎ回目）の正置画像情報における棚段の領域の画像情報とに基づく類似性の判定でもよいし，色ヒストグラム同士のＥＭＤを用いたものであってもよい。また，それらに限定するものではない。そして，商品識別情報特定処理部２９３におけるフェイス単位ごとの特定処理ではなく，商品識別情報特定処理部２９３に，Ｎ回目の正置画像情報におけるその棚段における各フェイスの商品識別情報を，Ｎ−１回目の同一の棚段における各フェイスの商品識別情報と同一として，商品識別情報記憶部３１に記憶させる。これによって，あまり商品の動きがない棚段や逆にきわめて短いサイクルで管理される棚段など，変化がほとんど生じない棚段についての処理を省略することができる。なお，棚段画像マッチング処理部２９４による処理は設けなくてもよい。 The shelf image matching processing unit 294 includes the image information of the shelf area in the previous (N-1st) normal image information and the image information of the shelf area in the current (Nth) normal image information. If the similarity is high, it is determined that the product identification information of each face on the shelf is the same. As described above, this similarity determination process is performed on the image feature amount of the image information of the shelf area in the previous (N-1st) orthogonal image information and the current (Nth) orthogonal image information. The similarity may be determined based on the image information of the shelf area, or the EMD of the color histograms may be used. Moreover, it is not limited to them. Then, instead of the specific processing for each face in the product identification information identification processing unit 293, the product identification information identification processing unit 293 is provided with the product identification information of each face on the shelf in the Nth orthodox image information. It is stored in the product identification information storage unit 31 as the same as the product identification information of each face on the same shelf for the first time. As a result, it is possible to omit the processing for the shelves that hardly change, such as the shelves where the products do not move much and the shelves that are managed in an extremely short cycle. It is not necessary to provide the processing by the shelf image matching processing unit 294.

標本画像情報記憶部３０は，正置画像情報に写っている陳列棚の棚段における各フェイスの商品がどの商品であるかを識別するための標本画像情報を記憶する。標本画像情報は，陳列棚に陳列される可能性のある商品を，上下，左右，斜めなど複数の角度から撮影をした画像情報である。図２８に標本画像情報記憶部３０に記憶される標本画像情報の一例を示す。図２８では，標本画像情報として，缶ビールをさまざまな角度から撮影をした場合を示しているが，缶ビールに限られない。標本画像情報記憶部３０は，標本画像情報と，商品識別情報とを対応付けて記憶する。 The sample image information storage unit 30 stores the sample image information for identifying which product is the product of each face on the shelf of the display shelf shown in the normal image information. Specimen image information is image information obtained by photographing products that may be displayed on a display shelf from a plurality of angles such as up and down, left and right, and diagonally. FIG. 28 shows an example of the sample image information stored in the sample image information storage unit 30. FIG. 28 shows the case where the canned beer is photographed from various angles as the sample image information, but the sample image information is not limited to the canned beer. The sample image information storage unit 30 stores the sample image information and the product identification information in association with each other.

なお，標本画像情報記憶部３０には，標本画像情報とともに，または標本画像情報に代えて，標本画像情報から抽出された，類似性の算出に必要となる情報，たとえば画像特徴量とその位置のペアの情報を記憶していてもよい。標本画像情報には，類似性の算出に必要となる情報も含むとする。この場合，商品識別情報特定処理部２９３は，フェイスの領域の画像情報と，標本画像情報とのマッチング処理を行う際に，標本画像情報について毎回，画像特徴量を算出せずともよくなり，計算時間を短縮することができる。 In the sample image information storage unit 30, information necessary for calculating similarity, for example, an image feature amount and its position, which is extracted from the sample image information together with the sample image information or instead of the sample image information, is stored in the sample image information storage unit 30. The pair information may be stored. It is assumed that the sample image information also includes the information necessary for calculating the similarity. In this case, the product identification information identification processing unit 293 does not have to calculate the image feature amount for the sample image information every time when performing the matching process between the image information in the face area and the sample image information, and the calculation is performed. You can save time.

商品識別情報記憶部３１は，陳列棚の棚段の各フェイスに表示されている商品の商品識別情報を記憶する。たとえば，商品識別情報に対応付けて，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報，正置画像情報の画像識別情報，フェイスを識別するためのフェイス識別情報に対応づけて商品識別情報記憶部３１に記憶する。 The product identification information storage unit 31 stores the product identification information of the product displayed on each face of the shelf of the display shelf. For example, product identification is associated with product identification information, such as shooting date / time information, store information, image information identification information of shot image information, image identification information of orthodox image information, and face identification information for identifying faces. It is stored in the information storage unit 31.

整合性判定処理部３２は，商品タグ認識処理部２８による商品タグに表示される情報の認識結果と，陳列商品認識処理部２９による商品（商品識別情報）の認識結果について，各棚や各棚段に含まれている可能性の高い商品かどうかの整合性を判定する。たとえば，商品タグに表示される情報の認識結果，または商品の認識結果において，定格が３５０ｍｌの商品と認識しているが，その商品の陳列棚または棚段には５００ｍｌの商品が陳列されていること定められている場合には，同一の商品名の定格を「５００ｍｌ」に変更する。陳列棚または棚段に載置される商品については，あらかじめ設定されており，撮影画像情報記憶部２１に記憶する撮影画像情報に対応づけられていることが好ましい。 The consistency determination processing unit 32 determines each shelf and each shelf regarding the recognition result of the information displayed on the product tag by the product tag recognition processing unit 28 and the recognition result of the product (product identification information) by the display product recognition processing unit 29. Determine the consistency of the product that is likely to be included in the stage. For example, in the recognition result of the information displayed on the product tag or the recognition result of the product, the product is recognized as having a rating of 350 ml, but the product of 500 ml is displayed on the display shelf or the shelf of the product. If it is stipulated, change the rating of the same product name to "500 ml". It is preferable that the products to be placed on the display shelf or the shelf are preset and associated with the photographed image information stored in the photographed image information storage unit 21.

認識結果照合処理部３３は，陳列商品認識処理部２９において認識したフェイスごとの商品の商品識別情報と，商品タグ認識処理部２８において認識した商品の情報（商品識別情報）とを突合し，認識結果が一致しているかを照合する。 The recognition result collation processing unit 33 collates the product identification information of the product for each face recognized by the display product recognition processing unit 29 with the product information (product identification information) recognized by the product tag recognition processing unit 28, and the recognition result. Check if they match.

具体的には，まず陳列商品認識処理部２９による認識処理の結果，類似性の高いフェイスが並んでいる区画を一群として，一つの棚段に何群あるかを特定する。また，それぞれの群の棚段の左右位置がどこかを特定する。そして，各群と左右位置が一致している，商品タグ認識処理部２８による商品タグの認識結果の情報を，各群に対応づける。 Specifically, first, as a result of the recognition process by the display product recognition processing unit 29, the number of groups on one shelf is specified as a group of sections in which faces having high similarity are lined up. Also, identify where the left and right positions of the shelves in each group are. Then, the information of the product tag recognition result by the product tag recognition processing unit 28, whose left and right positions are the same as those of each group, is associated with each group.

フェイスによる群と商品タグとが対応づけている場合，以下の処理を実行する。まず，商品タグ認識処理部２８による商品タグの認識の結果，読み取った商品名（商品識別情報）を，尤度付きの候補商品リストに変換をする。この場合，商品Ａ１である確率をｐ１，商品Ａ２である確率をｐ２といったように確率に対応させて変換をする。なお，商品タグの認識結果から，メーカー名や定格情報が得られている場合には，そのメーカーの商品，その定格が存在する商品の尤度を高く設定する。また，商品タグの認識結果から価格情報が得られている場合には，その価格帯を売価としてもつ商品の尤度を高くする。さらに，税抜価格と税込価格の２つの価格が読み取られ，それらの間の比率がちょうど消費税の有無に一致しているなど，ロジカルチェックと一致している場合には，尤度を一層，高く設定する。加えて，このとき，陳列棚の棚段に陳列されている商品のジャンルなどがわかっている場合には，それらのジャンルに属する商品の尤度を高くまたは低く設定する。 When the group by face and the product tag are associated, the following processing is executed. First, as a result of product tag recognition by the product tag recognition processing unit 28, the read product name (product identification information) is converted into a candidate product list with likelihood. In this case, the probability of being the product A1 is converted to p1, the probability of being the product A2 is p2, and so on. If the manufacturer name and rating information are obtained from the recognition result of the product tag, the likelihood of the product of that manufacturer and the product having the rating is set high. If price information is obtained from the recognition result of the product tag, the likelihood of the product having that price range as the selling price is increased. Furthermore, if two prices, the tax-excluded price and the tax-included price, are read and the ratio between them exactly matches the presence or absence of consumption tax, and if it matches the logical check, the likelihood is further increased. Set high. In addition, at this time, if the genres of the products displayed on the shelves of the display shelves are known, the likelihood of the products belonging to those genres is set high or low.

そして陳列商品認識処理部２９による認識の結果，認識した商品の情報についても，同様に，画像類似性の程度に基づいて，尤度付きの候補商品リストを与える。この場合，商品Ｂ１である確率をＰｂ１，商品Ｂ２である確率をＰｂ２といったように確率に対応させて変換をする。 Then, as a result of the recognition by the display product recognition processing unit 29, the information of the recognized product is also given a candidate product list with a likelihood based on the degree of image similarity. In this case, the probability of being the product B1 is converted to Pb1, the probability of being the product B2 is Pb2, and so on.

そして，商品タグの認識結果による商品Ａ１，Ａ２などの各候補商品のリストと，陳列商品認識処理部２９の認識結果による商品Ｂ１，Ｂ２などの各候補商品のリストとを比較し，商品が両方に現れる（ケース１），Ａ群のみに現れる（ケース２），Ｂ群のみに現れる（ケース３）のいずれかに分類をし，ケース１についてはそれぞれのＡ群，Ｂ群の商品の尤度を合成した，いずれよりも高い尤度とする。また，ケース２，ケース３については，商品タグ認識処理部２８による商品タグの認識結果の精度と，陳列商品認識処理部２９による商品情報の認識結果の精度の総意を反映させた合成関数を適用することで，最終的な尤度付きの候補商品リストを生成する。 Then, the list of each candidate product such as products A1 and A2 based on the recognition result of the product tag is compared with the list of each candidate product such as products B1 and B2 based on the recognition result of the display product recognition processing unit 29, and both products are both. It is classified into one of (Case 1), which appears only in Group A (Case 2), and (Case 3) which appears only in Group B. For Case 1, the likelihood of the products in Group A and Group B, respectively. Is synthesized, and the likelihood is higher than any of them. For cases 2 and 3, a composite function that reflects the consensus of the accuracy of the product tag recognition result by the product tag recognition processing unit 28 and the accuracy of the product information recognition result by the display product recognition processing unit 29 is applied. By doing so, a final list of candidate products with likelihood is generated.

以上のように最終的な尤度付きの候補商品リストを生成することで，候補となる商品を順位づけて特定することができるので，たとえば最上位（１位）の候補となる商品を商品として確定してもよいし，１位から所定順位までの候補となる商品を表示させ，目視の判断結果の選択入力を受け付けてもよい。 By generating the final list of candidate products with likelihood as described above, the candidate products can be ranked and specified. Therefore, for example, the highest (first) candidate product is used as the product. It may be confirmed, or the candidate products from the first place to the predetermined order may be displayed and the selection input of the visual judgment result may be accepted.

そして，確定した商品について，再度，読み取られた価格の尤度を算出する。すなわち，税抜価格と税込価格の比率，商品の売価の範囲内か，同商品の頻出価格との一致性または乖離性を判定し，価格の尤度を決定する。そして，この尤度があらかじめ定められた閾値よりも高ければその価格を自動的に確定し，低ければその旨を表示に反映させ，選択による入力を受け付けてもよい。 Then, the likelihood of the read price is calculated again for the confirmed product. That is, the ratio of the tax-excluded price to the tax-included price, whether it is within the range of the selling price of the product, or the consistency or divergence with the frequent price of the product is determined, and the likelihood of the price is determined. Then, if this likelihood is higher than a predetermined threshold value, the price may be automatically determined, and if it is lower, that fact may be reflected in the display and input by selection may be accepted.

つぎに本実施例２における画像認識システム１の処理プロセスの一例を図２６のフローチャートを用いて説明する。なお，実施例１と同様の処理は説明を省略する。 Next, an example of the processing process of the image recognition system 1 in the second embodiment will be described with reference to the flowchart of FIG. The same processing as in Example 1 will be omitted.

店舗の陳列棚が撮影された撮影画像情報（図２１，図２２）は，撮影画像情報入力端末４から入力され，管理端末２の撮影画像情報入力受付処理部２０でその入力を受け付ける（Ｓ１００）。また，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報の入力を受け付ける。そして，撮影画像情報入力受付処理部２０は，入力を受け付けた撮影画像情報，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報を対応づけて撮影画像情報記憶部２１に記憶させる。 The photographed image information (FIGS. 21 and 22) photographed on the display shelf of the store is input from the photographed image information input terminal 4, and the input is accepted by the photographed image information input reception processing unit 20 of the management terminal 2 (S100). .. It also accepts input of image information identification information such as shooting date and time, store identification information, and shooting image information. Then, the captured image information input reception processing unit 20 stores the captured image information, the shooting date and time, the store identification information, and the image information identification information of the captured image information in association with each other in the captured image information storage unit 21.

管理端末２において所定の操作入力を受け付けると，撮影画像情報正置化処理部２２は，撮影画像情報記憶部２１に記憶する撮影画像情報を抽出し，台形補正処理を行うための頂点である棚位置（陳列棚の位置）の４点の入力を受け付け，台形補正処理を実行する（Ｓ１１０）。このようにして台形補正処理が実行された撮影画像情報（正置画像情報）の一例が，図７，図８である。 When the management terminal 2 receives a predetermined operation input, the photographed image information emplacement processing unit 22 extracts the photographed image information stored in the photographed image information storage unit 21, and is a shelf which is a vertex for performing trapezoidal correction processing. The input of four points of the position (position of the display shelf) is received, and the keystone correction process is executed (S110). 7 and 8 show an example of the captured image information (normal image information) in which the keystone correction process is executed in this way.

そして，正置画像情報に対して，管理端末２において所定の操作入力を受け付けることで，位置特定処理部２３は，棚段領域および商品タグ配置領域を特定する（Ｓ１２０）。すなわち，正置画像情報における棚段領域，商品タグ配置領域の入力を受け付ける。図２９および図３０が，棚段領域および商品タグ配置領域が特定された状態を示す図である。 Then, the position specifying processing unit 23 specifies the shelf stage area and the product tag placement area by accepting a predetermined operation input in the management terminal 2 for the normal image information (S120). That is, the input of the shelf area and the product tag arrangement area in the normal image information is accepted. 29 and 30 are diagrams showing a state in which the shelf area and the product tag arrangement area are specified.

以上のようにして，棚段領域，商品タグ配置領域を特定すると，棚段領域における陳列商品の認識処理を陳列商品認識処理部２９が，商品タグ配置領域における商品タグ認識処理を商品タグ認識処理部２８がそれぞれ実行する。なお陳列商品認識処理部２９における陳列商品の認識処理，商品タグ認識処理部２８による商品タグ認識処理は，並行して行ってもよいし，異なるタイミングで行ってもよい。 When the shelf area and the product tag arrangement area are specified as described above, the display product recognition processing unit 29 performs the display product recognition process in the shelf area, and the product tag recognition process in the product tag arrangement area performs the product tag recognition process. Each unit 28 executes. The display product recognition processing unit 29 may perform the display product recognition processing and the product tag recognition processing unit 28 may perform the product tag recognition processing in parallel or at different timings.

商品タグ認識処理部２８における商品タグ認識処理（Ｓ１３０乃至Ｓ１６０）は，実施例１と同様である。すなわち，商品タグ認識処理部２８における商品タグ配置領域切出処理部２４はＳ１２０で特定した商品タグ配置領域の画像情報を切り出し（Ｓ１３０），商品タグ配置領域正置化処理部２５が台形補正処理を実行することで，正置化処理を実行する（Ｓ１４０）。 The product tag recognition process (S130 to S160) in the product tag recognition processing unit 28 is the same as in the first embodiment. That is, the product tag placement area cutting processing unit 24 in the product tag recognition processing unit 28 cuts out the image information of the product tag placement area specified in S120 (S130), and the product tag placement area normalization processing unit 25 performs keystone correction processing. Is executed to execute the orthostatic processing (S140).

Ｓ１４０において商品タグ配置領域の正置化処理が終了すると，商品タグ認識処理部２８における商品タグ特定処理部２６が，正置化した商品タグ配置領域の画像情報から，個々の商品タグ領域を特定する（Ｓ１５０）。 When the product tag placement area normalization processing is completed in S140, the product tag identification processing unit 26 in the product tag recognition processing unit 28 identifies each product tag area from the image information of the product tag placement area. (S150).

商品タグ特定処理部２６が商品タグ領域を特定すると，商品タグ内情報特定処理部２７が，商品タグ内における情報を特定する（Ｓ１６０）。この特定によって，商品タグに記載した情報，たとえば税抜価格，税込価格，商品名（商品識別情報），定格などの情報を文字認識することができる。 When the product tag specifying processing unit 26 specifies the product tag area, the information specifying processing unit 27 in the product tag specifies the information in the product tag (S160). By this identification, the information described in the product tag, for example, the information such as the tax-excluded price, the tax-included price, the product name (product identification information), and the rating can be recognized in characters.

つぎに陳列商品認識処理部２９による陳列商品の認識処理を説明する。 Next, the display product recognition processing by the display product recognition processing unit 29 will be described.

棚段領域切出処理部２９１は，Ｓ１２０で入力を受け付けた棚段の領域に基づいて，正置画像情報から棚段領域の画像情報を切り出す（Ｓ１７０）。そして，棚段領域画像情報における棚段ごとに，フェイスを特定する処理を実行する（Ｓ１８０）。具体的には，棚段領域における棚段について，４点の座標で構成される矩形領域の範囲内において，商品と商品との間に生ずる細く狭い陰影を特定する，画像の繰り返しパターンを特定する，パッケージの上辺の段差を特定する，商品幅が同一であるなどの制約に基づいて区切り位置を特定する，などによって，フェイスを特定する。特定したフェイスには，フェイスを識別するためのフェイス識別情報を付す。そして，特定した各フェイスの座標は，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報，正置画像情報の画像情報識別情報，フェイスを識別するためのフェイス識別情報と対応付けて記憶させる。なお，フェイスの座標は４点を記憶せずとも，矩形領域を特定可能な２点であってもよい。 The shelf area cutout processing unit 291 cuts out the image information of the shelf area from the normal image information based on the area of the shelf that has received the input in S120 (S170). Then, a process of specifying the face is executed for each shelf in the shelf area image information (S180). Specifically, for the shelf in the shelf area, within the range of the rectangular area composed of the coordinates of four points, the narrow and narrow shadows generated between the products are specified, and the repeating pattern of the image is specified. , The face is specified by specifying the step on the upper side of the package, specifying the dividing position based on restrictions such as the same product width. Face identification information for identifying the face is attached to the specified face. Then, the coordinates of each of the specified faces are stored in association with the shooting date and time, store identification information, image information identification information of the shot image information, image information identification information of the orthodox image information, and face identification information for identifying the face. Let me. The coordinates of the face may be two points that can specify the rectangular area without storing four points.

以上のように正置画像情報の棚段位置領域画像情報における各棚段の各フェイスを特定すると，商品識別情報特定処理部２９３は，フェイスごとに，標本画像情報記憶部３０に記憶する標本画像情報とマッチング処理を実行し，そのフェイスに表示されている商品の商品識別情報を特定する（Ｓ１９０）。すなわち，ある棚段のフェイスの矩形領域（この領域のフェイスのフェイス識別情報をＸとする）における画像情報と，標本画像情報記憶部３０に記憶する各標本画像情報とから，それぞれの画像特徴量を算出し，特徴点のペアを求めることで，類似性を判定する。そして，もっとも類似性の高い標本画像情報を特定し，そのときの類似性があらかじめ定められた閾値以上であれば，その標本画像情報に対応する商品識別情報を標本画像情報記憶部３０に基づいて特定する。そして，特定した商品識別情報を，そのフェイス識別情報Ｘのフェイスに表示されている商品の商品識別情報とする。そして商品識別情報特定処理部２９３は，特定した商品識別情報を，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報，正置画像情報の画像情報識別情報，フェイス識別情報に対応づけて商品識別情報記憶部３１に記憶する（Ｓ２００）。 When each face of each shelf in the shelf position area image information of the normal image information is specified as described above, the product identification information identification processing unit 293 stores the sample image in the sample image information storage unit 30 for each face. The information and the matching process are executed, and the product identification information of the product displayed on the face is specified (S190). That is, from the image information in the rectangular region of the face of a certain shelf (the face identification information of the face in this region is X) and each sample image information stored in the sample image information storage unit 30, each image feature amount is obtained. Is calculated and a pair of feature points is obtained to determine the similarity. Then, the sample image information having the highest similarity is specified, and if the similarity at that time is equal to or higher than a predetermined threshold value, the product identification information corresponding to the sample image information is obtained based on the sample image information storage unit 30. Identify. Then, the specified product identification information is used as the product identification information of the product displayed on the face of the face identification information X. Then, the product identification information identification processing unit 293 associates the identified product identification information with the shooting date / time, store identification information, image information identification information of the shot image information, image information identification information of the normal image information, and face identification information. It is stored in the product identification information storage unit 31 (S200).

なお，すべてのフェイスの商品識別情報を特定できるとは限らない。そこで，特定できていないフェイスについては，商品識別情報の入力を受け付け，入力を受け付けた商品識別情報を，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報，正置画像情報の画像情報識別情報，フェイス識別情報に対応づけて商品識別情報記憶部３１に記憶する。また，特定した商品識別情報の修正処理についても同様に，入力を受け付けてもよい。 Not all face product identification information can be specified. Therefore, for faces that have not been identified, the input of product identification information is accepted, and the product identification information for which the input is accepted is used as the shooting date and time, store identification information, image information identification information of shot image information, and image information of orthodox image information. It is stored in the product identification information storage unit 31 in association with the identification information and the face identification information. Similarly, input may be accepted for the correction process of the specified product identification information.

以上のような処理を行うことで，撮影画像情報に写っている陳列棚の棚段に陳列されている商品の商品識別情報を特定することができる。 By performing the above processing, it is possible to identify the product identification information of the product displayed on the shelf of the display shelf shown in the photographed image information.

このように商品タグ認識処理部２８による商品タグの認識結果，陳列商品認識処理部２９による陳列商品の認識結果，整合性判定処理部３２が各棚，各棚段に含まれている可能性の高い商品かどうかの整合性を判定する（Ｓ２１０）。たとえば，商品タグの認識結果，または商品の認識結果において，定格が３５０ｍｌの商品と認識しているが，その商品の陳列棚または棚段には５００ｍｌの商品が陳列されていること定められている場合には，同一の商品名の定格を「５００ｍｌ」に変更する。 In this way, there is a possibility that the product tag recognition processing unit 28 includes the product tag recognition result, the display product recognition processing unit 29 recognizes the displayed product, and the consistency determination processing unit 32 is included in each shelf and each shelf. The consistency of whether or not the product is expensive is determined (S210). For example, in the product tag recognition result or the product recognition result, it is recognized that the product has a rating of 350 ml, but it is stipulated that a product of 500 ml is displayed on the display shelf or shelf of the product. In that case, the rating of the same product name is changed to "500 ml".

また，認識結果照合処理部３３は，陳列商品認識処理部２９において認識したフェイスごとの商品の商品識別情報と，商品タグ認識処理部２８において認識した商品の情報（商品識別情報）とを突合し，認識結果が一致しているかを照合する（Ｓ２２０）。 Further, the recognition result collation processing unit 33 collates the product identification information of the product for each face recognized by the display product recognition processing unit 29 with the product information (product identification information) recognized by the product tag recognition processing unit 28. It is collated whether the recognition results match (S220).

すなわち，認識結果照合処理部３３は，陳列商品認識処理部２９による認識処理の結果，類似性の高いフェイスが並んでいる区画を一群として，一つの棚段に何群あるかを特定する。また，それぞれの群の棚段の左右位置がどこかを特定する。そして，各群と左右位置が一致している，商品タグ認識処理部２８による商品タグの認識結果の情報を，群に対応づける。 That is, as a result of the recognition processing by the display product recognition processing unit 29, the recognition result collation processing unit 33 specifies how many groups are on one shelf, with the sections in which faces having high similarity are lined up as a group. Also, identify where the left and right positions of the shelves in each group are. Then, the information of the product tag recognition result by the product tag recognition processing unit 28, whose left and right positions match with each group, is associated with the group.

このようにフェイスによる群と商品タグとの対応付け後，商品識別情報記憶部３１に記憶するフェイスまたは群に対応する商品識別情報（商品名）と，商品タグ認識処理部２８による商品タグの認識結果とを比較し，それらの認識結果が一致するかを特定し，また読み取った価格を確定する。 After associating the group with the product tag by the face in this way, the product identification information (product name) corresponding to the face or group stored in the product identification information storage unit 31 and the product tag recognition by the product tag recognition processing unit 28. Compare the results, identify if those recognition results match, and determine the read price.

以上のような処理によって，陳列棚に陳列されている商品を画像認識処理によって認識した結果と，商品タグによる文字認識処理によって認識した結果とを比較して照合することができる。 By the above processing, it is possible to compare and collate the result of recognizing the product displayed on the display shelf by the image recognition processing and the result of recognizing by the character recognition processing by the product tag.

上述した実施例１，実施例２では，４点を指定することで台形補正処理を実行することとしたが，その基準となる頂点を毎回，指定して入力することは負担が大きい。そこで，台形補正処理の基準となる頂点を自動的に特定するように構成してもよい。この場合の処理を説明する。 In the above-mentioned Examples 1 and 2, the keystone correction process is executed by designating four points, but it is burdensome to specify and input the reference vertex each time. Therefore, it may be configured to automatically identify the vertex that is the reference of the keystone correction process. The processing in this case will be described.

この場合の撮影画像情報正置化処理部２２は，初回の台形補正処理と，二回目以降の台形補正処理とに分かれる。なお，初回とは一回目のほか，頂点を自動的に特定する際のずれを修正するため，任意のタイミングで手動で行う場合も含まれる。二回目以降とは初回以外である。 In this case, the captured image information normalization processing unit 22 is divided into the first keystone correction process and the second and subsequent keystone correction processes. In addition to the first time, the first time includes the case of manually performing at an arbitrary timing in order to correct the deviation when automatically identifying the vertices. The second and subsequent times are other than the first time.

撮影画像情報正置化処理部２２における初回の台形補正処理は，実施例１と同様に，陳列棚の長方形の領域の４頂点の指定の入力を受け付ける。陳列棚の長方形の領域の４頂点としては，陳列棚の棚位置の４頂点であってもよいし，棚段の４頂点や商品タグを取り付ける領域の４頂点であってもよい。また，２段，３段の棚段のまとまりの４頂点であってもよい。ここで指定を受け付けた４頂点は，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報と対応づけて記憶させる。そして撮影画像情報正置化処理部２２は，指定を受け付けた４頂点の座標に基づいて，撮影画像情報に対して台形補正処理を実行し，正置画像情報とする。 In the first keystone correction processing in the captured image information emplacement processing unit 22, the input of the designation of the four vertices of the rectangular area of the display shelf is accepted as in the first embodiment. The four vertices of the rectangular area of the display shelf may be the four vertices of the shelf position of the display shelf, the four vertices of the shelf stage, or the four vertices of the area to which the product tag is attached. Further, it may be four vertices of a group of two or three shelves. The four vertices that have been designated here are stored in association with the image information identification information of the shooting date / time information, store information, and shooting image information. Then, the captured image information normalization processing unit 22 executes trapezoidal correction processing on the captured image information based on the coordinates of the four vertices that have received the designation, and obtains the captured image information as the normal image information.

撮影画像情報は，一定期間ごとに，同じような領域を同じような角度で撮影がされることが望ましい。しかし完全に同じ領域を同じ角度で撮影をすることはできない。そこで，撮影画像情報正置化処理部２２は，二回目以降の台形補正処理を以下のように実行をする。 It is desirable that the captured image information be captured in a similar area at a similar angle at regular intervals. However, it is not possible to shoot the exact same area at the same angle. Therefore, the captured image information normalization processing unit 22 executes the second and subsequent keystone correction processing as follows.

まず，撮影画像情報正置化処理部２２は，Ｎ回目の撮影画像情報に対応する同じ（ほぼ同じ）領域を撮影したＮ−１回目の撮影画像情報の頂点座標を，前回の処理の際に記憶した情報から特定する。Ｎ回目の撮影画像情報に対応する同じ（ほぼ同じ）領域を撮影したＮ−１回目の撮影画像情報の頂点座標は，撮影画像情報に対応する店舗識別情報，画像識別情報，撮影日時情報などに基づいて特定をする。そして，Ｎ−１回目の撮影画像情報に対して，特定をした４頂点の頂点座標を含む所定の大きさの矩形領域，たとえば棚段の幅の１／５程度の正方形を特徴量採取領域２２０として設定をする。Ｎ−１回目の撮影画像情報に対して，特徴量採取領域２２０を設定した状態の一例を図３１に示す。特徴量採取領域２２０は，頂点座標を含む矩形領域であればよい。一方，陳列棚の背景同士がマッチングをしてしまうと，撮影位置が少しずれるだけで背景が大きくずれてしまう。そこで，特徴量採取領域２２０は，なるべく陳列棚の内側を多く含む位置に設定することが好ましい。つまり，頂点座標は，特徴量採取領域２２０において，特徴量採取領域２２０の中心点よりも陳列棚の外側方向に位置していることが好ましい。 First, the captured image information emplacement processing unit 22 sets the vertex coordinates of the N-1st captured image information obtained by photographing the same (almost the same) region corresponding to the Nth captured image information at the time of the previous processing. Identify from the stored information. The vertex coordinates of the N-1st shot image information obtained by shooting the same (almost the same) area corresponding to the Nth shot image information are used as store identification information, image identification information, shooting date and time information, etc. corresponding to the shot image information. Identify based on. Then, with respect to the N-1th captured image information, a rectangular area having a predetermined size including the coordinates of the specified four vertices, for example, a square having a width of about 1/5 of the shelf level is used as the feature collection area 220. Set as. FIG. 31 shows an example of a state in which the feature amount collection area 220 is set for the N-1th captured image information. The feature amount collection area 220 may be a rectangular area including the coordinates of the vertices. On the other hand, if the backgrounds of the display shelves match each other, the backgrounds will shift significantly even if the shooting position shifts slightly. Therefore, it is preferable to set the feature quantity collection area 220 at a position that includes as much of the inside of the display shelf as possible. That is, it is preferable that the apex coordinates are located outside the display shelf in the feature collection area 220 with respect to the center point of the feature collection area 220.

たとえば，頂点座標の４点は左上，右上，左下，右下に位置する。そして，特徴量採取領域２２０の矩形領域を縦横の中心でそれぞれ２分割した合計４領域に分割すると，左上の頂点座標を含む特徴量採取領域２２０では，その頂点座標が矩形領域のうち左上の領域に位置するように特徴量採取領域２２０を設定する。同様に，右上の頂点座標を含む特徴量採取領域２２０では，その頂点座標が矩形領域のうち右上の領域に位置するように特徴量採取領域２２０を設定し，左下の頂点座標を含む特徴量採取領域２２０では，その頂点座標が矩形領域のうち左下の領域に位置するように特徴量採取領域２２０を設定し，右下の頂点座標を含む特徴量採取領域２２０では，その頂点座標が矩形領域のうち右下の領域に位置するように特徴量採取領域２２０を設定する。これによって，頂点座標は，特徴量採取領域２２０において，特徴量採取領域２２０の中心点よりも陳列棚の外側方向に位置することとなる。 For example, the four points of vertex coordinates are located at the upper left, upper right, lower left, and lower right. Then, when the rectangular area of the feature amount collection area 220 is divided into two areas at the vertical and horizontal centers, a total of four areas are divided. In the feature amount collection area 220 including the upper left vertex coordinates, the vertex coordinates are the upper left area of the rectangular area. The feature amount collection area 220 is set so as to be located at. Similarly, in the feature amount collection area 220 including the upper right vertex coordinates, the feature amount collection area 220 is set so that the vertex coordinates are located in the upper right area of the rectangular area, and the feature amount collection including the lower left vertex coordinates is set. In the area 220, the feature amount collection area 220 is set so that the vertex coordinates are located in the lower left area of the rectangular area, and in the feature amount collection area 220 including the lower right vertex coordinates, the vertex coordinates are in the rectangular area. The feature amount collection area 220 is set so as to be located in the lower right area. As a result, the apex coordinates are located in the feature collection area 220 in the direction outside the display shelf from the center point of the feature collection area 220.

つぎに，撮影画像情報正置化処理部２２は，Ｎ回目の撮影画像情報において，Ｎ−１回目の撮影画像情報に設定した特徴量採取領域２２０を内包し，Ｎ−１回目の撮影画像情報の特徴量採取領域２２０以上の大きさの特徴量採取領域２２１を設定する。Ｎ回目の撮影画像情報に設定する特徴量採取領域２２１は，短辺の１／２の大きさは超えない。さらに，撮影画像情報よりも外側に出る場合には，その範囲をトリミングする。Ｎ回目の撮影画像情報に対して特徴量採取領域２２１を設定した状態の一例を図３２に示す。 Next, the captured image information emplacement processing unit 22 includes the feature amount collection area 220 set in the N-1st captured image information in the Nth captured image information, and the N-1th captured image information. A feature collection area 221 having a size of 220 or more is set. The feature amount collection area 221 set in the Nth captured image information does not exceed half the size of the short side. Furthermore, if it goes outside the captured image information, the range is trimmed. FIG. 32 shows an example of a state in which the feature amount collection area 221 is set for the Nth captured image information.

そして撮影画像情報正置化処理部２２は，Ｎ−１回目の撮影画像情報に対して設定した各特徴量採取領域２２０において，局所特徴量を採取し，局所特徴量による特徴点とその座標のセットとを記憶する。また，Ｎ回目の撮影画像情報に対して設定した各特徴量採取領域２２１において，局所特徴量を採取し，局所特徴量による特徴点とその座標のセットとを記憶する。 Then, the captured image information emplacement processing unit 22 collects the local feature amount in each feature amount collection area 220 set for the N-1th photographed image information, and the feature points based on the local feature amount and their coordinates. Memorize the set. Further, in each feature amount collection area 221 set for the Nth captured image information, the local feature amount is collected, and the feature point based on the local feature amount and the set of its coordinates are stored.

撮影画像情報正置化処理部２２は，Ｎ−１回目の撮影画像情報の特徴量採取領域２２０における特徴点の局所特徴量と，Ｎ−１回目の撮影画像情報の特徴量採取領域２２０に対応する位置にあるＮ回目の撮影画像情報の特徴量採取領域２２１における特徴点の局所特徴量とを比較する。そして，Ｎ−１回目の撮影画像情報の各特徴点の各局所特徴量にもっとも近い，Ｎ回目の撮影画像情報の各局所特徴量の特徴点を特定する。そしてもっとも近い局所特徴量同士の特徴点をペアとし，ペアとなる局所特徴量による特徴点の座標を対応づける。なお，この際に，局所特徴量同士の近さ（類似性）があらかじめ定められた閾値未満のペアは除外をする。これによって，Ｎ−１回目の特徴量採取領域２２０における局所特徴量の特徴点と，Ｎ回目の特徴量採取領域２２１におけるもっとも近い局所特徴量の特徴点同士のペアを特定できる。Ｎ−１回目の特徴量採取領域２２０の局所特徴量の特徴点と，Ｎ回目の特徴量採取領域２２１の局所特徴量の特徴点とのペアの関係を図３３に示す。図３３では，Ｎ−１回目の特徴量採取領域２２０における局所特徴量による特徴点の点群をＡ，Ｎ回目の特徴量採取領域２２１における局所特徴量による特徴点の点群をＢ，Ｎ−１回目の台形補正処理に用いた頂点をＣで示している。 The captured image information emplacement processing unit 22 corresponds to the local feature amount of the feature point in the feature amount collection area 220 of the N-1st shot image information and the feature amount collection area 220 of the N-1th shot image information. The feature amount of the Nth photographed image information at the position where the image is to be taken is compared with the local feature amount of the feature point in the feature collection area 221. Then, the feature point of each local feature amount of the Nth shot image information, which is the closest to each local feature amount of each feature point of the N-1th shot image information, is specified. Then, the feature points of the closest local features are paired, and the coordinates of the feature points according to the paired local features are associated. At this time, pairs whose proximity (similarity) between local features is less than a predetermined threshold are excluded. Thereby, the pair of the feature points of the local feature amount in the N-1th feature amount collection area 220 and the feature points of the closest local feature amount in the Nth feature amount collection area 221 can be specified. FIG. 33 shows the relationship between the feature points of the local feature amount of the N-1th feature amount collection area 220 and the feature points of the local feature amount of the Nth feature amount collection area 221. In FIG. 33, the point cloud of the feature point by the local feature amount in the N-1th feature amount collection area 220 is A, and the point cloud of the feature point by the local feature amount in the Nth feature amount collection area 221 is B, N-. The vertices used for the first keystone correction process are indicated by C.

Ｎ−１回目の特徴量採取領域２２０における局所特徴量による特徴点の点群Ａの座標と，点群Ａに対応するＮ回目の特徴量採取領域２２１における局所特徴量による特徴点の点群Ｂの座標とに基づいて，点群Ａを点群Ｂに射影する関数Ｆ（アフィン変換）を求める。関数Ｆは，サンプリング推定を反復する，ロバスト推定の一種であるＯｐｅｎＣＶのＲＡＮＳＡＣを利用するなどの方法があるが，それらに限定しない。なお，射影の関係にある関係線からずれが大きいペアは処理対象から除外をする。 The coordinates of the point cloud A of the feature points due to the local feature amount in the N-1st feature amount collection area 220 and the point cloud B of the feature points due to the local feature amount in the Nth feature amount collection area 221 corresponding to the point group A. Based on the coordinates of, the function F (affine transformation) that projects the point cloud A onto the point cloud B is obtained. The function F has methods such as repeating sampling estimation and using RANSAC of OpenCV, which is a kind of robust estimation, but is not limited thereto. Pairs with a large deviation from the relational line that is in the projection relationship are excluded from the processing target.

撮影画像情報正置化処理部２２において関数Ｆを求めたのち，撮影画像情報正置化処理部２２は，Ｎ−１回目の台形補正処理で用いた頂点Ｃの座標を，関数Ｆに基づいてＮ回目の撮影画像情報に射影し，Ｎ回目の台形補正処理のための頂点Ｄの座標として特定する。これを模式的に示すのが図３４である。 After the function F is obtained by the captured image information normalization processing unit 22, the captured image information normalization processing unit 22 determines the coordinates of the vertices C used in the N-1th keystone correction processing based on the function F. It is projected onto the Nth captured image information and specified as the coordinates of the vertex D for the Nth trapezoidal correction process. FIG. 34 schematically shows this.

以上の処理を各特徴量採取領域２２０，２２１に対して行うことで，Ｎ回目の撮影画像情報における台形補正処理のための棚位置の４頂点を特定する。そして，撮影画像情報正置化処理部２２は，特定した４頂点に基づいて，Ｎ回目の撮影画像情報に対する台形補正処理を実行して正置化し，正置画像情報を生成し，記憶する。この際に，撮影画像情報正置化処理部２２は，正置画像情報に対応付けて，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報，正置画像情報の画像識別情報と対応づけて記憶をさせる。特定したＮ回目の撮影画像情報に対応する頂点の座標は，撮影日時情報，店舗情報，撮影画像情報の画像情報識別情報と対応づけて記憶させる。 By performing the above processing for each feature amount collection area 220 and 221, the four vertices of the shelf position for the keystone correction processing in the Nth captured image information are specified. Then, the captured image information normalization processing unit 22 executes trapezoidal correction processing for the Nth captured image information based on the specified four vertices to normalize the captured image information, and generates and stores the normal image information. At this time, the captured image information normalization processing unit 22 corresponds to the shooting date / time information, the store information, the image information identification information of the shooting image information, and the image identification information of the normal image information in association with the normal image information. Let me remember it. The coordinates of the vertices corresponding to the specified Nth shot image information are stored in association with the shooting date / time information, store information, and image information identification information of the shot image information.

なお，撮影画像情報正置化処理部２２における台形補正処理で用いる頂点の特定処理は，本発明のように陳列棚を撮影した画像情報から商品を特定する場合に限らず，同一の撮影対象物を撮影した複数の画像情報を正置化し，正置画像情報を生成する画像認識システム１にも適用することができる。これによって，同一の撮影対象物を撮影した複数の画像情報について，それぞれ正置化して，その撮影対象物の正置画像情報を生成することができる。 Note that the vertex identification process used in the trapezoidal correction processing in the captured image information emplacement processing unit 22 is not limited to the case of identifying a product from the image information obtained by photographing the display shelf as in the present invention, and is the same object to be photographed. It can also be applied to the image recognition system 1 in which a plurality of image information obtained by capturing the image is rectified and the rectified image information is generated. As a result, it is possible to generate the orthodox image information of the same object to be photographed by rectifying each of a plurality of image information of the same object to be photographed.

また，Ｎ回目の撮影画像情報における台形補正処理のための棚位置の４頂点を特定するため，上述では，Ｎ−１回目の撮影画像情報における特徴量採取領域２２０での局所特徴量による特徴点の点群Ａと，Ｎ回目の撮影画像情報における特徴量採取領域２２１での局所特徴量による特徴点の点群Ｂとを用いて関数Ｆを求め，Ｎ−１回目の台形補正処理で用いた頂点Ｃの座標を，関数ＦによりＮ回目の撮影画像情報に射影し，Ｎ回目の台形補正処理のための頂点Ｄの座標として特定する処理を説明した。しかし，かかる処理では，Ｎ−１回目の撮影画像情報と，Ｎ回目の撮影画像情報とにおいて，類似する画像情報の対応点の座標（位置）を見つければよいので，上記の方法にするものではなく，画像情報内の箇所を特定するタイプの特徴量であればいかなるものであってもよい。たとえば，画像情報内における尖った箇所，ハイライトのポイントなどがある。本明細書では，局所特徴量などの，画像情報内の箇所を特定する特徴量を画像特徴量（位置特定型画像特徴量）という。なお，本明細書の説明では，画像特徴量として，上述のように局所特徴量を用いる場合を説明する。 Further, in order to specify the four vertices of the shelf position for the trapezoidal correction processing in the Nth shot image information, the feature points based on the local feature amount in the feature amount collection area 220 in the N-1th shot image information are described above. The function F was obtained by using the point cloud A of No. 1 and the point cloud B of the feature points based on the local feature amount in the feature amount collection area 221 in the Nth shot image information, and used in the N-1th trapezoidal correction process. The process of projecting the coordinates of the vertex C onto the captured image information of the Nth time by the function F and specifying the coordinates of the vertex D as the coordinates of the vertex D for the Nth trapezoidal correction process has been described. However, in such processing, it is sufficient to find the coordinates (positions) of the corresponding points of similar image information in the N-1st shot image information and the Nth shot image information. However, any feature quantity of a type that specifies a part in the image information may be used. For example, there are sharp points and highlight points in the image information. In the present specification, a feature amount that identifies a location in image information, such as a local feature amount, is referred to as an image feature amount (position-specific image feature amount). In the description of the present specification, the case where the local feature amount is used as the image feature amount as described above will be described.

つぎに，実施例３における台形補正処理を行うための頂点の特定処理を説明する。この場合，任意の陳列棚を撮影した撮影画像情報において台形補正処理を行うための頂点がすでに特定されており，所定期間（たとえば一週間）経過後に，同一の陳列棚について，同じような領域を同じような角度で撮影した撮影画像情報について行う場合を説明する。 Next, the vertex identification process for performing the keystone correction process in the third embodiment will be described. In this case, the apex for performing the keystone correction processing has already been specified in the photographed image information obtained by photographing an arbitrary display shelf, and after a predetermined period (for example, one week) has elapsed, a similar area is formed for the same display shelf. A case will be described in which the captured image information captured at the same angle is used.

店舗の陳列棚が撮影された撮影画像情報は，撮影画像情報入力端末４から入力され，管理端末２の撮影画像情報入力受付処理部２０でその入力を受け付ける。図３５に，撮影画像情報の一例を示す。また，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報の入力を受け付ける。そして，撮影画像情報入力受付処理部２０は，入力を受け付けた撮影画像情報，撮影日時，店舗識別情報，画像情報識別情報を対応づけて撮影画像情報記憶部２１に記憶させる。 The photographed image information photographed on the display shelf of the store is input from the photographed image information input terminal 4, and the input is accepted by the photographed image information input reception processing unit 20 of the management terminal 2. FIG. 35 shows an example of captured image information. It also accepts input of image information identification information such as shooting date and time, store identification information, and shooting image information. Then, the photographed image information input reception processing unit 20 stores the photographed image information, the shooting date and time, the store identification information, and the image information identification information that received the input in the photographed image information storage unit 21 in association with each other.

管理端末２において所定の操作入力を受け付けると，撮影画像情報正置化処理部２２は撮影画像情報記憶部２１に記憶する撮影画像情報を抽出し，台形補正処理を実行するための，棚位置の頂点Ｄ（Ｄ１乃至Ｄ４）を特定する処理を実行する。 When the management terminal 2 receives a predetermined operation input, the captured image information normalization processing unit 22 extracts the captured image information stored in the captured image information storage unit 21 and executes the keystone correction processing at the shelf position. A process for identifying the vertices D (D1 to D4) is executed.

今回（Ｎ回目）の撮影画像情報（図３５）に対応する同じまたはほぼ同じ領域を撮影した前回（Ｎ−１回目）の撮影画像情報（図３６）の頂点座標（たとえば頂点座標Ｃ１乃至Ｃ４とする）を特定する。前回の撮影画像情報の頂点座標は，撮影日時，店舗識別情報，撮影画像情報の画像識別情報などに基づいて特定をすればよい。 The vertex coordinates (for example, vertex coordinates C1 to C4) of the previous (N-1st) captured image information (FIG. 36) in which the same or almost the same area corresponding to the captured image information (FIG. 35) of this time (Nth) was captured. To identify). The vertex coordinates of the previously captured image information may be specified based on the shooting date and time, the store identification information, the image identification information of the captured image information, and the like.

撮影画像情報正置化処理部２２は，撮影画像情報記憶部２１からＮ−１回目の撮影画像情報を抽出し，それぞれの頂点Ｃ１乃至Ｃ４について，頂点を一つずつ含む所定の大きさの矩形領域を特徴量採取領域２２０として，Ｎ−１回目の撮影画像情報に設定する。Ｎ−１回目の撮影画像情報に特徴量採取領域２２０を設定した状態を図３６に示す。 The captured image information emplacement processing unit 22 extracts the N-1th captured image information from the captured image information storage unit 21, and for each of the vertices C1 to C4, a rectangle having a predetermined size including one vertex. The area is set as the feature amount collection area 220 and set as the N-1th captured image information. FIG. 36 shows a state in which the feature amount collection area 220 is set in the N-1th captured image information.

また，撮影画像情報正置化処理部２２は，撮影画像情報記憶部２１からＮ回目の撮影画像情報（図３５）を抽出し，Ｎ−１回目の特徴量採取領域２２０よりも広い範囲の特徴量採取領域２２１を，Ｎ回目の撮影画像情報に設定する。Ｎ回目の撮影画像情報に特徴量採取領域２２１を設定した状態を図３７に示す。Ｎ回目の撮影画像情報におけるそれぞれの特徴量採取領域２２１は，Ｎ−１回目の特徴量採取領域２２０を一つずつ含む。図３７では，Ｎ回目の特徴量採取領域２２１に，Ｎ−１回目の特徴量採取領域２２０を示すことで，その包含関係を示している。 Further, the captured image information emplacement processing unit 22 extracts the Nth captured image information (FIG. 35) from the captured image information storage unit 21, and features in a wider range than the N-1th feature amount collection area 220. The amount collection area 221 is set as the Nth captured image information. FIG. 37 shows a state in which the feature amount collection area 221 is set for the Nth captured image information. Each feature amount collection area 221 in the Nth photographed image information includes one N-1th feature amount collection area 220. In FIG. 37, the inclusion relationship is shown by showing the N-1th feature collection area 220 in the Nth feature collection area 221.

そして撮影画像情報正置化処理部２２は，Ｎ−１回目の撮影画像情報に対して設定した各特徴量採取領域２２０において局所特徴量を採取し，局所特徴量による特徴点と座標のセットとを記憶する。また，Ｎ回目の撮影画像情報に対して設定した各特徴量採取領域２２１において局所特徴量を採取し，局所特徴量による特徴点と座標のセットとを記憶する。 Then, the captured image information emplacement processing unit 22 collects the local feature amount in each feature amount collection area 220 set for the N-1th photographed image information, and sets the feature points and coordinates according to the local feature amount. Remember. In addition, the local feature amount is collected in each feature amount collection area 221 set for the Nth captured image information, and the feature point and the set of coordinates according to the local feature amount are stored.

撮影画像情報正置化処理部２２は，Ｎ−１回目の撮影画像情報の特徴量採取領域２２０での各特徴点の各局所特徴量にもっとも近いＮ回目の撮影画像情報の特徴量採取領域２２１での局所特徴量の特徴点を特定し，それらをペアとなる局所特徴量の特徴点として，それぞれの座標を対応付ける。図３３に示すのがＮ−１回目のＮ−１回目の特徴量採取領域２２０と，Ｎ回目の特徴量採取領域２２１とのペアの関係である。 The captured image information emplacement processing unit 22 is the feature amount collection area 221 of the Nth shot image information closest to each local feature amount of each feature point in the feature amount collection area 220 of the N-1th shot image information. Identify the feature points of the local features in, and associate them with the respective coordinates as the feature points of the paired local features. FIG. 33 shows the relationship between the N-1st feature amount collection area 220 and the Nth feature amount collection area 221.

そして，Ｎ−１回目の特徴量採取領域２２０における局所特徴量による特徴点の点群をＡ，Ｎ回目の特徴量採取領域２２１における局所特徴量による特徴点の点群をＢ，Ｎ−１回目の台形補正処理に用いた頂点をＣ（Ｃ１乃至Ｃ４）とすると，撮影画像情報正置化処理部２２は，点群Ａと点群Ｂの座標とに基づいて，点群Ａを点群Ｂに射影する関数Ｆ（アフィン変換）を求める。 Then, the point cloud of the feature point by the local feature amount in the N-1th feature amount collection area 220 is A, and the point cloud of the feature point by the local feature amount in the Nth feature amount collection area 221 is B, N-1th time. Assuming that the vertices used for the trapezoidal correction processing of the above are C (C1 to C4), the captured image information emplacement processing unit 22 sets the point cloud A to the point cloud B based on the coordinates of the point cloud A and the point cloud B. Find the function F (Affin transformation) that projects to.

そして撮影画像情報正置化処理部２２は，Ｎ−１回目の台形補正処理で用いた頂点Ｃ（Ｃ１乃至Ｃ４）の座標を，求めた関数Ｆに基づいて射影し，Ｎ回目の台形補正処理のための頂点Ｄ（Ｄ１乃至Ｄ４）の座標として特定する。 Then, the captured image information emplacement processing unit 22 projects the coordinates of the vertices C (C1 to C4) used in the N-1th keystone correction process based on the obtained function F, and performs the Nth keystone correction process. It is specified as the coordinates of the vertices D (D1 to D4) for.

以上の処理を各特徴量採取領域２２０，２２１に対して行うことで，Ｎ回目の台形補正処理のための４頂点Ｄ（Ｄ１乃至Ｄ４）が自動的に特定できる。特定したＮ回目の頂点Ｄ（Ｄ１乃至Ｄ４）の座標は，撮影日時，店舗識別情報，撮影画像情報の画像情報識別情報に対応づけて記憶させる。特定された頂点Ｄ１乃至Ｄ４を示すのが図３８である。 By performing the above processing for each feature amount collection area 220 and 221, the four vertices D (D1 to D4) for the Nth keystone correction processing can be automatically specified. The coordinates of the specified Nth vertex D (D1 to D4) are stored in association with the image information identification information of the shooting date and time, the store identification information, and the shooting image information. FIG. 38 shows the identified vertices D1 to D4.

以上のようにして，Ｎ回目の撮影画像情報に対する台形補正処理のための棚位置の頂点Ｄ（Ｄ１乃至Ｄ４）を特定すると，撮影画像情報正置化処理部２２は，頂点Ｄ（Ｄ１乃至Ｄ４）に基づいて，Ｎ回目の撮影画像情報に対して台形補正処理を実行する。 As described above, when the vertices D (D1 to D4) of the shelf position for the keystone correction processing for the Nth captured image information are specified, the captured image information emplacement processing unit 22 determines the vertices D (D1 to D4). ), The keystone correction process is executed for the Nth captured image information.

以上のような処理を実行することで，二回目以降の台形補正処理について，台形補正処理で用いる４頂点を指定せずとも，対応する頂点を自動的に特定することができるようになり，担当者の負担を軽減することができる。 By executing the above processing, it becomes possible to automatically identify the corresponding vertices for the second and subsequent keystone correction processing without specifying the four vertices used in the keystone correction processing. The burden on the person can be reduced.

さらに実施例２のフェイス特定処理部２９２におけるフェイスの特定処理の変形例を説明する。本実施例では，実施例２のフェイスの特定処理を初回の処理として，二回目以降のフェイスの特定処理として，自動的にフェイスを特定する処理を行うようにしてもよい。この場合の処理を説明する。 Further, a modification of the face identification process in the face identification process unit 292 of the second embodiment will be described. In this embodiment, the face identification process of the second embodiment may be performed as the first process, and the face identification process of the second and subsequent times may be performed as the face identification process. The processing in this case will be described.

なお，初回とは一回目のほか，自動的に特定する際のずれを修正するため，任意のタイミングで実施例２の処理を行う場合も含まれる。二回目以降とは初回以外である。 In addition to the first time, the first time includes the case where the processing of the second embodiment is performed at an arbitrary timing in order to correct the deviation at the time of automatic identification. The second and subsequent times are other than the first time.

フェイス特定処理部２９２は，実施例２の処理と同様の処理を初回のフェイスの特定処理として実行する。そして，フェイス特定処理部２９２における二回目以降のフェイスの特定処理は，同一の陳列棚の同一の棚段について，前回（Ｎ−１回目）の正置画像情報で特定したフェイスの領域の座標を抽出し，その座標を今回（Ｎ回目）の正置画像情報で特定したフェイスの領域の座標とする。 The face identification processing unit 292 executes the same processing as the processing of the second embodiment as the first face identification processing. Then, in the second and subsequent face identification processes in the face identification processing unit 292, the coordinates of the face area specified by the previous (N-1th) normal image information are used for the same shelf on the same display shelf. Extract and use the coordinates as the coordinates of the face area specified by the normal image information of this time (Nth time).

フェイスの領域の座標は，棚段の位置の座標と同様に，正置画像情報における，陳列棚内での所定箇所（たとえば陳列棚の左上の頂点Ｃ１）を基準とした相対座標である。 Similar to the coordinates of the position of the shelf, the coordinates of the face area are relative coordinates based on a predetermined position in the display shelf (for example, the apex C1 on the upper left of the display shelf) in the orthodox image information.

さらに実施例２の変形例として，商品識別情報特定処理部２９３における，陳列棚の棚段ごとに，フェイスに表示されている商品の商品識別情報を特定する処理として，実施例２の処理を初回の商品識別情報の特定処理とし，二回目以降の商品識別情報の特定処理として，以下のような処理を実行する。 Further, as a modification of Example 2, the process of Example 2 is first performed as a process of specifying the product identification information of the product displayed on the face for each shelf of the display shelf in the product identification information identification processing unit 293. The following processing is executed as the identification processing of the product identification information of the above, and as the identification processing of the product identification information from the second time onward.

商品識別情報特定処理部２９３は，Ｎ回目の正置画像情報におけるフェイスの商品識別情報の特定処理は，まず処理対象となるフェイスのフェイス識別情報を特定する。特定したフェイス識別情報をＸとする。そして，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの領域の画像情報と，フェイス識別情報Ｘに対応する位置にあるＮ−１回目の正置画像情報における領域の画像情報とを比較する。類似性の判定については，色ヒストグラム同士のＥＭＤを求め，類似性の尺度とすることが好ましいが，それに限定するものではない。その類似性が一定の閾値以上であれば，Ｎ−１回目の正置画像情報におけるその領域のフェイスに対応する商品識別情報を商品識別情報記憶部３１から抽出し，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの商品識別情報とする。これによって，処理対象となるＮ回目の正置画像情報におけるフェイス識別情報Ｘの商品識別情報を特定できる。もし類似性が一定の閾値未満であれば，初回の場合と同様に，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの領域の画像情報と，標本画像情報記憶部３０に記憶する標本画像情報とを比較して，類似性が所定の閾値以上で，かつ，もっとも類似性が高い商品識別情報を，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘのフェイスの商品識別情報として特定をする。 The product identification information identification processing unit 293 first identifies the face identification information of the face to be processed in the identification process of the product identification information of the face in the Nth normal image information. Let X be the specified face identification information. Then, the image information of the area of the face identification information X in the Nth normal image information is compared with the image information of the area in the N-1th normal image information at the position corresponding to the face identification information X. Regarding the determination of similarity, it is preferable, but not limited to, to obtain the EMD between color histograms and use it as a measure of similarity. If the similarity is equal to or higher than a certain threshold value, the product identification information corresponding to the face of the region in the N-1th normal image information is extracted from the product identification information storage unit 31, and the Nth normal image information. It is the product identification information of the face identification information X in the above. Thereby, the product identification information of the face identification information X in the Nth normal image information to be processed can be specified. If the similarity is less than a certain threshold, the image information in the area of the face identification information X in the Nth orthodox image information and the sample image information stored in the sample image information storage unit 30 are the same as in the first case. The product identification information having the similarity equal to or higher than a predetermined threshold value and having the highest similarity is specified as the product identification information of the face of the face identification information X in the Nth orthodox image information.

なお，Ｎ−１回目の正置画像情報におけるフェイスの領域の画像情報との比較において，対応するフェイスの位置との比較のみならず，所定範囲のフェイスを比較対象として含めてもよい。たとえばＮ回目の正置画像情報におけるフェイス識別情報Ｘの領域の画像情報と比較する場合，比較対象としては，Ｎ−１回目の正置画像情報におけるフェイス識別情報Ｘの領域のほか，その領域から所定範囲にあるフェイスの領域，たとえばその左右方向に一または複数離隔している位置にあるフェイス，上下の棚段に位置するフェイスの領域も含めてもよい。さらに，Ｎ−１回目の正置画像情報におけるフェイス識別情報Ｘの領域のほか，フェイス識別情報Ｘ−２，Ｘ−１，Ｘ，Ｘ＋１，Ｘ＋２のように，複数の隣接するフェイスの領域を含めてもよい。 In the comparison with the image information of the face area in the N-1th orthodox image information, not only the comparison with the position of the corresponding face but also the faces in a predetermined range may be included as the comparison target. For example, when comparing with the image information in the area of face identification information X in the Nth normal image information, the comparison target is from the area of face identification information X in the N-1th normal image information and from that area. The area of the face within a predetermined range, for example, the area of the face located one or more apart in the left-right direction thereof, and the area of the face located on the upper and lower shelves may be included. Further, in addition to the area of the face identification information X in the N-1th orthodox image information, the area of a plurality of adjacent faces such as the face identification information X-2, X-1, X, X + 1, X + 2 is included. You may.

この場合，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの領域の画像情報と，Ｎ−１回目の正置画像情報における，比較対象となる範囲のフェイスの範囲の領域のそれぞれの画像情報とを比較し，もっとも類似性が高いＮ−１回目の正置画像情報のフェイス識別情報を特定する。なお，類似性は，類似性は一定の閾値以上であることを条件としてもよい。そして特定したＮ−１回目の正置画像情報のフェイス識別情報に対応する商品識別情報を商品識別情報記憶部３１から抽出し，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの商品識別情報とする。この処理を模式的に示すのが図２７である。図２７（ａ）は前回（Ｎ−１回目）の正置画像情報であり，図２７（ｂ）は今回（Ｎ回目）の正置画像情報である。そして，Ｎ回目の正置画像情報の棚段１の各フェイスの領域の画像情報と，Ｎ−１回目の正置画像情報の棚段１の各フェイスの領域の画像情報とをそれぞれ比較することで類似性を判定し，もっとも類似性が高いＮ−１回目の正置画像情報の棚段１のフェイスの商品識別情報を，Ｎ回目の正置画像情報の棚段１のフェイスの商品識別情報として特定をすることを示す。図２７では，Ｎ回目の正置画像情報のフェイスに対応する位置にあるＮ−１回目の正置画像情報のフェイスに加え，その左右２つずつのフェイスとの比較を行う場合を示している。なお同一棚段のみならず，上下の棚段のフェイス位置の画像情報との比較を行ってもよい。たとえば図２７の場合，Ｎ回目の正置画像情報の棚段２の中心のフェイス位置の商品識別情報を特定する際に，Ｎ−１回目の正置画像情報の棚段２の中心およびその左右２つずつのフェイスの領域の画像情報と比較するのみならず，Ｎ−１回目の正置画像情報の棚段１の中心およびその左右２つずつのフェイスの領域の画像情報，Ｎ−１回目の正置画像情報の棚段３の中心およびその左右２つずつのフェイスの領域の画像情報と類似性の比較を行ってもよい。 In this case, the image information in the area of the face identification information X in the Nth normal image information and the image information in the area of the face range in the N-1th normal image information to be compared. To identify the face identification information of the N-1th orthodox image information having the highest similarity. The similarity may be conditional on the similarity being equal to or higher than a certain threshold value. Then, the product identification information corresponding to the face identification information of the specified N-1th normal image information is extracted from the product identification information storage unit 31, and the product identification information of the face identification information X in the Nth normal image information is obtained. do. FIG. 27 schematically shows this process. FIG. 27 (a) is the previous (N-1st) normal image information, and FIG. 27 (b) is the current (Nth) normal image information. Then, the image information of the area of each face on the shelf 1 of the N-th vertical image information is compared with the image information of the area of each face of the shelf 1 of the N-1th normal image information. The product identification information of the face of the shelf 1 of the N-1th normal image information, which has the highest similarity, is used as the product identification information of the face of the shelf 1 of the N-1th normal image information. Indicates that it is specified as. FIG. 27 shows a case where, in addition to the face of the N-1th normal image information at a position corresponding to the face of the Nth normal image information, comparison is performed with two faces on the left and right of the face. .. Not only the same shelf, but also the image information of the face positions of the upper and lower shelves may be compared. For example, in the case of FIG. 27, when specifying the product identification information of the face position at the center of the shelf 2 of the Nth normal image information, the center of the shelf 2 of the N-1th normal image information and its left and right sides thereof. Not only comparing with the image information of two face areas, but also the image information of the center of the shelf 1 of the N-1th orthodox image information and the two face areas on the left and right, N-1st. You may compare the image information and the similarity with the image information of the center of the shelf 3 and the two face regions on the left and right of the vertical image information of the above.

Ｎ−１回目の正置画像情報のフェイスの画像情報との比較の結果，類似性が閾値を充足しないなどによって商品識別情報を特定できなかった場合には，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘの領域の画像情報と，標本画像情報記憶部３０に記憶する標本画像情報とを比較して，類似性が所定の閾値以上で，かつ，もっとも類似性が高い商品識別情報を，Ｎ回目の正置画像情報におけるフェイス識別情報Ｘのフェイスの商品識別情報として特定をする。この場合の類似性の判定処理は，初回の商品識別情報の特定処理と同様に行える。 As a result of comparison with the image information of the face of the N-1th orthodox image information, if the product identification information cannot be specified because the similarity does not satisfy the threshold, the face in the Nth orthodox image information is not specified. By comparing the image information in the region of the identification information X with the sample image information stored in the sample image information storage unit 30, the product identification information having a similarity equal to or higher than a predetermined threshold and having the highest similarity is obtained as N. The face identification information X in the first orthodox image information is specified as the product identification information of the face. The similarity determination process in this case can be performed in the same manner as the initial product identification information identification process.

なお本発明の画像認識システム１の実施例１乃至実施例５において，各処理部における処理対象となる画像情報については，それぞれ前処理として正置化処理などが実行されていることで精度を向上させることができ，上述の各実施例ではその場合を説明した。しかし，必ずしも処理対象となる画像情報について正置化処理が実行されている必要はなく，その場合は，各処理部における処理対象となるのは，正置化処理が実行されていない場合の画像情報である。たとえば，位置特定処理部２３，商品タグ配置領域切出処理部２４，商品タグ配置領域正置化処理部２５，商品タグ特定処理部２６，商品タグ内情報特定処理部２７，商品タグ認識処理部２８，陳列商品認識処理部２９，棚段領域切出処理部２９１，フェイス特定処理部２９２，商品識別情報特定処理部２９３，棚段画像マッチング処理部２９４において，正置化処理が実行されていない画像情報を処理対象としてもよい。この場合，正置化処理が実行されていなくても，商品を陳列した陳列棚が写っている画像情報に対する処理を実行すればよい。なお，商品を陳列した陳列棚が写っている画像情報には，正置化処理をした画像情報，正置化処理をしていない画像情報の双方が含まれる。 In the first to fifth embodiments of the image recognition system 1 of the present invention, the accuracy of the image information to be processed in each processing unit is improved by executing the orthostatic processing as the preprocessing. In each of the above examples, the case was described. However, it is not always necessary that the image information to be processed has been subjected to the orthostatic processing. In that case, the processing target of each processing unit is the image when the orthostatic processing is not executed. Information. For example, the position identification processing unit 23, the product tag arrangement area cutting processing unit 24, the product tag arrangement area normalization processing unit 25, the product tag identification processing unit 26, the product tag in-information identification processing unit 27, and the product tag recognition processing unit. 28, Display product recognition processing unit 29, shelf area cutout processing unit 291, face identification processing unit 292, product identification information identification processing unit 293, shelf image matching processing unit 294 have not executed the orthotopic processing. Image information may be processed. In this case, even if the emplacement processing is not executed, the processing for the image information showing the display shelf on which the products are displayed may be executed. The image information showing the display shelves on which the products are displayed includes both the image information that has been subjected to the orthostatic processing and the image information that has not been orientated.

本発明の画像認識システム１を用いることによって，陳列棚に陳列されている商品を精度よく特定することができる。 By using the image recognition system 1 of the present invention, it is possible to accurately identify the products displayed on the display shelves.

１：画像認識システム
２：管理端末
４：撮影画像情報入力端末
２０：撮影画像情報入力受付処理部
２１：撮影画像情報記憶部
２２：撮影画像情報正置化処理部
２３：位置特定処理部
２４：商品タグ配置領域切出処理部
２５：商品タグ配置領域正置化処理部
２６：商品タグ特定処理部
２７：商品タグ内情報特定処理部
２８：商品タグ認識処理部
２９：陳列商品認識処理部
３０：標本画像情報記憶部
３１：商品識別情報記憶部
３２：整合性判定処理部
３３：認識結果照合処理部
２２０：Ｎ−１回目の画像情報における特徴量採取領域
２２１：Ｎ回目の画像情報における特徴量採取領域
２９１：棚段領域切出処理部
２９２：フェイス特定処理部
２９３：商品識別情報特定処理部
２９４：棚段画像マッチング処理部
７０：演算装置
７１：記憶装置
７２：表示装置
７３：入力装置
７４：通信装置 1: Image recognition system 2: Management terminal 4: Photographed image information input terminal 20: Photographed image information input reception processing unit 21: Photographed image information storage unit 22: Photographed image information emplacement processing unit 23: Position identification processing unit 24: Product tag placement area cutout processing unit 25: Product tag placement area normalization processing unit 26: Product tag identification processing unit 27: Product tag information identification processing unit 28: Product tag recognition processing unit 29: Display product recognition processing unit 30 : Specimen image information storage unit 31: Product identification information storage unit 32: Consistency determination processing unit 33: Recognition result collation processing unit 220: N-1 Feature amount collection area in the first image information 221: Features in the Nth image information Quantity collection area 291: Shelf stage area cutout processing unit 292: Face identification processing unit 293: Product identification information identification processing unit 294: Shelf stage image matching processing unit 70: Computing device 71: Storage device 72: Display device 73: Input device 74: Communication device

Claims

A first emplacement processing unit that performs a first emplacement process on the first image information showing a display shelf for displaying products and generates a second image information, and a first emplacement processing unit.
A second emplacement processing unit that performs a second emplacement process on the area including the product tag arrangement area in the second image information, and
A product tag identification processing unit that specifies a product tag area from the image information that has undergone the second orthostatic processing, and a product tag identification processing unit.
An information identification processing unit in the product tag that specifies the information written on the product tag by performing OCR recognition processing in the specified product tag area, and
An image recognition system characterized by having.

The information identification processing unit in the product tag
The box is specified by binarizing the specified product tag area and performing labeling processing.
Among the specified boxes, the boxes whose height, width, and baseline satisfy the predetermined conditions are specified as blocks.
The OCR recognition process is executed for the specified block.
The image recognition system according to claim 1.

Computer,
A first emplacement processing unit, which performs a first emplacement process on the first image information showing a display shelf for displaying products and generates a second image information.
A second emplacement processing unit that performs a second emplacement process on an area including a product tag arrangement area in the second image information.
Product tag identification processing unit that specifies the product tag area from the image information that has undergone the second orthostatic processing,
Information identification processing unit in the product tag that specifies the information written on the product tag by performing OCR recognition processing in the specified product tag area,
An image recognition program characterized by functioning as.