JP6548306B2

JP6548306B2 - Image analysis apparatus, program and method for tracking a person appearing in a captured image of a camera

Info

Publication number: JP6548306B2
Application number: JP2016031624A
Authority: JP
Inventors: 小林　達也; 達也小林; 加藤　晴久; 晴久加藤
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2016-02-23
Filing date: 2016-02-23
Publication date: 2019-07-24
Anticipated expiration: 2036-02-23
Also published as: JP2017151582A

Description

本発明は、カメラの撮影画像に映る人物を追跡する画像解析の技術に関する。 The present invention relates to an image analysis technique for tracking a person appearing in a captured image of a camera.

従来、カメラによる撮影画像を解析し、その撮影画像に映る人物を検出し追跡する技術がある。このような技術は、例えば所定の撮影対象範囲における人の混雑度調査や、動線解析、監視のための不審者検知等に利用される。例えば、単眼カメラの映像から、登録された人物領域候補をマッチングさせることによって、人物を検出し追跡する技術がある（例えば特許文献１参照）。撮影画像内に人物の全身がはっきりと映り込む場合、比較的容易に人物領域を検出することができる。 2. Description of the Related Art Conventionally, there is a technique of analyzing a captured image by a camera and detecting and tracking a person appearing in the captured image. Such a technique is used, for example, for investigating the degree of congestion of a person in a predetermined imaging target range, analyzing a flow line, detecting a suspicious person for monitoring, and the like. For example, there is a technique of detecting and tracking a person by matching registered person region candidates from a video of a monocular camera (see, for example, Patent Document 1). When the whole body of a person clearly appears in a captured image, the person area can be detected relatively easily.

しかしながら、実際の利用シーンでは、撮影画像内に複数の人物同士が重なる遮蔽（人物間オクルージョン）が発生した際に、人物を検出することが難しい。このとき、人物の追跡が途切れたり、追跡中の人物の動線が入れ替わることも生じる。特許文献１に記載の技術によれば、人物間オクルージョンによって追跡精度が劣化してしまう。 However, in an actual usage scene, it is difficult to detect a person when shielding (inter-person occlusion) in which a plurality of persons overlap with each other occurs in a captured image. At this time, tracking of a person may be interrupted, or the flow line of the person being tracked may be switched. According to the technology described in Patent Document 1, inter-person occlusion degrades tracking accuracy.

これに対し、複数のカメラを用いて撮影した撮影画像の前景画像から、視差を用いた視体積交差法によって、人物領域を３次元的に推定する技術がある（例えば特許文献２参照）。
また、複数のカメラで個々に追跡した人物動線を統合することにより、追跡の失敗や動線の入れ替わりを防ぐ技術もある（例えば特許文献３参照）。
更に、複数のカメラを用いることなく、連続的な追跡結果からオクルージョンが発生している前景画像を分割することによって個々の人物領域に分離して、人物間オクルージョンを解消する技術もある（例えば特許文献４参照）。
更に、オクルージョン領域を輝度変化に基づいて領域分割することにより、個々の人物領域に分離し、正確な人数を計測する技術もある（例えば特許文献５参照）。この技術によれば、天井に設置されたカメラを想定し、撮影画像から抽出した前景画像から、人物の頭部領域のみを検出する。これによって、非人物の雑音領域を除外し、人物領域を高精度に検出することができる。 On the other hand, there is a technique of three-dimensionally estimating a person area by a view volume intersection method using parallax from foreground images of photographed images photographed using a plurality of cameras (see, for example, Patent Document 2).
In addition, there is also a technology for preventing a tracking failure or a change of flow lines by integrating human flow lines individually tracked by a plurality of cameras (see, for example, Patent Document 3).
Furthermore, there is also a technology that eliminates the inter-person occlusion by dividing the foreground image in which occlusion is occurring from continuous tracking results into individual person regions without using a plurality of cameras (for example, a patent). Reference 4).
Furthermore, there is also a technology of dividing an occlusion area into individual person areas by dividing the occlusion area into areas based on a change in luminance, and measuring the number of people correctly (see, for example, Patent Document 5). According to this technology, assuming a camera installed on a ceiling, only the head region of a person is detected from the foreground image extracted from the captured image. As a result, the noise area of non-person can be excluded and the person area can be detected with high accuracy.

特開２０１０−２５７４４１号公報Unexamined-Japanese-Patent No. 2010-257441 特開２０１４−１６４５２５号公報JP, 2014-164525, A 特開２０１０−０６３００１号公報Unexamined-Japanese-Patent No. 2010-063001 特開２０１３−２０６２６２号公報JP, 2013-206262, A 特開２０１４−２２９０６８号公報JP, 2014-229068, A

特許文献２及び３に記載の技術によれば、複数のカメラが重複して撮影できていない撮影対象範囲については、人物間オクルージョンに対して頑健（ロバスト）に人物を追跡することができない。また、撮影対象範囲が広くなるほど、複数のカメラで重複して撮影できるようにするためには、カメラの設置コストも問題となる。 According to the techniques described in Patent Documents 2 and 3, it is not possible to track a person robustly against inter-person occlusion for a shooting target range in which a plurality of cameras can not be shot redundantly. In addition, as the shooting target range becomes wider, the installation cost of the camera also becomes a problem in order to be able to shoot in duplicate with a plurality of cameras.

特許文献４及び５の記載の技術によれば、単一のカメラで人物間オクルージョンを解消することができるが、前景画像を領域分割することで複数人を検出しているために、完全なオクルージョン（一方の人物が他方の人物によって完全に遮蔽される状況）が発生した場合には、領域分割で失敗する。 According to the techniques described in Patent Documents 4 and 5, it is possible to eliminate inter-person occlusion with a single camera, but complete occlusion because a plurality of persons are detected by dividing the foreground image. If (a situation in which one person is completely shielded by the other person) occurs, region division fails.

図１は、人物間オクルージョンの発生を表す画像である。 FIG. 1 is an image showing the occurrence of inter-person occlusion.

図１によれば、室内の天井にパノラマカメラが設置され、床面全域が撮影されている。また、その撮影画像に基づくフレームt-4の前景画像からは、４人の人物が検出できる。しかしながら、フレームt-3〜t+1の前景画像によれば、各人物が移動することによって、人物間オクルージョンが発生している。また、フレームtの前景画像によれば、完全な人物間オクルージョンが発生しており、人物領域を検出することが極めて難しい。 According to FIG. 1, the panoramic camera is installed on the ceiling of the room, and the entire floor surface is photographed. In addition, four persons can be detected from the foreground image of frame t-4 based on the photographed image. However, according to the foreground image of the frame t-3 to t + 1, inter-person occlusion occurs as each person moves. Further, according to the foreground image of the frame t, perfect inter-person occlusion occurs, and it is extremely difficult to detect the human region.

図２は、完全な人物間オクルージョンの発生によって人物の追跡失敗を表す説明図である。 FIG. 2 is an explanatory view showing a tracking failure of a person due to the occurrence of perfect inter-person occlusion.

図２によれば、フレームt-1の前景画像では、人物Ａ、Ｂ、Ｃが検出できる。しかしながら、フレームtの前景画像では、人物Ｃを完全に見失っている。このとき、人物Ｃは、人物間オクルージョンによって、他の人物Ａ又はＢに完全に遮蔽されたと推定することもできる。しかしながら、人物Ｃは、人物Ａ又は人物Ｂのいずれに遮蔽されたか？まで推定することはできない。 According to FIG. 2, persons A, B, and C can be detected in the foreground image of frame t-1. However, in the foreground image of frame t, the person C is completely lost. At this time, it is possible to estimate that the person C is completely occluded by another person A or B due to inter-person occlusion. However, did person C be blocked by person A or person B? Can not be estimated.

図３は、部分的な人物間オクルージョンの発生によって人物の入れ替わりを表す説明図である。 FIG. 3 is an explanatory view showing a change of a person by the occurrence of partial occlusion between people.

図３によれば、フレームtの前景画像では、人物領域候補とマッチングすることによって、人物Ｂ及び人物Ｃで、人物間オクルージョンが発生していることを認識することができる。しかしながら、人物追跡の過程で、人物Ｂ及び人物Ｃが入れ替わって認識される恐れがある。人物追跡で、人物間オクルージョンが発生した後に、各人物の識別子が入れ替わって認識してしまうという問題がある。 According to FIG. 3, in the foreground image of the frame t, it is possible to recognize that occlusion between persons is occurring in the persons B and C by matching with the person area candidate. However, in the process of tracking a person, there is a possibility that the person B and the person C may be recognized alternately. In person tracking, there is a problem that the identifier of each person is replaced and recognized after the occurrence of inter-person occlusion.

そこで、本発明は、１台のカメラによる撮影画像であっても、移動速度の異なる複数人の人物間オクルージョンに対してロバストに追跡を継続することができる画像解析装置、プログラム及び方法を提供することを目的とする。 Therefore, the present invention provides an image analysis apparatus, program, and method that can continue tracking robustly against occlusions between a plurality of persons having different moving speeds, even in the case of an image captured by a single camera. The purpose is

本発明によれば、カメラによる連続的な撮影画像の中から人物を追跡する画像解析装置において、
連続的な撮影画像から、異なる頻度で更新される複数の背景画像を用いた背景差分によって、同一時刻のフレームに対して複数の前景画像i(i=1〜n、n>1)を抽出する前景画像抽出手段と、
最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまでの全ての前景画像から、人物領域を検出する人物領域検出手段と、
当該人物領域が映る更新頻度が最も高い前景画像iを、フレームの時間経過に応じて動線で結ぶ人物領域追跡手段と、
複数の人物領域同士でオクルージョンが発生した際に、同一時刻のフレームについて、当該前景画像iよりも更新頻度が高い先の前景画像(<i)で既に検出された人物領域で、当該前景画像iを画像的にマスクし、人物領域毎の動線の相違によって人物を識別する人物領域識別手段と
を有することを特徴とする。 According to the present invention, there is provided an image analysis apparatus for tracking a person from among continuous captured images by a camera,
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images Foreground image extraction means,
Person area detection means for detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A person area tracking unit that connects the foreground image i having the highest update frequency in which the person area appears, with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. And a person area identification means for identifying a person by a difference in flow line for each person area.

本発明の画像解析装置における他の実施形態によれば、
人物領域追跡手段は、当該人物領域が最低更新頻度の前景画像i=nから検出されない場合、当該人物領域の動線を除外することも好ましい。 According to another embodiment of the image analysis device of the present invention,
The human region tracking means preferably excludes the flow line of the human region when the human region is not detected from the foreground image i = n with the lowest update frequency.

本発明の画像解析装置における他の実施形態によれば、
人物領域追跡手段は、最低更新頻度の前景画像i=nに人物領域が最初に映り込んだ際に、当該人物領域の動線の追跡を開始することも好ましい。 According to another embodiment of the image analysis device of the present invention,
It is also preferable that the human region tracking means start tracking the flow line of the human region when the human region first appears in the foreground image i = n with the lowest update frequency.

本発明の画像解析装置における他の実施形態によれば、
人物領域識別手段は、
移動中の人物の後方で、静止中の人物が遮蔽された場合、移動中の人物における人物領域の動線は、静止中の人物における人物領域の動線と異なり、
次の時刻のフレームで、移動中の人物による遮蔽が解消した静止中の人物における人物領域の動線は、最高更新頻度の前景画像i=1に映り込む
ことによってオクルージョン発生時の人物を識別することも好ましい。 According to another embodiment of the image analysis device of the present invention,
The person area identification means
When the stationary person is shielded behind the moving person, the flow line of the person area in the moving person is different from the flow line of the person area in the stationary person,
In the frame of the next time, the flow line of the person area in the stationary person whose occlusion by the moving person is canceled is reflected in the foreground image i = 1 of the highest update frequency to identify the person at the time of occlusion occurrence Is also preferred.

本発明の画像解析装置における他の実施形態によれば、
人物領域識別手段は、
静止中の人物の後方で、移動中の人物が遮蔽された場合、移動中の人物における人物領域の動線と、静止中の人物における人物領域の動線とが一致し、
次の時刻のフレームで、静止中の人物による遮蔽が解消した移動中の人物における人物領域の動線は、最高更新頻度の前景画像i=1に映り込む
ことによってオクルージョン発生時の人物を識別することも好ましい。 According to another embodiment of the image analysis device of the present invention,
The person area identification means
When the moving person is shielded behind the standing person, the flow line of the person area in the moving person coincides with the flow line of the person area in the standing person,
In the frame of the next time, the flow line of the person area in the moving person whose occluding by the standing person has been eliminated is reflected in the foreground image i = 1 of the highest update frequency to identify the person at the time of occlusion occurrence Is also preferred.

本発明の画像解析装置における他の実施形態によれば、
人物領域検出手段は、先の時刻のフレームの中で動線を結ぶ前景画像に映る当該人物領域の画像特徴量を用いて、次の時刻のフレームについて、最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまで順に、マッチングによって人物領域を検出することも好ましい。 According to another embodiment of the image analysis device of the present invention,
The person area detection means uses the image feature quantity of the person area in the foreground image connecting the flow lines in the frame of the previous time, and the foreground image i = 1 of the highest update frequency for the frame of the next time It is also preferable to detect a person area by matching in order up to the foreground image i = n with the lowest update frequency.

本発明の画像解析装置における他の実施形態によれば、
前景画像抽出手段は、最低更新頻度の前景画像i=nの段階nについて、
人物領域追跡手段によって検出された動線の数（追跡中の人数）に応じて、
人物領域識別手段によって検出されたオクルージョン領域の最大人数に応じて、
又は、
人物領域識別手段によって検出されたオクルージョン領域の最大面積に応じて
可変されることも好ましい。 According to another embodiment of the image analysis device of the present invention,
The foreground image extracting means determines the foreground image i = n step n with the lowest update frequency
Depending on the number of flow lines detected by the human area tracking means (the number of people being tracked),
According to the maximum number of people in the occlusion area detected by the person area identification means,
Or
It is also preferable to be variable according to the maximum area of the occlusion area detected by the person area identification means.

本発明の画像解析装置における他の実施形態によれば、
人物領域検出手段は、前景画像iについて、前景画像i=nに映る人物領域以外の部分を、残像領域として除外することも好ましい。 According to another embodiment of the image analysis device of the present invention,
It is also preferable that the person area detection unit excludes a portion other than the person area that appears in the foreground image i = n as the afterimage area in the foreground image i.

本発明によれば、カメラによる連続的な撮影画像の中から人物を識別する装置に搭載されたコンピュータを機能させる画像解析用のプログラムにおいて、
連続的な撮影画像から、異なる頻度で更新される複数の背景画像を用いた背景差分によって、同一時刻のフレームに対して複数の前景画像i(i=1〜n、n>1)を抽出する前景画像抽出手段と、
最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまでの全ての前景画像から、人物領域を検出する人物領域検出手段と、
当該人物領域が映る更新頻度が最も高い前景画像iを、フレームの時間経過に応じて動線で結ぶ人物領域追跡手段と、
複数の人物領域同士でオクルージョンが発生した際に、同一時刻のフレームについて、当該前景画像iよりも更新頻度が高い先の前景画像(<i)で既に検出された人物領域で、当該前景画像iを画像的にマスクし、人物領域毎の動線の相違によって人物を識別する人物領域識別手段と
してコンピュータを機能させることを特徴とする。 According to the present invention, there is provided a program for image analysis which causes a computer mounted on a device for identifying a person among continuous photographed images by a camera to function as
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images Foreground image extraction means,
Person area detection means for detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A person area tracking unit that connects the foreground image i having the highest update frequency in which the person area appears, with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. Are imagewise masked, and the computer is made to function as a person area identification means for identifying a person by a difference in flow line for each person area.

本発明によれば、カメラによる連続的な撮影画像の中から人物を識別する装置の画像解析方法において、
装置は、
連続的な撮影画像から、異なる頻度で更新される複数の背景画像を用いた背景差分によって、同一時刻のフレームに対して複数の前景画像i(i=1〜n、n>1)を抽出する第１のステップと、
最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまでの全ての前景画像から、人物領域を検出する第２のステップと、
当該人物領域が映る更新頻度が最も高い前景画像iを、フレームの時間経過に応じて動線で結ぶ第３のステップと、
複数の人物領域同士でオクルージョンが発生した際に、同一時刻のフレームについて、当該前景画像iよりも更新頻度が高い先の前景画像(<i)で既に検出された人物領域で、当該前景画像iを画像的にマスクし、人物領域毎の動線の相違によって人物を識別する第４のステップと
を実行することを特徴とする。
According to the present invention, there is provided an image analysis method of an apparatus for identifying a person among continuous captured images by a camera,
The device is
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images The first step,
A second step of detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A third step of connecting the foreground image i having the highest update frequency in which the person area is shown with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. Are imagewise masked, and a fourth step of identifying a person by a difference in flow line for each person region is performed.

本発明の画像処理装置、プログラム及び方法によれば、１台のカメラによる撮影画像であっても、移動速度の異なる複数人の人物間オクルージョンに対してロバストに追跡を継続することができる。具体的には、各人物を、移動速度及び静止状態に応じて複数の前景画像に分散させて認識することができるために、複数人物が同じ時間に固まって移動しない状況である限り、高精度にオクルージョン領域の人物同士を検出することができる。 According to the image processing apparatus, program, and method of the present invention, it is possible to continue tracking robustly against occlusion between a plurality of persons having different moving speeds, even in the case of an image captured by one camera. Specifically, since each person can be dispersed and recognized in a plurality of foreground images according to the movement speed and the stationary state, high accuracy is possible as long as a plurality of persons do not move together at the same time. People in the occlusion area can be detected.

人物間オクルージョンの発生を表す画像である。It is an image showing the occurrence of occlusion between people. 完全な人物間オクルージョンの発生によって人物の追跡失敗を表す説明図である。It is explanatory drawing showing the tracking failure of a person by generation | occurrence | production of perfect inter-person occlusion. 部分的な人物間オクルージョンの発生によって人物の入れ替わりを表す説明図である。It is explanatory drawing showing exchange of a person by generation | occurrence | production of partial inter-person occlusion. 本発明における画像解析装置の機能構成図である。It is a functional block diagram of the image-analysis apparatus in this invention. 部分的な人物間オクルージョンを表す説明図である。It is explanatory drawing showing partial inter-person occlusion. 完全な人物間オクルージョンを表す説明図である。It is explanatory drawing showing perfect inter-person occlusion. 本発明による人物追跡を表す第１の説明図である。It is a first explanatory view showing person tracking according to the present invention. 本発明による人物追跡を表す第２の説明図である。It is a 2nd explanatory view showing person tracking by the present invention.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図４は、本発明における画像解析装置の機能構成図である。 FIG. 4 is a functional block diagram of the image analysis apparatus in the present invention.

画像解析装置１は、１台のカメラによる撮影画像の中から人物を検出する。撮影画像は、予め録画されたものであってもよいし、インタフェースを介して外部から時系列に入力されるもの（例えばライブ映像）であってもよい。インタフェースは、ネットワークに接続する通信インタフェースであってもよいし、カメラからの入力インタフェースであってもよい。 The image analysis device 1 detects a person in an image captured by one camera. The captured image may be one recorded in advance, or may be one input externally in time series via an interface (for example, live video). The interface may be a communication interface connected to a network or an input interface from a camera.

カメラは、何ら限定されることなく、既存のものであってもよい。室内の人物を追跡する場合、例えば広い画角を１フレームとして撮影するパノラマカメラであることも好ましい。具体的には、室内の天井から床面全域を撮影することができる全方位カメラであってもよい。 The camera may be existing without limitation. In the case of tracking a person in the room, for example, it is also preferable to be a panoramic camera that captures a wide angle of view as one frame. Specifically, it may be an omnidirectional camera capable of photographing the entire floor surface from the ceiling of the room.

カメラの内部パラメータＡ及び外部パラメータＷは、事前にキャリブレーションによって取得されたものであり、基本的に撮影中は変化しないと想定している。但し、複数種類のパラメータを事前に用意するか、又は、公知の動的キャリブレーション技術を利用することによって、内部パラメータの動的変化（パン・チルト・ズームなどの変化）や、移動カメラにも適用することができる。 It is assumed that the internal parameter A of the camera and the external parameter W are obtained in advance by calibration and basically do not change during shooting. However, dynamic parameters (changes in pan, tilt, zoom, etc.) of internal parameters and moving cameras can be prepared by preparing multiple types of parameters in advance or using known dynamic calibration techniques. It can apply.

本発明の画像解析装置は、１台のカメラによる撮影画像であっても、移動速度の異なる人物同士の人物間オクルージョン（完全なオクルージョンも含む）に対してロバスト（頑健）な追跡を継続することができる。画像解析装置１は、前景画像抽出部１１と、人物領域検出部１２と、人物領域追跡部１３と、人物領域識別部１４とを有する。これら一連の機能部は、時間経過に伴うフレーム毎に、フィードバックして実行される。
尚、これら機能構成部は、画像解析装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現される。また、これら機能構成からなる処理の流れは、装置における画像解析方法としても理解される。 The image analysis device according to the present invention is to continue robust tracking against inter-person occlusion (including complete occlusion) between persons having different moving speeds, even in the case of an image captured by one camera. Can. The image analysis device 1 includes a foreground image extraction unit 11, a person area detection unit 12, a person area tracking unit 13, and a person area identification unit 14. These series of functional units are fed back and executed for each frame as time passes.
Note that these functional components are realized by executing a program that causes a computer installed in the image analysis device to function. In addition, the flow of processing including these functional configurations is also understood as an image analysis method in the apparatus.

［前景画像抽出部１１］
前景画像抽出部１１は、連続的な撮影画像から、異なる頻度で更新される複数の背景モデルを用いた背景差分によって、前景画像を抽出する。例えば背景差分やフレーム間差分のような公知技術を用いたものであってもよいし、前述した特許文献５の記載の技術を用いたものであってもよい。尚、背景差分法の中でも、例えば混合ガウス分布（MoG）を用いる場合、「学習率」や「サンプリングレート」が、「静止した前景画像が背景とみなされる速さ」を調整するパラメータに相当する。 [Foreground image extraction unit 11]
The foreground image extraction unit 11 extracts a foreground image from a continuous captured image by background difference using a plurality of background models updated with different frequencies. For example, known techniques such as background difference and interframe difference may be used, or the technique described in Patent Document 5 described above may be used. In the background subtraction method, for example, in the case of using a mixed Gaussian distribution (MoG), the “learning rate” and the “sampling rate” correspond to the parameters for adjusting “the speed at which the stationary foreground image is regarded as the background”. .

「背景差分(background subtraction)」とは、現時刻の撮影画像と過去時刻の撮影画像とを比較して、過去時刻の撮影画像に映らない物体を抽出する技術をいう。このとき、過去時刻の撮影画像を背景画像と称す。また、背景画像に存在しない物体が占める領域を「前景領域」、それ以外を「背景領域」と称す。具体的には、事前に過去の複数枚の撮影画像から背景画像を作成しておき、その差分となる前景画像を抽出することも好ましい。前景画像には、静止物体は映り込むことなく除外され、人物のような移動物体のみが映り込む。前景画像とは、一般に前景と推定されたピクセルの輝度値を255、背景と推定されたピクセルの輝度値を0とした画像の形で表される。具体的な前景画像は、前述した図１の下段のように表される。 "Background subtraction" refers to a technology for extracting an object that does not appear in the captured image of the past time by comparing the captured image of the current time with the captured image of the past time. At this time, the captured image of the past time is referred to as a background image. Also, a region occupied by an object not present in the background image is referred to as a "foreground region", and the other is referred to as a "background region". Specifically, it is also preferable to create a background image in advance from a plurality of photographed images in the past and extract a foreground image that is the difference. In the foreground image, stationary objects are excluded without being reflected, and only moving objects such as a person are reflected. The foreground image is generally expressed in the form of an image in which the luminance value of a pixel estimated to be foreground is 255 and the luminance value of a pixel estimated to be background is 0. A specific foreground image is represented as shown in the lower part of FIG. 1 described above.

前景画像抽出部１１は、連続的な撮影画像から、異なる頻度で更新される複数の背景画像を用いた背景差分によって、同一時刻のフレームに対して複数の前景画像i(i=1〜n、n>1)を抽出する。iは、前景画像のインデックスを表す。
前景画像i=1は、最高頻度で更新される（最も高い場合は毎フレーム更新される）背景画像に基づいて抽出される。
同様に、前景画像i=nは、最低頻度で更新される（最も低い場合は、実行中に一切更新されない）背景画像に基づいて抽出される。 The foreground image extraction unit 11 generates a plurality of foreground images i (i = 1 to n, for a frame at the same time) by using background differences using a plurality of background images updated at different frequencies from continuous captured images. Extract n> 1). i represents the index of the foreground image.
The foreground image i = 1 is extracted based on the background image updated most frequently (highest if every frame).
Similarly, the foreground image i = n is extracted based on the background image that is updated least frequently (if it is the least, it is not updated at all during execution).

従来技術によれば、前景画像は、撮影画像に対して１枚生成されるのに対し、本発明によれば、複数枚生成される。
前景画像i=1には、移動中の人物の人物領域のみが映り込むことが期待される（静止中の人物は背景画像となって検出されない）。前景画像i=1は、d₁(d₁>0)フレーム前の画像とのフレーム差分(d=d₁)を取る。例えばd₁=1とすることができる。
前景画像i=2は、d₂(d₂>d₁)フレーム前の画像とのフレーム差分(d=2)を取る。例えばd₂=2とすることができる。
・・・・・
前景画像i=nは、静止中及び移動中の全ての人物の人物領域が映り込むことが期待される。前景画像i=nによれば、前景画像nは、人物が存在しない過去（d_nフレーム前）の背景画像とのフレーム差分(d=d_n, d_n>d_n-1)を取る。又は、背景モデルを更新しない場合、d_n=ts(tsは初期フレームからの経過時間)としてもよいし、別途入力される背景画像（事前に無人の撮影シーンを撮影することで取得することが望ましい）との差分を取ってもよい。
例えば、背景モデルの更新頻度や、フレーム差分の間隔d₁、d₂、・・・、d_n[frame]を調整することによって、前景画像を抽出する人物の移動時間の範囲を調整することができる。また、実行中にd₁、d₂、・・・、d_nを動的に調整することも可能である。 According to the prior art, one foreground image is generated for a photographed image, whereas a plurality of foreground images are generated according to the present invention.
It is expected that only the person area of the moving person is reflected in the foreground image i = 1 (the still person is not detected as a background image). The foreground image i = 1 takes the frame difference (d = d ₁ ) with the image before d ₁ (d ₁ > 0) frame. For example, d ₁ = 1.
The foreground image i = 2 takes a frame difference (d = 2) with the image of d ₂ (d ₂ > d ₁ ) frame before. For example, it can be d ₂ = 2.
......
In the foreground image i = n, it is expected that the person area of all persons in still and moving is reflected. According to the foreground image i = n, the foreground image n takes a frame difference (d = d _n , d _n > d _n-1 ) with the background image of the past (d _n frames before) in which no person exists. Alternatively, when the background model is not updated, d _n = ts (ts is an elapsed time from the initial frame), or a separately input background image (obtainable by capturing an unmanned shooting scene in advance) You may take the difference with (desired).
For example, the range of movement time of the person from which the foreground image is extracted may be adjusted by adjusting the update frequency of the background model and the intervals d ₁ , d ₂ ,..., D _n [frame] of the frame difference. it can. It is also possible to adjust d ₁ , d ₂ ,..., D _n dynamically during execution.

尚、本発明によれば、前景画像の枚数（背景モデルの種類）nは、最低n=2であるが、その枚数は任意であってもよい。
但し、前景画像の枚数が増えるほど処理負荷が増加するため、リアルタイム性が要求される。そのために、前景画像の枚数nは、必要最小限の数に抑えることが好ましい。勿論、撮影画像中の人物が１人である場合、オクルージョンは発生しないために、その時間帯に限っては前景画像n=1として処理することが好ましい。 Note that according to the present invention, the number of foreground images (type of background model) n is at least n = 2, but the number may be arbitrary.
However, since the processing load increases as the number of foreground images increases, real-time property is required. Therefore, it is preferable to limit the number n of foreground images to the necessary minimum number. Of course, when there is only one person in the photographed image, it is preferable to process as the foreground image n = 1 only in the time zone, since no occlusion occurs.

＜最低更新頻度の前景画像i=nの段階nを可変する実施形態＞
前景画像抽出部１１は、最低更新頻度の前景画像i=nの段階nについて、以下の３つの実施形態に応じて可変とすることも好ましい。これにより、処理負荷とロバスト性の両立が可能となる。
（段階制御１）人物領域追跡部１３からフィードバックされた動線の数（追跡中の人数）に応じて、前景画像の段階nを可変する。例えば追跡人数が少ないほど、前景画像の段階を少なくする。
（段階制御２）人物領域識別部１４からフィードバックされたオクルージョン領域の最大人数に応じて、前景画像の段階nを可変する。例えばオクルージョン領域の最大人数が少ないほど、前景画像の段階を少なくする。例えば、前景画像の段階nを、追跡人数と同数に設定してもよい。
（段階制御３）人物領域識別部１４からフィードバックされたオクルージョン領域の最大面積に応じて、前景画像の段階nを可変する。例えばオクルージョン領域の最大面積が小さいほど、前景画像の段階を少なくする。 <Embodiment in which the stage n of the foreground image i = n of the minimum update frequency is variable>
It is also preferable that the foreground image extraction unit 11 make the stage n of the foreground image i = n of the lowest update frequency variable according to the following three embodiments. This makes it possible to achieve both processing load and robustness.
(Step Control 1) The step n of the foreground image is varied according to the number of flow lines (the number of people in tracking) fed back from the person area tracking unit 13. For example, the smaller the number of tracking persons, the smaller the number of foreground image stages.
(Step Control 2) In accordance with the maximum number of people in the occlusion area fed back from the person area identification unit 14, the stage n of the foreground image is varied. For example, the smaller the maximum number of people in the occlusion area, the smaller the number of foreground image stages. For example, the number n of foreground images may be set to the same number as the number of tracking persons.
(Step Control 3) In accordance with the maximum area of the occlusion area fed back from the person area identification unit 14, the stage n of the foreground image is varied. For example, the smaller the maximum area of the occlusion area, the smaller the number of foreground image stages.

そして、前景画像抽出部１１は、抽出した複数枚の前景画像i=1〜nを、人物領域検出部１２へ出力する。 Then, the foreground image extraction unit 11 outputs the plurality of extracted foreground images i = 1 to n to the person area detection unit 12.

［人物領域検出部１２］
人物領域検出部１２は、最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまでの全ての前景画像から、マッチングによって人物領域を検出する。 [People area detection unit 12]
The person area detection unit 12 detects a person area by matching from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency.

人物領域検出部１２は、例えば、撮影画像中で当該人物が占めると想定される輪郭画像を「人物領域候補」として予め記憶したものであってもよい。人物領域候補は、事前に、３次元空間に配置した人物の３次元モデルを、カメラパラメータを用いて撮影画像中に投影して作成したものであってもよい。即ち、前景画像の中から、人物領域候補の外縁領域とマッチングして、類似度が最も高い人物領域を探索する。勿論、例えば特許文献５の記載の技術を用いたものであってもよい。また、類似度としては、人物領域における画像特徴量や、移動時間、移動量、これら組み合わせに基づくものであってもよい。 For example, the person area detection unit 12 may store in advance a contour image assumed to be occupied by the person in the photographed image as a “person area candidate”. The human region candidate may be created in advance by projecting a three-dimensional model of a person arranged in a three-dimensional space into a photographed image using camera parameters. That is, the foreground region is searched for the person region having the highest degree of similarity by matching with the outer edge region of the person region candidate. Of course, for example, the technology described in Patent Document 5 may be used. Further, the similarity may be based on an image feature amount in a person area, a moving time, a moving amount, or a combination thereof.

人物領域検出部１２は、同一時刻のフレームについて、当該前景画像iよりも更新頻度が高い先の前景画像(<i)で既に検出された人物領域で、当該前景画像iを画像的にマスクする。マスクは、具体的には輝度値を0にする。即ち、人物領域の検出対象となる前景画像は、先の前景画像(<i)によって検出された人物領域を含まない画像とする。これによって、人物領域の重複検出を避けることができる。 The person area detection unit 12 graphically masks the foreground image i in a person area that has already been detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. . Specifically, the mask sets the luminance value to zero. That is, the foreground image to be the detection target of the person area is an image not including the person area detected by the foreground image (<i). This makes it possible to avoid duplicate detection of the person area.

図５は、部分的な人物間オクルージョンを表す説明図である。
図６は、完全な人物間オクルージョンを表す説明図である。 FIG. 5 is an explanatory view showing partial inter-person occlusion.
FIG. 6 is an explanatory view showing perfect inter-person occlusion.

図５に及び図６によれば、更新頻度が高い前景画像から順に、人物領域を検出し、検出した人物領域を更新頻度が低い前景画像からマスクしたものである。部分的な人物間オクルージョンを表す図５によれば、マスク後も、前景画像i=n=2に人物Ｂの人物領域の一部が映り込んでおり、人物Ｂを検出することができる。一方で、完全な人物間オクルージョンを表す図６によれば、マスクによって、前景画像i=n=2から人物Ｂが完全に消えてしまう。このような状態にあっても、後述する人物領域追跡部１３及び人物識別部１４によって、人物Ｂ及びＣを区別して、人物の追跡を可能とする。 According to FIG. 5 and FIG. 6, the person area is detected in order from the foreground image with the high update frequency, and the detected person area is masked from the foreground image with the low update frequency. According to FIG. 5 representing partial inter-person occlusion, even after masking, a part of the person area of the person B is reflected in the foreground image i = n = 2, and the person B can be detected. On the other hand, according to FIG. 6 showing perfect inter-person occlusion, the person B completely disappears from the foreground image i = n = 2 due to the mask. Even in such a state, persons B and C can be distinguished by person area tracking unit 13 and person identification unit 14 described later to enable tracking of persons.

＜残像領域を除外する実施形態＞
撮影画像に移動中の人物が映る場合、更新頻度の高い前景画像ほど、実際に人物が存在する実像領域と、実際には人物が存在しない残像領域とが映り込む。残像領域とは、実際に人物が存在していないにも拘わらず、背景画像内の人物が移動したことによって前景画像として検出されたものである。 <Embodiment to exclude afterimage area>
When a moving person appears in the captured image, as the foreground image is updated more frequently, a real image area in which the person actually exists and an afterimage area in which the person does not actually appear are reflected. The afterimage area is detected as a foreground image due to movement of a person in the background image despite the fact that no person exists.

前景画像に残像領域が映り込む場合、人物識別の精度が劣化する場合がある。そのために、人物領域検出部１２は、前景画像iについて、前景画像i=nに映る人物領域以外の部分は、残像領域であるとして除外することも好ましい。残像領域をできる限り除外することによって、人物識別の精度を高めることができる。 When an afterimage area appears in the foreground image, the accuracy of person identification may deteriorate. Therefore, it is also preferable that the human area detection unit 12 excludes a portion other than the human area in the foreground image i = n in the foreground image i as an afterimage area. By excluding the afterimage area as much as possible, it is possible to improve the accuracy of person identification.

尚、背景画像が人物を含むことによって、各前景画像iについて、人物領域が（背景画像に含まれる自身の影響で）一部欠けることも起こりうる。そのため、人物が存在すると判定された場合に、当該前景画像を包含する前景画像i=n内の前景画像を当該前景画像と置き換えてもよい。 Note that when the background image includes a person, the person region may be partially missing (due to its own influence included in the background image) for each foreground image i. Therefore, when it is determined that a person is present, the foreground image in the foreground image i = n including the foreground image may be replaced with the foreground image.

＜前景画像に対する人物領域のマッチングの実施形態＞
前景画像から人物領域候補のマッチングは、３次元距離最小のペアを対応付けてもよいし、以下のような式で表される領域の重複率に基づいて対応付けてもよい。
Ｓ₁₂＝（Ａ₁∩Ａ₂）／（Ａ₁∪Ａ₂）
Ｓ₁₂：領域Ａ₁とＡ₂との一致度（重複率）
Ａ₁∩Ａ₂：領域Ａ₁とＡ₂との重複領域の面積
Ａ₁∪Ａ₂：領域Ａ₁とＡ₂との包含領域の面積
即ち、マッチングとは、前景画像から抽出された人物領域の集合の中で、重複率が最も高い人物領域候補を探索することを意味する。 <Embodiment of Matching of Person Region to Foreground Image>
The matching between the foreground image and the person area candidate may be made in correspondence with a pair of minimum three-dimensional distances, or may be made based on the overlapping rate of the area represented by the following equation.
S ₁₂ = (A ₁ ∩A ₂ ) / (A ₁ ∪A ₂ )
S ₁₂ : Degree of matching (overlap rate) between area A ₁ and A ₂
A ₁ ∩A ₂ : area of overlapping area of area A ₁ and A ₂
A ₁ ∪ A ₂ : Area of inclusion area of areas A ₁ and A ₂ That is, matching means searching for a person area candidate having the highest overlapping rate among a set of person areas extracted from the foreground image Means

尚、本発明ではマッチング方法について具体的に特定しないが、一般的な「貪欲法」を用いたものであってもよい。
まず、人物領域の集合の中で、重複率が最も高い人物領域ペアを対応付ける。次に、選択したそのペアを人物領域の集合から取り除き、再び、重複率が最も高い人物領域ペアを対応付ける。これを、重複率が所定閾値を超えるペアが無くなるまで、又は、一方のフレームの人物領域の集合の全人物領域の選択を完了するまで、繰り返す。これによって、人物領域のフレーム間を追跡（対応付け）することできる。
尚、貪欲法以外にも、ハンガリアン法等、選択する全ペアの重複率の総和が最大となるようにマッチングする方法や、選択するペア数が最大となるようにマッチングする方法もある。 In the present invention, the matching method is not specifically specified, but a general "greedy method" may be used.
First, in the set of person areas, the person area pair having the highest duplication rate is associated. Next, the selected pair is removed from the set of person areas, and again, the person area pair having the highest overlapping rate is associated. This is repeated until there is no pair whose overlapping rate exceeds a predetermined threshold, or until selection of all person areas in the set of person areas in one frame is completed. This enables tracking (association) between frames in the person area.
Other than the greedy method, there is also a method of matching such that the sum of overlapping rates of all pairs to be selected is maximized, such as the Hungarian method, or a method of maximizing the number of pairs to be selected.

＜先の時刻のフレームに基づく前景画像を用いた人物領域のマッチングの実施形態＞
人物領域検出部１２は、先の時刻のフレームt-1の中で動線を結ぶ前景画像に映る当該人物領域の画像特徴量を用いて、次の時刻のフレームtについて、最高更新頻度の前景画像i=1から最低更新頻度の前景画像i=nまで順に、マッチングによって人物領域を検出する。ここで、画像特徴量は、人物領域の外縁形状又は面積であってもよい。 <Embodiment of Matching Human Region Using Foreground Image Based on Frame of Previous Time>
The person area detection unit 12 uses the image feature amount of the person area in the foreground image connecting the flow line in the frame t-1 of the previous time to obtain the foreground with the highest update frequency for the frame t of the next time. A person area is detected by matching in order from the image i = 1 to the foreground image i = n with the lowest update frequency. Here, the image feature amount may be an outer edge shape or an area of the person area.

＜オクルージョンの判定＞
例えば図６によれば、人物Ｂの人物領域は、前景画像i=n=2ではマッチングによって検出されない。このとき、他の人物によって完全に遮蔽された可能性を考慮し、オクルージョンを判定する。具体的には、より上位の前景画像に映り込む人物Ｂの人物領域を用いて、例えば以下の式によって判定する。
Ｃ₁₂＝Ａ₁／（Ａ₁∪Ａ₂）
Ｃ₁₂：領域Ａ₁とＡ₂との包含率
Ａ₁：領域Ａ₁の面積（例えば上位の前景画像に映る人物Ｂの人物領域）
Ａ₁∪Ａ₂：領域Ａ₁とＡ₂との包含領域の面積
（前景画像nに映るオクルージョン領域）
例えば図６によれば、人物Ｂの人物領域を、マッチング済みの人物Ｃの人物領域が包含するために、人物Ｂは人物Ｃに完全に遮蔽されたと判定する。 <Occlusion judgment>
For example, according to FIG. 6, the person area of the person B is not detected by the matching in the foreground image i = n = 2. At this time, the occlusion is determined in consideration of the possibility of being completely blocked by another person. Specifically, the determination is made, for example, according to the following equation using the person area of the person B included in the higher order foreground image.
C ₁₂ = A ₁ / (A ₁ ∪A ₂ )
C ₁₂ : Inclusion rate of area A ₁ and A ₂
A ₁ : Area of area A ₁ (for example, person area of person B appearing in upper foreground image)
A ₁ ∪A ₂ : area of inclusion region of region A ₁ and A ₂
(Occlusion area in foreground image n)
For example, according to FIG. 6, it is determined that the person B is completely shielded by the person C because the person region of the person C who has been matched includes the person region of the person B.

＜コストを用いたマッチング方法＞
完全な遮蔽が発生している領域の構成人数がｎ_pの場合、ある１通りのマッチング方法を、以下のものとする。
Ｍi＝［ｍ₁,・・・,ｍ_j,・・・,ｍ_np］（i=1,・・・,Ｎ_M）
ここで、以下の式のコスト最小化によって、類似度の総和（例えば総積）を最大化するマッチング方法を選択することができる。
cost＝Σ^np _jＤ（ｍ_j）
Ｄ（ｍ_j）は、マッチングした１つのペアの距離を表し、画像特徴のユークリッド距離や、移動時間の差の絶対値、移動量の差のノルム等を用いることができる。人物の画像特徴の抽出については、ＨＯＧ特徴量や色ヒストグラムなどの、公知の技術を用いることができる。 <Matching method using cost>
When the number of members of the region where complete shielding occurs is n _p , one matching method is as follows.
Mi = [m ₁ , ..., m _j , ..., m _np ] (i = 1, ..., N _M )
Here, by the cost minimization of the following equation, it is possible to select a matching method that maximizes the sum of similarity (for example, the total product).
cost = ^np _j D (m _j )
D (m _j ) represents the distance of one matched pair, and the Euclidean distance of the image feature, the absolute value of the difference in moving time, the norm of the difference in moving amount, or the like can be used. For extraction of human image features, known techniques such as HOG feature quantities and color histograms can be used.

ここで、遮蔽される後方側の人物の画像特徴は、遮蔽する前方側の人物の影響で正確に抽出することができない。そのために、人物領域検出部１２は、遮蔽する前方側の人物の画像特徴のみを用いることによって、高精度なマッチングを実現することができる。
又は、遮蔽が発生する前の画像をテンプレートとして保存しておき、遮蔽される後方側の人物については、テンプレートの画像特徴と、人物領域候補の画像特徴との類似度を評価してもよい。 Here, the image feature of the person on the back side to be occluded can not be accurately extracted due to the influence of the person on the front side to be occluded. Therefore, the person area detection unit 12 can realize high accuracy matching by using only the image feature of the person on the front side to be shielded.
Alternatively, the image before the occurrence of shielding may be stored as a template, and the degree of similarity between the image feature of the template and the image feature of the person region candidate may be evaluated for the rear person to be shielded.

［人物領域追跡部１３］
人物領域追跡部１３は、最低更新頻度の前景画像i=nに人物領域が最初に映り込んだ際に、当該人物領域の動線の追跡を開始する。前景画像i=nについて、更新頻度が高い前景画像でマッチングされない（残っている）人物領域は、新規に撮影画像内に映り込んだ人物の可能性が高い。これらを新規の人物として動線の追跡を開始する。また、精度を高めるために、別途信頼性の高い人物判定の処理を実行し、人物と判定されたもののみを人物領域として検出してもよい。 [Person region tracking unit 13]
The person area tracking unit 13 starts tracking the flow line of the person area when the person area first appears in the foreground image i = n with the lowest update frequency. For the foreground image i = n, a person area which is not matched (remains) in the foreground image with a high update frequency is highly likely to be a person newly reflected in the photographed image. Start tracking the flow line as a new person. Also, in order to improve the accuracy, a process of highly reliable person determination may be executed separately, and only those determined to be persons may be detected as a person area.

そして、人物領域追跡部１３は、最低更新頻度の前景画像i=nに人物領域が映った後、当該人物領域が映る更新頻度が最も高い前景画像iを、フレームの時間経過に応じて動線で結ぶ。人物の動線とは、同一の人物について前フレームから現フレームへの追跡を表す。人物領域追跡部１３は、各人物の人物領域毎に、前フレームのいずれの前景画像iに動線が結ばれていたかを記憶する。 Then, after the person area is displayed in the foreground image i = n with the lowest update frequency, the person area tracking unit 13 flows the foreground image i with the highest update frequency in which the person area appears, according to the passage of time of the frame. Tie in The flow line of a person represents tracking from the previous frame to the current frame for the same person. The person area tracking unit 13 stores, for each person area of each person, which foreground image i of the previous frame the flow line is connected to.

また、人物領域追跡部１３は、当該人物領域が最低更新頻度の前景画像i=nから検出されない場合、当該人物領域の動線を除外する。その人物は既に、撮影画像に映り込んでいないためである。 In addition, when the person area tracking unit 13 does not detect the person area from the foreground image i = n with the lowest update frequency, it excludes the flow line of the person area. The reason is that the person is not already reflected in the photographed image.

［人物領域識別部１４］
人物領域識別部１４は、複数の人物領域同士でオクルージョンが発生した際に、人物領域毎の動線の相違によって人物を識別する。即ち、人物間オクルージョンが発生しても、動線が相違する限り、各人物の動線を識別する。 [People area identification unit 14]
When occlusion occurs in a plurality of person areas, the person area identification unit 14 identifies a person based on the difference in flow line for each person area. That is, even if the occlusion between persons occurs, the flow line of each person is identified as long as the flow line is different.

人物領域識別部１４は、オクルージョン発生時の人物を、以下の２つの条件によって識別する。
（条件１）移動中の人物の後方で、静止中の人物が遮蔽された場合の条件
移動中の人物における人物領域の動線は、静止中の人物における人物領域の動線と異なる。この場合、次の時刻のフレームで、移動中の人物による遮蔽が解消した静止中の人物における人物領域の動線は、最高更新頻度の前景画像i=1に映り込む。
（条件２）静止中の人物の後方で、移動中の人物が遮蔽された場合の条件
移動中の人物における人物領域の動線と、静止中の人物における人物領域の動線とが一致する。この場合、次の時刻のフレームで、静止中の人物による遮蔽が解消した移動中の人物における人物領域の動線は、最高更新頻度の前景画像i=1に映り込む。 The person area identification unit 14 identifies a person at the time of occurrence of occlusion according to the following two conditions.
(Condition 1) A condition in which a standing person is shielded behind a moving person. The flow line of the person area in the moving person is different from the flow line of the person area in the still person. In this case, in the frame of the next time, the flow line of the person area in the stationary person whose occlusion by the moving person has been eliminated is reflected in the foreground image i = 1 with the highest update frequency.
(Condition 2) A condition in the case where the moving person is blocked behind the standing person The flow line of the person area in the moving person coincides with the flow line of the person area in the standing person. In this case, in the frame at the next time, the flow line of the person area in the moving person whose occlusion by the person at rest has been eliminated is reflected in the foreground image i = 1 with the highest update frequency.

従来技術によれば、完全なオクルージョンが解消した場合、人物に付与する人物識別子が入れ替わる恐れがあった。これに対し、本発明によれば、オクルージョン発生後の人物領域が、最高更新頻度の前景画像i=1に映り込むか、又は、最低更新頻度の前景画像i=nに映り込むかによって、人物を明確に識別し、ロバストに追跡することができる。 According to the prior art, when complete occlusion is eliminated, there is a possibility that the person identifier given to the person may be replaced. On the other hand, according to the present invention, the person region after the occurrence of occlusion is reflected in the foreground image i = 1 of the highest update frequency or in the foreground image i = n of the lowest update frequency. Can be clearly identified and tracked robustly.

図７は、本発明による人物追跡を表す第１の説明図である。図７によれば、フレームtについて、移動中の人物Ｃの後方で、静止中の人物Ａが遮蔽された場合の条件１を表す。 FIG. 7 is a first explanatory diagram showing person tracking according to the present invention. According to FIG. 7, for the frame t, Condition 1 in the case where the stationary person A is blocked behind the moving person C is shown.

図８は、本発明による人物追跡を表す第２の説明図である。フレームtについて、静止中の人物Ａの後方で、移動中の人物Ｃが遮蔽された場合の条件２を表す。 FIG. 8 is a second explanatory diagram showing person tracking according to the present invention. For the frame t, Condition 2 in the case where the moving person C is blocked behind the stationary person A is shown.

人物領域識別部１４は、完全な人物間オクルージョンが発生しても、以下のように判定することができる。
図７によれば、移動中の人物の後方で、静止中の人物が遮蔽された場合、移動中の人物における人物領域の動線と、静止中の人物における人物領域の動線とが一致する。
図８によれば、静止中の人物の後方で、移動中の人物が遮蔽された場合、移動中の人物における人物領域の動線は、静止中の人物における人物領域の動線と異なっている。 The person area identification unit 14 can determine as follows even if perfect inter-person occlusion occurs.
According to FIG. 7, when the standing person is shielded behind the moving person, the flow line of the person area in the moving person coincides with the flow line of the person area in the still person. .
According to FIG. 8, when the moving person is blocked behind the standing person, the flow line of the person area in the moving person is different from the flow line of the person area in the still person .

図７及び図８によれば、前景画像n=3とし、３人の人物Ａ、Ｂ、Ｃの移動が撮影画像に映り込む例を表す。
図７及び図８について、フレームt-5〜t-1までは全く同じ人物追跡となっている。 FIGS. 7 and 8 show an example in which the movement of three persons A, B and C is reflected in the photographed image, with the foreground image n = 3.
In FIGS. 7 and 8, the frames t-5 to t-1 are completely the same person tracking.

［フレームt-5］撮影画像には、人物Ａのみが映り込んでいる。
（前景画像i=1）人物Ａの人物領域と、その近くに人物Ａの残像領域とが映り込んでいる。人物Ａの残像領域に当たる前景領域は、当該領域がi=3に映り込んでいないことで残像領域と判別できるため、人物領域とは見なされずに除外される。以下の説明でも、残像領域は、人物領域とは見なされずに除外されることとする。
（前景画像i=2）人物Ａの人物領域のみが映り込んでいる。
（前景画像i=3）人物Ａの人物領域のみが映り込んでいる。最低更新頻度の前景画像i=3に初めて人物検出を開始したために、その後、人物Ａの動線の追跡を開始する。 [Frame t-5] In the captured image, only the person A is reflected.
(Foreground image i = 1) The person area of the person A and the afterimage area of the person A appear near the person area. The foreground area corresponding to the afterimage area of the person A can be determined as an afterimage area because the area is not reflected in i = 3, so it is excluded without being regarded as a person area. Also in the following description, the afterimage area is excluded without being regarded as a human area.
(Foreground image i = 2) Only the person area of person A is reflected.
(Foreground image i = 3) Only the person area of person A is reflected. Since human detection is started for the first time in the foreground image i = 3 with the lowest update frequency, tracking of the flow of the person A is started thereafter.

［フレームt-4］撮影画像には、左から右方向へ移動している人物Ａのみが映り込んでいる。
（前景画像i=1）人物Ａの人物領域と、その近くに残像領域とが映り込んでいる。
（前景画像i=2）人物Ａの人物領域と、少し離れて残像領域とが映り込んでいる。ここで、前景画像i=2から、前景画像i=1に映り込む人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａの人物領域のみが映り込んでいるが、前景画像i=3から、前景画像i=1に映り込む人物領域を、画像的にマスクしている。このとき、当該人物領域が映る更新頻度が最も高い前景画像i=1を時間経過に応じて動線で結ぶ。人物Ａについて、t-5の前景画像i=3から、t-4の前景画像i=1に動線を結ぶ。 [Frame t-4] In the photographed image, only the person A moving from left to right is reflected.
(Foreground image i = 1) The person area of the person A and the afterimage area appear near it.
(Foreground image i = 2) The person area of person A and the afterimage area are reflected slightly apart. Here, from the foreground image i = 2, the person area to be reflected in the foreground image i = 1 is imagewise masked.
(Foreground image i = 3) Although only the person area of the person A is reflected, the person area to be reflected in the foreground image i = 1 from the foreground image i = 3 is imagewise masked. At this time, the foreground image i = 1, which has the highest update frequency in which the person area appears, is connected by a flow line according to the passage of time. For person A, a flow line is drawn from the foreground image i = 3 of t-5 to the foreground image i = 1 of t-4.

［フレームt-3］人物Ａが静止し、撮影画像に新たに人物Ｂが映り込んだとする。
（前景画像i=1）新たな人物Ｂの人物領域は検出されるが、人物Ａの人物領域は、静止によって背景画像と一致して検出できない。
（前景画像i=2）人物Ａ及びＢの人物領域が検出されると共に、少し離れて残像領域も検出される。ここでは、前景画像i=2は、前景画像i=1よりも更新頻度が低いために、静止した人物Ａはまだ検出できる。
（前景画像i=3）人物Ａの人物領域と、人物Ｂの人物領域とが映り込んでいる。その上で、前景画像i=3から、前景画像i=2に映り込む人物Ａの人物領域を、画像的にマスクする。
人物Ａについて、t-4の前景画像i=1から、t-3の前景画像i=2に動線を結ぶ。
人物Ｂについて、最低更新頻度の前景画像i=3に初めてその人物領域（人物Ｂ）が映り込んだために、その後、当該人物Ｂの動線の追跡を開始する。 [Frame t-3] Suppose that the person A stands still and the person B is newly reflected in the captured image.
(Foreground image i = 1) Although the person area of the new person B is detected, the person area of the person A can not be detected in agreement with the background image due to the stillness.
(Foreground image i = 2) While the person areas of the persons A and B are detected, the afterimage area is also detected with a slight distance. Here, since the foreground image i = 2 is updated less frequently than the foreground image i = 1, the still person A can still be detected.
(Foreground image i = 3) The person area of person A and the person area of person B are reflected. Then, from the foreground image i = 3, the person area of the person A appearing in the foreground image i = 2 is image-wise masked.
For the person A, flow lines are connected from the foreground image i = 1 at t-4 to the foreground image i = 2 at t-3.
Since the person area (person B) appears for the first time in the foreground image i = 3 with the lowest update frequency for the person B, tracking of the flow line of the person B is then started.

［フレームt-2］人物Ａが静止し、人物Ｂが左から右へ移動し、撮影画像に新たに人物Ｃが映り込んだとする。
（前景画像i=1）人物Ｂ及びＣの人物領域は検出されるが、人物Ａの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｂの人物領域の左横には、その人物Ｂの残像領域が映り込んでいる。
（前景画像i=2）人物Ｂ及びＣの人物領域が検出される。人物Ａの人物領域は、静止によって背景画像と一致して検出できない。その上で、前景画像i=2から、前景画像i=1に映り込む人物Ｂの人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいる。その上で、前景画像i=3から、前景画像i=1に映り込む人物Ｂの人物領域を、画像的にマスクする。
前景画像i=3は、前景画像i=2よりも更新頻度が低いために、静止した人物Ａの人物領域は、まだ検出できている。
人物Ａについて、t-3の前景画像i=2から、t-2の前景画像i=3に動線を結ぶ。
人物Ｂについて、t-3の前景画像i=3から、t-2の前景画像i=1に動線を結ぶ。
人物Ｃについて、最低更新頻度の前景画像i=3に初めてその人物領域（人物Ｃ）が映り込んだために、その後、当該人物Ｃの動線の追跡を開始する。 [Frame t-2] Suppose that the person A stands still, the person B moves from the left to the right, and the person C is newly reflected in the captured image.
(Foreground image i = 1) The person areas of the persons B and C are detected, but the person area of the person A can not be detected in agreement with the background image due to the stillness. Further, on the left side of the person area of the person B, the afterimage area of the person B is reflected.
(Foreground image i = 2) Person regions of persons B and C are detected. The person area of the person A can not be detected in agreement with the background image due to the stillness. Then, from the foreground image i = 2, the person region of the person B appearing in the foreground image i = 1 is image-wise masked.
(Foreground image i = 3) Person areas of persons A, B and C are reflected. Then, from the foreground image i = 3, the person area of the person B appearing in the foreground image i = 1 is image-wise masked.
Since the foreground image i = 3 is updated less frequently than the foreground image i = 2, the person area of the stationary person A can still be detected.
For the person A, flow lines are connected from the foreground image i = 2 at t-3 to the foreground image i = 3 at t-2.
For person B, a flow line is drawn from the foreground image i = 3 of t-3 to the foreground image i = 1 of t-2.
For the person C, since the person area (person C) appears for the first time in the foreground image i = 3 with the lowest update frequency, tracking of the flow line of the person C is then started.

［フレームt-1］人物Ａ、Ｂが静止し、人物Ｃが右下へ移動したとする。
（前景画像i=1）人物Ｃの人物領域は検出できるが、人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｃの左上には、その人物Ｃの残像領域が映り込んでいる。
（前景画像i=2）人物Ｂ及びＣの人物領域が検出される。人物Ａの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｂの人物領域の左横には、その人物Ｂの残像領域が映り込んでいる。その上で、前景画像i=2から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいる。その上で、前景画像i=3から、前景画像i=1,2に映り込む人物Ｂ及びＣの人物領域を、画像的にマスクする。
前景画像i=3は、前景画像i=2よりも更新頻度が低いために、静止した人物Ａの人物領域は、まだ検出できている。
人物Ａについて、t-2の前景画像i=3から、t-1の前景画像i=3に動線を結ぶ。
人物Ｂについて、t-2の前景画像i=1から、t-1の前景画像i=2に動線を結ぶ。
人物Ｃについて、t-2の前景画像i=3から、t-1の前景画像i=1に動線を結ぶ。 [Frame t-1] Suppose that the persons A and B stand still, and the person C moves to the lower right.
(Foreground image i = 1) Although the person area of the person C can be detected, the person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. In the upper left of the person C, an afterimage area of the person C is reflected.
(Foreground image i = 2) Person regions of persons B and C are detected. The person area of the person A can not be detected in agreement with the background image due to the stillness. Further, on the left side of the person area of the person B, the afterimage area of the person B is reflected. Then, from the foreground image i = 2, the person area of the person C appearing in the foreground image i = 1 is image-wise masked.
(Foreground image i = 3) Person areas of persons A, B and C are reflected. Then, from foreground image i = 3, person regions of persons B and C reflected in foreground images i = 1, 2 are imagewise masked.
Since the foreground image i = 3 is updated less frequently than the foreground image i = 2, the person area of the stationary person A can still be detected.
For person A, a flow line is drawn from the foreground image i = 3 at t-2 to the foreground image i = 3 at t-1.
For the person B, flow lines are connected from the foreground image i = 1 at t-2 to the foreground image i = 2 at t-1.
For person C, a flow line is drawn from the foreground image i = 3 at t-2 to the foreground image i = 1 at t-1.

人物の人物領域は、その移動速度が速くなるほど、更新頻度の高い前景画像から順に映り込み、静止するほど、更新頻度の低い前景画像のみに映り込む。本発明のようにマスクすることによって、人物の人物領域は、その移動速度が速くなるほど、更新頻度の高い前景画像のみに映り込む。 The faster the moving speed of a person's person area is, the more frequently updated foreground images appear in order, and the more stopped, the less updated image appears only in foreground images. By masking as in the present invention, as the moving speed of the person is faster, it is reflected only in the foreground image that is frequently updated.

＜図７：移動中の人物の後方で、静止中の人物が遮蔽された場合＞
［フレームt］移動中の人物Ｃの後方で、静止中の人物Ｂが遮蔽されたとする。
（前景画像i=1）人物Ｃの人物領域は検出できるが、人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｃの人物領域の左横には、その人物Ｃの残像領域が映り込んでいる。
（前景画像i=2）人物Ｃの人物領域が検出される。人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｃの人物領域から少し離れて、その人物Ｃの残像領域が映り込んでいる。その上で、前景画像i=2から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいるが、人物Ａ及びＣが完全なオクルージョンを発生している。ここで、前景画像i=3から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
人物Ａについて、人物Ｃと完全なオクルージョンが発生したことを認識したために、t-1の前景画像i=3から、tの前景画像i=3のオクルージョン部分に動線を結ぶ。
人物Ｂについて、t-1の前景画像i=2から、tの前景画像i=3に動線を結ぶ。
人物Ｃについて、t-1の前景画像i=1から、tの前景画像i=1に動線を結ぶ。 <Fig. 7: Case where a stationary person is blocked behind a moving person>
[Frame t] It is assumed that the stationary person B is shielded behind the moving person C.
(Foreground image i = 1) Although the person area of the person C can be detected, the person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. Further, on the left side of the person area of the person C, an afterimage area of the person C is reflected.
(Foreground image i = 2) The person area of the person C is detected. The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. Further, the afterimage area of the person C is reflected slightly away from the person area of the person C. Then, from the foreground image i = 2, the person area of the person C appearing in the foreground image i = 1 is image-wise masked.
(Foreground image i = 3) Although the person areas of the persons A, B, and C are reflected, the persons A and C generate complete occlusion. Here, from the foreground image i = 3, the person region of the person C appearing in the foreground image i = 1 is imagewise masked.
Since it is recognized that perfect occlusion has occurred with the person C for the person A, a flow line is connected from the foreground image i = 3 of t−1 to the occlusion portion of the foreground image i = 3 of t.
For person B, a flow line is connected from the foreground image i = 2 at t−1 to the foreground image i = 3 at t.
For the person C, flow lines are connected from the foreground image i = 1 of t−1 to the foreground image i = 1 of t.

［フレームt+1］人物Ｃが、右へ移動したとする。
（前景画像i=1）人物Ｃの人物領域が検出されると共に、オクルージョンが解消し、遮蔽されていた静止中の人物Ａの人物領域が新たに映り込む。即ち、オクルージョン解消後の時刻t+1について、オクルージョンを発生していた人物Ａ及び人物Ｃの両方が、前景画像i=1に映り込む。
（前景画像i=2）人物Ｃの人物領域が検出される。人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｃの人物領域から少し離れて、その人物Ｃの残像領域が映り込んでいる。その上で、前景画像i=2から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいる。ここで、前景画像i=3から、前景画像i=1に映り込む人物Ｂ及びＣの人物領域を、画像的にマスクする。
人物Ａについて、tの前景画像i=3から、t+1の前景画像i=1に動線を結ぶ。
人物Ｂについて、tの前景画像i=3から、t+1の前景画像i=3に動線を結ぶ。
人物Ｃについて、tの前景画像i=1から、t+1の前景画像i=1に動線を結ぶ。
このように、フレームt及びフレームt+1から、完全なオクルージョンを発生した人物Ａ及びＣを識別して追跡することができる。 [Frame t + 1] Suppose that the person C moves to the right.
(Foreground image i = 1) The person area of the person C is detected, the occlusion is eliminated, and the person area of the still person A who has been blocked is newly reflected. That is, at time t + 1 after the occlusion is eliminated, both the person A and the person C who have caused occlusion appear in the foreground image i = 1.
(Foreground image i = 2) The person area of the person C is detected. The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. Further, the afterimage area of the person C is reflected slightly away from the person area of the person C. Then, from the foreground image i = 2, the person area of the person C appearing in the foreground image i = 1 is image-wise masked.
(Foreground image i = 3) Person areas of persons A, B and C are reflected. Here, from foreground image i = 3, person regions of persons B and C reflected in foreground image i = 1 are imagewise masked.
For person A, a flow line is drawn from the foreground image i = 3 of t to the foreground image i = 1 of t + 1.
For person B, a flow line is drawn from the foreground image i = 3 of t to the foreground image i = 3 of t + 1.
For the person C, a flow line is drawn from the foreground image i = 1 of t to the foreground image i = 1 of t + 1.
In this way, from the frame t and the frame t + 1, the persons A and C who have generated the complete occlusion can be identified and tracked.

［フレームt+2］移動によって人物Ｃが撮影画像に映り込まなくなったとする。人物Ａ及びＢは静止したままである。
（前景画像i=1）人物Ｃの人物領域は検出されない。また、人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=2）人物Ａの人物領域のみが検出される。人物Ｂの人物領域は、静止によって背景画像と一致して検出できない。
（前景画像i=3）人物Ａ、Ｂの人物領域が映り込んでいる。その上で、前景画像i=3から、前景画像i=2に映り込む人物Ａの人物領域を、画像的にマスクする。
人物Ａについて、t+1の前景画像i=1から、t+2の前景画像i=2に動線を結ぶ。
人物Ｂについて、t+1の前景画像i=3から、t+2の前景画像i=3に動線を結ぶ。
人物Ｃについて、t+2の前景画像i=3にも映り込んでおらず、追跡が終了される。 [Frame t + 2] It is assumed that the person C does not appear in the photographed image due to the movement. People A and B remain stationary.
(Foreground image i = 1) The person area of the person C is not detected. In addition, the person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 2) Only the person area of person A is detected. The person area of the person B can not be detected in agreement with the background image due to the stillness.
(Foreground image i = 3) The person area of the persons A and B is reflected. Then, from the foreground image i = 3, the person area of the person A appearing in the foreground image i = 2 is image-wise masked.
For person A, flow lines are connected from the foreground image i = 1 at t + 1 to the foreground image i = 2 at t + 2.
For person B, a flow line is drawn from the foreground image i = 3 at t + 1 to the foreground image i = 3 at t + 2.
The person C is not reflected in the foreground image i = 3 at t + 2, and the tracking is ended.

［フレームt+3］人物Ａ及びＢは静止したままである。
（前景画像i=1）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。
（前景画像i=2）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=3）人物Ａ、Ｂの人物領域が映り込んでいる。
人物Ａについて、t+2の前景画像i=2から、t+3の前景画像i=3に動線を結ぶ。
人物Ｂについて、t+2の前景画像i=3から、t+3の前景画像i=3に動線を結ぶ。 [Frame t + 3] People A and B remain stationary.
(Foreground image i = 1) The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness.
(Foreground image i = 2) The person areas of persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 3) The person area of the persons A and B is reflected.
For person A, a flow line is drawn from the foreground image i = 2 at t + 2 to the foreground image i = 3 at t + 3.
For person B, flow lines are drawn from the foreground image i = 3 at t + 2 to the foreground image i = 3 at t + 3.

＜図８：静止中の人物の後方で、移動中の人物が遮蔽された場合＞
［フレームt］静止中の人物Ａの後方で、移動中の人物Ｃが遮蔽されたとする。
（前景画像i=1）人物Ｃの人物領域は検出できない。人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=2）人物Ｃの人物領域は検出できない。人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいるが、人物Ａ及びＣが完全なオクルージョンを発生している。
人物Ａについて、t-1の前景画像i=3から、tの前景画像i=3に動線を結ぶ。
人物Ｂについて、t-1の前景画像i=2から、tの前景画像i=3に動線を結ぶ。
人物Ｃについて、人物Ａと完全なオクルージョンが発生したことを認識したために、t-1の前景画像i=3から、tの前景画像i=3のオクルージョン部分に動線を結ぶ。 <FIG. 8: When the moving person is occluded behind the stationary person>
[Frame t] It is assumed that the moving person C is blocked behind the stationary person A.
(Foreground image i = 1) The person area of the person C can not be detected. The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 2) The person area of the person C can not be detected. The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 3) Although the person areas of the persons A, B, and C are reflected, the persons A and C generate complete occlusion.
For person A, a flow line is drawn from the foreground image i = 3 at t−1 to the foreground image i = 3 at t.
For person B, a flow line is connected from the foreground image i = 2 at t−1 to the foreground image i = 3 at t.
Since it is recognized that perfect occlusion has occurred with the person A for the person C, a flow line is connected from the foreground image i = 3 of t−1 to the occlusion portion of the foreground image i = 3 of t.

［フレームt+1］人物Ｃが、右へ移動したとする。
（前景画像i=1）オクルージョンが解消し、遮蔽されていた人物Ｃの人物領域が検出される。人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。即ち、オクルージョン解消後の時刻t+1について、オクルージョンの後方で遮蔽されていた人物Ｃのみが、前景画像i=1に映り込む。
（前景画像i=2）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。また、人物Ｃの人物領域から少し離れて、その人物Ｃの残像領域が映り込んでいる。その上で、前景画像i=2から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
（前景画像i=3）人物Ａ、Ｂ、Ｃの人物領域が映り込んでいる。ここで、前景画像i=3から、前景画像i=1に映り込む人物Ｃの人物領域を、画像的にマスクする。
人物Ａについて、tの前景画像i=3から、t+1の前景画像i=3に動線を結ぶ。
人物Ｂについて、tの前景画像i=3から、t+1の前景画像i=3に動線を結ぶ。
人物Ｃについて、tの前景画像i=3から、t+1の前景画像i=1に動線を結ぶ。
このように、フレームt及びフレームt+1から、完全なオクルージョンを発生した人物Ａ及びＣを識別して追跡することができる。 [Frame t + 1] Suppose that the person C moves to the right.
(Foreground image i = 1) The occlusion is eliminated, and the person area of the person C who has been shielded is detected. The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. That is, at time t + 1 after occlusion removal, only the person C who has been blocked behind the occlusion appears in the foreground image i = 1.
(Foreground image i = 2) The person areas of persons A and B can not be detected in agreement with the background image due to the stillness. Further, the afterimage area of the person C is reflected slightly away from the person area of the person C. Then, from the foreground image i = 2, the person area of the person C appearing in the foreground image i = 1 is image-wise masked.
(Foreground image i = 3) Person areas of persons A, B and C are reflected. Here, from the foreground image i = 3, the person region of the person C appearing in the foreground image i = 1 is imagewise masked.
For person A, a flow line is drawn from the foreground image i = 3 of t to the foreground image i = 3 of t + 1.
For person B, a flow line is drawn from the foreground image i = 3 of t to the foreground image i = 3 of t + 1.
For person C, a flow line is drawn from the foreground image i = 3 of t to the foreground image i = 1 of t + 1.
In this way, from the frame t and the frame t + 1, the persons A and C who have generated the complete occlusion can be identified and tracked.

［フレームt+2］移動によって人物Ｃが撮影画像に映り込まなくなったとする。人物Ａ及びＢは静止したままである。
（前景画像i=1）人物Ｃの人物領域は検出されない。また、人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=2）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。
（前景画像i=3）人物Ａ、Ｂの人物領域が映り込んでいる。
人物Ａについて、t+1の前景画像i=3から、t+2の前景画像i=3に動線を結ぶ。
人物Ｂについて、t+1の前景画像i=3から、t+2の前景画像i=3に動線を結ぶ。
人物Ｃについて、t+1の前景画像i=3にも映り込んでおらず、追跡が終了される。 [Frame t + 2] It is assumed that the person C does not appear in the photographed image due to the movement. People A and B remain stationary.
(Foreground image i = 1) The person area of the person C is not detected. In addition, the person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 2) The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness.
(Foreground image i = 3) The person area of the persons A and B is reflected.
For person A, a flow line is drawn from the foreground image i = 3 at t + 1 to the foreground image i = 3 at t + 2.
For person B, a flow line is drawn from the foreground image i = 3 at t + 1 to the foreground image i = 3 at t + 2.
The person C is not reflected in the foreground image i = 3 at t + 1, and the tracking is ended.

［フレームt+3］人物Ａ及びＢは静止したままである。
（前景画像i=1）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。
（前景画像i=2）人物Ａ及びＢの人物領域は、静止によって背景画像と一致して検出できない。但し、人物Ｃの残像領域のみが映り込んでいる。
（前景画像i=3）人物Ａ、Ｂの人物領域が映り込んでいる。
人物Ａについて、t+2の前景画像i=3から、t+3の前景画像i=3に動線を結ぶ。
人物Ｂについて、t+2の前景画像i=3から、t+3の前景画像i=3に動線を結ぶ。 [Frame t + 3] People A and B remain stationary.
(Foreground image i = 1) The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness.
(Foreground image i = 2) The person areas of the persons A and B can not be detected in agreement with the background image due to the stillness. However, only the afterimage area of the person C is reflected.
(Foreground image i = 3) The person area of the persons A and B is reflected.
For person A, a flow line is drawn from the foreground image i = 3 at t + 2 to the foreground image i = 3 at t + 3.
For person B, flow lines are drawn from the foreground image i = 3 at t + 2 to the foreground image i = 3 at t + 3.

以上、詳細に説明したように、本発明の画像処理装置、プログラム及び方法によれば、１台のカメラによる撮影画像であっても、移動速度の異なる複数人の人物間オクルージョンに対してロバストに追跡を継続することができる。具体的には、各人物を、移動速度及び静止状態に応じて複数の前景画像に分散させて認識することができるために、複数人物が同じ時間に固まって移動しない状況である限り、高精度にオクルージョン領域の人物同士を検出することができる。 As described above in detail, according to the image processing apparatus, program, and method of the present invention, even an image captured by a single camera can be robust against occlusion between a plurality of persons having different moving speeds. Tracking can be continued. Specifically, since each person can be dispersed and recognized in a plurality of foreground images according to the movement speed and the stationary state, high accuracy is possible as long as a plurality of persons do not move together at the same time. People in the occlusion area can be detected.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 For the various embodiments of the present invention described above, various modifications, corrections and omissions of the scope of the technical idea and aspect of the present invention can be easily made by those skilled in the art. The above description is merely an example and is not intended to be limiting in any way. The present invention is limited only as defined in the following claims and the equivalents thereto.

１画像解析装置、画像解析サーバ
１１前景画像抽出部
１２人物領域検出部
１３人物領域追跡部
１４人物領域識別部
２全方位カメラ 1 image analysis apparatus, image analysis server 11 foreground image extraction unit 12 person area detection unit 13 person area tracking unit 14 person area identification unit 2 omnidirectional camera

Claims

In an image analysis apparatus for tracking a person from among continuous captured images by a camera,
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images Foreground image extraction means,
Person area detection means for detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A person area tracking unit that connects the foreground image i having the highest update frequency in which the person area appears, with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. An image analysis apparatus comprising: a person area identification means for imagewise masking the image and identifying a person by a difference in flow line for each person area.

2. The image analysis apparatus according to claim 1, wherein the person area tracking unit excludes the flow line of the person area when the person area is not detected from the foreground image i = n of the lowest update frequency.

The person area tracking means, when elaborate reflected human region within the first foreground image i = n lowest update frequency, to claim 1 or 2, characterized in that start tracking the flow line of the person area Image analysis device as described.

The person area identification means
When the stationary person is shielded behind the moving person, the flow line of the person area in the moving person is different from the flow line of the person area in the stationary person,
In the frame of the next time, the flow line of the person area in the stationary person whose occlusion by the moving person is canceled is reflected in the foreground image i = 1 of the highest update frequency to identify the person at the time of occlusion occurrence The image analysis device according to claim 3 , characterized in that:

The person area identification means
When the moving person is shielded behind the standing person, the flow line of the person area in the moving person coincides with the flow line of the person area in the standing person,
In the frame of the next time, the flow line of the person area in the moving person whose occluding by the standing person has been eliminated is reflected in the foreground image i = 1 of the highest update frequency to identify the person at the time of occlusion occurrence The image analysis device according to claim 3 or 4 , characterized in that:

The person area detection means uses the image feature amount of the person area in the foreground image connecting the flow line in the frame of the previous time, and the foreground image i = 1 of the highest update frequency for the frame of the next time The image analysis apparatus according to any one of claims 1 to 5, wherein a person area is detected by matching in order from the first to the foreground image i = n with the lowest update frequency.

The foreground image extraction unit is configured to calculate the stage n of the foreground image i = n with the lowest update frequency,
According to the number of flow lines detected by the person area tracking means (the number of people in tracking)
According to the maximum number of people in the occlusion area detected by the person area identification means,
Or
The image analysis apparatus according to any one of claims 1 to 6 , characterized in that it is variable according to the maximum area of the occlusion area detected by the person area identification means.

The image according to any one of claims 1 to 7 , wherein the person area detection means excludes a portion other than the person area appearing in the foreground image i = n in the foreground image i as an afterimage area. Analysis device.

In a program for image analysis that causes a computer installed in an apparatus for identifying a person to be taken from continuous captured images by a camera,
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images Foreground image extraction means,
Person area detection means for detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A person area tracking unit that connects the foreground image i having the highest update frequency in which the person area appears, with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. A program for image analysis, characterized in that the computer functions as a person area identification means for imagewise masking the image and identifying the person by the difference of flow lines for each person area.

In an image analysis method of an apparatus for identifying a person from continuous captured images by a camera,
The device
Extract a plurality of foreground images i (i = 1 to n, n> 1) for a frame at the same time using background differences using a plurality of background images updated at different frequencies from continuous captured images The first step,
A second step of detecting a person area from all foreground images from the foreground image i = 1 of the highest update frequency to the foreground image i = n of the lowest update frequency;
A third step of connecting the foreground image i having the highest update frequency in which the person area is shown with a flow line according to the passage of time of the frame;
When occlusion occurs among a plurality of human regions , the foreground image i is a human region already detected in the foreground image (<i) at a higher update frequency than the foreground image i for a frame at the same time. And a fourth step of identifying a person by a difference in flow lines for each person area, and performing an image analysis method of the apparatus.