JP3616355B2

JP3616355B2 - Image processing method and image processing apparatus by computer

Info

Publication number: JP3616355B2
Application number: JP2001188416A
Authority: JP
Inventors: バラージュ・バーグボルジ; 信也山内; ラースロー・ツニ; タマーシュ・グレグシュ; タマーシュ・シラニイ
Original assignee: Tateyama Kagaku Kogyo Co Ltd
Current assignee: Tateyama Kagaku Kogyo Co Ltd
Priority date: 2001-06-21
Filing date: 2001-06-21
Publication date: 2005-02-02
Anticipated expiration: 2021-06-21
Also published as: JP2003006659A

Description

【０００１】
【発明の属する技術分野】
本発明は、光学的に取得された動画データ上で動体と背景とを識別する為の画像処理方法と画像処理装置に関する。
【０００２】
【従来の技術】
ビデオ連続画像は、ビデオフレーム（以下、フレームと記す。）と呼ばれる連続した静止画像から成り立っている。時間軸で前後するフレームは、互いによく似通っているものの僅かな相違点（前後差）が存在し、当該相違点の連続によって被写体の動きが表現される。
【０００３】
画像処理装置の一態様である監視システムの主目的は、監視中の空間に侵入する動体を全て検知することにあり、監視システムの根幹たる動体捕捉では、この様なビデオ連続画像のリアルタイムでの動作分析が基本となる。例えば、パノラマ画像ブロック（以下、ＰＡＬと記す。）を用いて取得した環状画像による全方向監視システムなど固定カメラを用いた監視システムでは、原則として背景自体が連続画像上で動作せず、動体として検知する対象は、風や光による外乱或いは動きを伴う機械・器具等を除けば、外部からの侵入物だけである。この様に、固定カメラ式の監視システムは、リアルタイムの動作分析におけるコスト面や設備面で有利であることに着眼し、固定カメラ式の監視システムに適した動体検出処理の開発を図ることとした。
【０００４】
従来から用いられている第一の動体検出処理として、単純にビデオ画像の連続する二つのフレームの引き算をするものが挙げられる。この方法は、視野内に存在する動体を検出するには良い方法であるが、検出対象たる動体はもとより照明ノイズ、自然光の変化、電気的ノイズなど不必要な変化をも全て捉えてしまう。また、照度の違いは検出できても色彩的に背景に埋もれた動体は検出できず、たとえ色彩的に背景に埋もれる事が無いとしても色彩的な変化を多く含む周辺部以外が検出できない。
【０００５】
その結果、例えば、図１０（イ）の如く、単一の色調の服を着た人物が画像中を左から右へ移動する場合、上記動体検出処理では、図１０（ロ）の如く、当該人物の右側だけを検出し、左側の背景の一部は検出できても残りは前のフレーム内の人物の左側で隠されてしまいフレーム毎の背景認識ができない。当該移動体検出処理は、上記不正確さにより商用の物体追跡用途には応用できないが、比較的遅いコンピュータでもフレームレートで処理できることから、単純に動きだけを検出する用途には理想的と言える。
【０００６】
フレーム毎の背景認識を可能とする第２の動体検出処理として、ある画面の背景だけを基準背景画像として何らかの方法で記録し、その後各フレームから当該基準背景画像を差し引く方法が挙げられる。この方法を用いれば一様な動体をみつけたり小さなノイズを除去するのも容易となる。ところが、差し引かれる対象の背景自体は、明るさや色彩が刻々と変化する動画であるために、時間の経過に従って新たに取得した最新画像の背景と予め取得した基本背景画像の背景との差が増大し、次第にノイズが増加するという欠点もある。
【０００７】
この問題を解消すべく、時間の経過と共に背景画像を動的に更新する方法、具体的には、連続するビデオ画像中で、各画素につき最新フレームから過去に向かう幾つかのフレームのピクセル値の平均をとって背景を造る平均背景抽出法を採用し、アナログのＶＬＳＩハードウエアを併用する事によって実用的なフレームレートで高速に且つ簡単に行える方法を試みた。
【０００８】
【発明が解決しようとする課題】
しかしながら、上記の如く動体検出処理に用いる基本背景画像としていくつかのフレームの平均値をとるとすると、平均値の性格上、各フレームは当該平均の結果として得られる基本背景画像が背景と動体との合成画像（図１２（イ））となってゴースト効果（図１２（ロ）の当該合成画像と最新フレーム画像との閾値差による画像参照）が生じるという問題が残る。
【０００９】
本発明は、この様な実情に鑑みてなされたもので、変化する環境に対して正確に適応し、動体と背景とを低コストで正確に且つ高速に識別し得るコンピュータによる画像処理方法と画像処理装置の提供を目的とする。
【００１０】
【課題を解決するための手段】
上記課題を解決するために成された本発明によるコンピュータによる画像処理方法は、光学的に取得された動画データ上で動体と背景とを識別する為の画像処理方法において、前記動画のフレームを構成する画素それぞれのピクセル値について最新フレームと直前フレーム間の前後差を求め、最新フレームを構成する各画素に対し当該前後差の絶対値に反比例した重み付けを施して得た評価値より導出した動体成分値から、所定の閾値に基づいて動体構成画素を認定する動体検出処理を演算手段により実行することを特徴とする。
【００１１】
尚、動画のフレームとは、ＣＣＤ等の撮像素子から入力した時点のフレームでも良いし、受像器のディスプレイに表示される時点のフレームでも良い。ピクセル値とは、１出力画素の値で、輝度や色彩等の情報を数値化したものである。また、重み付けとは、例えば、前記前後差の絶対値に反比例させると言う様な計算式等の一定の規則に照らしてウエイトの大小を前記評価値に反映させることを意味する。閾値に基づいて動体構成画素を認定するとは、例えば、前記ピクセル値から評価値を差し引いた動体成分値について、閾値より高い値の画素については動体の一部を構成する動体構成画素と認定し、閾値より低い値の部分については背景成分画素として判断することを意味する。
【００１２】
光の外乱による不具合を回避すべく、フレームを構成する画素毎に前記評価値の累積平均値及び制限時間内の最大変化量Ｄを最新フレームを取得する毎に導出し、最新フレームにおける各画素の評価値と前記累積平均値との差の絶対値たる累積平均格差が、前記最大変化量Ｄに基づく基準値を上回る画素を、前記動体構成画素として擬制する環境的フィルタリング処理を演算手段により実行する場合もある。尚、前記最大変化量Ｄに基づく基準値とは、当該最大変化量に対して一定の割合を増減させた値のことである。
【００１３】
前記環境的フィルタリング処理については前記最新フレームにおける各画素のピクセル値と前記累積平均値との差の絶対値が、前記最大変化量Ｄに基づく基準値を下回る場合には、例えば、基準値を下回った量に相当する分だけ最大変化量Ｄを減少させる形で更新し、逆に前記最新フレームにおける各画素のピクセル値と前記累積平均値との差の絶対値が、前記最大変化量Ｄを上回る場合には、例えば、基準値を上回った量に相当する分だけ最大変化量Ｄを増加させる形で更新する学習処理を含む場合もある。
【００１４】
又、前記動画のフレーム上において捕捉した相分離する動体間の距離が所定の近接距離未満の場合にそれら動体を相連結した一個の動体として認定すると共に、当該相連結した動体の大きさを算出し、所定の閾値を下回る大きさの動体を検出対象から除外する処理を含んだ物体的フィルタリング処理を演算手段により実行する場合もある。
【００１５】
更に、前記動画のフレーム上において捕捉した動体に形状及び大きさが最も近似した動体を、その直前のフレーム上で検索し、当該動体の移動方向が当該画像処理の目的とは無関係な方向である動体を前記特定フレームにおける検出対象から除外する処理を含む物体的フィルタリング処理を演算手段により実行する場合もある。また更に、前記動画のフレーム上において捕捉した動体に形状及び大きさが最も近似した動体を、その直前のフレーム上で検索し、当該動体の移動速度が当該画像処理の目的とは無関係な移動速度である動体を前記特定フレームにおける検出対象から除外する処理を含む物体的フィルタリング処理を演算手段により実行する場合もある。
【００１６】
上記課題を解決するために成された本発明によるコンピュータによる画像処理装置は、ＰＡＬ光学系により撮像位置周辺３６０°に亘る三次元空間を環状画像として結像させる固定カメラと、当該固定カメラから取得したビデオ画像中から、上記いずれかのコンピュータによる画像処理方法によって動体を検出する為の演算手段を含む固定画像制御装置と、当該固定画像制御装置により認識した精査すべき動体の存在方向を向く単一指向性のズーム光学系スキャンカメラと、当該スキャンカメラで取得したビデオ画像の保存を行うコンピュータシステムを含む捕捉画像制御装置を具備したことを特徴とする。
【００１７】
【発明の実施の形態】
以下、本発明によるコンピュータによる画像処理方法（以下、画像処理方法と記す。）の実施の形態を画像処理装置たる監視システムの一例を示しつつ図１のブロック図に基づき説明する。
【００１８】
当該例では、図８に示すＰＡＬ光学系４により撮像位置周辺３６０°に亘る三次元空間を環状画像５として結像させる固定カメラ（以下、プレスキャンカメラと６記す。）と、当該プレスキャンカメラ６で取得したビデオ画像を表示する固定画像受像器８（図７参照）、及び当該プレスキャンカメラ６から取得したビデオ画像中から動体を検出する為のコンピュータシステムによる演算手段１を含む固定画像制御装置１３と、当該固定画像制御装置１３により認識した精査すべき動体の存在方向を向く単一指向性ではあるが画像分解能が高いズーム光学系で当該動体を自動的に拡大撮影するＰＺＴスキャンカメラ（以下、スキャンカメラ７と記す。）と、当該スキャンカメラ７で取得したビデオ画像を表示する受像器（図示省略）、及び当該ビデオ画像の保存及び処理を行うコンピュータシステムを含む捕捉画像制御装置１４を具備して構成される（図２及び図９参照）。当該構成によってＣＣＤカメラによる分解能の制約というＰＡＬシステムの問題点が解消される。
【００１９】
ＰＡＬ光学系４は、垂直方向の視野角が約５０度から７０度程度に限定されるものの、水平視野が大きいため広範囲を撮影するに際して複数のカメラを設置したり、カメラを機械的に動かす必要がなく、例え小さなサイズのＰＡＬであっても鮮明な広角画像が得られるという利点がある。この様なＰＡＬの特性から侵入者の検知を目的とした監視システムに広く用いられているところである。尚、パノラマ画像ブロック９とは、前面中央の前遮光面ａと、前面周縁部の前透光面ｂと、後面周縁部の後遮光面ｃと、後面中央部の後透光面ｄを具備した光透過性素材より成る回転体であって、前記後遮光面ｃは、前透光面ｂからの入射光を当該回転体を通過して前遮光面ａへ集め得る反射鏡とされ、前遮光面ａは、後遮光面ｃからの反射光を当該回転体内を通過して後透光面ｄへ集め得る反射鏡とされたものである（図８参照）。
【００２０】
ＰＡＬで三次元空間を二次元平面に投影すると、距離情報はすべて失われるため、ＰＡＬ光学系４によるプレスキャンカメラ６の視野だけから物体の正確な位置を特定することはできないが、環状画像５を結んだ受光素子１０のピクセルの位置から、その平面的な入射角度だけは特定できる。即ち、前記プレスキャンカメラ６のＰＡＬ光学系４の採光部とスキャンカメラ７を出来る限り近づけて設置する事でＰＡＬ光学系４により検知した動体の方向へ正確にスキャンカメラ７を向ける事が出来る。
【００２１】
経験上、動体がスキャンカメラ７や、プレスキャンカメラ６のＰＡＬ光学系４の採光部から４ｍ以上離れている場合には、図４の如く前記スキャンカメラ７と、プレスキャンカメラ６のＰＡＬ光学系４とが４０ｃｍ程度離れていても実用上問題は無いが、動体の捕捉精度は、相互の距離が近ければ近いほどが高まるので、理想的には、図５の如くスキャンカメラ７の上に、当該スキャンカメラ７の旋回軸をＰＡＬ光学系４の光軸ｅとして配置することでスキャンカメラ７はＰＡＬ光学系４によるプレスキャンカメラ６の視野に入らず、又、環状画像５上の方位をそのままスキャンカメラ７の方位として用いることができ、前記動体検出処理２の効率を悪化させることもない。
【００２２】
上記の如く適正な配設を行った上でシステムの初期設定を行う。初期設定としては、スキャンカメラ７の原点角度とＰＡＬの０度位相との角度差（図６参照。理想的には０度）を測定して固定画像制御装置１３及び捕捉画像制御装置１４に登録する。尚、ＰＡＬ光学系に換えて通常レンズによる光学系を使用した場合にあっても（図３参照）、狭い視野角による動体の検出という点及び直角座標系の水平軸を角度換算する点が異なるだけでほぼ同様のシステム構成が可能となる。この様な構成を採った場合には、マスキング処理を行ったと同様に処理効率の向上が望めることとなる。
【００２３】
以下、上記監視システムに採用するものとして以下に例示する画像処理方法は、ビデオ画像を対象とした統計的な変化検知をおこなって変化のある部分を抽出する動体検出処理（見方を変えれば背景検出処理として見る事もできる。）２と、当該結果から照明の変化や各種ノイズによる影響を除去する環境的フィルタリング処理３と、更に、予定しない大きさや動作特性を持った対象を除去する物体的フィルタリング処理１１とで構成された画像処理方法である（図１参照）。
【００２４】
当該画像処理方法における動体検出処理２は、背景の一部に生じた外乱によるピクセル値の変化は動体の一部のピクセル値の変化に比べて頻度も大きさも小さいという経験上の事実に基づき、結果として背景上で動きの少ない部分に大きな重みを与え、フレーム毎に変化する動きの激しい部分に小さい重みを与えて動体検出処理を行うものである。
【００２５】
具体的には、先ず、前記プレスキャンカメラ６で撮影した画像の各フレームにおける各画素のピクセル値Ｐ１を、当該プレスキャンカメラ６の受光素子（ＣＣＤ等）１０からＡ／Ｄコンバータ等の物理量検出インターフェース１２（図９参照）を介して採取し、演算手段１によって当該最新フレームのピクセル値Ｐ１と直前フレームのピクセル値Ｐ０の前後差の絶対値｜Ｐ１−Ｐ０｜を画素毎に求め、最新フレームを構成する各画素について、当該前後差の絶対値｜Ｐ１−Ｐ０｜に、フレーム毎の変化の反映度を決める平均ウエイトｒを次式（α）の如く掛け合わせて動体に対するウエイトｗを得る。そして、画素毎に、そのウエイトｗを含んだ次式（β）を演算手段１により実行し、最新フレームのピクセル値Ｐ１に対して当該前後差の絶対値｜Ｐ１−Ｐ０｜に反比例した重み付けが施された評価値Ｔ１を得る。
【００２６】
ｗ＝ｒ＊｜Ｐ１−Ｐ０｜・・・・・・・・・（α）
Ｔ１＝（ｗ＊Ｔ０＋Ｐ１）／（ｗ＋１）・・（β）
ここで
ｒは、前記重み付けを行う際の比例定数（平均重み）、
Ｐ０は、直前フレームの座標（ｘ、ｙ）に存在する画素のピクセル値、
Ｐ１は、最新フレームの座標（ｘ、ｙ）に存在する画素のピクセル値、
Ｔ０は、直前フレームにおける座標（ｘ、ｙ）に存在する画素の評価値、
Ｔ１は、最新フレームにおける座標（ｘ、ｙ）に存在する画素の評価値、
ｘは、フレームの横軸、
ｙは、フレームの縦軸、
である。
【００２７】
【表１】

【００２８】
新規フレームを取得する毎に直前のフレームにおける重み付けの更新が行われる（表の上段から下段へ）上記演算を演算手段１を以て実行すれば、重み付けを受けたことにより、動きの頻繁な部分と動きの希薄な実質的背景部分との格差が高められた評価値Ｔ１が最新フレームを取得する毎に各画素について与えられる。そこで、演算手段１を以て最新フレームのピクセル値Ｐ１から当該評価値Ｔ１を減じてその絶対値を取ることにより、背景成分のほとんどが除去された前記最新フレームの動体成分値Ｅ１を画素毎に得ることができる。そして、得られた動体成分値Ｅ１を所定の閾値と比較し（表１、図１４参照）、それより大きい画素を動体構成画素１６として抽出し、その位置座標に基づいて一個のフレーム上に配置した状態を模擬した仮想フレームを演算手段１のメモリー１５内に表現し、当該仮想フレームにおける動体ブロック（それぞれの位置座標に基づく前記動体構成画素の集合により構成されたブロック）の座標計算を演算装置１により行うことによって動体のみの位置及び大きさを割り出すことができる。
【００２９】
この様に動作の頻度に応じた重み付けを行う上記方法によれば、動体が背景と同化することによって生じるゴーストの影響（図１２（イ）（ロ））を排除することができるのみならず、従来の動体検出処理では検出できなかった物体、即ち、暫くの間動かなくなった物体やゆっくり動く物体であっても、静止時間を予め設定し当該設定時間に適したフレームサンプリング時間に設定しておくことで、動作の頻度に応じた重み付けが行われ、動体を問題なく検出することができることとなる。
【００３０】
上記適正な設定時間による動体検出処理が行われ、適正な閾値が設定されると、前記動体成分Ｅ１によって抽出される一塊りの動体ブロックは一単位の物体として認識され、処理過程においても一単位の物体として処理されることとなる。そして、動体検出に係る処理時間の短縮を阻害する背景成分のほとんどが除去された前記動体成分値Ｅ１を用いることによって、処理対象が整理され処理効率と実用性が大きく高められることとなる。
【００３１】
次に、当該例に用いたフィルタリング処理のうち、我々の周囲に存在する照度の変化（局地的輝度変化）に起因したノイズを除去する環境的フィルタリング処理３について説明する。当該処理は、例えば、直前の複数フレームの背景の平均輝度と最新フレームの背景の輝度とをフレーム上の画素毎に比較し、それらの輝度差が所定の閾値を超えるか否かで当該領域で物理的に重大な変化が生じたか否かを検出するものである。一日の照度変化の遷移函数はほぼ線形であるので、前記動体検出処理２等で抽出されるなどして得られた平均的輝度に基づき顕著な変化に対する閾値を適正に定めることは可能である。
【００３２】
具体的には、ビデオ画像におけるフレーム取得開始から最新フレーム取得に至るまでの、各フレームを構成する全画素について、評価値Ｔ１に関する累積値（表２では累積評価値）、平均値（表２では平均評価値ＴＡ）、前後フレームの変化量（表２では直前変化量）、及びその間の最大変化量（表２では評価値最大変化量Ｄ）を演算手段１で導き出し、当該システムの記録手段１７に保存する。
【００３３】
そこで、演算手段１では、最新フレームの評価値Ｔ１と、当該最新フレームを含む平均評価値ＴＡとの差の絶対値（表中では平均評価値格差）が、予め設定された検知感度ｓ％を前記評価値最大変化量Ｄに割増した割増変化量ＴＳ以内であれば、当該画素は、物理的に重大な変化を含む動体の一部である蓋然性が高い画素（擬制画素）には当たらないとして、純然たる背景成分として動体検出に係る処理対象から省くと共に、もし、当該平均評価値格差が前記割増変化量ＴＳより大きい場合には、その時の変化がそれまでのフレーム取得履歴で得た評価値最大変化量Ｄよりも格段に大きいと判断して、動体の一部である蓋然性が高い擬制画素としてマークする（表２、図１５参照）。
【００３４】
【表２】

【００３５】
擬制画素としてマークされた全ての画素は、演算手段１のメモリー１５内に設定された仮想フレームに基づく動体ブロックの座標計算の際、同じフレームにおいて前記動体検出処理２により一塊りの動体ブロックとして認識された画素と共に、動体を構成する画素として取り扱われ、相分離する物体や擬制画素相互の間隔が所定の距離（設定調整可能）未満の場合に、それらの検出領域に対しぬりつぶし法や二進形態閉塞法等の連結・一体化処理を演算手段１で施すことにより、一塊りの繋がった動体ブロックとして取り扱われることとなる。
【００３６】
上記環境的フィルタリング処理３に際し、演算手段１は、もし前記評価値Ｔ１についての前記平均評価値格差（｜Ｔ１−ＴＡ｜）がそれまでの評価値最大変化量Ｄより大きい場合には、当該最大変化量Ｄを、予め設定された感度係数（当該例においては“１”）に最大変化量Ｄの増加分を乗じた値だけ増加させて更新するという学習処理を行い、もし前記平均評価値格差の絶対値が前記最大変化量Ｄよりも小さい場合には、当該最大変化量Ｄを、予め設定された感度係数に最大変化量Ｄの増加分を乗じた値だけ減少させて更新するという学習処理（当該例においては、単位時間に一定の割合（或いは量）だけ減少させるといった定期的な処理として行われている。）を行う。当該最大変化量Ｄを増加或いは減少させるスピードは、前記感度係数を増減することにより適正なものに調整することができ、当該最大変化量Ｄを増加させるか否かの判断を生じさせる感度は、前記検知感度ｓ％を適宜増減することによって調整することができる（表２及び図１５参照）。
【００３７】
上記環境的フィルタリング処理３は、動体成分値Ｅ１についても行うことが出来る。その場合、演算手段１は、もし前記動体成分値Ｅ１と平均動体成分値ＥＡの差たる平均動体成分値格差（｜Ｅ１−ＥＡ｜）がそれまでの動体成分値最大変化量Ｌより大きい場合には、当該最大変化量Ｌを、予め設定された感度係数（当該例においては“１”）に最大変化量Ｌの増加分を乗じた値だけ増加させて更新するという学習処理を行い、もし前記平均動体成分値格差（｜Ｅ１−ＥＡ｜）が前記最大変化量Ｌよりも小さい場合には、当該最大変化量Ｌを、予め設定された感度係数に最大変化量Ｌの減少分を乗じた値だけ減少させて更新するという学習処理を行う。この場合も、当該最大変化量Ｌを増加或いは減少させるスピードは、前記感度係数を増減することにより適正なものに調整することができ、当該最大変化量Ｌを増加させるか否かの判断を生じさせる感度は、前記検知感度ｓ％を適宜増減することによって調整することができる（表３及び図１６参照）。
【００３８】
【表３】

【００３９】
この様な環境フィルタリング処理３をフレームを構成する全画素に対して行い、擬制画素としてマークされなかった画素から成る領域の照度変化等をノイズであると判断して動体ブロックを構成する画素としての認識対象から除外することによって、処理内容の合理化を図ることが出来る。また、上記学習処理によってＤ又はＬが自動的に実際の状況に即した値に設定され、常に適正な基準値が与えられて正確なノイズ除去が可能となる。
【００４０】
この実施の形態では、当該物体の持つ空間的、或いは時間的特性により、更に幾つかの物体的フィルタリング処理が演算手段１によって行われる。
【００４１】
前記演算手段１のメモリー１５内に設定した仮想フレームで、動体ブロックの拾い出し１８を行い、動体ブロックの座標計算による連結・一体化処理２０を施した後、当該座標計算によって一つの物体の大きさを算出し、更に、基準以下の小さな物体は動体ブロックを構成する画素としての認識対象から除外する大きさフィルタ１９を含んだ物体的フィルタリング処理１１を当該演算手段１によって行う。当該物体的フィルタリング処理１１によって、小さな動物や遠方の物体がフレームを横切るという場合であっても、検出対象たる物体より極端に小さい物体を動体検出処理２の対象から除くことができる。
【００４２】
上記演算手段１による座標計算によって一旦動体ブロックを認識すると、その進行方向を導き運動の軌跡を予測することができるが、同動体ブロックについては進行方向の導出に間に合うようにその存在を検出して追跡しなければならない。そこで、当該例においては、直前のフレームで得られた動体成分値Ｅ０を所定の閾値と比較して（図１４参考）それより大きい画素を抽出し、一個のフレームとして配置した仮想フレームを演算手段１のメモリー１５内に設定し、ブロックマッチング（二値相関、パターン・マッチング或いはテンプレートマッチングと呼ばれる場合もある。）を行って進行方向導出の対象となる動体ブロックに最も近似した動体ブロックを、最新のフレーム上で探し出し、前後の位置座標の変化に基づく同演算手段１による座標計算を以て動体ブロックの移動ベクトルを算出し、当該画像処理の目的（監視システム等）とは無関係（無効）な方向（例えば、右が有効な方向であれば左が無効な方向）へ進む物体を検出対象から除く方向フィルタ２２を含んだ処理が当該演算手段１によって行われる。
【００４３】
上記の如く動体の移動方向が判明すると、当該動体ブロックの相対速度を導出することができる。よって、当該速度に基づき検出対象の速度として予測される速度から大きく外れる動作対象を除去する速度フィルタ２３を含んだ物体的フィルタリング処理１１を演算手段１により行うことが可能となる。尚、表１、表２、表３は、フレーム取得開始から５０フレーム分の値を関係処理毎に分けて示したものである。
【００４４】
当該画像処理においては、前記演算手段１により、上記フィルタリング処理に加えてカメラの視野に係るフレーム全体のうちの特定領域だけを検出の関心領域（ＲＯＩ）とするマスキング処理を行うことによってより効率の良い画像処理が可能となる。
【００４５】
【発明の効果】
上記の如く構成されたコンピュータによる画像処理方法と画像処理装置によれば、最新フレームを構成する各画素に対し当該前後差の絶対値に反比例した重み付けを施して得た評価値より導出した動体成分値から、所定の閾値に基づいて動体構成画素を認定する処理を演算手段を以て行うことにより、背景自体の明るさや色彩が刻々と変化する野外であっても、時間の経過に従って背景成分が評価値として更新されるので、経時的ノイズが増加するという欠点が解消されると共に、動体検出処理に用いる基本背景画像としていくつかのフレームの平均値をとるという手法で問題となっていたゴースト効果も解消される。
【００４６】
また、フレームを構成する画素毎に前記評価値の累積平均値及び制限時間内の最大変化量Ｄを最新フレームを取得する毎に導出し、最新フレームにおける各画素の評価値と前記累積平均値との差の絶対値たる累積平均格差が、前記最大変化量Ｄに基づく基準値を上回る画素を、前記動体構成画素として擬制する環境的フィルタリング処理を演算手段を以て実行することで、影や雲の動きなどの影響で光が徐々に変化する状態、或いは屋外においては揺れた木を、屋内においてはモニタ画面のちらつきやカーテンの揺れなどを侵入者と明確に区別でき、その結果として動体と背景とを正確に識別し誤報を回避することができる低コストなシステムを提供することができる（図１３参照）。
【００４７】
又、前記自動的に閾値を学習する環境的フィルタリング処理を実行することによって、常に現場でのパラメータを実際の状況に合わせた環境的フィルタリングが可能となり、更に、前記種々の物体的フィルタリング処理を実行することによってより実状に即した効果的な画像処理が可能となる。そして、これらの具体的な効果を以て、変化する環境に対して正確に適応し、動体と背景とを低コストで正確に且つ高速に識別し得るコンピュータによる画像処理方法の提供が可能となる。
【図面の簡単な説明】
【図１】本発明によるコンピュータによる画像処理方法の処理手続きの一例を示すブロック図である。
【図２】本発明によるコンピュータによる画像処理方法を採用した監視システムの一例を示す概略説明図である。
【図３】本発明によるコンピュータによる画像処理方法を採用した監視システムの一例を示す概略説明図である。
【図４】本発明によるコンピュータによる画像処理方法を採用した監視システムのスキャンカメラとプレスキャンカメラの配設例を示す説明図である。
【図５】本発明によるコンピュータによる画像処理方法を採用した監視システムのスキャンカメラとプレスキャンカメラの配設例を示す説明図である。
【図６】本発明によるコンピュータによる画像処理方法を採用した監視システムのスキャンカメラとプレスキャンカメラの初期設定の一例を示す説明図である。
【図７】本発明によるコンピュータによる画像処理方法を採用した監視システムのプレスキャンカメラから取得した画像を映す受像器の一例を示す説明図である。
【図８】パノラマ画像ブロック光学系により撮像位置周辺の三次元空間を環状画像として結像させる一例を示した説明図である。
【図９】本発明によるコンピュータによる画像処理方法を実現するハードウェア構成の一例を示すブロック図である。
【図１０】（イ）（ロ）従来のコンピュータによる画像処理方法の問題点の一例を示す最新原画像及び直前原画像との閾値差を表現した画像である。
【図１１】（イ）（ロ）従来のコンピュータによる画像処理方法である背景減算法による最新原画像及び直前原画像との閾値差を表現した画像である。
【図１２】（イ）（ロ）従来のコンピュータによる画像処理方法である背景減算法による直前の複数フレームを平均して得たゴースト効果を含む背景画像、及び当該背景画像と図１１（イ）の最新原画像との閾値差を表現した画像である。
【図１３】（イ）（ロ）（ハ）本発明によるコンピュータによる画像処理方法で採用された環境的フィルタリング処理の効果を示す最新原画像、動体の全てを検出したフィルタリング前の画像及びフィルタリング後の画像の一例を示したものである。
【図１４】本発明によるコンピュータによる画像処理方法の動体検出処理の一例を説明するグラフである。
【図１５】本発明によるコンピュータによる画像処理方法の学習処理の一例を説明するグラフである。
【図１６】本発明によるコンピュータによる画像処理方法の学習処理の一例を説明するグラフである。
【符号の説明】
１演算手段，２動体検出処理，３環境的フィルタリング処理，
４パノラマ画像ブロック光学系，５環状画像，
６プレスキャンカメラ，７スキャンカメラ，８受像器，
９パノラマ画像ブロック，１０受光素子，
１１物体的フィルタリング処理，１２物理量検出インターフェース，
１３固定画像制御装置，１４捕捉画像制御装置，１５メモリー，
１６動体構成画素，１７記録手段，１８動体ブロックの拾い出し，
１９連結・一体化処理，２０大きさフィルタ，
２１動きの方向と速度の演算，２２方向フィルタ，２３速度フィルタ，[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image processing method and an image processing apparatus for identifying a moving object and a background on optically acquired moving image data.
[0002]
[Prior art]
A video continuous image is composed of continuous still images called video frames (hereinafter referred to as frames). Although the frames that move back and forth on the time axis are very similar to each other, there are slight differences (front and back differences), and the motion of the subject is expressed by a series of the differences.
[0003]
The main purpose of the monitoring system, which is one aspect of the image processing apparatus, is to detect all moving objects that enter the space being monitored. In capturing moving objects that are the basis of the monitoring system, such a continuous video image in real time is used. Motion analysis is the basis. For example, in a monitoring system using a fixed camera such as an omnidirectional monitoring system using a circular image acquired using a panoramic image block (hereinafter referred to as PAL), the background itself does not operate on a continuous image in principle, and as a moving object. The objects to be detected are only intrusions from the outside, except for machines and instruments with disturbances or movements caused by wind or light. In this way, the fixed camera type monitoring system is advantageous in terms of cost and equipment in real-time motion analysis, and we decided to develop a moving object detection process suitable for the fixed camera type monitoring system. .
[0004]
As a first moving object detection process that has been used in the past, there is a process that simply subtracts two consecutive frames of a video image. This method is a good method for detecting a moving object existing in the field of view, but captures all unnecessary changes such as illumination noise, changes in natural light, and electrical noise as well as the moving object to be detected. Further, even if the difference in illuminance can be detected, a moving object that is buried in the background in color cannot be detected, and even if it is not buried in the background in color, it can not be detected except for the peripheral part that contains many color changes.
[0005]
As a result, for example, as shown in FIG. 10 (a), when a person wearing clothes of a single color moves from the left to the right in the image, in the above moving object detection process, as shown in FIG. Only the right side of the person is detected and a part of the background on the left side can be detected, but the rest is hidden on the left side of the person in the previous frame, and the background recognition for each frame cannot be performed. The moving object detection processing cannot be applied to commercial object tracking applications due to the above inaccuracy, but can be processed at a frame rate even with a relatively slow computer, so it can be said to be ideal for applications that simply detect motion.
[0006]
As a second moving object detection process that enables background recognition for each frame, there is a method of recording only the background of a certain screen as a reference background image by some method and then subtracting the reference background image from each frame. By using this method, it is easy to find a uniform moving body and remove small noises. However, since the background of the object to be subtracted is a moving image whose brightness and color change every moment, the difference between the background of the latest image newly acquired and the background of the basic background image acquired in advance increases over time. However, there is also a drawback that noise gradually increases.
[0007]
In order to solve this problem, a method of dynamically updating the background image over time, specifically, in a continuous video image, for each pixel, the pixel values of several frames going from the latest frame to the past. An average background extraction method that takes an average to create a background was adopted, and an attempt was made to perform a method that can be easily and quickly performed at a practical frame rate by using analog VLSI hardware together.
[0008]
[Problems to be solved by the invention]
However, if the average value of several frames is taken as the basic background image used for the moving object detection processing as described above, the basic background image obtained as a result of the average is the background and the moving object. The composite image (FIG. 12 (a)) becomes a ghost effect (refer to the image by the threshold difference between the composite image and the latest frame image in FIG. 12 (b)).
[0009]
The present invention has been made in view of the above circumstances, and is a computer-based image processing method and image that can be accurately adapted to a changing environment and can accurately and quickly identify moving objects and backgrounds at low cost. An object is to provide a processing apparatus.
[0010]
[Means for Solving the Problems]
An image processing method by a computer according to the present invention made to solve the above-mentioned problem is the image processing method for discriminating a moving object and a background on optically acquired moving image data. The moving body component derived from the evaluation value obtained by calculating the front-rear difference between the latest frame and the immediately preceding frame for each pixel value and weighting each pixel constituting the latest frame inversely proportional to the absolute value of the front-rear difference A moving body detection process for identifying a moving body constituent pixel based on a value based on a predetermined threshold value is executed by a calculation means.
[0011]
The moving image frame may be a frame at the time of input from an image sensor such as a CCD, or a frame at the time of display on the display of the receiver. The pixel value is a value of one output pixel and is a numerical value of information such as luminance and color. The weighting means reflecting the magnitude of the weight in the evaluation value in light of a certain rule such as a calculation formula such as inversely proportional to the absolute value of the difference between the front and rear. Recognizing moving body constituent pixels based on a threshold is, for example, a moving body component value obtained by subtracting an evaluation value from the pixel value, and a pixel having a value higher than the threshold is recognized as a moving body constituent pixel constituting a part of the moving body, This means that a portion having a value lower than the threshold is determined as a background component pixel.
[0012]
In order to avoid problems due to light disturbance, the cumulative average value of the evaluation values and the maximum change amount D within the time limit for each pixel constituting the frame are derived each time the latest frame is acquired, and each pixel in the latest frame is derived. The computing means executes an environmental filtering process that simulates a pixel whose cumulative average disparity, which is an absolute value of the difference between the evaluation value and the cumulative average value, exceeds a reference value based on the maximum change amount D as the moving object constituent pixel. In some cases. Note that the reference value based on the maximum change amount D is a value obtained by increasing or decreasing a certain ratio with respect to the maximum change amount.
[0013]
For the environmental filtering process, when the absolute value of the difference between the pixel value of each pixel in the latest frame and the cumulative average value is lower than a reference value based on the maximum change amount D, for example, it is lower than the reference value. The absolute value of the difference between the pixel value of each pixel in the latest frame and the cumulative average value exceeds the maximum change amount D. In some cases, for example, there may be included a learning process in which the maximum change amount D is increased by an amount corresponding to an amount exceeding the reference value.
[0014]
In addition, when the distance between the phase-separated moving bodies captured on the frame of the moving image is less than a predetermined proximity distance, the moving bodies are recognized as a single linked moving body, and the size of the phase-connected moving bodies is calculated. In some cases, the object filtering process including the process of excluding moving objects having a size below a predetermined threshold from the detection target is executed by the calculation means.
[0015]
Further, a moving object that is most similar in shape and size to the moving object captured on the frame of the moving image is searched on the immediately preceding frame, and the moving direction of the moving object is a direction unrelated to the purpose of the image processing. An object filtering process including a process of excluding a moving object from the detection target in the specific frame may be executed by the calculation means. Still further, a moving object having the closest shape and size to the moving object captured on the moving image frame is searched on the immediately preceding frame, and the moving speed of the moving object is unrelated to the purpose of the image processing. There is a case where the object filtering process including the process of excluding the moving object from the detection target in the specific frame is executed by the calculation means.
[0016]
In order to solve the above problems, a computer-based image processing apparatus according to the present invention obtains a fixed camera that forms an image of a three-dimensional space around the imaging position as a circular image by a PAL optical system, and the fixed camera. A fixed image control device including a computing means for detecting a moving object by any one of the above-described image processing methods using a computer, and a single image oriented in the direction of existence of the moving object to be examined recognized by the fixed image control device. A captured image control apparatus including a unidirectional zoom optical system scan camera and a computer system that stores a video image acquired by the scan camera is provided.
[0017]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of an image processing method by a computer according to the present invention (hereinafter referred to as an image processing method) will be described based on a block diagram of FIG. 1 while showing an example of a monitoring system as an image processing apparatus.
[0018]
In this example, a fixed camera (hereinafter referred to as a prescan camera 6) that forms a three-dimensional space extending 360 ° around the imaging position as an annular image 5 by the PAL optical system 4 shown in FIG. 8, and the prescan camera. Fixed image receiver 8 (see FIG. 7) for displaying the video image acquired in step 6 and fixed image control including computing means 1 by a computer system for detecting moving objects from the video image acquired from the prescan camera 6 A PZT scan camera that automatically enlarges and captures the moving object with a zoom optical system having a high unidirectional directivity in the direction of existence of the moving object to be examined recognized by the apparatus 13 and the fixed image control apparatus 13 ( Hereinafter, this is referred to as a scan camera 7.), a receiver (not shown) for displaying a video image acquired by the scan camera 7, and the Constructed comprises a captured image control device 14 which includes a computer system for storing and processing of Deo image (see FIGS. 2 and 9). This configuration eliminates the problem of the PAL system, which is the resolution limitation of the CCD camera.
[0019]
The PAL optical system 4 has a vertical viewing angle limited to about 50 to 70 degrees. However, since the horizontal viewing field is large, it is necessary to install multiple cameras or move the cameras mechanically when photographing a wide area. There is an advantage that a clear wide-angle image can be obtained even with a small-sized PAL. Because of such PAL characteristics, it is widely used in monitoring systems for the purpose of detecting intruders. The panoramic image block 9 includes a front light-shielding surface a at the front center, a front light-transmitting surface b at the front peripheral edge, a rear light-shielding surface c at the rear peripheral edge, and a rear light-transmitting surface d at the rear central part. The rear light-shielding surface c is a reflecting mirror that can collect incident light from the front light-transmitting surface b through the rotary body and gather on the front light-shielding surface a. The light shielding surface a is a reflecting mirror capable of collecting the reflected light from the rear light shielding surface c through the rotating body and collecting it on the rear light transmitting surface d (see FIG. 8).
[0020]
When the three-dimensional space is projected onto the two-dimensional plane by PAL, all the distance information is lost, and thus the exact position of the object cannot be specified only from the field of view of the pre-scan camera 6 by the PAL optical system 4, but the annular image 5 Only the planar incident angle can be specified from the pixel position of the light receiving element 10 connecting the two. That is, by placing the daylighting part of the PAL optical system 4 of the pre-scan camera 6 and the scan camera 7 as close as possible, the scan camera 7 can be accurately directed in the direction of the moving object detected by the PAL optical system 4.
[0021]
From experience, when the moving object is at least 4 m away from the daylighting part of the PAL optical system 4 of the scan camera 7 or the pre-scan camera 6, the scan camera 7 and the PAL optical system of the pre-scan camera 6 as shown in FIG. 4 is about 40 cm away, there is no practical problem. However, the closer the mutual distance is, the higher the moving object capture accuracy, so ideally on the scan camera 7 as shown in FIG. By arranging the pivot axis of the scan camera 7 as the optical axis e of the PAL optical system 4, the scan camera 7 does not enter the field of view of the pre-scan camera 6 by the PAL optical system 4 and the orientation on the annular image 5 remains unchanged. It can be used as the orientation of the scan camera 7, and does not deteriorate the efficiency of the moving object detection processing 2.
[0022]
The system is initialized after the proper arrangement as described above. As an initial setting, an angle difference between the origin angle of the scan camera 7 and the 0 degree phase of the PAL (see FIG. 6; ideally 0 degree) is measured and registered in the fixed image control device 13 and the captured image control device 14. To do. Even when a normal lens optical system is used instead of the PAL optical system (see FIG. 3), the difference is that the moving object is detected with a narrow viewing angle and the horizontal axis of the rectangular coordinate system is converted into an angle. A similar system configuration can be achieved with just this. When such a configuration is adopted, an improvement in processing efficiency can be expected as in the masking process.
[0023]
Hereinafter, the image processing method exemplified below as employed in the monitoring system described above is a moving object detection process that performs statistical change detection on video images and extracts portions with changes (background detection from a different perspective). 2) Environmental filtering processing 3 that removes the effects of changes in lighting and various noises from the results, and object filtering that removes objects with unplanned size and operating characteristics This is an image processing method configured by processing 11 (see FIG. 1).
[0024]
The moving object detection process 2 in the image processing method is based on an empirical fact that a change in pixel value due to a disturbance generated in a part of the background is smaller in frequency and magnitude than a change in the pixel value of a part of the moving object. As a result, the moving object detection process is performed by giving a large weight to a portion with little motion on the background and giving a small weight to a portion with high motion that changes every frame.
[0025]
Specifically, first, the pixel value P1 of each pixel in each frame of the image captured by the prescan camera 6 is detected from the light receiving element (CCD or the like) 10 of the prescan camera 6 using a physical quantity such as an A / D converter. Sampled via the interface 12 (see FIG. 9), the computing means 1 obtains the absolute value | P1-P0 | of the difference between the pixel value P1 of the latest frame and the pixel value P0 of the immediately preceding frame for each pixel, and the latest frame Is multiplied by an average weight r that determines the degree of change of each frame as shown in the following equation (α) to obtain the weight w for the moving object. Then, for each pixel, the following expression (β) including the weight w is executed by the calculation means 1, and the weighting inversely proportional to the absolute value | P1−P0 | The applied evaluation value T1 is obtained.
[0026]
w = r * | P1-P0 | ... (α)
T1 = (w * T0 + P1) / (w + 1) (.)
here
r is a proportionality constant (average weight) for performing the weighting,
P0 is the pixel value of the pixel existing at the coordinates (x, y) of the immediately preceding frame,
P1 is the pixel value of the pixel present at the coordinates (x, y) of the latest frame,
T0 is an evaluation value of a pixel existing at coordinates (x, y) in the immediately preceding frame,
T1 is an evaluation value of a pixel existing at coordinates (x, y) in the latest frame,
x is the horizontal axis of the frame,
y is the vertical axis of the frame,
It is.
[0027]
[Table 1]

[0028]
Every time a new frame is acquired, the weighting in the immediately preceding frame is updated (from the upper part to the lower part in the table). An evaluation value T1 with an increased disparity from the sparse substantial background portion is given for each pixel every time the latest frame is acquired. Therefore, by subtracting the evaluation value T1 from the pixel value P1 of the latest frame by the calculating means 1 and taking the absolute value, the moving body component value E1 of the latest frame from which most of the background components have been removed is obtained for each pixel. Can do. Then, the obtained moving body component value E1 is compared with a predetermined threshold (see Table 1 and FIG. 14), and pixels larger than that are extracted as moving body constituent pixels 16, and arranged on one frame based on the position coordinates. A virtual frame simulating the state is expressed in the memory 15 of the computing means 1 and the arithmetic unit calculates the coordinate of the moving object block (the block constituted by the set of moving object constituting pixels based on the respective position coordinates) in the virtual frame. The position and size of only the moving object can be determined by performing the above 1.
[0029]
Thus, according to the above-described method of performing weighting according to the frequency of movement, not only can the influence of the ghost (FIGS. 12A and 12B) caused by the assimilation of the moving object with the background be eliminated, Even for an object that could not be detected by the conventional moving object detection process, that is, an object that has not moved for a while or an object that moves slowly, a stationary time is set in advance and a frame sampling time suitable for the set time is set. Thus, weighting according to the frequency of movement is performed, and the moving object can be detected without any problem.
[0030]
When the moving object detection process is performed with the appropriate setting time and an appropriate threshold value is set, a group of moving object blocks extracted by the moving object component E1 is recognized as one unit object, and one unit is also used in the processing process. Will be processed as an object. Then, by using the moving body component value E1 from which most of the background components that hinder the reduction of the processing time related to moving body detection are used, the processing target is arranged, and the processing efficiency and practicality are greatly improved.
[0031]
Next, among the filtering processes used in this example, an environmental filtering process 3 for removing noise caused by a change in illuminance existing around us (local brightness change) will be described. For example, the processing compares the average luminance of the background of the immediately preceding frames and the luminance of the background of the latest frame for each pixel on the frame, and determines whether or not the luminance difference exceeds a predetermined threshold in the area. It detects whether a physically significant change has occurred. Since the transition function of the daily illuminance change is almost linear, it is possible to appropriately set a threshold for a significant change based on the average luminance obtained by extraction in the moving object detection process 2 or the like. .
[0032]
Specifically, the cumulative value (accumulated evaluation value in Table 2) and average value (in Table 2) for all the pixels constituting each frame from the start of frame acquisition to the latest frame acquisition in the video image. The average evaluation value TA), the change amount of the preceding and following frames (the change amount immediately before in Table 2), and the maximum change amount therebetween (the evaluation value maximum change amount D in Table 2) are derived by the calculation means 1, and the recording means 17 of the system. Save to.
[0033]
Therefore, in the calculation means 1, the absolute value of the difference between the evaluation value T1 of the latest frame and the average evaluation value TA including the latest frame (the average evaluation value difference in the table) is set to a predetermined detection sensitivity s%. If the pixel is within the premium change amount TS that is increased to the evaluation value maximum change amount D, the pixel is not considered to be a pixel (pseudo control pixel) having a high probability of being a part of a moving object that includes a physically significant change. If the average evaluation value difference is larger than the additional change amount TS, the change at that time is the evaluation value obtained from the frame acquisition history so far. It is judged that it is much larger than the maximum change amount D, and it is marked as a pseudo pixel having a high probability of being a part of the moving object (see Table 2, FIG. 15).
[0034]
[Table 2]

[0035]
All the pixels marked as pseudo-pixels are recognized as a block of moving blocks in the same frame by the moving body detection processing 2 in the coordinate calculation of the moving block based on the virtual frame set in the memory 15 of the computing means 1. If the distance between the object to be phase-separated and the pseudo-pixels is less than a predetermined distance (adjustable settings), the detection method for these detection areas is filled or binary form. By performing connection / integration processing such as a blockage method by the calculation means 1, it is handled as a group of connected moving object blocks.
[0036]
In the environmental filtering process 3, the computing means 1 determines that the maximum evaluation value difference (| T1-TA |) for the evaluation value T1 is greater than the maximum evaluation value variation D so far. A learning process is performed in which the change amount D is updated by a value obtained by multiplying a preset sensitivity coefficient (“1” in this example) by an increase in the maximum change amount D, and the average evaluation value difference When the absolute value of is less than the maximum change amount D, the maximum change amount D is updated by decreasing the value obtained by multiplying a preset sensitivity coefficient by the increment of the maximum change amount D. (In this example, it is performed as a regular process of decreasing by a fixed rate (or amount) per unit time). The speed at which the maximum change amount D is increased or decreased can be adjusted to an appropriate value by increasing or decreasing the sensitivity coefficient, and the sensitivity for determining whether or not to increase the maximum change amount D is: It can be adjusted by appropriately increasing or decreasing the detection sensitivity s% (see Table 2 and FIG. 15).
[0037]
The environmental filtering process 3 can be performed also on the moving body component value E1. In this case, the calculation means 1 determines that the average moving body component value difference (| E1-EA |), which is the difference between the moving body component value E1 and the average moving body component value EA, is larger than the maximum moving body component value variation L so far. Performs a learning process of updating the maximum change amount L by increasing a value obtained by multiplying a preset sensitivity coefficient (in this example, “1”) by the increase amount of the maximum change amount L. When the average moving body component value difference (| E1-EA |) is smaller than the maximum change amount L, a value obtained by multiplying the maximum change amount L by a decrease in the maximum change amount L by a preset sensitivity coefficient. A learning process is performed in which the data is updated with a decrease. Also in this case, the speed at which the maximum change amount L is increased or decreased can be adjusted to an appropriate value by increasing or decreasing the sensitivity coefficient, and a determination is made as to whether or not the maximum change amount L is to be increased. The sensitivity to be adjusted can be adjusted by appropriately increasing or decreasing the detection sensitivity s% (see Table 3 and FIG. 16).
[0038]
[Table 3]

[0039]
Such an environment filtering process 3 is performed on all the pixels constituting the frame, and the change in illuminance of the area composed of the pixels that are not marked as pseudo-pixels is determined to be noise, and the pixels constituting the moving object block By excluding it from the recognition target, the processing contents can be rationalized. In addition, D or L is automatically set to a value according to the actual situation by the learning process, and an appropriate reference value is always given to enable accurate noise removal.
[0040]
In this embodiment, some object filtering processes are further performed by the computing means 1 depending on the spatial or temporal characteristics of the object.
[0041]
A moving object block is picked up 18 by a virtual frame set in the memory 15 of the computing means 1 and subjected to a connection / integration process 20 by moving object block coordinate calculation, and then the size of one object is calculated by the coordinate calculation. Further, the calculation means 1 performs an object filtering process 11 including a size filter 19 that excludes a small object below the reference from a recognition target as a pixel constituting the moving object block. Even when a small animal or a distant object crosses the frame, the object filtering process 11 can remove an object that is extremely smaller than the object to be detected from the object of the moving object detection process 2.
[0042]
Once the moving object block is recognized by the coordinate calculation by the computing means 1, the traveling direction can be derived and the trajectory of the motion can be predicted. However, the existence of the moving object block is detected in time for deriving the traveling direction. Must be tracked. Therefore, in this example, the moving body component value E0 obtained in the immediately preceding frame is compared with a predetermined threshold (see FIG. 14), and a larger pixel is extracted, and a virtual frame arranged as one frame is calculated. 1 is set in the memory 15, and block matching (sometimes called binary correlation, pattern matching, or template matching) is performed, and the moving object block that is closest to the moving object block for which the traveling direction is derived is the latest The movement vector of the moving object block is calculated by the coordinate calculation by the calculation means 1 based on the change of the position coordinate before and after the frame, and the direction (invalid) irrelevant to the purpose of the image processing (monitoring system etc.) For example, a direction filter 22 is included that excludes an object moving in the detection direction from the detection target if the right is an effective direction. Management is performed by the operation means 1.
[0043]
When the moving direction of the moving object is found as described above, the relative speed of the moving object block can be derived. Therefore, it is possible to perform the object filtering process 11 including the speed filter 23 that removes the operation target greatly deviating from the speed predicted as the detection target speed based on the speed by the calculation unit 1. Tables 1, 2 and 3 show the values for 50 frames from the start of frame acquisition separately for each related process.
[0044]
In the image processing, in addition to the filtering processing, the calculation means 1 performs masking processing in which only a specific region of the entire frame related to the camera field of view is set as a region of interest (ROI) for detection. Good image processing becomes possible.
[0045]
【The invention's effect】
According to the image processing method and the image processing apparatus by the computer configured as described above, the moving body component derived from the evaluation value obtained by applying the weighting inversely proportional to the absolute value of the front-rear difference to each pixel constituting the latest frame. The background component is evaluated according to the passage of time, even in the outdoors where the brightness and color of the background itself changes every time, by performing processing for certifying moving object constituent pixels based on a predetermined threshold value from the value. As a result, the ghost effect that has been a problem with the method of taking the average value of several frames as the basic background image used for motion detection processing is also eliminated. Is done.
[0046]
Further, for each pixel constituting the frame, the cumulative average value of the evaluation value and the maximum change amount D within the time limit are derived every time the latest frame is acquired, and the evaluation value of each pixel in the latest frame, the cumulative average value, By executing an environmental filtering process that simulates, as a moving object constituent pixel, a pixel whose cumulative average disparity, which is an absolute value of the difference, exceeds a reference value based on the maximum change amount D, motion of shadows and clouds It is possible to clearly distinguish a tree that has been swaying under the influence of the light from the outside, or a tree that has been swaying outdoors, and a flickering monitor screen or a shake of a curtain from indoors. A low-cost system that can accurately identify and avoid false alarms can be provided (see FIG. 13).
[0047]
In addition, by executing the environmental filtering process that automatically learns the threshold value, it is possible to always perform the environmental filtering that matches the actual parameters in the field, and further executes the various object filtering processes. By doing so, it is possible to perform effective image processing that is more realistic. With these specific effects, it is possible to provide a computer-based image processing method that can be accurately adapted to a changing environment and can accurately identify moving objects and backgrounds at low cost and at high speed.
[Brief description of the drawings]
FIG. 1 is a block diagram showing an example of a processing procedure of a computer image processing method according to the present invention.
FIG. 2 is a schematic explanatory diagram showing an example of a monitoring system employing a computer image processing method according to the present invention.
FIG. 3 is a schematic explanatory diagram showing an example of a monitoring system employing a computer image processing method according to the present invention.
FIG. 4 is an explanatory diagram showing an example of arrangement of a scan camera and a pre-scan camera of a monitoring system adopting an image processing method by a computer according to the present invention.
FIG. 5 is an explanatory diagram showing an arrangement example of a scan camera and a pre-scan camera of a monitoring system adopting an image processing method by a computer according to the present invention.
FIG. 6 is an explanatory diagram showing an example of initial settings of a scan camera and a pre-scan camera of a monitoring system adopting an image processing method by a computer according to the present invention.
FIG. 7 is an explanatory diagram showing an example of a receiver that displays an image acquired from a pre-scan camera of a monitoring system that employs an image processing method by a computer according to the present invention.
FIG. 8 is an explanatory diagram showing an example in which a panoramic image block optical system forms an image of a three-dimensional space around an imaging position as an annular image.
FIG. 9 is a block diagram showing an example of a hardware configuration for realizing an image processing method by a computer according to the present invention.
FIGS. 10A and 10B are images representing a threshold difference between the latest original image and the immediately preceding original image, showing an example of the problem of the conventional image processing method by a computer.
FIGS. 11A and 11B are images representing threshold differences between the latest original image and the immediately preceding original image by a background subtraction method, which is a conventional computer image processing method.
FIGS. 12A and 12B are a background image including a ghost effect obtained by averaging a plurality of immediately preceding frames by a background subtraction method, which is a conventional computer image processing method, and the background image and FIG. This is an image expressing a threshold difference from the latest original image.
FIGS. 13A and 13B are the latest original image showing the effect of the environmental filtering processing employed in the computer image processing method according to the present invention, the image before filtering in which all moving objects are detected, and the image after filtering. An example of the image is shown.
FIG. 14 is a graph illustrating an example of a moving object detection process of an image processing method by a computer according to the present invention.
FIG. 15 is a graph illustrating an example of learning processing of an image processing method by a computer according to the present invention.
FIG. 16 is a graph illustrating an example of learning processing of an image processing method by a computer according to the present invention.
[Explanation of symbols]
1 computing means, 2 moving object detection processing, 3 environmental filtering processing,
4 panoramic image block optical system, 5 ring image,
6 pre-scan cameras, 7 scan cameras, 8 receivers,
9 panoramic image block, 10 light receiving element,
11 object filtering process, 12 physical quantity detection interface,
13 fixed image control device, 14 capture image control device, 15 memory,
16 moving object constituent pixels, 17 recording means, 18 picking up moving object blocks,
19 connection / integration processing, 20 size filter,
21 Calculation of motion direction and speed, 22 direction filter, 23 speed filter,

Claims

In an image processing method for identifying a moving object and a background on optically acquired moving image data,
For each pixel value of pixels constituting the moving image frame taken from the light receiving element of the fixed camera obtains respective differential between the latest frame and the previous frame, the front and rear differential For each pixel the respective pixel values forming the latest frame The evaluation value obtained by weighting inversely proportional to the absolute value is subtracted from the pixel value of the latest frame to derive the moving body component value, and the obtained moving body component value is compared with a predetermined threshold value and extracted as a moving body constituent pixel. A computer-based image processing method for executing a moving object detection process (2) in which a virtual frame arranged on one frame based on the position coordinates is represented in a memory by a calculation means (1).

In an image processing method for identifying a moving object and a background on optically acquired moving image data,
For each pixel value constituting the moving image frame taken from the light receiving element of the fixed camera, the difference between the latest frame and the immediately preceding frame is obtained , and the latest frame is calculated using the following formula (α) and formula (β). an evaluation value obtained by applying weighting inversely proportional to the absolute value of the difference across for each pixel the respective pixel values forming, derives the motion component value is subtracted from the most recent frame pixel values, obtained body component values Is extracted as a moving object constituent pixel by comparing with a predetermined threshold value, and a moving object detection process (2) for expressing a virtual frame arranged on one frame in the memory based on the position coordinates is executed by the computing means (1). An image processing method by a computer.
w = r * | P1-P0 |... (α)
T1 = (w * T0 + P1) / (w + 1) (... (Β)
Here, r is a proportionality constant (average weight) for performing the weighting,
P0 is the pixel value of the pixel existing at the coordinates (x, y) of the immediately preceding frame,
P1 is the pixel value of the pixel present at the coordinates (x, y) of the latest frame,
T0 is an evaluation value of a pixel existing at coordinates (x, y) in the immediately preceding frame,
T1 is an evaluation value of a pixel existing at coordinates (x, y) in the latest frame,
x is the horizontal axis of the frame,
y is the vertical axis of the frame,
It is.

The cumulative average value of the evaluation values and the maximum change amount D within the time limit are derived for each pixel constituting the frame every time the latest frame is acquired, and the difference between the evaluation value of each pixel in the latest frame and the cumulative average value absolute value serving accumulated average disparity, the pixels exceeding the maximum change amount reference value based on D, the claim executed by environmental filtering process of fictitious (3) a computation means (1) as the moving object pixels constituting 1 An image processing method by a computer according to claim 2 .

When the cumulative average disparity is less than the reference value based on the maximum change amount D, the maximum change amount D is updated to decrease, and conversely, the cumulative average disparity exceeds the reference value based on the maximum change amount D. In this case, the computer-based image processing method according to claim 3, wherein the computing means (1) executes an environmental filtering process (3) including a learning process for updating the maximum change amount D in an increasing manner.

A fixed camera that forms an image of a three-dimensional space extending around 360 ° around the imaging position as a ring image (5) by the PAL optical system (4), and a video image acquired from the fixed camera, the above claims 1 to 4. A fixed image control device (13) including a calculation means (1) for detecting a moving object by an image processing method using any one of the computers, and a moving object to be examined recognized by the fixed image control device (13) Unidirectional zoom optical system scan camera (7) facing the camera, and a captured image control device (14) including a computer system that stores a video image acquired by the scan camera (7).