JP4764362B2

JP4764362B2 - Event discrimination device and event discrimination program

Info

Publication number: JP4764362B2
Application number: JP2007034236A
Authority: JP
Inventors: 俊彦三須; 昌秀苗村; 真人藤井; 伸行八木
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2007-02-15
Filing date: 2007-02-15
Publication date: 2011-08-31
Anticipated expiration: 2027-02-15
Also published as: JP2008198038A

Description

本発明は、映像データにおけるイベントを判別するイベント判別装置及びイベント判別プログラムに関する。 The present invention relates to an event determination device and an event determination program for determining an event in video data.

従来、スポーツを撮影した映像データであるスポーツ映像から、特定のシーンであるイベントを検出するイベント検出手法として、カメラのパン、ズーム操作などの構図やカメラワークをスポーツ映像から検出し、画角や撮影方向及びカメラワークの遷移に基づいて、イベントを間接的に検出する手法がある（例えば、非特許文献１）。 Conventionally, as an event detection method for detecting an event that is a specific scene from sports video that is video data of sports, the composition and camera work such as camera pan and zoom operations are detected from the sports video, and the angle of view and There is a method of indirectly detecting an event based on a shooting direction and a transition of camera work (for example, Non-Patent Document 1).

また、従来のイベント検出手法には、構図に加え、色、カット点などに基づいて、スポーツ映像の重要性を判別したり、スポーツ映像の分野（ニュース、スポーツ中継などの別）を判別したりするものがある（例えば、特許文献１、特許文献２）。 In addition, conventional event detection methods determine the importance of sports video based on color, cut points, etc. in addition to composition, and determine the field of sports video (different from news, sports broadcasts, etc.) (For example, Patent Document 1 and Patent Document 2).

さらに、従来のイベント検出手法には、スポーツ映像における選手に関する特徴量からイベントの種別及びタイミングを検出する手法があり、選手位置および選手速度の瞬時の状況（フォーメーション）を解析する手法がある（例えば、特許文献２）。
また、従来のイベント検出手法には、イベント推測手段をモジュール化し、モジュールの使い分けによりさまざまな分野のイベント検出に対応した手法もある（例えば、特許文献３）。
特開平９−６５２８７号公報特開２００６−２５１８８５号公報特開２００６−２８５８７８号公報特開２０００−１２３１８４号公報「スポーツ中継映像データベースのカメラワーク情報による特徴解析手法」日本データベース学会 Letters Vol.1, No.2, pp.32-35. Furthermore, the conventional event detection method includes a method for detecting the type and timing of an event from a feature amount related to a player in a sports video, and a method for analyzing an instantaneous situation (formation) of a player position and a player speed (for example, Patent Document 2).
In addition, as a conventional event detection method, there is a method corresponding to event detection in various fields by modularizing an event estimation unit and using the module properly (for example, Patent Document 3).
JP-A-9-65287 JP 2006-251885 A JP 2006-285878 A JP 2000-123184 A "A feature analysis method for sports broadcast video database based on camera work information" Database Society of Japan Letters Vol.1, No.2, pp.32-35.

しかしながら、従来のイベント検出手法（非特許文献１、特許文献１及び特許文献２）は、カメラワークに基づいて間接的にシーンを判別するものであるため、スポーツ映像を撮影したカメラマンの技量差や演出上の意図によっては、意に反したイベント検出結果が出力される可能性がある。 However, the conventional event detection methods (Non-Patent Document 1, Patent Document 1 and Patent Document 2) discriminate the scene indirectly based on the camera work, so the skill difference between the cameraman who captured the sports video Depending on the intention of the production, an unexpected event detection result may be output.

ちなみに、特許文献１に開示されている従来のイベント検出手法は、映像スイッチングや画像内の色を参照することもできるが、元々画像内の被写体単位の動きを解析するものではないため、スポーツ映像の詳細なイベントの判別には適していない。 Incidentally, the conventional event detection method disclosed in Patent Document 1 can also refer to video switching or color in an image, but does not originally analyze the movement of a subject unit in an image, It is not suitable for detailed event determination.

一方、特許文献３に開示されている従来のイベント検出手法は、被写体の配置や動きからシーンを判別するため、カメラマンの技量差は影響せずに、被写体の配置や動きというスポーツ映像におけるイベントに直接的に関連する情報を解析することができ、スポーツ映像のイベントを判別することができる。しかし、この特許文献３に開示されている従来のイベント検出手法を、各種スポーツに適用するためには、スポーツごとにルールベースを主観的に整備する必要があり、主観的にルールベースを構築した結果、イベントを検出できる精度がルールベースを構築した人のスポーツに関する知識量に専ら依存してしまうという問題がある。 On the other hand, the conventional event detection method disclosed in Patent Document 3 discriminates the scene from the arrangement and movement of the subject, so that the difference in the skill of the photographer is not affected, and the event in the sports video of the arrangement and movement of the subject is not affected. Information related directly can be analyzed, and an event of a sports video can be determined. However, in order to apply the conventional event detection method disclosed in Patent Document 3 to various sports, it is necessary to subjectively prepare a rule base for each sport. As a result, there is a problem that the accuracy with which an event can be detected depends solely on the amount of knowledge about the sport of the person who built the rule base.

さらに、特許文献４に開示されている従来のイベント検出手法では、ルールベースとなるイベント推測手段について、モジュール化することで汎用化を図っているが、ルールベースの構築法は主観的であるので、やはり、イベントを検出できる精度がルールベースを構築した人のスポーツに関する知識量に専ら依存してしまうという問題がある。 Furthermore, in the conventional event detection method disclosed in Patent Document 4, the rule base event estimation means is modularized to be generalized, but the rule base construction method is subjective. After all, there is a problem that the accuracy with which an event can be detected depends exclusively on the amount of knowledge about the sport of the person who built the rule base.

そこで、本発明では、前記した問題を解決し、各スポーツに対して汎用的に利用可能で、客観的な判別基準によりイベントを判別することができるイベント判別装置及びイベント判別プログラムを提供することを目的とする。 Therefore, the present invention provides an event determination device and an event determination program that solve the above-described problems, can be used universally for each sport, and can determine an event based on an objective determination criterion. Objective.

前記課題を解決するため、請求項１に記載のイベント判別装置は、スポーツ用の映像データにおける特定のシーンであるイベントを判別するイベント判別装置であって、シルエット画像抽出手段と、シルエット特徴量抽出手段と、位置関係特徴ベクトル生成手段と、変換手段と、イベント判別手段と、を備える構成とした。 In order to solve the above-mentioned problem, the event discriminating device according to claim 1 is an event discriminating device for discriminating an event which is a specific scene in sports video data, comprising a silhouette image extracting means and a silhouette feature amount extracting unit. And a positional relationship feature vector generation unit, a conversion unit, and an event determination unit.

かかる構成によれば、イベント判別装置は、シルエット画像抽出手段によって、映像データを構成する画像に含まれる画素の所定の画素値を基準として、背景領域と非背景領域とを区別して、当該非背景領域を、人物のシルエット領域としたシルエット画像を抽出する。続いて、イベント判別装置は、シルエット特徴量抽出手段によって、シルエット画像抽出手段で抽出されたシルエット画像に含まれるシルエット領域について座標情報を少なくとも含む当該シルエット領域の特徴量を示すベクトルであるシルエット特徴ベクトルを抽出する。そして、イベント判別装置は、位置関係特徴ベクトル生成手段によって、シルエット特徴量抽出手段で抽出されたシルエット特徴ベクトルの座標情報を用いて、人物の位置関係を定量化したベクトルである位置関係特徴ベクトルを生成する。
このとき、イベント判別装置は、シルエット特徴量抽出手段によって、座標情報として、スポーツが行われているグラウンドの空間と座標値とが予め対応付けられている座標空間上の座標値を、シルエット特徴ベクトルに含める。これによれば、イベント判別装置は、シルエット特徴量抽出手段によって、座標空間上の座標値をシルエット特徴ベクトルに含めて出力することで、被写体がグラウンドの空間のどこに存在するのかを判明させる。
またこのとき、イベント判別装置は、位置関係特徴ベクトル生成手段によって、シルエット特徴ベクトルに含まれる座標値に基づいて、グラウンドの空間を分割した局所領域それぞれに存在する人物を計数し、この計数した結果を位置関係特徴ベクトルに含めて出力する。これによれば、イベント判別装置は、位置関係特徴ベクトル生成手段によって、被写体を計数した結果を、位置関係特徴ベクトルに含めて出力することで、局所領域に存在する被写体の数を判明させる。 According to such a configuration, the event determination device distinguishes the non-background region from the background region and the non-background region by using the silhouette image extraction unit as a reference based on the predetermined pixel value of the pixel included in the image constituting the video data. A silhouette image in which the region is a silhouette region of a person is extracted. Subsequently, the event discriminating apparatus is a silhouette feature vector which is a vector indicating a feature amount of the silhouette region including at least coordinate information regarding the silhouette region included in the silhouette image extracted by the silhouette image extracting unit by the silhouette feature amount extracting unit. To extract. Then, the event discriminating apparatus obtains a positional relationship feature vector that is a vector obtained by quantifying the positional relationship of the person using the coordinate information of the silhouette feature vector extracted by the silhouette feature amount extraction unit by the positional relationship feature vector generation unit. Generate.
At this time, the event discriminating apparatus uses the silhouette feature quantity extraction means to calculate the coordinate value on the coordinate space in which the ground space where the sport is being performed and the coordinate value are associated in advance as the coordinate information. Include in According to this, the event discriminating device makes it possible to determine where the subject exists in the ground space by outputting the coordinate value in the coordinate space by including it in the silhouette feature vector by the silhouette feature amount extraction means.
Also, at this time, the event discriminating device counts the persons present in each of the local areas obtained by dividing the ground space based on the coordinate values included in the silhouette feature vector by the positional relationship feature vector generation means, and the result of the counting Are included in the positional relationship feature vector and output. According to this, the event discriminating apparatus determines the number of subjects existing in the local region by outputting the result of counting the subjects included in the positional relationship feature vector by the positional relationship feature vector generating means.

また、イベント判別装置は、変換手段によって、位置関係特徴ベクトル生成手段で生成された位置関係特徴ベクトルに、特徴量を重み付ける予め設定した行列を乗じたベクトルである判別用特徴ベクトルを出力する。そして、イベント判別装置は、イベント判別手段によって、変換手段で出力された判別用特徴ベクトルを用い、この判別用特徴ベクトルが得られたときのイベントの生起確率について、予めイベントの種別ごとにモデル化した確率モデルを参照することで、イベントを判別する。 In addition, the event discriminating apparatus outputs a discriminating feature vector, which is a vector obtained by multiplying the positional relationship feature vector generated by the positional relationship feature vector generating unit by a preset matrix for weighting the feature amount by the converting unit. Then, the event discriminating apparatus uses the discriminating feature vector output from the converting unit by the event discriminating unit, and the event occurrence probability when the discriminating feature vector is obtained is previously modeled for each event type. The event is discriminated by referring to the probability model.

なお、確率モデルを参照することで、イベントを判別するとは、イベントの生起確率が当該確率モデルで設定した閾値を超えた場合であるとしてもよいし、イベントの生起確率から得られた確率密度関数が、当該確率モデルで予め設定した密度分布と近似した場合であるとしてもよいし、イベントごとの生起確率を、その大きさ順にならべ、当該生起確率の大きいから所定の個数のイベントを抽出する過程によるとしてもよい。 Note that determining an event by referring to the probability model may mean that the event occurrence probability exceeds a threshold set in the probability model, or a probability density function obtained from the event occurrence probability. May be approximated to the density distribution set in advance by the probability model, or the occurrence probabilities for each event are arranged in order of magnitude, and a predetermined number of events are extracted because the occurrence probability is large It may be.

請求項２に記載のイベント判別装置は、スポーツ用の映像データにおける特定のシーンであるイベントを判別するイベント判別装置であって、シルエット画像抽出手段と、シルエット特徴量抽出手段と、位置関係特徴ベクトル生成手段と、変換手段と、イベント判別手段と、を備える構成とした。 The event discriminating apparatus according to claim 2 is an event discriminating apparatus that discriminates an event that is a specific scene in sports video data, and includes a silhouette image extracting unit, a silhouette feature amount extracting unit, and a positional relationship feature vector. The configuration includes a generation unit, a conversion unit, and an event determination unit.

かかる構成によれば、イベント判別装置は、シルエット画像抽出手段によって、映像データを構成する画像に含まれる画素の所定の画素値を基準として、背景領域と非背景領域とを区別して、当該非背景領域を、人物のシルエット領域としたシルエット画像を抽出する。続いて、イベント判別装置は、シルエット特徴量抽出手段によって、シルエット画像抽出手段で抽出されたシルエット画像に含まれるシルエット領域について座標情報を少なくとも含む当該シルエット領域の特徴量を示すベクトルであるシルエット特徴ベクトルを抽出する。そして、イベント判別装置は、位置関係特徴ベクトル生成手段によって、シルエット特徴量抽出手段で抽出されたシルエット特徴ベクトルの座標情報を用いて、人物の位置関係を定量化したベクトルである位置関係特徴ベクトルを生成する。
このとき、イベント判別装置は、シルエット特徴量抽出手段によって、スポーツが行われているグラウンドの空間と座標値とが予め対応付けられている座標空間上の座標値と、シルエット画像に含まれるシルエット領域ごとに、色ベクトル統計量を評価し、シルエット領域に含まれる被写体が属している所属チームを判別した所属チーム情報と、をシルエット特徴ベクトルに含めて出力する。これによれば、座標空間上の座標値と所属チーム情報とをシルエット特徴ベクトルに含めて出力することで、所属チームごとに被写体がグラウンドの空間のどこに存在するのかを判明させる。
またこのとき、イベント判別装置は、位置関係特徴ベクトル生成手段によって、シルエット特徴ベクトルに含まれる座標値と所属チーム情報とに基づいて、グラウンドの空間を分割した局所領域それぞれに存在する被写体の数を所属チームごとに計数し、この計数した結果を位置関係特徴ベクトルに含めて出力する。これによれば、イベント判別装置は、位置関係特徴ベクトル生成手段によって、所属チームごとの被写体を計数した結果を、位置関係特徴ベクトルに含めて出力することで、局所領域に存在する所属チームごとの被写体の数を判明させる。 According to such a configuration, the event determination device distinguishes the non-background region from the background region and the non-background region by using the silhouette image extraction unit as a reference based on the predetermined pixel value of the pixel included in the image constituting the video data. A silhouette image in which the region is a silhouette region of a person is extracted. Subsequently, the event discriminating apparatus is a silhouette feature vector which is a vector indicating a feature amount of the silhouette region including at least coordinate information regarding the silhouette region included in the silhouette image extracted by the silhouette image extracting unit by the silhouette feature amount extracting unit. To extract. Then, the event discriminating apparatus obtains a positional relationship feature vector that is a vector obtained by quantifying the positional relationship of the person using the coordinate information of the silhouette feature vector extracted by the silhouette feature amount extraction unit by the positional relationship feature vector generation unit. Generate.
At this time, the event discriminating apparatus uses the silhouette feature amount extraction means to coordinate values in the coordinate space in which the space of the ground where the sport is performed and the coordinate values are associated in advance, and the silhouette region included in the silhouette image Each time, the color vector statistic is evaluated, and the team information that identifies the team to which the subject included in the silhouette area belongs is included in the silhouette feature vector and output. According to this, by outputting the coordinate value in the coordinate space and the belonging team information included in the silhouette feature vector, it is possible to determine where the subject exists in the ground space for each belonging team.
Further, at this time, the event discriminating device determines the number of subjects existing in each of the local regions divided from the ground space based on the coordinate values included in the silhouette feature vector and the belonging team information by the positional relationship feature vector generating means. Counting is performed for each team, and the counted result is included in the positional relationship feature vector and output. According to this, the event discriminating device outputs the result of counting the subjects for each team by the positional relationship feature vector generation means, including the positional relationship feature vector, and outputs the result for each team belonging to the local region. Determine the number of subjects.

請求項３に記載のイベント判別装置は、請求項１又は請求項２に記載のイベント判別装置において、前記シルエット特徴量抽出手段が、前記シルエット画像に含まれるシルエット領域の経過時間による移動を検出し、当該シルエット領域の移動量を速度ベクトルとして、前記シルエット特徴ベクトルに含めて出力し、前記位置関係特徴ベクトル生成手段は、前記シルエット特徴ベクトルに含まれる速度ベクトルの集合平均値を求め、この求めた結果を前記位置関係特徴ベクトルに含めて出力することを特徴とする。 The event discriminating apparatus according to claim 3 is the event discriminating apparatus according to claim 1 or 2 , wherein the silhouette feature amount extraction unit detects movement of a silhouette area included in the silhouette image according to elapsed time. The amount of movement of the silhouette region is output as a velocity vector included in the silhouette feature vector, and the positional relationship feature vector generation means obtains a set average value of the velocity vectors included in the silhouette feature vector. A result is included in the positional relationship feature vector and output.

かかる構成によれば、イベント判別装置は、シルエット特徴量抽出手段によって、速度ベクトルを、シルエット特徴ベクトルに含めて出力することで、被写体が移動する速度を判明させる。また、かかる構成によれば、イベント判別装置は、位置関係特徴ベクトル生成手段によって、速度ベクトルの集合平均値を、位置関係特徴ベクトルに含めて出力することで、被写体の集合がどのぐらいの速度（平均速度）で移動しているのかを判明させる。 According to such a configuration, the event discriminating apparatus makes it possible to determine the moving speed of the subject by outputting the speed vector including the silhouette vector by the silhouette feature amount extraction unit . Further, according to this configuration, the event discriminating device outputs the average value of the velocity vectors included in the positional relationship feature vector by the positional relationship feature vector generation means, and outputs the velocity ( Determine if you are moving at an average speed.

請求項４に記載のイベント判別装置は、請求項１乃至請求項３のいずれか一項に記載のイベント判別装置において、前記変換手段が、主成分分析による主成分群からなる重み行列を前記位置関係特徴ベクトルにかけることで、前記判別用特徴ベクトルを出力することを特徴とする。 The event discriminating apparatus according to claim 4 , wherein the conversion unit is configured to convert the position of a weight matrix including a principal component group by principal component analysis into the position, according to any one of claims 1 to 3. The feature vector for discrimination is output by being applied to the related feature vector.

かかる構成によれば、イベント判別装置は、変換手段によって、主成分分析による主成分群からなる重み行列を用いることで、位置関係特徴ベクトルを判別用特徴ベクトルに変換する。つまり、この変換手段は、人物の位置関係の特徴を示すベクトルを、どの様なイベントに該当するのかを示すベクトルに置き換えている。 According to such a configuration, the event discriminating device converts the positional relationship feature vector into the discriminating feature vector by using the weight matrix composed of the principal component group obtained by the principal component analysis by the converting means. In other words, this conversion means replaces the vector indicating the characteristics of the positional relationship between the persons with a vector indicating what kind of event it corresponds to.

請求項５に記載のイベント判別装置は、請求項１乃至請求項３のいずれか一項に記載のイベント判別装置において、前記変換手段は、フィッシャー判別基準に基づく重み行列を前記位置関係特徴ベクトルにかけることで、前記判別用特徴ベクトルを出力することを特徴とする。 The event discriminating apparatus according to claim 5 is the event discriminating apparatus according to any one of claims 1 to 3, wherein the conversion unit uses a weight matrix based on a Fisher discrimination criterion as the positional relationship feature vector. By applying, the feature vector for discrimination is output.

かかる構成によれば、イベント判別装置は、変換手段によって、フィッシャー判別基準に基づく重み行列を用いることで、位置関係特徴ベクトルを判別用特徴ベクトルに変換する。 According to such a configuration, the event discriminating apparatus converts the positional relationship feature vector into the discriminating feature vector by using the weight matrix based on the Fisher discriminating criterion by the converting means.

請求項６に記載のイベント判別装置は、請求項１乃至請求項５のいずれか一項に記載のイベント判別装置において、前記イベント判別手段は、前記イベント種別ごとに前記判別用特徴ベクトルの生起確率をモデル化した確率モデルを参照し、最も確率値の高いイベント種別を出力することを特徴とする。 The event discriminating apparatus according to claim 6 is the event discriminating apparatus according to any one of claims 1 to 5, wherein the event discriminating means is the occurrence probability of the feature vector for discrimination for each event type. The event type with the highest probability value is output by referring to the probability model that models the above.

かかる構成によれば、イベント判別装置は、イベント判別手段によって、確率モデルを参照することで、客観的にイベント種別を判別している。 According to such a configuration, the event discriminating apparatus objectively discriminates the event type by referring to the probability model by the event discriminating means.

請求項７に記載のイベント判別装置は、請求項６に記載のイベント判別装置において、前記確率モデルとしてガウス混合モデルを用いることを特徴とする。
かかる構成によれば、イベント判別装置は、イベント判別手段によって、ガウス混合モデルを参照することで、客観的にイベント種別を判別している。 The event discriminating device according to claim 7 is the event discriminating device according to claim 6, wherein a Gaussian mixture model is used as the probability model.
According to this configuration, the event discriminating apparatus objectively discriminates the event type by referring to the Gaussian mixture model by the event discriminating means.

請求項８に記載のイベント判別プログラムは、スポーツ用の映像データにおける特定のシーンであるイベントを判別するために、コンピュータを、シルエット画像抽出手段、シルエット特徴量抽出手段、位置関係特徴ベクトル生成手段、判別用特徴ベクトルを出力する変換手段、イベント判別手段、として機能させる構成とした。An event determination program according to claim 8, wherein a computer is used to determine an event that is a specific scene in sports video data, a silhouette image extraction unit, a silhouette feature amount extraction unit, a positional relationship feature vector generation unit, It is configured to function as a conversion unit that outputs a discrimination feature vector and an event discrimination unit.

かかる構成によれば、イベント判別プログラムは、シルエット画像抽出手段によって、映像データを構成する画像に含まれる画素の所定の画素値を基準として、背景領域と非背景領域とを区別して、当該非背景領域を、人物のシルエット領域としたシルエット画像を抽出し、シルエット特徴量抽出手段によって、シルエット画像抽出手段で抽出されたシルエット画像に含まれるシルエット領域について座標情報を少なくとも含む当該シルエット領域の特徴量を示すベクトルであるシルエット特徴ベクトルを抽出する。そして、イベント判別プログラムは、位置関係特徴ベクトル生成手段によって、シルエット特徴量抽出手段で抽出されたシルエット特徴ベクトルの座標情報を用いて、人物の位置関係を定量化したベクトルである位置関係特徴ベクトルを生成し、変換手段によって、位置関係特徴ベクトル生成手段で生成された位置関係特徴ベクトルに、特徴量を重み付ける予め設定した行列を乗じたベクトルである判別用特徴ベクトルを出力する。そして、イベント判別プログラムは、イベント判別手段によって、変換手段で出力された判別用特徴ベクトルを用い、この判別用特徴ベクトルが得られたときのイベントの生起確率について、予めイベントの種別ごとにモデル化した確率モデルを参照することで、イベントを判別する。According to such a configuration, the event determination program distinguishes the non-background region from the background region and the non-background region by using the silhouette image extraction unit with reference to the predetermined pixel value of the pixel included in the image constituting the video data. A silhouette image having a region as a person's silhouette region is extracted, and the feature amount of the silhouette region including at least coordinate information about the silhouette region included in the silhouette image extracted by the silhouette image extraction unit is extracted by the silhouette feature amount extraction unit. A silhouette feature vector, which is a vector to be shown, is extracted. Then, the event discrimination program uses a positional relationship feature vector generation unit to calculate a positional relationship feature vector that is a vector obtained by quantifying the positional relationship of a person using the coordinate information of the silhouette feature vector extracted by the silhouette feature amount extraction unit. Then, the converting means outputs a discrimination feature vector which is a vector obtained by multiplying the positional relation feature vector generated by the positional relation feature vector generation means by a preset matrix for weighting the feature amount. The event discriminating program uses the discriminating feature vector output from the converting unit by the event discriminating unit, and the occurrence probability of the event when the discriminating feature vector is obtained is previously modeled for each event type. The event is discriminated by referring to the probability model.
このとき、イベント判別装置は、シルエット特徴量抽出手段によって、スポーツが行われているグラウンドの空間と座標値とが予め対応付けられている座標空間上の座標値と、シルエット画像に含まれるシルエット領域ごとに、色ベクトル統計量を評価し、シルエット領域に含まれる被写体が属している所属チームを判別した所属チーム情報と、をシルエット特徴ベクトルに含めて出力する。これによれば、座標空間上の座標値と所属チーム情報とをシルエット特徴ベクトルに含めて出力することで、所属チームごとに被写体がグラウンドの空間のどこに存在するのかを判明させる。At this time, the event discriminating apparatus uses the silhouette feature amount extraction means to coordinate values in the coordinate space in which the space of the ground where the sport is performed and the coordinate values are associated in advance, and the silhouette region included in the silhouette image Each time, the color vector statistic is evaluated, and the team information that identifies the team to which the subject included in the silhouette area belongs is included in the silhouette feature vector and output. According to this, by outputting the coordinate value in the coordinate space and the belonging team information included in the silhouette feature vector, it is possible to determine where the subject exists in the ground space for each belonging team.
またこのとき、イベント判別装置は、位置関係特徴ベクトル生成手段によって、シルエット特徴ベクトルに含まれる座標値と所属チーム情報とに基づいて、グラウンドの空間を分割した局所領域それぞれに存在する被写体の数を所属チームごとに計数し、この計数した結果を位置関係特徴ベクトルに含めて出力する。これによれば、イベント判別装置は、位置関係特徴ベクトル生成手段によって、所属チームごとの被写体を計数した結果を、位置関係特徴ベクトルに含めて出力することで、局所領域に存在する所属チームごとの被写体の数を判明させる。Further, at this time, the event discriminating device determines the number of subjects existing in each of the local regions divided from the ground space based on the coordinate values included in the silhouette feature vector and the belonging team information by the positional relationship feature vector generating means. Counting is performed for each team, and the counted result is included in the positional relationship feature vector and output. According to this, the event discriminating device outputs the result of counting the subjects for each team by the positional relationship feature vector generation means, including the positional relationship feature vector, and outputs the result for each team belonging to the local region. Determine the number of subjects.

請求項９に記載のイベント判別プログラムは、スポーツ用の映像データにおける特定のシーンであるイベントを判別するために、コンピュータを、シルエット画像抽出手段、シルエット特徴量抽出手段、位置関係特徴ベクトル生成手段、判別用特徴ベクトルを出力する変換手段、イベント判別手段、として機能させる構成とした。The event determination program according to claim 9, in order to determine an event that is a specific scene in the video data for sports, the computer includes a silhouette image extraction unit, a silhouette feature amount extraction unit, a positional relationship feature vector generation unit, It is configured to function as a conversion unit that outputs a discrimination feature vector and an event discrimination unit.

請求項１０に記載のイベント判別プログラムは、請求項８又は請求項９に記載のイベント判別プログラムにおいて、前記シルエット特徴量抽出手段は、前記シルエット画像に含まれるシルエット領域の経過時間による移動を検出し、当該シルエット領域の移動量を速度ベクトルとして、前記シルエット特徴ベクトルに含めて出力し、前記位置関係特徴ベクトル生成手段は、前記シルエット特徴ベクトルに含まれる速度ベクトルの集合平均値を求め、この求めた結果を前記位置関係特徴ベクトルに含めて出力することを特徴とする。 The event determination program according to claim 10 is the event determination program according to claim 8 or 9, wherein the silhouette feature amount extraction unit detects movement of a silhouette region included in the silhouette image according to elapsed time. The amount of movement of the silhouette region is output as a velocity vector included in the silhouette feature vector, and the positional relationship feature vector generation means obtains a set average value of the velocity vectors included in the silhouette feature vector. A result is included in the positional relationship feature vector and output .

請求項１、８に記載の発明によれば、スポーツ用の画像から、シルエット画像を抽出した後、シルエット特徴ベクトル、位置関係特徴ベクトル、判別用特徴ベクトルを求めることで、各スポーツに対して汎用的に利用可能で、客観的な判別基準によりイベントを判別することができる。 According to the first and eighth aspects of the present invention, a silhouette image is extracted from a sport image, and then a silhouette feature vector, a positional relationship feature vector, and a discrimination feature vector are obtained. The event can be discriminated based on objective discrimination criteria.

また、シルエット特徴ベクトルに、座標空間上の座標値を含めることで、シルエット領域の位置が正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。
さらに、位置関係特徴ベクトルに、局所領域における被写体の計数を含めることで、被写体の数が正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。 In addition, by including the coordinate value in the coordinate space in the silhouette feature vector, the position of the silhouette region can be accurately determined, and an objective determination criterion can be adopted for determining the event.
Further, by including the count of subjects in the local region in the positional relationship feature vector, the number of subjects can be accurately determined, and an objective discrimination criterion can be employed for event discrimination.

請求項２、９に記載の発明によれば、スポーツ用の画像から、シルエット画像を抽出した後、シルエット特徴ベクトル、位置関係特徴ベクトル、判別用特徴ベクトルを求めることで、各スポーツに対して汎用的に利用可能で、客観的な判別基準によりイベントを判別することができる。According to the second and ninth aspects of the present invention, a silhouette image is extracted from a sport image, and then a silhouette feature vector, a positional relationship feature vector, and a discrimination feature vector are obtained. The event can be discriminated based on objective discrimination criteria.

また、シルエット特徴ベクトルに、座標空間上の座標値と所属チーム情報とを含めることで、シルエット領域の位置と所属するチームとが正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。In addition, by including the coordinate value in the coordinate space and the team information belonging to the silhouette feature vector, the position of the silhouette area and the team to which the team belongs can be accurately determined. Standards can be adopted.
さらに、位置関係特徴ベクトルに、局所領域における所属チームごとの被写体の計数を含めることで、所属チームごとの被写体の数が正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。Furthermore, by including the number of subjects for each team in the local area in the positional relationship feature vector, the number of subjects for each team can be accurately determined. Can be adopted.

請求項３、１０に記載の発明によれば、シルエット特徴ベクトルに、速度ベクトルを含めることで、シルエット領域の移動量が正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。また、位置関係特徴ベクトルに、速度ベクトルの集合平均値を含めることで、被写体の集合がどのぐらいの速度（平均速度）で移動しているのかが正確に判明することになり、イベントの判別に、客観的な判別基準を採用することができる。According to the third and tenth aspects of the present invention, by including the velocity vector in the silhouette feature vector, the amount of movement of the silhouette region can be accurately determined, and an objective determination criterion is used for determining the event. Can be adopted. In addition, by including the velocity vector set average value in the positional relationship feature vector, it is possible to accurately determine how fast (average velocity) the set of subjects is moving, and for event determination Objective discrimination criteria can be employed.

請求項４に記載の発明によれば、位置関係特徴ベクトルを、主成分分析による主成分群からなる重み行列を用いて判別用特徴ベクトルに変換することで、イベントの判別に、客観的な判別基準を採用することができる。According to the fourth aspect of the present invention, an objective discrimination can be performed for event discrimination by converting the positional relationship feature vector into a discrimination feature vector using a weight matrix composed of principal component groups by principal component analysis. Standards can be adopted.

請求項５に記載の発明によれば、位置関係特徴ベクトルを、フィッシャー判別基準に基づく重み行列を用いて判別用特徴ベクトルに変換することで、イベントの判別に、客観的な判別基準を採用することができる。According to the fifth aspect of the present invention, an objective discrimination criterion is adopted for event discrimination by converting the positional relationship feature vector into a discrimination feature vector using a weight matrix based on the Fisher discrimination criterion. be able to.

請求項６に記載の発明によれば、イベントの判別に、確率モデルを用いることで、客観的にイベント種別を判別することができる。According to the sixth aspect of the invention, the event type can be objectively determined by using the probability model for determining the event.

請求項７に記載の発明によれば、イベントの判別に、ガウス混合モデルを用いることで、客観的にイベント種別を判別することができる。According to the seventh aspect of the present invention, the event type can be objectively determined by using a Gaussian mixture model for determining the event.

次に、本発明の実施形態について、適宜、図面を参照しながら詳細に説明する。
（イベント判別装置の構成）
図１はイベント判別装置のブロック図である。この図１に示すように、イベント判別装置１は、入力された映像データ（１ショットごとの画像（入力画像Ｉ））について、イベント種別を判別するもので、シルエット抽出手段３と、シルエット特徴量抽出手段５と、フォーメーション定量化手段（位置関係特徴ベクトル生成手段）７と、変換手段９と、イベント判別手段１１とを備えている。 Next, embodiments of the present invention will be described in detail with reference to the drawings as appropriate.
(Configuration of event discriminator)
FIG. 1 is a block diagram of an event discrimination device. As shown in FIG. 1, the event discriminating device 1 discriminates an event type for input video data (images per shot (input image I)). An extraction unit 5, a formation quantification unit (positional relationship feature vector generation unit) 7, a conversion unit 9, and an event determination unit 11 are provided.

シルエット抽出手段３は、映像データを構成する１ショットごとの画像（入力画像Ｉ）に含まれる画素の画素値について、所定の画素値を基準として、背景領域と非背景領域とを区別することで、非背景領域である被写体（ここでは、人物、選手）のシルエットを示すシルエット画像を抽出するものである。ここで、図６を参照して、入力画像Ｉからシルエット画像Ｂを抽出する場合について説明する。 The silhouette extracting means 3 distinguishes the background area and the non-background area with respect to the pixel values of the pixels included in the image (input image I) for each shot constituting the video data, based on a predetermined pixel value. A silhouette image indicating the silhouette of a subject (here, a person or a player) that is a non-background region is extracted. Here, with reference to FIG. 6, the case where the silhouette image B is extracted from the input image I is demonstrated.

ここでは、シルエット抽出手段３は、入力画像Ｉからシルエット画像Ｂを抽出するのに、背景差分法を用いており、この背景差分法を用いるのに当たり、背景差分演算手段３ａを備えている。図６に例示すように、入力画像Ｉの非背景領域と背景領域を分類して、２値画像であるシルエット画像Ｂを出力する。 Here, the silhouette extraction means 3 uses the background difference method to extract the silhouette image B from the input image I, and includes the background difference calculation means 3a when using this background difference method. As illustrated in FIG. 6, the non-background region and the background region of the input image I are classified, and a silhouette image B that is a binary image is output.

この場合、シルエット抽出手段３には、図示を省略した記憶手段が備えられ、この記憶手段に被写体の写っていない背景画像Ｊが記憶されている。そして、シルエット抽出手段３は、背景差分演算手段３ａによって、入力画像Ｉと背景画像Ｊとの画素ごとの差分値を評価することで、シルエット画像Ｂを生成している。 In this case, the silhouette extraction means 3 is provided with a storage means (not shown), and a background image J where no subject is captured is stored in the storage means. Then, the silhouette extraction unit 3 generates the silhouette image B by evaluating the difference value for each pixel between the input image I and the background image J by the background difference calculation unit 3a.

背景差分演算手段３ａにおける演算過程を具体的に説明する。まず、画素位置（Ｘ，Ｙ）における入力画像Ｉ、背景画像Ｊ及びシルエット画像Ｂの画素値を、それぞれＩ（Ｘ，Ｙ）、Ｊ（Ｘ，Ｙ）及びＢ（Ｘ，Ｙ）とする。また、画素値Ｉ（Ｘ，Ｙ）及びＪ（Ｘ，Ｙ）は、モノクロ画像の場合には１次元の輝度値とし、また、カラー画像や多バンド画像の場合には２次元以上のベクトル値（例えば、赤色、緑色及び青色からなる３次元の色ベクトル値）とする。さらに、一方、画素値Ｂ（Ｘ，Ｙ）は０又は１のいずれかの値をとるものとし、背景画素に対しては値０を、非背景画素に対しては値１を設定するものとする。 The calculation process in the background difference calculation means 3a will be specifically described. First, the pixel values of the input image I, the background image J, and the silhouette image B at the pixel position (X, Y) are set to I (X, Y), J (X, Y), and B (X, Y), respectively. The pixel values I (X, Y) and J (X, Y) are one-dimensional luminance values in the case of a monochrome image, and two-dimensional or more vector values in the case of a color image or a multiband image. (For example, a three-dimensional color vector value composed of red, green, and blue). Further, on the other hand, the pixel value B (X, Y) is assumed to be either 0 or 1, and the value 0 is set for the background pixel and the value 1 is set for the non-background pixel. To do.

そうすると、例えば、背景差分演算手段３ａは、０以上の閾値ｂにより、次に示す数式（１）を用いた演算を行って、シルエット画像Ｂを得ることができる。 Then, for example, the background difference calculation means 3a can obtain the silhouette image B by performing a calculation using the following formula (1) with a threshold value b of 0 or more.

また、例えば、シルエット抽出手段３は、背景差分法以外に、クロマキー法によってシルエット画像Ｂを抽出するものとしてもよい。この場合、市販の汎用的なクロマキー装置を用いてハードクロマキーを適用することができる。図１に戻る。 Further, for example, the silhouette extracting means 3 may extract the silhouette image B by a chroma key method other than the background difference method. In this case, a hard chroma key can be applied using a commercially available general-purpose chroma key device. Returning to FIG.

シルエット特徴量抽出手段５は、シルエット抽出手段３によって抽出されたシルエット画像Ｂに含まれる各シルエット領域（単連結領域）Ｓ_ｐを抽出し、このシルエット領域（単連結領域）Ｓ_ｐに対して、少なくとも座標情報を含む特徴ベクトル（シルエット特徴ベクトル）ｇ_ｐを抽出するものである。 Silhouette feature extraction means 5 extracts each silhouette regions (simply connected region) S _p included in the silhouette image B extracted by the silhouette extraction unit 3, with respect to the silhouette region (simply connected region) S _p, feature vector including at least coordinate information and extracts the (silhouette feature vector) g _p.

なお、シルエット領域Ｓ_ｐに付した下付き文字ｐは、シルエット画像Ｂに含まれる単連結領域を互いに区別するための番号を表すものである。また、座標情報は、スポーツが行われているグラウンドの空間と座標値とが予め対応付けられている座標空間上の座標値（実座標）で表されるものである。 The subscript _p attached to the silhouette area Sp represents a number for distinguishing the single connected areas included in the silhouette image B from each other. The coordinate information is represented by coordinate values (actual coordinates) in a coordinate space in which a ground space where sports are performed and coordinate values are associated in advance.

また、シルエット特徴ベクトルｇ_ｐには、座標情報である実座標の他に、実座標の時間微分又は時間差分としての速度、当該シルエット領域Ｓ_ｐに含まれる人物の推定人数、当該シルエット領域Ｓ_ｐに含まれる人物の所属するチームを分類したチーム識別子（所属チーム情報）のいずれか０個以上を含めるものとしてもよい。 Furthermore, the silhouette feature vector g _p, in addition to the actual coordinates in the coordinate information, the rate of the time derivative or time difference of the actual coordinates, estimates the number of persons included in the silhouette area S _p, the silhouette area S _p It is possible to include any 0 or more of the team identifiers (affiliation team information) that classify the teams to which the persons included in the group belong.

ここで、図２、３を参照して、シルエット特徴量抽出手段５の詳細な構成を説明する。
図２に示すように、シルエット特徴量抽出手段５は、シルエット画像Ｂと入力画像Ｉとを入力として、シルエット特徴ベクトル｛ｇ_ｐ｝を出力するもので、ラベリング手段１３と、重心演算手段１５と、逆投影手段１７と、速度推定手段１９と、人数推定手段２１と、色判別手段２３とによって構成されている。 Here, with reference to FIGS. 2 and 3, a detailed configuration of the silhouette feature quantity extraction unit 5 will be described.
As shown in FIG. 2, the silhouette feature quantity extraction unit 5 receives the silhouette image B and the input image I and outputs a silhouette feature vector {g _p }, and includes a labeling unit 13, a center-of-gravity calculation unit 15, and the like. , Back projection means 17, speed estimation means 19, number estimation means 21, and color discrimination means 23.

ラベリング手段１３は、シルエット画像Ｂに含まれる単連結領域それぞれに対してラベル付けを行うことで、各単連結領域を互いに区別するものである。ここで、ラベリング手段１３によるラベル付けを、図７に例示する。この図７に示すように、シルエット画像Ｂに含まれる単連結領域（図７のシルエット画像Ｂにおいて白で表した領域）それぞれに対してラベル付けを行うことで、各単連結領域を互いに区別しており、ラベリング手段１３により得られた単連結領域をＳ_ｐと記し、ｐ＝１，２，……，Ｐ（Ｐはシルエット画像Ｂに含まれる単連結領域の総数）とする。図２に戻る。 The labeling means 13 distinguishes each single connected area from each other by labeling each single connected area included in the silhouette image B. Here, the labeling by the labeling means 13 is illustrated in FIG. As shown in FIG. 7, by labeling each single connected area (area represented by white in silhouette image B in FIG. 7) included in silhouette image B, each single connected area is distinguished from each other. cage, wrote simply connected region obtained by the labeling unit 13 and _{S p, p = 1,2, ......} , and P (the total number of simply connected region P is contained in the silhouette image B). Returning to FIG.

重心演算手段１５は、ラベリング手段１３によって、ラベル付けした単連結領域Ｓ_ｐの重心（ξ_ｐ，η_ｐ）を演算によって求めるものである。この重心演算手段１５は、次に示す数式（２）を用いた演算を行って、単連結領域Ｓ_ｐの重心（ξ_ｐ，η_ｐ）を求めている。 Gravity calculation means 15, the labeling unit 13, and requests the center of gravity (ξ _{_p,} η _{_p)} of the single connection region S _p that labeling by calculation. The center-of-gravity calculation means 15 performs a calculation using equation (2) shown below, seeking the center of gravity of the simply connected region _{_{_{S p (ξ p, η p}}} ).

ここでいう重心とは、画像平面上に存在する単連結領域Ｓ_ｐを代表する１点であり、平面図形の重心を指している。この重心の他に、例えば、単連結領域Ｓ_ｐに外接する円、楕円、多角形等の中心など、重心の替わりに用いことも可能である。 The center of gravity here is a point representing the single connection region S _p existing on the image plane refers to the center of gravity of the plane figure. In addition to this the center of gravity, for example, circle circumscribing the simply connected region S _p, ellipse, such as the center of the polygonal it is possible used instead of the center of gravity.

逆投影手段１７は、重心演算手段１５で求められた重心座標（ξ_ｐ，η_ｐ）が実空間においてどの位置（実座標）に対応するかを求めるものである。逆投影手段１７では、映像データを撮影したカメラの第一光学主点を始点とし、この第一光学主点と単連結領域Ｓ_ｐの重心（ξ_ｐ，η_ｐ）とを通る半直線ｒ_ｐを想定し、予め実空間上に仮想的においた曲面Πと半直線ｒ_ｐとが交わる交点を求め、当該交点の座標を実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）としている。 The back projection means 17 determines which position (real coordinate) in the real space the centroid coordinates (ξ _p , η _p ) obtained by the centroid calculation means 15 correspond to. The backprojection means 17, and starting from the first optical principal point of the camera took the picture data, the first optical principal point and the center of gravity of the simply connected region _{_{_{S p (ξ p, η p}}} ) a half line passing through the r _p assuming, in advance virtually seeking curved Π and the half-line r _p and the intersection of intersection is being placed on the real space, the coordinates of the intersection actual coordinate _{_{(x p, y p, z}} p) and.

なお、曲面Πとしては、例えば、地上高Ｈ／２の曲面（地面が平らな場合は平面）とすることができる。ここに、Ｈは人物（対象画像に現れ得るスポーツ選手）の平均的な身長とし、例えば、Ｈ＝１．８［メートル］とすることができる。 Note that the curved surface ridge can be, for example, a curved surface having a ground height of H / 2 (a flat surface when the ground surface is flat). Here, H is the average height of a person (sports player who can appear in the target image), for example, H = 1.8 [meter].

速度推定手段１９は、逆投影手段１７によって得られた実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）の履歴に基づき、実速度（ｕ_ｐ，ｖ_ｐ，ｗ_ｐ）を算出するものである。この速度推定手段１９は、例えば、現在（時刻ｔ）の実座標（ｘ_ｐ（ｔ），ｙ_ｐ（ｔ），ｚ_ｐ（ｔ））と、記憶された時間ΔＴ過去の実座標（ｘ_ｐ（ｔ−ΔＴ），ｙ_ｐ（ｔ−ΔＴ），ｚ_ｐ（ｔ−ΔＴ））とに基づいて、次に示す数式（３）を用いた演算によって、実速度（ｕ_ｐ，ｖ_ｐ，ｗ_ｐ）を算出することができる。 Speed estimating unit 19, based on the history of the actual coordinates obtained by the back projection unit _{_{17 (x p, y p,}} z p), the actual speed _{_{_{(u p, v p, w}}} p) and calculates a. The speed estimation means 19, for example, the current actual coordinates (time _{_{t) (x p (t)}} , y p (t), z p (t)) and the stored time ΔT past the actual coordinate _{(x p} _{(t-ΔT), y p} (t-ΔT), based on the _{z p (t-ΔT))} , by a calculation using the equation (3) shown below, the actual speed _(u _{p, v} p, w _p ) can be calculated.

なお、逆投影手段１７及び速度推定手段１９を用いる代わりに、図３に示すように拡張カルマンフィルタ２５を用いて、重心（ξ_ｐ，η_ｐ）から実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）及び実速度（ｕ_ｐ，ｖ_ｐ，ｗ_ｐ）を推定する構成（シルエット特徴量抽出手段５Ａ）としてもよい。
この場合、シルエット特徴量抽出手段５Ａの拡張カルマンフィルタ２５は、まず、次に示す数式（４）を用いて、時刻ｔにおける状態ベクトルｓ_ｐ（ｔ）を定義する。 Instead of using a back projection means 17 and the speed estimating means 19, by using the extended Kalman filter 25 as shown in FIG. 3, the center of gravity (ξ _{_p,} η _p) real coordinates from _{_{_{(x p, y p, z}}} p) and actual speed _{_{_{(u p, v p, w}}} p) may estimate the configuration (silhouette feature extraction means 5A).
In this case, extended Kalman filter 25 silhouettes feature extraction means 5A, first, using Equation (4) shown below, that defines the state vector s _{p (t)} at time t.

さらに、シルエット特徴量抽出手段５Ａの拡張カルマンフィルタ２５は、時刻ｔにおける状態ベクトルｓ_ｐ（ｔ）から時刻ｔ＋ΔＴにおける状態ベクトルｓ_ｐ（ｔ＋ΔＴ）への状態遷移を、次に示す数式（５）を用いてモデル化する。 Furthermore, extended Kalman filter 25 silhouettes feature extraction means 5A is a state transition to the state vector _s p (t + [Delta] T) from the state vector _s p (t) at time t at time t + [Delta] T, using equation (5) shown below Model.

この数式（５）において、φ（ｔ，ｓ_ｐ（ｔ））は、時刻ｔにおける状態ベクトルｓ_ｐ（ｔ）から時刻ｔ＋ΔＴにおける状態ベクトルｓ_ｐ（ｔ＋ΔＴ）への状態遷移のノミナル値（ノミナルの状態遷移）を与えるベクトル関数であり、一方、ν（ｔ）は時刻ｔにおいて付加されるプロセス雑音である。
例えば、ノミナルの状態遷移を等速直線運動としてモデル化する場合には、次に示す数式（６）を用いる。 In this formula _{(5), φ (t,} s p (t)) is the nominal value of the state transition from the state vector _s p at time t (t) to the state vector _s p (t + [Delta] T) at time t + [Delta] T (nominal On the other hand, ν (t) is a process noise added at time t.
For example, when modeling the nominal state transition as a constant velocity linear motion, the following formula (6) is used.

また、プロセス雑音には、例えば、平均０、共分散行列Σ_νのガウス雑音を仮定して、次に示す数式（７）を用いてモデル化する。 Further, the process noise, for example, average 0, assuming a Gaussian noise covariance matrix sigma _[nu, modeled using Equation (7) shown below.

この数式（７）において、Ｅ［・］は集合平均を、また、上付きのＴはベクトル又は行列の転置をそれぞれ表している。

In Equation (7), E [•] represents a set average, and superscript T represents a transposition of a vector or a matrix.

続いて、シルエット特徴量抽出手段５Ａの拡張カルマンフィルタ２５は、時刻ｔにおける単連結領域Ｓ_ｐの重心をｏ_ｐ（ｔ）＝［ξ_ｐ（ｔ），η_ｐ（ｔ）］^Ｔとおいて、例えば、観測モデルを、次に示す数式（８）を用いて定義する。 Then, extended Kalman filter 25 silhouettes feature extraction means 5A is a center of gravity of the simply connected region _{S p} at time _{t o p (t) = [} ξ p (t), η p (t)] at is ^T, for example, The observation model is defined using the following formula (8).

この数式（８）において、ｈ（ｔ，ｓ）は、状態量ｓ_ｐ（ｔ）から重心ｏ_ｐ（ｔ）を得る理想的な（雑音なしの）場合の変換を表している。例えば、図９に示すように状態量ｓ_ｐ（ｔ）に対応する実座標から第一光学主点へ至る半直線ｒ’_ｐを想定し、当該半直線ｒ’_ｐが画像平面と交わる点をもって重心ｏ_ｐ（ｔ）とするという一連の変換操作をｈ（ｔ，ｓ）とすることができる。 In this formula (8), h (t, s) represents the ideal (no noise) when converted from the state quantity _s p (t) to obtain a centroid _o p (t). For example, 'I assumed _p, said half-linear r' half lines r extending from the real coordinates corresponding to the state quantity s _{p (t)} as shown in FIG. 9 to the first optical principal point with a _{point p} intersects the image plane a series of conversion operation that the center of gravity _o p (t) can be a h (t, s).

一方、数式（８）において、μ（ｔ）は観測雑音である。例えば、観測雑音μ（ｔ）として、平均０、共分散行列Σ_μのガウス雑音を仮定して、次に示す数式（９）を用いてモデル化する。 On the other hand, in Equation (8), μ (t) is observation noise. For example, assuming that the observation noise μ (t) is Gaussian noise having an average of 0 and a covariance matrix Σ _μ , the observation noise μ (t) is modeled using the following formula (9).

以上の数式（４）乃至数式（９）のモデルに基づき、シルエット特徴量抽出手段５Ａは、拡張カルマンフィルタ２５を構成することで、重心（ξ_ｐ，η_ｐ）から実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）および実速度（ｕ_ｐ，ｖ_ｐ，ｗ_ｐ）を推定することができる。図２に戻る。 Based on the models of the above formulas (4) to (9), the silhouette feature amount extraction unit 5A configures the extended Kalman filter 25 so that the real coordinates (x _p , y _p ) are obtained from the center of gravity (ξ _p , η _p ). , _{z p)} and the actual speed _(u _p, v _p, it is possible to estimate the _{w p).} Returning to FIG.

人数推定手段２１は、単連結領域Ｓ_ｐの各々について、当該領域Ｓ_ｐに含まれる人物の数（人数）ｎ_ｐを推定するものである。ここでは、人数推定手段２１は、単連結領域Ｓ_ｐの面積Ａ_ｐと実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）とに基づいて人数ｎ_ｐを推定することができる。連結領域Ｓ_ｐの面積Ａ_ｐを、次に示す数式（１０）を用いて定義すると、次に示す数式（１１）を用いて、人数ｎ_ｐを推定することができる。 Number estimating means 21 for each of the simply connected region S _p, it is to estimate the number (number) n _p of a person included in the region S _p. Here, the number estimation means 21 can estimate the number _{n p} based on the single connection region _{S p} area _{A p} and the actual coordinates _{_{_{(x p, y p, z}}} p) and. The area _{A p} of the connecting area _{S p,} when defined using Equation (10) shown below, using Equation (11) shown below, it is possible to estimate the number _{n p.}

この数式（１１）において、αは正の定数であり、単連結領域Ｓ_ｐに含まれる人数ｎ_ｐが得られるよう定めるものとする。

In this formula (11), alpha is a positive constant, and the provisions to persons n _p included in the simply connected region S _p is obtained.

色判別手段２３は、単連結領域Ｓ_ｐ内における入力画像Ｉの色を評価し、当該領域Ｓ_ｐに対応する人物の所属チームを判別するものである。ここでは、色判別手段２３は、単連結領域Ｓ_ｐ内における入力画像Ｉの色ヒストグラムγ_ｐ（ｃ）を用いて所属チームを判別する。ここにｃは色ベクトルを表すものとすると、色ヒストグラムγ_ｐ（ｃ）は次に示す数式（１２）のように表すことができる。 Color determining means 23 evaluates the color of the input image I in the single consolidated area S _p, is to determine their teams of person corresponding to the region S _p. Here, the color determination unit 23 determines his team using color histograms gamma _p of the input image I in the single consolidated area S _p _(c). Assuming that c represents a color vector, the color histogram γ _p (c) can be expressed as the following formula (12).

この数式（１２）において、δ_ｃ，Ｉはクロネッカーのデルタで、ｃ＝Ｉのときに値１を、ｃ≠Ｉのときに値０をとるものとする。
なお、サッカーやバスケットボールのように、コート左右に分かれて競技を行うスポーツにおいて、コート左側を自陣とするチームの識別子ｋ_ｐ値を−１、コート右側を自陣とするチームの識別子ｋ_ｐの値を＋１とする。 In Equation (12), δ _{c, I} is a Kronecker delta, and takes a value of 1 when c = I and a value of 0 when c ≠ I.
It is to be noted that, as football or basketball, in sports to carry out the competition is divided into coat left and right, the identifier k _p value of the team that the Court left the player's side -1, the value of the identifier k _p of the team to the coat right side and their own half +1.

この色判別手段２３は、ユニフォームの種類ψごとに予め登録された色のテンプレートを、予め図示を省略した記憶手段に保持している。例えば、ユニフォーム種別ψのテンプレートとして、色ヒストグラムΓ_ψ（ｃ）を用いることができる。 The color discriminating means 23 holds a color template registered in advance for each type of uniform ψ in a storage means not shown in advance. For example, the color histogram Γ _ψ (c) can be used as a template of the uniform type ψ.

続いて、色判別手段２３は、次に示す数式（１３）を用いた演算により、観測された色ヒストグラムγ_ｐ（ｃ）が、どのユニフォーム種別ψに似ているかを判別し、その結果をΨとする。 Subsequently, the color discriminating means 23 discriminates which uniform type ψ the observed color histogram γ _p (c) is similar to by the calculation using the following formula (13), and uses the result as ψ And

この数式（１３）において、｛｝内はヒストグラム間の距離を表し、例えば、Bhattacharrya距離を用いることができる。
色判別手段２３は、得られたユニフォーム種別Ψが左チームのもの（サッカーにおいては左チームのフィールド選手または左チームのゴールキーパーのユニフォーム）であればｋ_ｐ＝−１を出力し、ユニフォーム種別Ψが右チームのもの（サッカーにおいては右チームのフィールド選手または右チームのゴールキーパーのユニフォーム）であればｋ_ｐ＝＋１を出力する。 In Equation (13), {} represents a distance between histograms, and for example, a Bhattacharrya distance can be used.
The color discrimination means 23 outputs k _p = −1 if the obtained uniform type Ψ is that of the left team (in soccer, the left team field player or the left team goalkeeper uniform), and the uniform type Ψ Is the right team (in soccer, the right team field player or the right team goalkeeper uniform), k _p = + 1 is output.

さらに、色判別手段２３は、人数推定手段２１により得られた人数ｎ_ｐを考慮してもよい。例えば、人数ｎ_ｐが１人程度（例えば０．５以上１．５未満）の場合には、前記した通り、ｋ_ｐ＝−１又はｋ_ｐ＝＋１を出力し、人数ｎ_ｐが１人程度以外の場合にはｋ_ｐ＝０を出力するものとすることができる。なお、この色判別手段２３では、複数人が重なり合った単連結領域Ｓ_ｐを敢えてチーム分類せず、例えば、当該単連結領域Ｓ_ｐに両チームの人物が半々の割合で存在するものとして扱うことができる。図１に戻る。 Further, the color determination unit 23 may consider the number of people n _p obtained by the number of people estimation unit 21. For example, when the number n _p is about 1 (for example, 0.5 to less than 1.5), k _p = −1 or k _p = + 1 is output as described above, and the number n _p is about 1 Otherwise, k _p = 0 can be output. In this color discriminating means 23, not dare team classify simply connected region S _p which overlap a plurality of persons, for example, be treated as two teams of people in the simply connected region S _p is present at a fifty-fifty ratio of Can do. Returning to FIG.

シルエット特徴量抽出手段５は、以上により得られた実座標（ｘ_ｐ，ｙ_ｐ，ｚ_ｐ）、実速度（ｕ_ｐ，ｖ_ｐ，ｗ_ｐ）、推定人数ｎ_ｐ及びチーム識別子ｋ_ｐのいずれか１成分以上からなるベクトルをさらに単連結領域の数Ｐ個分並べたベクトルを、シルエット特徴ベクトルσとして、フォーメーション定量化手段７に出力する。 Silhouette feature extraction means 5, the actual coordinates obtained as described above _{_{_{(x p, y p, z}}} p), the actual speed _{_{_{(u p, v p, w}}} p), one of the estimated number _{n p} and the team identifier _{k p} A vector obtained by further arranging vectors composed of one or more components for the number P of the single connected regions is output to the formation quantifying means 7 as a silhouette feature vector σ.

フォーメーション定量化手段７は、シルエット特徴量抽出手段５により得られたシルエット特徴ベクトルσに基づき、人物の位置関係を示すフォーメーションの特徴を定量化したベクトルであるフォーメーション特徴ベクトルｆを求めるものである。 The formation quantification unit 7 obtains a formation feature vector f which is a vector obtained by quantifying the formation features indicating the positional relationship of the person based on the silhouette feature vector σ obtained by the silhouette feature amount extraction unit 5.

例えば、フォーメーション特徴ベクトルｆは、グラウンド（競技場）を分割した部分領域群｛Ｄ_ｑ｝（ｑは部分領域を互いに区別するための番号）内のそれぞれに存在する人物の数を列記したものとすることができる。 For example, the formation feature vector f lists the number of persons existing in each of the partial area group {D _q } (q is a number for distinguishing the partial areas from each other) obtained by dividing the ground (stadium). can do.

さらに、フォーメーション特徴ベクトルｆは、グランド（競技場）を分割した部分領域群｛Ｄ_ｑ｝内のそれぞれに存在する特定チーム所属の選手人数を列記したものとすることができる。以下の説明では、部分領域Ｄ_ｑ内に存在するチームκ（左チームをκ＝−１、右チームをκ＝＋１とする）所属の選手人数をＮ（ｑ，ｋ）と記載することとする。 Furthermore, the formation feature vector f may be a list of the number of players belonging to a specific team existing in each of the partial region groups {D _q } obtained by dividing the ground (stadium). In the following description, the number of players belonging to the team κ (where the left team is κ = −1 and the right team is κ = + 1) existing in the partial region D _q is described as N (q, k). .

部分領域Ｄ_ｑについて、例えば、図１０に示すように、競技場（例えば、サッカーにおいてはサッカーコート、若干の場外領域を含めてもよい）をＱ個の格子状に分割し、そのｑ番目（例えば、左から右へ主走査、手前から奥へ副走査するラスタスキャンの順序とする）の部分領域をＤ_ｑとおくことができる。このとき、部分領域Ｄ_ｑ内に存在する人数Ｎ（ｑ，κ）は、次に示す数式（１４）を用いて計数することができる。 For the partial region D _q , for example, as shown in FIG. 10, a stadium (for example, a soccer court in soccer, which may include some off-field regions) is divided into Q grids, and the q th ( for example, the main scanning from left to right, a partial area of the order of raster scanning by-scan from the front to the back) can be placed and D _q. At this time, the number present in the partial region D _q N (q, kappa) can be counted using the following equation (14).

ただし、この数式（１４）において、ｄ（ｋ_ｐ，κ）は、次に示す数式（１５）のような関数として定義することができる。 However, in this equation (14), d (k _p , κ) can be defined as a function like the following equation (15).

なお、フォーメーション定量化手段７は、単連結領域群｛Ｓ_ｐ｝のうち、シルエット特徴ベクトルσ内のチームの識別子ｋ_ｐが計数対象のチームκと一致したＳ_ｐだけを抽出する機能を有している。また、フォーメーション定量化手段７は、このｄ（ｋ，κ）の定義により、シルエット特徴ベクトルσ内に含まれるチームの識別子ｋ_ｐが値０（複数人数の場合など）をとった場合には、１／２の重みでカウントするものとした。 Incidentally, formation quantification means 7, of the simply connected region group {S _p}, has the function of extracting only the S _p team identifier k _p in silhouette feature vectors σ matches the team κ counting target ing. Further, the formation quantification means 7 uses the definition of d (k, κ), and when the team identifier k _p included in the silhouette feature vector σ takes a value of 0 (in the case of a plurality of people), Counting was performed with a weight of 1/2.

さらに、フォーメーション定量化手段７は、部分領域Ｄ_ｑをファジイ集合として扱うことも可能である。この場合、フォーメーション定量化手段７は、部分領域Ｄ_ｑの座標（ｘ，ｙ）におけるメンバシップ関数をｍ_ｑ（ｘ，ｙ）とおいて、部分領域Ｄ_ｑ内に存在する人数Ｎ（ｑ，κ）を、次に示す数式（１６）を用いて計数することができる。 Furthermore, formation quantification means 7, it is possible to handle a partial region D _q as fuzzy sets. In this case, the formation quantification means 7 sets the membership function in the coordinates (x, y) of the partial region D _q to m _q (x, y), and the number of people N (q, κ) existing in the partial region D _q . ) Can be counted using the following equation (16).

ここで、部分領域Ｄ_ｑの設定方法における例を説明する。例えば、図１１に示す部分領域Ｄ_ｑのように、グラウンド（競技場）を平面とすると、この平面上の座標（Ｇ_ｘ（ｑ），Ｇ_ｙ（ｑ））を中心とするメンバシップ関数ｍ_ｑ（ｘ，ｙ）によって、次に示す数式（１７）を用いて、ファジイ集合を定義することができる。 Here, an example of a method for setting the partial region _Dq will be described. For example, as in the partial region D _q shown in FIG. 11, when the ground (stadium) is a plane, the membership function m centered on coordinates (G _x (q), G _y (q)) on the plane. _{With f} (x, y), a fuzzy set can be defined using the following equation (17).

例えば、横Ｅ_ｘ、奥行きＥ_ｙの大きさの領域（競技場）を横方向Δ_ｘ個、奥行き方向Δ_ｙ個の格子状に配列した合計Ｑ＝Δ_ｘ×Δ_ｙ個のファジイ集合による部分領域Ｄ_ｑにより覆う場合に、次に示す数式（１８）を用いて表した各パラメータを用いることができる。 For example, partial lateral E _x, the depth E _y size of the area (the stadium) transverse delta _x pieces, by total Q = Δ _x × Δ _y-number of fuzzy sets are arranged in the depth direction delta _y-number of grid-like When covering with the area _| region Dq, each parameter represented using Numerical formula (18) shown next can be used.

ただし、この数式（１８）において、ｑｍｏｄΔ_ｘは、ｑをΔ_ｘで割ったときの余りを表し、逆カギ括弧及び」で挟まれたｘは、ｘより大きくない最大の整数を表している。 However, in this formula (18), qmodΔ _x represents the remainder when divided by q in delta _x, x sandwiched between opposite brackets and "represents the maximum integer not greater than x.

例えば、１０５メートル×６８メートルのサッカーコートを各辺１２５％に拡大した領域（Ｅ_ｘ＝１３１．２５、Ｅ_ｙ＝８５．００）を横方向Δ_ｘ＝５個、奥行き方向Δ_ｙ＝５個の合計Ｑ＝２５個のファジイ集合による部分領域Ｄ_ｑにより覆う場合には、次に示す数式（１９）を用いて表すことができる。 For example, the area (E _x = 131.25, E _y = 85.00) obtained by enlarging a 105 m × 68 m soccer court to 125% on each side (Δ _x = 5 in the horizontal direction and Δ _y = 5 in the depth direction) Is covered by the partial region D _q by the fuzzy set of Q = 25, it can be expressed using the following formula (19).

この数式（１９）をまとめると、次に示す数式（２０）と表すことができる。 When this numerical formula (19) is put together, it can be expressed as the following numerical formula (20).

以下、説明を簡単にするために、部分領域Ｄ_ｑのインデクスｑを、次に示す数式（２１）を用いて、（ｑ_ｘ，ｑ_ｙ）の組み合わせで表すこととする。

Hereinafter, in order to simplify the explanation, the index q of the partial region D _q is represented by a combination of (q _x , q _y ) using the following formula (21).

ただし、この数式（２１）において、Δ_ｘが奇数の場合にはｑ_ｘは整数を、Δ_ｘが偶数の場合にはｑ_ｘは半整数をとるものとし、そのいずれの場合においてもｑ_ｘは−（Δ_ｘ−１）／２以上かつ＋（Δ_ｘ−１）／２以下の値をとるものとする。 However, in this equation (21), when _x is an odd number, q _x is an integer, and when _x is an even number, q _x is a half integer, and in any case q _x is It shall take the value of-((DELTA) _x- 1) / 2 or more and + ((DELTA) _x- 1) / 2 or less.

また、Δ_ｙが奇数の場合にはｑ_ｙは整数を、Δ_ｙが偶数の場合にはｑ_ｙは半整数をとるものとし、そのいずれの場合においてもｑ_ｙは−（Δ_ｙ−１）／２以上かつ＋（Δ_ｙ−１）／２以下の値をとるものとする。 In addition, q _y is an integer when Δ _y is an odd number, and q _y is a half integer when Δ _y is an even number. In either case, q _y is − (Δ _y −1). The value is not less than / 2 and not more than + (Δ _y −1) / 2.

（ｑ_ｘ，ｑ_ｙ）の組み合わせとｑとの関係について、図１２にΔ_ｘ＝５、Δ_ｙ＝５の場合を、また、図１３にΔ_ｘ＝４、Δ_ｙ＝４の場合を、それぞれ例示する。なお、図示を省略しているが、Δ_ｘが奇数かつΔ_ｙが偶数の場合や、Δ_ｘが偶数かつΔ_ｙが奇数の場合であっても構わない。 Regarding the relationship between the combination of (q _x , q _y ) and q, FIG. 12 shows the case of Δ _x = 5 and Δ _y = 5, and FIG. 13 shows the case of Δ _x = 4 and Δ _y = 4. Each is illustrated. Although not shown, delta _x or if odd cutlet delta _y is an even number, delta _x is may be a case even number cutlet delta _y is an odd number.

また、フォーメーション定量化手段７は、フォーメーション特徴ベクトルｆの成分として、全選手の平均速度ベクトル（Ｕ，Ｖ，Ｗ）、特定チームκ （κ∈｛−１，＋１｝）に所属の選手の平均速度ベクトル（Ｕ_κ，Ｖ_κ，Ｗ_κ）の一部乃至全部を含めることができる。この場合、平均速度ベクトル（Ｕ_κ，Ｖ_κ，Ｗ_κ）を、次に示す数式（２２）或いは数式（２３）を用いて定義する。 Further, the formation quantification means 7 uses the average velocity vector (U, V, W) of all players and the average of players belonging to a specific team κ (κ∈ {−1, + 1}) as components of the formation feature vector f. Some or all of the velocity vectors (U _κ , V _κ , W _κ ) can be included. In this case, the average velocity vector (U _κ , V _κ , W _κ ) is defined using the following formula (22) or formula (23).

フォーメーション定量化手段７は、以上により求めた、部分領域Ｄ_ｑ内に存在するチームκ所属の選手人数をＮ（ｑ，ｋ）、全選手の平均速度ベクトル（Ｕ，Ｖ，Ｗ）、および特定チームκ （κ∈｛−１，＋１｝）に所属の選手の平均速度ベクトル（Ｕ_κ，Ｖ_κ，Ｗ_κ）の一部乃至全部をまとめたベクトルをフォーメーション特徴ベクトルｆとする。
例えば、フォーメーション特徴ベクトルｆは、前記の全てを用いて、次に示す数式（２４）のように表したり、前記の一部を用いて、次に示す数式（２５）のように表したりすることができる。 The formation quantification means 7 determines N (q, k), the average velocity vector (U, V, W) of all the players, and the identification of the number of players belonging to the team κ existing in the partial region D _q . A formation feature vector f is a vector obtained by collecting a part or all of the average speed vectors (U _κ , V _κ , W _κ ) of the players belonging to the team κ (κε {−1, + 1}).
For example, the formation feature vector f may be expressed as the following formula (24) using all the above, or may be expressed as the following formula (25) using the above part. Can do.

変換手段９は、フォーメーション定量化手段７で生成されたフォーメーション特徴ベクトルｆに、特徴量を重み付ける予め設定した行列（後記する重み行列）を乗じる演算をし、その演算結果を判別用特徴ベクトルχとして出力するものである。 The conversion means 9 performs an operation of multiplying the formation feature vector f generated by the formation quantification means 7 by a preset matrix (weight matrix described later) for weighting the feature quantity, and the operation result is used as a discrimination feature vector χ. Is output as

ここでは、変換手段９は、重み行列Φとフォーメーション特徴ベクトルｆとの積を演算し、その演算結果を判別用特徴ベクトルχとして出力する。判別用特徴ベクトルχは次に示す数式（２６）を用いて表すことができる。 Here, the conversion means 9 calculates the product of the weight matrix Φ and the formation feature vector f, and outputs the calculation result as a discrimination feature vector χ. The distinguishing feature vector χ can be expressed using the following equation (26).

例えば、重み行列Φは、あらかじめ学習用の画像入力群｛Ｉ_ｉ｝から得られるフォーメーション特徴ベクトル群｛ｆ_ｉ｝を主成分分析することにより定めることができる。
この場合、学習用の画像入力群｛Ｉ_ｉ｝（ｉ∈｛１，２，……，Ｔ_Ｌ｝）を、シルエット抽出手段３、シルエット特徴量抽出手段５及びフォーメーション定量化手段７により処理し、フォーメーション特徴ベクトル群｛ｆ_ｉ｝を得る。 For example, the weight matrix Φ can be determined by performing principal component analysis on the formation feature vector group {f _i } obtained from the image input group {I _i } for learning in advance.
In this case, the learning image input group {I _i } (i∈ {1, 2,..., T _L }) is processed by the silhouette extraction means 3, the silhouette feature quantity extraction means 5, and the formation quantification means 7. , A formation feature vector group {f _i } is obtained.

続いて、フォーメーション特徴ベクトル群｛ｆ_ｉ｝の共分散行列Σ_ｆを、次に示す数式（２７）を用いて求める。 Subsequently, the covariance matrix Σ _f of the formation feature vector group {f _i } is obtained using the following formula (27).

さらに、共分散行列Σ_χの固有ベクトルを、その固有値が大きい方からＭ個求める。第ｍ番目（ｍ∈｛１，２，……，Ｍ｝）の固有ベクトルをβ_ｍとおく。最後に、固有ベクトルβ_ｍの転置を行方向に並べた行列をもって重み行列Φを、例えば、次に示す数式（２８）を用いて定義する。 Further, the eigenvectors of the covariance matrix sigma _chi, obtaining the M from the direction its eigenvalues is greater. Let the m-th (m∈ {1, 2,..., M}) eigenvector be β _m . Finally, a weight matrix Φ is defined using a matrix in which transpositions of eigenvectors β _m are arranged in the row direction, for example, using the following formula (28).

また、重み行列Φは、フィッシャー重みマップ（参考文献：篠原、大津：「フィッシャー重みマップを用いた顔画像からの表情認識」、電子情報通信学会技術研究報告ＰＲＭＵ２００３−２６９、ｖｏｌ．１０３、ｎｏ．７３７、ｐｐ．７９−８４、２００３年）を用いて定義することができる。 The weight matrix Φ is a Fisher weight map (references: Shinohara, Otsu: “Recognition of facial expressions from face images using the Fisher weight map”, IEICE Technical Report PRMU 2003-269, vol. 103, no. 737, pp. 79-84, 2003).

ここで、変換手段９で用いる重み行列Φの決定手順を、図４を参照して説明する。なお、図４に示したものは、図１に示したイベント判別装置１に、重み行列Φを決定する手段（分別手段２７、分散演算手段２９及び一般化固有値問題演算手段３１）を追加したものであり、イベント判別装置１Ａとする。そして、イベント判別装置１の構成と同様のものは同一の符号を付してその説明を省略する。 Here, the procedure for determining the weight matrix Φ used in the conversion means 9 will be described with reference to FIG. In addition, what is shown in FIG. 4 is a device in which means for determining a weight matrix Φ (a classification means 27, a variance calculation means 29, and a generalized eigenvalue problem calculation means 31) are added to the event determination apparatus 1 shown in FIG. It is assumed that the event discriminating apparatus 1A. And the thing similar to the structure of the event discrimination | determination apparatus 1 attaches | subjects the same code | symbol, and abbreviate | omits the description.

まず、図４に示すように、イベント判別装置１Ａは、学習用の映像データ（学習用画像入力群｛Ｉ_ｉ｝）を、シルエット抽出手段３に入力し、このシルエット抽出手段３及びシルエット特徴抽出手段２により処理し、フォーメーション特徴ベクトル群｛ｆ_ｉ｝を得る。 First, as shown in FIG. 4, the event discriminating apparatus 1A inputs learning video data (learning image input group {I _i }) to the silhouette extracting means 3, and this silhouette extracting means 3 and silhouette feature extraction. Processing is performed by means 2 to obtain a formation feature vector group {f _i }.

分別手段２７は、イベント種別正解データ入力（入力されたイベント種別正解データ群｛ｅ_ｉ｝）とフォーメーション特徴ベクトル群｛ｆ_ｉ｝とに基づき、フォーメーション特徴ベクトル群｛ｆ_ｉ｝をイベント種別ｅごとに分別し、イベント種別ｅごとのフォーメーション特徴ベクトル集合Ｆ_ｅを、次に示す数式（２９）を用いて得るものである。 Based on the event type correct answer data input (input event type correct answer data group {e _i }) and the formation feature vector group {f _i }, the classification unit 27 determines the formation feature vector group {f _i } for each event type e. The formation feature vector set F _e for each event type e is obtained using the following formula (29).

分散演算手段２９は、次に示す数式（３０）を用いて、クラス内（同一イベント同士の）分散Σ_Ｗ及びクラス間（異イベント間）の分散Σ_Ｂを求めるものである。 Variance calculating means 29, using the following equation (30), and requests the variance sigma _B in the class (between the same event) dispersion sigma _W and between classes (between different events).

この数式（３０）において、｜Ｆ｜は、集合Ｆの要素数を表している。
一般化固有値問題演算手段３１は、フィッシャーの判別基準に基づく一般化固有値問題を、次に示す数式（３１）を解くことによって、重み行列Φを定めるものである。 In Equation (30), | F | represents the number of elements in the set F.
The generalized eigenvalue problem calculation means 31 determines the weight matrix Φ by solving the generalized eigenvalue problem based on the Fisher's criterion for solving the following equation (31).

この数式（３１）において、βは固有ベクトル、またλはその固有値としている。この場合、固有値λの大きい方から数えてｍ番目の固有ベクトルβをβ_ｍとしたとき、次に示す数式（３２）により、重み行列Φを定義することができる。 In Equation (31), β is an eigenvector and λ is its eigenvalue. In this case, when the m-th eigenvector β counted from the larger eigenvalue λ is β _m , the weight matrix Φ can be defined by the following equation (32).

また、重み行列Φは、前記した主成分分析やフィッシャーの判別基準によらずに設定することも可能である。
例えば、フォーメーション特徴ベクトルｆの成分の重み付け及び組み合わせによってイベントの識別を容易にするような重み行列を設計者の主観により設定することができる。例えば、グラウンド（競技場）の四隅に対応する部分領域Ｄ_ｑ内（四隅の部分領域のインデクスｑを左手前ｑ_Ａ、右手前ｑ_Ｂ、左奥ｑ_Ｃおよび右奥ｑ_Ｄとする）の選手人数の単純平均を得るようＮ（ｑ，ｋ）の該当成分の線形結合をとる重みベクトルβ_ｍ１を、次に示す数式（３３）、（３４）、（３５）を用いて定義することができる。 Also, the weight matrix Φ can be set without depending on the principal component analysis and the Fisher's criterion.
For example, a weight matrix that facilitates event identification by weighting and combining components of the formation feature vector f can be set by the designer's subjectivity. For example, players in the partial area D _q corresponding to the four corners of the ground (stadium) (indexes q of the partial areas at the four corners are the left front q _A , the right front q _B , the left back q _C, and the right back q _D ) A weight vector β _m1 that takes a linear combination of corresponding components of N (q, k) so as to obtain a simple average of the number of people can be defined using the following equations (33), (34), and (35). .

なお、数式（３５）では、グラウンド（競技場）の場所ごとに、符号を変えた重みベクトルβ_ｍ２やβ_ｍ３を定義している。
そして、これらを、次に示す数式（３６）のように組み合わせることによって、重み行列Φを定義することができる。 In Equation (35), weight vectors β _m2 and β _m3 with different signs are defined for each place of the ground (stadium).
Then, the weight matrix Φ can be defined by combining these as in the following formula (36).

さらにまた、数式（２７）、数式（３２）、数式（３３）乃至数式（３５）など種々の方法で得た（または主観的に設計した）重みベクトルβ_●を自由に組み合わせてもよい。 Furthermore, the weight vector β _● obtained (or subjectively designed) by various methods such as Expression (27), Expression (32), Expression (33) to Expression (35) may be freely combined.

さらに、変換手段９は、フォーメーション特徴ベクトルｆの非線形変換φにより、次に示す数式（３７）を用いて、判別用特徴ベクトルχを求めてもよい。 Further, the conversion means 9 may obtain the discrimination feature vector χ using the following mathematical formula (37) by the nonlinear transformation φ of the formation feature vector f.

この場合、変換手段９は、例えば、Ｎ次元のフォーメーション特徴ベクトルｆの第ｎ成分をｆ_ｎ、Ｍ次元判別用特徴ベクトルの第ｍ成分をχ_ｍとおいたとき、次に示す数式（３８）を用いて、ｆの２次形式と１次形式との結合により非線形変換を施すことができる。 In this case, for example, when the conversion unit 9 sets the n-th component of the N-dimensional formation feature vector f as f _n and the m-th component of the M-dimensional discrimination feature vector as χ _m , Using it, it is possible to perform non-linear transformation by combining the secondary form and the primary form of f.

この数式（３８）において、ａ_●およびｂ_●は、それぞれ３階および２階のテンソルとし、その各成分は定数とする。図１に戻る。 In Equation (38), a _● and b _● are the third and second floor tensors, respectively, and each component is a constant. Returning to FIG.

イベント判別手段１１は、変換手段９により得られた判別用特徴ベクトルχと、予めイベントを分類したイベント種別の生起確率をモデル化した確率モデルとを参照して、イベントの種別を判別し、判別した結果（判別結果Ｅ）を出力するものである。 The event discriminating unit 11 discriminates the event type by referring to the discrimination feature vector χ obtained by the conversion unit 9 and the probability model obtained by modeling the occurrence probability of the event type in which the event is classified in advance. The result (discrimination result E) is output.

ここでは、イベント判別手段１１は、判別用特徴ベクトルχが与えられたときのイベント種別ｅの生起確率Ｐ（ｅ｜χ）をモデル化した確率モデルを参照することで、最も確からしいイベント種別Ｅを、次に示す数式（３９）を用いて求め、出力している。ここで、確率モデルを参照し、イベントを判別するとは、各イベントの生起確率の確率値の大小や、分布形状から、特定のイベントを選択する演算を行うことを指している。 Here, the event discriminating means 11 refers to a probability model obtained by modeling the occurrence probability P (e | χ) of the event type e when the discriminating feature vector χ is given, so that the most probable event type E Is obtained and output using the following mathematical formula (39). Here, referring to a probability model and determining an event means performing an operation of selecting a specific event from the magnitude of the probability value of the occurrence probability of each event and the distribution shape.

この数式（３９）において、生起確率Ｐ（ｅ｜χ）は、例えば、次に示す数式（４０）を用いたイベント種別ｅごとのガウス混合モデルにより表されるものとしてもよい。 In this mathematical formula (39), the occurrence probability P (e | χ) may be represented by, for example, a Gaussian mixture model for each event type e using the following mathematical formula (40).

この数式（４０）において、Ｋ_ｅはイベント種別ｅに対するガウス混合モデルの混合数を、ｗ_ｅ，ｋは当該ガウス混合モデルを構成する第ｋのガウス分布に対する重みを、μ_ｅ，ｋは前記第ｋのガウス分布の平均ベクトルを、また、Σ_ｅ，ｋは前記第ｋのガウス分布の共分散行列を、それぞれ表している。 In this equation (40), K _e is the number of Gaussian mixture models for the event type e, we _{, k} is the weight for the kth Gaussian distribution constituting the Gaussian mixture model, and μ _{e, k} is the above-mentioned number. An average vector of k Gaussian distributions, and Σ _{e, k} represent a covariance matrix of the kth Gaussian distribution, respectively.

また、混合数Ｋ_ｅは、イベント種別ｅによって異なる定数としてもよいし、全イベント種別に対して共通の定数（例えば５）としてもよい。
重みｗ_ｅ，ｋ、平均ベクトルμ_ｅ，ｋ及び共分散行列Σ_ｅ，ｋは以下の手順により、学習用の映像データに基づいて定めることができる。以下、重みｗ_ｅ，ｋは、次に示す数式（４１）を用いて、パラメータω_ｅ，ｋとパラメータｐ_ｅとの積により表すこととする。 The mixing speed K _e may be a different constant depending event type e, it may be as a common constant for all event types (e.g., 5).
The weights w _{e, k} , the mean vector μ _{e, k} and the covariance matrix Σ _{e, k} can be determined based on the video data for learning by the following procedure. Hereinafter, the weights w _{e, k} are represented by the product of the parameter ω _{e, k} and the parameter p _e using the following formula (41).

ここで、イベント判別手段１１の詳細な構成について、図５を参照して説明する。図５に示すように、イベント判別手段１１は、ＥＭアルゴリズム実行手段３３と、事前確率演算手段３５と、掛算手段３７とを備えている。なお、このイベント判別手段１１には、イベント種別ｅごとの判別用特徴ベクトル集合Ｘ_ｅが入力されているものとする。
このイベント種別ｅごとの判別用特徴ベクトル集合Ｘ_ｅは、イベント種別ｅごとのフォーメーション特徴ベクトル集合Ｆ_ｅを構成する全フォーメーション特徴ベクトルｆを、変換手段９によって、判別用特徴ベクトルχに変換することで作成されたものである。 Here, a detailed configuration of the event determination unit 11 will be described with reference to FIG. As shown in FIG. 5, the event determination unit 11 includes an EM algorithm execution unit 33, a prior probability calculation unit 35, and a multiplication unit 37. Note that it is assumed that a discrimination feature vector set X _e for each event type _e is input to the event discrimination means 11.
The discrimination feature vector set X _e for each event type e is obtained by converting all the formation feature vectors f constituting the formation feature vector set F _e for each event type e into a discrimination feature vector χ by the conversion means 9. It was created by.

例えば、変換手段９は、重み行列Φによる線形変換によって、イベント種別ｅごとの判別用特徴ベクトル集合Ｘ_ｅを作成する場合には、次に示す数式（４２）を用いる。 For example, the conversion means 9, by a linear transformation by weighting matrix [Phi, when creating a decision feature vector set X _e for each event type e uses the formula (42) shown below.

ＥＭアルゴリズム実行手段３３は、次式に定める確率密度関数ｐ（χ｜ｅ）が集合Ｘ_ｅの密度分布を近似するように、次に示す数式（４３）を用いて、パラメータω_ｅ，ｋ、μ_ｅ，ｋ及びΣ_ｅ，ｋを決定するものである。 The EM algorithm executing means 33 uses the following equation (43) so that the probability density function p (χ | e) defined by the following equation approximates the density distribution of the set X _e , and the parameters ω _{e, k} , μ _{e, k} and Σ _{e, k} are determined.

この数（４３）において、パラメータω_ｅ，ｋ、μ_ｅ，ｋ及びΣ_ｅ，ｋはＥＭ（Ｅｘｐｅｃｔａｔｉｏｎ−Ｍａｘｉｍｉｚａｔｉｏｎ）アルゴリズムにより求めることができる。
事前確率演算手段１０は、次に示す数式（４４）を用いて、パラメータｐ_ｅを定めるものである。 In this number (43), the parameters ω _{e, k} , μ _{e, k} and Σ _{e, k} can be obtained by an EM (Expectation-Maximization) algorithm.
Prior probability calculation unit 10 uses the formula (44) shown below, is intended to define the parameters p _e.

掛算手段１１は、数式（４１）の演算を実行することで、イベント種別ｅのガウス混合モデルの第ｋのガウス分布に対する重みｗ_ｅ，ｋを得るものである。 The multiplication means 11 obtains the weights we _{, k} for the kth Gaussian distribution of the Gaussian mixture model of the event type e by executing the calculation of the equation (41).

イベント判別手段１１では、数式（４１）乃至数式（４４）の操作を全イベント種別ｅに対して実行することで、数式（４０）に示したガウス混合モデル群の重みｗ_ｅ，ｋ、平均ベクトルμ_ｅ，ｋ及び共分散行列Σ_ｅ，ｋを全て決定することができる。 The event discriminating means 11 performs the operations of the equations (41) to (44) for all event types e, whereby the weights we _{, k} of the Gaussian mixture model group shown in the equation (40), the average vector μ _{e, k} and covariance matrix Σ _{e, k} can all be determined.

イベント判別装置１によれば、スポーツ用の入力画像Ｉから、シルエット画像Ｂを抽出した後、シルエット特徴ベクトルｇ、フォーメーション特徴ベクトルｆ、判別用特徴ベクトルχを求めることで、各スポーツに対して汎用的に利用可能で、客観的な判別基準によりイベントを判別することができる。 According to the event discriminating apparatus 1, after extracting the silhouette image B from the sports input image I, the silhouette feature vector g, the formation feature vector f, and the discriminating feature vector χ are obtained. The event can be discriminated based on objective discrimination criteria.

（イベント判別装置の動作）
次に、図１４を参照して、イベント判別装置１の概略の動作について説明する（適宜、図１参照）。
まず、イベント判別装置１は、シルエット画像抽出手段３によって、シルエット画像を抽出し（ステップＳ１）、シルエット特徴量抽出手段５によって、シルエット画像からシルエット領域（単連結領域）を抽出し、シルエット特徴ベクトルを抽出する（ステップＳ２）。 (Operation of event discriminator)
Next, with reference to FIG. 14, the general operation of the event determination apparatus 1 will be described (see FIG. 1 as appropriate).
First, the event discriminating apparatus 1 extracts a silhouette image by the silhouette image extraction means 3 (step S1), extracts a silhouette area (single connected area) from the silhouette image by the silhouette feature amount extraction means 5, and obtains a silhouette feature vector. Is extracted (step S2).

続いて、イベント判別装置１は、フォーメーション定量化手段７によって、フォーメーション特徴ベクトルを生成し（ステップＳ３）、変換手段９によって、フォーメーション特徴ベクトルに重み行列をかけることで、判別用特徴ベクトルを出力する（ステップＳ４）。そして、イベント判別装置１は、イベント判別手段１１によって、判別用特徴ベクトルに基づいて、イベントの種別を判別する（ステップＳ５）。 Subsequently, the event discriminating apparatus 1 generates a formation feature vector by the formation quantifying unit 7 (step S3), and outputs a discrimination feature vector by applying a weighting matrix to the formation feature vector by the conversion unit 9. (Step S4). Then, the event discriminating apparatus 1 discriminates the event type by the event discriminating means 11 based on the discriminating feature vector (step S5).

以上、本発明の実施形態について説明したが、本発明は前記実施形態には限定されない。例えば、本実施形態では、イベント判別装置１として説明したが、当該装置１の各構成の処理を実現可能にコンピュータ言語で記述したイベント判別プログラムとして構成することも可能である。この場合、同様の効果を得られる。 As mentioned above, although embodiment of this invention was described, this invention is not limited to the said embodiment. For example, in the present embodiment, the event determination device 1 has been described. However, it is also possible to configure as an event determination program described in a computer language so that the processing of each component of the device 1 can be realized. In this case, the same effect can be obtained.

本発明の実施形態に係るイベント判別装置のブロック図である。It is a block diagram of the event discriminating device concerning the embodiment of the present invention. シルエット特徴量抽出手段の詳細な構成を示すブロック図である。It is a block diagram which shows the detailed structure of a silhouette feature-value extraction means. シルエット特徴量抽出手段の詳細な他の構成（拡張カルマンフィルタを用いる場合の構成）を示すブロック図である。It is a block diagram which shows the other detailed structure (structure in the case of using an extended Kalman filter) of a silhouette feature-value extraction means. 学習により重み行列を得る手段を付加したイベント判別装置のブロック図である。It is a block diagram of an event discriminating device to which means for obtaining a weight matrix by learning is added. イベント判別手段の詳細な構成（学習によりガウス混合モデルのパラメータを得る手段を含む）を示すブロック図である。It is a block diagram which shows the detailed structure (The means to obtain the parameter of a Gaussian mixture model by learning) of an event discrimination means. 背景差分法によるシルエット抽出手段の動作を説明する図である。It is a figure explaining operation | movement of the silhouette extraction means by a background difference method. ラベリング手段の動作を説明する図である。It is a figure explaining operation | movement of a labeling means. 逆投影手段の動作を説明する図である。It is a figure explaining operation | movement of a back projection means. 拡張カルマンフィルタの観測モデルにおける観測の動作を説明する図である。It is a figure explaining the operation | movement of observation in the observation model of an extended Kalman filter. 部分領域の配置の一例を示す図である。It is a figure which shows an example of arrangement | positioning of a partial area | region. 部分領域をファジイ集合とする場合のメンバシップ関数の一例を示す図である。It is a figure which shows an example of the membership function in the case of making a partial area | region into a fuzzy set. （ｑ_ｘ，ｑ_ｙ）の組み合わせとｑとの関係の一例を示す図である。 _(Q x, _{q y)} is a diagram showing an example of the relationship between the combination and q of. （ｑ_ｘ，ｑ_ｙ）の組み合わせとｑとの関係の他の一例を示す図である。(Q _{x, q} _y) is a diagram showing another example of the relationship between the combination and q of. イベント判別装置の全体の概略動作を示したフローチャートである。It is the flowchart which showed the general | schematic operation | movement of the whole event determination apparatus.

Explanation of symbols

１イベント判別装置
３シルエット抽出手段
３ａ背景差分演算手段
５シルエット特徴量抽出手段
７フォーメーション定量化手段（位置関係特徴ベクトル生成手段）
９変換手段
１１イベント判別手段
１３ラベリング手段
１５重心演算手段
１７逆投影手段
１９速度推定手段
２１人数推定手段
２３色判別手段
２５拡張カルマンフィルタ
２７分別手段
２９分散演算手段
３１一般化固有値問題演算手段
３３ＥＭアルゴリズム実行手段
３５事前確率演算手段
３７掛算手段 DESCRIPTION OF SYMBOLS 1 Event discrimination device 3 Silhouette extraction means 3a Background difference calculation means 5 Silhouette feature amount extraction means 7 Formation quantification means (positional relation feature vector generation means)
DESCRIPTION OF SYMBOLS 9 Conversion means 11 Event discrimination means 13 Labeling means 15 Center of gravity calculation means 17 Back projection means 19 Speed estimation means 21 Number of persons estimation means 23 Color discrimination means 25 Extended Kalman filter 27 Sorting means 29 Variance calculation means 31 Generalized eigenvalue problem calculation means 33 EM algorithm Execution means 35 Prior probability calculation means 37 Multiplication means

Claims

An event discriminating device for discriminating an event that is a specific scene in video data for sports,
A silhouette image that distinguishes a background region from a non-background region on the basis of a predetermined pixel value of pixels included in the image constituting the video data and extracts a silhouette image using the non-background region as a silhouette region of a person Extraction means;
Silhouette feature amount extraction means for extracting a silhouette feature vector, which is a vector indicating the feature amount of the silhouette area including at least coordinate information about the silhouette area included in the silhouette image extracted by the silhouette image extraction means;
Using the coordinate information of the silhouette feature vector extracted by the silhouette feature quantity extracting means, a positional relationship feature vector generating means for generating a positional relationship feature vector which is a vector quantifying the positional relationship of the person;
Conversion means for outputting a discrimination feature vector which is a vector obtained by multiplying the positional relation feature vector generated by the positional relation feature vector generation means by a preset matrix for weighting the feature amount;
By using the discrimination feature vector output by the conversion means, and referring to a probability model that is modeled in advance for each event type, for the occurrence probability of the event when the discrimination feature vector is obtained, Event discriminating means for discriminating an event;
Equipped with a,
The silhouette feature amount extraction unit includes, as the coordinate information, a coordinate value on a coordinate space in which a space of the ground where the sport is performed and a coordinate value are associated in advance in the silhouette feature vector,
The positional relationship feature vector generation means counts persons existing in each of the local areas obtained by dividing the ground space based on the coordinate values included in the silhouette feature vector, and the counted result is the positional relationship feature vector. An event discriminating apparatus characterized in that it is included and output .

An event discriminating device for discriminating an event that is a specific scene in video data for sports,
A silhouette image that distinguishes a background region from a non-background region on the basis of a predetermined pixel value of pixels included in the image constituting the video data and extracts a silhouette image using the non-background region as a silhouette region of a person Extraction means;
Silhouette feature amount extraction means for extracting a silhouette feature vector, which is a vector indicating the feature amount of the silhouette area including at least coordinate information about the silhouette area included in the silhouette image extracted by the silhouette image extraction means;
Using the coordinate information of the silhouette feature vector extracted by the silhouette feature quantity extracting means, a positional relationship feature vector generating means for generating a positional relationship feature vector which is a vector quantifying the positional relationship of the person;
Conversion means for outputting a discrimination feature vector which is a vector obtained by multiplying the positional relation feature vector generated by the positional relation feature vector generation means by a preset matrix for weighting the feature amount;
By using the discrimination feature vector output by the conversion means, and referring to a probability model that is modeled in advance for each event type, for the occurrence probability of the event when the discrimination feature vector is obtained, Event discriminating means for discriminating an event;
With
The silhouette feature amount extraction unit includes coordinate values on a coordinate space in which a space of the ground where the sport is performed and coordinate values are associated in advance;
For each silhouette region included in the silhouette image, the color vector statistics are evaluated, and the team information identifying the team to which the person included in the silhouette region belongs is included in the silhouette feature vector and output. ,
The positional relationship feature vector generation means counts the number of persons existing in each local area divided into the ground space for each team based on the coordinate values and team information included in the silhouette feature vector. , features and be Louis vent discriminating device to output, including the count result to the positional relationship feature vector.

The silhouette feature amount extraction means detects the movement of the silhouette region included in the silhouette image according to the elapsed time, and outputs the amount of movement of the silhouette region as a velocity vector included in the silhouette feature vector,
The positional relationship between the feature vector generating unit, the calculated a set average value of the velocity vectors included in silhouette feature vector, according to claim 1 or claim, characterized in that the output results included the determined on the positional relationship feature vector Item 3. The event discriminating apparatus according to Item 2 .

And the converting means, by multiplying the weight matrix consisting of the main component group by the principal component analysis on the positional relationship feature vector, any of claims 1 to 3 and outputs the decision feature vector The event discrimination device according to one item.

And the converting means, by multiplying the weight matrix based on Fisher discriminant reference to the positional relationship feature vector, according to any one of claims 1 to 3 and outputs the decision feature vector Event discriminator.

The event determination unit outputs an event type having the highest probability value with reference to a probability model obtained by modeling the occurrence probability of the determination feature vector for each event type. event discrimination apparatus according to any one of 5.

The event discriminating apparatus according to claim 6 , wherein a Gaussian mixture model is used as the probability model.

In order to identify events that are specific scenes in sports video data,
A silhouette image that distinguishes a background region from a non-background region on the basis of a predetermined pixel value of pixels included in the image constituting the video data and extracts a silhouette image using the non-background region as a silhouette region of a person Extraction means,
Silhouette feature quantity extraction means for extracting a silhouette feature vector, which is a vector indicating the feature quantity of the silhouette area including at least coordinate information about the silhouette area included in the silhouette image extracted by the silhouette image extraction means;
Positional relationship feature vector generation means for generating a positional relationship feature vector, which is a vector obtained by quantifying the positional relationship of the person, using the coordinate information of the silhouette feature vector extracted by the silhouette feature amount extraction means;
Conversion means for outputting a feature vector for discrimination, which is a vector obtained by multiplying the formation feature vector generated by the formation quantification means by a preset matrix for weighting the feature amount;
By using the discrimination feature vector output by the conversion means, and referring to a probability model that is modeled in advance for each event type, for the occurrence probability of the event when the discrimination feature vector is obtained, Event discriminating means for discriminating an event,
To function as,
The silhouette feature amount extraction unit includes, as the coordinate information, a coordinate value on a coordinate space in which a space of the ground where the sport is performed and a coordinate value are associated in advance in the silhouette feature vector,
The positional relationship feature vector generation means counts persons existing in each of the local areas obtained by dividing the ground space based on the coordinate values included in the silhouette feature vector, and the counted result is the positional relationship feature vector. Event discriminating program characterized in that it is included and output .

In order to identify events that are specific scenes in sports video data,
A silhouette image that distinguishes a background region from a non-background region on the basis of a predetermined pixel value of pixels included in the image constituting the video data and extracts a silhouette image using the non-background region as a silhouette region of a person Extraction means,
Silhouette feature quantity extraction means for extracting a silhouette feature vector, which is a vector indicating the feature quantity of the silhouette area including at least coordinate information about the silhouette area included in the silhouette image extracted by the silhouette image extraction means;
Positional relationship feature vector generation means for generating a positional relationship feature vector, which is a vector obtained by quantifying the positional relationship of the person, using the coordinate information of the silhouette feature vector extracted by the silhouette feature amount extraction means;
Conversion means for outputting a feature vector for discrimination, which is a vector obtained by multiplying the formation feature vector generated by the formation quantification means by a preset matrix for weighting the feature amount;
By using the discrimination feature vector output by the conversion means, and referring to a probability model that is modeled in advance for each event type, for the occurrence probability of the event when the discrimination feature vector is obtained, Event discriminating means for discriminating an event,
To function as,
The silhouette feature amount extraction unit includes coordinate values on a coordinate space in which a space of the ground where the sport is performed and coordinate values are associated in advance;
For each silhouette region included in the silhouette image, color vector statistics are evaluated, and the team information that identifies the team to which the person included in the silhouette region belongs is included in the silhouette feature vector and output. ,
The positional relationship feature vector generation means counts the number of persons existing in each local area divided into the ground space for each team based on the coordinate values and team information included in the silhouette feature vector. An event discriminating program for outputting the counted result by including it in the positional relationship feature vector .

The silhouette feature amount extraction means detects the movement of the silhouette region included in the silhouette image according to the elapsed time, and outputs the amount of movement of the silhouette region as a velocity vector included in the silhouette feature vector,
The positional relationship between the feature vector generating unit, the calculated a set average value of the velocity vectors included in silhouette feature vector, according to claim 8 or claim, characterized in that the output results included the determined on the positional relationship feature vector Item 10. The event discrimination program according to Item 9 .