JP5565737B2

JP5565737B2 - Video event detection apparatus and operation method thereof

Info

Publication number: JP5565737B2
Application number: JP2012022970A
Authority: JP
Inventors: 聡嶌田; 豪東野; 啓樹靜谷; 美徳早川; 大三石; 文子今野
Original assignee: Tohoku University NUC; Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: Tohoku University NUC; NTT Inc; NTT Inc USA
Priority date: 2012-02-06
Filing date: 2012-02-06
Publication date: 2014-08-06
Anticipated expiration: 2032-02-06
Also published as: JP2013162344A

Description

本発明は、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを高い精度で検出する映像イベント検出装置およびその動作方法に関するものである。 The present invention relates to a video event detection apparatus and a method of operating the same for detecting a timing at which a line starts to be written and / or a timing at which a line is erased at a place reflected in a video file with high accuracy.

講義の様子を撮影した映像において、黒板に書かれた文字や図形の変化量により、「板書する」という行動を検出する従来技術の文献としては、非特許文献１がある。 Non-Patent Document 1 is a prior art document that detects an action of “writing on a board” based on the amount of change in characters and figures written on a blackboard in a video of a lecture.

非特許文献１の方法は以下の通りである。 The method of Non-Patent Document 1 is as follows.

まず、映像から一定間隔でフレーム画像をサンプリングし，ｎ−１番目のフレーム画像Ｆ（ｎ−１）とｎ番目のフレーム画像Ｆ（ｎ）の差分の２値化処理で、フレーム画像Ｆ（ｎ）における講師の領域を求め、これを各フレーム画像で行い、それぞれ講師領域を検出する。 First, a frame image is sampled from the video at regular intervals, and a frame image F (n) is obtained by binarizing the difference between the (n−1) th frame image F (n−1) and the nth frame image F (n). ) Is determined for each frame image, and the teacher area is detected.

次に、フレーム画像Ｆ（ｎ）から講師領域を消去した講師消去画像Ｇ（ｎ）を求めるために、フレーム画像Ｆ（ｎ）の講師領域以外の画素を講師消去画像Ｇ（ｎ）に転写し、フレーム画像Ｆ（ｎ）の講師領域に相当する、１つ前の講師消去画像Ｇ（ｎ−１）の画素を講師消去画像Ｇ（ｎ）に転写する。 Next, in order to obtain a lecturer erased image G (n) in which the lecturer area is erased from the frame image F (n), pixels other than the lecturer area of the frame image F (n) are transferred to the lecturer erased image G (n). The pixels of the previous instructor erased image G (n−1) corresponding to the instructor area of the frame image F (n) are transferred to the instructor erased image G (n).

次に、講師消去画像Ｇ（ｎ）と講師消去画像Ｇ（ｎ−１）の差分を２値化処理で文字差分画像を求める。つまり、フレーム画像Ｆ（ｎ）とフレーム画像Ｆ（ｎ−１）で変更のあった文字領域を検出する。 Next, a character difference image is obtained by binarizing the difference between the instructor erased image G (n) and the instructor erased image G (n−1). That is, a character area that has changed in the frame image F (n) and the frame image F (n−1) is detected.

最後に、文字差分画像の文字画素数の変化量を求め、変化量から板書を行っているかを判別する。 Finally, the amount of change in the number of character pixels in the character difference image is obtained, and it is determined from the amount of change whether the writing is performed.

篠木雄大，藤吉弘亘：高解像度映像からの視聴者の注目点を考慮した講義映像の自動生成，映像情報メディア学会誌, Vol.62, No.2, pp.240-246, 2008.Yudai Shinogi, Hironobu Fujiyoshi: Automatic Generation of Lecture Video Considering Viewer's Attention from High Resolution Video, Journal of the Institute of Image Information and Television Engineers, Vol.62, No.2, pp.240-246, 2008.

従来技術では、講義の映像で大きく変化する領域は講師領域であることから、フレーム画像間の差分で求めた動領域を講師領域として検出している。しかしながら、フレーム画像間の差分では動領域（講師領域）を概ねは検出できるが、正確に検出することは困難である。特に、講師の動きが少ない場合には、講師領域の検出は極めて困難である。また、講師領域を検出した後に講師消去画像を生成し、講師消去画像と１つ前の講師消去画像の差分により、講師消去画像で変化があった領域を求め、これを文字差分画像としているが、照明変動、文字のかすれなどのノイズの影響を受けやすい。これらの問題により文字画素数の変化量を正しく求められない場合があり、板書の開始時刻などの検出精度が低いことが問題である。 In the prior art, since a region that greatly changes in a lecture video is a lecturer region, a moving region obtained by a difference between frame images is detected as a lecturer region. However, although the moving area (instructor area) can be generally detected from the difference between the frame images, it is difficult to detect it accurately. In particular, when there is little movement of the lecturer, it is extremely difficult to detect the lecturer area. In addition, an instructor erased image is generated after the instructor area is detected, and a region that has changed in the instructor erased image is obtained from the difference between the instructor erased image and the previous instructor erased image, and this is used as the character difference image. It is susceptible to noise such as lighting fluctuations and blurred characters. Due to these problems, the amount of change in the number of character pixels may not be obtained correctly, and the problem is that detection accuracy such as the start time of a board is low.

本発明は、上記の課題に鑑みてなされたものであり、その目的とするところは、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを高い精度で検出する映像イベント検出装置およびその動作方法を提供することにある。 The present invention has been made in view of the above problems, and the object of the present invention is to detect with high accuracy the timing at which a line starts to be written and / or the timing at which a line is erased at a location shown in a video file. It is an object of the present invention to provide a video event detection apparatus and an operation method thereof.

上記の課題を解決するために、第１の本発明は、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出する映像イベント検出装置であって、前記映像ファイルから再生順にサンプリングされた各フレーム画像からサンプリングごとに線を消去し線消去画像を生成する線消去処理部と、前記各フレーム画像のサンプリングごとに、当該フレーム画像に映った場所の前景の領域を示す前景領域画像を生成する前景領域検出部と、前記各フレーム画像のサンプリングごとに、当該フレーム画像および当該フレーム画像から生成された線消去画像の差分画像を生成し、当該差分画像の、対応する前景領域画像により示される領域以外に該当する差分画像背景領域を含む画像である線画像を生成する線画像生成部と、前記各線画像が記憶される線画像記憶部とを備え、前記線画像生成部は、前記各フレーム画像について、当該フレーム画像より先に再生されるフレーム画像を用いて生成された線画像が前記線画像記憶部にある場合は、当該線画像を読み出し、当該線画像から、当該後に再生されるフレーム画像に対応する前景領域画像により示される領域に該当する線画像前景領域を抜き出し、当該後に再生されるフレーム画像に対応する線画像に当該線画像前景領域を含めるものであり、前記映像イベント検出装置は、さらに、前記各線画像内の線の量を示す線量を計算し、前記各フレーム画像の再生順での当該線量の変化に基づいて、前記映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出するイベント検出部を備えることを特徴とする映像イベント検出装置をもって解決手段とする。 In order to solve the above-mentioned problem, the first aspect of the present invention is a video event detection apparatus for detecting a timing at which a line starts to be written and / or a timing at which a line is erased at a location shown in a video file, A line erasure processing unit that erases a line for each sampling from each frame image sampled in the playback order from the video file and generates a line erased image, and foreground of a place reflected in the frame image for each sampling of each frame image A foreground region detection unit that generates a foreground region image indicating a region, and for each sampling of each frame image, generates a difference image between the frame image and a line erased image generated from the frame image, Generate a line image that is an image including a difference image background region corresponding to a region other than the region indicated by the corresponding foreground region image An image generation unit; and a line image storage unit for storing each line image, wherein the line image generation unit is generated for each frame image using a frame image reproduced before the frame image. When the line image is in the line image storage unit, the line image is read out, and the line image foreground area corresponding to the area indicated by the foreground area image corresponding to the frame image to be reproduced later is extracted from the line image. The line image foreground region is included in the line image corresponding to the frame image to be reproduced later, and the video event detection device further calculates a dose indicating the amount of lines in each line image, Based on the change in the dose in the playback order of each frame image, the timing at which a line starts to be written at the location shown in the video file and / or the timing at which the line is erased It is a solutions with a video event detection device according to claim comprising an event detector for detecting a grayed.

例えば、第１の本発明において、前記イベント検出部は、順次に再生される複数のフレーム画像のそれぞれにつき、当該対象のフレーム画像の直前に再生されるフレーム画像がある場合には当該対象のフレーム画像に対応する線量から当該直前に再生されるフレーム画像に対応する線量を減じた線量増分を計算し、当該複数のフレーム画像で最も後に再生されるフレーム画像に対応する線量増分が予め設定されたしきい値より大きく且つ、当該最も後に再生されるフレーム画像以外のフレーム画像に対応する線量増分が当該しきい値以下であり且つ、当該最も先に再生されるフレーム画像の再生のタイミングと当該最も後に再生されるフレーム画像の再生のタイミングとの間の時間差が予め設定されたしきい値より大きいなら、当該最も後に再生されるフレーム画像の再生のタイミングを前記線が書かれ始めるタイミングとし、あるフレーム画像に対応する線量から当該フレーム画像の直後に再生されるフレーム画像に対応する線量を減じた線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを前記線が消去されるタイミングとする。 For example, in the first aspect of the present invention, the event detection unit, for each of a plurality of frame images that are sequentially played back, has a frame image that is played back immediately before the target frame image. The dose increment corresponding to the frame image reproduced immediately before the plurality of frame images is calculated in advance by calculating the dose increment obtained by subtracting the dose corresponding to the frame image reproduced immediately before from the dose corresponding to the image. The dose increment corresponding to a frame image other than the frame image to be reproduced later than the threshold value that is greater than the threshold value is equal to or less than the threshold value, and the reproduction timing of the frame image to be reproduced first and the highest If the time difference between the playback timing of the frame image to be played back later is larger than a preset threshold value, The reproduction timing of the generated frame image is set as the timing at which the line starts to be written, and the dose reduction obtained by subtracting the dose corresponding to the frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image in advance. If it is larger than the set threshold value, the reproduction timing of the frame image reproduced immediately after that is set as the timing at which the line is erased.

例えば、第１の本発明において、前記イベント検出部は、前記各フレーム画像の再生順で前記線量を累積した累積線量を計算し、順次に再生される複数のフレーム画像に対応する累積線量の増加率が予め設定されたしきい値より大きく且つ、当該複数のフレーム画像で最も先に再生されるフレーム画像に対応する累積線量から当該フレーム画像の直前に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が予め設定されたしきい値以下であり且つ、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像に対応する累積線量から当該最も先に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が当該しきい値より大きいなら、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像の再生のタイミングを前記線が書かれ始めるタイミングとし、あるフレーム画像に対応する累積線量から当該フレーム画像の直後に再生されるフレーム画像に対応する累積線量を減じた累積線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを前記線が消去されるタイミングとする。 For example, in the first aspect of the present invention, the event detection unit calculates a cumulative dose obtained by accumulating the doses in the order of reproduction of the frame images, and increases the cumulative dose corresponding to a plurality of frame images that are sequentially reproduced. The cumulative dose corresponding to the frame image reproduced immediately before the frame image from the cumulative dose corresponding to the frame image reproduced first in the plurality of frame images, the rate being greater than a preset threshold value. The frame image reproduced first from the accumulated dose corresponding to the frame image reproduced immediately after the frame image reproduced first and the cumulative dose increment reduced is not more than a preset threshold value If the cumulative dose increment obtained by subtracting the cumulative dose corresponding to is larger than the threshold value, the frame image reproduced immediately after the frame image reproduced first Is set in advance as the timing at which the line starts to be written, and the cumulative dose decrement is obtained by subtracting the cumulative dose corresponding to a frame image reproduced immediately after the frame image from the cumulative dose corresponding to a frame image. If it is greater than the threshold, the playback timing of the frame image that is played back immediately after that is the timing at which the line is erased.

第２の本発明は、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出する映像イベント検出装置の動作方法であって、前記映像イベント検出装置の線消去処理部が、前記映像ファイルから再生順にサンプリングされた各フレーム画像からサンプリングごとに線を消去し線消去画像を生成し、前記映像イベント検出装置の前景領域検出部が、前記各フレーム画像のサンプリングごとに、当該フレーム画像に映った場所の前景の領域を示す前景領域画像を生成し、前記映像イベント検出装置の線画像生成部が、前記各フレーム画像のサンプリングごとに、当該フレーム画像および当該フレーム画像から生成された線消去画像の差分画像を生成し、当該差分画像の、対応する前景領域画像により示される領域以外に該当する差分画像背景領域を含む画像である線画像を生成し、当該線画像を前記映像イベント検出装置に設けられた線画像記憶部に記憶させ、前記線画像生成部は、前記各フレーム画像について、当該フレーム画像より先に再生されるフレーム画像を用いて生成された線画像が前記線画像記憶部にある場合は、当該線画像を読み出し、当該線画像から、当該後に再生されるフレーム画像に対応する前景領域画像により示される領域に該当する線画像前景領域を抜き出し、当該後に再生されるフレーム画像に対応する線画像に当該線画像前景領域を含め、前記映像イベント検出装置のイベント検出部が、前記各線画像内の線の量を示す線量を計算し、前記各フレーム画像の再生順での当該線量の変化に基づいて、前記映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出することを特徴とする映像イベント検出装置の動作方法をもって解決手段とする。 According to a second aspect of the present invention, there is provided an operation method of a video event detection device for detecting a timing at which a line starts to be written and / or a timing at which a line is erased at a location reflected in a video file, An erasure processing unit erases a line for each sampling from each frame image sampled in the playback order from the video file to generate a line erasure image, and the foreground region detection unit of the video event detection device samples the frame image Each time, a foreground region image indicating a foreground region of a place reflected in the frame image is generated, and the line image generation unit of the video event detection device generates the frame image and the frame for each sampling of the frame image. A difference image of the line-erased image generated from the image is generated, and the difference image is compared with the corresponding foreground region image. A line image that is an image including a difference image background area corresponding to a region other than the region shown is generated, the line image is stored in a line image storage unit provided in the video event detection device, and the line image generation unit is For each frame image, when a line image generated using a frame image reproduced prior to the frame image is in the line image storage unit, the line image is read and reproduced from the line image later. The video event detection device extracts a line image foreground area corresponding to the area indicated by the foreground area image corresponding to the frame image to be reproduced, includes the line image foreground area in the line image corresponding to the frame image to be reproduced later, and The event detection unit calculates a dose indicating the amount of the line in each line image, and based on the change of the dose in the reproduction order of each frame image, the video The solutions have a method of operation of the image event detection device, characterized in that the timing and / or line a line at the location starts written reflected in Airu detects the timing to be erased.

例えば、第２の本発明において、前記イベント検出部は、順次に再生される複数のフレーム画像のそれぞれにつき、当該対象のフレーム画像の直前に再生されるフレーム画像がある場合には当該対象のフレーム画像に対応する線量から当該直前に再生されるフレーム画像に対応する線量を減じた線量増分を計算し、当該複数のフレーム画像で最も後に再生されるフレーム画像に対応する線量増分が予め設定されたしきい値より大きく且つ、当該最も後に再生されるフレーム画像以外のフレーム画像に対応する線量増分が当該しきい値以下であり且つ、当該最も先に再生されるフレーム画像の再生のタイミングと当該最も後に再生されるフレーム画像の再生のタイミングとの間の時間差が予め設定されたしきい値より大きいなら、当該最も後に再生されるフレーム画像の再生のタイミングを前記線が書かれ始めるタイミングとし、あるフレーム画像に対応する線量から当該フレーム画像の直後に再生されるフレーム画像に対応する線量を減じた線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを前記線が消去されるタイミングとする。 For example, in the second aspect of the present invention, the event detection unit, for each of a plurality of sequentially reproduced frame images, has a frame image to be reproduced when there is a frame image to be reproduced immediately before the target frame image. The dose increment corresponding to the frame image reproduced immediately before the plurality of frame images is calculated in advance by calculating the dose increment obtained by subtracting the dose corresponding to the frame image reproduced immediately before from the dose corresponding to the image. The dose increment corresponding to a frame image other than the frame image to be reproduced later than the threshold value that is greater than the threshold value is equal to or less than the threshold value, and the reproduction timing of the frame image to be reproduced first and the highest If the time difference between the playback timing of the frame image to be played back later is larger than a preset threshold value, The reproduction timing of the generated frame image is set as the timing at which the line starts to be written, and the dose reduction obtained by subtracting the dose corresponding to the frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image in advance. If it is larger than the set threshold value, the reproduction timing of the frame image reproduced immediately after that is set as the timing at which the line is erased.

例えば、第２の本発明において、前記イベント検出部は、前記各フレーム画像の再生順で前記線量を累積した累積線量を計算し、順次に再生される複数のフレーム画像に対応する累積線量の増加率が予め設定されたしきい値より大きく且つ、当該複数のフレーム画像で最も先に再生されるフレーム画像に対応する累積線量から当該フレーム画像の直前に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が予め設定されたしきい値以下であり且つ、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像に対応する累積線量から当該最も先に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が当該しきい値より大きいなら、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像の再生のタイミングを前記線が書かれ始めるタイミングとし、あるフレーム画像に対応する累積線量から当該フレーム画像の直後に再生されるフレーム画像に対応する累積線量を減じた累積線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを前記線が消去されるタイミングとする。 For example, in the second aspect of the present invention, the event detection unit calculates a cumulative dose obtained by accumulating the doses in the order of reproduction of the frame images, and increases the cumulative dose corresponding to a plurality of frame images that are sequentially reproduced. The cumulative dose corresponding to the frame image reproduced immediately before the frame image from the cumulative dose corresponding to the frame image reproduced first in the plurality of frame images, the rate being greater than a preset threshold value. The frame image reproduced first from the accumulated dose corresponding to the frame image reproduced immediately after the frame image reproduced first and the cumulative dose increment reduced is not more than a preset threshold value If the cumulative dose increment obtained by subtracting the cumulative dose corresponding to is larger than the threshold value, the frame image reproduced immediately after the frame image reproduced first Is set in advance as the timing at which the line starts to be written, and the cumulative dose decrement is obtained by subtracting the cumulative dose corresponding to a frame image reproduced immediately after the frame image from the cumulative dose corresponding to a frame image. If it is greater than the threshold, the playback timing of the frame image that is played back immediately after that is the timing at which the line is erased.

本発明によれば、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを高い精度で検出することができる。 According to the present invention, it is possible to detect with high accuracy the timing at which a line starts to be written and / or the timing at which a line is erased at a location reflected in a video file.

本実施の形態に係る映像イベント検出装置の構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of the video event detection apparatus which concerns on this Embodiment. 映像イベント検出装置の動作方法を示すフローチャートである。It is a flowchart which shows the operation | movement method of a video event detection apparatus. 図３（ａ）は、フレーム画像の一例であり、図３（ｂ）は、線消去画像の一例である。FIG. 3A is an example of a frame image, and FIG. 3B is an example of a line erased image. 図４（ａ）は、２値化画像の一例であり、図４（ｂ）は、線消去画像において黒板の位置を示した図である。FIG. 4A is an example of a binarized image, and FIG. 4B is a diagram showing the position of the blackboard in the line erased image. 線画像の生成の様子の一例を示す図である。It is a figure which shows an example of the mode of the production | generation of a line image. イベント検出部１６に関する実施例１の説明に用いるための図である。FIG. 5 is a diagram for use in explaining the first embodiment related to an event detection unit 16; イベント検出部１６に関する実施例２の説明に用いるための図である。FIG. 10 is a diagram for use in explaining the second embodiment related to the event detection unit 16.

以下、本発明の実施の形態について図面を参照して説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

図１は、本実施の形態に係る映像イベント検出装置の構成を示す機能ブロック図である。図２は、映像イベント検出装置の動作方法を示すフローチャートである。 FIG. 1 is a functional block diagram showing the configuration of the video event detection apparatus according to the present embodiment. FIG. 2 is a flowchart showing an operation method of the video event detection apparatus.

図１において、映像イベント検出装置１は、映像ファイルに映った場所で線が書かれ始めるタイミングや線が消去されるタイミングを検出する装置である。場所は、例えば教室であり、線は、例えば講師により文字や図形として、チョークを使い黒板に書かれる。線は、例えば、黒板消しで消去される。 In FIG. 1, a video event detection device 1 is a device that detects the timing at which a line starts to be written or the timing at which a line is erased at a location shown in a video file. The place is, for example, a classroom, and the lines are written on the blackboard using chalk as letters and figures, for example. The line is erased by, for example, a blackboard eraser.

映像イベント検出装置１は、図示しない記憶装置などから映像ファイルを読み取る映像取得部１１と、映像ファイルのフレーム画像から線を消去し線消去画像を生成する線消去処理部１２と、フレーム画像に映った場所の前景の領域を示す前景領域画像を生成する前景領域検出部１３と、フレーム画像、線消去画像および前景領域画像を使用して、線画像と称する画像を生成する線画像生成部１４と、線画像を記憶する線画像記憶部１５と、各線画像内の線の量を示す線量を計算し、各フレーム画像の再生順での線量の変化に基づいて、上記の線が書かれ始めるタイミングや線が消去されるタイミングを検出するイベント検出部１６とを備える。イベント検出部１６は、検出結果として、例えば、映像ファイルにインデクスを設定する。検出結果は、例えば、図示しない表示装置に表示するというように、別な方法で出力してもよい。 The video event detection apparatus 1 includes a video acquisition unit 11 that reads a video file from a storage device (not shown), a line deletion processing unit 12 that deletes a line from a frame image of the video file and generates a line deletion image, and a frame image. A foreground region detection unit 13 that generates a foreground region image indicating a foreground region at a predetermined location, and a line image generation unit 14 that generates an image called a line image using a frame image, a line erased image, and a foreground region image. , A line image storage unit 15 for storing line images, and a timing for calculating the dose indicating the amount of lines in each line image and starting to write the above line based on the change in dose in the reproduction order of each frame image And an event detector 16 for detecting the timing at which lines are deleted. For example, the event detection unit 16 sets an index to the video file as the detection result. The detection result may be output by another method, for example, displayed on a display device (not shown).

映像取得部１１は、図示しない記憶装置などから映像ファイルを読み取り、映像ファイルから再生順にフレーム画像をサンプリングし、サンプリングした各フレーム画像を線消去処理部１２、前景領域検出部１３および線画像生成部１４に出力する。どのフレーム画像をサンプリングするかを示すサンプリング速度は予め定められており、映像取得部１１は、サンプリング速度に基づいてサンプリングを行う。 The video acquisition unit 11 reads a video file from a storage device (not shown), samples the frame images from the video file in the order of playback, and uses the sampled frame images as a line erasure processing unit 12, a foreground area detection unit 13, and a line image generation unit. 14 for output. A sampling speed indicating which frame image is to be sampled is determined in advance, and the video acquisition unit 11 performs sampling based on the sampling speed.

図２のステップ１０１で、映像取得部１１は、ｎ番目のフレーム画像のサンプリング、出力を行う。 In step 101 of FIG. 2, the video acquisition unit 11 performs sampling and output of the nth frame image.

図１に戻り、線消去処理部１２は、フレーム画像の出力ごとに、つまり、サンプリングごとに、フレーム画像から線を消去し、つまり、消去後の画像（線消去画像という）を生成し、線消去画像を前景領域検出部１３に出力する。 Returning to FIG. 1, the line erasure processing unit 12 erases a line from the frame image every time the frame image is output, that is, every sampling, that is, generates an image after erasure (referred to as a line erased image). The erased image is output to the foreground area detection unit 13.

図２のステップ１０２で、線消去処理部１２は、各フレーム画像の中で対象となっているｎ番目のフレーム画像に対応する線消去画像の生成、出力を行う。 In step 102 in FIG. 2, the line erasure processing unit 12 generates and outputs a line erasure image corresponding to the nth frame image that is a target in each frame image.

線消去画像の生成（線の消去）は、例えば、モルフォロジーによる平滑化処理により実現できる。図３（ａ）は、フレーム画像の一例であり、教師が黒板に文字や図形などの線を書いている。このフレーム画像に最小値フィルタを施した後、最大値フィルタを実行すると、図３（ｂ）に示す線消去画像となる。線消去画像は、文字などの線が消去されたものとなっている。図３は、明度の低い黒板上に明度の高い文字などが書かれた場合の結果であるが、文字などをホワイトボードに黒いペンで書いた場合には、最大値フィルタを施した後に最小値フィルタを実行すればよい。 Generation of a line erased image (line elimination) can be realized by, for example, a smoothing process based on morphology. FIG. 3A is an example of a frame image, and a teacher writes lines such as characters and figures on the blackboard. When the maximum value filter is executed after applying the minimum value filter to the frame image, a line-erased image shown in FIG. 3B is obtained. The line erased image is an image in which lines such as characters are erased. FIG. 3 shows the result when a character with high lightness is written on a blackboard with low lightness. When a character is written with a black pen on a whiteboard, the minimum value is applied after the maximum value filter is applied. You just need to run the filter.

図１に戻り、前景領域検出部１３は、フレーム画像および線消去画像の出力ごとに、つまり、サンプリングごとに、フレーム画像に映った場所の前景の領域を示す前景領域画像を生成し、線消去画像および前景領域画像を線画像生成部１４に出力する。 Returning to FIG. 1, the foreground area detection unit 13 generates a foreground area image indicating a foreground area at a location shown in the frame image for each output of the frame image and the line-erased image, that is, for each sampling, and erases the line. The image and the foreground area image are output to the line image generation unit 14.

前景領域画像は、例えば、図４（ａ）に示すような、講師の輪郭などを示す２値化画像である。 The foreground area image is a binarized image showing the contour of the lecturer, for example, as shown in FIG.

ここでは、前景領域検出部１３は、例えば、予め黒板の位置が分かっている場合は、当該位置を記憶しており、図４（ｂ）に示すように、線消去画像の当該位置（符号ａ１の領域）の色を検出し、当該色に類似しない色の領域を前景の領域として検出し、当該領域を示す前景領域画像を生成する。また、黒板の色を予め記憶していてもよい。 Here, the foreground area detection unit 13 stores, for example, the position of the blackboard when the position of the blackboard is known in advance, as shown in FIG. Color), a color area not similar to the color is detected as a foreground area, and a foreground area image indicating the area is generated. The color of the blackboard may be stored in advance.

黒板の色を示す情報（色情報）は、以下のように求め、記憶すればよい。例えば、黒板の複数箇所での色の頻度分布を求め、予め定められた値以上の頻度の色の分布を色情報とする。または、黒板の色の色空間での正規分布、または、混合正規分布を色情報とする。黒板の複数箇所での色に対しクラスタリングを行い、クラスタの中心の色を示す情報を色情報とする。 Information indicating the color of the blackboard (color information) may be obtained and stored as follows. For example, a color frequency distribution at a plurality of locations on the blackboard is obtained, and a color distribution having a frequency equal to or higher than a predetermined value is used as color information. Alternatively, a normal distribution in the color space of the blackboard color or a mixed normal distribution is used as the color information. Clustering is performed for colors at a plurality of locations on the blackboard, and information indicating the color of the center of the cluster is used as color information.

類似の有無は、記憶した黒板の色（色情報）と、対象の位置の色（色情報）との距離に基づいて判定すればよい。 The similarity may be determined based on the distance between the stored color of the blackboard (color information) and the color of the target position (color information).

なお、前景の面積より黒板の面積が大きいことが予想される場合には、黒板の領域を自動検出するか、黒板の領域を手動で設定し、その領域全体で頻出する色（または色の分布）を黒板の色（色の分布）としてもよい。 If the blackboard area is expected to be larger than the foreground area, the blackboard area is automatically detected, or the blackboard area is manually set, and the color (or color distribution) that appears frequently throughout the area ) May be the blackboard color (color distribution).

上記では、色情報に基づいて分離する方法を説明しているが、明度を含めたＬａｂ、ＨＳＶなどの色空間を用いてもよい。 In the above description, a method of separation based on color information is described. However, a color space such as Lab or HSV including brightness may be used.

図２のステップ１０３で、前景領域検出部１３は、ｎ番目のフレーム画像が最初にサンプリングされたフレーム画像であるか否かを判定し、そうであるなら（ＹＥＳ）、線消去画像の予め定められた位置の色（黒板の色）を検出し（ステップ１０４）、当該色に基づいて、前景領域画像を生成し、線消去画像とともに出力する（ステップ１０５）。一方、ｎ番目のフレーム画像が最初にサンプリングされたフレーム画像でないなら（ステップ１０３：ＮＯ）、ステップ１０４で検出し、記憶しておいた色に基づいて、前景領域画像を生成し、前景領域画像とともに出力する（ステップ１０５）。 In step 103 of FIG. 2, the foreground area detection unit 13 determines whether or not the nth frame image is the first sampled frame image, and if so (YES), the line erasure image is determined in advance. The color at the selected position (blackboard color) is detected (step 104), and a foreground area image is generated based on the color and output together with the line erased image (step 105). On the other hand, if the n-th frame image is not the first sampled frame image (step 103: NO), a foreground area image is generated based on the color detected and stored in step 104, and the foreground area image is generated. Together with the output (step 105).

図１に戻り、線画像生成部１４は、フレーム画像、線消去画像および前景領域画像の出力ごとに、つまり、フレーム画像のサンプリングごとに線画像を生成し、線画像を線画像記憶部１５およびイベント検出部１６に出力する。線画像記憶部１５は、線画像を記憶する。 Returning to FIG. 1, the line image generation unit 14 generates a line image for each output of the frame image, the line erased image, and the foreground area image, that is, for each sampling of the frame image, and the line image is stored in the line image storage unit 15 and Output to the event detector 16. The line image storage unit 15 stores a line image.

図５に示すように、具体的には、線画像生成部１４は、まず、出力されたフレーム画像（図のフレーム画像Ｆ（ｎ））およびフレーム画像Ｆ（ｎ）から生成された線消去画像（出力された線消去画像：図の線消去画像Ｍ（ｎ））の差分画像を生成する。 As shown in FIG. 5, specifically, the line image generation unit 14 firstly outputs the output frame image (the frame image F (n) in the figure) and the line erased image generated from the frame image F (n). A difference image of (output line erased image: line erased image M (n) in the figure) is generated.

次に、線画像生成部１４は、差分画像を２値化した２値化画像（図の２値化画像Ｂ（ｎ））を生成する。 Next, the line image generation unit 14 generates a binary image obtained by binarizing the difference image (binarized image B (n) in the figure).

次に、線画像生成部１４は、２値化画像Ｂ（ｎ）から、出力された前景領域画像により示される領域以外に該当する領域（差分画像背景領域という）を抜き出す。 Next, the line image generation unit 14 extracts a corresponding region (referred to as a difference image background region) other than the region indicated by the output foreground region image from the binarized image B (n).

次に、線画像生成部１４は、フレーム画像より先に再生されるフレーム画像がない、つまり、フレーム画像が最初にサンプリングされたフレーム画像であるなら、差分画像背景領域を含み、かつ、前景領域画像に該当する領域には文字等がない線画像（図の線画像Ｃ（ｎ））を生成する。 Next, if there is no frame image to be reproduced prior to the frame image, that is, if the frame image is a frame image sampled first, the line image generation unit 14 includes the difference image background region and the foreground region. A line image (line image C (n) in the figure) having no characters or the like in the region corresponding to the image is generated.

次に、線画像生成部１４は、線画像Ｃ（ｎ）を線画像記憶部１５に記憶させる。 Next, the line image generation unit 14 stores the line image C (n) in the line image storage unit 15.

一方、線画像生成部１４は、フレーム画像より先に再生されるフレーム画像がある、つまり、フレーム画像が最初にサンプリングされたフレーム画像でないなら、１つ前にサンプリングされたフレーム画像、つまり先に再生されるフレーム画像を用いて生成された線画像（図の線画像Ｃ（ｎ−１））を線画像記憶部１５から読み出す。つまり、線画像記憶部１５は、線画像生成部１４からの要求に応じて、線画像Ｃ（ｎ−１）を線画像生成部１４に出力する。 On the other hand, if there is a frame image to be reproduced prior to the frame image, that is, if the frame image is not the first sampled frame image, the line image generation unit 14 determines the previous sampled frame image, that is, first. A line image generated using the reproduced frame image (line image C (n−1) in the figure) is read from the line image storage unit 15. That is, the line image storage unit 15 outputs the line image C (n−1) to the line image generation unit 14 in response to a request from the line image generation unit 14.

次に、線画像生成部１４は、線画像Ｃ（ｎ−１）から、出力された前景領域画像（フレーム画像Ｆ（ｎ）に対応する前景領域画像）により示される領域に該当する領域（線画像前景領域という）を抜き出す。 Next, the line image generation unit 14 generates an area (line) corresponding to an area indicated by the output foreground area image (the foreground area image corresponding to the frame image F (n)) from the line image C (n−1). The image foreground area is extracted.

次に、線画像生成部１４は、差分画像背景領域および線画像前景領域とからなる線画像（図の線画像Ｃ（ｎ））を生成する。線画像Ｃ（ｎ）は、線画像記憶部１５に記憶され、次の線画像の生成において、線画像Ｃ（ｎ−１）として用いられる。 Next, the line image generation unit 14 generates a line image (line image C (n) in the figure) including the difference image background area and the line image foreground area. The line image C (n) is stored in the line image storage unit 15 and used as the line image C (n−1) in the generation of the next line image.

さて、フレーム画像が最初にサンプリングされたフレーム画像である場合において、線画像は、２値化画像Ｂ（ｎ）の一部である差分画像背景領域を含んでいる。 When the frame image is a frame image sampled first, the line image includes a difference image background region that is a part of the binarized image B (n).

一方、フレーム画像が最初にサンプリングされたフレーム画像でない場合においては、差分画像背景領域および線画像前景領域とからなる画像が線画像である。線画像は、差分画像背景領域を含んでいる。 On the other hand, when the frame image is not the first sampled frame image, the image composed of the difference image background region and the line image foreground region is a line image. The line image includes a difference image background area.

つまり、線画像生成部１４は、各フレーム画像のサンプリングごとに、当該フレーム画像および当該フレーム画像から生成された線消去画像の差分画像を生成し、当該差分画像の、対応する前景領域画像により示される領域以外に該当する差分画像背景領域を含む画像である線画像を生成する。また、線画像生成部１４は、特に、当該フレーム画像より先に再生されるフレーム画像を用いて生成された線画像が線画像記憶部１５にある場合は、当該線画像を読み出し、当該線画像から、当該後に再生されるフレーム画像に対応する前景領域画像により示される領域に該当する線画像前景領域を抜き出し、当該後に再生されるフレーム画像に対応する線画像に当該線画像前景領域を含めるのである。 That is, for each sampling of each frame image, the line image generation unit 14 generates a difference image between the frame image and a line erased image generated from the frame image, and indicates the corresponding foreground region image of the difference image. A line image that is an image including a corresponding difference image background region other than the region to be generated is generated. The line image generation unit 14 reads out the line image, particularly when the line image generated using the frame image reproduced before the frame image is in the line image storage unit 15. Therefore, the line image foreground area corresponding to the area indicated by the foreground area image corresponding to the frame image reproduced later is extracted, and the line image foreground area is included in the line image corresponding to the frame image reproduced later. is there.

図２のステップ１０６で、線画像生成部１４は、図５における差分画像および２値化画像Ｂ（ｎ）を生成する。ステップ１０７で、線画像生成部１４は、例えば、図５における線画像Ｃ（ｎ）の生成、出力を行う。これらは、フレーム画像が最初にサンプリングされたフレーム画像でない場合の処理であり、そうでない場合については、図示省略する。 In step 106 of FIG. 2, the line image generation unit 14 generates the difference image and the binarized image B (n) in FIG. In step 107, the line image generation unit 14 generates and outputs the line image C (n) in FIG. 5, for example. These are the processes when the frame image is not the first sampled frame image. Otherwise, the illustration is omitted.

図１に戻り、イベント検出部１６は、サンプリングごとに出力される各線画像内の線の量を示す線量を計算し、各フレーム画像の再生順での当該線量の変化に基づいて、映像ファイルに映った場所で線が書かれ始めるタイミングおよび線が消去されるタイミングを検出する。 Returning to FIG. 1, the event detection unit 16 calculates a dose indicating the amount of lines in each line image output for each sampling, and generates a video file based on the change in the dose in the reproduction order of each frame image. It detects when the line starts to be written at the reflected location and when the line is erased.

つまり、イベント検出部１６は、例えば、教室での授業が開始され、黒板に文字などが書かれ始めるタイミング、授業が終了し、黒板から文字などが消去されるタイミングを検出する。 In other words, the event detection unit 16 detects, for example, the timing at which a lesson in the classroom is started and characters and the like start to be written on the blackboard, and the timing at which the lesson is completed and characters and the like are erased from the blackboard.

なお、イベント検出部１６は、一方のタイミングを検出するように構成してもよい。 The event detection unit 16 may be configured to detect one timing.

図２のステップ１０８で、イベント検出部１６は、上記のように、線画像の線量を計算する。ステップ１０９では、イベント検出部１６は、各フレーム画像の再生順での線量の変化に基づいて、該当のフレーム画像の再生のタイミングは、線が書かれ始めるタイミングか否かを判定し、線が書かれ始めるタイミングならば（ＹＥＳ）、ステップ１１０で、その旨のインデクスを映像ファイルに設定する。線が書かれ始めるタイミングでなければ（ＮＯ）、ステップ１１１で、イベント検出部１６は、該当のフレーム画像の再生のタイミングは、線が消去されるタイミングか否かを判定する。イベント検出部１６は、線が消去されるタイミングならば（ＹＥＳ）、ステップ１１２で、その旨のインデクスを映像ファイルに設定する。ステップ１１２の処理が終了したなら、または、ステップ１１１で、該当のフレーム画像の再生のタイミングは、線が消去されるタイミングでないと判定されたなら（ＮＯ）、制御は、ステップ１０１に戻り、次のフレーム画像等について、同様の処理が行われる。なお、図２のフローチャートの処理は、最後にサンプリングされたフレーム画像等の処理が終わった際に終了する。 In step 108 of FIG. 2, the event detector 16 calculates the dose of the line image as described above. In step 109, the event detection unit 16 determines whether the reproduction timing of the corresponding frame image is the timing at which the line starts to be written based on the change in dose in the reproduction order of each frame image. If it is time to start writing (YES), in step 110, an index to that effect is set in the video file. If it is not the timing when the line starts to be written (NO), in step 111, the event detection unit 16 determines whether the reproduction timing of the corresponding frame image is the timing when the line is erased. If it is the timing at which the line is erased (YES), the event detection unit 16 sets an index to that effect in the video file in step 112. If the process of step 112 is completed, or if it is determined in step 111 that the reproduction timing of the corresponding frame image is not the timing for erasing the line (NO), the control returns to step 101 and the next The same processing is performed for the frame image and the like. Note that the processing of the flowchart in FIG. 2 ends when processing of the last sampled frame image or the like is completed.

ここで、タイミングの検出についての一例を説明する。 Here, an example of timing detection will be described.

つまり、イベント検出部１６は、順次に再生される複数のフレーム画像のそれぞれ（対象のフレーム画像）につき、当該対象のフレーム画像の直前に再生されるフレーム画像がある場合には当該対象のフレーム画像に対応する線量から当該直前に再生されるフレーム画像に対応する線量を減じた線量増分を計算する。イベント検出部１６は、当該複数のフレーム画像で最も後に再生されるフレーム画像に対応する線量増分が予め設定されたしきい値より大きく且つ、当該最も後に再生されるフレーム画像以外のフレーム画像に対応する線量増分が当該しきい値以下であり且つ、当該最も先に再生されるフレーム画像の再生のタイミングと当該最も後に再生されるフレーム画像の再生のタイミングとの間の時間差が予め設定されたしきい値より大きいなら、当該最も後に再生されるフレーム画像の再生のタイミングを、線が書かれ始めるタイミングとする。 That is, when there is a frame image to be reproduced immediately before the target frame image for each of the plurality of frame images to be reproduced sequentially (target frame image), the event detection unit 16 relates to the target frame image. The dose increment is calculated by subtracting the dose corresponding to the frame image reproduced immediately before from the dose corresponding to. The event detection unit 16 corresponds to a frame image other than the frame image to be reproduced most recently, and the dose increment corresponding to the frame image to be reproduced most recently among the plurality of frame images is greater than a preset threshold value. A time difference between the reproduction timing of the frame image reproduced first and the reproduction timing of the frame image reproduced most recently is set in advance. If it is larger than the threshold, the reproduction timing of the frame image reproduced most recently is set as the timing at which a line starts to be written.

また、イベント検出部１６は、あるフレーム画像に対応する線量から当該フレーム画像の直後に再生されるフレーム画像に対応する線量を減じた線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを線が消去されるタイミングとする。 Further, the event detection unit 16 determines that the dose decrement obtained by subtracting the dose corresponding to the frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image is larger than a preset threshold value. The reproduction timing of the frame image reproduced immediately after is set as the timing at which the line is deleted.

線量は、例えば、線画像内の線の領域の画素数である。線量は、例えば、当該線量を求める線画像と、当該線画像に対応するフレーム画像の直前に再生されるフレーム画像に対応する線画像とでの変化があった画素の数である。変化の有無は、対象の画素の近傍にある画素を用いた類似度どうしを比較して判定される。 The dose is, for example, the number of pixels in the line area in the line image. The dose is, for example, the number of pixels that have changed between the line image for which the dose is obtained and the line image corresponding to the frame image reproduced immediately before the frame image corresponding to the line image. The presence or absence of a change is determined by comparing similarities using pixels in the vicinity of the target pixel.

図６は、実施例１の説明に用いるための図である。 FIG. 6 is a diagram for explaining the first embodiment.

図６の棒グラフは、線量増分および線量減分をフレーム画像ごとに示すグラフである。線量減分は正の数であるが、便宜上、ここでは、負の方向のグラフとしている。 The bar graph of FIG. 6 is a graph showing the dose increment and the dose decrement for each frame image. Although the dose decrement is a positive number, for the sake of convenience, a graph in the negative direction is used here.

しきい値Ｔｈ１は、線量増分について予め設定されたしきい値であり、しきい値Ｔｈ２は、線量減分について予め設定されたしきい値である。 The threshold value Th1 is a preset threshold value for dose increment, and the threshold value Th2 is a preset threshold value for dose decrement.

映像ファイルは、例えば、線が書かれ始めるタイミングまでの部分であるチャプタＣＨ１と、線が書かれ始めてから消去されるまでの部分であるチャプタＣＨ３と、線が消去されてからの部分であるチャプタＣＨ３とからなる。しかし、映像ファイルは、チャプタＣＨ２の開始のタイミング、つまり、前者のタイミングを示すインデクスＩＤ２を含んでいないこととする。また、映像は、チャプタＣＨ３の開始のタイミング、つまり、後者のタイミングを示すインデクスＩＤ３を含んでいないこととする。 The video file includes, for example, a chapter CH1 that is a part up to the timing at which a line starts to be written, a chapter CH3 that is a part from when a line starts to be written until it is deleted, and a chapter that is a part after the line is deleted. It consists of CH3. However, it is assumed that the video file does not include the index ID2 indicating the start timing of the chapter CH2, that is, the former timing. Also, the video does not include the index ID3 indicating the start timing of the chapter CH3, that is, the latter timing.

チャプタＣＨ１の最初のフレーム画像（ｎ＝１）については、これより先に再生されるフレーム画像がないので、線量増分および線量減分は計算できず、グラフは表示されていない。 For the first frame image of chapter CH1 (n = 1), there is no frame image to be reproduced earlier, so the dose increment and dose decrement cannot be calculated and the graph is not displayed.

チャプタＣＨ１においては、後続のフレーム画像（ｎ＝２）に対応する線量からフレーム画像（ｎ＝１）に対応する線量を減じた線量増分は、例えば、線が書かれ始める前であるから、しきい値Ｔｈ１より小さい。同様に、各フレーム画像（ｎ＝３、４、５）についても、線量増分は、しきい値Ｔｈ１以下である。 In chapter CH1, the dose increment obtained by subtracting the dose corresponding to the frame image (n = 1) from the dose corresponding to the subsequent frame image (n = 2) is, for example, before the line starts to be written. It is smaller than the threshold value Th1. Similarly, for each frame image (n = 3, 4, 5), the dose increment is equal to or less than the threshold value Th1.

しかし、チャプタＣＨ２の最初のフレーム画像（ｎ＝６）については、線が書かれ始めるので、線量増分は、例えば、しきい値Ｔｈ１より大きくなる。 However, for the first frame image (n = 6) of chapter CH2, the line starts to be written, so that the dose increment becomes larger than the threshold value Th1, for example.

つまり、順次に再生される複数のフレーム画像（ｎ＝１、２、３、４、５、６）からフレーム画像（ｎ＝１）を除いた各フレーム画像につき、当該フレーム画像に対応する線量から当該フレーム画像の直前に再生されるフレーム画像に対応する線量を減じた線量増分が計算される。 That is, for each frame image obtained by removing a frame image (n = 1) from a plurality of sequentially reproduced frame images (n = 1, 2, 3, 4, 5, 6), the dose corresponding to the frame image is used. A dose increment is calculated by subtracting the dose corresponding to the frame image reproduced immediately before the frame image.

また、フレーム画像（ｎ＝１、２、３、４、５、６）で最も後に再生されるフレーム画像（ｎ＝６）に対応する線量増分がしきい値Ｔｈ１より大きい。 Further, the dose increment corresponding to the frame image (n = 6) reproduced most recently in the frame image (n = 1, 2, 3, 4, 5, 6) is larger than the threshold value Th1.

また、フレーム画像（ｎ＝２、３、４、５）に対応する線量増分はしきい値Ｔｈ１以下である。 The dose increment corresponding to the frame image (n = 2, 3, 4, 5) is equal to or less than the threshold value Th1.

さらに、ここでは、フレーム画像（ｎ＝１）の再生のタイミングと最も後に再生されるフレーム画像（ｎ＝６）の再生のタイミングとの間の時間差が予め設定されたしきい値Ｔ１より大きいなら、イベント検出部１６は、最も後に再生されるフレーム画像（ｎ＝６）の再生のタイミングを線を書かれ始めるタイミングとし、例えば、インデクスＩＤ２を映像ファイルに設定する。 Further, here, if the time difference between the playback timing of the frame image (n = 1) and the playback timing of the most recently played frame image (n = 6) is greater than a preset threshold value T1. The event detection unit 16 sets the reproduction timing of the frame image (n = 6) reproduced most recently as the timing at which a line starts to be written, and sets, for example, the index ID2 to the video file.

さて、その後、フレーム画像（ｎ＝７、８、１１、１２）については、例えば、線量増分がしきい値Ｔｈ１より大きくなるが、フレーム画像（ｎ＝７、８、１２）については、直前のフレーム画像に対応する線量増分はしきい値Ｔｈ１より大きいので、フレーム画像（ｎ＝７、８、１２）の再生のタイミングは、線が書かれ始めるタイミングとならない。フレーム画像（ｎ＝１１）については、直前のフレーム画像（ｎ＝９、１０）に対応する線量増分はしきい値Ｔｈ１以下であるものの、フレーム画像（ｎ＝９）の再生のタイミングとフレーム画像（ｎ＝１１）の再生のタイミングとの間の時間差はしきい値Ｔ１以下である。よって、フレーム画像（ｎ＝１１）の再生のタイミングは、線が書かれ始めるタイミングとはならない。 Now, for the frame image (n = 7, 8, 11, 12), for example, the dose increment becomes larger than the threshold Th1, but for the frame image (n = 7, 8, 12), Since the dose increment corresponding to the frame image is larger than the threshold value Th1, the reproduction timing of the frame image (n = 7, 8, 12) is not the timing at which the line starts to be written. For the frame image (n = 11), the dose increment corresponding to the immediately preceding frame image (n = 9, 10) is equal to or less than the threshold Th1, but the reproduction timing of the frame image (n = 9) and the frame image The time difference from the reproduction timing of (n = 11) is equal to or less than the threshold value T1. Therefore, the reproduction timing of the frame image (n = 11) is not the timing at which the line starts to be written.

その後、チャプタＣＨ３の最初のフレーム画像（ｎ＝１５）については、線が消去されるので、線量減分は、例えば、しきい値Ｔｈ２より大きくなる。よって、イベント検出部１６は、フレーム画像（ｎ＝１５）の再生のタイミングを線が消去されるタイミングとし、例えば、インデクスＩＤ３を映像ファイルに設定する。 Thereafter, for the first frame image (n = 15) of chapter CH3, the line is deleted, so that the dose decrement is larger than the threshold value Th2, for example. Therefore, the event detection unit 16 sets the reproduction timing of the frame image (n = 15) as the timing at which the line is deleted, and sets, for example, the index ID3 to the video file.

次に、タイミングの検出についての別な一例を説明する。 Next, another example of timing detection will be described.

イベント検出部１６は、まず、各フレーム画像の再生順で線量を累積した累積線量を計算する。 The event detection unit 16 first calculates a cumulative dose obtained by accumulating doses in the order of reproduction of each frame image.

また、イベント検出部１６は、順次に再生される複数のフレーム画像に対応する累積線量の増加率が予め設定されたしきい値より大きく且つ、当該複数のフレーム画像で最も先に再生されるフレーム画像に対応する累積線量から当該フレーム画像の直前に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が予め設定されたしきい値以下であり且つ、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像に対応する累積線量から当該最も先に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が当該しきい値より大きいなら、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像の再生のタイミングを、線が書かれ始めるタイミングとする。 In addition, the event detection unit 16 has a cumulative dose increase rate corresponding to a plurality of sequentially reproduced frame images that is greater than a preset threshold value, and the frame that is reproduced first in the plurality of frame images. A frame to be reproduced first when the cumulative dose increment obtained by subtracting the cumulative dose corresponding to the frame image reproduced immediately before the frame image from the cumulative dose corresponding to the image is equal to or less than a preset threshold value. If the cumulative dose increment obtained by subtracting the cumulative dose corresponding to the frame image reproduced first from the cumulative dose corresponding to the frame image reproduced immediately after the image is larger than the threshold, the first reproduction is performed. The reproduction timing of the frame image reproduced immediately after the frame image to be reproduced is set as the timing at which the line starts to be written.

また、イベント検出部１６は、あるフレーム画像に対応する累積線量から当該フレーム画像の直後に再生されるフレーム画像に対応する累積線量を減じた累積線量減分が予め設定されたしきい値より大きいなら当該直後に再生されるフレーム画像の再生のタイミングを、線が消去されるタイミングとする。 Further, the event detection unit 16 has a cumulative dose decrement obtained by subtracting a cumulative dose corresponding to a frame image reproduced immediately after the frame image from a cumulative dose corresponding to a certain frame image, which is larger than a preset threshold value. Then, the reproduction timing of the frame image that is reproduced immediately after that is the timing at which the line is erased.

線量は、例えば、実施例１と同様に計算される。 For example, the dose is calculated in the same manner as in the first embodiment.

図７は、実施例２の説明に用いるための図であり、この棒グラフは、累積線量をフレーム画像ごとに示すものである。 FIG. 7 is a diagram for use in explaining the second embodiment, and this bar graph shows the cumulative dose for each frame image.

図７（ａ）に示すように、最初にサンプリングされたフレーム画像（ｎ＝１）については、例えば、線が書かれ始める前であるから、累積線量は０である。 As shown in FIG. 7A, for the first sampled frame image (n = 1), for example, before the line starts to be written, the accumulated dose is zero.

フレーム画像（ｎ＝２）に対応する累積線量は、すなわち、フレーム画像（ｎ＝１）に対応する累積線量（０）に対し、フレーム画像（ｎ＝２）に対応する線量を加えたものある。例えば、累積線量増分（フレーム画像（ｎ＝２）についての累積線量からフレーム画像（ｎ＝１）についての累積線量（０）を減じたもの）は、線が書かれ始める前であるから、累積線量増分についてのしきい値Ｔｈ３以下である。よって、フレーム画像（ｎ＝３、４、５）についても、同様に累積線量増分はしきい値Ｔｈ３以下である。 The cumulative dose corresponding to the frame image (n = 2) is obtained by adding the dose corresponding to the frame image (n = 2) to the cumulative dose (0) corresponding to the frame image (n = 1). . For example, the cumulative dose increment (accumulated dose for frame image (n = 2) minus the cumulative dose (0) for frame image (n = 1)) is before the line begins to be written. It is below the threshold value Th3 for the dose increment. Therefore, also for the frame images (n = 3, 4, 5), the cumulative dose increment is similarly equal to or less than the threshold value Th3.

また、直線７１１の傾きは、これらの累積線量の平均の増加率を示すものであり、予め設定された増加率についてのしきい値以下となっている。 The slope of the straight line 711 indicates the average increase rate of these accumulated doses, and is equal to or less than a threshold value for a preset increase rate.

よって、各フレーム画像の再生のタイミングは、線が書かれ始めるタイミングとはされない。 Therefore, the reproduction timing of each frame image is not the timing at which a line starts to be written.

図７（ｂ）に示すように、フレーム画像（ｎ＝６）については、例えば、線が書かれ始めた後であるから、累積線量は大きくなる。例えば、フレーム画像（ｎ＝６）についての累積線量増分は、しきい値Ｔｈ３より大きくなる。 As shown in FIG. 7B, for the frame image (n = 6), for example, after the line starts to be written, the accumulated dose increases. For example, the cumulative dose increment for the frame image (n = 6) is greater than the threshold value Th3.

同様に、フレーム画像（ｎ＝７、８）についても、例えば、線が書かれ始めた後であるから、累積線量増分は、しきい値Ｔｈ３より大きくなる。 Similarly, for the frame image (n = 7, 8), for example, after the line starts to be written, the cumulative dose increment becomes larger than the threshold value Th3.

また、直線７１２の傾きは、これらの累積線量の平均の増加率を示すものであり、前述の増加率についてのしきい値より大きくなっている。 Further, the slope of the straight line 712 indicates the average increase rate of these accumulated doses, and is larger than the threshold value for the increase rate described above.

また、前述のように、フレーム画像（ｎ＝５）についての累積線量増分はしきい値Ｔｈ３以下であり、フレーム画像（ｎ＝６）についての累積線量増分はしきい値Ｔｈ３より大きい。 Further, as described above, the cumulative dose increment for the frame image (n = 5) is less than or equal to the threshold Th3, and the cumulative dose increment for the frame image (n = 6) is greater than the threshold Th3.

よって、イベント検出部１６は、フレーム画像（ｎ＝６）の再生のタイミングを、書かれ始めるタイミングとし、例えば、実施例１と同様にインデクスを映像ファイルに設定する。 Therefore, the event detection unit 16 sets the playback timing of the frame image (n = 6) as the timing to start writing, and sets the index to the video file, for example, as in the first embodiment.

図７（ｃ）に示すように、その後、例えば、フレーム画像（ｎ＝１５）については、線が消去されるので、累積線量減分（フレーム画像（ｎ＝１４）に対応する累積線量からフレーム画像（ｎ＝１５）に対応する累積線量を減じたもの）は、例えば、累積線量減分についてのしきい値Ｔｈ４より大きくなる。よって、イベント検出部１６は、フレーム画像（ｎ＝１５）の再生のタイミングを線が消去されるタイミングとし、例えば、実施例１と同様にインデクスを映像ファイルに設定する。 As shown in FIG. 7C, for example, for the frame image (n = 15), since the line is deleted, the cumulative dose decrement (from the cumulative dose corresponding to the frame image (n = 14) to the frame The image (those obtained by subtracting the cumulative dose corresponding to n = 15) is, for example, larger than the threshold value Th4 for the cumulative dose decrement. Therefore, the event detection unit 16 sets the playback timing of the frame image (n = 15) as the timing at which the line is erased, and sets the index in the video file as in the first embodiment, for example.

以上説明したように、本実施の形態によれば、映像イベント検出装置１は、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出する映像イベント検出装置であって、映像ファイルから再生順にサンプリングされた各フレーム画像からサンプリングごとに線を消去し線消去画像を生成する線消去処理部１２と、各フレーム画像のサンプリングごとに、当該フレーム画像に映った場所の前景の領域を示す前景領域画像を生成する前景領域検出部１３と、各フレーム画像のサンプリングごとに、当該フレーム画像および当該フレーム画像から生成された線消去画像の差分画像を生成し、当該差分画像の、対応する前景領域画像により示される領域以外に該当する差分画像背景領域を含む画像である線画像を生成する線画像生成部１４と、各線画像が記憶される線画像記憶部１５とを備え、線画像生成部１４は、各フレーム画像について、当該フレーム画像より先に再生されるフレーム画像を用いて生成された線画像が線画像記憶部１５にある場合は、当該線画像を読み出し、当該線画像から、当該後に再生されるフレーム画像に対応する前景領域画像により示される領域に該当する線画像前景領域を抜き出し、当該後に再生されるフレーム画像に対応する線画像に当該線画像前景領域を含めるものであり、映像イベント検出装置１は、さらに、各線画像内の線の量を示す線量を計算し、各フレーム画像の再生順での当該線量の変化に基づいて、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを検出するイベント検出部１６を備える。 As described above, according to the present embodiment, the video event detection device 1 detects the timing at which a line starts to be written and / or the timing at which the line is erased at a location shown in the video file. A line erasure processing unit 12 that erases a line for each sampling from each frame image sampled in the playback order from the video file and generates a line erasure image, and is reflected in the frame image for each sampling of each frame image. Foreground region detection unit 13 that generates a foreground region image indicating a foreground region of a place, and for each sampling of each frame image, generates a difference image between the frame image and a line erased image generated from the frame image, An image that includes a difference image background area other than the area indicated by the corresponding foreground area image of the difference image. A line image generation unit 14 that generates a line image and a line image storage unit 15 that stores each line image. The line image generation unit 14 generates a frame that is reproduced before each frame image. When a line image generated using an image is stored in the line image storage unit 15, the line image is read out, and corresponds to the area indicated by the foreground area image corresponding to the frame image reproduced later from the line image. The line image foreground area is extracted, and the line image foreground area is included in the line image corresponding to the frame image to be reproduced later. The video event detection apparatus 1 further indicates the amount of lines in each line image. Calculate the dose and based on the change of the dose in the playback order of each frame image, the timing when the line starts to be written at the location shown in the video file and / or the line is erased It comprises an event detection unit 16 for detecting the timing.

例えば、イベント検出部１６は、順次に再生される複数のフレーム画像のそれぞれにつき、当該対象のフレーム画像の直前に再生されるフレーム画像がある場合には当該対象のフレーム画像に対応する線量から当該直前に再生されるフレーム画像に対応する線量を減じた線量増分計算し、当該複数のフレーム画像で最も後に再生されるフレーム画像に対応する線量増分が予め設定されたしきい値より大きく且つ、当該最も後に再生されるフレーム画像以外のフレーム画像に対応する線量増分が当該しきい値以下であり且つ、当該最も先に再生されるフレーム画像の再生のタイミングと当該最も後に再生されるフレーム画像の再生のタイミングとの間の時間差が予め設定されたしきい値より大きいなら、当該最も後に再生されるフレーム画像の再生のタイミングを、線が書かれ始めるタイミングとし、あるフレーム画像に対応する線量から当該フレーム画像の直後に再生されるフレーム画像に対応する線量を減じた線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを、線が消去されるタイミングとするので、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを高い精度で検出することができる。 For example, when there is a frame image to be reproduced immediately before the target frame image for each of a plurality of frame images to be sequentially reproduced, the event detection unit 16 calculates the relevant dose from the dose corresponding to the target frame image. Calculate the dose increment by subtracting the dose corresponding to the frame image reproduced immediately before, the dose increment corresponding to the frame image reproduced most recently in the plurality of frame images is greater than a preset threshold value, and The dose increment corresponding to the frame image other than the frame image to be reproduced later is equal to or less than the threshold value, and the reproduction timing of the frame image reproduced first and the reproduction of the frame image reproduced most recently. If the time difference from the timing of the frame is greater than a preset threshold value, The raw timing is the timing when the line starts to be written, and a threshold value is set in which a dose decrement is obtained by subtracting the dose corresponding to a frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image. If it is larger, the reproduction timing of the frame image reproduced immediately after that is the timing at which the line is erased, so the timing at which the line starts to be written and / or the line is erased at the location shown in the video file Can be detected with high accuracy.

また、例えば、イベント検出部１６は、各フレーム画像の再生順で線量を累積した累積線量を計算し、順次に再生される複数のフレーム画像に対応する累積線量の増加率が予め設定されたしきい値より大きく且つ、当該複数のフレーム画像で最も先に再生されるフレーム画像に対応する累積線量から当該フレーム画像の直前に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が予め設定されたしきい値以下であり且つ、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像に対応する累積線量から当該最も先に再生されるフレーム画像に対応する累積線量を減じた累積線量増分が当該しきい値より大きいなら、当該最も先に再生されるフレーム画像の直後に再生されるフレーム画像の再生のタイミングを、線が書かれ始めるタイミングとし、あるフレーム画像に対応する累積線量から当該フレーム画像の直後に再生されるフレーム画像に対応する累積線量を減じた累積線量減分が予め設定されたしきい値より大きいなら、当該直後に再生されるフレーム画像の再生のタイミングを、線が消去されるタイミングとするので、映像ファイルに映った場所で線が書かれ始めるタイミングおよび／または線が消去されるタイミングを高い精度で検出することができる。 In addition, for example, the event detection unit 16 calculates a cumulative dose obtained by accumulating doses in the order of reproduction of each frame image, and an increase rate of the cumulative dose corresponding to a plurality of frame images to be sequentially reproduced is set in advance. A cumulative dose increment that is larger than the threshold value and is obtained by subtracting the cumulative dose corresponding to the frame image reproduced immediately before the frame image from the cumulative dose corresponding to the frame image reproduced first in the plurality of frame images in advance. The cumulative dose corresponding to the frame image reproduced first is subtracted from the cumulative dose corresponding to the frame image reproduced immediately after the earliest frame image that is not more than the set threshold value. If the cumulative dose increment is greater than the threshold value, the timing of reproduction of the frame image reproduced immediately after the frame image reproduced first is If the cumulative dose reduction obtained by subtracting the cumulative dose corresponding to a frame image reproduced immediately after the frame image from the cumulative dose corresponding to a certain frame image is greater than a preset threshold value Since the playback timing of the frame image that is played back immediately after that is the timing at which the line is erased, the timing at which the line starts to be written and / or the timing at which the line is erased at the location shown in the video file is highly accurate. Can be detected.

なお、映像イベント検出装置１としてコンピュータを機能させるためのコンピュータプログラムは、半導体メモリ、磁気ディスク、光ディスク、光磁気ディスク、磁気テープなどのコンピュータ読み取り可能な記録媒体に記録でき、また、インターネットなどの通信網を介して伝送させて、広く流通させることができる。 A computer program for causing a computer to function as the video event detection apparatus 1 can be recorded on a computer-readable recording medium such as a semiconductor memory, a magnetic disk, an optical disk, a magneto-optical disk, a magnetic tape, or a communication such as the Internet. It can be distributed widely through the network.

１映像イベント検出装置
１１映像取得部
１２線消去処理部
１３前景領域検出部
１４線画像生成部
１５線画像記憶部
１６イベント検出部 DESCRIPTION OF SYMBOLS 1 Image | video event detection apparatus 11 Image | video acquisition part 12 Line erasure process part 13 Foreground area | region detection part 14 Line image generation part 15 Line image memory | storage part 16 Event detection part

Claims

A video event detection device for detecting when a line starts to be written and / or when a line is erased in a place reflected in a video file,
A line erasure processing unit that erases a line for each sampling from each frame image sampled in the reproduction order from the video file and generates a line erasure image;
A foreground region detection unit that generates a foreground region image indicating a foreground region of a place reflected in the frame image for each sampling of the frame image;
For each sampling of each frame image, a difference image between the frame image and a line erased image generated from the frame image is generated, and a difference image corresponding to a region other than the region indicated by the corresponding foreground region image of the difference image A line image generation unit that generates a line image that is an image including a background region;
A line image storage unit for storing each line image,
The line image generation unit
For each frame image, when a line image generated using a frame image reproduced prior to the frame image is in the line image storage unit, the line image is read and reproduced from the line image later. A line image foreground area corresponding to the area indicated by the foreground area image corresponding to the frame image to be reproduced, and including the line image foreground area in the line image corresponding to the frame image to be reproduced later,
The video event detection device further includes:
A dose indicating the amount of lines in each line image is calculated, and based on the change of the dose in the reproduction order of each frame image, the timing at which lines start to be written at the locations shown in the video file and / or lines A video event detection apparatus comprising: an event detection unit that detects a timing at which the video is deleted.

The event detection unit
For each of a plurality of frame images that are sequentially played back, if there is a frame image that is played back immediately before the target frame image, a dose corresponding to the target frame image is changed to a frame image played back immediately before the target frame image. The dose increment obtained by subtracting the corresponding dose is calculated, and the dose increment corresponding to the frame image reproduced most recently in the plurality of frame images is larger than a preset threshold value and the frame image reproduced most recently The time difference between the reproduction timing of the frame image to be reproduced first and the reproduction timing of the frame image to be reproduced most recently is the dose increment corresponding to a frame image other than Is greater than a preset threshold, the line indicates the playback timing of the frame image to be played back last. The timing that is begin,
If the dose decrement obtained by subtracting the dose corresponding to a frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image is larger than a preset threshold value, the frame image reproduced immediately after the frame image The video event detection device according to claim 1, wherein the reproduction timing is a timing at which the line is erased.

The event detection unit
Calculate a cumulative dose by accumulating the dose in the order of reproduction of each frame image,
The rate of increase in cumulative dose corresponding to a plurality of frame images that are sequentially reproduced is greater than a preset threshold value, and the cumulative dose corresponding to the frame image that is reproduced first in the plurality of frame images A frame that is reproduced immediately after the earliest frame image that has a cumulative dose increment obtained by subtracting the cumulative dose corresponding to the frame image that is reproduced immediately before the frame image is equal to or less than a preset threshold value. If the cumulative dose increment obtained by subtracting the cumulative dose corresponding to the earliest frame image to be reproduced from the cumulative dose corresponding to the image is greater than the threshold value, it is reproduced immediately after the earliest frame image to be reproduced. The frame image playback timing is the timing when the line starts to be written,
If the cumulative dose deduction obtained by subtracting the cumulative dose corresponding to the frame image reproduced immediately after the frame image from the cumulative dose corresponding to a certain frame image is larger than a preset threshold value, it is reproduced immediately after that. The video event detection device according to claim 1, wherein the reproduction timing of the frame image is a timing at which the line is deleted.

An operation method of a video event detection device for detecting a timing at which a line starts to be written and / or a timing at which a line is erased at a location reflected in a video file,
The line erasure processing unit of the video event detection device generates a line erasure image by erasing a line for each sampling from each frame image sampled in the playback order from the video file,
The foreground area detection unit of the video event detection device generates a foreground area image indicating a foreground area of a place reflected in the frame image for each sampling of the frame images,
The line image generation unit of the video event detection device generates a difference image between the frame image and a line erased image generated from the frame image for each sampling of the frame images, and a corresponding foreground of the difference image Generating a line image that is an image including a difference image background region corresponding to a region other than the region indicated by the region image, and storing the line image in a line image storage unit provided in the video event detection device;
For each frame image, the line image generation unit reads the line image when the line image generated using the frame image reproduced before the frame image is in the line image storage unit, A line image foreground area corresponding to the area indicated by the foreground area image corresponding to the frame image reproduced later is extracted from the line image, and the line image foreground area is extracted from the line image corresponding to the frame image reproduced later. Including
The event detection unit of the video event detection apparatus calculates a dose indicating the amount of the line in each line image, and is reflected in the video file based on the change in the dose in the reproduction order of the frame images. A method of operating a video event detection apparatus, comprising: detecting a timing at which a line starts to be written and / or a timing at which a line is erased.

The event detection unit
For each of a plurality of frame images that are sequentially played back, if there is a frame image that is played back immediately before the target frame image, a dose corresponding to the target frame image is changed to a frame image played back immediately before the target frame image. The dose increment obtained by subtracting the corresponding dose is calculated, and the dose increment corresponding to the frame image reproduced most recently in the plurality of frame images is larger than a preset threshold value and the frame image reproduced most recently The time difference between the reproduction timing of the frame image to be reproduced first and the reproduction timing of the frame image to be reproduced most recently is the dose increment corresponding to a frame image other than Is greater than a preset threshold, the line indicates the playback timing of the frame image to be played back last. The timing that is begin,
If the dose decrement obtained by subtracting the dose corresponding to a frame image reproduced immediately after the frame image from the dose corresponding to a certain frame image is larger than a preset threshold value, the frame image reproduced immediately after the frame image The operation method of the video event detection apparatus according to claim 4, wherein the reproduction timing is a timing at which the line is erased.

The event detection unit
Calculate a cumulative dose by accumulating the dose in the order of reproduction of each frame image,
The rate of increase in cumulative dose corresponding to a plurality of frame images that are sequentially reproduced is greater than a preset threshold value, and the cumulative dose corresponding to the frame image that is reproduced first in the plurality of frame images A frame that is reproduced immediately after the earliest frame image that has a cumulative dose increment obtained by subtracting the cumulative dose corresponding to the frame image that is reproduced immediately before the frame image is equal to or less than a preset threshold value. If the cumulative dose increment obtained by subtracting the cumulative dose corresponding to the earliest frame image to be reproduced from the cumulative dose corresponding to the image is greater than the threshold value, it is reproduced immediately after the earliest frame image to be reproduced. The frame image playback timing is the timing when the line starts to be written,
If the cumulative dose deduction obtained by subtracting the cumulative dose corresponding to the frame image reproduced immediately after the frame image from the cumulative dose corresponding to a certain frame image is larger than a preset threshold value, it is reproduced immediately after that. The operation method of the video event detection apparatus according to claim 4, wherein the reproduction timing of the frame image is a timing at which the line is deleted.

A computer program for causing a computer to function as the video event detection device according to claim 1.