JP6906273B2

JP6906273B2 - Programs, devices and methods that depict the trajectory of displacement of the human skeleton position from video data

Info

Publication number: JP6906273B2
Application number: JP2018115828A
Authority: JP
Inventors: 和之田坂
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2018-06-19
Filing date: 2018-06-19
Publication date: 2021-07-21
Anticipated expiration: 2038-06-19
Also published as: JP2019219836A

Description

本発明は、映像データから、人物の行動を推定する技術に関する。 The present invention relates to a technique for estimating a person's behavior from video data.

従来、人物間のオクルージョン中であるか否かにかかわらず、頑健かつ高精度な人物追跡を実現する画像処理の技術がある（例えば特許文献１参照）。この技術によれば、過去フレームの追跡結果と人物領域の検出結果とのマッチングの際に、単一人物の領域として比較する類似度基準とは別に、人物間オクルージョン状態の領域として比較する類似度基準を用いることによって、人物間オクルージョンが発生している場合でも頑強にマッチングできる。 Conventionally, there is an image processing technique that realizes robust and highly accurate person tracking regardless of whether or not it is occlusion between people (see, for example, Patent Document 1). According to this technique, when matching the tracking result of the past frame with the detection result of the person area, the similarity is compared as the area of the interpersonal occlusion state separately from the similarity criterion of comparing as the area of a single person. By using the criteria, it is possible to robustly match even when interpersonal occlusion occurs.

また、映像中の人物が密集している集団領域と、それ以外の個人領域とに区分して、映像中の人物を追跡する技術もある（例えば特許文献２参照）。この技術によれば、人物の追跡処理は、個人領域のみで実行され、集団領域では実行されない。即ち、ある人物が、個人領域から集団領域に吸収された際に、追跡処理を中止し、集団領域から個人領域へ分離した際に再度、追跡処理を開始する。特に、個人領域における人物の座標と着衣に記載の数字とに基づいて、その人物の集団領域における座標の軌跡を決定する。 There is also a technique for tracking people in a video by dividing them into a group area where people in the video are dense and an individual area other than that (see, for example, Patent Document 2). According to this technique, the tracking process of a person is performed only in the personal area, not in the collective area. That is, when a person is absorbed from the individual area into the group area, the tracking process is stopped, and when the person is separated from the group area into the individual area, the tracking process is started again. In particular, the trajectory of the coordinates in the group area of the person is determined based on the coordinates of the person in the individual area and the numbers written on the clothes.

更に、人物とは異なって、物体の移動を追跡する技術もある（例えば特許文献３参照）。この技術によれば、時系列に連続した画像をニューラルネットワークに入力し、入力された画像それぞれの特徴量と、ニューラルネットワークに抽出させた特徴量とを比較して類似性を照合する。そして、前時点の画像に映る追跡候補となる１以上の物体に一致する、後の画像に映る１以上の物体の識別情報及び位置情報を、識別結果として出力する。これにより、物体の追跡をディスプレイ上に表示する。 Further, unlike a person, there is also a technique for tracking the movement of an object (see, for example, Patent Document 3). According to this technique, continuous images in a time series are input to a neural network, and the features of each of the input images are compared with the features extracted by the neural network to match the similarity. Then, the identification information and the position information of one or more objects appearing in the subsequent image, which match the one or more objects appearing in the image at the previous time as tracking candidates, are output as the identification result. As a result, the tracking of the object is displayed on the display.

更に、人の各骨格の軌跡を表示する技術もある（例えば非特許文献１、特許文献４参照）。また、特定の部位として指先の軌跡を仮想空間上に表示する技術もある（例えば非特許文献２参照）。これらの技術によれば、抽出した骨格点の移動軌跡を表示することができる。 Further, there is also a technique for displaying the locus of each human skeleton (see, for example, Non-Patent Document 1 and Patent Document 4). There is also a technique for displaying the locus of a fingertip as a specific part in a virtual space (see, for example, Non-Patent Document 2). According to these techniques, the movement locus of the extracted skeleton points can be displayed.

特開２０１７−１８２２９５号公報Japanese Unexamined Patent Publication No. 2017-182295 再表２０１６−１３９９０６号公報Re-table 2016-139906 特開２０１８−０２６１０８号公報JP-A-2018-026108 特開２０１５−０６１５７９号公報Japanese Unexamined Patent Publication No. 2015-061579

Zhe Cao etc.「Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields」、Carnegie Mellon University、[online]、［平成３０年６月１６日検索］、インターネット＜URL:https://arxiv.org/pdf/1611.08050.pdf＞Zhe Cao etc. "Realtime Multi-Person 2D Pose Optimization using Part Affinity Fields", Carnegie Mellon University, [online], [Search June 16, 2018], Internet <URL: https://arxiv.org/pdf /1611.08050.pdf ＞田中雄規，河野恭之、「カメラ画像とセンサからの頭部位置・姿勢追跡による指先軌跡の空中描画システム」、関西学院大学、情報処理学会研究報告、Vol.2015-HCI-162 No.7、[online]、［平成３０年６月１６日検索］、インターネット＜URL:https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=141307&file_id=1&file_no=1＞Yuki Tanaka, Yasuyuki Kono, "Aerial drawing system of fingertip trajectory by tracking head position / posture from camera image and sensor", Kwansei Gakuin University, Information Processing Society of Japan Research Report, Vol.2015-HCI-162 No.7, [ online], [Search June 16, 2018], Internet <URL: https://ipsj.ixsq.nii.ac.jp/ej/?action=repository_uri&item_id=141307&file_id=1&file_no=1> 「動画や写真からボーンが検出できる OpenPoseを試してみた」、[online]、［平成３０年６月１６日検索］、インターネット＜URL:http://hackist.jp/?p=8285＞"I tried OpenPose, which can detect bones from videos and photos", [online], [Search June 16, 2018], Internet <URL: http://hackist.jp/?p=8285>

前述した非特許文献１及び２並びに特許文献４に記載の技術によれば、全ての骨格の移動軌跡を表示することができる。
しかしながら、人物の行動によっては（例えば激しい動きの場合）、骨格同士の軌跡の重なりが多く、表示された移動軌跡は、ユーザにとって視覚的に見づらいものとなる。
これに対し、本願の発明者は、人物の行動に応じて、ユーザが注目したい骨格の移動軌跡は異なるのではないか、と考えた。 According to the techniques described in Non-Patent Documents 1 and 2 and Patent Document 4 described above, the movement loci of all skeletons can be displayed.
However, depending on the behavior of the person (for example, in the case of violent movement), the loci of the skeletons often overlap each other, and the displayed movement locus is visually difficult for the user to see.
On the other hand, the inventor of the present application thought that the movement locus of the skeleton that the user wants to pay attention to may differ depending on the behavior of the person.

そこで、本発明は、映像データから人物の行動を推定し、当該行動に応じた骨格位置の変位の軌跡を描写するプログラム、装置及び方法を提供することを目的とする。 Therefore, an object of the present invention is to provide a program, an apparatus, and a method for estimating a person's behavior from video data and drawing a trajectory of displacement of the skeleton position according to the behavior.

本発明によれば、映像データから人物の骨格位置の変位の軌跡を描写するようにコンピュータを機能させるプログラムであって、
人物の行動毎に、軌跡を描写すべき１つ以上の骨格を予め対応付けた骨格設定テーブルと、
映像データから、複数の骨格位置を時系列に抽出する骨格認識手段と、
第１の所定期間の時系列の骨格位置の変位量から、人物の行動を推定する行動推定手段と、
骨格設定テーブルを用いて、行動推定手段によって認識された行動に対応する１つ以上の骨格を特定する骨格特定手段と、
骨格特定手段によって特定された骨格について、骨格認識手段によって抽出された、第２の所定期間の時系列の骨格位置の変位を軌跡として描写する骨格変位描写手段と
してコンピュータを機能させることを特徴とする。 According to the present invention, it is a program that makes a computer function so as to draw a trajectory of displacement of a person's skeleton position from video data.
A skeleton setting table in which one or more skeletons for which a trajectory should be drawn are associated in advance for each action of a person,
A skeleton recognition means that extracts multiple skeleton positions in time series from video data,
A behavior estimation means for estimating a person's behavior from the displacement amount of the skeleton position in a time series in the first predetermined period,
A skeleton identifying means that identifies one or more skeletons corresponding to the behavior recognized by the behavior estimating means using the skeleton setting table.
It is characterized in that the computer functions as a skeleton displacement depiction means for depicting the displacement of the skeleton position in a time series in a second predetermined period as a locus extracted by the skeleton recognition means for the skeleton identified by the skeleton identification means. ..

本発明のプログラムにおける他の実施形態によれば、
骨格変位描写手段は、映像データに、第２の所定期間の時系列の骨格位置の変位を軌跡として重畳的に描写する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable that the skeleton displacement depiction means causes the computer to function so as to superimpose the displacement of the skeleton position in the time series in the second predetermined period as a locus on the video data.

本発明のプログラムにおける他の実施形態によれば、
行動推定手段の第１の所定期間は、推定前時点t-n〜推定時点tであり、
骨格変位描写手段の第２の所定期間は、
推定後時点t+1〜t+kであるか、又は、
推定前時点t-m〜推定時点t〜推定後時点t+kである
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
The first predetermined period of the behavior estimation means is from the pre-estimation time point tn to the estimation time point t.
The second predetermined period of the skeletal displacement depiction means is
Post-estimation time point t + 1 to t + k, or
It is also preferable to operate the computer so that the pre-estimation time point tm to the estimation time point t to the post-estimation time point t + k.

本発明のプログラムにおける他の実施形態によれば、
人物の行動毎に、第１の所定期間における各骨格位置の変位量を時系列に並べた第１の骨格変位テーブルを更に有し、
行動推定手段は、行動毎の第１の骨格変位テーブルを用いて、第１の所定期間の時系列の骨格変位に類似する行動を検索する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It also has a first skeletal displacement table in which the displacement amounts of each skeletal position in the first predetermined period are arranged in chronological order for each action of the person.
It is also preferable that the behavior estimation means uses the first skeletal displacement table for each behavior to make the computer function to search for behaviors similar to the skeletal displacement of the time series in the first predetermined period.

本発明のプログラムにおける他の実施形態によれば、
人物の行動毎に、第２の所定期間における各骨格位置の変位量を時系列に並べた第２の骨格変位テーブルを更に有し、
骨格設定テーブルは、推定された当該行動における第２の骨格変位テーブルを用いて、所定ルールに基づく時系列の変位量となる骨格を設定する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It also has a second skeletal displacement table in which the displacement amounts of each skeletal position in the second predetermined period are arranged in chronological order for each action of the person.
As the skeleton setting table, it is also preferable to make the computer function so as to set a skeleton having a time-series displacement amount based on a predetermined rule by using the second skeleton displacement table in the estimated behavior.

本発明のプログラムにおける他の実施形態によれば、
骨格設定テーブルの所定ルールは、ユーザによって予め設定されたものである
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable to make the computer function so that the predetermined rule of the skeleton setting table is preset by the user.

本発明のプログラムにおける他の実施形態によれば、
骨格設定テーブルの所定ルールは、所定期間の変位量が所定条件よりも大きい骨格、又は、所定期間の変位量が所定条件よりも小さい骨格を設定する
ようにコンピュータを機能させることも好ましい。 According to other embodiments in the program of the present invention
It is also preferable that the predetermined rule of the skeleton setting table causes the computer to function so as to set a skeleton in which the displacement amount in the predetermined period is larger than the predetermined condition or a skeleton in which the displacement amount in the predetermined period is smaller than the predetermined condition.

本発明によれば、映像データから人物の骨格位置の変位の軌跡を描写する装置であって、
人物の行動毎に、軌跡を描写すべき１つ以上の骨格を予め対応付けた骨格設定テーブルと、
映像データから、複数の骨格位置を時系列に抽出する骨格認識手段と、
第１の所定期間の時系列の骨格位置の変位量から、人物の行動を推定する行動推定手段と、
骨格設定テーブルを用いて、行動推定手段によって認識された行動に対応する１つ以上の骨格を特定する骨格特定手段と、
骨格特定手段によって特定された骨格について、骨格認識手段によって抽出された、第２の所定期間の時系列の骨格位置の変位を軌跡として描写する骨格変位描写手段と
を有することを特徴とする。 According to the present invention, it is a device for drawing a trajectory of displacement of a person's skeleton position from video data.
A skeleton setting table in which one or more skeletons for which a trajectory should be drawn are associated in advance for each action of a person,
A skeleton recognition means that extracts multiple skeleton positions in time series from video data,
A behavior estimation means for estimating a person's behavior from the displacement amount of the skeleton position in a time series in the first predetermined period,
A skeleton identifying means that identifies one or more skeletons corresponding to the behavior recognized by the behavior estimating means using the skeleton setting table.
It is characterized by having a skeleton displacement depiction means for describing the displacement of the skeleton position in a time series in a second predetermined period as a locus, which is extracted by the skeleton recognition means for the skeleton specified by the skeleton identification means.

本発明によれば、映像データから人物の骨格位置の変位の軌跡を描写する装置の骨格軌跡描写方法であって、
装置は、人物の行動毎に、軌跡を描写すべき１つ以上の骨格を予め対応付けた骨格設定テーブルを有し、
装置は、
映像データから、複数の骨格位置を時系列に抽出する第１のステップと、
第１の所定期間の時系列の骨格位置の変位量から、人物の行動を推定する第２のステップと、
骨格設定テーブルを用いて、第２のステップによって認識された行動に対応する１つ以上の骨格を特定する第３のステップと、
第３のステップによって特定された骨格について、第１のステップによって抽出された、第２の所定期間の時系列の骨格位置の変位を軌跡として描写する第４のステップと
を実行することを特徴とする。 According to the present invention, it is a method of drawing a skeleton trajectory of an apparatus for drawing a trajectory of displacement of a person's skeleton position from video data.
The device has a skeleton setting table in which one or more skeletons for which a locus should be drawn are associated in advance for each action of the person.
The device is
The first step of extracting multiple skeleton positions in chronological order from video data,
The second step of estimating the behavior of a person from the displacement amount of the skeleton position in the time series in the first predetermined period, and
Using the skeleton setting table, a third step of identifying one or more skeletons corresponding to the behavior recognized by the second step, and
The skeleton identified by the third step is characterized by executing the fourth step of drawing the displacement of the skeleton position in the time series in the second predetermined period as a locus, which is extracted by the first step. do.

本発明のプログラム、装置及び方法によれば、映像データから人物の行動を推定し、当該行動に応じた骨格位置の変位の軌跡を描写することができる。 According to the program, device and method of the present invention, the behavior of a person can be estimated from the video data, and the locus of displacement of the skeleton position according to the behavior can be drawn.

本発明における認識装置の機能構成図である。It is a functional block diagram of the recognition device in this invention. 骨格認識部によって認識された骨格位置を表す説明図である。It is explanatory drawing which shows the skeleton position recognized by the skeleton recognition part. 本発明における骨格認識部及び行動推定部の第１の実施形態を表す説明図である。It is explanatory drawing which shows the 1st Embodiment of the skeleton recognition part and behavior estimation part in this invention. 本発明における骨格認識部及び行動推定部の第２の実施形態を表す説明図である。It is explanatory drawing which shows the 2nd Embodiment of the skeleton recognition part and behavior estimation part in this invention. 本発明における骨格位置の時系列の変位を表す説明図である。It is explanatory drawing which shows the displacement of the skeleton position in time series in this invention. 骨格設定テーブルを決定する第１の実施形態を表す説明図である。It is explanatory drawing which shows the 1st Embodiment which determines the skeleton setting table. 骨格設定テーブルを決定する第２の実施形態を表す説明図である。It is explanatory drawing which shows the 2nd Embodiment which determines the skeleton setting table.

以下、本発明の実施の形態について、図面を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

図１は、本発明における認識装置の機能構成図である。 FIG. 1 is a functional configuration diagram of the recognition device according to the present invention.

図１によれば、本発明の認識装置１は、映像データから人物の骨格位置の変位の軌跡を描写することができる。
認識装置１は、カメラを搭載したスマートフォンや携帯端末であってもよく、人物を撮影した映像データを入力する。また、認識装置１は、携帯電話網又は無線ＬＡＮのようなアクセスネットワークを介して映像データを受信するものであってもよい。
勿論、認識装置１は、スマートフォン等に限られず、例えば固定設置されたＷｅｂカメラであってもよい。また、Ｗｅｂカメラによって撮影された映像データがＳＤカードに記録され、その記録された映像データを認識装置１へ入力するものであってもよい。
尚、以下では、認識装置１は、カメラを搭載するスマートフォンとして説明するが、例えばインターネットに接続されたサーバとして機能するものであってもよい。 According to FIG. 1, the recognition device 1 of the present invention can depict a trajectory of displacement of a person's skeleton position from video data.
The recognition device 1 may be a smartphone or a mobile terminal equipped with a camera, and inputs video data of a person. Further, the recognition device 1 may receive video data via an access network such as a mobile phone network or a wireless LAN.
Of course, the recognition device 1 is not limited to a smartphone or the like, and may be, for example, a fixedly installed Web camera. Further, the video data captured by the Web camera may be recorded on the SD card, and the recorded video data may be input to the recognition device 1.
In the following, the recognition device 1 will be described as a smartphone equipped with a camera, but may function as a server connected to the Internet, for example.

図１によれば、認識装置１は、骨格設定テーブル１０と、骨格認識部１１と、行動推定部１２と、骨格特定部１３と、骨格変位描写部１４とを有する。これら機能構成部は、装置に搭載されたコンピュータを機能させるプログラムを実行することによって実現できる。また、これら機能構成部の処理の流れは、装置の骨格軌跡描写方法としても理解できる。 According to FIG. 1, the recognition device 1 has a skeleton setting table 10, a skeleton recognition unit 11, a behavior estimation unit 12, a skeleton identification unit 13, and a skeleton displacement depiction unit 14. These functional components can be realized by executing a program that makes the computer mounted on the device function. Further, the processing flow of these functional components can be understood as a method of drawing the skeleton trajectory of the device.

［骨格認識部１１］
骨格認識部１１は、映像データから、人物の複数の骨格位置を時系列に抽出する。骨格を抽出する映像データは、２次元に基づくものであって、一般的なＷｅｂカメラで撮影したものであってもよい。抽出された骨格位置は、行動推定部１２へ出力される。 [Skeletal recognition unit 11]
The skeleton recognition unit 11 extracts a plurality of skeleton positions of a person in time series from the video data. The video data for extracting the skeleton is based on two dimensions and may be captured by a general Web camera. The extracted skeleton position is output to the behavior estimation unit 12.

図２は、骨格認識部によって認識された骨格位置を表す説明図である。 FIG. 2 is an explanatory diagram showing the skeleton position recognized by the skeleton recognition unit.

骨格認識部１１は、具体的にはOpenPose（登録商標）のようなスケルトンモデルを用いて、人の骨格の特徴点を抽出する（例えば非特許文献３参照）。OpenPoseとは、画像から複数の人間の体／手／顔のキーポイントをリアルタイムに検出可能なソフトウェアであって、GitHubによって公開されている。撮影映像に映る人の身体全体であれば、例えば１８点のキーポイントを検出できる。
図２によれば、映像データに１人の人物が映り込んでいる。OpenPoseの場合、１８個の各骨格位置（Nose, Neck, RShoulder, RElbow,・・・）の２次元座標点及び信頼度が、各フレームで結び付けられている。 Specifically, the skeleton recognition unit 11 uses a skeleton model such as OpenPose (registered trademark) to extract characteristic points of the human skeleton (see, for example, Non-Patent Document 3). OpenPose is software that can detect multiple human body / hand / face key points in real time from images, and is published by GitHub. For example, 18 key points can be detected in the entire human body shown in the captured image.
According to FIG. 2, one person is reflected in the video data. In the case of OpenPose, the two-dimensional coordinate points and reliability of each of the 18 skeleton positions (Nose, Neck, RShoulder, RElbow, ...) are linked in each frame.

［行動推定部１２］
行動推定部１２は、第１の所定期間の時系列の骨格位置の変位量から、人物の行動を推定する。具体的には、行動推定部１２は、映像データの時系列の骨格位置の変位量に「行動」を対応付けた教師データに基づいて、深層学習の学習モデルを予め構築したものである。
そして、行動推定部１２は、学習モデルを用いて、映像データの時系列の骨格位置から、「行動」を認識する。行動推定部１２は、例えば「蹴る」「座る」「踊る」のような人物の行動を、人物の時系列の骨格位置から認識する。 [Behavior estimation unit 12]
The behavior estimation unit 12 estimates the behavior of the person from the displacement amount of the skeleton position in the time series in the first predetermined period. Specifically, the behavior estimation unit 12 has constructed a learning model for deep learning in advance based on teacher data in which "behavior" is associated with the displacement amount of the skeletal position in the time series of the video data.
Then, the behavior estimation unit 12 recognizes the "behavior" from the time-series skeleton position of the video data by using the learning model. The behavior estimation unit 12 recognizes a person's behavior such as "kicking", "sitting", and "dancing" from the time-series skeletal position of the person.

図３は、本発明における骨格認識部及び行動推定部の第１の実施形態を表す説明図である。
図３によれば、骨格認識部１１及び行動推定部１２の両方とも、例えばOpenPoseによって構成されたものであってもよい。OpenPoseの場合、クラス分類によって、行動毎にスコアが算出される。即ち、「映像データ」を入力することによって、最も高いスコアとなる「行動」を推定することができる。 FIG. 3 is an explanatory diagram showing a first embodiment of the skeleton recognition unit and the behavior estimation unit in the present invention.
According to FIG. 3, both the skeleton recognition unit 11 and the behavior estimation unit 12 may be configured by, for example, OpenPose. In the case of OpenPose, the score is calculated for each action by class classification. That is, by inputting the "video data", the "behavior" having the highest score can be estimated.

図４は、本発明における骨格認識部及び行動推定部の第２の実施形態を表す説明図である。 FIG. 4 is an explanatory diagram showing a second embodiment of the skeleton recognition unit and the behavior estimation unit in the present invention.

図４によれば、骨格認識部１１のみ、例えばOpenPoseによって複数の骨格位置が認識される。また、行動推定部１２には、第１の骨格変位テーブル１２１が対応付けられている。 According to FIG. 4, a plurality of skeleton positions are recognized only by the skeleton recognition unit 11, for example, OpenPose. Further, the behavior estimation unit 12 is associated with the first skeleton displacement table 121.

［第１の骨格変位テーブル１２１］
第１の骨格変位テーブル１２１は、人物の「行動」毎に、当該行動を特定可能な第１の所定期間における各骨格位置の変位量を時系列に並べたものである。図４によれば、例えば３次元（x,y,z)座標における前時刻からの変位量が表されている。
図４によれば、第１の所定期間は、推定前時点t-n〜推定時点tである（[t-n]は、推定時点tよりも任意のn時間前の時点）。
行動推定部１２は、映像データにおける推定時点tまでの行動毎の各骨格位置の変位量の変化から、第１の骨格変位テーブル１２１を用いて、第１の所定期間の時系列の骨格変位に類似する最も類似する「行動」を選択する。勿論、１つの行動に限られず、類似度が高い順に複数の行動が選択されてもよい。 [First Skeleton Displacement Table 121]
The first skeleton displacement table 121 arranges the displacement amount of each skeleton position in the first predetermined period in which the action can be specified for each "action" of the person in chronological order. According to FIG. 4, for example, the amount of displacement from the previous time in three-dimensional (x, y, z) coordinates is shown.
According to FIG. 4, the first predetermined period is from the pre-estimation time point tn to the estimated time point t ([tn] is a time point arbitrary n hours before the estimated time point t).
The behavior estimation unit 12 uses the first skeleton displacement table 121 to determine the time-series skeleton displacement of the first predetermined period from the change in the displacement amount of each skeleton position for each action up to the estimation time t in the video data. Select the most similar "behavior" that is similar. Of course, the action is not limited to one action, and a plurality of actions may be selected in descending order of similarity.

図４の第１の骨格変位テーブルによれば、行動「蹴る」の場合、骨格１０「右足首(RAnkle)」の位置の変位量が最も大きいことが理解できる。
行動「座る」の場合、全体の骨格位置の変位量が比較的小さいことが理解できる。
行動「踊る」の場合、骨格８「右ヒップ(Hip)」の位置の変位量が比較的大きいことが理解できる。 According to the first skeleton displacement table of FIG. 4, it can be understood that the displacement amount of the position of the skeleton 10 “right ankle (RAnkle)” is the largest in the case of the action “kicking”.
In the case of the action "sitting", it can be understood that the displacement amount of the entire skeletal position is relatively small.
In the case of the action "dancing", it can be understood that the displacement amount of the position of the skeleton 8 "right hip (Hip)" is relatively large.

［骨格設定テーブル１０］
骨格設定テーブル１０は、人物の行動毎に、軌跡を描写すべき１つ以上の骨格を予め対応付けたものである。
前述した図１によれば、骨格設定テーブル１０には、人物の行動毎に、以下のように、骨格番号が対応付けられている。
行動「蹴る」−＞骨格番号１０「右足首」
行動「座る」−＞骨格番号４「右手首」
行動「踊る」−＞骨格番号０「鼻」 [Skeletal setting table 10]
The skeleton setting table 10 is associated with one or more skeletons for which a locus should be drawn in advance for each action of the person.
According to FIG. 1 described above, the skeleton setting table 10 is associated with the skeleton number as follows for each action of the person.
Action "Kick"-> Skeleton number 10 "Right ankle"
Action "Sit"-> Skeleton No. 4 "Right wrist"
Action "Dancing"-> Skeleton number 0 "Nose"

［骨格特定部１３］
骨格特定部１３は、骨格設定テーブル１０を用いて、行動推定部１２によって認識された行動に対応する１つ以上の骨格を特定する。
例えば行動推定部１２によって行動「蹴る」と推定された場合、骨格設定テーブル１０を用いて、骨格番号１０「右足首」が選択される。 [Skeletal identification part 13]
The skeleton identification unit 13 identifies one or more skeletons corresponding to the behavior recognized by the behavior estimation unit 12 by using the skeleton setting table 10.
For example, when the action "kicking" is estimated by the action estimation unit 12, the skeleton number 10 "right ankle" is selected using the skeleton setting table 10.

［骨格変位描写部１４］
骨格変位描写部１４は、骨格特定部１３によって特定された骨格について、骨格認識部１１によって抽出された、第２の所定期間の時系列の骨格位置の変位を軌跡として描写する。
また、骨格変位描写部１４は、映像データに、骨格位置の時系列の変位を軌跡として重畳的に描写する。 [Skeletal displacement depiction unit 14]
The skeleton displacement depiction unit 14 describes the displacement of the skeleton position in the time series in the second predetermined period extracted by the skeleton recognition unit 11 as a locus for the skeleton specified by the skeleton identification unit 13.
Further, the skeleton displacement depiction unit 14 superimposes the time-series displacement of the skeleton position as a locus on the video data.

ここで、第２の所定期間は、以下のいずれであってもよい。
（１）推定後時点t+1〜t+k （[t+k]は、推定時点tよりも任意のk時間後の時点）
この場合、推定前時点t-n〜推定時点tで「行動」を推定し、その行動から特定される骨格番号について、推定後時点t+1〜時点t+kでその骨格位置の変位の軌跡を描写する。即ち、行動が推定された後段階から骨格位置の変位を描写する。
（２）推定前時点t-m〜推定時点t〜推定後時点t+k
この場合、推定前時点t-n〜推定時点tで「行動」を推定し、その行動から特定される骨格番号について、過去に遡った推定前時点t-m〜推定後時点t+kでその骨格位置の変位の軌跡を描写する。即ち、行動が推定される前段階から骨格位置の変位を描写する（[t-m]は、推定時点tよりも任意のm時間前の時点）。 Here, the second predetermined period may be any of the following.
(1) Post-estimation time point t + 1 to t + k ([t + k] is a time point arbitrary k hours after the estimated time point t)
In this case, the "behavior" is estimated from the pre-estimation time point tn to the estimation time point t, and the displacement trajectory of the skeleton position is drawn from the post-estimation time point t + 1 to the time point t + k for the skeleton number specified from the behavior. do. That is, the displacement of the skeletal position is depicted from the stage after the behavior is estimated.
(2) Pre-estimation time point tm ~ Estimated time point t ~ Post-estimation time point t + k
In this case, the "behavior" is estimated from the pre-estimation time point tn to the estimation time point t, and the displacement of the skeleton position from the pre-estimation time point tm to the post-estimation time point t + k that goes back to the past for the skeleton number specified from the behavior. Depict the trajectory of. That is, the displacement of the skeletal position is depicted from the stage before the behavior is estimated ([tm] is a time point arbitrary m hours before the estimation time point t).

図５は、本発明における骨格位置の時系列の変位を表す説明図である。 FIG. 5 is an explanatory diagram showing the time-series displacement of the skeleton position in the present invention.

（１）推定前時点t-2〜推定時点tの間で、骨格位置の変位量から行動「蹴る」の推定スコアが徐々に高くなり、時点tで推定閾値を超えて、行動「蹴る」が出力される。このとき、骨格設定テーブル１０を用いて、行動「蹴る」に対応する骨格１０「右足首」が特定される。
（２）推定後時点t+1〜t+3の間で、骨格１０「右足首」の位置の軌跡が記録されていく。骨格１０「右足首」の軌跡の描写は、徐々に更新されるものであってもよいし、任意の時点t+4で表示されるものであってもよい。
図５によれば、骨格位置の変位は、２次元画像で描写されているが、勿論、３次元画像であってもよい。 (1) Between the pre-estimation time point t-2 and the estimated time point t, the estimated score of the action "kicking" gradually increases from the displacement amount of the skeleton position, and the action "kicking" exceeds the estimated threshold value at the time point t. It is output. At this time, the skeleton 10 “right ankle” corresponding to the action “kicking” is specified using the skeleton setting table 10.
(2) The locus of the position of the skeleton 10 “right ankle” is recorded between t + 1 and t + 3 at the time point after estimation. The depiction of the locus of skeleton 10 "right ankle" may be updated gradually or may be displayed at any time point t + 4.
According to FIG. 5, the displacement of the skeleton position is depicted in a two-dimensional image, but of course, it may be a three-dimensional image.

＜骨格設定テーブル１０への骨格の設定＞
骨格設定テーブル１０は、特定された当該行動における第２の骨格変位テーブル１０１を用いて、所定ルールに基づく時系列の変位量となる骨格を設定する <Skeleton setting in the skeleton setting table 10>
The skeleton setting table 10 sets a skeleton that is a time-series displacement amount based on a predetermined rule by using the second skeleton displacement table 101 in the specified action.

図６は、骨格設定テーブルを決定する第１の実施形態を表す説明図である。 FIG. 6 is an explanatory diagram showing a first embodiment for determining a skeleton setting table.

図６によれば、骨格設定テーブル１０を決定するために、第２の骨格変位テーブル１０１と、所定ルールとが表されている。
［第２の骨格変位テーブル１０１］
第２の骨格変位テーブル１０１は、人物の行動毎に、第２の所定期間における各骨格位置の変位量を時系列に並べたものである。
第２の骨格変位テーブル１０１は、過去の統計データであって、行動毎に、経過時間に応じて各骨格位置の変位量を表したものである。図６によれば、行動「蹴る」「座る」「踊る」それぞれについて、経過時間に応じて、その人物の各骨格位置の変位量が記録されている。
［所定ルール］
所定ルールは、所定期間の変位量が所定条件よりも大きい骨格、又は、所定期間の変位量が所定条件よりも小さい骨格を設定する。
図６によれば、所定ルールとして、例えば「所定期間内の変位量が最も大きい骨格」と定義されているとする。 According to FIG. 6, a second skeleton displacement table 101 and a predetermined rule are shown in order to determine the skeleton setting table 10.
[Second skeletal displacement table 101]
The second skeleton displacement table 101 arranges the displacement amount of each skeleton position in the second predetermined period in chronological order for each action of the person.
The second skeleton displacement table 101 is past statistical data, and represents the displacement amount of each skeleton position according to the elapsed time for each action. According to FIG. 6, for each of the actions “kicking”, “sitting”, and “dancing”, the amount of displacement of each skeletal position of the person is recorded according to the elapsed time.
[Predetermined rule]
The predetermined rule sets a skeleton in which the displacement amount in the predetermined period is larger than the predetermined condition, or a skeleton in which the displacement amount in the predetermined period is smaller than the predetermined condition.
According to FIG. 6, it is assumed that, for example, "the skeleton having the largest displacement within a predetermined period" is defined as a predetermined rule.

この場合、骨格設定テーブル１０には、行動毎に、所定期間内の変位量が最も大きい骨格が登録される。
行動「蹴る」の場合、所定期間内の変位量が最も大きい骨格１０「右足首」が登録される。
行動「座る」の場合、所定期間内の変位量が最も大きい骨格４「右手首」が登録される。
行動「踊る」の場合、所定期間内の変位量が最も大きい骨格０「鼻」が登録される。
このように、所定ルールに応じて骨格設定テーブル１０が登録される。 In this case, the skeleton having the largest displacement within a predetermined period is registered in the skeleton setting table 10 for each action.
In the case of the action "kicking", the skeleton 10 "right ankle" having the largest displacement within a predetermined period is registered.
In the case of the action "sitting", the skeleton 4 "right wrist" having the largest displacement within a predetermined period is registered.
In the case of the action "dancing", the skeleton 0 "nose" having the largest displacement within a predetermined period is registered.
In this way, the skeleton setting table 10 is registered according to the predetermined rule.

ここで、所定ルールを対応付けて判定する「所定期間」は、任意である。
例えば行動の推定時点t以降となる推定後時点t+1〜t+2であってもよい。
例えば行動の推定前時点t-1〜t+2であってもよい。
即ち、第２の骨格変位テーブル１０１における推定前時点t-n〜推定時点tの各骨格位置の変位量の数値は、第１の骨格変位テーブルと全く同じものであってもよい。 Here, the "predetermined period" for determining the predetermined rule in association with each other is arbitrary.
For example, it may be a post-estimation time point t + 1 to t + 2 that is after the behavior estimation time point t.
For example, it may be t-1 to t + 2 before the estimation of the behavior.
That is, the numerical value of the displacement amount of each skeleton position from the pre-estimation time point tn to the estimated time point t in the second skeleton displacement table 101 may be exactly the same as that of the first skeleton displacement table.

図７は、骨格設定テーブルを決定する第２の実施形態を表す説明図である。 FIG. 7 is an explanatory diagram showing a second embodiment for determining the skeleton setting table.

図７によれば、図６と比較して、骨格設定テーブル１０の所定ルールは、ユーザによって予め設定されたものである。即ち、ユーザ毎に、異なるルールで、骨格設定テーブル１０が登録される。
尚、第２の骨格変位テーブル１０１は、図６と同様である。 According to FIG. 7, as compared with FIG. 6, the predetermined rule of the skeleton setting table 10 is preset by the user. That is, the skeleton setting table 10 is registered according to different rules for each user.
The second skeleton displacement table 101 is the same as in FIG.

図７のユーザ毎の所定ルールによれば、以下のように設定している。
（ユーザＡ）
「蹴る」：第１の所定期間の変位量が最も大きい骨格
「座る」：第１の所定期間の変位量が最も小さい骨格
「踊る」：第２の所定期間の変位量が最も大きい骨格
（ユーザＢ）
「蹴る」：第１の所定期間の変位量が最も小さい骨格
「座る」：第１の所定期間の変位量が最も大きい骨格
「踊る」：第２の所定期間の変位量が最も大きい骨格
尚、所定期間も任意に設定することができる。これによって、短い時間内で激しい行動や静かな行動に基づく骨格番号を検出するだけでなく、長い時間で激しい行動や静かな行動に基づく骨格番号を検出することもできる。 According to the predetermined rule for each user in FIG. 7, the settings are as follows.
(User A)
"Kick": Skeleton with the largest amount of displacement in the first predetermined period "Sit": Skeleton with the smallest amount of displacement in the first predetermined period "Dancing": Skeleton with the largest amount of displacement in the second predetermined period (user) B)
"Kick": Skeleton with the smallest amount of displacement in the first predetermined period "Sit": Skeleton with the largest amount of displacement in the first predetermined period "Dancing": Skeleton with the largest amount of displacement in the second predetermined period The predetermined period can also be set arbitrarily. As a result, not only the skeleton number based on intense behavior or quiet behavior can be detected in a short time, but also the skeleton number based on intense behavior or quiet behavior can be detected in a long time.

ユーザＡによれば、行動「蹴る」が推定された場合、第１の所定期間の変位量が最も大きい骨格１０「右足首」が選択される。この場合、骨格１０「右足首」の変位の軌跡が描写される。
また、ユーザＢによれば、行動「蹴る」が推定された場合、第１の所定期間の変位量が最も小さい骨格１「首」が選択される。この場合、骨格１「首」の変位の軌跡が描写される。 According to the user A, when the action "kicking" is estimated, the skeleton 10 "right ankle" having the largest displacement amount in the first predetermined period is selected. In this case, the displacement trajectory of the skeleton 10 "right ankle" is depicted.
Further, according to the user B, when the action "kicking" is estimated, the skeleton 1 "neck" having the smallest displacement amount in the first predetermined period is selected. In this case, the locus of displacement of the skeleton 1 "neck" is drawn.

以上、詳細に説明したように、本発明のプログラム、装置及び方法によれば、映像データから人物の行動を推定し、当該行動に応じた骨格位置の変位の軌跡を描写することができる。
また、本発明によれば、推定された行動に応じた骨格位置を、所定ルールに基づいて自動的に特定することができる。即ち、ユーザが注目すべき骨格位置の変位の軌跡を、所定ルールに応じて描写することができる。 As described in detail above, according to the program, device, and method of the present invention, the behavior of a person can be estimated from the video data, and the locus of displacement of the skeleton position according to the behavior can be drawn.
Further, according to the present invention, the skeletal position according to the estimated behavior can be automatically specified based on a predetermined rule. That is, the locus of displacement of the skeleton position that the user should pay attention to can be drawn according to a predetermined rule.

前述した本発明の種々の実施形態について、本発明の技術思想及び見地の範囲の種々の変更、修正及び省略は、当業者によれば容易に行うことができる。前述の説明はあくまで例であって、何ら制約しようとするものではない。本発明は、特許請求の範囲及びその均等物として限定するものにのみ制約される。 With respect to the various embodiments of the present invention described above, various changes, modifications and omissions within the scope of the technical idea and viewpoint of the present invention can be easily made by those skilled in the art. The above explanation is just an example and does not attempt to restrict anything. The present invention is limited only to the scope of claims and their equivalents.

１認識装置
１０骨格設定テーブル
１０１第２の骨格変位テーブル
１１骨格認識部
１２行動推定部
１２１第１の骨格変位テーブル
１３骨格特定部
１４骨格変位描写部

1 Recognition device 10 Skeleton setting table 101 Second skeleton displacement table 11 Skeleton recognition unit 12 Behavior estimation unit 121 First skeleton displacement table 13 Skeleton identification unit 14 Skeleton displacement depiction unit

Claims

A program that makes a computer function to depict the trajectory of displacement of a person's skeleton position from video data.
A skeleton setting table in which one or more skeletons for which a trajectory should be drawn are associated in advance for each action of a person,
A skeleton recognition means that extracts a plurality of skeleton positions in time series from the video data,
A behavior estimation means for estimating a person's behavior from the displacement amount of the skeleton position in a time series in the first predetermined period,
A skeleton specifying means that identifies one or more skeletons corresponding to the behavior recognized by the behavior estimating means using the skeleton setting table.
It is characterized in that the computer functions as a skeleton displacement depiction means for depicting the displacement of the skeleton position in a time series in a second predetermined period as a locus extracted by the skeleton recognition means for the skeleton identified by the skeleton identification means. Program to be.

The first aspect of the present invention, wherein the skeleton displacement depiction means causes the computer to function so as to superimpose the displacement of the skeleton position in the time series in the second predetermined period as a locus on the video data. program.

The first predetermined period of the behavior estimation means is from the pre-estimation time point tn to the estimation time point t.
The second predetermined period of the skeletal displacement depiction means is
Post-estimation time point t + 1 to t + k, or
The program according to claim 1 or 2, wherein the computer functions so that the pre-estimation time point tm to the pre-estimation time point t to the post-estimation time point t + k.

It also has a first skeletal displacement table in which the displacement amounts of each skeletal position in the first predetermined period are arranged in chronological order for each action of the person.
The claim is characterized in that the behavior estimation means makes a computer function to search for a behavior similar to a time-series skeletal displacement in a first predetermined period by using a first skeletal displacement table for each behavior. The program described in 3.

It also has a second skeletal displacement table in which the displacement amounts of each skeletal position in the second predetermined period are arranged in chronological order for each action of the person.
The skeleton setting table is characterized in that the computer functions to set a skeleton that is a time-series displacement amount based on a predetermined rule by using a second skeleton displacement table in the estimated behavior. The program according to 3 or 4.

The program according to claim 5, wherein the predetermined rule of the skeleton setting table causes the computer to function as if it is preset by a user.

The predetermined rule of the skeleton setting table is characterized in that the computer functions to set a skeleton in which the displacement amount in a predetermined period is larger than a predetermined condition or a skeleton in which the displacement amount in a predetermined period is smaller than the predetermined condition. The program according to claim 6.

It is a device that draws the trajectory of displacement of the skeleton position of a person from video data.
A skeleton setting table in which one or more skeletons for which a trajectory should be drawn are associated in advance for each action of a person,
A skeleton recognition means that extracts a plurality of skeleton positions in time series from the video data,
A behavior estimation means for estimating a person's behavior from the displacement amount of the skeleton position in a time series in the first predetermined period,
A skeleton specifying means that identifies one or more skeletons corresponding to the behavior recognized by the behavior estimating means using the skeleton setting table.
The skeleton identified by the skeleton specifying means is characterized by having a skeleton displacement depiction means that describes the displacement of the skeleton position in a time series in a second predetermined period as a locus, which is extracted by the skeleton recognition means. Device.

It is a method of drawing the skeleton trajectory of a device that draws the trajectory of the displacement of the skeleton position of a person from video data.
The device has a skeleton setting table in which one or more skeletons for which a locus should be drawn are associated in advance for each action of a person.
The device is
The first step of extracting a plurality of skeleton positions in time series from the video data,
The second step of estimating the behavior of a person from the displacement amount of the skeleton position in the time series in the first predetermined period, and
Using the skeleton setting table, a third step of identifying one or more skeletons corresponding to the behavior recognized by the second step, and
The skeleton identified by the third step is characterized by executing the fourth step of drawing the displacement of the skeleton position in the time series in the second predetermined period as a locus, which is extracted by the first step. How to describe the skeletal trajectory of the device.