JP6922768B2

JP6922768B2 - Information processing device

Info

Publication number: JP6922768B2
Application number: JP2018017236A
Authority: JP
Inventors: 純平松永; 田中　清明; 清明田中; 信二高橋
Original assignee: Omron Corp
Current assignee: Omron Corp
Priority date: 2018-02-02
Filing date: 2018-02-02
Publication date: 2021-08-18
Anticipated expiration: 2038-02-02
Also published as: JP2019133566A; WO2019150954A1

Description

本発明は、撮像された画像に写っている人物を識別する情報処理装置に関する。 The present invention relates to an information processing device that identifies a person in a captured image.

撮像された画像に写っている人物を識別する技術として、撮像された画像から人物の顔特徴量を取得し、取得した顔特徴量から当該人物を識別する技術がある。 As a technique for identifying a person in an captured image, there is a technique for acquiring a person's facial features from the captured image and identifying the person from the acquired facial features.

特許文献１に開示の技術では、一の画像の一の顔の特徴量と他の画像の一の顔の特徴量とが近似しており且つ一の画像の一の顔の特徴量と他の画像の一の顔以外のすべての他の顔の特徴量とが近似していないような関係が複数存在するかどうかが判定される。そして、そのような関係が複数存在する場合には、一の画像の一の顔と他の画像の一の顔とが同一人物の顔であると判定される。 In the technique disclosed in Patent Document 1, the feature amount of one face of one image and the feature amount of one face of another image are close to each other, and the feature amount of one face of one image and another It is determined whether or not there are a plurality of relationships in which the features of all other faces other than one face in the image are not close to each other. Then, when a plurality of such relationships exist, it is determined that one face of one image and one face of another image are the faces of the same person.

特開２０１５−２２５５６７号公報Japanese Unexamined Patent Publication No. 2015-225567

識別結果をドアの施解錠などのために利用する場合には、人物がカメラに対して積極的に顔を向けるため、撮像された画像から当該人物の正確な顔特徴量が得られる。しかしながら、識別結果が他の目的で利用される場合には、人物がカメラに対して顔を向けないことがあり、撮像された画像から当該人物の正確な顔特徴量が得られないことがある。そのため、顔特徴量から人物を識別する従来技術では、人物を高精度に識別できないことがある。 When the identification result is used for locking / unlocking a door or the like, the person actively turns his / her face toward the camera, so that an accurate facial feature amount of the person can be obtained from the captured image. However, when the identification result is used for other purposes, the person may not turn his / her face toward the camera, and the accurate facial feature amount of the person may not be obtained from the captured image. .. Therefore, in the conventional technique for identifying a person from the amount of facial features, the person may not be identified with high accuracy.

本発明は、上記実情に鑑みなされたものであって、顔が撮像されていない場合でも撮像された画像に写っている人物を識別できる情報処理装置を提供することを目的とする。 The present invention has been made in view of the above circumstances, and an object of the present invention is to provide an information processing device capable of identifying a person appearing in an captured image even when the face is not imaged.

上記目的を達成するために、本発明では、撮像された画像に写っている人物の姿勢、仕草、シルエット、動線、及び、滞在場所の少なくともいずれかに基づいて当該人物を識別する、という方法を採用する。 In order to achieve the above object, in the present invention, a method of identifying a person based on at least one of the posture, gesture, silhouette, flow line, and place of stay of the person in the captured image. Is adopted.

具体的には、本発明の第一態様は、撮像された画像を取得する画像取得手段と、前記画像から、当該画像に写っている人物の姿勢、仕草、シルエット、動線、及び、滞在場所の少なくともいずれかを当該人物の特徴として示す特徴情報を取得する情報取得手段と、前記特徴情報に基づいて、前記人物を識別する識別手段と、を有することを特徴とする情報処理装置を提供する。 Specifically, the first aspect of the present invention is an image acquisition means for acquiring an captured image, and a posture, gesture, silhouette, movement line, and staying place of a person appearing in the image from the image. Provided is an information processing apparatus characterized by having an information acquisition means for acquiring characteristic information indicating at least one of the characteristics of the person, and an identification means for identifying the person based on the characteristic information. ..

この構成によれば、顔が撮像されていない場合でも撮像された画像に写っている人物を識別できる。具体的には、撮像された画像に写っている人物の姿勢、仕草、シルエット、動線、滞在場所、等は、当該人物の顔が撮像されていなくても取得できる。そして、それらの特徴は人物固有のものであるため、それらの特徴から人物を識別できる。 According to this configuration, it is possible to identify the person in the captured image even when the face is not captured. Specifically, the posture, gesture, silhouette, flow line, place of stay, etc. of the person in the captured image can be acquired even if the face of the person is not captured. And since those characteristics are unique to a person, the person can be identified from those characteristics.

前記情報取得手段は、前記人物の骨格を示す骨格情報を前記画像から取得し、当該骨格情報に基づいて前記特徴情報を取得してもよい。複数の人物のそれぞれについて、その人
物の特徴を示す参照情報を記憶する記憶手段をさらに有し、前記識別手段は、前記特徴情報と各参照情報を比較して、前記画像に写っている前記人物を識別してもよい。前記画像を撮像する撮像手段、をさらに有してもよい。 The information acquisition means may acquire skeleton information indicating the skeleton of the person from the image and acquire the feature information based on the skeleton information. For each of the plurality of persons, the storage means for storing the reference information indicating the characteristics of the person is further provided, and the identification means compares the characteristic information with each reference information and shows the person in the image. May be identified. It may further have an imaging means for capturing the image.

前記画像に２人以上の人物が写っている場合に、前記識別手段は、前記２人以上の人物のうちの一部の人物を、その人物に対応する特徴情報に基づいて識別し、前記２人以上の人物のうちの残りの人物を、その人物に対応する特徴情報と、前記一部の人物の識別結果とに基づいて識別するとよい。前記記憶手段は、前記複数の人物のそれぞれについて、その人物と他の人物との２つ以上の組み合わせにそれぞれ対応する２つ以上の参照情報を記憶し、前記画像に２人以上の人物が写っている場合に、前記識別手段は、前記２人以上の人物のうちの一部の人物を、その人物に対応する特徴情報と、前記各参照情報とを比較して識別し、前記２人以上の人物のうちの残りの人物を、その人物に対応する特徴情報と、識別された前記一部の人物との組み合わせに対応する各参照情報とを比較して識別してもよい。 When two or more people are shown in the image, the identification means identifies a part of the two or more people based on the feature information corresponding to the person, and the above 2 It is preferable to identify the remaining person among the person or more based on the feature information corresponding to the person and the identification result of the part of the person. The storage means stores two or more reference information corresponding to each of two or more combinations of the person and the other person for each of the plurality of persons, and the image shows the two or more persons. In this case, the identification means identifies a part of the two or more persons by comparing the feature information corresponding to the person with the reference information, and identifies the two or more persons. The remaining person of the person may be identified by comparing the feature information corresponding to the person with the reference information corresponding to the combination of the identified part of the person.

一の人物が他の人物と一緒にいる場合には、一の人物の特徴（姿勢、仕草、動線、滞在場所、等）が他の人物に依存して変わることがある。例えば、他の人物に特定の人物（父、母、兄、弟、姉、妹、上司、部下、等）が含まれている場合とそうでない場合との間で、一の人物の姿勢や仕草が異なることがある。そのため、他の人物の識別結果を考慮することにより、一の人物をより高精度に識別できる。 When one person is with another person, the characteristics of one person (posture, gesture, flow line, place of stay, etc.) may change depending on the other person. For example, the attitude or behavior of one person between when another person includes a specific person (father, mother, brother, brother, sister, sister, boss, subordinate, etc.) and when it does not. May be different. Therefore, one person can be identified with higher accuracy by considering the identification result of another person.

なお、本発明は、上記構成ないし機能の少なくとも一部を有する情報処理システムとして捉えることができる。また、本発明は、上記処理の少なくとも一部を含む、情報処理方法又は情報処理システムの制御方法や、これらの方法をコンピュータに実行させるためのプログラム、又は、そのようなプログラムを非一時的に記録したコンピュータ読取可能な記録媒体として捉えることもできる。上記構成及び処理の各々は技術的な矛盾が生じない限り互いに組み合わせて本発明を構成することができる。 The present invention can be regarded as an information processing system having at least a part of the above-mentioned configuration or function. Further, the present invention provides a method for controlling an information processing method or an information processing system including at least a part of the above processing, a program for causing a computer to execute these methods, or a program such as such for non-temporary purposes. It can also be regarded as a recorded computer-readable recording medium. Each of the above configurations and processes can be combined with each other to construct the present invention as long as there is no technical contradiction.

本発明によれば、顔が撮像されていない場合でも撮像された画像に写っている人物を識別できる。 According to the present invention, it is possible to identify a person appearing in an image captured even when the face is not imaged.

図１は、本発明が適用された情報処理装置の構成例を示すブロック図である。FIG. 1 is a block diagram showing a configuration example of an information processing apparatus to which the present invention is applied. 図２は、第１の実施形態に係る情報処理システムの構成例を示すブロック図である。FIG. 2 is a block diagram showing a configuration example of the information processing system according to the first embodiment. 図３は、第１の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。FIG. 3 is a flowchart showing an example of a processing flow of the information processing apparatus according to the first embodiment. 図４は、第１の実施形態に係る監視カメラによって撮像された画像の一例を示す図である。FIG. 4 is a diagram showing an example of an image captured by the surveillance camera according to the first embodiment. 図５は、第１の実施形態に係る姿勢の違いの一例を示す図である。FIG. 5 is a diagram showing an example of a difference in posture according to the first embodiment. 図６は、第１の実施形態に係る参照情報の一例を示す図である。FIG. 6 is a diagram showing an example of reference information according to the first embodiment. 図７は、第２の実施形態に係るコミュニケーションロボットの構成例を示すブロック図である。FIG. 7 is a block diagram showing a configuration example of the communication robot according to the second embodiment. 図８は、第３の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。FIG. 8 is a flowchart showing an example of a processing flow of the information processing apparatus according to the third embodiment. 図９（Ａ）は、第３の実施形態に係る参照情報の一例を示す図であり、図９（Ｂ）は、第３の実施形態に係るマスク画像によって示されたシルエットの一例を示す図である。FIG. 9A is a diagram showing an example of reference information according to the third embodiment, and FIG. 9B is a diagram showing an example of the silhouette shown by the mask image according to the third embodiment. Is. 図１０は、第３の実施形態に係るシルエット画像とマスク画像の一例を示す図である。FIG. 10 is a diagram showing an example of a silhouette image and a mask image according to the third embodiment. 図１１は、第４の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。FIG. 11 is a flowchart showing an example of a processing flow of the information processing apparatus according to the fourth embodiment. 図１２は、第４の実施形態に係る滞在マップ（特徴情報）の一例を示す図である。FIG. 12 is a diagram showing an example of a stay map (feature information) according to the fourth embodiment. 図１３（Ａ）は、第４の実施形態に係る参照情報の一例を示す図であり、図１３（Ｂ）は、第４の実施形態に係る滞在マップ（参照情報）の一例を示す図である。FIG. 13 (A) is a diagram showing an example of reference information according to the fourth embodiment, and FIG. 13 (B) is a diagram showing an example of a stay map (reference information) according to the fourth embodiment. be. 図１４は、本発明の変形例に係る動線の一例を示す図である。FIG. 14 is a diagram showing an example of a flow line according to a modified example of the present invention. 図１５は、第５の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。FIG. 15 is a flowchart showing an example of a processing flow of the information processing apparatus according to the fifth embodiment. 図１６は、第６の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。FIG. 16 is a flowchart showing an example of a processing flow of the information processing apparatus according to the sixth embodiment. 図１７は、第６の実施形態に係る参照情報の一例を示す図である。FIG. 17 is a diagram showing an example of reference information according to the sixth embodiment.

＜適用例＞
本発明の適用例について説明する。撮像された画像に写っている人物を識別する従来技術では、撮像された画像から人物の顔特徴量が取得され、取得した顔特徴量から当該人物が識別される。識別結果をドアの施解錠などのために利用する場合には、人物がカメラに対して積極的に顔を向けるため、撮像された画像から当該人物の正確な顔特徴量が得られる。しかしながら、識別結果が他の目的で利用される場合には、人物がカメラに対して顔を向けないことがあり、撮像された画像から当該人物の正確な顔特徴量が得られないことがある。そのため、上記従来技術では、人物を高精度に識別できないことがある。 <Application example>
An application example of the present invention will be described. In the conventional technique for identifying a person appearing in an captured image, the facial feature amount of the person is acquired from the captured image, and the person is identified from the acquired facial feature amount. When the identification result is used for locking / unlocking a door or the like, the person actively turns his / her face toward the camera, so that an accurate facial feature amount of the person can be obtained from the captured image. However, when the identification result is used for other purposes, the person may not turn his / her face toward the camera, and the accurate facial feature amount of the person may not be obtained from the captured image. .. Therefore, in the above-mentioned conventional technique, a person may not be identified with high accuracy.

図１は、本発明が適用された情報処理装置１００の構成例を示すブロック図である。情報処理装置１００は、画像入力部１０１、制御部１０２、記憶部１０３、及び、出力部１０４を有する。制御部１０２は、情報取得部１１１と識別部１１２を有する。 FIG. 1 is a block diagram showing a configuration example of an information processing apparatus 100 to which the present invention is applied. The information processing device 100 includes an image input unit 101, a control unit 102, a storage unit 103, and an output unit 104. The control unit 102 has an information acquisition unit 111 and an identification unit 112.

画像入力部１０１は、撮像された画像（画像データ）を取得する。例えば、画像入力部１０１は、画像データが入力される入力端子である。画像入力部１０１は、本発明の画像取得手段の一例である。 The image input unit 101 acquires the captured image (image data). For example, the image input unit 101 is an input terminal into which image data is input. The image input unit 101 is an example of the image acquisition means of the present invention.

制御部１０２は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）、等を含み、各種情報処理や各構成要素の制御を行う。 The control unit 102 includes a CPU (Central Processing Unit), a RAM (Random Access Memory), a ROM (Read Only Memory), and the like, and performs various information processing and control of each component.

情報取得部１１１は、画像入力部１０１によって取得された画像から、当該画像に写っている人物の姿勢、仕草、シルエット、動線、及び、滞在場所の少なくともいずれかを当該人物の特徴として示す特徴情報を取得する。情報取得部１１１は、本発明の情報取得手段の一例である。 The information acquisition unit 111 is a feature that indicates at least one of the posture, gesture, silhouette, flow line, and staying place of the person in the image from the image acquired by the image input unit 101 as the characteristics of the person. Get information. The information acquisition unit 111 is an example of the information acquisition means of the present invention.

識別部１１２は、情報取得部１１１によって取得された特徴情報に基づいて、画像入力部１０１によって取得された画像に写っている人物を識別する。識別部１１２は、本発明の識別手段の一例である。 The identification unit 112 identifies the person in the image acquired by the image input unit 101 based on the feature information acquired by the information acquisition unit 111. The identification unit 112 is an example of the identification means of the present invention.

記憶部１０３は、制御部１０２で実行されるプログラム、制御部１０２で使用される各種データ、等を記憶する。例えば、記憶部１０３は、ハードディスクドライブ、ソリッドステートドライブ、等の補助記憶装置である。記憶部１０３は、本発明の記憶手段の一例である。 The storage unit 103 stores programs executed by the control unit 102, various data used by the control unit 102, and the like. For example, the storage unit 103 is an auxiliary storage device such as a hard disk drive or a solid state drive. The storage unit 103 is an example of the storage means of the present invention.

出力部１０４は、識別部１１２の識別結果を、外部装置、情報処理装置１００の不図示の構成要素、等へ出力する。出力部１０４は、例えば、識別結果のデータを出力する出力端子である。 The output unit 104 outputs the identification result of the identification unit 112 to an external device, a component (not shown) of the information processing device 100, and the like. The output unit 104 is, for example, an output terminal that outputs identification result data.

情報処理装置１００の上記構成によれば、顔が撮像されていない場合でも撮像された画像に写っている人物を識別できる。具体的には、撮像された画像に写っている人物の姿勢、仕草、シルエット、動線、滞在場所、等は、当該人物の顔が撮像されていなくても取得できる。そして、それらの特徴は人物固有のものであるため、それらの特徴から人物を識別できる。 According to the above configuration of the information processing device 100, it is possible to identify a person in the captured image even when the face is not captured. Specifically, the posture, gesture, silhouette, flow line, place of stay, etc. of the person in the captured image can be acquired even if the face of the person is not captured. And since those characteristics are unique to a person, the person can be identified from those characteristics.

＜第１の実施形態＞
本発明の第１の実施形態について説明する。従業員や住人（住民）などを管理したり、職場、地域、家庭、等におけるセキュリティを向上させたりするために、監視カメラが利用されることがある。人物が監視カメラに対して顔を向けることは少ない。第１の実施形態では、監視カメラによって撮像された画像に写っている人物を識別する例を説明する。 <First Embodiment>
The first embodiment of the present invention will be described. Surveillance cameras may be used to manage employees, residents (residents), etc., and to improve security in workplaces, communities, homes, etc. People rarely turn their faces to surveillance cameras. In the first embodiment, an example of identifying a person in an image captured by a surveillance camera will be described.

図２は、第１の実施形態に係る情報処理システムの構成例を示すブロック図である。図２の情報処理システムでは、図１の情報処理装置１００に監視カメラ２００と管理装置３００とが接続されている。監視カメラ２００と管理装置３００の少なくとも一方は情報処理装置１００の一部であってもよい。 FIG. 2 is a block diagram showing a configuration example of the information processing system according to the first embodiment. In the information processing system of FIG. 2, the surveillance camera 200 and the management device 300 are connected to the information processing device 100 of FIG. At least one of the surveillance camera 200 and the management device 300 may be a part of the information processing device 100.

監視カメラ２００は、撮像を行い、撮像した画像を情報処理装置１００へ出力する撮像装置である。監視カメラ２００の撮像範囲は、固定であってもよいし、変化してもよい。情報処理装置１００の画像入力部１０１には、監視カメラ２００によって撮像された画像が入力される。監視カメラ２００によって撮像された画像が画像入力部１０１に入力されると、当該画像に写っている人物が、情報処理装置１００の制御部１０２（情報取得部１１１と識別部１１２）の処理によって識別される。そして、情報処理装置１００の出力部１０４が、制御部１０２（識別部１１２）の識別結果を、管理装置３００へ出力する。管理装置３００は、情報処理装置１００から出力された識別結果を用いて各種処理を行う。例えば、管理装置３００は、所定期間における複数の識別結果の統計データを生成したり、統計データや識別結果を管理者（管理装置３００のユーザ）に通知したりする。管理者への通知は、例えば、液晶モニタなどの表示部を用いた画面表示、スピーカを用いた音声出力、等によって実現される。第１の実施形態では、情報取得部１１１は、姿勢と仕草を示す特徴情報を取得する。 The surveillance camera 200 is an image pickup device that takes an image and outputs the captured image to the information processing device 100. The imaging range of the surveillance camera 200 may be fixed or variable. The image captured by the surveillance camera 200 is input to the image input unit 101 of the information processing device 100. When the image captured by the surveillance camera 200 is input to the image input unit 101, the person in the image is identified by the processing of the control unit 102 (information acquisition unit 111 and identification unit 112) of the information processing device 100. Will be done. Then, the output unit 104 of the information processing device 100 outputs the identification result of the control unit 102 (identification unit 112) to the management device 300. The management device 300 performs various processes using the identification result output from the information processing device 100. For example, the management device 300 generates statistical data of a plurality of identification results in a predetermined period, and notifies the administrator (user of the management device 300) of the statistical data and the identification results. The notification to the administrator is realized by, for example, a screen display using a display unit such as a liquid crystal monitor, an audio output using a speaker, or the like. In the first embodiment, the information acquisition unit 111 acquires characteristic information indicating a posture and a gesture.

図３は、第１の実施形態に係る情報処理装置１００の処理フロー例を示すフローチャートである。 FIG. 3 is a flowchart showing an example of a processing flow of the information processing apparatus 100 according to the first embodiment.

まず、画像入力部１０１が、監視カメラ２００によって撮像された画像を監視カメラ２００から取得する（ステップＳ３０１）。図４は、監視カメラ２００によって撮像された画像の一例を示す。図４では、監視カメラ２００によって撮像された画像４００に人物４０１が写っている。 First, the image input unit 101 acquires an image captured by the surveillance camera 200 from the surveillance camera 200 (step S301). FIG. 4 shows an example of an image captured by the surveillance camera 200. In FIG. 4, the person 401 is shown in the image 400 captured by the surveillance camera 200.

次に、情報取得部１１１が、ステップＳ３０１で取得された画像から、当該画像に写っている人物の骨格を示す骨格情報を取得する（ステップＳ３０２）。骨格情報は、例えば、ＯｐｅｎＰｏｓｅなどを使って取得される。骨格情報は、人体を示す情報でもあるし、人体の部位（頭、首、肩、肘、手、腰、膝、足首、目、耳、指先、等）を示す情報でもある。そのため、骨格情報の取得は「人体検出」や「部位検出」などとも言える。図４では、画像４００から、人物４０１の骨格（骨格情報）４０２が検出されている。 Next, the information acquisition unit 111 acquires skeleton information indicating the skeleton of the person in the image from the image acquired in step S301 (step S302). The skeleton information is acquired using, for example, OpenPose. The skeletal information is information indicating the human body and information indicating parts of the human body (head, neck, shoulders, elbows, hands, hips, knees, ankles, eyes, ears, fingertips, etc.). Therefore, the acquisition of skeletal information can be said to be "human body detection" or "site detection". In FIG. 4, the skeleton (skeleton information) 402 of the person 401 is detected from the image 400.

そして、情報取得部１１１が、ステップＳ３０２で取得された骨格情報に基づいて、ステップＳ３０１で取得された画像に写っている人物の姿勢と仕草を検出する（ステップＳ３０３，Ｓ３０４）。ステップＳ３０３の検出結果（姿勢）とステップＳ３０４の検出結果（仕草）との組み合わせが、ステップＳ３０１で取得された画像に写っている人物の特徴情報である。ステップＳ３０３の処理（姿勢検出）とステップＳ３０４の処理（仕草検出）とは、並列に行われてもよいし、順番に行われてもよい。姿勢検出と仕草検出の順番は特に限定されない。 Then, the information acquisition unit 111 detects the posture and gesture of the person in the image acquired in step S301 based on the skeleton information acquired in step S302 (steps S303 and S304). The combination of the detection result (posture) of step S303 and the detection result (gesture) of step S304 is the characteristic information of the person shown in the image acquired in step S301. The process of step S303 (posture detection) and the process of step S304 (gesture detection) may be performed in parallel or in order. The order of posture detection and gesture detection is not particularly limited.

ステップＳ３０３では、姿勢の検出結果として、例えば、直立、猫背、Ｏ脚、Ｘ脚、等を示す情報が得られる。図５に示すように、猫背の場合と正常の場合との間で、骨格の形状は異なる。このように、骨格の形状は姿勢に依存する。そのため、骨格情報に基づいて、骨格の形状から姿勢を検出できる。 In step S303, as the posture detection result, for example, information indicating upright, stoop, O-leg, X-leg, and the like can be obtained. As shown in FIG. 5, the shape of the skeleton is different between the case of stoop and the case of normal. Thus, the shape of the skeleton depends on the posture. Therefore, the posture can be detected from the shape of the skeleton based on the skeleton information.

ステップＳ３０４では、仕草の検出結果として、例えば、歩行、屈伸、寝転び、腕組み、等を示す情報が得られる。腕組みの場合と腕組みでない場合との間で、上腕と前腕の間の位置関係などは異なる。このように、各部位の位置関係は仕草に依存する。そのため、骨格情報に基づいて、各部位の位置関係から仕草を検出できる。歩行や屈伸などの動きを伴う仕草は、互いに異なる時間に撮像された複数の画像にそれぞれ対応する複数の骨格情報を用いて検出されてもよい。歩行については、歩幅と肩幅の比率を示す情報が得られてもよい。 In step S304, as a result of detecting the gesture, information indicating, for example, walking, bending and stretching, lying down, arms folded, and the like can be obtained. The positional relationship between the upper arm and the forearm differs between the case where the arms are folded and the case where the arms are not folded. In this way, the positional relationship of each part depends on the gesture. Therefore, the gesture can be detected from the positional relationship of each part based on the skeleton information. Gestures accompanied by movements such as walking and bending and stretching may be detected by using a plurality of skeletal information corresponding to a plurality of images captured at different times. For walking, information indicating the ratio of stride length to shoulder width may be obtained.

なお、骨格情報を用いない他の方法で姿勢や仕草が検出されてもよい。例えば、パターンマッチングなどを用いた姿勢検出や仕草検出が行われてもよい。 The posture and gesture may be detected by another method that does not use skeletal information. For example, posture detection or gesture detection using pattern matching or the like may be performed.

次に、識別部１１２が、ステップＳ３０３，Ｓ３０４の処理によって得られた特徴情報に基づいて、ステップＳ３０１で取得された画像に写っている人物を識別する（ステップＳ３０５）。例えば、記憶部１０３は、複数の人物のそれぞれについて、その人物の特徴を示す参照情報を予め記憶する。そして、識別部１１２は、特徴情報と各参照情報を比較して、画像に写っている人物を識別する。参照情報は、情報処理装置１００に対して手動で登録されてもよいし、監視カメラ２００によって撮像された画像を用いて自動で登録されてもよい。 Next, the identification unit 112 identifies the person in the image acquired in step S301 based on the feature information obtained by the processes of steps S303 and S304 (step S305). For example, the storage unit 103 stores in advance reference information indicating the characteristics of each of the plurality of persons. Then, the identification unit 112 compares the feature information with each reference information to identify the person in the image. The reference information may be manually registered in the information processing apparatus 100, or may be automatically registered using the image captured by the surveillance camera 200.

図６は、第１の実施形態に係る参照情報の一例を示す。図６では、参照情報６０１〜６０３が予め登録されている。参照情報６０１は、「Ａさん」の特徴として、「姿勢（背）：正常」、「姿勢（脚）：正常」、及び、「歩幅比（肩幅を１とした時の歩幅の比率）：１．５」を示す。参照情報６０２は、「Ｂさん」の特徴として、「姿勢（背）：猫背」、「姿勢（脚）：Ｏ脚」、及び、「歩幅比：１．３」を示す。そして、参照情報６０３は、「Ｃさん」の特徴として、「姿勢（背）：猫背」、「姿勢（脚）：Ｏ脚」、及び、「歩幅比：１．０」を示す。 FIG. 6 shows an example of reference information according to the first embodiment. In FIG. 6, reference information 601 to 603 is registered in advance. Reference information 601 is characterized by "posture (back): normal", "posture (leg): normal", and "step ratio (ratio of stride when shoulder width is 1): 1. .5 ”is shown. Reference information 602 shows "posture (back): stoop", "posture (leg): O-leg", and "step ratio: 1.3" as the characteristics of "Mr. B". The reference information 603 indicates "posture (back): stoop", "posture (leg): O-leg", and "step ratio: 1.0" as the characteristics of "Mr. C".

ここで、参照情報６０１〜６０３が予め登録されており、且つ、ステップＳ３０３，Ｓ３０４の処理によって「姿勢（背）：正常」、「姿勢（脚）：正常」、及び、「歩幅比：１．４」を示す特徴情報が取得された場合を考える。この場合には、参照情報６０１〜６０３のうち特徴情報に最も類似する情報は参照情報６０１である。そのため、識別部１１２は、ステップＳ３０１で取得された画像に写っている人物が「Ａさん」であると判定する。 Here, the reference information 601 to 603 is registered in advance, and by the processing of steps S303 and S304, "posture (back): normal", "posture (leg): normal", and "step ratio: 1. Consider the case where the feature information indicating "4" is acquired. In this case, the information most similar to the feature information among the reference information 601 to 603 is the reference information 601. Therefore, the identification unit 112 determines that the person in the image acquired in step S301 is "Mr. A".

次に、出力部１０４が、ステップＳ３０５の識別結果を管理装置３００へ出力する（ステップＳ３０６）。 Next, the output unit 104 outputs the identification result of step S305 to the management device 300 (step S306).

以上述べたように、第１の実施形態によれば、監視カメラによって顔が撮像されていない場合でも、監視カメラによって撮像された画像に写っている人物の姿勢と仕草を示す特徴情報に基づいて、当該人物を識別できる。 As described above, according to the first embodiment, even when the face is not imaged by the surveillance camera, it is based on the feature information indicating the posture and gesture of the person in the image captured by the surveillance camera. , The person can be identified.

＜第２の実施形態＞
本発明の第２の実施形態について説明する。ユーザ（従業員、顧客、住人、等）に有意義な情報を提供したり、ユーザとの会話を行ったりするコミュニケーションロボットでは、ユーザとの適切なコミュニケーションのために、ユーザの識別が行われることが好ましい。例えば、コミュニケーションロボットは、自身の撮像部で撮像された画像から人物の顔特徴量を取得し、取得した顔特徴量から当該人物を識別する。 <Second embodiment>
A second embodiment of the present invention will be described. Communication robots that provide meaningful information to users (employees, customers, residents, etc.) and have conversations with users may identify users for proper communication with them. preferable. For example, the communication robot acquires a face feature amount of a person from an image captured by its own imaging unit, and identifies the person from the acquired face feature amount.

しかしながら、識別前に人物がコミュニケーションロボット（撮像部）に対して顔を向けるとは限らない。特に、コミュニケーションロボットが自発的にユーザとコミュニケーションをとる場合には、当該ユーザが識別前にコミュニケーションロボットに対して顔を向ける可能性は低い。また、遺伝などの影響により、家族における人物間で顔が似ていることがある。そのため、家族における或る人物がコミュニケーションロボットに対して顔を向けており、当該人物の正確な顔特徴量が得られたとしても、コミュニケーションロボットは、当該人物を顔特徴量から特定できないことがある。 However, the person does not always turn his face to the communication robot (imaging unit) before identification. In particular, when the communication robot voluntarily communicates with the user, it is unlikely that the user turns his face to the communication robot before identification. In addition, due to the influence of heredity, the faces of people in the family may be similar. Therefore, even if a certain person in the family is facing the communication robot and an accurate facial feature amount of the person is obtained, the communication robot may not be able to identify the person from the facial feature amount. ..

第２の実施形態では、コミュニケーションロボットに本発明を適用した例を説明する。 In the second embodiment, an example in which the present invention is applied to a communication robot will be described.

図７は、第２の実施形態に係るロボット（コミュニケーションロボット）７００の構成例を示すブロック図である。ロボット７００は、撮像部７０１、画像入力部１０１、制御部１０２、記憶部１０３、出力部１０４、及び、コミュニケーション部７０２を有する。制御部１０２は、情報取得部１１１と識別部１１２を有する。 FIG. 7 is a block diagram showing a configuration example of the robot (communication robot) 700 according to the second embodiment. The robot 700 includes an image pickup unit 701, an image input unit 101, a control unit 102, a storage unit 103, an output unit 104, and a communication unit 702. The control unit 102 has an information acquisition unit 111 and an identification unit 112.

撮像部７０１は、撮像を行い、撮像した画像を画像入力部１０１へ出力する。例えば、撮像部７０１は、ＣＣＤやＣＭＯＳセンサなどの撮像センサである。画像入力部１０１、制御部１０２、記憶部１０３、及び、出力部１０４については、第１の実施形態で述べたとおりである。但し、画像入力部１０１は、撮像部７０１によって撮像された画像を撮像部７０１から取得し、出力部１０４は、識別結果をコミュニケーション部７０２へ出力する。コミュニケーション部７０２は、出力部１０４から出力された識別結果に基づいて、撮像部７０１によって撮像された画像に写っている人物とのコミュニケーションのための処理を行う。例えば、コミュニケーション部７０２は、液晶モニタなどの表示部、スピーカ、等を含む。そして、コミュニケーション部７０２は、識別結果の人物への情報の提供、識別結果の人物との会話、等のために、識別結果に基づいて、液晶モニタなどの表示部を用いた画面表示、スピーカを用いた音声出力、等を行う。 The image pickup unit 701 takes an image and outputs the captured image to the image input unit 101. For example, the image pickup unit 701 is an image pickup sensor such as a CCD or CMOS sensor. The image input unit 101, the control unit 102, the storage unit 103, and the output unit 104 are as described in the first embodiment. However, the image input unit 101 acquires the image captured by the image pickup unit 701 from the image pickup unit 701, and the output unit 104 outputs the identification result to the communication unit 702. Based on the identification result output from the output unit 104, the communication unit 702 performs processing for communication with the person in the image captured by the image pickup unit 701. For example, the communication unit 702 includes a display unit such as a liquid crystal monitor, a speaker, and the like. Then, the communication unit 702 provides a screen display using a display unit such as a liquid crystal monitor and a speaker based on the identification result for providing information to the person of the identification result, talking with the person of the identification result, and the like. Perform the audio output, etc. used.

ロボット７００の処理フロー例は、第１の実施形態（図３）と同様である。但し、ステップＳ３０１にて、画像入力部１０１は、撮像部７０１によって撮像された画像を撮像部７０１から取得する。ステップＳ３０６にて、出力部１０４は、ステップＳ３０５の識別結果をコミュニケーション部７０２へ出力する。 The processing flow example of the robot 700 is the same as that of the first embodiment (FIG. 3). However, in step S301, the image input unit 101 acquires the image captured by the image pickup unit 701 from the image pickup unit 701. In step S306, the output unit 104 outputs the identification result of step S305 to the communication unit 702.

以上述べたように、第２の実施形態によれば、コミュニケーションロボットにおいて、顔が撮像されていない場合でも、撮像された画像に写っている人物の姿勢と仕草を示す特徴情報に基づいて、当該人物を識別できる。また、実年齢の差や精神年齢の差などにより、家族における人物間であっても姿勢や仕草が異なる可能性は高い。そのため、姿勢や仕草を考慮することにより、家族における各人物も高精度に特定できる。同様に、家族における人物間であってもシルエット、動線、滞在場所、等が異なる可能性は高い。シルエッ
ト、動線、滞在場所、等を考慮する例については後述する。 As described above, according to the second embodiment, even when the face is not imaged in the communication robot, the communication robot is said to be based on the characteristic information indicating the posture and gesture of the person in the captured image. Can identify a person. In addition, there is a high possibility that postures and behaviors will differ even among family members due to differences in actual age and mental age. Therefore, by considering the posture and gesture, each person in the family can be identified with high accuracy. Similarly, there is a high possibility that silhouettes, flow lines, places of stay, etc. will differ even among family members. An example of considering the silhouette, flow line, place of stay, etc. will be described later.

＜第３の実施形態＞
本発明の第３の実施形態について説明する。第１および第２の実施形態では、姿勢と仕草を示す特徴情報が取得される例を説明した。第３の実施形態では、撮像された画像に写っている人物のシルエットをさらに示す特徴情報を取得する例を説明する。 <Third embodiment>
A third embodiment of the present invention will be described. In the first and second embodiments, an example in which feature information indicating a posture and a gesture is acquired has been described. In the third embodiment, an example of acquiring feature information further indicating the silhouette of a person in the captured image will be described.

第３の実施形態に係る情報処理装置の構成は、第１の実施形態に係る情報処理装置１００の構成（図１，２）、または、第２の実施形態に係るロボット７００の構成（図７）と同様である。 The configuration of the information processing device according to the third embodiment is the configuration of the information processing device 100 according to the first embodiment (FIGS. 1 and 2) or the configuration of the robot 700 according to the second embodiment (FIG. 7). ).

図８は、第３の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。 FIG. 8 is a flowchart showing an example of a processing flow of the information processing apparatus according to the third embodiment.

まず、第１の実施形態や第２の実施形態と同様に、画像入力部１０１が、撮像された画像を取得し（ステップＳ３０１）、情報取得部１１１が、骨格情報を取得する（ステップＳ３０２）。 First, similarly to the first embodiment and the second embodiment, the image input unit 101 acquires the captured image (step S301), and the information acquisition unit 111 acquires the skeleton information (step S302). ..

次に、情報取得部１１１が、ステップＳ３０１で取得された画像やステップＳ３０２で取得された骨格情報に基づいて、ステップＳ３０１で取得された画像に写っている人物の姿勢、仕草、及び、シルエットを検出する（ステップＳ３０３，Ｓ３０４，Ｓ８００）。ステップＳ３０３の検出結果（姿勢）、ステップＳ３０４の検出結果（仕草）、及び、ステップＳ８００の検出結果（シルエット）の組み合わせが、ステップＳ３０１で取得された画像に写っている人物の特徴情報である。ステップＳ３０３の処理（姿勢検出）、ステップＳ３０４の処理（仕草検出）、及び、ステップＳ８００の処理（シルエット検出）は、並列に行われてもよいし、順番に行われてもよい。姿勢検出、仕草検出、及び、シルエット検出の順番は特に限定されない。 Next, the information acquisition unit 111 determines the posture, gesture, and silhouette of the person in the image acquired in step S301 based on the image acquired in step S301 and the skeleton information acquired in step S302. Detect (steps S303, S304, S800). The combination of the detection result (posture) of step S303, the detection result (gesture) of step S304, and the detection result (silhouette) of step S800 is the characteristic information of the person shown in the image acquired in step S301. The process of step S303 (posture detection), the process of step S304 (gesture detection), and the process of step S800 (silhouette detection) may be performed in parallel or in order. The order of posture detection, gesture detection, and silhouette detection is not particularly limited.

ステップＳ３０３の処理（姿勢検出）とステップＳ３０４の処理（仕草検出）とについては、第１の実施形態で述べたとおりである。ステップＳ８００では、例えば、Ｍａｓｋ
Ｒ−ＣＮＮなどを使って画像から人物のシルエットが検出される。 The process of step S303 (posture detection) and the process of step S304 (gesture detection) are as described in the first embodiment. In step S800, for example, Mask
The silhouette of a person is detected from the image using R-CNN or the like.

次に、第１の実施形態や第２の実施形態と同様に、識別部１１２が、得られた特徴情報（具体的には、ステップＳ３０３，Ｓ３０４，Ｓ８００の処理によって得られた特徴情報）に基づいて、ステップＳ３０１で取得された画像に写っている人物を識別する（ステップＳ３０５）。 Next, as in the first embodiment and the second embodiment, the identification unit 112 provides the obtained feature information (specifically, the feature information obtained by the processing of steps S303, S304, S800). Based on this, the person in the image acquired in step S301 is identified (step S305).

図９（Ａ）は、第３の実施形態に係る参照情報の一例を示す。図９（Ａ）では、参照情報９０１〜９０３が予め登録されている。参照情報９０１は、「Ａさん」の特徴として、「姿勢（背）：正常」、「歩幅比：１．５」、及び、「マスク画像：Ｉ１」を示す。参照情報９０２は、「Ｂさん」の特徴として、「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「マスク画像：Ｉ２」を示す。そして、参照情報９０３は、「Ｃさん」の特徴として、「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「マスク画像：Ｉ３」を示す。 FIG. 9A shows an example of reference information according to the third embodiment. In FIG. 9A, reference information 901 to 903 is registered in advance. Reference information 901 shows "posture (back): normal", "step ratio: 1.5", and "mask image: I1" as features of "Mr. A". Reference information 902 shows "posture (back): stoop", "step ratio: 1.3", and "mask image: I2" as features of "Mr. B". Then, the reference information 903 shows "posture (back): stoop", "step ratio: 1.3", and "mask image: I3" as the characteristics of "Mr. C".

マスク画像Ｉ１〜Ｉ３はシルエットを示す画像である。図９（Ｂ）は、マスク画像Ｉ１〜Ｉ３に対応するシルエットの一例を示す。「Ａさん」、「Ｂさん」、及び、「Ｃさん」は、体型が互いに異なる。そのため、図９（Ｂ）に示すように、マスク画像Ｉ１〜Ｉ３の間でシルエットが異なる。 The mask images I1 to I3 are images showing silhouettes. FIG. 9B shows an example of silhouettes corresponding to mask images I1 to I3. "Mr. A", "Mr. B", and "Mr. C" have different body shapes. Therefore, as shown in FIG. 9B, the silhouette is different between the mask images I1 to I3.

ここで、参照情報９０１〜９０３が予め登録されており、且つ、ステップＳ３０３，Ｓ
３０４，Ｓ８００の処理によって「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「シルエット画像：Ｉｐ」を示す特徴情報が取得された場合を考える。シルエット画像は、ステップＳ８００で検出されたシルエットを示す画像である。この場合には、「姿勢（背）：猫背」と「歩幅比：１．３」は、参照情報９０２，９０３のそれらと一致する。そのため、特徴情報の「姿勢（背）：猫背」と「歩幅比：１．３」からでは、ステップＳ３０１で取得された画像に写っている人物が「Ｂさん」であるか「Ｃさん」であるかを判別できない。 Here, the reference information 901 to 903 is registered in advance, and steps S303 and SS
Consider the case where the feature information indicating "posture (back): stoop", "step ratio: 1.3", and "silhouette image: Ip" is acquired by the processing of 304 and S800. The silhouette image is an image showing the silhouette detected in step S800. In this case, "posture (back): stoop" and "step ratio: 1.3" match those of reference information 902 and 903. Therefore, from the feature information "posture (back): stoop" and "step ratio: 1.3", the person in the image acquired in step S301 is "Mr. B" or "Mr. C". I can't tell if it's there.

第３の実施形態では、例えば、識別部１１２は、特徴情報のシルエット画像Ｉｐを、参照情報９０２のマスク画像Ｉ２や参照情報９０３のマスク画像Ｉ３と比較する。それにより、識別部１１２は、ステップＳ８００で検出されたシルエットと「Ｃさん」のシルエットとの類似度よりも、検出されたシルエットと「Ｂさん」のシルエットとの類似度が高いと判定できる。その結果、識別部１１２は、ステップＳ３０１で取得された画像に写っている人物が「Ｂさん」であると判定できる。このように、第３の実施形態では、シルエットをさらに考慮することで、姿勢と仕草のみを考慮しても識別できない人物が識別可能となる。 In the third embodiment, for example, the identification unit 112 compares the silhouette image Ip of the feature information with the mask image I2 of the reference information 902 and the mask image I3 of the reference information 903. As a result, the identification unit 112 can determine that the similarity between the detected silhouette and the silhouette of "Mr. B" is higher than the similarity between the silhouette detected in step S800 and the silhouette of "Mr. C". As a result, the identification unit 112 can determine that the person in the image acquired in step S301 is "Mr. B". As described above, in the third embodiment, by further considering the silhouette, a person who cannot be identified by considering only the posture and the gesture can be identified.

なお、人物のシルエットのサイズは、カメラ（撮像部）と当該人物との距離に依存する。そのため、検出されたシルエットがサイズの正規化が施されてシルエット画像が生成されたり、サイズの正規化が施されたシルエットに対応するマスク画像が予め用意されたりする。シルエットは、例えば、頭の先から足先までのサイズが所定値となるように正規化される。 The size of the silhouette of a person depends on the distance between the camera (imaging unit) and the person. Therefore, the detected silhouette is size-normalized to generate a silhouette image, or a mask image corresponding to the size-normalized silhouette is prepared in advance. The silhouette is normalized so that, for example, the size from the tip of the head to the tip of the foot becomes a predetermined value.

シルエットの類似度の取得方法例をより詳細に説明する。図１０は、シルエット画像とマスク画像の一例を示す。図１０には腕の一部に対応するマスク画像が示されているが、マスク画像は、人物の全体に対応する画像であってもよいし、人物の一部に対応する画像であってもよい。図１０に示すように、マスク画像は複数の領域（複数のシルエット判定領域）からなる。マスク画像によって示されたシルエットに対応するシルエット判定領域には、当該シルエットに対応する属性「１」が割り当てられている。そして、マスク画像によって示された背景に対応するシルエット判定領域には、当該背景に対応する属性「０」が割り当てられている。 An example of how to acquire the similarity of silhouettes will be described in more detail. FIG. 10 shows an example of a silhouette image and a mask image. Although the mask image corresponding to a part of the arm is shown in FIG. 10, the mask image may be an image corresponding to the whole person or an image corresponding to a part of the person. good. As shown in FIG. 10, the mask image is composed of a plurality of regions (plurality of silhouette determination regions). The attribute "1" corresponding to the silhouette is assigned to the silhouette determination area corresponding to the silhouette shown by the mask image. Then, the attribute "0" corresponding to the background is assigned to the silhouette determination area corresponding to the background indicated by the mask image.

識別部１１２は、シルエット判定領域ごとに、シルエット画像の属性（シルエット／背景）が、マスク画像の属性（１／０；シルエット／背景）と一致するか否かを判定する。そして、識別部１１２は、シルエット判定領域の総数に対する一致領域（一致すると判定されたシルエット判定領域）の総数の比率、一致領域の総数、等を、シルエットの類似度として算出する。 The identification unit 112 determines whether or not the attribute (silhouette / background) of the silhouette image matches the attribute (1/0; silhouette / background) of the mask image for each silhouette determination area. Then, the identification unit 112 calculates the ratio of the total number of matching areas (silhouette determination areas determined to match) to the total number of silhouette determination areas, the total number of matching areas, and the like as the similarity of silhouettes.

なお、シルエット判定領域のサイズは特に限定されない。シルエット判定領域は、１画素の領域であってよいし、複数の画素分の領域であってもよい。シルエット判定領域が複数の画素分の領域である場合には、当該シルエット判定領域に、シルエット画像によって示されたシルエットと背景の両方が含まれることがある。その場合には、シルエット判定領域におけるシルエット（シルエット画像によって示されたシルエット）のサイズなどに基づいて、当該シルエット判定領域におけるシルエット画像の属性を判定してもよい。例えば、シルエット画像によって示されたシルエットのサイズが閾値以上であるシルエット判定領域に対して、シルエット画像の属性「シルエット」を取得し、そうでないシルエット判定領域に対して、シルエット画像の属性「背景」を取得してもよい。 The size of the silhouette determination area is not particularly limited. The silhouette determination area may be an area of one pixel or an area of a plurality of pixels. When the silhouette determination area is an area for a plurality of pixels, the silhouette determination area may include both the silhouette and the background indicated by the silhouette image. In that case, the attributes of the silhouette image in the silhouette determination area may be determined based on the size of the silhouette (silhouette indicated by the silhouette image) in the silhouette determination area. For example, the attribute "silhouette" of the silhouette image is acquired for the silhouette determination area where the size of the silhouette indicated by the silhouette image is equal to or larger than the threshold value, and the attribute "background" of the silhouette image is acquired for the silhouette determination area which is not. May be obtained.

次に、第１の実施形態や第２の実施形態と同様に、出力部１０４が、ステップＳ３０５の識別結果を出力する（ステップＳ３０６）。 Next, as in the first embodiment and the second embodiment, the output unit 104 outputs the identification result of step S305 (step S306).

以上述べたように、第３の実施形態によれば、シルエットをさらに示す特徴情報を用いることにより、第１の実施形態や第２の実施形態よりも高精度に人物を識別できる。 As described above, according to the third embodiment, the person can be identified with higher accuracy than the first embodiment and the second embodiment by using the feature information further indicating the silhouette.

＜第４の実施形態＞
本発明の第４の実施形態について説明する。第４の実施形態では、撮像された画像に写っている人物の姿勢、仕草、及び、滞在場所を示す特徴情報を取得する例を説明する。 <Fourth Embodiment>
A fourth embodiment of the present invention will be described. In the fourth embodiment, an example of acquiring characteristic information indicating the posture, gesture, and staying place of the person in the captured image will be described.

第４の実施形態に係る情報処理装置の構成は、第１の実施形態に係る情報処理装置１００の構成（図１，２）、または、第２の実施形態に係るロボット７００の構成（図７）と同様である。 The configuration of the information processing device according to the fourth embodiment is the configuration of the information processing device 100 according to the first embodiment (FIGS. 1 and 2) or the configuration of the robot 700 according to the second embodiment (FIG. 7). ).

図１１は、第４の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。 FIG. 11 is a flowchart showing an example of a processing flow of the information processing apparatus according to the fourth embodiment.

次に、情報取得部１１１が、ステップＳ３０２で取得された骨格情報に基づいて、ステップＳ３０１で取得された画像に写っている人物の姿勢、仕草、及び、滞在場所を検出する（ステップＳ３０３，Ｓ３０４，Ｓ１１００）。ステップＳ３０３の検出結果（姿勢）、ステップＳ３０４の検出結果（仕草）、及び、ステップＳ１１００の検出結果（滞在場所）の組み合わせが、ステップＳ３０１で取得された画像に写っている人物の特徴情報である。ステップＳ３０３の処理（姿勢検出）、ステップＳ３０４の処理（仕草検出）、及び、ステップＳ１１００の処理（滞在場所検出）は、並列に行われてもよいし、順番に行われてもよい。姿勢検出、仕草検出、及び、滞在場所検出の順番は特に限定されない。 Next, the information acquisition unit 111 detects the posture, gesture, and staying place of the person in the image acquired in step S301 based on the skeleton information acquired in step S302 (steps S303 and S304). , S1100). The combination of the detection result (posture) of step S303, the detection result (gesture) of step S304, and the detection result (stay location) of step S1100 is the characteristic information of the person shown in the image acquired in step S301. .. The process of step S303 (posture detection), the process of step S304 (gesture detection), and the process of step S1100 (stay location detection) may be performed in parallel or in sequence. The order of posture detection, gesture detection, and stay location detection is not particularly limited.

ステップＳ３０３の処理（姿勢検出）とステップＳ３０４の処理（仕草検出）とについては、第１の実施形態で述べたとおりである。ステップＳ１１００では、情報取得部１１１は、例えば、骨格情報によって示された骨格の位置（人物の位置；人物位置）などに基づいて人物の滞在場所を検出する。具体的には、撮像範囲を構成する複数の滞在判定領域が予め定められている。情報取得部１１１は、過去数分間などの所定期間における人物位置の時間変化に基づいて、当該所定期間の長さに対する滞在時間の比率（滞在率）を、滞在判定領域ごとに算出する。それにより、滞在場所の検出結果として、各滞在判定領域の滞在率を示す滞在マップ（ヒートマップ）が得られる。例えば、図１２に示す滞在マップＭｐが得られる。 The process of step S303 (posture detection) and the process of step S304 (gesture detection) are as described in the first embodiment. In step S1100, the information acquisition unit 111 detects the staying place of the person based on, for example, the position of the skeleton (position of the person; the position of the person) indicated by the skeleton information. Specifically, a plurality of stay determination areas constituting the imaging range are predetermined. The information acquisition unit 111 calculates the ratio of the staying time to the length of the predetermined period (stay rate) for each stay determination area based on the time change of the person's position in the predetermined period such as the past several minutes. As a result, as a result of detecting the place of stay, a stay map (heat map) showing the stay rate of each stay determination area is obtained. For example, the stay map Mp shown in FIG. 12 can be obtained.

次に、第１の実施形態や第２の実施形態と同様に、識別部１１２が、得られた特徴情報（具体的には、ステップＳ３０３，Ｓ３０４，Ｓ１１００の処理によって得られた特徴情報）に基づいて、ステップＳ３０１で取得された画像に写っている人物を識別する（ステップＳ３０５）。 Next, as in the first embodiment and the second embodiment, the identification unit 112 provides the obtained feature information (specifically, the feature information obtained by the processing of steps S303, S304, and S1100). Based on this, the person in the image acquired in step S301 is identified (step S305).

図１３（Ａ）は、第４の実施形態に係る参照情報の一例を示す。図１３（Ａ）では、参照情報１３０１〜１３０３が予め登録されている。参照情報１３０１は、「父」の特徴として、「姿勢（背）：正常」、「歩幅比：１．５」、及び、「滞在マップ：Ｍ１」を示す。参照情報１３０２は、「母」の特徴として、「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「滞在マップ：Ｍ２」を示す。そして、参照情報１３０３は、「姉」の特徴として、「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「滞在マップ：Ｍ３」を示す。 FIG. 13A shows an example of reference information according to the fourth embodiment. In FIG. 13A, reference information 1301 to 1303 is registered in advance. Reference information 1301 indicates "posture (back): normal", "step ratio: 1.5", and "stay map: M1" as the characteristics of the "father". Reference information 1302 indicates "posture (back): stoop", "step ratio: 1.3", and "stay map: M2" as the characteristics of the "mother". Then, the reference information 1303 indicates "posture (back): stoop", "step ratio: 1.3", and "stay map: M3" as the characteristics of the "sister".

参照情報１３０１〜１３０３の滞在マップＭ１〜Ｍ３は、例えば、過去１ヶ月などの所定期間での滞在率を示す。図１３（Ｂ）は、滞在マップＭ１〜Ｍ３の一例を示す。例えば、父はリビングの通路寄りにいることが多く、母は台所にいることが多く、姉はリビングの壁寄りにいることが多い、等のように、滞在場所（滞在率）は人物に依存する。そのため、図１３（Ｂ）に示すように、滞在マップＭ１〜Ｍ３の間で滞在率の分布が異なる。 The stay maps M1 to M3 of the reference information 1301 to 1303 show the stay rate in a predetermined period such as the past one month. FIG. 13B shows an example of stay maps M1 to M3. For example, my father is often near the aisle in the living room, my mother is often in the kitchen, my sister is often near the wall in the living room, and so on. do. Therefore, as shown in FIG. 13B, the distribution of the stay rate differs between the stay maps M1 to M3.

ここで、参照情報１３０１〜１３０３が予め登録されており、且つ、ステップＳ３０３，Ｓ３０４，Ｓ１１００の処理によって「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「滞在マップ：Ｍｐ（図１２）」を示す特徴情報が取得された場合を考える。この場合には、「姿勢（背）：猫背」と「歩幅比：１．３」は、参照情報１３０２，１３０３のそれらと一致する。そのため、特徴情報の「姿勢（背）：猫背」と「歩幅比：１．３」からでは、ステップＳ３０１で取得された画像に写っている人物が「母」であるか「姉」であるかを判別できない。 Here, the reference information 1301 to 1303 is registered in advance, and by the processing of steps S303, S304, and S1100, "posture (back): stoop", "step ratio: 1.3", and "stay map: Consider the case where the feature information indicating "Mp (FIG. 12)" is acquired. In this case, "posture (back): stoop" and "step ratio: 1.3" match those of reference information 1302 and 1303. Therefore, from the feature information "posture (back): stoop" and "step ratio: 1.3", whether the person in the image acquired in step S301 is the "mother" or the "sister". Cannot be determined.

第４の実施形態では、例えば、識別部１１２は、特徴情報の滞在マップＭｐを、参照情報１３０２（母）の滞在マップＭ２や参照情報１３０３（姉）の滞在マップＭ３と比較する。それにより、識別部１１２は、滞在マップＭｐと「母」の滞在マップＭ２との類似度よりも、滞在マップＭｐと「姉」の滞在マップＭ３との類似度が高いと判定できる。その結果、識別部１１２は、ステップＳ３０１で取得された画像に写っている人物が「姉」であると判定できる。このように、第４の実施形態では、姿勢や仕草に加え滞在場所をさらに考慮することで、姿勢と仕草のみを考慮しても識別できない人物が識別可能となる。 In the fourth embodiment, for example, the identification unit 112 compares the stay map Mp of the feature information with the stay map M2 of the reference information 1302 (mother) and the stay map M3 of the reference information 1303 (sister). As a result, the identification unit 112 can determine that the degree of similarity between the stay map Mp and the "sister" stay map M3 is higher than the degree of similarity between the stay map Mp and the "mother" stay map M2. As a result, the identification unit 112 can determine that the person in the image acquired in step S301 is the "sister". As described above, in the fourth embodiment, by further considering the place of stay in addition to the posture and gesture, a person who cannot be identified by considering only the posture and gesture can be identified.

なお、滞在場所に関する処理は上記処理に限られない。例えば、ステップＳ１１００にて、情報取得部１１１は、滞在率が最も高い滞在判定領域を滞在場所として検出してもよい。参照情報では、滞在率が最も高い滞在判定領域や、ユーザによって指定された滞在判定領域などが、滞在場所として示されていてもよい。そして、ステップＳ３０５にて、識別部１１２は、ステップＳ１１００で検出された滞在場所と参照情報の滞在場所との一致／不一致を判定してもよい。 The processing related to the place of stay is not limited to the above processing. For example, in step S1100, the information acquisition unit 111 may detect the stay determination area having the highest stay rate as the stay place. In the reference information, the stay determination area having the highest stay rate, the stay determination area designated by the user, and the like may be indicated as the stay place. Then, in step S305, the identification unit 112 may determine a match / mismatch between the staying place detected in step S1100 and the staying place of the reference information.

以上述べたように、第４の実施形態によれば、姿勢や仕草に加え滞在場所をさらに示す特徴情報を用いることにより、第１の実施形態や第２の実施形態よりも高精度に人物を識別できる。 As described above, according to the fourth embodiment, by using the characteristic information indicating the place of stay in addition to the posture and gesture, the person can be displayed with higher accuracy than the first embodiment and the second embodiment. Can be identified.

なお、人物の動線（移動経路）を考慮してもよい。動線は、例えば、滞在マップと同様の方法で検出される。図１４は、父、母、及び、姉の動線の一例を示す。滞在場所（滞在率）が人物に依存するのと同様に、動線も人物に依存する。そのため、図１４に示すように、父、母、及び、姉の間で動線が異なる。 The flow line (movement path) of the person may be taken into consideration. The flow line is detected, for example, in the same manner as the stay map. FIG. 14 shows an example of the flow lines of the father, mother, and sister. Just as the place of stay (stay rate) depends on the person, the flow line also depends on the person. Therefore, as shown in FIG. 14, the flow lines differ between the father, mother, and sister.

＜第５の実施形態＞
本発明の第５の実施形態について説明する。第５の実施形態では、撮像された画像に写っている人物の姿勢、仕草、シルエット、及び、滞在場所を示す特徴情報を取得する例を説明する。即ち、第５の実施形態では、第３の実施形態と第４の実施形態との組み合わせの例を説明する。 <Fifth Embodiment>
A fifth embodiment of the present invention will be described. In the fifth embodiment, an example of acquiring characteristic information indicating the posture, gesture, silhouette, and staying place of the person in the captured image will be described. That is, in the fifth embodiment, an example of the combination of the third embodiment and the fourth embodiment will be described.

第５の実施形態に係る情報処理装置の構成は、第１の実施形態に係る情報処理装置１００の構成（図１，２）、または、第２の実施形態に係るロボット７００の構成（図７）と
同様である。 The configuration of the information processing device according to the fifth embodiment is the configuration of the information processing device 100 according to the first embodiment (FIGS. 1 and 2) or the configuration of the robot 700 according to the second embodiment (FIG. 7). ).

図１５は、第５の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。 FIG. 15 is a flowchart showing an example of a processing flow of the information processing apparatus according to the fifth embodiment.

次に、情報取得部１１１が、ステップＳ３０２で取得された骨格情報に基づいて、ステップＳ３０１で取得された画像に写っている人物の姿勢、仕草、シルエット、及び、滞在場所を検出する（ステップＳ３０３，Ｓ３０４，Ｓ８００，Ｓ１１００）。ステップＳ３０３の検出結果（姿勢）、ステップＳ３０４の検出結果（仕草）、ステップＳ８００の検出結果（シルエット）、及び、ステップＳ１１００の検出結果（滞在場所）の組み合わせが、ステップＳ３０１で取得された画像に写っている人物の特徴情報である。ステップＳ３０３の処理（姿勢検出）、ステップＳ３０４の処理（仕草検出）、ステップＳ８００の処理（シルエット検出）、及び、ステップＳ１１００の処理（滞在場所検出）は、並列に行われてもよいし、順番に行われてもよい。姿勢検出、仕草検出、シルエット検出、及び、滞在場所検出の順番は特に限定されない。 Next, the information acquisition unit 111 detects the posture, gesture, silhouette, and staying place of the person in the image acquired in step S301 based on the skeleton information acquired in step S302 (step S303). , S304, S800, S1100). The combination of the detection result (posture) of step S303, the detection result (gesture) of step S304, the detection result (silhouette) of step S800, and the detection result (stay place) of step S1100 is added to the image acquired in step S301. It is the characteristic information of the person in the picture. The process of step S303 (posture detection), the process of step S304 (gesture detection), the process of step S800 (silhouette detection), and the process of step S1100 (stay location detection) may be performed in parallel, or in order. May be done in. The order of posture detection, gesture detection, silhouette detection, and stay location detection is not particularly limited.

ステップＳ３０３の処理（姿勢検出）とステップＳ３０４の処理（仕草検出）とについては、第１の実施形態で述べたとおりである。ステップＳ８００の処理（シルエット検出）については、第３の実施形態で述べたとおりである。ステップＳ１１００の処理（滞在場所検出）については、第４の実施形態で述べたとおりである。 The process of step S303 (posture detection) and the process of step S304 (gesture detection) are as described in the first embodiment. The process (silhouette detection) of step S800 is as described in the third embodiment. The process of step S1100 (detection of the place of stay) is as described in the fourth embodiment.

次に、第１の実施形態や第２の実施形態と同様に、識別部１１２が、得られた特徴情報（具体的には、ステップＳ３０３，Ｓ３０４，Ｓ８００，Ｓ１１００の処理によって得られた特徴情報）に基づいて、ステップＳ３０１で取得された画像に写っている人物を識別する（ステップＳ３０５）。 Next, as in the first embodiment and the second embodiment, the identification unit 112 obtains the feature information (specifically, the feature information obtained by the processing of steps S303, S304, S800, S1100). ), The person in the image acquired in step S301 is identified (step S305).

第３の実施形態で述べたように、姿勢と仕草に加えシルエットをさらに考慮することで、人物の識別精度が向上する。そして、第４の実施形態で述べたように、姿勢と仕草に加え滞在場所をさらに考慮することでも、人物の識別精度が向上する。そのため、姿勢と仕草に加えシルエットと滞在場所の両方をさらに考慮することで、第３の実施形態や第４の実施形態よりも高精度に人物を識別できる。例えば、姿勢、仕草、シルエット、及び、滞在場所のうちのいずれかに基づく識別に成功できなくても、姿勢、仕草、シルエット、及び、滞在場所のうちの残りに基づく識別に成功できればよい。そのため、識別に成功する確率が増す。 As described in the third embodiment, the accuracy of identifying a person is improved by further considering the silhouette in addition to the posture and gesture. Then, as described in the fourth embodiment, the accuracy of identifying a person is improved by further considering the place of stay in addition to the posture and gesture. Therefore, by further considering both the silhouette and the place of stay in addition to the posture and gesture, the person can be identified with higher accuracy than in the third embodiment and the fourth embodiment. For example, even if the identification based on the posture, the gesture, the silhouette, and the place of stay cannot be succeeded, the identification based on the posture, the gesture, the silhouette, and the rest of the place of stay may be successful. Therefore, the probability of successful identification increases.

以上述べたように、第５の実施形態によれば、姿勢、仕草、シルエット、及び、滞在場所を示す特徴情報を用いることにより、第１〜第４の実施形態よりも高精度に人物を識別できる。 As described above, according to the fifth embodiment, the person is identified with higher accuracy than the first to fourth embodiments by using the characteristic information indicating the posture, the gesture, the silhouette, and the place of stay. can.

＜第６の実施形態＞
本発明の第６の実施形態について説明する。第６の実施形態では、撮像された画像に２人以上の人物が写っている場合の例を説明する。 <Sixth Embodiment>
A sixth embodiment of the present invention will be described. In the sixth embodiment, an example in which two or more people are shown in the captured image will be described.

第６の実施形態に係る情報処理装置の構成は、第１の実施形態に係る情報処理装置１００の構成（図１，２）、または、第２の実施形態に係るロボット７００の構成（図７）と同様である。 The configuration of the information processing device according to the sixth embodiment is the configuration of the information processing device 100 according to the first embodiment (FIGS. 1 and 2) or the configuration of the robot 700 according to the second embodiment (FIG. 7). ).

図１６は、第６の実施形態に係る情報処理装置の処理フロー例を示すフローチャートである。 FIG. 16 is a flowchart showing an example of a processing flow of the information processing apparatus according to the sixth embodiment.

まず、第１の実施形態や第２の実施形態と同様に、画像入力部１０１が、撮像された画像を取得し（ステップＳ３０１）、情報取得部１１１が、骨格情報を取得する（ステップＳ３０２）。撮像された画像に２人以上の人物が写っている場合には、各人物の骨格情報が取得される。 First, similarly to the first embodiment and the second embodiment, the image input unit 101 acquires the captured image (step S301), and the information acquisition unit 111 acquires the skeleton information (step S302). .. When two or more people are shown in the captured image, the skeleton information of each person is acquired.

次に、第１〜第５の実施形態と同様に、情報取得部１１１が、ステップＳ３０２で取得された骨格情報に基づいて特徴情報を取得する（ステップＳ１６０１）。ステップＳ１６０１では、例えば、図１５のステップＳ３０３，Ｓ３０４，Ｓ８００，Ｓ１１００などの処理が行われる。撮像された画像に２人以上の人物が写っている場合には、各人物の特徴情報が取得される。 Next, as in the first to fifth embodiments, the information acquisition unit 111 acquires feature information based on the skeleton information acquired in step S302 (step S1601). In step S1601, for example, processes such as steps S303, S304, S800, and S1100 of FIG. 15 are performed. When two or more people are shown in the captured image, the feature information of each person is acquired.

そして、第１〜第５の実施形態と同様に、識別部１１２が、ステップＳ１６０１で取得された特徴情報に基づいて、ステップＳ３０１で取得された画像に写っている人物を識別する（ステップＳ３０５）。撮像された画像に２人以上の人物が写っている場合には、各人物の識別が行われる。ステップＳ３０５では、複数の人物にそれぞれ対応する複数の参照情報が１つの特徴情報に類似することなどによって、当該特徴情報に対応する人物を識別できないことがある。ここでは、撮像された画像に写っている２人以上の人物のうちの一部の人物のみが、ステップＳ３０５で識別され、当該２人以上の人物のうちの残りの人物が識別されなかったとする。残りの人物は、ステップＳ３０５で識別に失敗した人物であってもよいし、ステップＳ３０５で識別の対象とされなかった人物であってもよい。 Then, as in the first to fifth embodiments, the identification unit 112 identifies the person in the image acquired in step S301 based on the feature information acquired in step S1601 (step S305). .. When two or more people are shown in the captured image, each person is identified. In step S305, the person corresponding to the feature information may not be identified because the plurality of reference information corresponding to the plurality of persons are similar to one feature information. Here, it is assumed that only a part of the two or more persons in the captured image is identified in step S305, and the remaining person among the two or more persons is not identified. .. The remaining person may be a person who failed to be identified in step S305, or may be a person who was not targeted for identification in step S305.

次に、識別部１１２が、上記残りの人物に対応する特徴情報（ステップＳ１６０１で取得された特徴情報）と、上記一部の人物の識別結果（ステップＳ３０５の識別結果）とに基づいて、当該残りの人物を識別する（ステップＳ１６０２）。例えば、記憶部１０３は、複数の人物のそれぞれについて、その人物と他の人物との２つ以上の組み合わせにそれぞれ対応する２つ以上の参照情報を予め記憶する。そして、識別部１１２は、上記残りの人物に対応する特徴情報と、上記一部の人物との組み合わせに対応する各参照情報とを比較して、当該残りの人物を識別する。 Next, the identification unit 112 determines the feature information corresponding to the remaining person (feature information acquired in step S1601) and the identification result of some of the persons (identification result in step S305). Identify the remaining person (step S1602). For example, the storage unit 103 stores in advance two or more reference information corresponding to each of two or more combinations of the person and the other person for each of the plurality of persons. Then, the identification unit 112 identifies the remaining person by comparing the feature information corresponding to the remaining person with the reference information corresponding to the combination with the part of the person.

図１７は、第６の実施形態に係る参照情報の一例を示す。ここでは、図１７の参照情報１７０１〜１７１３が予め登録されており、且つ、ステップＳ１６０１において２人の人物Ａ，Ｂの特徴情報が取得されたとする。人物Ａの特徴情報は、「姿勢（背）：正常」、「歩幅比：１．５」、及び、「滞在マップ：Ｍ１」を示し、人物Ｂの特徴情報は、「姿勢（背）：猫背」、「歩幅比：１．３」、及び、「滞在マップ：Ｍ２」を示す。 FIG. 17 shows an example of reference information according to the sixth embodiment. Here, it is assumed that the reference information 1701 to 1713 of FIG. 17 is registered in advance, and the feature information of the two persons A and B is acquired in step S1601. The characteristic information of person A indicates "posture (back): normal", "step ratio: 1.5", and "stay map: M1", and the characteristic information of person B is "posture (back): stoop". , "Step ratio: 1.3", and "Stay map: M2".

この場合には、人物Ａの特徴情報は「父」の参照情報１７０１〜１７０４に類似するため、ステップＳ３０５にて、人物Ａが「父」であると判定できる。一方で、人物Ｂの特徴
情報は、「母」の参照情報１７０５〜１７０９と、「姉」の参照情報１７１０とに類似するため、ステップＳ３０５にて、人物Ｂが「母」であるか「姉」であるかを判別できない。 In this case, since the characteristic information of the person A is similar to the reference information 1701 to 1704 of the "father", it can be determined in step S305 that the person A is the "father". On the other hand, since the characteristic information of the person B is similar to the reference information 1705 to 1709 of the "mother" and the reference information 1710 of the "sister", in step S305, whether the person B is the "mother" or the "sister". I can't tell if it is.

第３の実施形態では、ステップＳ３０５で人物Ａが「父」であると判定されると、ステップＳ１６０２では、参照情報１７０１〜１７１３のうち、「父」と一緒の「母」の参照情報１７０６と、「父」と一緒の「姉」の参照情報１７１１とが参照されることになる。人物Ｂの特徴情報は、「姉」の参照情報１７１１よりも「母」の参照情報１７０６に類似しているため、ステップＳ１６０２にて、人物Ｂが「母」であると判定できる。 In the third embodiment, when the person A is determined to be the "father" in step S305, in step S1602, among the reference information 1701 to 1713, the reference information 1706 of the "mother" together with the "father" is obtained. , The reference information 1711 of the "sister" together with the "father" will be referred to. Since the characteristic information of the person B is more similar to the reference information 1706 of the "mother" than the reference information 1711 of the "sister", it can be determined in step S1602 that the person B is the "mother".

なお、図１７には、一緒の人物に依存して滞在場所（滞在マップ）が変わる例が示されているが、一緒の人物に依存して姿勢、仕草、動線、等が変わることもある。 Although FIG. 17 shows an example in which the place of stay (stay map) changes depending on the same person, the posture, gesture, flow line, etc. may change depending on the same person. ..

次に、第１〜第５の実施形態と同様に、出力部１０４が、ステップＳ３０５，Ｓ１６０２の識別結果を出力する（ステップＳ３０６）。 Next, as in the first to fifth embodiments, the output unit 104 outputs the identification results of steps S305 and S1602 (step S306).

以上述べたように、第６の実施形態によれば、特徴情報に基づいて識別された人物を考慮することで、第１〜第５の実施形態よりも高精度に他の人物を識別できる。例えば、特徴情報に基づいて識別された人物を考慮することで、識別できなかった人物が識別できるようになる。 As described above, according to the sixth embodiment, by considering the person identified based on the feature information, it is possible to identify another person with higher accuracy than in the first to fifth embodiments. For example, by considering the person identified based on the feature information, the person who could not be identified can be identified.

＜その他＞
上述した各実施形態は、本発明の例示に過ぎない。本発明は上記の具体的な形態に限定されることはなく、その技術的思想の範囲内で種々の変形が可能である。上述した各構成、以下で述べる各構成、等を適宜組み合わせることも可能である。例えば、姿勢、仕草、シルエット、動線、及び、滞在場所のうちの少なくともいずれかが特徴情報によって示されれば、特徴情報によって示される特徴は特に限定されない。例えば、特徴情報は、姿勢、仕草、シルエット、動線、及び、滞在場所のうちの１つ、２つ、３つ、４つ、または、５つを示す。特徴情報は、姿勢、仕草、シルエット、動線、及び、滞在場所とは異なる特徴を示してもよい。参照情報についても同様である。 <Others>
Each of the above embodiments is merely an example of the present invention. The present invention is not limited to the above-mentioned specific form, and various modifications can be made within the scope of its technical idea. It is also possible to appropriately combine each of the above-mentioned configurations, each configuration described below, and the like. For example, if at least one of the posture, gesture, silhouette, flow line, and place of stay is indicated by the feature information, the feature indicated by the feature information is not particularly limited. For example, the feature information indicates one, two, three, four, or five of the posture, gesture, silhouette, flow line, and place of stay. The feature information may show features different from the posture, gesture, silhouette, flow line, and place of stay. The same applies to the reference information.

＜付記＞
撮像された画像を取得する画像取得手段（１０１）と、
前記画像から、当該画像に写っている人物の姿勢、仕草、シルエット、動線、及び、滞在場所の少なくともいずれかを当該人物の特徴として示す特徴情報を取得する情報取得手段（１１１）と、
前記特徴情報に基づいて、前記人物を識別する識別手段（１１２）と、
を有することを特徴とする情報処理装置（１００）。 <Additional notes>
An image acquisition means (101) for acquiring an captured image, and
Information acquisition means (111) for acquiring characteristic information indicating at least one of the posture, gesture, silhouette, flow line, and staying place of the person shown in the image as the characteristic of the person from the image.
An identification means (112) for identifying the person based on the feature information, and
(100), an information processing device (100).

１００：情報処理装置１０１：画像入力部１０２：制御部１０３：記憶部１０４：出力部１１１：情報取得部１１２：識別部
２００：監視カメラ３００：管理装置
７００：コミュニケーションロボット７０１：撮像部７０２：コミュニケーション部
４００：画像４０１：人物４０２：骨格（骨格情報）
６０１〜６０３，９０１〜９０３：参照情報
１３０１〜１３０３，１７０１〜１７１３：参照情報
Ｉ１〜Ｉ３，Ｉｐ：マスク画像
Ｍ１〜Ｍ３，Ｍｐ：滞在マップ（ヒートマップ） 100: Information processing device 101: Image input unit 102: Control unit 103: Storage unit 104: Output unit 111: Information acquisition unit 112: Identification unit 200: Surveillance camera 300: Management device 700: Communication robot 701: Imaging unit 702: Communication Part 400: Image 401: Person 402: Skeleton (skeleton information)
601 to 603, 901 to 903: Reference information 1301 to 1303, 1701 to 1713: Reference information I1 to I3, Ip: Mask image M1 to M3, Mp: Stay map (heat map)

Claims

An image acquisition means for acquiring an captured image, and
Information acquisition means for acquiring characteristic information indicating the place of stay of the person shown in the image as a characteristic of the person from the image, and
An identification means for identifying the person based on the feature information,
Have a,
The feature information includes the stay rate of the person in each of the plurality of images to be imaged as the information of the stay place.
The information processing device is characterized in that the stay rate is the ratio of the stay time to the length of a predetermined period.

The information processing apparatus according to claim 1, wherein the feature information further indicates at least one of the posture, gesture, silhouette, and flow line of the person as a feature of the person.

The information processing device according to claim 1 or 2 , wherein the information acquisition means acquires skeleton information indicating the skeleton of the person from the image and acquires the feature information based on the skeleton information.

For each of the plurality of persons, a storage means for storing reference information indicating the characteristics of the person is further provided.
The information processing apparatus according to any one of claims 1 to 3 , wherein the identification means identifies the person in the image by comparing the feature information with each reference information.

When two or more people are shown in the image, the identification means is
A part of the two or more persons is identified based on the characteristic information corresponding to the person, and the person is identified.
Any of claims 1 to 4 , wherein the remaining person among the two or more persons is identified based on the characteristic information corresponding to the person and the identification result of the part of the persons. The information processing apparatus according to item 1.

The storage means stores, for each of the plurality of persons, two or more reference information corresponding to each of the two or more combinations of the person and the other person.
When two or more people are shown in the image, the identification means is
A part of the two or more persons is identified by comparing the feature information corresponding to the person with the reference information.
The feature is that the remaining person among the two or more persons is identified by comparing the feature information corresponding to the person with the reference information corresponding to the combination of the identified part of the persons. The information processing apparatus according to claim 4.

The information processing apparatus according to any one of claims 1 to 6 , further comprising an imaging means for capturing the image.