JP7564064B2

JP7564064B2 - Video monitoring system and video monitoring method

Info

Publication number: JP7564064B2
Application number: JP2021111778A
Authority: JP
Inventors: 健一森田; 敦廣池; 智明吉永; 良起伊藤
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2021-07-05
Filing date: 2021-07-05
Publication date: 2024-10-08
Anticipated expiration: 2041-07-05
Also published as: JP2023008315A; WO2023281897A1

Description

本開示は、映像監視システム及び映像監視方法に関する。 This disclosure relates to a video surveillance system and a video surveillance method.

警備業務及び監視業務などでは、監視場所に設置されたカメラなどから送信されてきた監視映像を表示する映像監視システムが使用されており、監視員は、監視映像を確認することで、特定の監視事象である特定事象を察知することが重要である。なお、特定事象は、例えば、犯罪又は事故などのインシデント、及び、警備、警察又は救急による対処が必要な事象などである。 In security and surveillance work, video surveillance systems are used that display surveillance footage transmitted from cameras installed at surveillance locations, and it is important for security personnel to detect specific events, which are specific surveillance events, by checking the surveillance footage. Specific events include, for example, incidents such as crimes or accidents, and events that require response by security, police, or emergency services.

特定事象を察知するためには、監視員及び警備員などの管理者は、場所及び時間などを考慮しながら、監視映像に写されている人物及び物体などの対象に関する外見及び動作に違和があるか否かを判断している。このため、特定事象の察知には、管理者の経験又は勘などが必要であり、言語化及び伝承が難しいという問題がある。 To detect specific events, supervisors such as monitors and security guards must determine whether there is anything unusual about the appearance and behavior of people, objects, and other subjects captured on surveillance footage, taking into account factors such as location and time. For this reason, the experience or intuition of the supervisor is required to detect specific events, which can be difficult to verbalize and communicate.

これに対して特許文献１には、インシデントを写した過去の監視映像に対して、インシデントの発生可能レベルを属性情報としてユーザが登録する映像監視システムが開示されている。この映像監視システムは、登録された属性情報に基づいて、新たな監視映像からインシデントを検出している。 In response to this, Patent Document 1 discloses a video surveillance system in which a user registers the likelihood of an incident occurring as attribute information for past surveillance footage that shows an incident. This video surveillance system detects incidents from new surveillance footage based on the registered attribute information.

特開２０１９－１９８０００号公報JP 2019-198000 A

特許文献１に記載の技術では、監視映像を確認したユーザがインシデントに関する適切な属性情報を登録できることが前提となっているため、監視業務についての専門的な知見のあるユーザがいない場合、インシデントを正確に検出できないという問題がある。 The technology described in Patent Document 1 is based on the premise that the user who reviews the surveillance footage is able to register appropriate attribute information related to the incident, so if there is no user with specialized knowledge of surveillance work, there is a problem that incidents cannot be detected accurately.

本開示の目的は、専門的な知見を有するユーザがいなくても特定事象を検知することが可能な映像監視システム及び映像監視方法を提供することにある。 The objective of this disclosure is to provide a video surveillance system and a video surveillance method that are capable of detecting specific events without the need for a user with specialized knowledge.

本開示の一態様に従う映像監視システムは、複数の第１の映像データのそれぞれに写る被写体の特徴を示す複数の特徴データから、所望の前記第１の映像データを検索するための前記特徴に関する検索クエリに応じて抽出した特定事象データを格納する格納部と、第２の映像データに写る被写体の特徴を示す解析結果データを生成する解析部と、前記解析結果データと前記特定事象データとに基づいて、前記第２の映像データに写る特定の監視事象である特定事象の有無を検知する検知部と、を有する。 A video surveillance system according to one aspect of the present disclosure includes a storage unit that stores specific event data extracted from a plurality of feature data indicating the features of a subject captured in each of a plurality of first video data in response to a search query related to the features for searching for desired first video data, an analysis unit that generates analysis result data indicating the features of a subject captured in second video data, and a detection unit that detects the presence or absence of a specific event, which is a specific surveillance event captured in the second video data, based on the analysis result data and the specific event data.

本発明によれば、専門的な知見を有するユーザがいなくても特定事象を察知することが可能になる。 The present invention makes it possible to detect specific events even without a user with specialized knowledge.

本開示の第１の実施形態の映像監視システムの一例を示す図である。1 is a diagram illustrating an example of a video monitoring system according to a first embodiment of the present disclosure. 検索クエリテーブルの一例を示す図である。FIG. 13 is a diagram illustrating an example of a search query table. フレーム別テーブルの一例を示す図であるFIG. 13 is a diagram illustrating an example of a frame-by-frame table. 追跡ＩＤ別テーブルの一例を示す図である。FIG. 13 is a diagram illustrating an example of a tracking ID table. 映像検索処理の一例について説明するためのフローチャートである。11 is a flowchart illustrating an example of a video search process. 調整処理の一例を説明するためのフローチャートである。11 is a flowchart illustrating an example of an adjustment process. 本開示の第１の実施形態の映像監視システムの別の例を示す図である。FIG. 2 is a diagram illustrating another example of the video monitoring system according to the first embodiment of the present disclosure. 特定事象検知処理の一例を説明するためのフローチャートである。11 is a flowchart illustrating an example of a specific event detection process. 特定事象判定処理の一例を説明するためのフローチャートである。11 is a flowchart illustrating an example of a specific event determination process. 特定事象判定処理の一例を説明するための図である。11 is a diagram for explaining an example of a specific event determination process. FIG. 本開示の第１の実施形態の映像監視システムの別の例を示す図である。FIG. 2 is a diagram illustrating another example of the video monitoring system according to the first embodiment of the present disclosure. 本開示の第２の実施形態の映像監視システムの一例を示す図である。FIG. 11 is a diagram illustrating an example of a video monitoring system according to a second embodiment of the present disclosure. 本開示の第２の実施形態の映像監視システムの別の例を示す図である。FIG. 11 is a diagram illustrating another example of a video monitoring system according to the second embodiment of the present disclosure. 特定事象検知処理の別の例を説明するためのフローチャートである。13 is a flowchart illustrating another example of the specific event detection process.

以下、本開示の実施形態について図面を参照して説明する。 Embodiments of the present disclosure will be described below with reference to the drawings.

以下で説明する本実施形態の映像監視システムは、映像データに写る所定の被写体の特徴を示す特徴データから、特定の監視事象である特定事象に関するデータである特定事象データを抽出して蓄積し、その蓄積された特定事象データに基づいて、監視対象の映像データに写る特定事象を検知するシステムである。特定事象は、例えば、犯罪又は事故などのインシデント、及び、警備、警察又は救急による対処が必要な事象、又は、それらのインシデント又は事象が発生する予兆となる事象などである。 The video surveillance system of this embodiment described below is a system that extracts and accumulates specific event data, which is data related to a specific event that is a specific surveillance event, from feature data that indicates the characteristics of a specific subject captured in the video data, and detects a specific event captured in the video data of the surveillance target based on the accumulated specific event data. Specific events are, for example, incidents such as crimes or accidents, events that require response by security, police, or emergency services, or events that are a sign of the occurrence of such incidents or events.

図１は、本開示の第１の実施形態の映像監視システムにおける特定事象データを蓄積するための構成を示す図である。図１において、映像監視システムは、映像配信装置１と、映像処理装置２と、ＦＤＢ（Feature Database：特徴データベース）サーバ３と、ユーザ端末４と、映像検索装置５と、特定事象ＤＢサーバ６とを有する。各装置１～６は、例えば、不図示の通信ネットワークを介して、相互に通信可能に接続される。 Figure 1 is a diagram showing a configuration for accumulating specific event data in a video surveillance system according to a first embodiment of the present disclosure. In Figure 1, the video surveillance system includes a video distribution device 1, a video processing device 2, an FDB (Feature Database) server 3, a user terminal 4, a video search device 5, and a specific event DB server 6. Each of the devices 1 to 6 is connected to each other so that they can communicate with each other, for example, via a communication network (not shown).

映像配信装置１は、映像データを配信する装置である。映像配信装置１は、本実施形態では、所定の監視場所を写した映像データを取得して配信するカメラであるが、レコーダ又はＶＭＳ（Video Management System）などでもよい。映像データは、本実施形態では、複数のフレームから構成される動画像データである。また、映像配信装置１は、映像データのみを配信するか、映像データに対してその映像データに関するメタ情報を付与して配信する。メタ情報は、例えば、当該映像配信装置１を識別する識別情報であるカメラＩＤ、及び、映像データを取得した日時である取得日時などである。 Video distribution device 1 is a device that distributes video data. In this embodiment, video distribution device 1 is a camera that acquires and distributes video data showing a specified monitoring location, but it may also be a recorder or a VMS (Video Management System). In this embodiment, the video data is moving image data consisting of multiple frames. Furthermore, video distribution device 1 distributes only the video data, or distributes the video data with meta information about the video data added to it. The meta information is, for example, a camera ID, which is identification information that identifies the video distribution device 1, and an acquisition date and time, which is the date and time when the video data was acquired.

映像処理装置２は、映像配信装置１にて配信された映像データに対して所定の映像解析処理を行う装置である。映像処理装置２は、具体的には、映像入力部２１と、映像解析部２２と、ＦＤＢ登録部２３とを有する。 The video processing device 2 is a device that performs a predetermined video analysis process on the video data distributed by the video distribution device 1. Specifically, the video processing device 2 has a video input unit 21, a video analysis unit 22, and an FDB registration unit 23.

映像入力部２１は、映像配信装置１にて配信された映像データを受け付ける。映像データは、図１の例では、特定事象データを生成するための第１の映像データである。 The video input unit 21 accepts video data distributed by the video distribution device 1. In the example of FIG. 1, the video data is the first video data for generating specific event data.

映像解析部２２は、映像入力部２１が受け付けた映像データに対して所定の映像解析処理を行う。映像解析処理は、映像データに写る所定の被写体の特徴を推定（取得）する特徴推定処理を含む。所定の被写体は、本実施形態では、人物であるが、人物以外の動物又は物体などでもよい。 The video analysis unit 22 performs a predetermined video analysis process on the video data received by the video input unit 21. The video analysis process includes a feature estimation process that estimates (acquires) the features of a predetermined subject that appears in the video data. In this embodiment, the predetermined subject is a person, but it may also be an animal or object other than a person.

特徴推定処理は、所定の被写体である人物を検知する人物検知処理、人物検知処理にて検知された検知人物の属性を推定する属性推定処理、及び、検知人物の行動を推定する行動推定処理などである。この場合、検知人物の属性及び行動が特徴となる。属性は、外見的な属性（外見から判定される属性）であり、例えば、年齢（年齢区分）、性別、及び、衣服の色などである。行動は、例えば、走る、及び、キョロキョロする（周囲を見回す）などである。なお、映像解析処理は、映像データの背景を認識する背景認識処理、及び、検知人物の骨格を推定する骨格推定処理のような、特徴推定処理を行うための処理を含む。また、特徴推定処理は、映像データのフレームごとに行われ、複数のフレームに共通する人物が写っている場合には、その共通人物を特定する追跡処理を含む。また、特徴推定処理は、映像データのフレームの人物が映っている領域の画像特徴量を人物画像特徴量として算出する処理を含んでもよい。 The feature estimation process includes a person detection process that detects a person who is a specified subject, an attribute estimation process that estimates the attributes of the detected person detected in the person detection process, and a behavior estimation process that estimates the behavior of the detected person. In this case, the attributes and behavior of the detected person are the features. The attributes are external attributes (attributes determined from the appearance), such as age (age category), sex, and color of clothes. The behaviors are, for example, running and looking around (looking around). Note that the video analysis process includes processes for performing feature estimation processes, such as background recognition process that recognizes the background of the video data and skeleton estimation process that estimates the skeleton of the detected person. In addition, the feature estimation process is performed for each frame of the video data, and includes a tracking process that identifies the common person when a person is captured in multiple frames. In addition, the feature estimation process may include a process of calculating the image feature amount of an area in which a person is captured in a frame of the video data as a person image feature amount.

ＦＤＢ登録部２３は、映像入力部２１が受け付けた映像データと、映像解析部２２による映像データに対する映像解析処理の処理結果を示す特徴データとをＦＤＢサーバ３に登録する。特徴データは、例えば、映像データのフレームごとに、カメラＩＤと、取得日時と、そのフレームに写る人物の特徴を示す特徴情報と対応付けたデータである。映像データにメタ情報が付与されていない場合、ＦＤＢ登録部２３がカメラＩＤ及び取得日時を生成してもよい。特徴情報は、特徴の種類のそれぞれに対応する複数の推定項目を有する。推定項目には、人物の行動に関する特徴を示す行動項目と、人物の属性に関する特徴を示す属性項目と、人物画像特徴量を示す画像特徴項目とがある。行動項目及び属性項目は、それぞれ複数あってもよい。 The FDB registration unit 23 registers in the FDB server 3 the video data received by the video input unit 21 and feature data indicating the results of the video analysis process performed by the video analysis unit 22 on the video data. The feature data is, for example, data that associates, for each frame of video data, a camera ID, an acquisition date and time, and feature information indicating the features of the person appearing in that frame. If meta-information is not attached to the video data, the FDB registration unit 23 may generate the camera ID and acquisition date and time. The feature information has multiple estimated items corresponding to each type of feature. The estimated items include behavior items that indicate features related to a person's behavior, attribute items that indicate features related to a person's attributes, and image feature items that indicate person image features. There may be multiple behavior items and attribute items.

行動項目の項目値は、行動項目に対応する行動の有無を示す。属性項目の項目値は、属性項目に対応する属性を示す。属性が年齢区分の場合、項目値は、例えば、１５歳以下であれが「０」、１５～４５歳であれば「１」、４５歳～６０歳であれば「２」、６０歳以上であれば「３」を示す。 The item value of an action item indicates whether or not an action corresponding to the action item has occurred. The item value of an attribute item indicates the attribute corresponding to the attribute item. If the attribute is an age category, the item value is, for example, "0" for age 15 or younger, "1" for ages 15-45, "2" for ages 45-60, and "3" for ages 60 or older.

ＦＤＢサーバ３は、映像データ及び特徴データを格納するストレージ部であるＦＤＢ３１を有するサーバである。 The FDB server 3 is a server that has an FDB 31, which is a storage unit that stores video data and feature data.

ユーザ端末４は、映像監視システムを利用するユーザにて操作される端末装置である。ユーザ端末４は、入力部４１と、表示部４２とを有する。入力部４１は、ユーザから所望の映像データを検索するための検索クエリ（検索条件）である映像検索クエリのような種々の情報を受け付けて映像検索装置５に送信する。表示部４２は、映像検索クエリに対する応答である検索結果のような種々の情報を受け付けて表示する。映像検索クエリは、例えば、カメラＩＤ、取得日時及び特徴情報に関する検索条件を含む。 The user terminal 4 is a terminal device operated by a user who uses the video surveillance system. The user terminal 4 has an input unit 41 and a display unit 42. The input unit 41 accepts various information such as a video search query, which is a search query (search condition) for searching desired video data from the user, and transmits it to the video search device 5. The display unit 42 accepts and displays various information such as a search result, which is a response to the video search query. The video search query includes search conditions related to, for example, a camera ID, acquisition date and time, and feature information.

映像検索装置５は、ユーザ端末４からの映像検索クエリに基づいて、ＦＤＢ３１から映像データを検索する装置である。映像検索装置５は、映像検索部５１と、特定事象ＤＢ登録部５２とを有する。 The video search device 5 is a device that searches for video data from the FDB 31 based on a video search query from the user terminal 4. The video search device 5 has a video search unit 51 and a specific event DB registration unit 52.

映像検索部５１は、ユーザ端末４からの映像検索クエリに適合する映像データをＦＤＢ３１から検索し、その検索結果をユーザ端末４に送信する。映像検索部５１は、具体的には、ＦＤＢ３１に格納されている特徴データに基づいて、映像検索クエリの検索条件に適合する特徴を有する人物である追跡対象人物を写した映像データを検索する。 The video search unit 51 searches the FDB 31 for video data that matches the video search query from the user terminal 4, and transmits the search results to the user terminal 4. Specifically, the video search unit 51 searches for video data that shows a tracked person who has characteristics that match the search conditions of the video search query, based on the characteristic data stored in the FDB 31.

特定事象ＤＢ登録部５２は、ＦＤＢサーバ３のＦＤＢ３１に格納された特徴データから、映像検索クエリに応じた特徴データを、特定事象に関する特定事象データとして抽出して特定事象ＤＢサーバ６に登録する登録部である。具体的には、特定事象ＤＢ登録部５２は、映像検索部５１にて映像検索クエリに応じて検索された映像データである検索映像データに対応する特徴データを、特定事象データとして特定事象ＤＢサーバ６に登録する。特定事象データは、特徴データをフレーム事に集約したフレーム別テーブルと、追跡対象人物ごとに特徴データを集約した追跡ＩＤ別テーブルとを含む。追跡ＩＤ別テーブルは、フレーム別テーブルから生成される。また、特定事象ＤＢ登録部５２は、映像検索クエリを特定事象ＤＢサーバ６に登録してもよい。このとき、特定事象ＤＢ登録部５２は、例えば、映像検索クエリを集めた検索クエリテーブルの形式で登録してもよいし、フレーム別テーブルに追加する形式で登録してもよいし、その両方の形式で登録してもよい。 The specific event DB registration unit 52 is a registration unit that extracts feature data corresponding to a video search query from the feature data stored in the FDB 31 of the FDB server 3 as specific event data related to a specific event and registers it in the specific event DB server 6. Specifically, the specific event DB registration unit 52 registers feature data corresponding to searched video data, which is video data searched by the video search unit 51 in response to a video search query, in the specific event DB server 6 as specific event data. The specific event data includes a frame-specific table that aggregates feature data for each frame, and a tracking ID-specific table that aggregates feature data for each tracking target person. The tracking ID-specific table is generated from the frame-specific table. The specific event DB registration unit 52 may also register a video search query in the specific event DB server 6. In this case, the specific event DB registration unit 52 may register, for example, in the form of a search query table that collects video search queries, or in the form of adding to the frame-specific table, or in both forms.

特定事象ＤＢサーバ６は、映像検索クエリと特定事象データとを格納する格納部である特定事象ＤＢ６１を有するサーバである。 The specific event DB server 6 is a server having a specific event DB 61, which is a storage unit that stores video search queries and specific event data.

図２は、検索クエリテーブルの一例を示す図である。図２に示す検索クエリテーブル２００は、各映像検索クエリをレコードとして有するテーブル情報であり、フィールド２０１～２０４を含む。 Figure 2 is a diagram showing an example of a search query table. The search query table 200 shown in Figure 2 is table information having each video search query as a record, and includes fields 201 to 204.

フィールド２０１は、映像データを取得した映像配信装置１のカメラＩＤを格納する。フィールド２０２は、映像データを取得した時間帯を格納する。フィールド２０３は、特徴情報の行動項目ごとに設けられ、その行動項目が検索対象として指定されたか否かを示す行動特徴指定情報を格納する。図２の例では、行動特徴指定情報は、行動項目が指定された場合、「１」を示し、行動項目が指定されていない場合、「０」を示す。フィールド２０４は、特徴情報の属性項目ごとに設けられ、その属性項目の取り得る項目値のいずれかを指定する指定値を示す属性特徴指定情報を格納する。なお、映像検索クエリは、フィールド２０１～２０４の全てに対して値を有する必要はない。 Field 201 stores the camera ID of the video distribution device 1 that acquired the video data. Field 202 stores the time period when the video data was acquired. Field 203 is provided for each action item of the characteristic information, and stores action characteristic designation information indicating whether or not the action item is designated as a search target. In the example of FIG. 2, the action characteristic designation information indicates "1" if an action item is designated, and indicates "0" if an action item is not designated. Field 204 is provided for each attribute item of the characteristic information, and stores attribute characteristic designation information indicating a designated value that designates one of the item values that the attribute item can take. Note that a video search query does not need to have values for all of fields 201 to 204.

図３は、特徴事象データに含まれるフレーム別テーブルの一例を示す図である。図３に示すフレーム別テーブル３００は、特徴データをレコードとするテーブル情報であり、フィールド３０１～３０８を含む。 Figure 3 shows an example of a frame-specific table included in feature event data. The frame-specific table 300 shown in Figure 3 is table information in which feature data is a record, and includes fields 301 to 308.

フィールド３０１は、検索映像データを取得した映像配信装置１のカメラＩＤを格納する。フィールド３０２は、検索映像データのフレームを識別するフレームＩＤを格納する。フィールド３０３は、当該フレームが取得された時刻を格納する。フィールド３０４は、当該フレームに写る各人物を識別する人物ＩＤのうち追跡対象人物を識別する人物ＩＤを格納する。フィールド３０５は、当該フレームに写る追跡対象人物を識別する追跡ＩＤを格納する。なお、人物ＩＤは、フレーム内における人物をするＩＤであり、追跡ＩＤは、フレーム間で同一人物に割り当てられるＩＤである。 Field 301 stores the camera ID of the video distribution device 1 that acquired the searched video data. Field 302 stores a frame ID that identifies a frame of the searched video data. Field 303 stores the time when the frame was acquired. Field 304 stores a person ID that identifies a person to be tracked among the person IDs that identify each person appearing in the frame. Field 305 stores a tracking ID that identifies a person to be tracked appearing in the frame. Note that a person ID is an ID that identifies a person in a frame, and a tracking ID is an ID that is assigned to the same person between frames.

フィールド３０６は、行動項目ごとに設けられ、当該フレームに写る追跡対象人物に対する当該行動項目の項目値及び確信度を格納する。フィールド３０７は、属性項目ごとに設けられ、当該フレームに写る追跡対象人物に対する当該属性項目の項目値及び確信度を格納する。本実施形態では、フィールド３０６及び３０７は、「１，０．９」のような「項目値，確信度」で表される数値列の形式で項目値及び確信度を格納している。フィールド３０８は、当該フレームの人物画像特徴量を格納する。 Field 306 is provided for each action item, and stores the item value and certainty of the action item for the tracked person appearing in the frame. Field 307 is provided for each attribute item, and stores the item value and certainty of the attribute item for the tracked person appearing in the frame. In this embodiment, fields 306 and 307 store the item value and certainty in the form of a numeric string represented as "item value, certainty" such as "1, 0.9". Field 308 stores the person image feature amount for the frame.

なお、映像検索クエリをフレーム別テーブルに付与する場合には、例えば、フレーム別テーブル３００に映像検索クエリを格納するフィールドが追加される。そのフィールドには、例えば、映像検索クエリで指定された項目（例えば、運動項目Ａ及びＢ、年齢区分など）が格納される。 When a video search query is assigned to a frame-specific table, for example, a field for storing the video search query is added to the frame-specific table 300. For example, the field stores the items specified in the video search query (e.g., exercise items A and B, age category, etc.).

図４は、特徴事象データに含まれる追跡ＩＤ別テーブルの一例を示す図である。図４に示す追跡ＩＤ別テーブル４００は、フィールド４０１～４０６を含む。 Figure 4 shows an example of a tracking ID table included in the characteristic event data. The tracking ID table 400 shown in Figure 4 includes fields 401 to 406.

フィールド４０１は、追跡対象人物を写したフレームを有する映像データを取得した映像配信装置１のカメラＩＤを格納する。フィールド４０２は、追跡対象人物を写した各フレームの取得時間を格納する。フィールド４０３は、追跡対象人物を識別する追跡ＩＤを格納する。フィールド４０４は、行動項目ごとに設けられ、追跡対象人物に対する当該行動項目の項目値を集約した集約値を格納する。フィールド４０５は、属性項目ごとに設けられ、追跡対象人物に対する当該属性項目の項目値を集約した集約値を格納する。フィールド４０６は、追跡対象人物を写した各フレームの人物画像特徴量を集約した集約値を格納する。 Field 401 stores the camera ID of the video distribution device 1 that acquired video data having frames showing the tracked person. Field 402 stores the acquisition time of each frame showing the tracked person. Field 403 stores the tracking ID that identifies the tracked person. Field 404 is provided for each action item, and stores an aggregated value that aggregates the item values of the action items for the tracked person. Field 405 is provided for each attribute item, and stores an aggregated value that aggregates the item values of the attribute items for the tracked person. Field 406 stores an aggregated value that aggregates the person image features of each frame showing the tracked person.

図５は、映像監視システムの映像検索処理の一例について説明するためのフローチャートである。 Figure 5 is a flowchart illustrating an example of a video search process in a video surveillance system.

先ず、ユーザ端末４の入力部４１は、ユーザから映像検索クエリを受け付け、その映像検索クエリを映像検索装置５に送信する。映像検索装置５の映像検索部５１は、ユーザ端末４からの映像検索クエリを受信すると、映像検索クエリに適合する特徴を有する人物である追跡対象人物を写した映像データをＦＤＢサーバ３から検索する（ステップＳ１０１）。そして、特定事象ＤＢ登録部５２は、映像検索クエリを検索クエリテーブルのレコードとして特定事象ＤＢサーバ６の特定事象ＤＢ６１に登録する（ステップＳ１０２）。 First, the input unit 41 of the user terminal 4 accepts a video search query from a user and transmits the video search query to the video search device 5. When the video search unit 51 of the video search device 5 receives the video search query from the user terminal 4, it searches the FDB server 3 for video data showing a tracked person who has characteristics matching the video search query (step S101). Then, the specific event DB registration unit 52 registers the video search query in the specific event DB 61 of the specific event DB server 6 as a record in the search query table (step S102).

映像検索部５１は、検索した映像データである検索映像データのリストを検索結果としてユーザ端末４に送信する。ユーザ端末４の表示部４２は、検索結果を受信し、その検索結果を表示する（ステップＳ１０３）。なお、検索結果は、検索映像データのサムネイル画像などを含んでもよい。 The video search unit 51 transmits a list of searched video data, which is the searched video data, to the user terminal 4 as a search result. The display unit 42 of the user terminal 4 receives the search results and displays the search results (step S103). Note that the search results may include thumbnail images of the searched video data, etc.

入力部４１は、ユーザから、検索結果にて示される検索映像データのいずれかを選択する選択要求を受け付け、その選択要求を映像検索装置５に送信する。映像検索装置５は、ユーザ端末４からの選択要求にて選択された検索映像データである選択映像データをユーザ端末４に送信する。ユーザ端末４の表示部４２は、選択映像データを受信し、その選択映像データを表示（再生）する（ステップＳ１０４）。 The input unit 41 receives a selection request from the user to select one of the searched video data displayed in the search results, and transmits the selection request to the video search device 5. The video search device 5 transmits selected video data, which is the searched video data selected by the selection request from the user terminal 4, to the user terminal 4. The display unit 42 of the user terminal 4 receives the selected video data and displays (plays) the selected video data (step S104).

また、映像検索装置５の特定事象ＤＢ登録部５２は、選択映像データから映像検索クエリと最も精度よく適合する人物を追跡対象人物として特定する（ステップＳ１０５）。特定事象ＤＢ登録部５２は、ＦＤＢ３１に格納された選択映像データに対応する特徴データから対象人物を識別する追跡ＩＤを抽出する（ステップＳ１０６）。 The specific event DB registration unit 52 of the video search device 5 identifies the person who most accurately matches the video search query from the selected video data as the person to be tracked (step S105). The specific event DB registration unit 52 extracts a tracking ID that identifies the target person from the feature data corresponding to the selected video data stored in the FDB 31 (step S106).

特定事象ＤＢ登録部５２は、抽出した追跡ＩＤを含む特徴データ（追跡対象人物を写した全てのフレームのそれぞれに対応する特徴データ）をＦＤＢ３１から抽出し、その特徴データをフレーム別テーブルとして特定事象ＤＢサーバ６の特定事象ＤＢ６１に登録する（ステップＳ１０７）。 The specific event DB registration unit 52 extracts feature data including the extracted tracking ID (feature data corresponding to each of all frames showing the person to be tracked) from the FDB 31, and registers the feature data as a frame-specific table in the specific event DB 61 of the specific event DB server 6 (step S107).

その後、特定事象ＤＢ登録部５２は、フレーム別テーブルに基づいて、追跡ＩＤ別テーブルを生成して特定事象ＤＢサーバ６の特定事象ＤＢ６１に登録し（ステップＳ１０８）、検索処理を終了する。 Then, the specific event DB registration unit 52 generates a tracking ID table based on the frame table, registers it in the specific event DB 61 of the specific event DB server 6 (step S108), and ends the search process.

追跡ＩＤ別テーブルの各推定項目の項目値は、例えば、以下の３つの第１～第３の方法のいずれかにて決定される。 The item values of each estimated item in the tracking ID table are determined, for example, by one of the following three methods, methods 1 to 3.

第１の方法では、特定事象ＤＢ登録部５２は、推定項目ごとに、フレーム別テーブルにおいて確信度が判定閾値以上の項目値の中で最も出現頻度の高い項目値を、追跡ＩＤ別テーブルの項目値とする。 In the first method, the specific event DB registration unit 52 sets, for each estimated item, the item value that appears most frequently among the item values in the frame-specific table whose confidence level is equal to or greater than the judgment threshold, as the item value in the tracking ID-specific table.

第２の方法では、特定事象ＤＢ登録部５２は、推定項目ごとに、フレーム別テーブルにおいて最も出現頻度の高い項目値を最頻推定値として特定し、行動項目については、最頻推定値にその最頻推定値の出現頻度を乗算した値を追跡ＩＤ別テーブルの項目値、属性項目については、最頻推定値を追跡ＩＤ別テーブルの項目値とする。ここで、出現頻度は、追跡ＩＤの対象人物が写るフレームの総数である総フレーム数に対する推定値に対応するフレームの数の割合である。例えば、総フレーム数が１０００であり、年齢区分が「１」のフレーム数が１００、年齢区分が「２」のフレーム数が３００の場合、年齢区分が「１」の出現頻度は、１００／１０００＝０．１となり、年齢区分が「２」の出現頻度は、３００／１０００＝０．３となる。 In the second method, the specific event DB registration unit 52 identifies the item value with the highest frequency of occurrence in the frame-specific table for each estimation item as the most frequent estimation value, and for behavior items, the most frequent estimation value is multiplied by the frequency of occurrence of the most frequent estimation value as the item value in the tracking ID-specific table, and for attribute items, the most frequent estimation value is the item value in the tracking ID-specific table. Here, the frequency of occurrence is the ratio of the number of frames corresponding to the estimation value to the total number of frames, which is the total number of frames in which the target person of the tracking ID appears. For example, if the total number of frames is 1000, the number of frames with age category "1" is 100, and the number of frames with age category "2" is 300, the frequency of occurrence of age category "1" is 100/1000 = 0.1, and the frequency of occurrence of age category "2" is 300/1000 = 0.3.

第３の方法では、特定事象ＤＢ登録部５２は、推定項目ごとに、フレーム別テーブルにおける各フレームに対応する項目値を繋いだベクトル値を追跡ＩＤ別テーブルの項目値として算出する。このとき、特定事象ＤＢ登録部５２は、ベクトル値の要素を間引くなどして、所定の次元数のベクトル値に正規化してもよい。 In the third method, the specific event DB registration unit 52 calculates, for each estimated item, a vector value that connects item values corresponding to each frame in the frame-specific table as an item value in the tracking ID-specific table. At this time, the specific event DB registration unit 52 may normalize the vector value to a vector value with a predetermined number of dimensions, for example by thinning out the elements of the vector value.

以上説明した検索処理に係る構成及び動作では、特定事象ＤＢ登録部５２は、映像検索クエリに応じた特徴データを特定事象データとして登録していたが、特定事象データとして登録する特徴データはユーザ端末４を介してユーザから直接指定されてもよい。この場合、特定事象ＤＢ登録部５２は、特徴データを指定するためのインタフェースをユーザ端末４に提供してもよい。 In the configuration and operation related to the search process described above, the specific event DB registration unit 52 registered feature data corresponding to the video search query as specific event data, but the feature data to be registered as specific event data may be specified directly by the user via the user terminal 4. In this case, the specific event DB registration unit 52 may provide the user terminal 4 with an interface for specifying feature data.

また、映像検索クエリでは、推定項目の項目値のみが指定されていたが、推定項目の項目値だけでなく、項目値の確信度に対する閾値である検索閾値をさらに含んでもよい。この場合、例えば、図２に示した検索クエリテーブルのフィールド２０３及び２０４は、「１，０．８」のような「指定値，検索閾値」で表される数値列を格納する。また、行動Ａ（走る）に対応する行動項目が「１，０．８」となる場合、映像検索部５１は、確信度が０．８以上で「１」と判断された映像データを検索する。検索閾値は、特定事象ＤＢ登録部５２にて調整されてもよい。 In addition, in the video search query, only the item value of the estimated item is specified, but the video search query may further include not only the item value of the estimated item but also a search threshold, which is a threshold for the certainty of the item value. In this case, for example, fields 203 and 204 of the search query table shown in FIG. 2 store a numeric string expressed as "specified value, search threshold" such as "1, 0.8". In addition, when the action item corresponding to action A (running) is "1, 0.8", the video search unit 51 searches for video data whose certainty is 0.8 or more and is determined to be "1". The search threshold may be adjusted by the specific event DB registration unit 52.

図６は、検索閾値を調整する調整処理の一例を説明するためのフローチャートである。調整処理は、映像検索装置５の映像検索部５１がユーザ端末４からの映像検索クエリを受信した際に実行される。あるいは、調整処理は、映像検索装置５の映像検索部５１がユーザ端末４からのユーザの操作に基づく調整指示を受信した際に実行されてもよい。 Figure 6 is a flowchart for explaining an example of an adjustment process for adjusting a search threshold. The adjustment process is executed when the video search unit 51 of the video search device 5 receives a video search query from the user terminal 4. Alternatively, the adjustment process may be executed when the video search unit 51 of the video search device 5 receives an adjustment instruction based on a user operation from the user terminal 4.

調整処理では、先ず、特定事象ＤＢ登録部５２は、特定事象ＤＢサーバ６の特定事象ＤＢ６１に登録されてるフレーム別テーブルを参照する（ステップＳ２０１）。そして、特定事象ＤＢ登録部５２は、映像検索クエリに含まれる推定項目ごとに、その推定項目の各項目値の確信度をフレーム別テーブルから取得する（ステップＳ２０２）。 In the adjustment process, the specific event DB registration unit 52 first refers to the frame-specific table registered in the specific event DB 61 of the specific event DB server 6 (step S201). Then, for each estimated item included in the video search query, the specific event DB registration unit 52 obtains the confidence level of each item value of the estimated item from the frame-specific table (step S202).

特定事象ＤＢ登録部５２は、推定項目ごとに、各項目値の確信度の統計値として最小値を算出する（ステップＳ２０３）。統計値は、最小値に限らず、例えば、平均値、又は、平均値から標準偏差を減算した値などでもよい。 The specific event DB registration unit 52 calculates the minimum value as a statistical value of the confidence of each item value for each estimated item (step S203). The statistical value is not limited to the minimum value, and may be, for example, the average value or a value obtained by subtracting the standard deviation from the average value.

特定事象ＤＢ登録部５２は、検索閾値の調整を行うか否かの選択を要求する要求画面をユーザ端末４に送信する。ユーザ端末４の表示部４２は、要求画面を受信し、その要求画面を表示する（ステップＳ２０４）。その後、入力部４１は、検索閾値の調整を行う旨の調整要求を受け付けると、その調整要求を映像検索装置５に送信する。映像検索装置５の特定事象ＤＢ登録部５２は、調整要求を受け付けると、推定項目ごとに、各項目値の検索閾値を上記の確信度の最小値に設定し（ステップＳ２０５）、調整処理を終了する。なお、特定事象ＤＢ登録部５２は、要求画面を表示せずに、自動的に項目値の検索閾値を上記の確信度の最小値に設定してもよい。 The specific event DB registration unit 52 sends a request screen to the user terminal 4, requesting the user to select whether or not to adjust the search threshold. The display unit 42 of the user terminal 4 receives the request screen and displays the request screen (step S204). After that, when the input unit 41 receives an adjustment request to adjust the search threshold, it sends the adjustment request to the video search device 5. When the specific event DB registration unit 52 of the video search device 5 receives the adjustment request, it sets the search threshold of each item value to the minimum value of the above-mentioned confidence level for each estimated item (step S205), and ends the adjustment process. Note that the specific event DB registration unit 52 may automatically set the search threshold of the item value to the minimum value of the above-mentioned confidence level without displaying the request screen.

図７は、映像監視システムにおける特定事象を検知するための構成が示されている。図７に示した映像監視システムは、図１に示した映像監視システムと比較して、映像検索装置５の代わりに、特定事象検索装置７を有する点で異なる。ただし、映像監視システムは、映像検索装置５と特定事象検索装置７との両方を備えるものでもよいし、映像検索装置５と特定事象検索装置７とが物理的に同一の装置で構成されてもよい。 Figure 7 shows a configuration for detecting a specific event in a video surveillance system. The video surveillance system shown in Figure 7 differs from the video surveillance system shown in Figure 1 in that it has a specific event search device 7 instead of a video search device 5. However, the video surveillance system may be equipped with both the video search device 5 and the specific event search device 7, or the video search device 5 and the specific event search device 7 may be configured as the same physical device.

なお、図７の例では、映像処理装置２の映像入力部２１が映像配信装置１から受け付ける映像データは、特定事象が写っているか否かを検知する検知対象の第２の映像データであり、映像解析部２２は、第２の映像データに対して映像解析処理を行い、その処理結果を解析結果データとして特定事象検索装置７に送信する。映像解析処理は、図１などで説明した処理と同様であり、解析結果データは特徴データと同様なデータでよい。 In the example of FIG. 7, the video data received by the video input unit 21 of the video processing device 2 from the video distribution device 1 is second video data to be detected to detect whether or not a specific event is captured, and the video analysis unit 22 performs video analysis processing on the second video data and transmits the processing result to the specific event search device 7 as analysis result data. The video analysis processing is the same as the processing described in FIG. 1 etc., and the analysis result data may be data similar to the feature data.

特定事象検索装置７は、映像データに特定事象が写っているか否かを検知する検知装置である。特定事象検索装置７は、特定事象検索部７１と、統合表示部７２とを有する。 The specific event search device 7 is a detection device that detects whether a specific event is captured in video data. The specific event search device 7 has a specific event search unit 71 and an integrated display unit 72.

特定事象検索部７１は、映像処理装置２の映像解析部２２による映像データに対する映像解析処理の処理結果である解析結果データと、特定事象ＤＢサーバ６の特定事象ＤＢ６１に格納されている特定事象データとに基づいて、映像データに特定事象が写っているか否かを検知する。 The specific event search unit 71 detects whether or not a specific event is captured in the video data based on the analysis result data, which is the result of the video analysis process performed on the video data by the video analysis unit 22 of the video processing device 2, and the specific event data stored in the specific event DB 61 of the specific event DB server 6.

統合表示部７２は、特定事象検索部７１にて特定事象が検知された場合、特定事象を示すアラート情報と映像データとを統合してユーザ端末４に送信することで、アラート情報及び映像データをユーザ端末４の表示部４２に表示する。アラート情報は、例えば、ポップアップ情報として映像データに統合される。 When a specific event is detected by the specific event search unit 71, the integrated display unit 72 integrates alert information indicating the specific event with the video data and transmits the integrated information to the user terminal 4, thereby displaying the alert information and the video data on the display unit 42 of the user terminal 4. The alert information is integrated with the video data as, for example, pop-up information.

なお、特定事象ＤＢ６１に格納されている特定事象データは、図１で示した映像監視システムにて蓄積されたものに加えて、例えば、他の映像監視システムで取得されてクラウド上に格納された特徴データ及び特定事象データなどに基づいて追加されてもよい。また、映像監視システムがクラウドに特徴データ及び特定事象データをアップロードしてもよい。 The specific event data stored in the specific event DB 61 may be added based on, for example, feature data and specific event data acquired by another video surveillance system and stored on the cloud, in addition to the data accumulated in the video surveillance system shown in FIG. 1. The video surveillance system may also upload feature data and specific event data to the cloud.

図８は、映像監視システムによる特定事象を検知する特定事象検知処理の一例を説明するためのフローチャートである。 Figure 8 is a flowchart illustrating an example of a specific event detection process that detects a specific event using a video surveillance system.

先ず、映像処理装置２の映像入力部２１が映像データを受け付けると（ステップＳ３０１）、映像解析部２２は、その映像データに対して映像解析処理を行い、映像解析処理の結果である解析結果データと映像データとを特定事象検索装置７に送信する（ステップＳ３０２）。 First, when the video input unit 21 of the video processing device 2 receives video data (step S301), the video analysis unit 22 performs video analysis processing on the video data and transmits the analysis result data, which is the result of the video analysis processing, and the video data to the specific event search device 7 (step S302).

特定事象検索装置７の特定事象検索部７１は、映像処理装置２からの解析結果データに基づいて、特定事象データを検索するための検索クエリである事象検索クエリを生成する。特定事象検索部７１は、事象検索クエリに基づいて、特定事象ＤＢ６１に格納されている特定事象データを検索する（ステップＳ３０３）。 The specific event search unit 71 of the specific event search device 7 generates an event search query, which is a search query for searching for specific event data, based on the analysis result data from the video processing device 2. The specific event search unit 71 searches for specific event data stored in the specific event DB 61 based on the event search query (step S303).

特定事象検索部７１は、検索結果を解析して映像データに特定事象が写っているか否かを判断する（ステップＳ３０４）。 The specific event search unit 71 analyzes the search results and determines whether a specific event is captured in the video data (step S304).

特定事象が写っていない場合、統合表示部７２は、映像データをユーザ端末４に送信する。ユーザ端末４の表示部４２は、映像データを受け付け、その映像データを表示し（ステップＳ３０５）、処理を終了する。 If the specific event is not captured, the integrated display unit 72 transmits the video data to the user terminal 4. The display unit 42 of the user terminal 4 receives the video data, displays the video data (step S305), and ends the process.

一方、特定事象が写っている場合、統合表示部７２は、アラート情報を生成し、映像データ及びアラート情報をユーザ端末４に送信する。ユーザ端末４の表示部４２は、映像データ及びアラート情報を受け付け、その映像データ及びアラート情報を表示し（ステップＳ３０６）、処理を終了する。 On the other hand, if a specific event is captured, the integrated display unit 72 generates alert information and transmits the video data and the alert information to the user terminal 4. The display unit 42 of the user terminal 4 receives the video data and the alert information, displays the video data and the alert information (step S306), and ends the process.

図９は、図８のステップＳ３０３～Ｓ３０６の処理である特定事象判定処理をより詳細に説明するためのフローチャートである。図１０は、特定事象判定処理の一例を説明するための図である。 Figure 9 is a flowchart for explaining in more detail the specific event determination process, which is the process of steps S303 to S306 in Figure 8. Figure 10 is a diagram for explaining an example of the specific event determination process.

先ず、特定事象検索部７１は、映像処理装置２からの解析結果データを取得する（ステップＳ４０１）。解析結果データは、映像データにおける最新のフレームに対応する処理結果だけでなく、その最新のフレームに写っている各人物の追跡ＩＤに紐づいた処理結果を含む。 First, the specific event search unit 71 acquires analysis result data from the video processing device 2 (step S401). The analysis result data includes not only the processing result corresponding to the latest frame in the video data, but also the processing result linked to the tracking ID of each person appearing in the latest frame.

特定事象検索部７１は、解析結果データを、追跡ＩＤごとに、その追跡ＩＤの追跡対象人物に関する特徴データを集約した追跡ＩＤ別のデータに変換する（ステップＳ４０２）。変換方法は、フレーム別テーブルから追跡ＩＤ別テーブルを変換する方法と同様である。図１０では、追跡ＩＤ別の解析結果データの一例として解析結果データ５００が示されている。 The specific event search unit 71 converts the analysis result data into data by tracking ID that aggregates, for each tracking ID, feature data related to the tracking target person of that tracking ID (step S402). The conversion method is the same as the method for converting a table by frame into a table by tracking ID. In FIG. 10, analysis result data 500 is shown as an example of analysis result data by tracking ID.

特定事象検索部７１は、追跡ＩＤ別の解析結果データのうち、第１の項目である行動項目（図１０の項目５０１）の項目値で構成されるベクトルを検索条件とする事象検索クエリを生成し、特定事象ＤＢ６１に格納されている追跡ＩＤ別テーブルを検索する（ステップＳ４０３）。ここでは、特定事象検索部７１は、行動項目において、事象検索クエリと追跡別テーブル（図１０のテーブル５５０）の各レコードとの類似度を算出し、類似度が高い方から降順に各レコードをソートする。類似度としては、例えば、ユークリッド距離のような特徴量間の距離の逆数を使用することができる。また、行動項目が複数ある場合、特定事象検索部７１は、各行動項目の類似度の総和又は平均値などの統計値を事象検索クエリと追跡別テーブルの各レコードとの類似度として算出してもよい。 The specific event search unit 71 generates an event search query using as a search condition a vector consisting of the item values of the action item (item 501 in FIG. 10), which is the first item, from among the analysis result data by tracking ID, and searches the table by tracking ID stored in the specific event DB 61 (step S403). Here, the specific event search unit 71 calculates the similarity between the event search query and each record of the tracking table (table 550 in FIG. 10) in the action item, and sorts each record in descending order from the highest similarity. For example, the reciprocal of the distance between feature quantities such as Euclidean distance can be used as the similarity. In addition, when there are multiple action items, the specific event search unit 71 may calculate a statistical value such as the sum or average value of the similarity of each action item as the similarity between the event search query and each record of the tracking table.

特定事象検索部７１は、追跡別テーブルから類似度が所定類似度未満のレコードを除去して、類似度が所定類似度以上のデータを抽出する（ステップＳ４０４）。なお、抽出されたデータの数が設定値以下の場合、特定事象検索部７１は、追跡別テーブルから類似度が最も高いデータを抽出してもよい。設定値は、例えば、０である。 The specific event search unit 71 removes records whose similarity is less than a predetermined similarity from the trace-based table, and extracts data whose similarity is equal to or greater than the predetermined similarity (step S404). If the number of extracted data is equal to or less than a set value, the specific event search unit 71 may extract data whose similarity is the highest from the trace-based table. The set value is, for example, 0.

特定事象検索部７１は、行動項目以外の第２の項目（属性項目、カメラＩＤ及び取得時刻など）のそれぞれについて、項目値の分布を解析して、項目値の分布の偏り（局在性）を評価した評価値を算出する（ステップＳ４０５）。評価値は、例えば、標準偏差などである。 The specific event search unit 71 analyzes the distribution of item values for each of the second items other than the action items (attribute items, camera ID, acquisition time, etc.) and calculates an evaluation value that evaluates the bias (localization) of the distribution of the item values (step S405). The evaluation value is, for example, a standard deviation.

特定事象検索部７１は、評価値が所定評価値以上の項目を偏り項目として抽出する（ステップＳ４０６）。 The specific event search unit 71 extracts items whose evaluation value is equal to or greater than a predetermined evaluation value as biased items (step S406).

特定事象検索部７１は、偏り項目について、追跡別テーブルにおいて最も頻度の高い項目値である最頻値と事象検索クエリの項目値とを比較する（ステップＳ４０７）。 The specific event search unit 71 compares the most frequent item value in the tracking table for the biased item with the item value in the event search query (step S407).

特定事象検索部７１は、それらの項目値が一致する場合、映像データに特定事象が写っていると判断し、アラート情報を生成する（ステップＳ４０８）。本実施形態では、特定事象検索部７１は、全ての偏り項目について項目値が一致している場合、アラート情報として特定事象が高確率で写っていることを示す警告アラートを生成し、偏り項目の一部について項目値が一致している場合、アラート情報として特定事象が写っている可能性があることを示す注意アラートを生成する。 If the item values match, the specific event search unit 71 determines that a specific event is captured in the video data and generates alert information (step S408). In this embodiment, if the item values match for all biased items, the specific event search unit 71 generates a warning alert as alert information indicating that there is a high probability that a specific event is captured, and if the item values match for some of the biased items, it generates a caution alert as alert information indicating that a specific event may be captured.

なお、特定事象検索部７１は、アラート情報と、特定事象が写っていると判断した映像データを特定する特定事象ＩＤとを対応づけて特定事象ＤＢ６１に格納してもよい。また、お、全ての偏り項目について項目値が一致しなかった場合、特定事象検索部７１は、特定事象が写っている可能性が低い旨の低アラート情報を特定事象ＩＤと対応づけて特定事象ＤＢ６１に格納してもよい。また、低アラート情報の代わりに注意アラートが用いられてもよい。 The specific event search unit 71 may store in the specific event DB 61 an association between the alert information and a specific event ID that identifies the video data determined to include a specific event. If the item values do not match for all bias items, the specific event search unit 71 may store in the specific event DB 61 low alert information, indicating that the specific event is unlikely to be included, in association with the specific event ID. A caution alert may be used instead of low alert information.

図１１は、映像監視システムにおける特定事象のリストを表示する構成が示されている。図１１に示した映像監視システムは、図７に示した映像監視システムと比較して、特定事象検索装置７が特定事象検索部７１及び統合表示部７２を備える代わりに、コンテンツ生成部７３を備える点で異なる。ただし、特定事象検索装置７は、特定事象検索部７１、統合表示部７２及びコンテンツ生成部７３を備えるものでもよい。 Figure 11 shows a configuration for displaying a list of specific events in a video surveillance system. The video surveillance system shown in Figure 11 differs from the video surveillance system shown in Figure 7 in that the specific event search device 7 includes a content generation unit 73 instead of a specific event search unit 71 and an integrated display unit 72. However, the specific event search device 7 may also include a specific event search unit 71, an integrated display unit 72, and a content generation unit 73.

コンテンツ生成部７３は、特定事象ＤＢ６１に格納されている特定事象ＩＤのリストを特定事象リストとして生成し、その特定事象ＩＤのいずれかを選択する事象選択画面を表示部４２に表示する。その後、入力部４１が特定事象ＩＤのいずれかを選択した場合、コンテンツ生成部７３は、その選択された特定事象ＩＤに対応する映像データをＦＤＢ３１などから取得してユーザ端末４に送信することで、表示部４２に表示する。また、コンテンツ生成部７３は、その映像データに関連する関連映像データを、例えば、ＦＤＢ３１、特定事象ＤＢ６１又は映像配信装置１（レコーダの場合）から取得して表示部４２に表示してもよい。関連映像データは、例えば、映像データに写っている人物と同じ人物が写っている映像データなどである。 The content generation unit 73 generates a list of specific event IDs stored in the specific event DB 61 as a specific event list, and displays an event selection screen on the display unit 42 for selecting one of the specific event IDs. Thereafter, when the input unit 41 selects one of the specific event IDs, the content generation unit 73 acquires video data corresponding to the selected specific event ID from the FDB 31 or the like, and transmits it to the user terminal 4, thereby displaying it on the display unit 42. The content generation unit 73 may also acquire related video data related to the video data, for example, from the FDB 31, the specific event DB 61, or the video distribution device 1 (in the case of a recorder), and display it on the display unit 42. The related video data is, for example, video data showing the same person as the person shown in the video data.

なお、特定事象ＤＢ６１は、コンテンツ生成部７３にて生成されたコンテンツ（映像データ及び関連映像データ）を保持してもよい。また、本コンテンツはクラウドなどにアップロードされてもよい。 The specific event DB 61 may store the content (video data and related video data) generated by the content generation unit 73. This content may also be uploaded to the cloud or the like.

以上説明したように本実施形態によれば、特定事象ＤＢ６１は、複数の第１の映像データのそれぞれに写る被写体の特徴を示す複数の特徴データから、所望の第１の映像データを検索するための特徴に関する検索クエリに応じて抽出した特定事象データを格納する。映像解析部２２は、第２の映像データに写る被写体の特徴を示す解析結果データを生成する。特定事象検索部７１は、解析結果データと特定事象データとに基づいて、第２の映像データに特定の監視事象である特定事象が写っているか否かを検知する。したがって、ユーザが特定事象の写っている可能性があると考えて検索した過去の検索実績に基づいて特定事象が写っているか否かが検知されるので、専門的な知見を有するユーザがいなくても特定事象を検知することが可能となる。 As described above, according to this embodiment, the specific event DB 61 stores specific event data extracted from a plurality of feature data indicating the features of the subject appearing in each of a plurality of first video data in response to a search query regarding features for searching for desired first video data. The video analysis unit 22 generates analysis result data indicating the features of the subject appearing in the second video data. The specific event search unit 71 detects whether or not a specific event, which is a specific surveillance event, is shown in the second video data based on the analysis result data and the specific event data. Therefore, whether or not a specific event is shown is detected based on past search results in which a user searched for a specific event that may be shown, so that it is possible to detect a specific event even without a user with specialized knowledge.

また、本実施形態では、特定事象検索部７１は、解析結果データにて示される特徴と被写体別データ（追跡ＩＤ別テーブルの各レコード）にて示される特徴とを比較し、その比較結果に基づいて、特定事象が写っているか否かを検知する。このため、過去に検索された画像の特徴に基づいて特定事象が検知されるため、特定事象が写っているか否かをより精度よく検知することが可能となる。 In addition, in this embodiment, the specific event search unit 71 compares the features indicated in the analysis result data with the features indicated in the subject-specific data (each record in the tracking ID-specific table), and detects whether or not a specific event is captured based on the comparison result. Therefore, since a specific event is detected based on the features of images previously searched, it becomes possible to more accurately detect whether or not a specific event is captured.

また、本実施形態では、特定事象検索部７１は、特徴の第１の項目において、解析結果データにて示される特徴との類似度が所定類似度以上の特徴を示す被写体別データを抽出し、前記解析結果データにて示される特徴と当該抽出した被写体別データのそれぞれにて示される特徴とを比較する。第１の項目は、行動項目であることが好ましい。この場合、特定事象が写っているか否かをより精度よく検知することが可能となる。 In addition, in this embodiment, the specific event search unit 71 extracts subject-specific data that indicates a feature in the first feature item that has a similarity to the feature indicated in the analysis result data that is equal to or greater than a predetermined similarity, and compares the feature indicated in the analysis result data with the feature indicated in each of the extracted subject-specific data. It is preferable that the first feature item is an action item. In this case, it becomes possible to more accurately detect whether or not a specific event is captured.

また、本実施形態では、特定事象検索部７１は、特徴の第２の項目において、複数の被写体別データのそれぞれの特徴の偏りを評価した評価値が所定値以上の場合、当該項目の値のうち最も頻度の高い値と解析結果データの当該項目の値とを比較する。そして、特定事象検索部７１は、各項目の値が一致している場合、特定事象が写っていると検知する。第１の項目は、属性項目を含むことが好ましい。この場合、特定事象が写っているか否かをより精度よく検知することが可能となる。 In addition, in this embodiment, if the evaluation value for evaluating the bias of each feature of the multiple subject-specific data in the second feature item is equal to or greater than a predetermined value, the specific event search unit 71 compares the most frequent value of the value of that item with the value of that item in the analysis result data. Then, if the values of each item match, the specific event search unit 71 detects that a specific event is captured. It is preferable that the first item includes an attribute item. In this case, it becomes possible to detect with greater accuracy whether or not a specific event is captured.

また、本実施形態では、特定事象ＤＢ登録部５２は、特徴データから映像検索クエリに応じて特定事象データを抽出して特定事象ＤＢ６１に登録する。このため、特定事象データを更新していくことが可能となるため、特定事象が写っているか否かをより精度よく検知することが可能となる。 In addition, in this embodiment, the specific event DB registration unit 52 extracts specific event data from the feature data in response to the video search query and registers the specific event data in the specific event DB 61. This makes it possible to update the specific event data, making it possible to more accurately detect whether or not a specific event is captured.

また、本実施形態では、特定事象ＤＢ登録部５２は、映像検索クエリに適合した特徴データのうちユーザにて選択された特徴データを特定事象データとして抽出する。このため、ユーザが特定事象の写っている可能性が高いと考えて検索した過去の検索実績に基づいて特定事象が写っているか否かが検知されるので、特定事象が写っているか否かをより精度よく検知することが可能となる。 In addition, in this embodiment, the specific event DB registration unit 52 extracts feature data selected by the user from among feature data that matches the video search query as specific event data. Therefore, whether or not a specific event is captured is detected based on past search results in which the user performed searches that were deemed highly likely to contain a specific event, making it possible to more accurately detect whether or not a specific event is captured.

また、本実施形態では、統合表示部７２は、特定事象が写っていた場合、特定事象を示すアラート情報と第２の映像データとを表示する。このため、ユーザに特定事象が写っていることを適切に通知することが可能になる。 In addition, in this embodiment, if a specific event is captured, the integrated display unit 72 displays alert information indicating the specific event and the second video data. This makes it possible to properly notify the user that a specific event is captured.

（第２の実施形態）
本実施形態では、機械学習を用いて特定事象を特定する場合における映像監視システムについて説明する。 Second Embodiment
In this embodiment, a video surveillance system in which a specific event is identified using machine learning will be described.

図１２は、本実施形態の映像監視システムにおける機械学習を行う構成を示す図である。図１２に示す映像監視システムは、図１に示した映像監視システムと比較して、機械学習装置８をさらに有する点で言雄なる。なお、機械学習装置８は、映像検索装置５と物理的に同一の装置で構成されてもよい。 Figure 12 is a diagram showing the configuration for performing machine learning in the video surveillance system of this embodiment. The video surveillance system shown in Figure 12 is different from the video surveillance system shown in Figure 1 in that it further includes a machine learning device 8. Note that the machine learning device 8 may be configured as the same device physically as the video search device 5.

機械学習装置８は、特定事象ＤＢサーバ６の特定事象ＤＢ６１に格納されている特定事象データに基づいて、特定事象を検知するための機械学習モデルである特定事象モデルを構築する。機械学習装置８は、特定事象学習部８１と、特定事象モデル部８２とを有する。 The machine learning device 8 constructs a specific event model, which is a machine learning model for detecting a specific event, based on the specific event data stored in the specific event DB 61 of the specific event DB server 6. The machine learning device 8 has a specific event learning unit 81 and a specific event model unit 82.

特定事象学習部８１は、特定事象ＤＢサーバ６の特定事象ＤＢ６１に格納されている特定事象データを学習データとして用いて、映像解析部２２の解析結果データを特定事象の有無でクラス分けする特定事象モデルを構築する。例えば、特定事象学習部８１は、特定事象データのカメラＩＤ及び取得時刻などをデータ、行動項目及び属性項目の項目値（推定クラス）及び確信度を教師ラベルとした、マルチラベル機械学習などを行うことで、特定事象モデルを構築する。機械学習の手法は、特に限定されないが、例えば、重回帰分析などである。 The specific event learning unit 81 uses the specific event data stored in the specific event DB 61 of the specific event DB server 6 as learning data to construct a specific event model that classifies the analysis result data of the video analysis unit 22 according to the presence or absence of a specific event. For example, the specific event learning unit 81 constructs a specific event model by performing multi-label machine learning using the camera ID and acquisition time of the specific event data as data, and the item values (estimated classes) and confidence levels of the action items and attribute items as teacher labels. The machine learning method is not particularly limited, but may be, for example, multiple regression analysis.

特定事象モデル部８２は、特定事象学習部８１にて構築された特定事象モデルを保持する。 The specific event model unit 82 holds the specific event model constructed by the specific event learning unit 81.

図１３は、本実施形態の映像監視システムにおける機械学習モデルを用いて特定事象を検知する構成を示す図である。図１３に示す映像監視システムは、図１２に示した映像監視システムと比較して、映像検索装置５の代わりに、特定事象検索装置７及び機械学習装置８を有する点で異なる。特定事象検索装置７は、図７に示した特定事象検索装置７と比較して、特定事象検索部７１の代わりに特定事象推定部７４を備える点で異なる。なお、機械学習装置８は、図１３では、特定事象モデル部８２のみ示している。 Figure 13 is a diagram showing a configuration for detecting a specific event using a machine learning model in the video surveillance system of this embodiment. The video surveillance system shown in Figure 13 differs from the video surveillance system shown in Figure 12 in that it has a specific event search device 7 and a machine learning device 8 instead of the video search device 5. The specific event search device 7 differs from the specific event search device 7 shown in Figure 7 in that it has a specific event estimation unit 74 instead of the specific event search unit 71. Note that in Figure 13, only the specific event model unit 82 of the machine learning device 8 is shown.

特定事象推定部７４は、特定事象モデル部８２に設定された特定事象モデルを用いて、解析結果データと特定事象データとに基づいて、映像データに特定事象が写っているか否かを検知する検知部である。 The specific event estimation unit 74 is a detection unit that uses the specific event model set in the specific event model unit 82 to detect whether or not a specific event is captured in the video data based on the analysis result data and the specific event data.

図１４は、本実施形態における映像監視システムによる特定事象を検知する特定事象検知処理の一例を説明するためのフローチャートである。 Figure 14 is a flowchart illustrating an example of a specific event detection process for detecting a specific event by the video surveillance system of this embodiment.

先ず、映像処理装置２の映像入力部２１が映像データを受け付けると（ステップＳ５０１）、映像解析部２２は、その映像データに対して映像解析処理を行い、映像解析処理の結果である解析結果データと映像データとを特定事象検索装置７に送信する（ステップＳ５０２）。 First, when the video input unit 21 of the video processing device 2 receives video data (step S501), the video analysis unit 22 performs video analysis processing on the video data and transmits the analysis result data, which is the result of the video analysis processing, and the video data to the specific event search device 7 (step S502).

特定事象検索装置７の特定事象推定部７４は、映像処理装置２からの解析結果データを機械学習装置８の特定事象モデル部８２に保持されている特定事象モデルに入力する（ステップＳ５０３）。特定事象推定部７４は、特定事象モデルの出力を確認して、映像データに特定事象が写っているか否かを判断する（ステップＳ５０４）。 The specific event estimation unit 74 of the specific event search device 7 inputs the analysis result data from the video processing device 2 into the specific event model held in the specific event model unit 82 of the machine learning device 8 (step S503). The specific event estimation unit 74 checks the output of the specific event model and determines whether or not a specific event is captured in the video data (step S504).

特定事象が写っていない場合、統合表示部７２は、映像データをユーザ端末４に送信する。ユーザ端末４の表示部４２は、映像データを受け付け、その映像データを表示し（ステップＳ５０５）、処理を終了する。 If the specific event is not captured, the integrated display unit 72 transmits the video data to the user terminal 4. The display unit 42 of the user terminal 4 receives the video data, displays the video data (step S505), and ends the process.

一方、特定事象が写っている場合、統合表示部７２は、アラート情報を生成し、映像データ及びアラート情報をユーザ端末４に送信する。ユーザ端末４の表示部４２は、映像データ及びアラート情報を受け付け、その映像データ及びアラート情報を表示し（ステップＳ５０６）、処理を終了する。 On the other hand, if a specific event is captured, the integrated display unit 72 generates alert information and transmits the video data and the alert information to the user terminal 4. The display unit 42 of the user terminal 4 receives the video data and the alert information, displays the video data and the alert information (step S506), and ends the process.

なお、アラート情報を確認したユーザが映像データに特定事象が写っているか否かを判断し、特定事象が写っていない場合、その旨を入力部４１に入力してもよい。この場合、特定事象推定部７４は、その旨に応じて、特定事象ＤＢ６１に格納されている特定事象データを更新してもよい。この場合、特定事象学習部８１が機械学習に用いる学習データが更新されるため、特定事象モデルを更新することが可能となる。 The user who has checked the alert information may determine whether or not a specific event is captured in the video data, and if the specific event is not captured, may input this to the input unit 41. In this case, the specific event estimation unit 74 may update the specific event data stored in the specific event DB 61 accordingly. In this case, the learning data used by the specific event learning unit 81 for machine learning is updated, making it possible to update the specific event model.

以上説明した各装置１～８は、例えば、プロセッサ（コンピュータ）及びメモリ（共に図示せず）を備えたコンピュータシステムにより構成されてもよい。この場合、各装置１～８の構成要素及び機能は、例えば、プロセッサがコンピュータプログラムを読み取り、その読み取ったコンピュータプログラムを実行することで実現される。コンピュータプログラムは、メモリのようなコンピュータにて読み取り可能な記録媒体に記録可能である。また、各装置１～８は、必要に応じて、情報の入出力を行う入出力装置、及び、情報を格納する補助記憶装置などを有してもよい。 Each of the devices 1 to 8 described above may be configured, for example, by a computer system equipped with a processor (computer) and memory (both not shown). In this case, the components and functions of each of the devices 1 to 8 are realized, for example, by the processor reading a computer program and executing the read computer program. The computer program can be recorded on a computer-readable recording medium such as memory. Furthermore, each of the devices 1 to 8 may have an input/output device for inputting and outputting information, and an auxiliary storage device for storing information, as necessary.

上述した本開示の実施形態は、本開示の説明のための例示であり、本開示の範囲をそれらの実施形態にのみ限定する趣旨ではない。当業者は、本開示の範囲を逸脱することなしに、他の様々な態様で本開示を実施することができる。 The above-described embodiments of the present disclosure are illustrative examples of the present disclosure, and are not intended to limit the scope of the present disclosure to only those embodiments. Those skilled in the art may implement the present disclosure in various other forms without departing from the scope of the present disclosure.

１：映像配信装置２：映像処理装置３：ＦＤＢサーバ４：ユーザ端末５：映像検索装置６：特定事象ＤＢサーバ７：特定事象検索装置８：機械学習装置２１：映像入力部２２：映像解析部２３：ＦＤＢ登録部３１：ＦＤＢ４１：入力部４２：表示部５１：映像検索部５２：特定事象ＤＢ登録部６１：特定事象ＤＢ７１：特定事象検索部７２：統合表示部７３：コンテンツ生成部７４：特定事象推定部８１：特定事象学習部８２：特定事象モデル部 1: Video distribution device 2: Video processing device 3: FDB server 4: User terminal 5: Video search device 6: Specific event DB server 7: Specific event search device 8: Machine learning device 21: Video input unit 22: Video analysis unit 23: FDB registration unit 31: FDB 41: Input unit 42: Display unit 51: Video search unit 52: Specific event DB registration unit 61: Specific event DB 71: Specific event search unit 72: Integrated display unit 73: Content generation unit 74: Specific event estimation unit 81: Specific event learning unit 82: Specific event model unit

Claims

a storage unit that stores specific event data extracted from a plurality of feature data indicating features of a subject captured in each of a plurality of first video image data in response to a search query related to the features for searching for a desired one of the first video image data;
an analysis unit that generates analysis result data indicating characteristics of a subject captured in the second video data;
a detection unit that detects whether or not a specific event that is a specific monitoring event is captured in the second video data based on the analysis result data and the specific event data ,
the specific event data includes, for each subject having the characteristic matching the search query, a plurality of subject-specific data summarizing characteristics of a subject identical to the subject in the first video data;
The detection unit compares the features indicated in the analysis result data with the features indicated in each of the plurality of subject-specific data, and detects whether or not the specific event is captured based on the comparison result;
A video surveillance system in which the detection unit extracts the subject-specific data that indicates, in a first item of the features, a feature whose similarity to the feature indicated in the analysis result data is equal to or greater than a predetermined similarity, and compares the feature indicated in the analysis result data with the feature indicated in each of the extracted subject-specific data .

The video surveillance system according to claim 1 , wherein the first item indicates a characteristic related to the behavior of the subject.

a storage unit that stores specific event data extracted from a plurality of feature data indicating features of a subject captured in each of a plurality of first video image data in response to a search query related to the features for searching for a desired one of the first video image data;
an analysis unit that generates analysis result data indicating characteristics of a subject captured in the second video data;
a detection unit that detects whether or not a specific event that is a specific monitoring event is captured in the second video data based on the analysis result data and the specific event data,
the specific event data includes, for each subject having the characteristic matching the search query, a plurality of subject-specific data aggregating characteristics of a subject identical to the subject in the first video data;
The detection unit compares the features indicated in the analysis result data with the features indicated in each of the plurality of subject-specific data, and detects whether or not the specific event is captured based on the comparison result;
In a video surveillance system, when an evaluation value evaluating the bias of each feature of the plurality of subject-specific data in a second item of the features is equal to or greater than a predetermined value, the detection unit compares the most frequent value among the values of the item with the value of the item of the analysis result data.

The video surveillance system according to claim 3 , wherein the detection unit detects that the specific event is captured in the video when values of the respective items match.

The video surveillance system according to claim 3 , wherein the second item indicates a feature related to an attribute of the subject.

A learning unit that uses the specific event data as learning data to construct a specific event model that classifies the analysis result data according to the presence or absence of the specific event,
The video surveillance system according to claim 1 , wherein the detection unit detects whether or not the specific event is captured by using the specific event model.

a storage unit that stores specific event data extracted from a plurality of feature data indicating features of a subject captured in each of a plurality of first video image data in response to a search query related to the features for searching for a desired one of the first video image data;
an analysis unit that generates analysis result data indicating characteristics of a subject captured in the second video data;
a detection unit that detects whether or not a specific event that is a specific monitoring event is captured in the second video data based on the analysis result data and the specific event data;
A registration unit that extracts the specific event data from the feature data in response to the search query and registers the specific event data in the storage unit.

The video surveillance system of claim 1, further comprising an integrated display unit that displays alert information indicating the specific event and the second video data when the specific event is captured.

A video monitoring method using a video monitoring system, comprising:
storing specific event data extracted from a plurality of feature data indicating features of a subject appearing in each of a plurality of first video image data in response to a search query relating to the features for searching for a desired one of the first video image data;
generating analysis result data indicative of characteristics of a subject captured in the second image data;
Detecting whether or not a specific event, which is a specific monitoring event, is captured in the second video data based on the analysis result data and the specific event data ;
the specific event data includes, for each subject having the characteristic matching the search query, a plurality of subject-specific data summarizing characteristics of a subject identical to the subject in the first video data;
In the detection, a feature indicated by the analysis result data is compared with a feature indicated by each of the plurality of subject-specific data, and based on the comparison result, it is detected whether or not the specific event is captured;
In the detection, the subject-specific data that indicates a feature in a first item of the feature having a similarity to the feature indicated in the analysis result data that is equal to or greater than a predetermined similarity is extracted, and the feature indicated in the analysis result data is compared with the feature indicated in each of the extracted subject-specific data .

A video monitoring method using a video monitoring system, comprising:
storing specific event data extracted from a plurality of feature data indicating features of a subject appearing in each of a plurality of first video image data in response to a search query related to the features for searching for a desired one of the first video image data;
generating analysis result data indicative of characteristics of a subject captured in the second image data;
Detecting whether or not a specific event, which is a specific monitoring event, is captured in the second video data based on the analysis result data and the specific event data;
the specific event data includes, for each subject having the characteristic matching the search query, a plurality of subject-specific data aggregating characteristics of a subject identical to the subject in the first video data;
In the detection, a feature indicated by the analysis result data is compared with a feature indicated by each of the plurality of subject-specific data, and based on the comparison result, it is detected whether or not the specific event is captured;
In the detection, when an evaluation value evaluating the bias of each feature of the plurality of subject-specific data in a second item of the features is equal to or greater than a predetermined value, the most frequent value among the values of the item is compared with the value of the item of the analysis result data, in this video surveillance method.

A video monitoring method using a video monitoring system, comprising:
storing specific event data extracted from a plurality of feature data indicating features of a subject appearing in each of a plurality of first video image data in response to a search query relating to the features for searching for a desired one of the first video image data;
generating analysis result data indicative of characteristics of a subject captured in the second image data;
Detecting whether or not a specific event, which is a specific monitoring event, is captured in the second video data based on the analysis result data and the specific event data;
The specific event data is extracted from the feature data in response to the search query and registered.