JP7770844B2

JP7770844B2 - Image capture device, image capture device control method and program

Info

Publication number: JP7770844B2
Application number: JP2021162763A
Authority: JP
Inventors: 一人寺境
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-10-01
Filing date: 2021-10-01
Publication date: 2025-11-17
Anticipated expiration: 2041-10-01
Also published as: JP2023053615A

Description

本発明は、撮像装置、撮像装置の制御方法及びプログラムに関する。 The present invention relates to an imaging device, a control method for an imaging device, and a program.

近年、デジタル系の撮像装置が著しい進化を遂げている。今後、更なる進化を遂げるため、他のデバイス装置と融合するなど、高性能化、多機能化が考えられる。その際入力情報として、視線入力による情報の活用が考えられる。視線の情報を活用した技術として、特許文献１には、左右の眼球がなす輻輳角θの状態が変化し、かつ、同一状態の継続時間が長いことを条件として、電子像と外界像の切り替えを指示する変更信号を像切替制御装置に出力することが記載されている。 Digital imaging devices have evolved significantly in recent years. To achieve further advances in the future, it is expected that they will become more powerful and multifunctional, for example by integrating with other devices. One possible input information for this would be information input via eye gaze. Patent Document 1 describes a technology that utilizes eye gaze information, in which a change signal is output to an image switching control device to instruct switching between an electronic image and an external world image, provided that the state of the convergence angle θ between the left and right eyeballs changes and the same state continues for a long period of time.

特開平８－１６０３４５号公報Japanese Patent Application Publication No. 8-160345

従来の撮像装置では、レリーズボタンを押す動作によって撮影指示を入力するが、ユーザが撮影したいと判断してから実際に撮影動作が開始するまでにタイムラグが発生し、撮影したいシーンを撮り逃してしまうという課題がある。特許文献１に記載の技術のように、左右の眼球がなす輻輳角の情報を条件に、撮影指示を入力したとしても、撮影指示の検出精度が低く、上記の課題を解決するに至っていない。 In conventional imaging devices, shooting instructions are input by pressing the release button, but there is a time lag between when the user decides they want to shoot and when the shooting operation actually begins, resulting in the user missing the intended scene. Even if shooting instructions are input based on information about the convergence angle between the left and right eyes, as in the technology described in Patent Document 1, the accuracy of detecting shooting instructions is low, and the above problem has not yet been solved.

本発明は、撮影したいシーンの撮り逃しを抑制することを目的とする。 The purpose of this invention is to prevent you from missing out on capturing a scene you want to capture.

本発明の撮像装置は、ユーザの目の状態を検出する検出手段と、前記検出手段による検出結果に基づいて目の動きを解析する解析手段と、前記解析手段により解析された目の動きを表す動き情報を学習モデルに入力することにより得られる撮影したいシーンであるかを示す出力データに基づいて、レリーズ動作を開始するタイミングの推定を行う推定手段と、を有することを特徴とする。 The imaging device of the present invention is characterized by having a detection means for detecting the state of the user's eyes, an analysis means for analyzing eye movement based on the detection results by the detection means, and an estimation means for estimating the timing to start the release operation based on output data indicating whether the scene is one that the user wants to photograph , which is obtained by inputting movement information representing the eye movement analyzed by the analysis means into a learning model.

本発明によれば、撮影したいシーンの撮り逃しを抑制することができる。 This invention makes it possible to prevent missing out on capturing a scene you want to capture.

撮像装置の全体構成例を示す図である。FIG. 1 is a diagram illustrating an example of the overall configuration of an imaging apparatus. 撮像装置のハードウェア構成例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of an imaging apparatus. 学習フェーズにおける機能構成例を示す図である。FIG. 10 is a diagram illustrating an example of a functional configuration in a learning phase. 学習用データの生成に用いる情報の例を示す図である。FIG. 10 is a diagram illustrating an example of information used to generate learning data. 学習モデルを用いた入出力の構造を示す概念図である。FIG. 1 is a conceptual diagram showing an input/output structure using a learning model. 目の動きの解析方法を説明するための図である。FIG. 1 is a diagram for explaining a method for analyzing eye movements. 学習フェーズで実行される処理を示すフローチャートである。10 is a flowchart showing a process executed in a learning phase. 推定フェーズにおける機能構成例を示す図である。FIG. 10 is a diagram illustrating an example of a functional configuration in an estimation phase. 推定フェーズで実行される処理を示すフローチャートである。10 is a flowchart showing a process executed in an estimation phase.

以下、添付図面を参照して、本発明の一実施形態について説明する。 One embodiment of the present invention will be described below with reference to the accompanying drawings.

＜撮像装置の全体構成＞
図１は、本実施形態に係る撮像装置１００の全体構成例を示す図である。
撮像装置１００は、デジタルスチルカメラであり、撮像レンズ１０１及び撮像素子１０２を備える。撮像レンズ１０１は、ズームレンズ、フォーカスレンズを含むレンズ群である。撮像素子１０２は、撮像レンズ１０１によって導かれた光学像を結像面に結像させ、電気信号に変換する。 <Overall configuration of imaging device>
FIG. 1 is a diagram showing an example of the overall configuration of an image capturing apparatus 100 according to this embodiment.
The imaging device 100 is a digital still camera and includes an imaging lens 101 and an imaging element 102. The imaging lens 101 is a lens group including a zoom lens and a focus lens. The imaging element 102 forms an optical image guided by the imaging lens 101 on an imaging surface and converts the image into an electrical signal.

また、撮像装置１００は、ＣＰＵ１０３、メモリ１０４、ＧＰＵ（Graphics Processing Unit）１０５、及びＦＰＧＡ（Field Programmable Gate Array）１０６を備える。ＣＰＵ１０３は、撮像装置１００の全体を制御する。メモリ１０４は、ＲＡＭ、ＲＯＭ、ＨＤＤ等であって、プログラムを記憶する。ＣＰＵ１０３が、メモリ１０４に記憶されるプログラムを実行することにより、後述するフローチャートの処理が実現する。ＧＰＵ１０５及びＦＰＧＡ１０６は、データをより多く並列処理することで効率的な演算を行うことができる。そのため、ディープラーニングのような学習モデルを用いて複数回に亘り学習を行う場合には、ＧＰＵ１０５やＦＰＧＡ１０６で処理を行うことが有効である。本実施形態では、解析や学習を行う場合に、ＣＰＵ１０３と共に、ＧＰＵ１０５及びＦＰＧＡ１０６が、協働して処理を行う。 The imaging device 100 also includes a CPU 103, memory 104, a GPU (Graphics Processing Unit) 105, and an FPGA (Field Programmable Gate Array) 106. The CPU 103 controls the entire imaging device 100. The memory 104 is a RAM, ROM, HDD, or the like, and stores programs. The CPU 103 executes the programs stored in the memory 104 to realize the processing of the flowcharts described below. The GPU 105 and FPGA 106 can perform efficient calculations by processing more data in parallel. Therefore, when performing learning multiple times using a learning model such as deep learning, it is effective to use the GPU 105 or FPGA 106 to perform processing. In this embodiment, when performing analysis or learning, the GPU 105 and FPGA 106 work together with the CPU 103 to perform processing.

また、撮像装置１００は、視線検知用センサー１０７、表示素子１０８、表示素子駆動回路１０９、接眼レンズ１１０及びレリーズボタン１１１を備える。
視線検知用センサー１０７は、ユーザがファインダを覗いたことを検知するためのセンサーである。視線検知用センサー１０７は、検知結果をＣＰＵ１０３に出力する。
表示素子１０８は、液晶画面等で構成されており、ファインダの内部に設けられている。表示素子駆動回路１０９は、ＣＰＵ１０３の制御により、表示素子１０８を駆動して表示素子１０８の画面上に撮像素子１０２で撮像された画像を表示する。表示素子１０８は、表示部の一例である。接眼レンズ１１０は、表示素子１０８に表示された画像を拡大して観察するために用いられる。レリーズボタン１１１は、撮影する際にユーザにより操作されるボタンであり、操作信号は信号入力回路２０３（図２）に入力される。 The imaging device 100 also includes a line-of-sight sensor 107 , a display element 108 , a display element drive circuit 109 , an eyepiece 110 , and a release button 111 .
The line-of-sight detection sensor 107 is a sensor for detecting when the user looks into the viewfinder, and outputs the detection result to the CPU 103.
The display element 108 is composed of a liquid crystal screen or the like and is provided inside the viewfinder. A display element drive circuit 109 drives the display element 108 under the control of the CPU 103 to display an image captured by the image capture element 102 on the screen of the display element 108. The display element 108 is an example of a display unit. An eyepiece 110 is used to enlarge and observe the image displayed on the display element 108. A release button 111 is a button operated by the user when taking a picture, and an operation signal is input to a signal input circuit 203 (FIG. 2).

また、撮像装置１００は、照明光源１１２ａ～１１２ｂ、光分割器１１４、受光レンズ１１５、及び眼球用撮像素子１１６を備えており、ファインダを覗くユーザの目の状態を検出する機能を有する。ここでの目の状態は、視線方向、目の見開き具合、目の細め具合、目の開閉状態等を含む。
照明光源１１２ａ～１１２ｂは、光源の角膜反射による反射像と瞳孔の関係からユーザの視線方向を検出するために、ユーザの眼球１１３を照明する光源である。照明光源１１２ａ～１１２ｂは、赤外発光ダイオードからなり、接眼レンズ１１０の周囲に配置されている。照射された眼球像と照明光源１１２ａ～１１２ｂの角膜反射による像は、接眼レンズ１１０を透過し、光分割器１１４で反射され、受光レンズ１１５によってＣＣＤ等の光電素子列を２次元的に配した眼球用撮像素子１１６上に結像される。受光レンズ１１５は、ユーザの眼球１１３の瞳孔と眼球用撮像素子１１６を共役な結像関係に位置付けている。眼球用撮像素子１１６上に結像された眼球像における白目の範囲等から、目の見開き具合、目の細め具合、目の開閉状態等が検出可能である。また、眼球用撮像素子１１６上に結像された眼球像と、照明光源１１２ａ～１１２ｂの角膜反射による像の位置関係から視線方向が検出可能である。 The image capturing device 100 also includes illumination light sources 112a-112b, a light splitter 114, a light receiving lens 115, and an eye image capturing element 116, and has the function of detecting the state of the eyes of the user looking through the viewfinder. The state of the eyes here includes the direction of the line of sight, the degree to which the eyes are wide open, the degree to which the eyes are narrowed, the state of the eyes being open or closed, etc.
The illumination light sources 112a-112b are light sources that illuminate the user's eyeball 113 in order to detect the user's line of sight from the relationship between the pupil and the image resulting from the corneal reflection of the light source. The illumination light sources 112a-112b are composed of infrared light-emitting diodes and are arranged around the eyepiece 110. The illuminated eyeball image and the image resulting from the corneal reflection of the illumination light sources 112a-112b pass through the eyepiece 110, are reflected by the light splitter 114, and are then focused by the light-receiving lens 115 on the eyeball image-capturing element 116, which is a two-dimensional array of photoelectric elements such as a CCD. The light-receiving lens 115 positions the pupil of the user's eyeball 113 and the eyeball image-capturing element 116 in a conjugate imaging relationship. From the extent of the white of the eye in the eyeball image focused on the eyeball image-capturing element 116, it is possible to detect the degree to which the eyes are wide open, narrowed, or open. Furthermore, the line of sight direction can be detected from the positional relationship between the eyeball image formed on the eyeball image sensor 116 and the image of the illumination light sources 112a to 112b due to corneal reflection.

＜撮像装置のハードウェア構成＞
図２は、図１の撮像装置１００のハードウェアの構成例を示す図である。図１と同一のものは同一番号を付している。 <Hardware configuration of the imaging device>
Fig. 2 is a diagram showing an example of the hardware configuration of the image capture device 100 shown in Fig. 1. The same components as those in Fig. 1 are assigned the same numbers.

図２に示す通り、ＣＰＵ１０３には、撮像素子１０２、メモリ１０４、表示素子駆動回路１０９、視線検出回路２０１、測光回路２０２、信号入力回路２０３、照明光源駆動回路２０５、及びＧＰＵ１０５が接続されており、これらのデバイスを制御している。
撮像素子１０２は、電気信号を撮像画像としてＣＰＵ１０３に出力する。
メモリ１０４は、撮像素子１０２から出力された撮像画像、眼球用撮像素子１１６から出力された像情報、学習に必要なデータ等を記録する。
表示素子駆動回路１０９は、ＣＰＵ１０３の制御により、表示素子１０８に表示する処理を実行する。本実施形態では、撮像素子１０２出力された撮像画像を表示素子１０８に逐次表示することで、ライブビュー表示を行う。以下、ライブビューで表示される画像をスルー画像と称する。 As shown in FIG. 2, the CPU 103 is connected to the image sensor 102, memory 104, display device drive circuit 109, line of sight detection circuit 201, photometry circuit 202, signal input circuit 203, illumination light source drive circuit 205, and GPU 105, and controls these devices.
The image sensor 102 outputs an electrical signal as a captured image to the CPU 103 .
The memory 104 records the captured image output from the image sensor 102, the image information output from the eye image sensor 116, data necessary for learning, and the like.
The display element drive circuit 109 executes processing for displaying on the display element 108 under the control of the CPU 103. In this embodiment, live view display is performed by sequentially displaying captured images output from the image sensor 102 on the display element 108. Hereinafter, images displayed in live view will be referred to as through images.

視線検出回路２０１は、眼球用撮像素子１１６からの眼球像が結像することによる出力をＡ／Ｄ変換し、この像情報をＣＰＵ１０３に出力する。ＣＰＵ１０３は、視線検出回路２０１から像情報を取得し、取得した像情報から目の状態の検出に必要な眼球像の各特徴点を所定のアルゴリズムに従って抽出し、抽出した各特徴点の位置を検出する。当該検出結果はＧＰＵ１０５やＦＰＧＡ１０６に出力される。
測光回路２０２は、測光センサーの役割も兼ねる撮像素子１０２から得られる電気信号に基づいて、被写界の輝度情報を演算し、ＣＰＵ１０３に出力する。
信号入力回路２０３は、レリーズボタン１１１の第一ストローク（第１の操作）でＯＮし、測光、測距等の撮影準備の開始を指示するためのスイッチであるＳＷ１に接続され、ＳＷ１がＯＮされたことをＣＰＵ１０３に出力する。また信号入力回路２０３は、レリーズボタン１１１の第二ストローク（第２の操作）でＯＮし、レリーズ動作の開始を指示するためのスイッチであるＳＷ２が接続され、ＳＷ２がＯＮされたことをＣＰＵ１０３に出力する。ここでのレリーズ動作は、ＳＷ２がＯＮされたタイミングで撮像素子１０２から出力された撮像画像を、静止画データとしてメモリ１０４に記憶させるための動作をいう。
照明光源駆動回路２０５は、ＣＰＵ１０３の制御により、ユーザの視線方向を検出する際に使用する照明光源１１２ａ～１１２ｂを駆動する処理を実行する。 The gaze detection circuit 201 A/D converts the output resulting from the formation of an eyeball image from the eyeball image sensor 116 and outputs this image information to the CPU 103. The CPU 103 acquires the image information from the gaze detection circuit 201, extracts each feature point of the eyeball image necessary for detecting the state of the eye from the acquired image information according to a predetermined algorithm, and detects the position of each extracted feature point. The detection results are output to the GPU 105 and FPGA 106.
The photometry circuit 202 calculates luminance information of the subject scene based on an electrical signal obtained from the image sensor 102 , which also functions as a photometry sensor, and outputs the information to the CPU 103 .
The signal input circuit 203 is connected to SW1, which is turned on by the first stroke (first operation) of the release button 111 and is used to instruct the start of preparations for shooting, such as photometry and distance measurement, and outputs information that SW1 has been turned on to the CPU 103. The signal input circuit 203 is also connected to SW2, which is turned on by the second stroke (second operation) of the release button 111 and is used to instruct the start of a release operation, and outputs information that SW2 has been turned on to the CPU 103. The release operation here refers to an operation for storing a captured image output from the image sensor 102 at the timing when SW2 is turned on in the memory 104 as still image data.
Under the control of the CPU 103, the illumination light source drive circuit 205 executes processing to drive the illumination light sources 112a to 112b used when detecting the direction of the user's line of sight.

ＣＰＵ１０３は、視線検出回路２０１から出力された情報に基づいて目の動きを解析する。ここでの目の動きは、目の見開き具合の変化、目の細め具合の変化、視線の動き、及び瞬きの周期等をいう。学習フェーズでは、ＣＰＵ１０３は、解析して得られた目の動き情報と信号入力回路２０３からの出力情報とに基づいて学習用データを生成し、生成した学習用データを用いて学習モデルの学習を行う。また推定フェーズでは、ＣＰＵ１０３は、視線検出回路２０１から出力された情報を解析して得られた目の動き情報を学習モデルに入力することにより、レリーズ動作を開始するタイミングの推定を行う。本実施形態において、ＣＰＵ１０３は、ＧＰＵ１０５やＦＰＧＡ１０６と協働して、以上のような処理を実行する。 The CPU 103 analyzes eye movement based on the information output from the gaze detection circuit 201. Eye movement here refers to changes in the degree of eye opening, changes in the degree of eye squinting, gaze movement, blinking cycles, etc. In the learning phase, the CPU 103 generates learning data based on the eye movement information obtained through the analysis and the output information from the signal input circuit 203, and uses the generated learning data to train a learning model. In the estimation phase, the CPU 103 estimates the timing to start the release operation by inputting the eye movement information obtained by analyzing the information output from the gaze detection circuit 201 into the learning model. In this embodiment, the CPU 103 executes the above-mentioned processing in cooperation with the GPU 105 and FPGA 106.

＜学習フェーズにおける機能構成＞
図３は、本実施形態に係る撮像装置１００の学習フェーズにおける機能構成例を示す図である。ＣＰＵ１０３は、メモリ１０４に記憶されるプログラムを実行することにより、ＣＰＵ１０３に接続される各デバイスを制御して、図３に示す、状態検出部３０１、データ記憶部３０２、解析部３０３、撮影要求検知部３０４及び学習部３０５として機能する。なお撮像装置１００は、ユーザの設定に応じて、学習フェーズにおける機能を起動させるか停止させるかを切り替えることが可能である。 <Functional configuration in the learning phase>
Fig. 3 is a diagram showing an example of the functional configuration of the imaging device 100 according to this embodiment in the learning phase. The CPU 103 executes a program stored in the memory 104 to control each device connected to the CPU 103 and function as a state detection unit 301, a data storage unit 302, an analysis unit 303, a shooting request detection unit 304, and a learning unit 305 shown in Fig. 3. Note that the imaging device 100 can switch between activating and deactivating functions in the learning phase according to user settings.

状態検出部３０１は、視線検出回路２０１から出力された情報に基づいてユーザの目の状態を検出する。状態検出部３０１（ＣＰＵ１０３）は、検出手段の一例である。
データ記憶部３０２は、状態検出部３０１により検出された目の状態に関する情報をメモリ１０４に蓄積する。 The state detection unit 301 detects the state of the user's eyes based on the information output from the line-of-sight detection circuit 201. The state detection unit 301 (CPU 103) is an example of a detection unit.
The data storage unit 302 stores information about the eye condition detected by the condition detection unit 301 in the memory 104 .

解析部３０３は、メモリ１０４に記憶される所定時間分の目の状態に関する情報を読み出し、当該読み出した情報に基づいて目の動きを解析する。解析部３０３（ＣＰＵ１０３）は、解析手段の一例である。なお、目の動きの解析は、ＧＰＵ１０５及びＦＰＧＡ１０６により実行されてもよい。本実施形態において、解析部３０３は、メモリ１０４から所定時間分の目の状態に関する情報を読み出して、各時刻について、目を見開いた状態か、目の細めた状態か、視線を動かした状態か、目を瞑っている状態か等の各項目を判定する。そして、所定時間分の目の状態に関する情報について当該各項目の判定を行って、目の見開き具合の変化、目の細め具合の変化、視線の動き、及び瞬きの周期等について解析を行う。なお解析の方法は、上記の方法に限られない。例えば、目の動きをパターン化する方法や、所定のパターンに分類する方法でもよい。解析部３０３は、解析結果を目の動き情報として、学習部３０５に提供する。 The analysis unit 303 reads information about the eye state for a predetermined time period stored in the memory 104 and analyzes eye movements based on the read information. The analysis unit 303 (CPU 103) is an example of an analysis means. Note that the analysis of eye movements may be performed by the GPU 105 and FPGA 106. In this embodiment, the analysis unit 303 reads information about the eye state for a predetermined time period from the memory 104 and determines, for each time period, each item, such as whether the eyes are wide open, narrowed, moving, or closed. The analysis unit 303 then determines each item for the information about the eye state for the predetermined time period and analyzes changes in the degree of eye opening, changes in the degree of eye narrowing, gaze movements, blinking cycles, and the like. Note that the analysis method is not limited to the above method. For example, a method of patterning eye movements or a method of classifying eye movements into predetermined patterns may also be used. The analysis unit 303 provides the analysis results to the learning unit 305 as eye movement information.

撮影要求検知部３０４は、信号入力回路２０３からＳＷ１、ＳＷ２がＯＮされたことが入力された場合に、ＳＷ１、ＳＷ２がＯＮされたタイミングを検知する。撮影要求検知部３０４は、ＳＷ１、ＳＷ２がＯＮされたタイミングを学習部３０５に提供する。
学習部３０５は、目の動き情報にＳＷ１、ＳＷ２がＯＮされたタイミングを関連付けて、学習用データを生成する。学習部３０５は、学習用データを複数生成してメモリ１０４に記憶する。また学習部３０５は、メモリ１０４に記憶された複数の学習用データを用いて学習モデルの学習を行う。学習部（ＣＰＵ１０３）３０５は、学習手段の一例である。 The imaging request detection unit 304 detects the timing at which SW1 and SW2 are turned ON when the signal input circuit 203 inputs that SW1 and SW2 are turned ON. The imaging request detection unit 304 provides the learning unit 305 with the timing at which SW1 and SW2 are turned ON.
The learning unit 305 generates learning data by associating the eye movement information with the timing at which SW1 and SW2 are turned on. The learning unit 305 generates multiple pieces of learning data and stores them in the memory 104. The learning unit 305 also learns a learning model using the multiple pieces of learning data stored in the memory 104. The learning unit (CPU 103) 305 is an example of a learning means.

＜学習用データに関する説明＞
図４は、学習用データの生成に用いる情報の例を示す図である。図４に示す情報は、レコード４００の蓄積データにより構成されており、メモリ１０４に記憶されている。レコード４００は、時刻４０１、経過時間４０２、座標ａ４０３、座標ｂ４０４、出力ａ４０５、出力ｂ４０６、ＳＷ１のＯＮ４０７、ＳＷ２のＯＮ４０８、及び解析結果４０９の情報を含む。
時刻４０１は、レコード４００が記憶された時の時刻である。経過時間４０２は、目の状態の検出が開始されてから同レコード４００内の時刻４０１の時点までの経過時間である。本実施形態では、視線検知用センサー１０７がユーザがファインダを覗いたことを検知した場合に、目の状態の検出が開始する。
座標ａ４０３、座標ｂ４０４、出力ａ４０５、及び出力ｂ４０６には、同レコード４００内の時刻４０１の時点での目の状態に関する情報が保持されている。座標ａ４０３、座標ｂ４０４は、照明光源１１２ａ、１１２ｂによる反射像に対応する眼球用撮像素子１１６（ＣＣＤ）における位置座標（図６の反射像座標６３１ａ、６３１ｂ）を表す。出力ａ４０５、出力ｂ４０５は、座標ａ４０３、座標ｂ４０４の座標位置におけるＣＣＤの出力強度を表す。詳細については図６を用いて後述する。なお、座標ａ４０３としては、図６の反射像座標６３１ａと共に、図６の瞳孔端座標６４１ａや虹彩端座標６５１ａを用いてもよい。また、座標ｂ４０４としては、図６の反射像座標６３１ｂと共に、図６の瞳孔端座標６４１ｂや虹彩端座標６５１ｂを用いてもよい。 <Explanation about the training data>
4 is a diagram showing an example of information used to generate learning data. The information shown in FIG. 4 is composed of accumulated data of a record 400 and is stored in the memory 104. The record 400 includes information such as time 401, elapsed time 402, coordinate a 403, coordinate b 404, output a 405, output b 406, SW1 ON 407, SW2 ON 408, and analysis result 409.
Time 401 is the time when record 400 is stored. Elapsed time 402 is the time that has elapsed since the start of eye state detection until the time 401 in the same record 400. In this embodiment, eye state detection starts when the line-of-sight detection sensor 107 detects that the user is looking through the viewfinder.
Coordinates a403, b404, output a405, and output b406 store information about the state of the eye at time 401 in the record 400. Coordinates a403 and b404 represent position coordinates (reflected image coordinates 631a and 631b in FIG. 6 ) on the eye image sensor 116 (CCD) corresponding to the reflected images from the illumination light sources 112a and 112b. Outputs a405 and b405 represent the output intensity of the CCD at the coordinate positions of coordinates a403 and b404. Details will be described later with reference to FIG. 6 . Note that, as coordinate a403, the pupil edge coordinate 641a or the iris edge coordinate 651a in FIG. 6 may be used in addition to the reflected image coordinate 631a in FIG. 6 . Furthermore, as coordinate b404, the pupil edge coordinate 641b or the iris edge coordinate 651b in FIG. 6 may be used in addition to the reflected image coordinate 631b in FIG. 6 .

ＳＷ１がＯＮされた時刻４０１の時点に記憶されたレコードのＳＷ１のＯＮ４０７には、ＳＷ１がＯＮされたことを示すフラグ情報が保持されている。ＳＷ２のＯＮ４０８についても同様である。図４の例では、時刻４０１が「９：３４：００」でＳＷ１がＯＮされたことを表し、時刻４０１が「９：３４：２２」でＳＷ２がＯＮされたことを表す。
解析結果４０９には、同レコード４００内の時刻４０１の時点での目の状態に関する情報を解析して得られた結果が保持されている。なお解析結果４０９は、解析結果を分かりやすく説明するための記載である。ＣＰＵ１０３は、目の状態に関する情報の時系列推移から、目の動きを解析する。図４に示す例では、目の動きを解析した結果として、以下のような情報が得られる。
・視線が中央から少しずつ左へ移動した後、レリーズボタン１１１が操作されてＳＷ１がＯＮされた
・ＳＷ１がＯＮされた後、まばたきがされて、虹彩が少しずつ小さくなった後、レリーズボタン１１１が操作されてＳＷ２がＯＮされた In the record stored at time 401 when SW1 was turned ON, flag information indicating that SW1 was turned ON is held in SW1 ON 407. The same is true for SW2 ON 408. In the example of Fig. 4, time 401 indicates that SW1 was turned ON at "9:34:00", and time 401 indicates that SW2 was turned ON at "9:34:22".
The analysis result 409 holds the results obtained by analyzing the information about the eye condition at time 401 in the record 400. The analysis result 409 is a description for explaining the analysis result in an easy-to-understand manner. The CPU 103 analyzes the eye movement from the time series transition of the information about the eye condition. In the example shown in FIG. 4, the following information is obtained as a result of analyzing the eye movement:
After the gaze gradually shifts to the left from the center, the release button 111 is operated to turn on SW1. After SW1 is turned on, the person blinks, the iris gradually becomes smaller, and then the release button 111 is operated to turn on SW2.

＜学習方法に関する説明＞
図５は、本実施形態の学習モデルを用いた入出力の構造を示す概念図である。
目の動き情報５０１は、学習モデル５０３に入力する入力データである。学習モデル５０３としては、例えばニューラルネットワークが用いられる。
目の動き情報５０１は、目の動きを表す動き情報であり、例えば目の見開き具合の変化、目の細め具合の変化、視線の動き、及び瞬きの周期について解析された情報である。本実施形態では、目の状態に関する所定時間分の情報（例えば、図４の座標ａ４０３、座標ｂ４０４、出力ａ４０５、及び出力ｂ４０６の蓄積データ）を解析することにより得られる。目の動き情報５０１は、時間情報を含む。なお、目の動き情報５０１として、目の状態に関する所定時間分の情報をそのまま用いてもよい。 <Explanation on learning methods>
FIG. 5 is a conceptual diagram showing the input/output structure using the learning model of this embodiment.
The eye movement information 501 is input data to be input to a learning model 503. As the learning model 503, for example, a neural network is used.
The eye movement information 501 is movement information representing eye movement, and is information obtained by analyzing, for example, changes in the degree of eye opening, changes in the degree of eye squinting, gaze movement, and blinking cycles. In this embodiment, the information is obtained by analyzing information relating to the eye state for a predetermined period of time (for example, accumulated data of coordinates a403, coordinates b404, output a405, and output b406 in FIG. 4). The eye movement information 501 includes time information. Note that the information relating to the eye state for a predetermined period of time may be used as is as the eye movement information 501.

学習部３０５は、目の動き情報５０１の時間情報に、レリーズボタン１１１が操作されたタイミング５０２（ＳＷ１がＯＮされたタイミング、ＳＷ２がＯＮされたタイミング）を関連付ける。そして学習部３０５は、目の動き情報５０１を入力データとし、当該目の動き情報５０１のうち、目の状態の検出が開始されてからレリーズボタン１１１が操作されたタイミング５０２までの情報を正解データとして、学習モデル５０３の学習を行う。学習モデル５０３からは、撮影したいシーンであるかを示す情報５０４が出力される。学習部３０５は、レリーズボタン１１１が操作されたタイミング、またはその直前で検出されやすい目の動きと同様の動きが入力された場合に、出力される情報５０４として撮影したいシーンであることを示す情報が出力されるよう、学習モデル５０３の学習を行う。 The learning unit 305 associates the time information of the eye movement information 501 with the timing 502 when the release button 111 is operated (the timing when SW1 is turned ON, the timing when SW2 is turned ON). The learning unit 305 then uses the eye movement information 501 as input data and trains the learning model 503 using the information from the eye movement information 501 that begins to detect the eye state until the timing 502 when the release button 111 is operated as correct answer data. The learning model 503 outputs information 504 indicating whether the scene is one that the user wishes to photograph. The learning unit 305 trains the learning model 503 so that, when an eye movement similar to one that is easily detected at or immediately before the timing when the release button 111 is operated is input, information indicating that the scene is one that the user wishes to photograph is output as output information 504.

機械学習の具体的なアルゴリズムとしては、最近傍法、ナイーブベイズ法、決定木、サポートベクターマシンなどが挙げられる。また、ニューラルネットワークを利用して、学習するための特徴量、結合重み付け係数を自ら生成する深層学習（ディープラーニング）も挙げられる。適宜、上記アルゴリズムのうち利用できるものを用いて本実施形態に適用することができる。 Specific machine learning algorithms include nearest neighbor methods, naive Bayes methods, decision trees, and support vector machines. Also included is deep learning, which uses neural networks to generate features and connection weighting coefficients for learning. Any of the above algorithms that are available can be used as appropriate and applied to this embodiment.

また、学習部３０５は、誤差検出部と、更新部とを備えてもよい。誤差検出部は、入力層に入力される入力データに応じてニューラルネットワークの出力層から出力される出力データと、教師データとの誤差を得る。誤差検出部は、損失関数を用いて、ニューラルネットワークからの出力データと教師データとの誤差を計算するようにしてもよい。更新部は、誤差検出部で得られた誤差に基づいて、その誤差が小さくなるように、ニューラルネットワークのノード間の結合重み付け係数等を更新する。この更新部は、例えば、誤差逆伝播法を用いて、結合重み付け係数等を更新する。誤差逆伝播法は、上記の誤差が小さくなるように、各ニューラルネットワークのノード間の結合重み付け係数等を調整する手法である。 The learning unit 305 may also include an error detection unit and an update unit. The error detection unit obtains the error between the training data and the output data output from the output layer of the neural network in response to the input data input to the input layer. The error detection unit may use a loss function to calculate the error between the output data from the neural network and the training data. The update unit updates the connection weighting coefficients between the nodes of the neural network based on the error obtained by the error detection unit so as to reduce the error. This update unit updates the connection weighting coefficients, for example, using the backpropagation method. The backpropagation method is a technique for adjusting the connection weighting coefficients between the nodes of each neural network so as to reduce the error.

＜目の動きを解析する方法＞
次に図６を用いて、目の動きを解析する方法について具体的に説明する。図６（ｉ）～（ｖ）の各図の上側には、眼球用撮像素子１１６に投影される眼球像の概略図を示す。各眼球像の概略図には、眼球像の上部、中央部、及び下部に対応させて、領域Ａ、領域Ｂ、及び領域Ｃが設けられている。図６（ｉ）～（ｖ）の各図の下側に設けられた（ａ）～（ｃ）には、眼球像の領域Ａ、領域Ｂ、及び領域Ｃにそれぞれ対応するＣＣＤの出力強度を表すグラフを示す。各グラフの横軸は、ＣＣＤのＸ軸を表す。また各グラフの縦軸は、そのＸ座標におけるＣＣＤの出力強度を表す。
瞳孔６１０は、ユーザの眼球１１３の瞳孔の投影像を表す。虹彩６２０は、ユーザの眼球１１３の虹彩の投影像を表す。反射像６３０ａ、６３０ｂは、ユーザの眼球１１３の瞳孔に照射した照明光源１１２ａ～１１２ｂによる反射像を表す。上瞼６６０は、眼球像における上瞼の位置を表す。下瞼６７０は、眼球像における下瞼の位置を表す。白目６８０は、眼球像における白目の範囲を表す。 <Method for analyzing eye movements>
Next, a method for analyzing eye movement will be specifically described using FIG. 6. The upper part of each of FIGS. 6(i) to 6(v) shows a schematic diagram of an eyeball image projected onto the eyeball image sensor 116. In each schematic diagram of the eyeball image, regions A, B, and C are provided corresponding to the upper, middle, and lower parts of the eyeball image. The lower parts of each of FIGS. 6(i) to 6(v) show graphs representing the output intensity of the CCD corresponding to regions A, B, and C of the eyeball image, respectively. The horizontal axis of each graph represents the X-axis of the CCD. The vertical axis of each graph represents the output intensity of the CCD at that X-coordinate.
The pupil 610 represents a projected image of the pupil of the user's eyeball 113. The iris 620 represents a projected image of the iris of the user's eyeball 113. The reflected images 630a and 630b represent reflected images from the illumination light sources 112a to 112b irradiating the pupil of the user's eyeball 113. The upper eyelid 660 represents the position of the upper eyelid in the eyeball image. The lower eyelid 670 represents the position of the lower eyelid in the eyeball image. The white of the eye 680 represents the range of the white of the eye in the eyeball image.

反射像座標６３１ａ、６３１ｂは、反射像６３０ａ、６３０ｂに対応するＣＣＤの座標位置を表す。瞳孔端座標６４１ａ、６４１ｂは、瞳孔６１０の左右の両端を表す瞳孔端６４０ａ、６４０ｂに対応するＣＣＤの座標位置を表す。虹彩端座標６５１ａ、６５１ｂは、虹彩６２０の左右の両端を表す虹彩端６５０ａ、６５０ｂに対応するＣＣＤの座標位置を表す。 Reflected image coordinates 631a and 631b represent the CCD coordinate positions corresponding to reflected images 630a and 630b. Pupil edge coordinates 641a and 641b represent the CCD coordinate positions corresponding to pupil edges 640a and 640b, which represent the left and right ends of pupil 610. Iris edge coordinates 651a and 651b represent the CCD coordinate positions corresponding to iris edges 650a and 650b, which represent the left and right ends of iris 620.

本実施形態において、解析部３０３は、眼球用撮像素子１１６に投影される眼球像における、照明光源１１２ａ～１１２ｂによる反射像、瞳孔端、虹彩端の各座標位置、及びＣＣＤの出力強度に基づいて、ユーザの目の動きを解析する。以下具体例として、図６（ｉ）の状態を基準状態として、図６（ｉｉ）～図６（ｖ）の各状態について説明する。 In this embodiment, the analysis unit 303 analyzes the user's eye movements based on the reflected images from the illumination light sources 112a-112b in the eyeball image projected onto the eyeball image sensor 116, the coordinate positions of the pupil edge and iris edge, and the output intensity of the CCD. As specific examples, the state in Figure 6(i) will be used as a reference state, and each of the states in Figures 6(ii) to 6(v) will be described below.

（ｉｉ）視線の移動
図６（ｉ）に示す基準状態に対して、図６（ｉｉ）では領域Ｂの瞳孔端座標６４１ａ、６４１ｂ、反射像座標６３１ａ、６３１ｂが向かって左側にずれた状態である。この時、ユーザは視線を中央に対して右へ動かしていることがわかる。更に、図記は省略したが、瞳孔端座標６４１ａ、６４１ｂ、反射像座標６３１ａ、６３１ｂが向かって右側にずれた状態の場合、ユーザは視線を中央に対して左へ動かしていることがわかる。このような瞳孔端座標６４１ａ、６４１ｂ、反射像座標６３１ａ、６３１ｂの位置情報の蓄積データを用いることで、解析部３０３は、ユーザの視線がどのように移動しているかを解析する。 (ii) Movement of the Gaze Compared to the reference state shown in FIG. 6(i), in FIG. 6(ii), the pupil edge coordinates 641a, 641b and reflected image coordinates 631a, 631b in area B are shifted to the left as one faces them. At this time, it can be seen that the user is moving their gaze to the right relative to the center. Furthermore, although not shown, when the pupil edge coordinates 641a, 641b and reflected image coordinates 631a, 631b are shifted to the right as one faces them, it can be seen that the user is moving their gaze to the left relative to the center. By using the accumulated data of position information on the pupil edge coordinates 641a, 641b and reflected image coordinates 631a, 631b, the analysis unit 303 analyzes how the user's gaze is moving.

（ｉｉｉ）瞬き
図６（ｉ）に示す基準状態に対して、図６（ｉｉｉ）では領域Ｂの反射像座標６３１ａ、６３１ｂ等におけるＣＣＤの出力強度が低くなった状態である。この時、ユーザはファインダを覗いている状態であることから、上瞼６６０によって瞳孔６１０が遮光されて、ユーザは目を瞑っていることがわかる。また、ＣＣＤの出力強度が低くなった状態の後、元の出力強度の状態に戻ったことが検出された場合、ユーザは瞬きをしていることがわかる。このような反射像座標６３１ａ、６３１ｂ等におけるＣＣＤの出力強度の蓄積データを用いることで、解析部３０３は、ユーザが瞬きをしているか、素早く瞬きしているか、ゆっくりと瞬きしているか等を解析する。 (iii) Blinking In comparison with the reference state shown in FIG. 6(i), FIG. 6(iii) shows a state in which the CCD output intensity at reflected image coordinates 631a, 631b, etc. in region B is lower. At this time, the user is looking through the viewfinder, and it is understood that the pupil 610 is blocked by the upper eyelid 660, and the user has their eye closed. Furthermore, if it is detected that the CCD output intensity has returned to its original state after being lowered, it is understood that the user is blinking. By using such accumulated data of the CCD output intensity at reflected image coordinates 631a, 631b, etc., the analysis unit 303 analyzes whether the user is blinking, blinking quickly, blinking slowly, etc.

（ｉｖ）目を見開いている状態
図６（ｉ）に示す基準状態に対して、図６（ｉｖ）では領域Ａと領域Ｃの虹彩端座標６５１ａから起点となるＸ座標０に向かう範囲と、虹彩端座標６５１ｂから座標が大きくなる範囲で、同レベルの出力が増加している状態である。この時、領域Ａと領域Ｃにおいての白目６８０の範囲が増加して、ユーザが目を見開いていることがわかる。このような虹彩端座標６５１ａ、６５１ｂの外側におけるＣＣＤの出力強度の蓄積データを用いることで、解析部３０３は、ユーザの目の見開き具合がどのように変化しているかを解析する。 (iv) Eyes Wide Open State Compared to the reference state shown in Fig. 6(i), Fig. 6(iv) shows a state in which the output increases at the same level in the range from the iris edge coordinate 651a in area A and area C toward the starting X coordinate 0, and in the range of increasing coordinates from the iris edge coordinate 651b. At this time, the range of the white of the eye 680 in area A and area C increases, indicating that the user's eyes are wide open. By using such accumulated data of the output intensity of the CCD outside the iris edge coordinates 651a and 651b, the analysis unit 303 analyzes how the degree to which the user's eyes are wide open is changing.

（ｖ）目を細めている状態
図６（ｉ）に示す基準状態に対して、図６（ｖ）では領域Ａと領域Ｃの虹彩端座標６５１ａから起点となるＸ座標０に向かう範囲と、虹彩端座標６５１ｂから座標が大きくなる範囲で、同レベルの出力が減少している状態である。この時、領域Ａと領域Ｃにおいての白目６８０の範囲が減少し、ユーザが目を細めていることがわかる。このような虹彩端座標６５１ａ、６５１ｂの外側におけるＣＣＤの出力強度の蓄積データを用いることで、解析部３０３は、ユーザの目の細め具合がどのように変化しているかを解析する。 (v) Eye Squinting State Compared to the reference state shown in Fig. 6(i), Fig. 6(v) shows a state in which the output is reduced at the same level in the range from the iris edge coordinate 651a in area A and area C toward the starting X coordinate 0, and in the range of increasing coordinates from the iris edge coordinate 651b. At this time, the area of the white of the eye 680 in area A and area C is reduced, indicating that the user is squinting. By using such accumulated data of the output intensity of the CCD outside the iris edge coordinates 651a and 651b, the analysis unit 303 analyzes how the degree of squinting of the user's eyes is changing.

以上のように、反射像６３０ａ、６３０ｂ、瞳孔端６４０ａ、６４０ｂ、虹彩端６５０ａ、６５０ｂの出力情報、反射像座標６３１ａ、６３１ｂ、瞳孔端座標６４１ａ、６４１ｂ、虹彩端座標６５１ａ、６５１ｂ等を用いて、ユーザの目の状態が検出される。また、これらの蓄積データを用いることで、ユーザの目の状態変化が解析される。 As described above, the state of the user's eyes is detected using output information for reflected images 630a, 630b, pupil edges 640a, 640b, iris edges 650a, 650b, reflected image coordinates 631a, 631b, pupil edge coordinates 641a, 641b, iris edge coordinates 651a, 651b, etc. Furthermore, changes in the state of the user's eyes are analyzed using this accumulated data.

＜学習フェーズで実行される処理＞
図７は、本実施形態に係る撮像装置１００によって、学習フェーズで実行される処理を示すフローチャート図である。本フローチャートの示す処理は、ＣＰＵ１０３が、メモリ１０４に格納されたプログラムを実行することにより実現される。本フローチャートに示す処理は、撮像装置１００の電源がＯＮされると開始される。
ステップＳ７０１において、ＣＰＵ１０３は、視線検知用センサー１０７がユーザがファインダを覗いたことを検知したか否かを判定する。ＣＰＵ１０３が視線検知用センサー１０７がユーザがファインダを覗いたことを検知したと判定するまで、ステップＳ７０１の処理を繰り返す。ＣＰＵ１０３が視線検知用センサー１０７がユーザがファインダを覗いたことを検知したと判定した場合、処理はステップＳ７０２へ進む。
ステップＳ７０２において、ＣＰＵ１０３は、撮像素子１０２からスルー画像の取得を開始し、取得したスルー画像を表示素子１０８に表示する。これにより、ユーザはファインダ内の表示素子１０８に表示されたスルー画像を見ることで被写体の視認が可能になる。この時ＣＰＵ１０３は、眼球用撮像素子１１６からデータの取得を開始する。これにより目の状態の検出が開始する。 <Processing performed in the learning phase>
7 is a flowchart showing the processing executed in the learning phase by the imaging device 100 according to this embodiment. The processing shown in this flowchart is realized by the CPU 103 executing a program stored in the memory 104. The processing shown in this flowchart starts when the imaging device 100 is powered on.
In step S701, the CPU 103 determines whether the line-of-sight detection sensor 107 has detected that the user has looked into the viewfinder. The process of step S701 is repeated until the CPU 103 determines that the line-of-sight detection sensor 107 has detected that the user has looked into the viewfinder. If the CPU 103 determines that the line-of-sight detection sensor 107 has detected that the user has looked into the viewfinder, the process proceeds to step S702.
In step S702, the CPU 103 starts acquiring a through image from the image sensor 102 and displays the acquired through image on the display device 108. This allows the user to visually recognize the subject by looking at the through image displayed on the display device 108 in the viewfinder. At this time, the CPU 103 starts acquiring data from the eye image sensor 116. This starts detection of the state of the eye.

次にステップＳ７０３において、ＣＰＵ１０３は、眼球用撮像素子１１６から取得したデータからユーザの目の状態に関する情報を検出し、検出した情報をメモリ１０４に記憶する。本実施形態では、図４の１行分のレコードが蓄積される。
次にステップＳ７０４において、ＣＰＵ１０３が、ユーザの操作によりレリーズボタン１１１のＳＷ１がＯＮされたか否かを判定する。ＣＰＵ１０３がＳＷ１がＯＮされたと判定した場合、処理はＳ７０５へ進み、ＣＰＵ１０３がＳＷ１がＯＮされていないと判定した場合、処理はＳ７０３へ戻り、目の状態に関する情報の蓄積を継続する。
ステップＳ７０５において、ＣＰＵ１０３が、レリーズボタン１１１のＳＷ１がＯＮされたタイミングをメモリ１０４に記憶する。例えばＣＰＵ１０３は、信号入力回路２０３からＳＷ１がＯＮされた旨の通知が入力されると、現在時刻、ステップＳ７０１にて視線検知用センサー１０７が検知してからの経過時間等の時間情報をメモリ１０４に記憶する。 Next, in step S703, the CPU 103 detects information about the state of the user's eyes from the data acquired from the eye image sensor 116, and stores the detected information in the memory 104. In this embodiment, one row of records in FIG. 4 is stored.
Next, in step S704, the CPU 103 determines whether or not the user has turned on SW1 of the release button 111. If the CPU 103 determines that SW1 has been turned on, the process proceeds to step S705, and if the CPU 103 determines that SW1 has not been turned on, the process returns to step S703, where the accumulation of information regarding the eye condition continues.
In step S705, the CPU 103 stores the timing at which SW1 of the release button 111 was turned ON in the memory 104. For example, when a notification that SW1 has been turned ON is input from the signal input circuit 203, the CPU 103 stores time information such as the current time and the time elapsed since the line of sight detection sensor 107 detected the line of sight in step S701 in the memory 104.

次にステップＳ７０６において、ＣＰＵ１０３は、ステップＳ７０２にて開始した眼球用撮像素子１１６からのデータの取得を停止する。つまりＣＰＵ１０３は、ＳＷ１がＯＮされたことに起因してデータの取得を停止する。例えば、メモリ１０４に最後に蓄積されたレコードに対してフラグ情報等を付与して、ＳＷ１がＯＮされた時のレコードであることがわかるようにしておく。
次にステップＳ７０７において、ＣＰＵ１０３は、ユーザの操作によりレリーズボタン１１１のＳＷ２がＯＮされたか否かを判定する。ＣＰＵ１０３がＳＷ２がＯＮされていないと判定した場合、処理はＳ７０９へ進み、ＣＰＵ１０３がＳＷ２がＯＮされたと判定した場合、処理はステップＳ７０８へ進む。 Next, in step S706, the CPU 103 stops the acquisition of data from the eye image sensor 116 that began in step S702. That is, the CPU 103 stops the acquisition of data because SW1 has been turned ON. For example, flag information or the like is added to the last record stored in the memory 104 so that it can be identified as the record when SW1 was turned ON.
Next, in step S707, the CPU 103 determines whether or not the user has turned on SW2 of the release button 111. If the CPU 103 determines that SW2 is not turned on, the process proceeds to step S709, and if the CPU 103 determines that SW2 is turned on, the process proceeds to step S708.

続いてステップＳ７０９～ステップＳ７１２において一連の流れで処理を説明する。ＳＷ１がＯＮされてからＳＷ２がＯＮされるまでの経過時間が長い場合、被写体が動いた等の大きな変化があったことが想定される。そのためＳＷ１までに蓄積された情報は、実際にＳＷ２がＯＮされた撮影したシーンに連携していない可能性が高い。そこで、ＳＷ１がＯＮされてから所定時間内にＳＷ２がＯＮされない場合には、ステップＳ７０９において、ＣＰＵ１０３は、ステップＳ７０６にて停止したデータの取得を再開する。
次にＳ７１０において、ＣＰＵ１０３は、Ｓ７０３と同様に、眼球用撮像素子１１６から取得したデータからユーザの目の状態に関する情報を検出し、検出した情報をメモリ１０４に記憶する。 Next, the processing flow from step S709 to step S712 will be explained. If a long time has passed since SW1 was turned ON until SW2 was turned ON, it is likely that a major change, such as a movement of the subject, has occurred. Therefore, it is highly likely that the information accumulated up to SW1 is not related to the scene captured when SW2 was actually turned ON. Therefore, if SW2 is not turned ON within a predetermined time after SW1 was turned ON, in step S709, the CPU 103 resumes the acquisition of data stopped in step S706.
Next, in S710, the CPU 103 detects information about the state of the user's eyes from the data acquired from the eye image sensor 116, as in S703, and stores the detected information in the memory 104.

次にステップＳ７１１において、ＣＰＵ１０３が、ユーザの操作によりレリーズボタン１１１のＳＷ２がＯＮされたか否かを判定する。ＣＰＵ１０３がＳＷ２がＯＮされたと判定した場合、処理はＳ７１２へ進み、ＣＰＵ１０３がＳＷ１がＯＮされていないと判定した場合、処理はＳ７１０へ戻り、目の状態に関する情報の蓄積を継続する。
Ｓ７１２において、ＣＰＵ１０３は、ステップＳ７０９にて再開したデータの取得を停止する。
次にステップＳ７０８において、ＣＰＵ１０３が、レリーズボタン１１１のＳＷ２がＯＮされたタイミングをメモリ１０４に記憶する。ステップＳ７０８の処理内容は、ステップＳ７０５と重複するため説明を割愛する。 Next, in step S711, the CPU 103 determines whether or not the user has turned on SW2 of the release button 111. If the CPU 103 determines that SW2 has been turned on, the process proceeds to S712, and if the CPU 103 determines that SW1 has not been turned on, the process returns to S710, where the accumulation of information regarding the eye condition continues.
In S712, the CPU 103 stops the data acquisition that was resumed in step S709.
Next, in step S708, the CPU 103 stores the timing at which SW2 of the release button 111 is turned ON in the memory 104. The processing content of step S708 overlaps with step S705, and therefore a description thereof will be omitted.

次にステップＳ７１３において、ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、目の状態に関する情報をメモリ１０４から読み出して目の動きを解析する。
次にステップＳ７１４において、ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、Ｓ７１３での解析により得られた目の動き情報と、ＳＷ１、ＳＷ２が押されたタイミングの関連付けを行う。この時ＣＰＵ１０３は、Ｓ７０５及びＳ７０８にて記憶された現在時刻、視線検知用センサー１０７が検知してからの経過時間等の時間情報を関連付ける。これにより、視線検知用センサー１０７が検知してからＳＷ１がＯＮされるまでの目の動き情報や、ＳＷ２がＯＮされるまでの目の動き情報を抽出することが可能になる。その後一連のフローチャートの処理が終了する。 Next, in step S713, the CPU 103 controls the GPU 105 and FPGA 106 to read information relating to the eye state from the memory 104 and analyze the eye movement.
Next, in step S714, the CPU 103 controls the GPU 105 and FPGA 106 to associate the eye movement information obtained by the analysis in S713 with the timing at which SW1 and SW2 were pressed. At this time, the CPU 103 associates time information, such as the current time stored in S705 and S708 and the elapsed time since detection by the line-of-sight detection sensor 107, with the eye movement information from detection by the line-of-sight detection sensor 107 until SW1 is turned ON and eye movement information from detection by the line-of-sight detection sensor 107 until SW2 is turned ON. After that, the series of processes in the flowchart ends.

以上のようにして得られた目の動き情報が、学習モデル５０３に入力する学習用データとして用いられる。本実施形態において、ＣＰＵ１０３は、ＳＷ１がＯＮされてからＳＷ２がＯＮされるまでの時間が所定時間未満である場合、視線検知用センサー１０７が検知してからＳＷ１がＯＮされるまでの目の動き情報を正解データとして用いる。またＣＰＵ１０３は、ＳＷ１がＯＮされてからＳＷ２がＯＮされるまでの時間が所定時間以上である場合、視線検知用センサー１０７が検知してからＳＷ２がＯＮされるまでの目の動き情報を正解データとして用いる。 The eye movement information obtained in this manner is used as learning data to be input into the learning model 503. In this embodiment, if the time between SW1 being turned ON and SW2 being turned ON is less than a predetermined time, the CPU 103 uses the eye movement information from detection by the line-of-sight detection sensor 107 until SW1 is turned ON as correct data. Furthermore, if the time between SW1 being turned ON and SW2 being turned ON is equal to or greater than a predetermined time, the CPU 103 uses the eye movement information from detection by the line-of-sight detection sensor 107 until SW2 is turned ON as correct data.

なおＣＰＵ１０３は、ズーム率が変化している状態や、撮像装置１００が動いている状態等、撮影の準備段階と想定される所定の状態で検出された目の動き情報を、正解データの対象から除外してもよい。
またＣＰＵ１０３は、ＳＷ２がＯＮされる時間間隔が短い状態で検出された目の動き情報を、連写シーン用の学習用データとして用いてもよい。連写シーン用の学習用データを用いて学習モデルの学習を行う場合には、連写で撮影したいシーンであるかを示す情報が出力されるように学習モデルの学習を行う。 In addition, the CPU 103 may exclude eye movement information detected in a specified state that is assumed to be the preparation stage for shooting, such as a state in which the zoom ratio is changing or a state in which the imaging device 100 is moving, from the target of correct answer data.
Furthermore, the CPU 103 may use eye movement information detected when the time interval at which SW2 is turned on is short as learning data for continuous shooting scenes. When training a learning model using learning data for continuous shooting scenes, the learning model is trained so as to output information indicating whether the scene is one that should be captured by continuous shooting.

図７に示すフローチャートの処理が繰り返し実行されることにより、メモリ１０４には学習用データが複数記憶される。その後ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、複数の学習用データを用いて学習モデルの学習を行う。学習された学習モデルはメモリ１０４に記憶され、次に説明する推定フェーズで用いられる。 By repeatedly executing the processing of the flowchart shown in Figure 7, multiple pieces of training data are stored in memory 104. Then, CPU 103 controls GPU 105 and FPGA 106 to train a training model using the multiple pieces of training data. The trained training model is stored in memory 104 and is used in the estimation phase, which will be described next.

＜推定フェーズにおける機能構成＞
図８は、本実施形態に係る撮像装置１００の推定フェーズにおける機能構成例を示す図である。ＣＰＵ１０３は、メモリ１０４に記憶されるプログラムを実行することにより、ＣＰＵ１０３に接続される各デバイスを制御して、図８に示す、状態検出部３０１、データ記憶部３０２、解析部３０３、推定部８０１、及び撮影制御部８０２として機能する。
なお本実施形態において、撮像装置１００は、ユーザの設定に応じて、推定フェーズにおける機能を起動させるか停止させるかを切り替えることが可能である。推定フェーズにおける機能が起動している状態では、レリーズボタン１１１の操作により、または撮影したシーンであると推定されたタイミングで、レリーズ動作が実行される。推定フェーズにおける機能が停止している状態では、レリーズボタン１１１が操作された場合に限り、レリーズ動作が実行される。 <Functional configuration in the estimation phase>
8 is a diagram showing an example of the functional configuration in the estimation phase of the imaging device 100 according to this embodiment. The CPU 103 executes a program stored in the memory 104 to control each device connected to the CPU 103, and functions as a state detection unit 301, a data storage unit 302, an analysis unit 303, an estimation unit 801, and an imaging control unit 802 shown in FIG.
In this embodiment, the imaging device 100 can switch between activating and deactivating the function in the estimation phase according to a user setting. When the function in the estimation phase is activated, a release operation is performed by operating the release button 111 or at the timing when it is estimated that a scene has been captured. When the function in the estimation phase is deactivated, a release operation is performed only when the release button 111 is operated.

状態検出部３０１、データ記憶部３０２、及び解析部３０３の機能は、学習フェーズで説明した機能と同様であるため説明を割愛する。
推定部８０１は、解析部３０３で解析された目の動き情報を推定用データとし、当該推定用データをメモリ１０４に記憶される学習モデルに入力することにより現在撮影されているシーンがユーザの撮影したいシーンであるかを示す出力データを得る。そして推定部８０１は、得られた出力データに基づいて、現在撮影されているシーンがユーザの撮影したいシーンであるかを推定する。出力データが現在撮影されているシーンがユーザの撮影したいシーンであることを示す場合、現在撮影されているシーンがユーザの撮影したいシーンであると推定する。出力データが、現在撮影されているシーンがユーザの撮影したいシーンであることを示さない場合、現在撮影されているシーンがユーザの撮影したいシーンでないと推定する。推定部８０１は、撮影したいシーンであると推定した場合、撮影制御部８０２にレリーズ動作の開始の指示を出力する。つまり推定部８０１は、レリーズ動作を実行するタイミングを推定する。推定部８０１（ＣＰＵ１０３）は、推定手段の一例である。
撮影制御部８０２は、推定部８０１からレリーズ動作の開始の指示が入力された場合に、各デバイスを制御して、レリーズ動作を実行する。撮影制御部８０２（ＣＰＵ１０３）は、制御手段の一例である。 The functions of the state detection unit 301, data storage unit 302, and analysis unit 303 are the same as those described in the learning phase, and therefore a description thereof will be omitted.
The estimation unit 801 uses the eye movement information analyzed by the analysis unit 303 as estimation data and inputs the estimation data into a learning model stored in the memory 104 to obtain output data indicating whether the currently captured scene is the scene the user wants to capture . Then, based on the obtained output data, the estimation unit 801 estimates whether the currently captured scene is the scene the user wants to capture. If the output data indicates that the currently captured scene is the scene the user wants to capture, the estimation unit 801 estimates that the currently captured scene is the scene the user wants to capture. If the output data does not indicate that the currently captured scene is the scene the user wants to capture, the estimation unit 801 estimates that the currently captured scene is not the scene the user wants to capture. If the estimation unit 801 estimates that the currently captured scene is the scene the user wants to capture, the estimation unit 801 outputs an instruction to start a release operation to the shooting control unit 802. In other words, the estimation unit 801 estimates the timing to execute a release operation. The estimation unit 801 (CPU 103) is an example of an estimation means.
The photographing control unit 802 controls each device to execute the release operation when an instruction to start the release operation is input from the estimation unit 801. The photographing control unit 802 (CPU 103) is an example of a control unit.

＜推定フェーズで実行される処理＞
図９は、本実施形態に係る撮像装置１００によって、推定フェーズで実行される処理を示すフローチャート図である。本フローチャートの示す処理は、ＣＰＵ１０３が、メモリ１０４に格納されたプログラムを実行することにより実現される。本フローチャートに示す処理は、撮像装置１００の電源がＯＮされると開始される。 <Processing performed in the estimation phase>
9 is a flowchart showing the processing executed in the estimation phase by the imaging device 100 according to this embodiment. The processing shown in this flowchart is realized by the CPU 103 executing a program stored in the memory 104. The processing shown in this flowchart starts when the imaging device 100 is powered on.

ステップＳ９０１～ステップＳ９０４の処理は、図７で説明した学習フェーズのステップＳ７０１～ステップＳ７０３、ステップＳ７１３の処理と同様であるため、ここでは説明を割愛する。
ステップＳ９０５において、ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、メモリ１０４に記憶される学習モデルを読み出し、Ｓ９０４での解析により得られた目の動き情報を、当該読み出した学習モデルに入力する。
次にステップＳ９０６において、ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、学習モデルから出力データとして、撮影したいシーンであるかを示す情報を取得する。 The processing in steps S901 to S904 is the same as the processing in steps S701 to S703 and step S713 in the learning phase described with reference to FIG. 7, and therefore a description thereof will be omitted here.
In step S905, the CPU 103 controls the GPU 105 and FPGA 106 to read out the learning model stored in the memory 104, and inputs the eye movement information obtained by the analysis in S904 into the read out learning model.
Next, in step S906, the CPU 103 controls the GPU 105 and FPGA 106 to obtain, as output data from the learning model, information indicating whether the scene is one that the user wants to capture.

次にステップＳ９０７において、ＣＰＵ１０３は、ＧＰＵ１０５とＦＰＧＡ１０６を制御して、Ｓ９０６で取得された撮影したいシーンであるかを示す情報に基づいて、現在撮影されているシーンがユーザの撮影したいシーンであるか否かを推定する。ＣＰＵ１０３が撮影したいシーンであると推定した場合には、処理はＳ９０８へ進む。ＣＰＵ１０３が撮影したいシーンではないと推定した場合には、処理はＳ９０１へ進む。 Next, in step S907, the CPU 103 controls the GPU 105 and FPGA 106 to estimate whether the scene currently being captured is one that the user wants to capture, based on the information indicating whether the scene is one that the user wants to capture obtained in S906. If the CPU 103 estimates that the scene is one that the user wants to capture, processing proceeds to S908. If the CPU 103 estimates that the scene is not one that the user wants to capture, processing proceeds to S901.

ステップＳ９０８において、ＣＰＵ１０３は、各デバイスを制御して、レリーズ動作を実行する。この時ＣＰＵ１０３は、レリーズ動作に合わせて報知音出力部（不図示）からシャッター音を出力する。この場合にレリーズボタン１１１を操作しないのにシャッター音が出力されるため、ユーザに違和感が生じるおそれがある。そこでＣＰＵ１０３は、Ｓ９０７での推定に起因して出力されるシャッター音を、レリーズボタン１１１の操作に起因して出力される通常のシャッター音と異ならせる。例えば、通常のシャッター音よりも高い音を出力する。その後一連のフローチャートの処理が終了する。 In step S908, the CPU 103 controls each device to perform a release operation. At this time, the CPU 103 outputs a shutter sound from an alarm sound output unit (not shown) in synchronization with the release operation. In this case, the shutter sound is output even though the release button 111 is not operated, which may cause discomfort to the user. Therefore, the CPU 103 differentiates the shutter sound output as a result of the estimation in S907 from the normal shutter sound output as a result of operating the release button 111. For example, it outputs a sound higher in pitch than the normal shutter sound. The processing of the series of flowcharts then ends.

ＣＰＵ１０３は、Ｓ９０８でのレリーズ動作により得られた静止画データを撮像装置１００の背面モニタ（不図示）に表示してユーザに提示する。なお、推定の精度によっては、ユーザが実際に撮影したいシーンとは異なるタイミングでレリーズ動作が実行されてしまう場合も想定され得る。そこでＣＰＵ１０３は、Ｓ９０８にてレリーズ動作を実行した後で、ステップＳ９０４で解析に用いたデータに連動しているであろうタイミングで、レリーズボタン１１１が操作されたか否かを判定する。例えば、Ｓ９０８にてレリーズ動作の実行後、所定時間内にレリーズボタン１１１が操作されたか否かを判定する。所定時間内にレリーズボタン１１１が操作された場合、ＣＰＵ１０３は、Ｓ９０８でのレリーズ動作により得られた静止画データを撮像装置１００の背面モニタに表示してユーザに提示する。所定時間内にレリーズボタン１１１が操作されなかった場合、ＣＰＵ１０３は、Ｓ９０８でのレリーズ動作により得られた静止画データをメモリ１０４から破棄してユーザに提示しないようにする。 The CPU 103 displays the still image data obtained by the release operation in S908 on the rear monitor (not shown) of the imaging device 100 and presents it to the user. Depending on the accuracy of the estimation, it is possible that the release operation may be performed at a timing different from the scene the user actually wants to capture. Therefore, after performing the release operation in S908, the CPU 103 determines whether the release button 111 was operated at a timing that would be linked to the data used for the analysis in step S904. For example, after performing the release operation in S908, the CPU 103 determines whether the release button 111 was operated within a predetermined time. If the release button 111 was operated within the predetermined time, the CPU 103 displays the still image data obtained by the release operation in S908 on the rear monitor of the imaging device 100 and presents it to the user. If the release button 111 is not operated within a predetermined time, the CPU 103 discards the still image data obtained by the release operation in S908 from the memory 104 and does not present it to the user.

なお、撮像装置１００は、学習モデルを用いて処理を行う構成に代えて、学習フェーズで得られた複数の学習用データから抽出した目の動きのパターンを用いて処理を行う構成でもよい。その場合には、例えば上記の目の動きのパターンをメモリ１０４等に格納しておく。撮像装置１００は、目の動きの解析結果を、格納された目の動きのパターンと比較することで類似度を算出し、算出した類似度に基づいて撮影したシーンであるかを推定する。つまり撮像装置１００は、複数の学習用データから抽出した目の動きのパターンを用いて、前述の推定部８０１と同等の処理を行う。 Instead of being configured to perform processing using a learning model, the imaging device 100 may be configured to perform processing using eye movement patterns extracted from multiple pieces of learning data obtained in the learning phase. In that case, for example, the eye movement patterns are stored in memory 104 or the like. The imaging device 100 calculates the similarity by comparing the results of the eye movement analysis with the stored eye movement patterns, and estimates whether the scene is a captured scene based on the calculated similarity. In other words, the imaging device 100 performs processing equivalent to that of the estimation unit 801 described above, using eye movement patterns extracted from multiple pieces of learning data.

また、撮像装置１００は、ルックアップテーブル（ＬＵＴ）等のルールベースの処理を行う構成でもよい。その場合には、例えば、目の動きのパターンを示すパターンデータと、撮影したいシーンであるかの関係を予めＬＵＴとして作成しておき、作成したＬＵＴをメモリ１０４等に格納しておく。撮像装置１００は、格納されたＬＵＴを参照して、目の動きの解析結果から、撮影したいシーンであるかを推定する。つまり撮像装置１００は、ＬＵＴを用いて、前述の推定部８０１と同等の処理を行う。 The imaging device 100 may also be configured to perform rule-based processing using a lookup table (LUT) or the like. In that case, for example, the relationship between pattern data indicating eye movement patterns and whether the scene is one that the user wishes to capture is created in advance as an LUT, and the created LUT is stored in memory 104 or the like. The imaging device 100 references the stored LUT and estimates whether the scene is one that the user wishes to capture based on the results of the eye movement analysis. In other words, the imaging device 100 uses the LUT to perform processing equivalent to that of the estimation unit 801 described above.

以上のように本実施形態の撮像装置１００は、レリーズボタン１１１が操作されるタイミング、またはその直前の、ユーザの目の動きの特徴を学習しておく。この学習結果を用いることで、視線の動き、目の開き具合の変化、目の細め具合の変化、瞬きの周期等のユーザの目の動きから、レリーズ動作を実行するタイミングを推定することが可能になる。これによれば、ユーザが撮影したいと推定されるタイミングで直ちにレリーズ動作を開始することが可能になり、撮影したいシーンの撮り逃しを抑制することができる。また、動画撮影中に静止画を撮影する場合に、ユーザがレリーズボタン１１１を操作することで発生する画像ズレを抑制することもできる。 As described above, the imaging device 100 of this embodiment learns the characteristics of the user's eye movement at the time when the release button 111 is operated or immediately before that time. By using the results of this learning, it becomes possible to estimate the timing to perform the release operation from the user's eye movement, such as gaze movement, changes in eye opening and closing, changes in eye squinting, and blinking cycles. This makes it possible to immediately start the release operation at the time when it is estimated that the user wants to capture a photo, preventing the user from missing a scene they want to capture. It is also possible to prevent image misalignment that occurs when the user operates the release button 111 when capturing a still image during video capture.

以上、本発明を実施形態と共に説明したが、上記実施形態は本発明を実施するにあたっての具体化の例を示したものに過ぎず、これらによって本発明の技術的範囲が限定的に解釈されてはならないものである。すなわち、本発明はその技術思想、又はその主要な特徴から逸脱することなく、様々な形で実施することができる。 The present invention has been described above in conjunction with embodiments, but these embodiments merely illustrate specific examples of how the present invention may be implemented, and the technical scope of the present invention should not be interpreted as being limited by these embodiments. In other words, the present invention can be implemented in various forms without departing from its technical concept or main features.

本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワークまたは記憶媒体を介してシステムまたは装置に供給し、そのシステムまたは装置のコンピュータがプログラムを読出し実行する処理でも実現可能である。コンピュータは、１または複数のプロセッサーまたは回路を有し、コンピュータ実行可能命令を読み出し実行するために、分離した複数のコンピュータまたは分離した複数のプロセッサーまたは回路のネットワークを含みうる。プロセッサーまたは回路は、中央演算処理装置（ＣＰＵ）、マイクロプロセッシングユニット（ＭＰＵ）、グラフィクスプロセッシングユニット（ＧＰＵ）、特定用途向け集積回路（ＡＳＩＣ）、フィールドプログラマブルゲートウェイ（ＦＰＧＡ）を含みうる。また、プロセッサーまたは回路は、デジタルシグナルプロセッサ（ＤＳＰ）、データフロープロセッサ（ＤＦＰ）、またはニューラルプロセッシングユニット（ＮＰＵ）を含みうる。 The present invention can also be realized by providing a program that achieves one or more of the functions of the above-described embodiments to a system or device via a network or storage medium, and having the computer of the system or device read and execute the program. The computer has one or more processors or circuits, and may include multiple separate computers or a network of multiple separate processors or circuits to read and execute computer-executable instructions. The processor or circuit may include a central processing unit (CPU), a microprocessing unit (MPU), a graphics processing unit (GPU), an application-specific integrated circuit (ASIC), or a field-programmable gateway (FPGA). The processor or circuit may also include a digital signal processor (DSP), a data flow processor (DFP), or a neural processing unit (NPU).

本実施形態の第１の変形例として、ＣＰＵ１０３は、眼球用撮像素子１１６から得られる像情報を解析して、眼球１１３の特徴やまつ毛の特徴等の眼球情報を取得し、取得した眼球情報に基づいて、ユーザを識別してもよい。この場合学習フェーズでは、ＣＰＵ１０３は、識別されたユーザ毎に学習モデルの学習を行う。メモリ１０４にはユーザ毎に学習された学習モデルが記憶される。この場合推定フェーズでは、ＣＰＵ１０３は、ユーザ毎に学習された学習モデルを用いて推定を行う。 As a first variant of this embodiment, the CPU 103 may analyze image information obtained from the eyeball image sensor 116 to acquire eyeball information such as the characteristics of the eyeball 113 and the characteristics of the eyelashes, and identify the user based on the acquired eyeball information. In this case, in the learning phase, the CPU 103 learns a learning model for each identified user. The memory 104 stores the learning model learned for each user. In this case, in the estimation phase, the CPU 103 performs estimation using the learning model learned for each user.

本実施形態の第２の変形例として、ＣＰＵ１０３は、表示素子１０８におけるユーザの視線位置を検出し、表示素子１０８におけるユーザの視線位置と、スルー画像上の被写体位置の関係性を解析してもよい。この場合ＣＰＵ１０３は、被写体の領域にユーザの視線位置があるかの情報や、被写体の動きにユーザの視線の動きが追随しているかの情報を取得する。学習フェーズでは、ＣＰＵ１０３は、学習用データのうち、被写体の領域にユーザの視線位置がない状態や、被写体の動きに視線の動きが追随していない状態で検出された目の動き情報を、正解データの対象から除外してもよい。つまり、ユーザの視線位置とスルー画像上の被写体の位置とが連携していない状態で検出された目の動き情報を、正解データの対象から除外する。 As a second variation of this embodiment, the CPU 103 may detect the user's gaze position on the display element 108 and analyze the relationship between the user's gaze position on the display element 108 and the subject position on the through image. In this case, the CPU 103 acquires information on whether the user's gaze position is in the subject area and whether the user's gaze movement is following the subject's movement. In the learning phase, the CPU 103 may exclude from the target correct data eye movement information of the learning data that is detected when the user's gaze position is not in the subject area or when the gaze movement is not following the subject's movement. In other words, eye movement information that is detected when the user's gaze position and the subject position on the through image are not linked is excluded from the target correct data.

１００：撮像装置、１０２：撮像素子、１０３：ＣＰＵ、１０４：メモリ、１０７：視線検知用センサー、１１６：眼球用撮像素子 100: Image capture device, 102: Image capture element, 103: CPU, 104: Memory, 107: Line-of-sight detection sensor, 116: Eye image capture element

Claims

a detection means for detecting the state of the user's eyes;
analysis means for analyzing eye movements based on the detection result by the detection means;
an estimation means for estimating the timing to start a shutter release operation based on output data indicating whether the scene is a desired scene , which is obtained by inputting the movement information representing the eye movement analyzed by the analysis means into a learning model; and
An imaging device comprising:

The imaging device of claim 1 further comprises a learning means for learning the learning model using learning data in which the movement information is input data and the movement information when the release button is operated is used as correct answer data.

The imaging device described in claim 2, characterized in that the learning means uses the movement information from when detection by the detection means begins until the release button is operated as correct answer data.

The imaging device described in claim 3, characterized in that, if the time between when preparation for shooting is instructed by a first operation of the release button and when shooting is instructed by a second operation of the release button is less than a predetermined time, the learning means uses the movement information from when detection by the detection means begins until the first operation is performed as correct data.

An imaging device according to claim 3 or 4, characterized in that, if the time between when preparation for shooting is instructed by a first operation of the release button and when shooting is instructed by a second operation of the release button is equal to or longer than a predetermined time, the learning means uses the movement information from when detection by the detection means begins until the second operation is performed as correct answer data.

An imaging device described in any one of claims 2 to 5, characterized in that whether or not learning is performed by the learning means can be switched depending on settings.

The imaging device described in any one of claims 1 to 6, characterized in that the analysis means analyzes at least one of the following items based on a predetermined period of data on the eye condition detected by the detection means: gaze movement, changes in eye opening, changes in eye squinting, and blinking cycle.

The imaging device described in any one of claims 1 to 7, characterized by a control means that controls the release operation to be performed at the estimated timing.

The imaging device of claim 8, wherein the shutter sound emitted when the control means executes a release operation is different from the shutter sound emitted when the release button is operated.

The control means
If the release button is operated within a predetermined time after the control means has performed the release operation, still image data obtained by the release operation is presented to the user;
10. The imaging device according to claim 8, wherein if the release button is not operated within a predetermined time after the control means executes the release operation, still image data obtained by the release operation is discarded.

The imaging device described in any one of claims 1 to 10, characterized in that the learning model is configured as a neural network.

the detection means detects a line of sight position relative to a display unit on which a captured image is displayed,
the analyzing means analyzes a relationship between the gaze position and the position of the subject on the captured image,
The imaging device according to claim 3, wherein the learning means excludes the movement information obtained when the gaze position and the position of the subject in the captured image are not linked from the target of correct data.

The imaging device described in claim 3, characterized in that the learning means excludes the motion information obtained when the imaging device is moving and/or when the zoom ratio is changing from the target of correct answer data.

a detecting step of detecting an eye condition of a user;
an analysis step of analyzing eye movements based on the detection result of the detection step;
an estimation step of estimating the timing to start a release operation based on output data indicating whether the scene is a desired scene to be photographed , which is obtained by inputting the movement information representing the eye movement analyzed in the analysis step into a learning model;
11. A method for controlling an imaging device, comprising:

A program for causing a computer to function as each of the means of the imaging device according to any one of claims 1 to 13 .