JP6797955B2

JP6797955B2 - Image processing device and control method of image processing device, imaging device, program

Info

Publication number: JP6797955B2
Application number: JP2019041361A
Authority: JP
Inventors: 彰太山口
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2016-01-13
Filing date: 2019-03-07
Publication date: 2020-12-09
Anticipated expiration: 2036-11-22
Also published as: JP6494587B2; JP2019135839A; JP2017126979A

Description

本発明は、画像および距離情報の記録制御技術に関する。 The present invention relates to a recording control technique for image and distance information.

連続的に取得された動画像の中からユーザが好みの画像を選択し、事後的に各種の画像処理を行う事で、ユーザの嗜好に合った１枚の静止画を取得する技術がある。特許文献１では、連続して撮影された画像に対して被写体が所定状態であるか否かを判別し、所定状態である画像の記録画質を相対的に高くする技術が開示されている。また、特許文献２では、撮影モードが超解像モードに変更された場合に、フレーム圧縮を行わないことで高画質な画像を取得する技術が開示されている。また、特許文献３のように、撮像画像に加えて付加情報として距離分布情報を生成することで、撮像画像のボケを画像処理で調整する技術が開示されている。 There is a technique for acquiring one still image that suits the user's taste by selecting a favorite image from the continuously acquired moving images and performing various image processing after the fact. Patent Document 1 discloses a technique for determining whether or not a subject is in a predetermined state with respect to continuously captured images, and relatively improving the recorded image quality of the images in the predetermined state. Further, Patent Document 2 discloses a technique for acquiring a high-quality image by not performing frame compression when the shooting mode is changed to the super-resolution mode. Further, as in Patent Document 3, a technique for adjusting the blur of a captured image by image processing by generating distance distribution information as additional information in addition to the captured image is disclosed.

特開２０１１−１６６３９１号公報Japanese Unexamined Patent Publication No. 2011-166391 特開２００７−１８９６６５号公報JP-A-2007-189665 特開２０１５−１１９４１６号公報JP-A-2015-119416

しかしながら、各種画像処理には、被写体の輪郭抽出を高精度に行う必要のある処理が含まれる。電子的に背景をぼかす背景ぼかし処理や、主被写体とそれ以外の領域に対して別々に階調補正処理を行うことで高品質なダイナミックレンジ圧縮画像を得る領域別階調処理等がある。これらの技術を実現するためには、フレーム画像のデータを記録しておくだけでなく、画角内の距離分布（少なくとも距離の相対関係がわかる情報）を示す距離マップを併せて取得しておくことが必要である。また、領域別階調処理においては、画像加工前のＲＡＷ画像も記録しておく必要がある。 However, various image processes include processes that require highly accurate contour extraction of the subject. There are background blurring processing that electronically blurs the background, and area-specific gradation processing that obtains a high-quality dynamic range compressed image by separately performing gradation correction processing on the main subject and other areas. In order to realize these technologies, not only the frame image data is recorded, but also a distance map showing the distance distribution within the angle of view (at least the information that shows the relative relationship of the distances) is acquired. It is necessary. Further, in the area-specific gradation processing, it is necessary to record the RAW image before the image processing.

一方、動画像の全フレームに対し、前記画像処理を行うために距離マップやＲＡＷ画像のデータを記録しておく形態では、記録容量が膨大になり、ユーザの負担が増すという課題がある。従って、事後処理用の情報を取得する場合にフレーム画像のデータ量を必要最小限に抑えることも非常に重要である。
本発明は、記録容量を抑えつつ、利便性の高い画像記録を行うことができる、撮像画像と付加情報を取得可能な画像処理装置の提供を目的とする。 On the other hand, in the form of recording the data of the distance map or the RAW image for all the frames of the moving image in order to perform the image processing, there is a problem that the recording capacity becomes enormous and the burden on the user increases. Therefore, it is also very important to minimize the amount of frame image data when acquiring information for post-processing.
An object of the present invention is to provide an image processing apparatus capable of acquiring captured images and additional information, which can perform highly convenient image recording while suppressing the recording capacity.

本発明の一実施形態の装置は、複数の画像データを取得して記録媒体に記録する記録手段を備える画像処理装置であって、画像データに対応する、被写体の深度分布情報を取得する取得手段と、前記画像データおよび前記画像データに対応する前記深度分布情報を前記記録手段により前記記録媒体に記録する第１のモードと、前記深度分布情報を記録せずに前記画像データを前記記録手段により前記記録媒体に記録する第２のモードとを切り替えて前記複数の画像データの記録処理を行う制御手段と、を備え、前記制御手段は、ユーザが設定する前記複数の画像データの撮像のフレームレートに応じた周期で前記第１のモードと前記第２のモードとを切替えて前記複数の画像データの記録処理を行うことを特徴とする画像処理装置。 The apparatus of one embodiment of the present invention is an image processing apparatus including a recording means for acquiring a plurality of image data and recording them on a recording medium, and is an acquisition means for acquiring depth distribution information of a subject corresponding to the image data. A first mode in which the image data and the depth distribution information corresponding to the image data are recorded on the recording medium by the recording means, and the image data is recorded by the recording means without recording the depth distribution information. The control means includes a control means for switching the second mode of recording on the recording medium and performing the recording process of the plurality of image data, and the control means has a frame rate for capturing the plurality of image data set by the user. An image processing apparatus characterized in that the recording process of a plurality of image data is performed by switching between the first mode and the second mode at a cycle corresponding to the above.

本発明によれば、記録容量を抑えつつ、利便性の高い画像記録を行うことができる。 According to the present invention, it is possible to perform highly convenient image recording while suppressing the recording capacity.

本発明の第１実施形態に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image pickup apparatus which concerns on 1st Embodiment of this invention. 第１実施形態における記録モード制御部のブロック図である。It is a block diagram of the recording mode control part in 1st Embodiment. 第１実施形態における記録モード制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the recording mode control part in 1st Embodiment. 主被写体領域検出結果の一例を示す説明図である。It is explanatory drawing which shows an example of the main subject area detection result. 主被写体領域と背景領域の距離ヒストグラムを例示する図である。It is a figure which illustrates the distance histogram of the main subject area and the background area. 物距離と像距離を示す光学モデルの一例を示す図である。It is a figure which shows an example of the optical model which shows the object distance and the image distance. デフォーカス量と物面距離との関係をグラフで示す図である。It is a figure which shows the relationship between the defocus amount and the object surface distance by a graph. 第１実施形態における画像処理部のブロック図である。It is a block diagram of the image processing part in 1st Embodiment. 第１実施形態における背景画像ぼかし処理のフローチャートである。It is a flowchart of the background image blurring processing in 1st Embodiment. 背景画像のぼかし処理が可能であることを知らせる画面例を示す図である。It is a figure which shows the screen example which informs that the blur processing of a background image is possible. 距離マップ整形部の処理を説明する図である。It is a figure explaining the process of the distance map shaping part. ピント被写体抽出部の処理を説明する図である。It is a figure explaining the process of the focus subject extraction part. ぼかし処理を説明する図である。It is a figure explaining the blurring process. 本発明の第２実施形態における記録モード制御部のブロック図である。It is a block diagram of the recording mode control part in 2nd Embodiment of this invention. 第２実施形態における記録モード制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the recording mode control part in 2nd Embodiment. 閾値算出部の処理を説明する図である。It is a figure explaining the process of the threshold value calculation part. 第２実施形態における画像処理部のブロック図である。It is a block diagram of the image processing part in 2nd Embodiment. 第２実施形態における領域別階調補正処理のフローチャートである。It is a flowchart of the gradation correction processing for each area in 2nd Embodiment. 領域別階調補正処理が可能であることを知らせる画面例を示す図である。It is a figure which shows the screen example which informs that the gradation correction processing for each area is possible. 第２実施形態における階調特性算出方法を説明する図である。It is a figure explaining the gradation characteristic calculation method in 2nd Embodiment. 合成処理を説明する図である。It is a figure explaining the synthesis process. 本発明の第３実施形態に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image pickup apparatus which concerns on 3rd Embodiment of this invention. 第３実施形態における記録情報調整部の処理を示すフローチャートである。It is a flowchart which shows the process of the record information adjustment part in 3rd Embodiment. 主被写体スコア算出処理を説明する図である。It is a figure explaining the main subject score calculation process. 主被写体スコア閾値の算出処理を説明する図である。It is a figure explaining the calculation process of the main subject score threshold value. 本発明の第４実施形態に係る撮像装置の構成を示すブロック図である。It is a block diagram which shows the structure of the image pickup apparatus which concerns on 4th Embodiment of this invention. 第４実施形態における露出条件制御部の処理を示すフローチャートである。It is a flowchart which shows the process of the exposure condition control part in 4th Embodiment. 画像と距離マップの記録形式を説明する図である。It is a figure explaining the recording format of an image and a distance map.

以下、図面を参照して本発明の各実施形態について説明する。各実施形態では、高性能な領域抽出処理を伴う画像処理を、ユーザの指示に応じて後から行う画像処理装置について説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. In each embodiment, an image processing apparatus that performs image processing accompanied by high-performance area extraction processing later according to a user's instruction will be described.

［第１実施形態］
本発明の第１実施形態では、画像処理装置において、記録された複数フレームの連続画像に関し、後からユーザが選択した任意のフレームに対して、背景画像のぼかし処理を行う場合を想定して説明する。そこで、本実施形態では、複数の撮像画像に対して付加情報としての深度分布情報を記録するモードと記録しないモードとを切り替えて制御可能な画像処理装置を提供する。図１は、本実施形態の画像処理装置の一例としての撮像装置に適用可能な構成を例示したブロック図である。
撮像光学系１０１は、ズームレンズやフォーカスレンズ等のレンズ群、絞り調整装置、シャッタ装置を備える。撮像光学系１０１は、撮像部１０２に到達する被写体像の倍率やピント位置、光量を調整する。撮像部１０２はＣＣＤ（電荷結合素子）イメージセンサやＣＭＯＳ（相補型金属酸化膜半導体）イメージセンサ等を備え、撮像光学系１０１を通過した被写体からの光束を光電変換によって電気信号に変換する。Ａ（アナログ）／Ｄ（デジタル）変換部１０３は、入力されたアナログの電気信号をデジタル画像信号に変換する。 [First Embodiment]
In the first embodiment of the present invention, it is assumed that the image processing apparatus blurs the background image of a plurality of recorded continuous images for an arbitrary frame selected by the user later. To do. Therefore, in the present embodiment, an image processing device capable of switching between a mode in which depth distribution information as additional information is recorded and a mode in which depth distribution information is not recorded for a plurality of captured images is provided. FIG. 1 is a block diagram illustrating a configuration applicable to an image pickup apparatus as an example of the image processing apparatus of the present embodiment.
The imaging optical system 101 includes a lens group such as a zoom lens and a focus lens, an aperture adjusting device, and a shutter device. The imaging optical system 101 adjusts the magnification, focus position, and amount of light of the subject image reaching the imaging unit 102. The imaging unit 102 includes a CCD (charge-coupled device) image sensor, a CMOS (complementary metal oxide semiconductor) image sensor, and the like, and converts the luminous flux from the subject that has passed through the imaging optical system 101 into an electric signal by photoelectric conversion. The A (analog) / D (digital) conversion unit 103 converts the input analog electric signal into a digital image signal.

記録モード制御部１０４は複数の記録モードを有し、記録部１０９に記録する画像の情報量を制御する。画像処理部１０５は、Ａ／Ｄ変換部１０３から出力される画像信号の他、記録部１０９から読み出した画像信号に対して各種の処理を行う。例えば撮像光学系に起因する歪みやノイズの補正処理、デモザイキング処理、ホワイトバランス調整、色変換処理、ガンマ補正などの処理が実行される。また画像処理部１０５は、所定の画像処理の他に、本実施形態で想定している背景ぼかし処理を行う。本実施形態では、上記画像処理部１０５による画像処理のうち、撮像光学系に起因する歪みやノイズの補正処理以外の画像加工の少なくとも一部を行っていない画像をＲＡＷ画像とする。本実施形態では画像処理部１０５が再生処理を行う例を説明するが、画像処理部１０５とは別に再生処理部を設けてもよい。 The recording mode control unit 104 has a plurality of recording modes and controls the amount of information of the image to be recorded in the recording unit 109. The image processing unit 105 performs various processing on the image signal read from the recording unit 109 in addition to the image signal output from the A / D conversion unit 103. For example, processing such as distortion and noise correction processing due to the imaging optical system, demosaiking processing, white balance adjustment, color conversion processing, and gamma correction is executed. In addition to the predetermined image processing, the image processing unit 105 also performs the background blurring process assumed in the present embodiment. In the present embodiment, among the image processing by the image processing unit 105, an image in which at least a part of the image processing other than the correction processing of distortion and noise caused by the imaging optical system is not performed is defined as a RAW image. In this embodiment, an example in which the image processing unit 105 performs the reproduction processing will be described, but a reproduction processing unit may be provided separately from the image processing unit 105.

システム制御部１０６は、撮像装置全体の動作制御を統括する制御中枢部であり、ＣＰＵ（中央演算処理装置）やメモリ等を備える。システム制御部１０６は、ユーザによる操作部１０７からの操作指示にしたがって、撮像光学系１０１や撮像部１０２の駆動制御、画像処理部１０５における所定の画像処理等の制御を行う。 The system control unit 106 is a control center that controls the operation control of the entire image pickup apparatus, and includes a CPU (central processing unit), a memory, and the like. The system control unit 106 controls drive control of the image pickup optical system 101 and the image pickup unit 102, predetermined image processing in the image processing unit 105, and the like according to an operation instruction from the operation unit 107 by the user.

表示部１０８は、液晶ディスプレイや有機ＥＬ（Electro Luminescence）ディスプレイ等で構成され、撮像部１０２によって生成された画像信号や、記録部１０９から読み出した画像信号にしたがって画像を表示する。記録部１０９は画像信号を記録媒体へ記録する処理を行う。静止画像、動画像それぞれの記録設定に従って、静止画向けの符号化形式（例えばJPEGなど）、動画向けの符号化形式（例えばH.264,H.265など）で記録処理が行われる。また、静止画像、動画像それぞれ画像加工前のＲＡＷ画像で記録する設定である場合には、非圧縮あるいは可逆圧縮など、画像加工後の画像データの記録時よりも画質の劣化の少ない、低圧縮率で圧縮、記録が行われる。記録媒体は、例えば半導体メモリが搭載されたメモリカードや光磁気ディスク等の回転記録体を収容したパッケージ等を用いた情報記録媒体である。記録媒体は撮像装置に着脱可能である。 The display unit 108 is composed of a liquid crystal display, an organic EL (Electro Luminescence) display, or the like, and displays an image according to an image signal generated by the imaging unit 102 or an image signal read from the recording unit 109. The recording unit 109 performs a process of recording an image signal on a recording medium. According to the recording settings of each of the still image and the moving image, the recording process is performed in the encoding format for still images (for example, JPEG) and the encoding format for moving images (for example, H.264, H.265, etc.). In addition, when the still image and the moving image are set to be recorded as RAW images before image processing, low compression such as uncompression or lossless compression, which causes less deterioration in image quality than when recording image data after image processing. It is compressed and recorded at a rate. The recording medium is an information recording medium using, for example, a package containing a rotating recording body such as a memory card or a magneto-optical disk on which a semiconductor memory is mounted. The recording medium is removable from the imaging device.

バス１１０は、記録モード制御部１０４、画像処理部１０５、システム制御部１０６、表示部１０８、および記録部１０９の間で信号を送受し合うために用いられる。
以下、記録モード制御部１０４と画像処理部１０５を中心にして、本実施形態における処理の流れについて説明する。記録モード制御部１０４は本実施形態にて特徴的な処理ブロックである。また画像処理部１０５は、撮影後の操作指示に応じて背景画像のぼかし処理を行う処理ブロックである。 The bus 110 is used to send and receive signals between the recording mode control unit 104, the image processing unit 105, the system control unit 106, the display unit 108, and the recording unit 109.
Hereinafter, the processing flow in the present embodiment will be described with a focus on the recording mode control unit 104 and the image processing unit 105. The recording mode control unit 104 is a processing block characteristic of the present embodiment. Further, the image processing unit 105 is a processing block that performs blurring processing of the background image in response to an operation instruction after shooting.

図２および図３を参照して、記録モード制御部１０４の動作について説明する。
図２は記録モード制御部１０４のブロック図である。主被写体領域検出部２０１は、入力画像のデータを取得して主被写体領域を検出する。主被写体領域とは、例えば複数の被写体から選択される被写体の画像領域である。距離マップ算出部２０２は、入力画像のデータを取得して距離マップ（深度分布情報）を算出する。主被写体距離算出部２０３は、主被写体領域検出部２０１により検出された主被写体領域の情報と、距離マップ算出部２０２により算出された距離マップを取得して主被写体距離を算出する。主被写体距離は撮像装置から主被写体までの距離である。背景距離算出部２０４は、背景領域の情報と距離マップから背景距離を算出する。背景領域は主被写体以外の領域であり、背景距離は撮像装置から背景までの距離である。閾値算出部２０５は主被写体距離を取得して、主被写体と背景との距離差に対する閾値を算出する。記録モード判断部２０６は、主被写体距離および背景距離と閾値を取得し、主被写体距離と背景距離の差分と閾値を比較して記録モードの判断処理を行う。記録モードは、画像データおよび距離情報を記録する第１のモードと、画像データを記録する第２のモードを少なくとも含む。本実施形態では、撮像により取得された画像データを各フレームとする動画像について、各フレームの記録時に第１および第２のモードを切り替える制御について説明するが、制御としてはこれに限らない。例えば静止画撮像において、各撮像で第１および第２のモードの切替を行う実施形態も本発明に含まれる。すなわち、取得される複数の画像データについて、第１のモードと第２のモードを切り替える制御であればよい。各部の処理の詳細については後述する。 The operation of the recording mode control unit 104 will be described with reference to FIGS. 2 and 3.
FIG. 2 is a block diagram of the recording mode control unit 104. The main subject area detection unit 201 acquires the data of the input image and detects the main subject area. The main subject area is, for example, an image area of a subject selected from a plurality of subjects. The distance map calculation unit 202 acquires the data of the input image and calculates the distance map (depth distribution information). The main subject distance calculation unit 203 acquires the information of the main subject area detected by the main subject area detection unit 201 and the distance map calculated by the distance map calculation unit 202, and calculates the main subject distance. The main subject distance is the distance from the image pickup device to the main subject. The background distance calculation unit 204 calculates the background distance from the information of the background area and the distance map. The background area is an area other than the main subject, and the background distance is the distance from the image pickup device to the background. The threshold value calculation unit 205 acquires the main subject distance and calculates a threshold value for the distance difference between the main subject and the background. The recording mode determination unit 206 acquires the main subject distance, the background distance, and the threshold value, compares the difference between the main subject distance and the background distance, and the threshold value, and performs the recording mode determination process. The recording mode includes at least a first mode for recording image data and distance information and a second mode for recording image data. In the present embodiment, control for switching between the first and second modes at the time of recording each frame will be described for a moving image in which the image data acquired by imaging is used as each frame, but the control is not limited to this. For example, in still image imaging, the present invention also includes an embodiment in which the first and second modes are switched in each imaging. That is, the control for switching between the first mode and the second mode may be sufficient for the plurality of acquired image data. Details of the processing of each part will be described later.

図３は記録モード制御部１０４の処理を説明するフローチャートである。本フローチャートでは動画の取得中である場合を想定しており、ユーザによる撮影開始の指示時点から撮影動作終了の指示時点までの間、所定のフレームレートで連続的にフレーム画像が入力されるものとする。 FIG. 3 is a flowchart illustrating the processing of the recording mode control unit 104. In this flowchart, it is assumed that a moving image is being acquired, and frame images are continuously input at a predetermined frame rate from the time when the user instructs the start of shooting to the time when the user instructs the end of the shooting operation. To do.

まず、Ｓ３０１にて主被写体領域検出部２０１は主被写体領域を検出する。図４を参照して具体例を説明する。図４は検出された主被写体領域の一例を示しており、顔検出枠を利用して主被写体領域が検出される。その他には、主被写体が人物でない場合、一般的な物体検出によって被写体領域が検出されるものとし、特定の検出方法には限定されない。また、図４に示すように、画像全体の領域において主被写体領域でない領域を、背景領域と定義する。 First, in S301, the main subject area detection unit 201 detects the main subject area. A specific example will be described with reference to FIG. FIG. 4 shows an example of the detected main subject area, and the main subject area is detected by using the face detection frame. In addition, when the main subject is not a person, the subject area is detected by general object detection, and the present invention is not limited to a specific detection method. Further, as shown in FIG. 4, a region that is not the main subject region in the entire region of the image is defined as a background region.

次にＳ３０２で距離マップ算出部２０２は距離マップを算出する。距離マップの算出方法としては、例えば特許文献３に記載されているように、被写体からの光を瞳分割して複数の視点画像（視差画像）を生成し、視差量を算出して被写体の深度分布情報を取得する方法がある。被写体の深度分布情報とは、撮像手段としてのカメラから被写体までの距離（被写体距離）を絶対値として距離値で表わすデータや、画像データにおける相対的な距離関係（画像の深度）を示すデータ（視差量の分布、デフォーカス量の分布等）を含む。本実施形態では距離値で表すデータとして以後の説明を行うが、深度分布情報として視差量あるいはデフォーカス量の分布を用いる場合は、距離値がそれぞれ視差量、デフォーカス量に置き換えて各処理がなされるものとする。
深度分布情報に関して、撮像画像内の各被写体の奥行き方向（深さ方向）の深度に対応する情報としてさまざまな実施形態がある。つまり、被写体の深さに対応するデータが示す情報は、画像内における撮像装置から被写体までの被写体距離を直接的に表すか、または画像内の被写体の距離（被写体距離）や深さの相対関係を表す情報であればよい。例えば、撮像部１０２に対して合焦位置を変更する制御が行われ、撮影された複数の撮像画像データが取得される。それぞれの撮像画像データの合焦領域と、撮像画像データの合焦位置情報から深度分布情報を取得することができる。この他にも、撮像部１０２の撮像素子が瞳分割型の画素構成を有する場合、一対の像信号の位相差から各画素に対する深度分布情報を取得可能である。具体的には、撮像素子は、撮像光学系の異なる瞳部分領域を通過する一対の光束が光学像としてそれぞれ結像したものを電気信号に変換し、対をなす画像データを複数の光電変換部から出力する。対をなす画像データ間の相関演算によって各領域の像ずれ量が算出され、像ずれ量の分布を表す像ずれマップが算出される。あるいはさらに像ずれ量がデフォーカス量に換算され、デフォーカス量の分布（撮像画像の２次元平面上の分布）を表すデフォーカスマップが生成される。このデフォーカス量を撮像光学系や撮像素子の条件に基づいて被写体距離に換算すると、被写体距離の分布を表す距離マップデータが得られる。像ずれマップデータ、デフォーカスマップデータ、あるいはデフォーカス量から変換される被写体距離の距離マップデータを取得可能である。
また、被写体への投光から反射光を受けるまでの遅延時間を測定して被写体までの距離計測を行うＴＯＦ（ＴｉｍｅＯｆＦｌｉｇｈｔ）法を用いて画像内における撮像装置から被写体までの被写体距離を直接的に取得してもよい。ＴＯＦ法では、投光手段により被写体（対象物）にパルス光を投射して、その反射光を撮像部１０２で受光し、このパルス光の飛行時間（遅れ時間）を測定することで被写体距離（対象物までの距離）を測り、深度分布情報を取得する。
本実施形態では距離値で表すデータとして以後の説明を行うが、深度分布情報として視差量あるいはデフォーカス量の分布を用いる場合は、距離値がそれぞれ視差量、デフォーカス量に置き換えて各処理がなされるものとする。 Next, in S302, the distance map calculation unit 202 calculates the distance map. As a method of calculating the distance map, for example, as described in Patent Document 3, the light from the subject is divided into pupils to generate a plurality of viewpoint images (parallax images), the amount of parallax is calculated, and the depth of the subject is calculated. There is a way to get distribution information. The depth distribution information of the subject is data showing the distance (subject distance) from the camera as an imaging means to the subject as an absolute value as a distance value, or data showing a relative distance relationship (image depth) in the image data (image depth). Distribution of parallax amount, distribution of defocus amount, etc.) is included. In the present embodiment, the following description will be given as the data represented by the distance value, but when the distribution of the parallax amount or the defocus amount is used as the depth distribution information, the distance value is replaced with the parallax amount and the defocus amount, respectively, and each process is performed. It shall be made.
Regarding the depth distribution information, there are various embodiments as information corresponding to the depth in the depth direction (depth direction) of each subject in the captured image. That is, the information indicated by the data corresponding to the depth of the subject directly represents the subject distance from the image pickup device to the subject in the image, or the relative relationship between the distance (subject distance) and the depth of the subject in the image. Any information may be used. For example, the image pickup unit 102 is controlled to change the focusing position, and a plurality of captured image data captured are acquired. Depth distribution information can be acquired from the focusing region of each captured image data and the focusing position information of the captured image data. In addition to this, when the image sensor of the image pickup unit 102 has a pupil-divided pixel configuration, depth distribution information for each pixel can be acquired from the phase difference of a pair of image signals. Specifically, the image sensor converts a pair of light fluxes that pass through different pupil region regions of the image pickup optical system into an electric signal, and converts the paired image data into a plurality of photoelectric conversion units. Output from. The image shift amount of each region is calculated by the correlation calculation between the paired image data, and the image shift map showing the distribution of the image shift amount is calculated. Alternatively, the image shift amount is further converted into the defocus amount, and a defocus map showing the distribution of the defocus amount (distribution on the two-dimensional plane of the captured image) is generated. When this defocus amount is converted into the subject distance based on the conditions of the imaging optical system and the image sensor, distance map data representing the distribution of the subject distance can be obtained. It is possible to acquire image shift map data, defocus map data, or distance map data of subject distance converted from the defocus amount.
In addition, the subject distance from the image pickup device to the subject in the image is directly measured by using the TOF (Time Of Flight) method in which the delay time from the projection of the light to the subject to the reception of the reflected light is measured to measure the distance to the subject. May be acquired. In the TOF method, pulsed light is projected onto a subject (object) by a light projecting means, the reflected light is received by the imaging unit 102, and the flight time (delay time) of the pulsed light is measured to measure the subject distance (delay time). Measure the distance to the object) and acquire the depth distribution information.
In the present embodiment, the following description will be given as the data represented by the distance value, but when the distribution of the parallax amount or the defocus amount is used as the depth distribution information, the distance value is replaced with the parallax amount and the defocus amount, respectively, and each process is performed. It shall be made.

Ｓ３０３にて主被写体距離算出部２０３は、主被写体領域情報および距離マップを用いて主被写体距離（主被写体に対応する視差量、デフォーカス量）を算出する。Ｓ３０４にて背景距離算出部２０４は、背景領域情報と距離マップを用いて背景距離を算出する。図５を参照してＳ３０３およびＳ３０４の処理について具体例を説明する。図５は、主被写体距離および背景距離の算出方法を示す図である。横軸は撮像装置の光軸方向の距離を表し、縦軸は距離ヒストグラムの頻度（度数）を表す。 In S303, the main subject distance calculation unit 203 calculates the main subject distance (parallax amount and defocus amount corresponding to the main subject) using the main subject area information and the distance map. In S304, the background distance calculation unit 204 calculates the background distance using the background area information and the distance map. A specific example of the processing of S303 and S304 will be described with reference to FIG. FIG. 5 is a diagram showing a method of calculating the main subject distance and the background distance. The horizontal axis represents the distance in the optical axis direction of the image pickup device, and the vertical axis represents the frequency (frequency) of the distance histogram.

まず主被写体距離算出部２０３は、図５（Ａ）に示す主被写体領域内の距離ヒストグラムを取得し、頻度が最大値となるピーク値に対応する距離を主被写体距離とする。図５（Ａ）では２つのピーク値が存在する例を示している。主被写体距離以外にもピーク値が存在している理由は、主被写体領域の境界を矩形枠で規定しているため、背景画像の一部が主被写体領域に入り込んでいるからである。 First, the main subject distance calculation unit 203 acquires a distance histogram in the main subject area shown in FIG. 5A, and sets the distance corresponding to the peak value at which the frequency becomes the maximum value as the main subject distance. FIG. 5A shows an example in which two peak values exist. The reason why the peak value exists in addition to the main subject distance is that the boundary of the main subject area is defined by a rectangular frame, so that a part of the background image is included in the main subject area.

次に背景距離算出部２０４は、図５（Ｂ）に示す背景領域内の距離ヒストグラムを取得し、頻度が最大値となるピーク値に対応する距離を背景距離とする。図５（Ｂ）では３つのピーク値が存在する例を示している。 Next, the background distance calculation unit 204 acquires a distance histogram in the background region shown in FIG. 5 (B), and sets the distance corresponding to the peak value at which the frequency becomes the maximum value as the background distance. FIG. 5B shows an example in which three peak values exist.

Ｓ３０５にて閾値算出部２０５は、主被写体と背景との距離差に対する閾値を算出する。閾値については、主被写体距離の値から光学モデルを用いて決定される。図６を参照して説明する。図６の光学モデルは、撮像光学系１０１を１つのレンズとして近似した場合における、物距離と像距離を示す。焦点距離（fと記す）のレンズに対し、物体が存在する側（図６にてレンズの左側）を物面側とし、レンズを通して物体からの光が結像する側（図６にてレンズの右側）を像面側と定義する。さらに、主被写体にピントが合っていると仮定し、以下の距離を定義する。
・主被写体距離：Dist_Obj
・背景距離：Dist_Back
・主被写体像距離：Img_Obj
主被写体像距離は、主被写体からの光が像面側に結像する距離である。さらに、主被写体からの光が結像する像面側の位置をピント面とし、ピント面から、背景の光が結像する位置までの変位量をデフォーカス量defとする。ここで、デフォーカス量defの符号については、レンズに対して物面側から遠ざかる方向を正とする。従って、物面上で主被写体よりも遠い位置にある背景の光が結像する位置でのデフォーカス量は負値となる。定性的には、デフォーカス量の絶対値が大きいほど、ピント面において背景像の散乱度合いが大きくなり、背景のボケが大きくなる。 In S305, the threshold value calculation unit 205 calculates the threshold value for the distance difference between the main subject and the background. The threshold value is determined using an optical model from the value of the main subject distance. This will be described with reference to FIG. The optical model of FIG. 6 shows an object distance and an image distance when the imaging optical system 101 is approximated as one lens. With respect to the lens with the focal length (denoted as f), the side where the object exists (the left side of the lens in FIG. 6) is the object surface side, and the side where the light from the object is imaged through the lens (the lens in FIG. 6). The right side) is defined as the image plane side. Furthermore, assuming that the main subject is in focus, the following distances are defined.
・ Main subject distance: Dist_Obj
・ Background distance: Dist_Back
・ Main subject image distance: Img_Obj
The main subject image distance is the distance at which the light from the main subject is imaged on the image plane side. Further, the position on the image plane side where the light from the main subject is imaged is defined as the focus plane, and the displacement amount from the focus plane to the position where the background light is imaged is defined as the defocus amount def. Here, regarding the sign of the defocus amount def, the direction away from the object surface side with respect to the lens is positive. Therefore, the defocus amount at the position where the background light, which is located farther from the main subject on the object surface, is formed, has a negative value. Qualitatively, the larger the absolute value of the defocus amount, the greater the degree of scattering of the background image on the focus plane, and the greater the blurring of the background.

図６において、レンズの公式より、下記（１）式および（２）式が成立する。
ここで、（１）式および（２）式から主被写体像距離Img_objを消去し、背景距離Dist_Backについての式に変形すると、下記（３）式が得られる。
（３）式より、背景距離Dist_Backは、主被写体距離Dist_Objおよび焦点距離fが一意に定まった場合、デフォーカス量defの関数であるとみなすことができる。したがって、本実施形態では、距離マップとして主として被写体距離を算出する例を示しているが、デフォーカス量、視差量によっても目的が達成されるのである。 In FIG. 6, the following equations (1) and (2) are established from the lens formula.
Here, when the main subject image distance Img_obj is deleted from the equations (1) and (2) and transformed into the equation for the background distance Dist_Back, the following equation (3) is obtained.
From the equation (3), the background distance Dist_Back can be regarded as a function of the defocus amount def when the main subject distance Dist_Obj and the focal length f are uniquely determined. Therefore, in the present embodiment, an example of mainly calculating the subject distance as a distance map is shown, but the object can also be achieved by the defocus amount and the parallax amount.

図７は、式（３）に基づいて作成された、デフォーカス量（横軸）と物面距離（縦軸）との関係を例示したグラフである。焦点距離を50mmとし、主被写体距離Dist_objが3m(3000mm)の場合と5m(5000mm)の場合の２例を示す。主被写体にピントが合っていると仮定しているので、物体距離がDist_objと等しいときにデフォーカス量は0となる。仮に、背景が十分にぼけているとみなせる許容デフォーカス量を、-0.2mmとしたとき、対応する物体距離は、Dist_obj=3mの場合、約3911mmである。従って、被写体と背景との距離差に対する閾値は、|3911-3000|=911mmとなり、およそ90cmと算出される。
以上が、Ｓ３０５にて閾値算出部２０５が行う処理の説明である。 FIG. 7 is a graph created based on the equation (3) exemplifying the relationship between the defocus amount (horizontal axis) and the object surface distance (vertical axis). Two examples are shown when the focal length is 50 mm and the main subject distance Dist_obj is 3 m (3000 mm) and 5 m (5000 mm). Since it is assumed that the main subject is in focus, the defocus amount is 0 when the object distance is equal to Dist_obj. Assuming that the allowable defocus amount that can be regarded as the background is sufficiently blurred is -0.2 mm, the corresponding object distance is about 3911 mm when Dist_obj = 3 m. Therefore, the threshold value for the distance difference between the subject and the background is | 3911-3000 | = 911 mm, which is calculated to be about 90 cm.
The above is the description of the process performed by the threshold value calculation unit 205 in S305.

次にＳ３０６にて記録モード判断部２０６は、主被写体距離と背景距離との差分と、閾値算出部２０５から取得した閾値とを比較する。主被写体距離と背景距離との差分が閾値以内である場合、記録モード判断部２０６は、背景画像のボケが十分でなく、後から電子的なぼかし処理を要する可能性があると判断してＳ３０７へ進み、第１のモードである高品質モードでの記録処理を行う。高品質モードでは通常圧縮フレームのデータの他に距離マップを記録する処理が実行される。高品質モードで記録した情報を用いて背景をぼかす処理については後述する。一方、主被写体距離と背景距離との差分が閾値よりも大きい場合、記録モード判断部２０６は、背景画像のボケが十分であるとみなす。つまり、記録モード判断部２０６は電子的なぼかし処理が不要であると判断し、Ｓ３０８へ進み、第２のモードである通常モードでの記録処理を行う。通常モードでは通常圧縮フレームのデータのみを記録する処理が実行される。なお、フレーム圧縮処理および現像処理に伴う所定の信号処理については、画像処理部１０５内で行われるものとする。記録モード制御部１０４による処理は撮影時に行われる。また、高品質モードで記録される画像は通常圧縮フレームに限らず、画像加工前のＲＡＷ画像でもよい。ここでＲＡＷ画像には記録時に可逆の圧縮方式で圧縮処理が施されていてもよい。 Next, in S306, the recording mode determination unit 206 compares the difference between the main subject distance and the background distance with the threshold value acquired from the threshold value calculation unit 205. When the difference between the main subject distance and the background distance is within the threshold value, the recording mode determination unit 206 determines that the background image is not sufficiently blurred and may require electronic blurring processing later in S307. Proceed to, and the recording process is performed in the high quality mode, which is the first mode. In the high quality mode, the process of recording the distance map in addition to the data of the compressed frame is usually executed. The process of blurring the background using the information recorded in the high quality mode will be described later. On the other hand, when the difference between the main subject distance and the background distance is larger than the threshold value, the recording mode determination unit 206 considers that the background image is sufficiently blurred. That is, the recording mode determination unit 206 determines that the electronic blurring process is unnecessary, proceeds to S308, and performs the recording process in the normal mode, which is the second mode. In the normal mode, the process of recording only the data of the normal compressed frame is executed. The predetermined signal processing associated with the frame compression processing and the development processing shall be performed in the image processing unit 105. The processing by the recording mode control unit 104 is performed at the time of shooting. Further, the image recorded in the high quality mode is not limited to the normal compressed frame, and may be a RAW image before image processing. Here, the RAW image may be compressed by a reversible compression method at the time of recording.

次に、記録後にユーザの指示に応じて行われる背景画像のぼかし処理に関して説明する。本処理は主に画像処理部１０５が行うが、ユーザの指示を受け付ける操作部１０７およびシステム制御部１０６も関与する。本処理は記録後に行われるため、以下では「事後ぼかし処理」と呼ぶ。図８および図９を参照して、事後ぼかし処理について説明する。 Next, the background image blurring process performed according to the user's instruction after recording will be described. This processing is mainly performed by the image processing unit 105, but the operation unit 107 and the system control unit 106 that receive the user's instruction are also involved. Since this process is performed after recording, it will be referred to as "post-blurring process" below. The post-blurring process will be described with reference to FIGS. 8 and 9.

図８は、事後ぼかし処理に関する画像処理部１０５の処理ブロック図である。距離マップ整形部８０１は距離マップの整形処理を行う。距離マップの整形処理について後述する。ピント被写体抽出部８０２はピント被写体の画像を抽出する。ピント被写体はピントが合っている主被写体である。ぼかし処理部８０３は、背景画像のぼかし処理を行い、処理後のフレーム画像データを出力する。各部の処理の詳細については後述する。 FIG. 8 is a processing block diagram of the image processing unit 105 relating to the post-blurring process. The distance map shaping unit 801 performs the distance map shaping process. The shaping process of the distance map will be described later. The focus subject extraction unit 802 extracts an image of the focus subject. The in-focus subject is the main subject that is in focus. The blur processing unit 803 performs blur processing of the background image and outputs the frame image data after the processing. Details of the processing of each part will be described later.

図９は、事後ぼかし処理のフローチャートである。
まず、Ｓ９０１にて撮像装置は、背景画像のぼかし処理の指示をユーザから受け付ける。事後ぼかし処理が可能なフレームは高品質モードで記録したフレームのみであるので、処理可能なフレームであるか否かをユーザに知らせることが必要となる。図１０は、表示部１０８によって、事後ぼかし処理が可能であることをユーザに通知（報知）する場合の表示例を示す。背景画像のぼかし処理を行うか否かの指示を仰ぐ表示および入力処理が行われる。ユーザが動画再生時に背景画像のぼかし処理を行う事を選択した場合、システム制御部１０６は操作部１０７から操作指示を受け付け、画像処理部１０５に対し、Ｓ９０２以降の処理を行うように命令する。 FIG. 9 is a flowchart of the post-blurring process.
First, in S901, the imaging device receives an instruction for blurring the background image from the user. Since the frame that can be post-blurred is only the frame recorded in the high quality mode, it is necessary to inform the user whether or not the frame can be processed. FIG. 10 shows a display example in which the display unit 108 notifies (notifies) the user that the post-blurring process is possible. Display and input processing is performed to ask for an instruction as to whether or not to perform blurring processing of the background image. When the user chooses to perform the background image blurring process during video reproduction, the system control unit 106 receives an operation instruction from the operation unit 107 and instructs the image processing unit 105 to perform the processing after S902.

Ｓ９０２にて距離マップ整形部８０１はフレーム画像と距離マップのデータを取得し、距離マップの整形処理を行う。図１１は、距離マップの整形処理を説明する図である。図１１（Ａ）は入力フレーム画像を例示し、図１１（Ｂ）は整形処理前の入力距離マップを例示し、図１１（Ｃ）は整形処理後の出力距離マップを例示する。 In S902, the distance map shaping unit 801 acquires the frame image and the distance map data, and performs the distance map shaping process. FIG. 11 is a diagram illustrating a shaping process of the distance map. FIG. 11A exemplifies an input frame image, FIG. 11B exemplifies an input distance map before the shaping process, and FIG. 11C exemplifies an output distance map after the shaping process.

図１１（Ｂ）に示す入力距離マップの被写体輪郭は、図１１（Ａ）に示す入力フレーム画像における被写体輪郭に対し、精度が低下している場合がある。その理由としては、距離算出の演算時における被写体境界部分の遠近競合の影響や、フレーム画像に対して解像度を低くして距離マップの算出が行われること等が挙げられる。背景画像のぼかし処理には高精度な輪郭抽出が必要とされる。このため、距離マップの被写体輪郭をフレーム画像の被写体輪郭に合わせる処理が必要となり、この処理を整形処理と呼ぶ。もちろん、整形処理後の距離マップは背景画像のぼかし処理以外の距離マップを必要とする処理でも活用できる汎用性の高いものである。 The subject contour of the input distance map shown in FIG. 11B may be less accurate than the subject contour in the input frame image shown in FIG. 11A. Reasons for this include the influence of perspective competition at the boundary of the subject when calculating the distance, and the fact that the distance map is calculated by lowering the resolution of the frame image. High-precision contour extraction is required for blurring the background image. Therefore, a process of matching the subject contour of the distance map with the subject contour of the frame image is required, and this process is called a shaping process. Of course, the distance map after the shaping process is highly versatile and can be used in processes that require a distance map other than the background image blurring process.

整形処理はバイラテラルフィルタ処理によって行われる。バイラテラルフィルタ処理では、整形用画像をフレーム画像（図１１（Ａ）参照）として、着目画素位置ｐのフィルタ結果（Ｊｐと記す）が、下記式（４）で表される。
Ｊｐ＝（１／Ｋｐ）ΣＩ１ｑ・ｆ（｜ｐ−ｑ｜）・ｇ（｜Ｉ２ｐ−Ｉ２ｑ｜）・・・（４）
式（４）中の各記号の意味は以下のとおりである。
ｑ：周辺画素位置
Ω ：着目画素位置ｐを中心とする積算対象領域
Σ ：ｑ∈Ω範囲の積算
Ｉ１ｑ：周辺画素位置ｑにおける距離マップ信号値
ｆ（｜ｐ−ｑ｜）：着目画素位置ｐを中心とするガウシアン関数
Ｉ２ｐ：着目画素位置ｐでの整形用画像の画素値
Ｉ２ｑ：周辺画素位置ｑでの整形用画像の画素値
ｇ（｜Ｉ２ｐ−Ｉ２ｑ｜）：整形用画像の画素値Ｉ２ｐを中心とするガウシアン関数
Ｋｐ：正規化係数であり、ｆ・ｇ重みの積算値。
（４）式において、着目画素位置ｐと周辺画素位置ｑとが近いほど、ｆ値が大きくなる。着目画素位置ｐのＩ２ｐと周辺画素位置ｑのＩ２ｑとの差が小さいほど、つまり整形用画像において着目画素と周辺画素の画素値が近いほど、その周辺画素のｇ重み（平滑化の重み）は大きくなる。ｆ・ｇ重みで入力距離マップの信号値Ｉ１ｑを重みづけ加算した出力が、整形後の出力距離マップの信号値Ｊｐとなる。 The shaping process is performed by bilateral filtering. In the bilateral filter processing, the shaping image is used as a frame image (see FIG. 11A), and the filter result (denoted as Jp) at the pixel position p of interest is represented by the following equation (4).
Jp = (1 / Kp) ΣI1q ・ f (| p−q |) ・ g (| I2p−I2q |) ・・・ (4)
The meaning of each symbol in the formula (4) is as follows.
q: Peripheral pixel position Ω: Integration target area centered on the pixel position p of interest Σ: Integration of the q ∈ Ω range I1q: Distance map signal value at the peripheral pixel position q f (| p−q |): Pixel position of interest p Gaussian function centered on I2p: Pixel value of the shaping image at the pixel position p of interest I2q: Pixel value of the shaping image at the peripheral pixel position q g (| I2p-I2q |): Pixel value of the shaping image I2p Gaussian function centered on Kp: Normalization coefficient, which is the integrated value of f · g weights.
In equation (4), the closer the pixel position p of interest to the peripheral pixel position q, the larger the f-number. The smaller the difference between I2p of the pixel position p of interest and I2q of the peripheral pixel position q, that is, the closer the pixel values of the pixel of interest and the peripheral pixels are in the shaping image, the more the g weight (smoothing weight) of the peripheral pixels becomes. growing. The output obtained by weighting and adding the signal value I1q of the input distance map with the f · g weight is the signal value Jp of the output distance map after shaping.

図１１（Ｄ）は、図１１（Ａ）のフレーム画像の位置ｘにおけるプロファイル１１００ｐｆを表す。プロファイル１１００ｐｆの取得位置を図１１（Ａ）のライン１１００に示す。プロファイル１１００ｐｆは位置ｘａで変化するステップ形状である。また図１１（Ｅ）は、図１１（Ｂ）および（Ｃ）の距離マップの位置ｘにおけるプロファイルを表す。プロファイル１１０１ｐｆの取得位置を図１１（Ｂ）のライン１１０１に示し、プロファイル１１０２ｐｆの取得位置を図１１（Ｃ）のライン１１０２に示す。ライン１１０１の位置とライン１１０２の位置は同じである。 FIG. 11 (D) represents the profile 1100 pf at position x of the frame image of FIG. 11 (A). The acquisition position of the profile 1100 pf is shown in line 1100 of FIG. 11 (A). Profile 1100pf is a step shape that changes at position xa. Further, FIG. 11 (E) shows the profile at the position x of the distance map of FIGS. 11 (B) and 11 (C). The acquisition position of the profile 1101pf is shown in line 1101 of FIG. 11 (B), and the acquisition position of the profile 1102 pf is shown in line 1102 of FIG. 11 (C). The position of line 1101 and the position of line 1102 are the same.

図１１（Ｂ）に示す入力距離マップにおいて、被写体の距離を示す信号値は被写体像の輪郭より外側にはみ出している。図１１（Ｅ）に破線で示すプロファイル１１０１ｐｆの変化は、図１１（Ｄ）のプロファイル１１００ｐｆが変化する位置ｘａからずれている。バイラテラルフィルタによる整形処理が実行され、図１１（Ｃ）の整形後の距離マップに対応するプロファイル１１０２ｐｆが得られる。プロファイル１１０２ｐｆの変化する位置は、図１１（Ｄ）のプロファイル１１００ｐｆが変化する位置ｘａに一致し、被写体像の輪郭に合った形状となる。すなわちプロファイル１１０２ｐｆは、位置ｘａで大きく変化するステップ形状である。 In the input distance map shown in FIG. 11B, the signal value indicating the distance of the subject protrudes outside the contour of the subject image. The change in the profile 1101 pf shown by the broken line in FIG. 11 (E) deviates from the position xa in which the profile 1100 pf in FIG. 11 (D) changes. The shaping process by the bilateral filter is executed, and the profile 1102pf corresponding to the shaped distance map of FIG. 11C is obtained. The changing position of the profile 1102pf coincides with the changing position xa of the profile 1100pf in FIG. 11 (D), and the shape matches the contour of the subject image. That is, the profile 1102pf is a step shape that greatly changes at the position xa.

図９のＳ９０３にてピント被写体抽出部８０２はピント被写体の抽出を行う。図１２を参照して具体的に説明する。図１２（Ａ）は入力フレーム画像を例示し、図１２（Ｂ）は整形後の距離マップを例示する。図１２（Ｃ）は抽出特性を例示し、図１２（Ｄ）はピント被写体の抽出結果を例示する。 In S903 of FIG. 9, the focus subject extraction unit 802 extracts the focus subject. A specific description will be given with reference to FIG. FIG. 12A exemplifies an input frame image, and FIG. 12B exemplifies a distance map after shaping. FIG. 12C exemplifies the extraction characteristics, and FIG. 12D exemplifies the extraction result of the focused subject.

図１２（Ｂ）に示す整形後の距離マップに対し、図１２（Ｃ）の抽出特性が適用される。図１２（Ｃ）の横軸は被写体の距離を表し、縦軸は抽出結果の出力値を表す。抽出特性は、あらかじめ定められたピント面距離範囲内の距離のみ最大値を出力し、その他の距離ではゼロまたは最小値を出力する特性である。抽出特性の適用により出力される抽出結果を図１２（Ｄ）に示す。ピントが合っている主被写体のみが抽出される。 The extraction characteristics of FIG. 12C are applied to the shaped distance map shown in FIG. 12B. The horizontal axis of FIG. 12C represents the distance of the subject, and the vertical axis represents the output value of the extraction result. The extraction characteristic is a characteristic that outputs the maximum value only for a distance within a predetermined focus plane distance range, and outputs zero or a minimum value for other distances. The extraction result output by applying the extraction characteristics is shown in FIG. 12 (D). Only the main subject that is in focus is extracted.

図９のＳ９０４にてぼかし処理部８０３は、背景画像のぼかし処理を行う。図１３を参照して、ぼかし処理を説明する。図１３（Ａ）は、ぼかしフィルタのカーネルの形状例を示し、図１３（Ｂ）はピント面抽出画像を例示する。図１３（Ｃ）は着目位置に対するぼかしフィルタを例示する。 In S904 of FIG. 9, the blur processing unit 803 performs blur processing of the background image. The blurring process will be described with reference to FIG. FIG. 13 (A) shows an example of the shape of the kernel of the blur filter, and FIG. 13 (B) illustrates an image extracted from the focus plane. FIG. 13C exemplifies a blur filter for the position of interest.

レンズによる丸ぼけを模擬するために、ぼかしフィルタのカーネルの形状は、図１３（Ａ）に示すような略円形であり、フィルタ重み（重み付け係数値）を一定とする。本実施形態では、フィルタサイズを５×５とする。ぼかしフィルタは、入力フレーム画像の各画像に適用される。その際、ぼかし処理部８０３はピント面抽出画像を参照し、背景部分のみにフィルタ処理が行われるように制御する。図１３（Ｂ）は、着目位置Ｐが被写体の右側の位置である場合を例示する。この場合、ぼかしフィルタは、図１３（Ｃ）に○の記号で図示した位置に対してフィルタ演算を行う。×の記号で図示する位置は、ピント被写体の画像領域に属する位置である。これらの位置をフィルタ演算の対象に含めると、輪郭部分が混ざってしまい、画質が劣化する可能性がある。このため、背景部分にのみフィルタ処理を施すように制御が行われる。以上の処理によって、ピント被写体以外の背景領域のみに対し、ぼかし処理を施した画像が得られる。 In order to simulate the round blur caused by the lens, the shape of the kernel of the blur filter is substantially circular as shown in FIG. 13 (A), and the filter weight (weighting coefficient value) is constant. In this embodiment, the filter size is 5 × 5. The blur filter is applied to each image of the input frame image. At that time, the blur processing unit 803 refers to the focus surface extraction image and controls so that the filter processing is performed only on the background portion. FIG. 13B illustrates a case where the position of interest P is the position on the right side of the subject. In this case, the blur filter performs a filter operation on the position shown by the symbol ◯ in FIG. 13 (C). The position shown by the X symbol is a position belonging to the image area of the focused subject. If these positions are included in the filter calculation target, the contour portions are mixed and the image quality may deteriorate. Therefore, control is performed so that only the background portion is filtered. By the above processing, an image obtained by blurring only the background area other than the focused subject can be obtained.

図９のＳ９０５にて、後調整が実行される。本処理は、ぼかしの強度がユーザの嗜好に合っているかどうかを確認した結果、ユーザの嗜好に合っていない場合にぼかしの強度を調整する処理である。これにより、ユーザが望む度合のぼかし画像を得ることができる。ぼかしの強度については、図１３（Ａ）のフィルタカーネルのＴＡＰ数を変更することで調整できる。画像処理部１０５が実行する事後ぼかし処理は、撮影後にユーザの指示に応じて行われる。最後に、画像処理部１０５により生成された画像データが記録部１０９によって記録媒体に記録される。 Post-adjustment is performed in S905 of FIG. This process is a process of adjusting the blur intensity when the blur intensity does not match the user's preference as a result of confirming whether or not the blur intensity matches the user's preference. As a result, it is possible to obtain a blurred image of the degree desired by the user. The intensity of blurring can be adjusted by changing the number of TAPs in the filter kernel shown in FIG. 13 (A). The post-blurring process executed by the image processing unit 105 is performed according to a user's instruction after shooting. Finally, the image data generated by the image processing unit 105 is recorded on the recording medium by the recording unit 109.

本実施形態では、主被写体と背景との距離差に応じて、領域抽出のための追加情報を同時に取得しておくか否かが動的に切り替えられる。このため、記録量が膨大になることによるユーザへの負担を軽減し、事後ぼかし処理用の情報を事前に取得しておくことができる。事後処理が必要とされる可能性の高いフレームに対してのみ、高精度な領域抽出用の情報を記録しておくことで、記録容量を抑えつつ、ユーザメリットの高い画像記録を行うことができる。ここで、本実施形態では、主被写体と背景の距離差が大きいときには通常モードとして撮像画像のみ記録したが、主被写体と背景の距離差への応じ方はこれに限られない。主被写体と背景の距離差が閾値以上ある大きい場合には、その距離差をより強調すべく背景ぼかし処理が行いたいので高品質モードで記録する。 In the present embodiment, it is dynamically switched whether or not additional information for region extraction is acquired at the same time according to the distance difference between the main subject and the background. Therefore, it is possible to reduce the burden on the user due to the enormous amount of recording, and to acquire the information for the post-blurring process in advance. By recording information for highly accurate area extraction only for frames that are likely to require post-processing, it is possible to perform image recording with high user benefits while reducing the recording capacity. .. Here, in the present embodiment, when the distance difference between the main subject and the background is large, only the captured image is recorded as the normal mode, but the method of responding to the distance difference between the main subject and the background is not limited to this. When the distance difference between the main subject and the background is larger than the threshold value, the background blurring process is desired to emphasize the distance difference, so recording is performed in the high quality mode.

本実施形態では、主被写体と背景との距離差に基づき、距離差と閾値との比較結果から、追加情報の取得の有無を判断した。これに限定されることなく、例えばＦ値から深度情報を取得し、深度情報を判断の一要素としてもよい。また本実施形態では距離マップ算出法として、瞳分割画像の視差から距離を算出した。これに限定されることなく、例えばコントラストＡＦ（オートフォーカス）評価値等を用いて距離を取得してもよい。これらの事項は後述の実施形態でも同じである。 In the present embodiment, based on the distance difference between the main subject and the background, it is determined whether or not additional information is acquired from the comparison result between the distance difference and the threshold value. Without being limited to this, for example, depth information may be acquired from the F value, and the depth information may be used as an element of determination. Further, in the present embodiment, as a distance map calculation method, the distance is calculated from the parallax of the pupil split image. Without being limited to this, the distance may be acquired by using, for example, a contrast AF (autofocus) evaluation value or the like. These matters are the same in the embodiments described later.

［第２実施形態］
次に本発明の第２実施形態を説明する。第１実施形態では、記録された画像に対し、事後ぼかし処理を行う場合を想定した。第２実施形態では、例えば主被写体が逆光で暗くなっているフレーム画像に対し、後から領域別の階調補正処理を行う場合を想定する。逆光で暗くなっている主被写体と、それ以外の背景領域を同一の階調変換特性で処理した場合、背景の暗部が極端に明るくなってしまい、不自然な画像となる。領域別に階調補正を行う意義は、画像内の主被写体領域と背景領域とを別々の階調特性で補正して不自然さを抑制することである。一方で、この処理には高度な領域抽出処理が要求される。 [Second Embodiment]
Next, a second embodiment of the present invention will be described. In the first embodiment, it is assumed that the recorded image is subjected to post-blurring processing. In the second embodiment, for example, it is assumed that a frame image in which the main subject is darkened by backlight is subjected to gradation correction processing for each region later. When the main subject darkened by backlight and the other background area are processed with the same gradation conversion characteristics, the dark part of the background becomes extremely bright, resulting in an unnatural image. The significance of performing gradation correction for each region is to correct the main subject region and the background region in the image with different gradation characteristics to suppress unnaturalness. On the other hand, this process requires an advanced area extraction process.

本実施形態における処理は第１実施形態と比較して、記録モード制御部１０４の動作と、後処理に関する画像処理部１０５の動作が異なる。以下、第１実施形態とは処理が異なる箇所を中心に説明し、第１実施形態の場合と同様の構成については既に使用した符号を用いることで、それらの詳細な説明を省略する。このような説明の省略は後述の実施形態でも同じである。 In the processing in this embodiment, the operation of the recording mode control unit 104 and the operation of the image processing unit 105 related to post-processing are different from those in the first embodiment. Hereinafter, the parts where the processing is different from that of the first embodiment will be mainly described, and the same configurations as those of the first embodiment will be described in detail by using the already used reference numerals. The omission of such a description is the same in the embodiments described later.

図１４および図１５を参照して、記録モード制御部１０４の動作について説明する。図１４は、第２実施形態における記録モード制御部１０４の処理ブロック図である。図２に示す構成との相違点は以下の通りである。 The operation of the recording mode control unit 104 will be described with reference to FIGS. 14 and 15. FIG. 14 is a processing block diagram of the recording mode control unit 104 in the second embodiment. The differences from the configuration shown in FIG. 2 are as follows.

主被写体Ｂｖ値算出部１４０２は、主被写体領域検出部２０１から主被写体領域の情報を取得し、主被写体のＢｖ値を算出する。Ｂｖ値とは、着目領域の目標輝度値に対する輝度差を表す露出値である。背景Ｂｖ値算出部１４０３は背景領域の情報を取得し、背景のＢｖ値を算出する。露出段差算出部１４０４は、主被写体および背景の各Ｂｖ値を取得して、主被写体と背景との露出段差を算出する。閾値算出部１４０５は主被写体のＢｖ値から露出段差に対する閾値を算出する。記録モード判断部１４０６は、露出段差算出部１４０４が算出した主被写体と背景との露出段差、および閾値算出部１４０５が算出した閾値を取得する。記録モード判断部１４０６は露出段差と閾値を比較し、記録モードを判断する。各部の処理の詳細については後述する。 The main subject Bv value calculation unit 1402 acquires the information of the main subject area from the main subject area detection unit 201 and calculates the Bv value of the main subject. The Bv value is an exposure value representing a brightness difference with respect to a target brightness value in the region of interest. The background Bv value calculation unit 1403 acquires the information of the background area and calculates the background Bv value. The exposure step calculation unit 1404 acquires each Bv value of the main subject and the background, and calculates the exposure step between the main subject and the background. The threshold value calculation unit 1405 calculates the threshold value for the exposure step from the Bv value of the main subject. The recording mode determination unit 1406 acquires the exposure level difference between the main subject and the background calculated by the exposure level difference calculation unit 1404, and the threshold value calculated by the threshold value calculation unit 1405. The recording mode determination unit 1406 compares the exposure step with the threshold value and determines the recording mode. Details of the processing of each part will be described later.

図１５は、記録モード制御部１０４の処理を説明するフローチャートである。
まず、Ｓ１５０１にて主被写体領域検出部２０１は画像内の主被写体領域を検出する。次のＳ１５０２にて主被写体Ｂｖ値算出部１４０２は主被写体のＢｖ値を算出する。主被写体のＢｖ値をBv_objと表記すると、これは、下記（５）式により算出される。
Bv_obj = log2( Y_obj / Y_obj_target ) ・・・（５）
（５）式中のlog2は、２を底とする対数関数である。Y_objは主被写体の代表輝度値であり、主被写体領域の輝度値の平均値として算出される。また、Y_obj_targetは主被写体領域の目標輝度値である。目標輝度値は適正露出とみなす輝度値のことであり、予め決まった値である。目標輝度値は、主被写体が人物であるか否かに応じて変更してもよい。（５）式より、主被写体の明るさが暗いほど、Ｂｖ値は小さくなる。例えば、主被写体の代表輝度値が目標輝度値の１／２である場合、Ｂｖ値は−１となる。主被写体の代表輝度値が目標輝度値の２倍である場合、Ｂｖ値は＋１となる。 FIG. 15 is a flowchart illustrating the processing of the recording mode control unit 104.
First, in S1501, the main subject area detection unit 201 detects the main subject area in the image. In the next S1502, the main subject Bv value calculation unit 1402 calculates the Bv value of the main subject. When the Bv value of the main subject is expressed as Bv_obj, this is calculated by the following equation (5).
Bv_obj = log2 (Y_obj / Y_obj_target) ・・・ (5)
Log2 in Eq. (5) is a logarithmic function with 2 as the base. Y_obj is a representative luminance value of the main subject, and is calculated as an average value of the luminance values of the main subject area. Y_obj_target is the target luminance value of the main subject area. The target luminance value is a luminance value regarded as proper exposure, and is a predetermined value. The target luminance value may be changed depending on whether or not the main subject is a person. From equation (5), the darker the brightness of the main subject, the smaller the Bv value. For example, when the representative luminance value of the main subject is 1/2 of the target luminance value, the Bv value is -1. When the representative luminance value of the main subject is twice the target luminance value, the Bv value becomes +1.

Ｓ１５０３にて背景Ｂｖ値算出部１４０３は、背景のＢｖ値を算出する。背景Ｂｖ値をBv_backと表記すると、これは下記（６）式で表される。
Bv_back = log2( Y_back / Y_back_target ) ・・・（６）
（６）式中のY_backは背景の代表輝度値であり、背景領域の輝度値の平均値等で算出される。また、Y_back_targetは背景領域の目標輝度値である。 In S1503, the background Bv value calculation unit 1403 calculates the background Bv value. When the background Bv value is expressed as Bv_back, this is expressed by the following equation (6).
Bv_back = log2 (Y_back / Y_back_target) ・・・ (6)
Y_back in the equation (6) is a representative luminance value of the background, and is calculated by the average value of the luminance values of the background region and the like. Y_back_target is the target luminance value of the background area.

次のＳ１５０４で露出段差算出部１４０４は、主被写体領域と背景領域との露出段差（delta_Bvと記す）を、下記（７）式により算出する。
delta_Bv = | Bv_obj - Bv_back | ・・・（７）
（５）〜（７）式から分かるように、背景が明るく主被写体が暗い場合、若しくはその逆の場合になるほど、つまり主被写体に着目した場合のＤレンジ（ダイナミックレンジ）が広くなるほど、露出段差delta_Bvの値は大きくなる。またＤレンジが狭くなるほど、delta_Bvの値は小さくなる。本実施形態では、フレームのＤレンジを示す評価値として露出段差を用いる。 In the next S1504, the exposure step calculation unit 1404 calculates the exposure step (denoted as delta_Bv) between the main subject area and the background area by the following equation (7).
delta_Bv = | Bv_obj --Bv_back | ・・・ (7)
As can be seen from the equations (5) to (7), the exposure step increases as the background is bright and the main subject is dark, or vice versa, that is, the wider the D range (dynamic range) when focusing on the main subject. The value of delta_Bv increases. The narrower the D range, the smaller the value of delta_Bv. In this embodiment, an exposure step is used as an evaluation value indicating the D range of the frame.

次にＳ１５０５で閾値算出部１４０５は、主被写体Ｂｖ値から露出段差に対する閾値を算出する。図１６に示す閾値算出例を挙げて説明する。図１６に示すグラフにおいて、横軸は主被写体Ｂｖ値を表し、縦軸は露出段差閾値を表す。図１６の例では、Ｂｖ_minに対応する閾値がＴＨ_minであり、主被写体Ｂｖ値がゼロ以上である場合の閾値がＴＨ_maxである。Ｂｖ_minからゼロまでの区間において１次式で線形補間を行った例を示しているが、２次以上の高次の補間処理を行ってもよい。 Next, in S1505, the threshold value calculation unit 1405 calculates the threshold value for the exposure step from the main subject Bv value. An example of threshold calculation shown in FIG. 16 will be described. In the graph shown in FIG. 16, the horizontal axis represents the main subject Bv value, and the vertical axis represents the exposure step threshold. In the example of FIG. 16, the threshold value corresponding to Bv_min is TH_min, and the threshold value when the main subject Bv value is zero or more is TH_max. An example in which linear interpolation is performed by a linear equation in a section from Bv_min to zero is shown, but higher-order interpolation processing of a second order or higher may be performed.

図１６に示すように、定性的には、主被写体Ｂｖ値が負値である場合、すなわち、主被写体が適正露出よりも暗い場合、後から階調補正処理を行う必要性が高くなる。従って、露出段差に対する閾値が小さく設定される。一方、主被写体Ｂｖ値がゼロ近辺の場合には主被写体が適正露出に近い。よって階調補正処理の必要性が低くなるので、閾値が大きく設定される。また、主被写体Ｂｖ値が正値である場合には、主被写体が明るすぎる。この場合は階調補正本処理の対象外とするために、閾値が高く設定される（ＴＨ_max）。 As shown in FIG. 16, qualitatively, when the main subject Bv value is a negative value, that is, when the main subject is darker than the proper exposure, it becomes necessary to perform the gradation correction process later. Therefore, the threshold value for the exposure step is set small. On the other hand, when the main subject Bv value is near zero, the main subject is close to the proper exposure. Therefore, the need for gradation correction processing is reduced, and a large threshold value is set. Further, when the main subject Bv value is a positive value, the main subject is too bright. In this case, the threshold value is set high (TH_max) in order to exclude the gradation correction main processing.

Ｓ１５０６で記録モード判断部１４０６は、閾値算出部１４０５が算出した閾値と、露出段差とを比較し、記録モードを判断する。記録モード判断部１４０６は露出段差が閾値以上である場合、後から階調補正が必要になる可能性が高いと判断し、Ｓ１５０７に進んで高品質モードでの記録処理を行う。高品質モードでは、フレームの画像と距離マップとＲＡＷ画像のデータの記録処理が行われる。画像加工前のＲＡＷ画像を記録しておく理由は、ガンマ変換等の非線形処理の前に階調補正を行うためである。一方、露出段差が閾値未満である場合、記録モード判断部１４０６は、後から階調補正を行う必要がないと判断し、Ｓ１５０８に進んで通常モードでの記録処理を行う。記録モード制御部１０４の処理は撮影時に行われる。 The recording mode determination unit 1406 in S1506 compares the threshold value calculated by the threshold value calculation unit 1405 with the exposure step, and determines the recording mode. When the exposure step is equal to or greater than the threshold value, the recording mode determination unit 1406 determines that there is a high possibility that gradation correction will be required later, and proceeds to S1507 to perform recording processing in the high quality mode. In the high quality mode, the data of the frame image, the distance map, and the RAW image are recorded. The reason for recording the RAW image before image processing is to perform gradation correction before nonlinear processing such as gamma conversion. On the other hand, when the exposure step is less than the threshold value, the recording mode determination unit 1406 determines that it is not necessary to perform gradation correction later, and proceeds to S1508 to perform recording processing in the normal mode. The processing of the recording mode control unit 104 is performed at the time of shooting.

次に、記録後にユーザの指示によって行われる領域別階調補正処理に関して説明を行う。本処理は、主に画像処理部１０５が行うが、ユーザの指示を受け付ける操作部１０７およびシステム制御部１０６も関与する。以下、記録後に行われる領域別階調補正処理を、「事後補正処理」と呼ぶ。図１７および図１８を参照して、事後補正処理について説明する。 Next, a region-specific gradation correction process performed according to a user's instruction after recording will be described. This processing is mainly performed by the image processing unit 105, but the operation unit 107 and the system control unit 106 that receive the user's instruction are also involved. Hereinafter, the area-specific gradation correction processing performed after recording is referred to as "post-correction processing". The post-correction process will be described with reference to FIGS. 17 and 18.

図１７は、事後補正処理に関する画像処理部１０５の処理ブロック図である。図８に示す構成との相違点は以下の通りである。
第１階調特性算出部１７０３は、フレーム画像データを取得して第１階調特性を算出する。第１階調特性は主被写体領域の階調特性である。また第２階調特性算出部１７０４はフレーム画像データを取得して第２階調特性を算出する。第２階調特性は背景領域の階調特性である。第１階調補正部１７０５は第１階調特性を用いて、入力フレーム画像に対する階調補正処理を行う。第２階調補正部１７０６は第２階調特性を用いて、入力フレーム画像に対する階調補正処理を行う。合成部１７０７はピント被写体抽出部８０２からピント被写体の情報を取得し、第１階調補正部１７０５が階調補正を行った第１の画像、および第２階調補正部１７０６が階調補正を行った第２の画像を取得して合成処理を行う。各部の処理の詳細については後述する。 FIG. 17 is a processing block diagram of the image processing unit 105 relating to the post-correction processing. The differences from the configuration shown in FIG. 8 are as follows.
The first gradation characteristic calculation unit 1703 acquires frame image data and calculates the first gradation characteristic. The first gradation characteristic is the gradation characteristic of the main subject area. Further, the second gradation characteristic calculation unit 1704 acquires the frame image data and calculates the second gradation characteristic. The second gradation characteristic is the gradation characteristic of the background region. The first gradation correction unit 1705 uses the first gradation characteristic to perform gradation correction processing on the input frame image. The second gradation correction unit 1706 uses the second gradation characteristic to perform gradation correction processing on the input frame image. The compositing unit 1707 acquires the information of the focused subject from the focused subject extraction unit 802, the first image in which the first gradation correction unit 1705 has performed gradation correction, and the second gradation correction unit 1706 perform gradation correction. The second image is acquired and the compositing process is performed. Details of the processing of each part will be described later.

図１８は、事後補正処理を説明するフローチャートである。
まず、Ｓ１８０１で撮像装置は事後補正処理の指示をユーザから受け付ける。図１９は表示例を示し、事後補正処理が可能な高品質記録のフレームに対し、表示部１０８は事後補正処理が可能であることをユーザに提示する。また、ユーザに対して事後補正処理を行うか否かの指示を仰ぐ表示および入力処理が行われる。動画再生時にユーザが撮影後の階調補正処理を行う事を選択した場合、システム制御部１０６は操作部１０７から指示を受け付け、画像処理部１０５に対し、Ｓ１８０２以降の処理を行うように命令する。 FIG. 18 is a flowchart illustrating the post-correction process.
First, in S1801, the imaging device receives an instruction for post-correction processing from the user. FIG. 19 shows a display example, and the display unit 108 shows the user that the post-correction processing is possible for the high-quality recording frame capable of the post-correction processing. In addition, display and input processing for asking the user to instruct whether or not to perform post-correction processing is performed. When the user selects to perform gradation correction processing after shooting during video reproduction, the system control unit 106 receives an instruction from the operation unit 107 and instructs the image processing unit 105 to perform processing after S1802. ..

Ｓ１８０２において、距離マップ整形部８０１が距離マップの整形処理を行い、Ｓ１８０３において、ピント被写体抽出部８０２がピント被写体を抽出する。Ｓ１８０４にて第１階調特性算出部１７０３は第１階調特性を算出する。Ｓ１８０５にて第２階調特性算出部１７０４は第２階調特性を算出する。図２０を参照して、第１階調特性および第２階調特性の算出処理を説明する。図２０（Ａ）は入力フレーム画像を例示し、画像内の主被写体領域と背景領域を示す。つまり主被写体領域検出部２０１により、入力フレーム画像にて画像内の主被写体領域と背景領域とが分離して検出される。主被写体領域と背景領域に対し、別々の階調補正処理が実施される。まず、主被写体領域に対しては、一律のゲイン処理が行われる。一律のゲイン処理とする理由は、本機能が適用される場合に、逆光や日陰等で主被写体が一様に暗くなっている可能性が高いからである。一方、背景領域に対しては、輝度別のゲイン処理が行われる。その理由は、一般に背景領域には様々な輝度を持つ被写体が存在し、Ｄレンジが広いからである。各領域の階調性を損なわないように階調圧縮が実行される。なお、ゲイン特性に関しては、以上の考え方に限定されず、任意の形状の特性をとりうるものとする。 In S1802, the distance map shaping unit 801 performs the distance map shaping process, and in S1803, the focus subject extraction unit 802 extracts the focus subject. In S1804, the first gradation characteristic calculation unit 1703 calculates the first gradation characteristic. In S1805, the second gradation characteristic calculation unit 1704 calculates the second gradation characteristic. The calculation process of the first gradation characteristic and the second gradation characteristic will be described with reference to FIG. FIG. 20A illustrates an input frame image and shows a main subject area and a background area in the image. That is, the main subject area detection unit 201 detects the main subject area and the background area in the input frame image separately. Separate gradation correction processing is performed on the main subject area and the background area. First, a uniform gain process is performed on the main subject area. The reason for using uniform gain processing is that when this function is applied, there is a high possibility that the main subject is uniformly darkened due to backlight, shade, or the like. On the other hand, for the background area, gain processing for each brightness is performed. The reason is that subjects having various brightnesss generally exist in the background area and the D range is wide. Gradation compression is executed so as not to impair the gradation of each region. The gain characteristics are not limited to the above concept, and the characteristics of any shape can be taken.

図２０（Ｂ）は、主被写体領域における、入力輝度（横軸）に対するゲイン（縦軸）の特性を例示する。ゲインは一定値（Gain_objと記す）をとる。主被写体領域の平均輝度値Y_objが適正露出とされる目標輝度値Y_obj_targetとなるように補正するために、ゲインGain_objは、下記（８）式により算出される。
Gain_obj = Y_obj_target / Y_obj ・・・（８） FIG. 20B illustrates the characteristics of the gain (vertical axis) with respect to the input luminance (horizontal axis) in the main subject area. The gain takes a constant value (denoted as Gain_obj). The gain Gain_obj is calculated by the following equation (8) in order to correct the average luminance value Y_obj of the main subject area to be the target luminance value Y_obj_target for proper exposure.
Gain_obj = Y_obj_target / Y_obj ・・・ (8)

図２０（Ｃ）は、図２０（Ｂ）に示すゲイン特性で処理を行った場合の入出力輝度の特性を実線のグラフで示す。つまり、この特性は第１階調特性である。点線は、入力輝度値と出力輝度値との比が１：１の場合を示す。第１階調特性のグラフ線の傾斜は、点線で示すグラフ線の傾斜よりも大きい。 FIG. 20C is a solid line graph showing the characteristics of the input / output luminance when the processing is performed with the gain characteristics shown in FIG. 20B. That is, this characteristic is the first gradation characteristic. The dotted line shows the case where the ratio of the input luminance value and the output luminance value is 1: 1. The slope of the graph line of the first gradation characteristic is larger than the slope of the graph line shown by the dotted line.

図２０（Ｄ）は、背景領域の輝度ヒストグラムにて代表輝度値を例示する。横軸は入力輝度を表し、縦軸は頻度（度数）を表す。図２０（Ｅ）は、背景領域における、入力輝度に対するゲイン特性を例示する。ゲインは入力輝度に応じて変化する。第２階調特性算出部１７０４はゲイン特性の算出前に、入力輝度の暗部側と明部側の代表輝度値を算出する。算出処理では、図２０（Ｄ）に示すように、背景領域の輝度ヒストグラムが取得される。代表輝度値Y_back_lowは、最小輝度から所定割合の画素数をカウントした場合に算出される、暗部の代表輝度値である。代表輝度値Y_back_highは、最大輝度から所定割合の画素数をカウントした場合に算出される、明部の代表輝度値である。背景領域の平均輝度値Y_backが適正露出とされる目標輝度値Y_back_targetとなるように、背景領域に対する最大ゲイン（Gain_backと記す）は、下記（９）式により算出される。
Gain_back = Y_back_target / Y_back ・・・（９） FIG. 20D exemplifies a representative luminance value in the luminance histogram of the background region. The horizontal axis represents the input brightness, and the vertical axis represents the frequency (frequency). FIG. 20E illustrates the gain characteristic with respect to the input luminance in the background region. The gain changes according to the input brightness. The second gradation characteristic calculation unit 1704 calculates the representative luminance values of the dark portion side and the bright portion side of the input luminance before calculating the gain characteristic. In the calculation process, as shown in FIG. 20D, a luminance histogram of the background region is acquired. The representative luminance value Y_back_low is a representative luminance value of the dark portion calculated when the number of pixels of a predetermined ratio is counted from the minimum luminance. The representative luminance value Y_back_high is a representative luminance value of the bright part calculated when the number of pixels of a predetermined ratio is counted from the maximum luminance. The maximum gain (denoted as Gain_back) for the background area is calculated by the following equation (9) so that the average luminance value Y_back of the background region becomes the target luminance value Y_back_target for proper exposure.
Gain_back = Y_back_target / Y_back ・・・ (9)

図２０（Ｅ）に示すゲイン特性では、入力輝度値がY_back_low以下の区間にてGain_backが一定値であり、入力輝度値がY_back_high以上の区間にて最小ゲイン量が１となる。T_back_lowとY_back_highとの間の区間では、入力輝度値に応じて単調減少となる特性である。図２０（Ｆ）は、図２０（Ｄ）に示すゲイン特性で処理を行った場合の入出力輝度の特性を実線のグラフで示す。この特性は第２階調特性である。第２階調特性を表す実線のグラフは、Y_back_lowにて上側に突出した折れ線形状であって、Y_back_high以上の区間では点線のグラフ線（入力輝度値と出力輝度値との比が１：１の場合）に一致する形状である。 In the gain characteristic shown in FIG. 20 (E), Gain_back is a constant value in the section where the input luminance value is Y_back_low or less, and the minimum gain amount is 1 in the section where the input luminance value is Y_back_high or more. In the section between T_back_low and Y_back_high, it is a characteristic that monotonically decreases according to the input luminance value. FIG. 20 (F) is a solid line graph showing the characteristics of the input / output luminance when the processing is performed with the gain characteristics shown in FIG. 20 (D). This characteristic is the second gradation characteristic. The solid line graph showing the second gradation characteristic has a polygonal line shape protruding upward at Y_back_low, and the dotted line graph line (the ratio of the input luminance value to the output luminance value is 1: 1 in the section above Y_back_high). The shape matches the case).

図１８のＳ１８０６では、第１階調補正部１７０５が第１階調特性を用いて、入力フレーム画像の階調補正処理を行う。Ｓ１８０７では、第２階調補正部１７０６が第２階調特性を用いて、入力フレーム画像の階調補正処理を行う。これらの処理は、入力フレーム画像の輝度値を、図２０（Ｃ）、（Ｆ）に例示した階調変換特性でそれぞれ変換する処理である。次のＳ１８０８で合成部１７０７は、階調補正処理が行われた２画像の合成を行う。図２１を参照して、合成部１７０７の処理を説明する。 In S1806 of FIG. 18, the first gradation correction unit 1705 uses the first gradation characteristic to perform gradation correction processing of the input frame image. In S1807, the second gradation correction unit 1706 uses the second gradation characteristic to perform gradation correction processing of the input frame image. These processes are processes for converting the luminance value of the input frame image with the gradation conversion characteristics illustrated in FIGS. 20 (C) and 20 (F), respectively. In the next S1808, the synthesizing unit 1707 synthesizes the two images that have undergone the gradation correction process. The processing of the synthesis unit 1707 will be described with reference to FIG.

図２１（Ａ）は入力フレーム画像を例示し、図２１（Ｂ）は、ピント被写体抽出部８０２が抽出したピント面画像を例示する。合成処理にて、図２１（Ｂ）に白色領域で示した画像内のピント面領域に対しては、第１階調特性で階調補正を行った画像が出力される。また、図２１（Ｂ）に黒色領域で示した画像内の非ピント面領域（背景領域）に対しては、第２階調特性で階調補正を行った画像が出力される。この処理を行うと、図２１（Ａ）の入力フレーム画像に対し、合成処理後の画像は図２１（Ｃ）に示す画像となる。図２１（Ｃ）は、ピント面である主被写体領域と、非ピント面領域である背景領域に対し、それぞれに異なる階調変換特性で階調補正処理が行われた画像を示す。最後に、領域別階調補正が行われた画像のデータは記録部１０９によって記録媒体に記録される。 FIG. 21 (A) illustrates an input frame image, and FIG. 21 (B) illustrates a focus plane image extracted by the focus subject extraction unit 802. In the compositing process, an image in which gradation correction is performed with the first gradation characteristic is output for the focus plane region in the image shown in the white region in FIG. 21 (B). Further, for the non-focus plane region (background region) in the image shown by the black region in FIG. 21B, an image in which gradation correction is performed by the second gradation characteristic is output. When this processing is performed, the image after the composition processing becomes the image shown in FIG. 21 (C) with respect to the input frame image of FIG. 21 (A). FIG. 21C shows an image in which gradation correction processing is performed on a main subject region, which is a focus plane, and a background region, which is a non-focus plane region, with different gradation conversion characteristics. Finally, the data of the image to which the gradation correction for each region is performed is recorded on the recording medium by the recording unit 109.

本実施形態では、主被写体領域と背景領域との露出段差（明るさの差）に応じて、領域抽出のための追加情報を同時に取得しておくか否かが動的に切り替えられる。このため、記録量が膨大になることによるユーザへの負担を軽減し、事後補正処理（階調補正）用の情報を取得しておくことができる。本実施形態では、主被写体領域と背景領域との露出段差が閾値より大きいときに高品質モードで画像および深度分布情報を記録した。しかしこれに限らず、例えば主被写体領域と背景領域との露出段差が閾値より小さいときの方が大きいときよりも被写体が適切な明るさでそれぞれ撮れている、として高品質モードで記録してもよい。このとき露出段差が閾値よりも大きいときには通常モードで深度分布情報を記録せずに画像を記録する。 In the present embodiment, it is dynamically switched whether or not to acquire additional information for area extraction at the same time according to the exposure step (difference in brightness) between the main subject area and the background area. Therefore, it is possible to reduce the burden on the user due to the enormous amount of recording, and to acquire the information for the post-correction processing (gradation correction). In this embodiment, when the exposure step between the main subject area and the background area is larger than the threshold value, the image and the depth distribution information are recorded in the high quality mode. However, the present invention is not limited to this, and even if the subject is recorded in the high quality mode, for example, when the exposure step between the main subject area and the background area is smaller than the threshold value, the subject is photographed with appropriate brightness than when it is larger. Good. At this time, when the exposure step is larger than the threshold value, the image is recorded without recording the depth distribution information in the normal mode.

［第３実施形態］
次に本発明の第３実施形態を説明する。本実施形態では、第１実施形態で説明した事後ぼかし処理を前提とし、特に情報の記録時において、記録容量をさらに削減することを目的とする。第１実施形態では、主被写体と背景との距離差に基づいて、例えば、背景の画像が十分にぼけていないと判断されたフレームに対して、後処理用の情報が取得される。しかし、撮影シーンによっては、常に主被写体と背景との距離が近い場合があり得る。そのような場合、ほぼ全フレームにわたって後処理用の情報が取得されてしまう。その結果、撮影された画像の記録容量が膨大になる可能性がある。そこで、本実施形態では、記録容量を適正に保つために、高品質モードで記録するフレームをさらに絞り込む処理について説明する。すなわち、本実施形態において後述する記録容量の削減の必要があるかの判定と被写体のスコア判定による記録モードの判定は、その一部あるいは全部を第１および第２の実施形態にそれぞれ組み合わせて実行され得るものである。 [Third Embodiment]
Next, a third embodiment of the present invention will be described. The present embodiment is premised on the post-blurring process described in the first embodiment, and an object thereof is to further reduce the recording capacity particularly at the time of recording information. In the first embodiment, information for post-processing is acquired for, for example, a frame determined that the background image is not sufficiently blurred based on the distance difference between the main subject and the background. However, depending on the shooting scene, the distance between the main subject and the background may always be short. In such a case, information for post-processing is acquired over almost all frames. As a result, the recording capacity of the captured image may become enormous. Therefore, in the present embodiment, a process of further narrowing down the frames to be recorded in the high quality mode will be described in order to maintain an appropriate recording capacity. That is, in the present embodiment, the determination of whether it is necessary to reduce the recording capacity and the determination of the recording mode by the score determination of the subject are executed by combining a part or all of them with the first and second embodiments, respectively. It can be done.

図２２は、本実施形態の撮像装置に適用可能な構成を示すブロック図である。図１に示す構成との相違は、記録情報調整部２２１０が追加されていることである。図２３は、記録情報調整部２２１０の処理を示すフローチャートである。図２３を参照して、記録情報調整部２２１０について説明する。以下の処理は、記録モード制御部１０４により制御される記録モードにおいて、一連の動画データを記録部１０９が記録した後に実行される。 FIG. 22 is a block diagram showing a configuration applicable to the imaging device of the present embodiment. The difference from the configuration shown in FIG. 1 is that the recording information adjusting unit 2210 is added. FIG. 23 is a flowchart showing the processing of the recording information adjusting unit 2210. The recording information adjusting unit 2210 will be described with reference to FIG. 23. The following processing is executed after the recording unit 109 records a series of moving image data in the recording mode controlled by the recording mode control unit 104.

Ｓ２３０１において記録情報調整部２２１０は、画像の記録容量（MEMと記す）を取得し、Ｓ２３０２で最大記録容量（MEM_MAXと記す）を算出する。最大記録容量MEM_MAXは、下記式（１０）により算出される。
MEM_MAX = MEM_FRAME×NUM_FRAME× k1 × k2 ・・・（１０）
（１０）式において、MEM_FRAMEは、１フレームあたりの記録量であり、この場合にはフレーム間圧縮やフレーム内圧縮は行わないものとする。NUM_FRAMEは、処理対象である動画の総フレーム数である。k1は所定の圧縮レートであり、１以下の値をとる。実際の圧縮率は撮影シーンに依存して変わるが、ここでは所定の圧縮率とする。k2は付加情報の記録による容量増加率の許容値であり、１以上の値をとる。 In S2301, the recording information adjusting unit 2210 acquires the recording capacity (denoted as MEM) of the image, and calculates the maximum recording capacity (denoted as MEM_MAX) in S2302. The maximum recording capacity MEM_MAX is calculated by the following formula (10).
MEM_MAX = MEM_FRAME × NUM_FRAME × k1 × k2 ・・・ (10)
In equation (10), MEM_FRAME is the recording amount per frame, and in this case, inter-frame compression and intra-frame compression are not performed. NUM_FRAME is the total number of frames of the moving image to be processed. k1 is a predetermined compression rate and takes a value of 1 or less. The actual compression rate changes depending on the shooting scene, but here it is set to a predetermined compression rate. k2 is an allowable value of the capacity increase rate by recording additional information, and takes a value of 1 or more.

次のＳ２３０３で記録情報調整部２２１０は、記録容量MEMと最大記録容量MEM_MAXを比較する。MEMがMEM_MAX以下である場合、記録情報の調整は行われずに処理を終了する。また、MEMがMEM_MAXよりも大きい場合、記録情報調整部２２１０は記録情報の調整を行う必要があると判断し、Ｓ２３０４に処理を進める。 In the next S2303, the recording information adjusting unit 2210 compares the recording capacity MEM with the maximum recording capacity MEM_MAX. If MEM is less than or equal to MEM_MAX, the processing ends without adjusting the recorded information. If the MEM is larger than MEM_MAX, the recording information adjusting unit 2210 determines that it is necessary to adjust the recorded information, and proceeds to S2304.

本実施形態では、記録容量を削減する必要があると判断された場合、主被写体のサイズ情報および位置情報に基づいて、高品質モードで記録するフレームをさらに絞り込む処理が行われる。その理由としては、ユーザが撮影後に背景をぼかして、静止画としても記録しておきたいと思うフレームは、主被写体が良好な状態で写っているフレームであることによる。絞り込み処理によって取得されるフレームは、具体的には、主被写体の画像領域が撮像された画像中心の近くに存在し、主被写体のサイズが大きく写っているフレームである。 In the present embodiment, when it is determined that it is necessary to reduce the recording capacity, a process of further narrowing down the frames to be recorded in the high quality mode is performed based on the size information and the position information of the main subject. The reason is that the frame in which the user wants to blur the background after shooting and record it as a still image is a frame in which the main subject is captured in good condition. Specifically, the frame acquired by the narrowing down process is a frame in which the image area of the main subject exists near the center of the captured image and the size of the main subject is large.

図２３のＳ２３０４で記録情報調整部２２１０は、主被写体のサイズ情報および位置情報を取得し、次のＳ２３０５にて、主被写体のサイズ情報と位置情報に基づいて主被写体スコアを算出する。なお、Ｓ２３０４以降の処理は、記録した全フレームに対して行われるものとする。図２４を参照して、Ｓ２３０４およびＳ２３０５の処理を説明する。図２４（Ａ）は入力フレーム画像を例示する。主被写体領域の重心位置の座標を(X,Y)と表記し、主被写体領域の高さをHeightと表記し、主被写体領域の幅をWidthと表記する。 In S2304 of FIG. 23, the recording information adjusting unit 2210 acquires the size information and the position information of the main subject, and in the next S2305, calculates the main subject score based on the size information and the position information of the main subject. It is assumed that the processing after S2304 is performed on all the recorded frames. The processing of S2304 and S2305 will be described with reference to FIG. 24. FIG. 24A illustrates an input frame image. The coordinates of the position of the center of gravity of the main subject area are expressed as (X, Y), the height of the main subject area is expressed as Height, and the width of the main subject area is expressed as Width.

Ｓ２３０４では、図２４（Ａ）に示す入力フレーム画像から、主被写体領域を矩形状に抽出し、幅Widthと高さHeightを取得する処理が実行される。さらに、抽出された主被写体領域の重心位置座標(X,Y)が取得される。取得した情報から、面積に相当する正規化サイズ（Sizeと記す）が、下記（１１）式により算出される。
Size = Width × Height / Size_all ・・・（１１） In S2304, a process of extracting the main subject area in a rectangular shape from the input frame image shown in FIG. 24A and acquiring the width width and the height height is executed. Further, the coordinates (X, Y) of the center of gravity of the extracted main subject area are acquired. From the acquired information, the normalized size (denoted as Size) corresponding to the area is calculated by the following equation (11).
Size = Width x Height / Size_all ・・・ (11)

（１１）式のSize_allは、画像サイズに依存しないように正規化するための正規化係数である。例えば、Size_allを画像全体の面積とする。この場合、Sizeは主被写体領域の面積が画像全体の面積に占める割合を示す。また、画像中央位置の座標を(Xc,Yc)と表記した場合、(Xc,Yc)から主被写体領域の重心位置座標(X,Y)までの正規化距離（Distと記す）は、下記（１２）式により算出される。
Size_all in Eq. (11) is a normalization coefficient for normalizing so as not to depend on the image size. For example, let Size_all be the area of the entire image. In this case, Size indicates the ratio of the area of the main subject area to the area of the entire image. When the coordinates of the center position of the image are expressed as (Xc, Yc), the normalized distance (denoted as Dist) from (Xc, Yc) to the coordinates of the center of gravity (X, Y) of the main subject area is as follows (denoted as Dist). 12) Calculated by equation.

（１２）式のＲは、画像中央位置から画像端部までの距離に相当する正規化係数である。つまり、正規化距離Distは画像中央位置から画像端部までの距離に対する、座標(Xc,Yc)と(X,Y)との距離差の割合を示す。 R in Eq. (12) is a normalization coefficient corresponding to the distance from the center position of the image to the edge of the image. That is, the normalized distance Dist indicates the ratio of the distance difference between the coordinates (Xc, Yc) and (X, Y) to the distance from the center position of the image to the edge of the image.

次にＳ２３０５にて、主被写体スコアが算出される。図２４（Ｂ）および（Ｃ）を参照して説明する。図２４（Ｂ）は、距離スコアの算出特性を例示する。横軸は正規化距離Distを表し、縦軸は距離スコア（Score_Distと記す）を表す。図２４（Ｃ）は、サイズスコアの算出特性を例示する。横軸は正規化サイズSizeを表し、縦軸はサイズスコア（Score_Sizeと記す）を表す。 Next, in S2305, the main subject score is calculated. This will be described with reference to FIGS. 24 (B) and 24 (C). FIG. 24B illustrates the calculation characteristics of the distance score. The horizontal axis represents the normalized distance Dist, and the vertical axis represents the distance score (denoted as Score_Dist). FIG. 24C illustrates the calculation characteristics of the size score. The horizontal axis represents the normalized size Size, and the vertical axis represents the size score (denoted as Score_Size).

本実施形態では、まず、算出された正規化距離情報および正規化サイズ情報から、図２４（Ｂ）および（Ｃ）に示す特性により、距離スコアScore_DistおよびサイズスコアScore_Sizeがそれぞれ算出される。図２４（Ｂ）に示す距離スコアScore_Distの特性に関しては、被写体の画像が画像中央部分に近いほどスコアを大きくするために、Distに対する単調減少の特性となる。図２４（Ｂ）は、２点間を一次式で線形補間した特性を例示する。正規化距離Distに対する第１の閾値D1よりDist値が小さい範囲では、距離スコアScore_Distが一定である。また正規化距離Distに対する第２の閾値D2よりDist値が大きい範囲では、距離スコアScore_Distが一定である。Dist値が第１の閾値以上であって、かつ第２の閾値以下である場合には、Dist値の増加につれて距離スコアScore_Distの値が線形的に減少する。 In the present embodiment, first, the distance score Score_Dist and the size score Score_Size are calculated from the calculated normalized distance information and the normalized size information according to the characteristics shown in FIGS. 24 (B) and 24 (C), respectively. Regarding the characteristic of the distance score Score_Dist shown in FIG. 24 (B), the closer the image of the subject is to the central portion of the image, the larger the score, so that the characteristic is monotonically decreasing with respect to Dist. FIG. 24B exemplifies the characteristic of linear interpolation between two points by a linear equation. In the range where the Dist value is smaller than the first threshold value D1 for the normalized distance Dist, the distance score Score_Dist is constant. Further, in the range where the Dist value is larger than the second threshold value D2 for the normalized distance Dist, the distance score Score_Dist is constant. When the Dist value is equal to or greater than the first threshold value and equal to or less than the second threshold value, the value of the distance score Score_Dist decreases linearly as the Dist value increases.

図２４（Ｃ）に示すサイズスコアScore_Sizeの特性に関しては、被写体の画像サイズが大きくなるほどスコアを大きくするために、Sizeに対する単調増加の特性となる。図２４（Ｃ）は、２点間を一次式で線形補間した特性を例示する。正規化サイズSizeに対する第１の閾値S1よりSize値が小さい範囲では、サイズスコアScore_Sizeが一定である。また正規化サイズSizeに対する第２の閾値S2よりSize値が大きい範囲では、サイズスコアScore_Sizeが一定である。Size値が第１の閾値以上であって、かつ第２の閾値以下である場合には、Size値の増加につれてサイズスコアScore_Sizeの値が線形的に増加する。
図２４（Ｂ）および（Ｃ）に示す特性は例示であり、３点以上を設定して補間処理を行ってもよい。 Regarding the characteristic of the size score Score_Size shown in FIG. 24C, the score increases as the image size of the subject increases, so that the characteristic increases monotonically with respect to Size. FIG. 24C exemplifies a characteristic in which two points are linearly interpolated by a linear equation. The size score Score_Size is constant in the range where the Size value is smaller than the first threshold value S1 for the normalized size Size. Further, in the range where the Size value is larger than the second threshold value S2 with respect to the normalized size Size, the size score Score_Size is constant. When the Size value is equal to or greater than the first threshold value and equal to or less than the second threshold value, the value of the size score Score_Size increases linearly as the Size value increases.
The characteristics shown in FIGS. 24 (B) and 24 (C) are examples, and the interpolation process may be performed by setting three or more points.

次に、算出された距離スコアScore_DistとサイズスコアScore_Sizeから、主被写体スコア（Scoreと記す）が、下記（１３）式により算出される。
Score = w_d × Score_Dist + w_s × Score_Size ・・・（１３）
（１３）式において、w_dとw_sはそれぞれ任意の重み付け係数である。 Next, from the calculated distance score Score_Dist and size score Score_Size, the main subject score (denoted as Score) is calculated by the following equation (13).
Score = w_d × Score_Dist + w_s × Score_Size ・・・ (13)
In equation (13), w_d and w_s are arbitrary weighting coefficients, respectively.

図２３のＳ２３０６にて記録情報調整部２２１０は、高品質の記録フレームの絞り込み処理を行う。Ｓ２３０５において全フレームに亘って主被写体スコアScoreが算出されている。記録情報調整部２２１０は、主被写体スコアScoreを所定の閾値（TH_Scoreと記す）を比較する。主被写体スコアScoreの値が閾値TH_Scoreを超えているフレームに対し、高品質モードでの記録処理が実行される。主被写体スコアScoreの値が閾値以下であるフレームについては通常モードとなり、付加情報は削除されるので記録されない。図２５を参照して具体的に説明する。横軸は閾値TH_Scoreを表し、縦軸は記録容量MEMを表す。閾値TH_Scoreが大きくなるほど、高品質のフレーム画像の数（フレーム数）は減っていくため、記録容量MEMが小さくなる。記録容量MEMが、Ｓ２３０２で算出された上限値MEM_MAXを下回る最大の閾値をTH_Score_minとする。閾値TH_Score_minを用いて高品質フレームの絞り込みを行うことによって、動画の記録容量をMEM_MAX以下に抑えつつ、動画を取得できる。 In S2306 of FIG. 23, the recording information adjusting unit 2210 narrows down high-quality recording frames. In S2305, the main subject score Score is calculated over all frames. The recording information adjusting unit 2210 compares the main subject score Score with a predetermined threshold value (denoted as TH_Score). The recording process in the high quality mode is executed for the frame whose main subject score Score value exceeds the threshold value TH_Score. Frames whose main subject score Score value is less than or equal to the threshold value are in the normal mode, and additional information is deleted and is not recorded. A specific description will be given with reference to FIG. The horizontal axis represents the threshold value TH_Score, and the vertical axis represents the recording capacity MEM. As the threshold value TH_Score increases, the number of high-quality frame images (number of frames) decreases, so that the recording capacity MEM decreases. Let TH_Score_min be the maximum threshold value at which the recording capacity MEM falls below the upper limit value MEM_MAX calculated in S2302. By narrowing down high-quality frames using the threshold value TH_Score_min, it is possible to acquire moving images while keeping the recording capacity of moving images below MEM_MAX.

本実施形態では、記録情報調整部２２１０の処理によって動画の記録容量の増加を抑えることができる。なお、本実施形態では、高品質フレームの絞り込みを行う指標として主被写体領域の位置情報とサイズ情報を利用した。
また、別の実施形態として、記録モードを切り替える別の指標として、シーンチェンジ度合いを利用してもよい。シーンチェンジ度合いとは、異なるフレーム（例えば現フレームと前フレーム）の間で画像が変化した場合のシーンの変化の大きさを表す指標である。シーンチェンジ度合いは、時系列の複数の画像、例えば現フレームと前フレームとを位置合わせし、画像間の差分を計算することで算出される。時間的に連続する２フレームの間で画像の変化がほとんどない場合には、高品質モードの追加情報を間引く処理が実行される。時間的に連続する２フレームの間で画像の大きな変化がある場合には、両方のフレームに係る追加情報を記録する処理が実行される。
上述した被写体スコアによる判定やシーンチェンジ度合いによる判定は、本実施形態では記録容量の削減の必要があると判定された場合に行っていたが、これに限られるものではない。例えば、記録容量の検出や判定を行わずに、高品質フレームの判定方法として被写体スコアやシーンチェンジ度合いを用いて記録モードを切り替えてもよい。 In the present embodiment, the increase in the recording capacity of the moving image can be suppressed by the processing of the recording information adjusting unit 2210. In this embodiment, the position information and the size information of the main subject area are used as indexes for narrowing down the high-quality frames.
Further, as another embodiment, the degree of scene change may be used as another index for switching the recording mode. The degree of scene change is an index showing the magnitude of change in the scene when the image changes between different frames (for example, the current frame and the previous frame). The degree of scene change is calculated by aligning a plurality of time-series images, for example, the current frame and the previous frame, and calculating the difference between the images. When there is almost no change in the image between two frames that are continuous in time, a process of thinning out additional information in the high quality mode is executed. When there is a large change in the image between two frames that are continuous in time, a process of recording additional information related to both frames is executed.
In the present embodiment, the determination based on the subject score and the determination based on the degree of scene change described above are performed when it is determined that the recording capacity needs to be reduced, but the determination is not limited to this. For example, the recording mode may be switched using the subject score or the degree of scene change as a method for determining a high-quality frame without detecting or determining the recording capacity.

［第４実施形態］
次に本発明の第４実施形態を説明する。本実施形態では、第１実施形態で説明した事後ぼかし処理を前提とし、各フレームの動きブレを低減させることを目的とする。高品質モードで記録する場合のシャッタ速度およびフレームレートを、主被写体の動きに合わせて変更する制御について説明する。 [Fourth Embodiment]
Next, a fourth embodiment of the present invention will be described. In the present embodiment, the post-blurring process described in the first embodiment is premised, and it is an object of the present invention to reduce the motion blur of each frame. A control for changing the shutter speed and frame rate when recording in the high quality mode according to the movement of the main subject will be described.

後処理により高品質な静止画を生成する対象となるフレームについては、被写体の動きブレが無く、被写体が止まっていることが好ましい状態である。動画を撮像する場合の露出制御では、フレーム画像を連続的に鑑賞する際に被写体像の動きが不連続な状態に見えないようにシャッタ速度が制御される。つまり、シャッタ速度が速くなりすぎないように制御が行われる。一方、そのようにして撮像された動画フレームの画像の１コマを静止画として鑑賞する場合、被写体の移動速度によっては動きブレが発生している可能性がある。そこで本実施形態は、高品質モードで取得するフレーム画像の露出制御、特にシャッタ速度に関して、被写体の動きブレを抑制することを目的とする。 With respect to the frame for which a high-quality still image is generated by post-processing, it is preferable that the subject does not move and the subject is stationary. In the exposure control when capturing a moving image, the shutter speed is controlled so that the movement of the subject image does not appear to be discontinuous when the frame image is continuously viewed. That is, control is performed so that the shutter speed does not become too high. On the other hand, when one frame of the image of the moving image frame captured in this way is viewed as a still image, there is a possibility that motion blurring may occur depending on the moving speed of the subject. Therefore, it is an object of the present embodiment to suppress the motion blur of the subject with respect to the exposure control of the frame image acquired in the high quality mode, particularly the shutter speed.

図２６は、本実施形態の撮像装置に適用可能な構成を示すブロック図である。図１に示す構成との相違は、露出条件制御部２６０６が設けられていることである。図２７は露出条件制御部２６０６の処理を示すフローチャートである。図２７を参照して、露出条件制御部２６０６について説明する。以下の処理は、記録モード制御部１０４の処理後に行われ、毎フレームまたは所定のフレーム間隔で行われる。 FIG. 26 is a block diagram showing a configuration applicable to the imaging device of the present embodiment. The difference from the configuration shown in FIG. 1 is that the exposure condition control unit 2606 is provided. FIG. 27 is a flowchart showing the processing of the exposure condition control unit 2606. The exposure condition control unit 2606 will be described with reference to FIG. 27. The following processing is performed after the processing of the recording mode control unit 104, and is performed every frame or at a predetermined frame interval.

まず、Ｓ２７０１では、記録モードが高品質モードであるか否かについて判定処理が行われる。記録モードが通常モードである場合、Ｓ２７０５に進み、現フレームよりも１フレーム時間だけ後の次フレームについても通常モードでの動画の露出で撮影動作が行われる。一方、記録モードが高品質モードである場合には、Ｓ２７０２に処理を進め、主被写体の動きに合わせた露出制御が行われる。Ｓ２７０２にて露出条件制御部２６０６は、主被写体の動きベクトルを算出する。動きベクトルの算出方法は、公知のパターンマッチング処理により行われる。 First, in S2701, a determination process is performed as to whether or not the recording mode is a high quality mode. When the recording mode is the normal mode, the process proceeds to S2705, and the shooting operation is performed with the exposure of the moving image in the normal mode also for the next frame one frame time after the current frame. On the other hand, when the recording mode is the high quality mode, the process proceeds to S2702, and the exposure control is performed according to the movement of the main subject. In S2702, the exposure condition control unit 2606 calculates the motion vector of the main subject. The motion vector calculation method is performed by a known pattern matching process.

次のＳ２７０３にて露出条件制御部２６０６は、次フレームのシャッタ速度を算出する。次フレームのシャッタ速度をTvと表記し、Ｓ２７０２で算出された動きベクトルの大きさをv（単位：ピクセル）と表記する。フレーム間隔をT_frame(フレームレートが60fpsの場合、1/60秒)と表記する。シャッタ速度Tvは、vおよびT_frameから、下記（１４）式により算出される。
Tv = T_frame / v ・・・（１４）
（１４）式は、撮影時間内に、主被写体画像の移動量が１ピクセルとなるシャッタ速度としてTvを算出していることを意味している。換言すれば、Tvは動きブレが１ピクセルに収まるシャッタ速度である。例えば、T_frameを1/60秒とし、vを6ピクセルとする。この場合、主被写体画像の１ピクセルの移動に対応するシャッタ速度は、Tv=1/360秒である。従って、本実施形態の目的に沿えば、（１４）式で算出したTvよりも小さい値をシャッタ速度として用いてもよい。 In the next S2703, the exposure condition control unit 2606 calculates the shutter speed of the next frame. The shutter speed of the next frame is expressed as Tv, and the magnitude of the motion vector calculated in S2702 is expressed as v (unit: pixel). The frame interval is expressed as T_frame (1/60 seconds when the frame rate is 60 fps). The shutter speed Tv is calculated from v and T_frame by the following equation (14).
Tv = T_frame / v ・・・ (14)
Equation (14) means that Tv is calculated as a shutter speed at which the amount of movement of the main subject image is 1 pixel within the shooting time. In other words, Tv is the shutter speed at which motion blur is within one pixel. For example, let T_frame be 1/60 second and v be 6 pixels. In this case, the shutter speed corresponding to the movement of one pixel of the main subject image is Tv = 1/360 seconds. Therefore, according to the object of this embodiment, a value smaller than Tv calculated by Eq. (14) may be used as the shutter speed.

Ｓ２７０４にて露出条件制御部２６０６は、その他の露出条件とフレームレートを決定する。その他の露出条件とは、具体的には感度と絞り値である。絞り値を変えると被写界深度が変わり、前後フレームとの連続性が失われてしまう。このため、本実施形態では、シャッタ速度Tvが変化した分については感度を変化させることで露出を一定に保つ制御が行われる。また、シャッタ速度Tvが速くなるにつれて、フレームレートを変更する制御が行われる。例えば、Tvの値が1/120秒以下となった場合、フレームレートを60fpsから120fpsへ変更する処理が実行される。この処理により、動きが速い被写体の決定的瞬間を逃し難くなるという効果が得られる。最後に露出条件制御部２６０６は、Ｓ２７０４またはＳ２７０５で決定された露出条件を撮像光学系１０１および撮像部１０２の制御にフィードバックして反映させた上で、次フレームの撮像処理を行うように制御する。 In S2704, the exposure condition control unit 2606 determines other exposure conditions and the frame rate. The other exposure conditions are, specifically, the sensitivity and the aperture value. When the aperture value is changed, the depth of field changes and the continuity with the front and rear frames is lost. Therefore, in the present embodiment, control is performed to keep the exposure constant by changing the sensitivity for the change in the shutter speed Tv. Further, as the shutter speed Tv increases, control is performed to change the frame rate. For example, when the Tv value is 1/120 second or less, the process of changing the frame rate from 60 fps to 120 fps is executed. This process has the effect of making it difficult to miss the decisive moment of a fast-moving subject. Finally, the exposure condition control unit 2606 controls to perform the imaging process of the next frame after feeding back and reflecting the exposure conditions determined in S2704 or S2705 to the control of the imaging optical system 101 and the imaging unit 102. ..

本実施形態では、露出条件制御部２６０６の処理によって、被写体（動体）の速度に合わせた最適なシャッタ速度で撮影が可能となる。また、動画として鑑賞する場合には、画像フレームに係るシャッタ速度Tvの値が大きいために動体の動きが不連続的に見えることを回避するため、被写体の動き量に応じて、電子的に被写体画像へブラーを付与する処理等が行われる。 In the present embodiment, the processing of the exposure condition control unit 2606 enables shooting at an optimum shutter speed according to the speed of the subject (moving object). In addition, when viewing as a moving image, the subject is electronically subjected to the amount of movement of the subject in order to avoid the movement of the moving object appearing discontinuous due to the large value of the shutter speed Tv related to the image frame. Processing such as adding blur to the image is performed.

また、上述した第１、第２、第３および第４の実施形態では、基本的にフレーム毎に記録モードの判定を行い切り替えて制御していたが、これに限られるものではない。記録データ量を削減する目的で手動あるいは自動で所定フレーム毎に高品質モードで記録するなど、周期的に記録モードを切り替えて制御を行ってもよい。
手動で設定が行われる場合、例えば操作部１０７を介したユーザ操作により所定フレーム数として５フレームと設定されると、５フレームに１フレーム、高品質モードとして画像とともに深度分布情報が記録される。あるいはユーザが設定する撮像のフレームレートに応じて高品質モードで記録する周期が決められてもよい。
自動で設定が行われる場合、上述した各実施形態における主被写体と背景の距離差、露出段差、被写体スコア、シーンチェンジ度合いなどの判定の少なくとも１つを定期的に行う。そして、高品質モードで記録される周期を決定して、次の判定までその周期で高品質モードでの記録が行われるように制御すればよい。 Further, in the first, second, third and fourth embodiments described above, the recording mode is basically determined for each frame and switched for control, but the present invention is not limited to this. For the purpose of reducing the amount of recorded data, control may be performed by periodically switching the recording mode, such as manually or automatically recording in a high quality mode every predetermined frame.
When the setting is performed manually, for example, if the predetermined number of frames is set to 5 frames by a user operation via the operation unit 107, 1 frame is recorded in 5 frames, and the depth distribution information is recorded together with the image as a high quality mode. Alternatively, the recording cycle in the high quality mode may be determined according to the frame rate of imaging set by the user.
When the setting is automatically performed, at least one of determinations such as the distance difference between the main subject and the background, the exposure step, the subject score, and the degree of scene change in each of the above-described embodiments is periodically performed. Then, the cycle of recording in the high quality mode may be determined, and control may be performed so that recording in the high quality mode is performed in that cycle until the next determination.

＜各実施形態における記録形式のパターン＞
上述した各実施形態において、撮像された複数の画像（フレーム）と、その一部のフレームに対応する深度分布情報を記録する形式については、下記のいずれでもよいものとする。
図２８に撮像された複数の画像（フレーム）と、その一部のフレームに対応する深度分布情報を記録する形式について各パターンをイメージした図を示す。すなわち、記録形式としては、図２８（Ａ）のように、順次撮像され取得された複数の画像１、２、３と、画像１、画像３にそれぞれ対応する距離マップ１、３が全て別ファイルとして記録されている。この場合、各画像ファイルのヘッダに画像と距離マップを関連づける情報（あるいは対応する距離マップがないという情報）を記録し、距離マップ側にも対応する画像の情報を記録するとよい。 <Recording format pattern in each embodiment>
In each of the above-described embodiments, any of the following formats may be used for recording the plurality of captured images (frames) and the depth distribution information corresponding to some of the frames.
FIG. 28 shows an image of each pattern of a plurality of images (frames) captured and a format for recording depth distribution information corresponding to some of the frames. That is, as the recording format, as shown in FIG. 28 (A), a plurality of images 1, 2 and 3 sequentially captured and acquired, and distance maps 1 and 3 corresponding to the images 1 and 3 are all separate files. It is recorded as. In this case, it is preferable to record the information relating the image and the distance map (or the information that there is no corresponding distance map) in the header of each image file, and record the information of the corresponding image on the distance map side as well.

また、図２８（Ｂ）のように、各画像が連続した画像として関連づけられ（符号化されてもよい）１つの動画像ファイルとなっており、この動画像ファイルと同期した形で複数の距離マップがそれぞれ個別に距離マップファイルとして記録されていてもよい。距離マップには対応する動画のタイムコードが記録されており、静止画切り出しを含めた動画編集の際に、画像と対応づけられて必要に応じて読み出して利用することができる。 Further, as shown in FIG. 28B, each image is associated (may be encoded) as one moving image file as a continuous image, and a plurality of distances are synchronized with this moving image file. Each map may be recorded individually as a distance map file. The time code of the corresponding moving image is recorded in the distance map, and it can be read out and used as needed in association with the image when editing the moving image including cutting out the still image.

また、図２８（Ｃ）のように、各画像が連続した画像として関連づけられ（符号化されてもよい）１つの動画像ファイルとなっており、この動画像ファイルと同期した形で複数の距離マップが別の１つのファイルとして記録されていてもよい。この場合、動画の各フレームのタイムコードと同期したタイムコードが、対応する距離マップに記録されて１つのファイルとして記録されていればよい。図２８（Ｂ）の形態に比べて、距離マップも１つのファイルにすることで動画像ファイルと対で扱いやすく、必要に応じて距離マップ間も公知の符号化技術を用いて符号化することにより、データ量の削減も期待できる。 Further, as shown in FIG. 28C, each image is associated (may be encoded) as one moving image file as a continuous image, and a plurality of distances are synchronized with this moving image file. The map may be recorded as a separate file. In this case, the time code synchronized with the time code of each frame of the moving image may be recorded in the corresponding distance map and recorded as one file. Compared with the form of FIG. 28B, the distance map is also made into one file so that it can be easily handled as a pair with the moving image file, and if necessary, the distance maps can be encoded by using a known coding technique. As a result, the amount of data can be expected to be reduced.

また、図２８（Ｄ）のように、画像とその前あるいは後に対応する距離マップがつながって記録され、全体で１つの動画像ファイルを形成して記録される形式でもよい。この形式では、記録処理が行われた画像に対応する深度分布情報である距離Ｍａｐを、該画像データのメタデータとして該画像データの前または後に記録する。この形式では１つのファイルで扱うことができたり、画像に対応する距離マップも隣接しているためアクセスが容易であったりなどの利点が考えられる。 Further, as shown in FIG. 28 (D), the image and the corresponding distance map before or after the image may be connected and recorded, and one moving image file may be formed and recorded as a whole. In this format, the distance Map, which is the depth distribution information corresponding to the image to which the recording process has been performed, is recorded as the metadata of the image data before or after the image data. This format has advantages such as being able to be handled by one file and being easy to access because the distance map corresponding to the image is also adjacent.

［その他の実施形態］
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 [Other Embodiments]
The present invention supplies a program that realizes one or more functions of the above-described embodiment to a system or device via a network or storage medium, and one or more processors in the computer of the system or device reads and executes the program. It can also be realized by the processing to be performed. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

以上の通り、本発明によれば、記録容量を抑えつつ、利便性の高い画像記録を行うことができる。 As described above, according to the present invention, it is possible to perform highly convenient image recording while suppressing the recording capacity.

１０１・・・撮像光学系
１０２・・・撮像部
１０４・・・記録モード制御部
１０５・・・画像処理部
１０６・・・システム制御部
１０９・・・記録部
２２１０・・・記録情報調整部
２６０６・・・露出条件制御部 101 ... Imaging optical system 102 ... Imaging unit 104 ... Recording mode control unit 105 ... Image processing unit 106 ... System control unit 109 ... Recording unit 2210 ... Recording information adjustment unit 2606・・・ Exposure condition control unit

Claims

An image processing device including a recording means for acquiring a plurality of image data and recording them on a recording medium.
An acquisition method for acquiring depth distribution information of a subject corresponding to image data,
A first mode in which the image data and the depth distribution information corresponding to the image data are recorded on the recording medium by the recording means, and the image data is recorded by the recording means without recording the depth distribution information. A control means for switching between a second mode for recording on a medium and performing recording processing of the plurality of image data is provided.
The control means switches between the first mode and the second mode at a cycle corresponding to the frame rate of imaging of the plurality of image data set by the user to perform the recording process of the plurality of image data. An image processing device characterized by.

The control means switches between the first mode and the second mode in the recording of the plurality of image data to perform recording processing, and the plurality of image data are used as moving images in the recording medium by the recording means. The image processing apparatus according to claim 1, wherein the image processing apparatus is recorded.

2. The recording means according to claim 2, wherein the depth distribution information corresponding to the plurality of image data and the image data recorded in the first mode is recorded in one moving image file. The image processing apparatus described.

The recording means records the depth distribution information corresponding to the image data recorded in the first mode in the moving image file before or after the image data as metadata of the image data. The image processing apparatus according to claim 3.

The control means acquires the depth distribution information including the depth information of the main subject and the background among a plurality of subjects, and the difference between the depth information of the main subject and the depth information of the background included in the depth distribution information is within the threshold value. The image processing apparatus according to claim 1, wherein the recording process is performed in the first mode.

The image processing apparatus according to claim 5, further comprising a threshold value calculating means for calculating the threshold value from the focal length of the imaging optical system, the depth information of the main subject included in the depth distribution information, and the allowable defocus amount. ..

5. The control means is characterized in that it determines whether or not to perform the recording process in the first mode based on the comparison result between the difference and the threshold value and the F value of the imaging optical system. The image processing apparatus according to 6.

The claim is characterized in that the control means calculates an evaluation value indicating a dynamic range of a frame and compares the evaluation value with a threshold value to determine whether or not to perform recording processing in the first mode. Item 1. The image processing apparatus according to item 1.

As the evaluation value, the control means calculates an exposure step between the main subject area and the background area related to the main subject among the plurality of subjects, and when the exposure step is equal to or more than the threshold value, in the first mode. The image processing apparatus according to claim 8, wherein the recording process is performed, and when the exposure step is smaller than the threshold value, the recording process is performed in the second mode.

An extraction means for extracting information on a main subject area related to a main subject among a plurality of subjects from the image data and depth distribution information recorded in the first mode.
The image processing apparatus according to claim 1, further comprising an image processing means that acquires information on a main subject area extracted by the extraction means and performs image processing of the image data.

The image processing apparatus according to claim 10, wherein the image processing means determines a background area in the image from the information of the main subject area and performs blurring processing of the background image.

An extraction means for extracting information on a main subject area related to a main subject among a plurality of subjects from the image data and depth distribution information recorded in the first mode.
An image processing means for acquiring information on the main subject area extracted by the extraction means and performing image processing on the image data is further provided.
The image processing means is characterized in that the main subject area and the background area in the image are determined from the information of the main subject area, and different gradation correction processing is performed on the main subject area and the background area. The image processing apparatus according to 1.

Further provided with an adjusting means for adjusting the recording amount by acquiring the recording capacity of the moving image and narrowing down the frames to be recorded in the first mode when the recording capacity is equal to or more than the threshold value.
The image processing apparatus according to claim 1, wherein the adjusting means controls to record the narrowed-down image data of the frame and the depth distribution information.

The control means compares a score calculated from at least one of the position information and size information of the main subject among the plurality of subjects with the threshold value, and in the frame where the score is larger than the threshold value, in the first mode. The image processing apparatus according to claim 1, wherein the recording process is performed, and the recording process is performed in the second mode in a frame whose score is equal to or less than the threshold value.

The image processing according to claim 1, wherein the control means detects a change in a scene from a difference between images, and when the change in the scene is detected, performs recording processing in the first mode. apparatus.

The image processing apparatus according to claim 1, wherein the control means performs recording processing in the first mode for each predetermined frame in a plurality of time-series images acquired by the acquisition means.

A first aspect of the present invention, wherein the control means notifies that the moving image recorded in the first mode is the image recorded in the first mode when the moving image is reproduced. The image processing apparatus according to any one of 16.

The depth distribution information is obtained by an image shift map based on the amount of parallax of a plurality of viewpoint images, a defocus map based on the amount of defocus for each area, a distance map showing the relative distance relationship of each subject in the image data, and the TOF method. The image processing device according to any one of claims 1 to 17, wherein the image processing device is any one of the distance information indicating the distance relationship from the acquired image pickup device to each subject.

The acquisition means is characterized in that an image shift map, which is the depth distribution information corresponding to the image data, is calculated and acquired based on the parallax amount of the paired parallax images corresponding to the acquired image data. The image processing apparatus according to any one of claims 1 to 18.

The claim is characterized in that the acquisition means calculates and acquires a defocus map which is the depth distribution information corresponding to the image data based on the defocus amount for each area of the acquired image data. The image processing apparatus according to any one of 1 to 18.

The acquisition means obtains the relative distance relationship of each subject, which is the depth distribution information corresponding to the image data, based on the defocus amount for each region of the acquired image data and the image pickup optical system or the image sensor. The image processing apparatus according to any one of claims 1 to 18, wherein the image processing apparatus is calculated and acquired.

The acquisition means uses the TOF method of measuring the delay time from the projection of light to the subject to the reception of reflected light and measuring the distance to the subject from the image pickup device which is the depth distribution information corresponding to the image data. The image processing apparatus according to any one of claims 1 to 18, wherein the subject distance to each subject is acquired.

The image processing apparatus according to any one of claims 1 to 22, wherein the control means controls to record data of a RAW image before image processing in the first mode.

The image processing apparatus according to claim 23, wherein the RAW image is an image that has not been subjected to image processing including demosaiking processing, white balance adjustment, color conversion processing, or gamma correction.

The image processing apparatus according to any one of claims 1 to 24,
An imaging device including an imaging means for imaging a subject.

An exposure condition control means for controlling an exposure condition when an image of a subject is imaged by the image pickup means and image data is generated is provided.
The exposure condition control means acquires the amount of movement of the subject in the image of the frame in the first mode, and determines the shutter speed and frame rate related to the imaging of the next frame of the frame from the amount of movement and the frame interval. The imaging apparatus according to claim 25, wherein one or more of them is determined.

A control method for an image processing apparatus including a recording means for acquiring a plurality of image data and recording them on a recording medium.
The acquisition process to acquire the depth distribution information of the subject corresponding to the image data,
A first mode in which the image data and the depth distribution information corresponding to the image data are recorded on the recording medium by the recording means, and the image data is recorded by the recording means without recording the depth distribution information. It has a control step of switching between a second mode of recording on a medium and performing recording processing of the plurality of image data.
In the control step, the first mode and the second mode are switched at a cycle corresponding to the frame rate of imaging of the plurality of image data set by the user to perform the recording process of the plurality of image data. A control method characterized by.

A program for causing a computer of an image processing apparatus to execute each step according to claim 27.