JP7818966B2

JP7818966B2 - Image processing method, image processing device, image processing system, and program

Info

Publication number: JP7818966B2
Application number: JP2022005803A
Authority: JP
Inventors: 正和小林
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2022-01-18
Filing date: 2022-01-18
Publication date: 2026-02-24
Anticipated expiration: 2042-01-18
Also published as: JP2023104667A

Description

本発明は、画像処理方法、画像処理装置、画像処理システム、およびプログラムに関する。 The present invention relates to an image processing method, an image processing device, an image processing system, and a program.

画像に対する認識または回帰のタスクにおいて、機械学習モデルを用いた手法は、仮定や近似を用いた理論ベースの手法に対して、高い精度を実現できる。理論ベースの手法では、仮定や近似によって無視された要素によって精度が低下する。しかし、機械学習モデルを用いた手法では、それらの要素も含む学習データを用いて機械学習モデルを学習することで、仮定や近似のない学習データに即した推定が実現できるため、タスクの精度が向上する。 In image recognition or regression tasks, methods using machine learning models can achieve higher accuracy than theory-based methods that use assumptions and approximations. With theory-based methods, accuracy decreases due to elements ignored by assumptions and approximations. However, with methods using machine learning models, by training the machine learning model using training data that includes these elements, it is possible to achieve estimations that are consistent with the training data without assumptions or approximations, thereby improving the accuracy of the task.

特許文献１には、機械学習モデルの１つである畳み込みニューラルネットワーク（ＣＮＮ：ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）を用いて、撮像画像のぼけを先鋭化する方法が開示されている。また特許文献１には、撮像画像と推定画像（ぼけ先鋭化画像）とを輝度飽和領域に基づいて重み付け平均し、先鋭化の強度を調整する方法が開示されている。 Patent Document 1 discloses a method for sharpening blur in a captured image using a convolutional neural network (CNN), a type of machine learning model. Patent Document 1 also discloses a method for adjusting the strength of sharpening by performing a weighted average of the captured image and an estimated image (blur-sharpened image) based on the brightness saturation region.

特開２０２０－１６６６２８号公報Japanese Patent Application Laid-Open No. 2020-166628

特許文献１に開示された方法では、光の回折現象に起因する弊害については述べられていない。光学系の絞り値（Ｆ値）を大きくして高輝度な被写体を撮像すると、光の回折現象によりエアリーディスクや光芒が発生する。機械学習モデルでエアリーディスクや光芒を補正した場合、学習画像の精度不足や機械学習モデルのパラメータ不足により、不自然な強調等の弊害が発生することがある。また、この弊害は、撮像画像における補正対象の周辺輝度値によって、目立ち具合が異なる。例えば、夜景のような暗い画像においては不自然な強調が目立つが、日中の屋外で撮影された明るい画像においては目立たない。 The method disclosed in Patent Document 1 does not address the problems caused by the diffraction of light. When capturing an image of a bright subject with a large aperture value (F-number) of the optical system, airy disks and light beams occur due to the diffraction of light. When correcting for airy disks and light beams using a machine learning model, problems such as unnatural emphasis can occur due to insufficient accuracy in the training image or insufficient parameters in the machine learning model. Furthermore, the degree to which this problem is noticeable varies depending on the peripheral brightness value of the correction target in the captured image. For example, unnatural emphasis is noticeable in dark images such as night scenes, but is not noticeable in bright images captured outdoors during the day.

また、エアリーディスクや光芒の発生は、光学系の絞り値に依存しているため、弊害の発生していない絞り値で撮像された撮像画像の補正効果まで低減してしまう。また、撮像画像の輝度値で異なる弊害の目立ちやすさを考慮せず、決められた重みに応じて平均化される。つまり、暗い画像では飽和領域周辺で弊害が低減されるが、明るい画像においては、必要以上に飽和領域周辺の補正効果が低減してしまう。 Furthermore, because the occurrence of airy disks and streaks of light depends on the aperture value of the optical system, the correction effect is reduced even for images captured at aperture values where no adverse effects occur. Furthermore, the effects are averaged according to a set weight without taking into account the differing conspicuousness of the adverse effects depending on the brightness value of the captured image. In other words, in dark images, the adverse effects are reduced around saturated areas, but in bright images, the correction effect around saturated areas is reduced more than necessary.

そこで本発明は、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能な画像処理方法を提供することを目的とする。 The present invention therefore aims to provide an image processing method that can maintain blur correction effects while suppressing the adverse effects caused by the aperture value of the optical system and the brightness value of the captured image.

本発明の一側面としての画像処理方法は、光学系を用いた撮像によって得られた撮像画像に基づいて、コンピュータが該撮像画像の補正に関する情報を生成する工程と、前記光学系の絞り値に関する情報と、前記撮像画像の輝度値に関する情報とに基づいて、コンピュータが重みマップを生成する工程と、前記撮像画像と前記補正に関する情報と前記重みマップとに基づいて、コンピュータが第１の画像を生成する工程とを有する。 An image processing method according to one aspect of the present invention includes the steps of : generating information regarding correction of a captured image based on the captured image obtained by imaging using an optical system ; generating a weight map based on information regarding the aperture value of the optical system and information regarding the brightness value of the captured image; and generating a first image based on the captured image, the information regarding the correction , and the weight map.

本発明の他の目的及び特徴は、以下の実施例において説明される。 Other objects and features of the present invention are described in the following examples.

本発明によれば、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能な画像処理方法を提供することができる。 This invention provides an image processing method that can maintain blur correction effects while suppressing adverse effects caused by the aperture value of the optical system and the brightness value of the captured image.

実施例１における機械学習モデルの説明図である。FIG. 2 is an explanatory diagram of a machine learning model in the first embodiment. 実施例１における画像処理システムのブロック図である。FIG. 1 is a block diagram of an image processing system according to a first embodiment. 実施例１における画像処理システムの外観図である。1 is an external view of an image processing system according to a first embodiment. 実施例１～３における先鋭化による弊害の説明図である。FIG. 10 is an explanatory diagram of adverse effects caused by sharpening in Examples 1 to 3. 実施例１～３におけるエアリーディスクの説明図である。FIG. 1 is an explanatory diagram of an airy disk in Examples 1 to 3. 実施例１～３における光芒の説明図である。FIG. 10 is an explanatory diagram of a beam of light in Examples 1 to 3. 実施例１～３における機械学習モデルの学習のフローチャートである。1 is a flowchart of learning of a machine learning model in Examples 1 to 3. 実施例１、２におけるモデル出力の生成のフローチャートである。10 is a flowchart of generating a model output in the first and second embodiments. 実施例１における先鋭化の強度調整のフローチャートである。10 is a flowchart of adjusting the strength of sharpening in the first embodiment. 実施例１における撮像画像と飽和影響マップの説明図である。4A and 4B are explanatory diagrams of a captured image and a saturation influence map in the first embodiment. 実施例１における撮像画像と飽和影響マップの説明図である。4A and 4B are explanatory diagrams of a captured image and a saturation influence map in the first embodiment. 実施例１における重みマップの説明図である。FIG. 4 is an explanatory diagram of a weight map in the first embodiment. 実施例１における重みマップの説明図である。FIG. 4 is an explanatory diagram of a weight map in the first embodiment. 実施例２における画像処理システムのブロック図である。FIG. 10 is a block diagram of an image processing system according to a second embodiment. 実施例２における画像処理システムの外観図である。FIG. 10 is an external view of an image processing system according to a second embodiment. 実施例２における先鋭化の強度調整のフローチャートである。10 is a flowchart of adjusting the sharpening strength in the second embodiment. 実施例２における重みマップの説明図である。FIG. 10 is an explanatory diagram of a weight map in the second embodiment. 実施例３における画像処理システムのブロック図である。FIG. 10 is a block diagram of an image processing system according to a third embodiment. 実施例３における画像処理システムの外観図である。FIG. 11 is an external view of an image processing system according to a third embodiment. 実施例３におけるモデル出力および先鋭化の強度調整のフローチャートである。13 is a flowchart of model output and sharpening strength adjustment in the third embodiment.

以下、本発明の実施例について、図面を参照しながら詳細に説明する。各図において、同一の部材については同一の参照符号を付し、重複する説明は省略する。 Embodiments of the present invention will now be described in detail with reference to the drawings. In each drawing, identical components will be designated by the same reference numerals, and duplicate descriptions will be omitted.

実施例の具体的な説明を行う前に、本発明の要旨を説明する。本発明は、光学系を用いて撮像された撮像画像から、光学系に起因するぼけを、機械学習モデルを用いて先鋭化した推定画像を生成する。そして、撮像画像の撮像に用いた光学系の絞り値に関する情報と、撮像画像の輝度値に関する情報とに基づいて重みマップを生成し、撮像画像と推定画像とを加重平均する。重みマップとは、撮像画像と推定画像を加重平均する際の、各画像の割合を決定するために用いられ、連続的な信号値を有する。例えば、重みマップの数値が撮像画像の割合を決定する場合、数値が０．５であれば、撮像画像とぼけ先鋭化画像の割合を５０％で加重平均した強度調整画像となる。また、重みマップの値が１であれば、強度調整画像は撮像画像となる。 Before describing the specific embodiments, the gist of the present invention will be explained. The present invention generates an estimated image from a captured image captured using an optical system, using a machine learning model to sharpen the blur caused by the optical system. A weight map is then generated based on information about the aperture value of the optical system used to capture the captured image and information about the luminance value of the captured image, and the captured image and estimated image are weighted-averaged. The weight map is used to determine the proportion of each image when weighting-averaging the captured image and the estimated image, and has continuous signal values. For example, if the value of the weight map determines the proportion of the captured image, a value of 0.5 results in an intensity-adjusted image that is a weighted average of the proportion of the captured image and the blur-sharpened image at 50%. Furthermore, if the value of the weight map is 1, the intensity-adjusted image is the captured image.

光学系に起因するぼけとは、収差、回折、デフォーカスによるぼけや、光学ローパスフィルタによる作用、撮像素子の画素開口劣化などを含む。機械学習モデルは、例えば、ニューラルネットワーク、遺伝的プログラミング、ベイジアンネットワークなどを含む。ニューラルネットワークは、ＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）、ＧＡＮ（ＧｅｎｅｒａｔｉｖｅＡｄｖｅｒｓａｒｉａｌＮｅｔｗｏｒｋ）、ＲＮＮ（ＲｅｃｕｒｒｅｎｔＮｅｕｒａｌＮｅｔｗｏｒｋ）などを含む。 Blur caused by the optical system includes blur due to aberration, diffraction, defocus, the effects of optical low-pass filters, and degradation of the pixel aperture of the image sensor. Machine learning models include, for example, neural networks, genetic programming, and Bayesian networks. Neural networks include CNN (Convolutional Neural Network), GAN (Generative Adversarial Network), and RNN (Recurrent Neural Network).

ぼけの先鋭化とは、ぼけによって低下または消失した被写体の周波数成分を復元する処理を指す。ぼけの先鋭化の際、光学系の絞り値を大きくして高輝度な被写体を撮像すると、光の回折現象によりエアリーディスクや光芒が発生する。機械学習モデルでエアリーディスクや光芒を補正した場合、学習画像の精度不足や機械学習モデルのパラメータ不足により、不自然な強調等の弊害が発生する。なお、エアリーディスクと光芒、学習画像の精度不足や機械学習モデルのパラメータ不足についての詳細は、後述する。 Blur sharpening refers to the process of restoring frequency components of a subject that have been reduced or lost due to blur. When sharpening blur, if the aperture value of the optical system is increased to capture a high-brightness subject, airy disks and light beams occur due to the diffraction of light. If airy disks and light beams are corrected using a machine learning model, problems such as unnatural emphasis can occur due to insufficient precision in the training images or insufficient parameters in the machine learning model. More details on airy disks and light beams, insufficient precision in the training images, and insufficient parameters in the machine learning model will be provided later.

また、この弊害は画像の明るさによって、目立ちやすさが異なる。例えば、夜景のような暗い画像においては不自然な強調が目立つが、日中の屋外で撮影された明るい画像においては目立たない。そこで、本発明は、撮像画像の撮像に用いた光学系の絞り値に関する情報と、撮像画像の輝度値に関する情報とに基づいて重みマップを生成し、撮像画像と推定画像とを加重平均する。これにより、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能になる。なお以下では、機械学習モデルのウエイトを学習する段階のことを学習フェーズとし、学習済みのウエイトを用いた機械学習モデルでぼけの先鋭化を行う段階のことを推定フェーズとする。 Furthermore, the degree to which this drawback is noticeable varies depending on the brightness of the image. For example, unnatural emphasis is noticeable in dark images such as night scenes, but is not noticeable in bright images taken outdoors during the day. Therefore, the present invention generates a weight map based on information about the aperture value of the optical system used to capture the captured image and information about the luminance value of the captured image, and performs a weighted average of the captured image and the estimated image. This makes it possible to maintain the blur correction effect while suppressing the drawbacks caused by the aperture value of the optical system and the luminance value of the captured image. Note that below, the stage in which the weights of the machine learning model are learned will be referred to as the learning phase, and the stage in which blur sharpening is performed using the machine learning model using the learned weights will be referred to as the estimation phase.

まず、本発明の実施例１における画像処理システムに関して説明する。本実施例において、機械学習モデルによるタスクは、輝度飽和を含む撮像画像に対するぼけの先鋭化（撮像画像の高解像化）である。また、先鋭化するぼけは、光学系で発生する収差や回折、光学ローパスフィルタによるぼけを対象とする。ただし、画素開口やデフォーカス、ぶれによるぼけを先鋭化する場合も、同様に発明の効果を得ることができる。また、ぼけ先鋭化以外のタスクに関しても、同様に本実施例を適用し、効果を得ることが可能である。具体的には、撮像画像の画素数を上げるアップサンプリングや、撮像画像のデフォーカスぼけを変換（形状変換）するタスクなどである。デフォーカスぼけの変換とは、例えば、二線ぼけからガウスぼけや玉ぼけへの変換などが含まれる。二線ぼけは、ピークが分離した点像分布関数（ＰＳＦ）を有する。これにより、本来は一本の線である被写体が、デフォーカスした際に二重にぼけているように見える。玉ぼけは、強度がフラットなＰＳＦを有する。ガウスぼけは、ガウス分布のＰＳＦを有する。変換の対象とする他のデフォーカスぼけには、例えば、ヴィネッティングによって欠けたデフォーカスぼけ、カタディオプトリックレンズなどの瞳遮蔽によるリング状のデフォーカスぼけなどが挙げられる。 First, an image processing system according to a first embodiment of the present invention will be described. In this embodiment, the task performed by the machine learning model is to sharpen blur in a captured image containing luminance saturation (increasing the resolution of the captured image). The blur to be sharpened is caused by aberrations, diffraction, and an optical low-pass filter that occur in the optical system. However, the effects of the present invention can be similarly achieved when sharpening blur caused by pixel aperture, defocus, or shaking. This embodiment can also be applied to tasks other than blur sharpening, achieving similar benefits. Specifically, this includes tasks such as upsampling to increase the number of pixels in a captured image and transforming (shape conversion) defocus blur in a captured image. Examples of defocus blur transformation include converting bilinear blur to Gaussian blur or spherical blur. Bilinear blur has a point spread function (PSF) with separate peaks. This causes a subject that is actually a single line to appear doubly blurred when defocused. Sphere blur has a PSF with flat intensity. Gaussian blur has a PSF with a Gaussian distribution. Other defocus blurs that can be transformed include, for example, defocus blur caused by vignetting and ring-shaped defocus blur caused by pupil obstruction by a catadioptric lens or the like.

図２は、本実施例における画像処理システム１００のブロック図である。図３は、画像処理システム１００の外観図である。画像処理システム１００は、有線または無線のネットワークで接続された学習装置１０１と画像処理装置１０３とを有する。画像処理装置１０３には、有線または無線によって、撮像装置１０２、表示装置１０４、記録媒体１０５、および出力装置１０６が接続される。撮像装置１０２を用いて被写体空間を撮像した撮像画像は、画像処理装置１０３に入力される。撮像画像には、撮像装置１０２内の光学系１０２ａによる収差および回折と、撮像素子１０２ｂの光学ローパスフィルタと、によってぼけが発生しており、被写体の情報が減衰している。 Figure 2 is a block diagram of the image processing system 100 in this embodiment. Figure 3 is an external view of the image processing system 100. The image processing system 100 has a learning device 101 and an image processing device 103, which are connected via a wired or wireless network. The image processing device 103 is connected to an imaging device 102, a display device 104, a recording medium 105, and an output device 106, either wired or wirelessly. A captured image of the subject space captured using the imaging device 102 is input to the image processing device 103. The captured image is blurred due to aberration and diffraction caused by the optical system 102a in the imaging device 102 and the optical low-pass filter of the image sensor 102b, and information about the subject is attenuated.

画像処理装置１０３は、機械学習モデルを用いて、撮像画像に対してぼけ先鋭化を行い、飽和影響マップとぼけ先鋭化画像（モデル出力）を生成する。なお、飽和影響マップの詳細は後述する。機械学習モデルは学習装置１０１で学習されたものであり、画像処理装置１０３は機械学習モデルに関する情報を予め学習装置１０１から取得し、記憶部１０３ａに記憶している。また画像処理装置１０３は、撮像画像とぼけ先鋭化画像の重み付け加算を取ることで、ぼけ先鋭化の強度を調整する機能を有する。機械学習モデルの学習と推定、ぼけ先鋭化の強度調整の詳細に関しては、後述する。ユーザは、表示装置１０４に表示された画像を確認しながら、ぼけ先鋭化の強度調整を行える。強度調整が施されたぼけ先鋭化画像は、記憶部１０３ａまたは記録媒体１０５に保存され、必要に応じてプリンタなどの出力装置１０６に出力される。なお、撮像画像は、グレースケールでも、複数の色成分を有していてもよい。また、未現像のＲＡＷ画像でも、現像後の画像でもよい。 The image processing device 103 uses a machine learning model to perform blur sharpening on the captured image and generate a saturation influence map and a blur-sharpened image (model output). Details of the saturation influence map will be described later. The machine learning model is trained by the learning device 101, and the image processing device 103 acquires information about the machine learning model from the learning device 101 in advance and stores it in the storage unit 103a. The image processing device 103 also has the function of adjusting the intensity of blur sharpening by performing a weighted addition of the captured image and the blur-sharpened image. Details of the learning and estimation of the machine learning model and the adjustment of the intensity of blur sharpening will be described later. The user can adjust the intensity of blur sharpening while checking the image displayed on the display device 104. The intensity-adjusted blur-sharpened image is stored in the storage unit 103a or the recording medium 105 and, as necessary, output to an output device 106 such as a printer. The captured image may be grayscale or have multiple color components. It may also be an undeveloped RAW image or a developed image.

次に、図４（Ａ）～（Ｃ）および図５を参照して、機械学習モデルによって、ぼけ先鋭化を行う際に発生する、光の回折現象に起因する弊害について説明する。図４（Ａ）、（Ｂ）は、光学系の絞り値を大きくした場合のＰＳＦの断面図（縦軸は信号値、横軸は空間座標）であり、縦軸のスケールが異なる。図５は、ＰＳＦの平面図である。光の回折現象により、中央部に明るい領域を持ち、その周囲に暗い同心円状輪帯を有する回折パターンが生じている。これをエアリーディスクと呼ぶ。この回折パターンは微弱な信号値であるが、点光源や強い光を撮像すると、撮像画像上でも目視で確認できるようになる。図４（Ｃ）中の一点鎖線１４１に、上記のＰＳＦを先鋭化し、回折パターンを低減した理想信号値、点線１４２に、先鋭化により回折パターンが強調された場合の信号値を示す。ぼけの先鋭化としては、一点鎖線１４１になることが好ましい。しかし、学習画像の精度不足や機械学習モデルのパラメータ不足により、点線１４２のように、回折パターンの強調が発生する場合がある。これは、回折パターンのエッジを被写体と誤認識して先鋭化してしまうためである。学習画像の精度不足や機械学習モデルのパラメータ不足については後述する。 Next, with reference to Figures 4(A)-(C) and Figure 5, we will explain the adverse effects of optical diffraction that occur when blur sharpening is performed using a machine learning model. Figures 4(A) and 4(B) are cross-sectional views of the PSF (vertical axis represents signal value, horizontal axis represents spatial coordinates) when the aperture value of the optical system is increased, with different scales on the vertical axis. Figure 5 is a plan view of the PSF. Due to the optical diffraction phenomenon, a diffraction pattern is generated with a bright central area surrounded by dark concentric rings. This is called an Airy disk. While this diffraction pattern has a weak signal value, it can be visually confirmed in the captured image when a point light source or strong light is captured. In Figure 4(C), the dashed-dotted line 141 represents the ideal signal value obtained by sharpening the PSF and reducing the diffraction pattern, while the dotted line 142 represents the signal value obtained when the diffraction pattern is enhanced by sharpening. The dashed-dotted line 141 is preferable for blur sharpening. However, due to insufficient precision in the training images or insufficient parameters in the machine learning model, the diffraction pattern may be emphasized, as shown by the dotted line 142. This is because the edges of the diffraction pattern are mistakenly recognized as the subject and sharpened. Insufficient precision in the training images and insufficient parameters in the machine learning model will be discussed later.

次に、図６を参照して、光の回折現象に起因する光芒について説明する。光芒とは、点光源や強い光を撮像した際に発生する、細長く伸びる光の筋を意味する。図６は、光芒の例である。筋の数は、光学系１０２ａの絞り羽根１４３の枚数に依存する。絞り羽根１４３の枚数が奇数である場合、枚数の２倍の筋が発生する。一方、絞り羽根１４３の枚数が偶数である場合、枚数の筋が発生する。図６に示される絞り羽根１４３の枚数は５枚であるため、光の筋１４４が１０本発生している。光芒の鋭さは光学系の絞り値に依存する。絞り値が大きくなると、鋭さが増していく。また、同じ絞り値でも、光学系１０２ａの種類によって光芒の現れ方が異なる。エアリーディスクと同様に、学習画像の精度不足や機械学習モデルのパラメータ不足により、光芒の不自然な強調が発生する場合がある。 Next, referring to Figure 6, we will explain light beams caused by the diffraction of light. Light beams refer to long, thin streaks of light that appear when capturing an image of a point light source or intense light. Figure 6 shows an example of light beams. The number of streaks depends on the number of aperture blades 143 in the optical system 102a. If the number of aperture blades 143 is odd, twice as many streaks will appear. On the other hand, if the number of aperture blades 143 is even, the number of streaks will be equal to the number of aperture blades 143. Figure 6 shows five aperture blades 143, resulting in ten light streaks 144. The sharpness of the light beams depends on the aperture value of the optical system. As the aperture value increases, the sharpness increases. Furthermore, even with the same aperture value, the appearance of the light beams varies depending on the type of optical system 102a. As with airy disks, insufficient precision in the training images or insufficient parameters in the machine learning model can cause unnatural enhancement of the light beams.

次に、学習画像の精度不足について詳細に説明する。機械学習モデルを用いてぼけの先鋭化を行う場合、先鋭化の精度は学習画像の精度に依存する。つまり、先鋭化の対象としている光学系１０２ａのぼけを高精度に再現した学習画像が必要となる。しかし、撮像時に光学系１０２ａが取り得るズーム位置、絞り値、被写体距離におけるぼけを網羅的に学習しようとすると、学習画像の枚数が膨大になる。その結果、個々のぼけ先鋭化精度の低下、学習が収束しない等の問題が生じる。そのため、光学系１０２ａが取り得るズーム位置、絞り値、被写体距離におけるぼけを離散的に学習し、中間領域は機械学習モデルに予測させることが必要になる。しかし、その場合、学習画像に含まれていない中間領域のぼけを先鋭化すると、回折パターンのエッジを被写体と誤認識して先鋭化することがある。 Next, we will explain in detail the lack of accuracy of the training images. When sharpening blur using a machine learning model, the accuracy of the sharpening depends on the accuracy of the training images. In other words, training images that accurately reproduce the blur of the optical system 102a that is the target of sharpening are required. However, if we attempt to comprehensively learn the blur at zoom positions, aperture values, and subject distances that the optical system 102a can assume during image capture, the number of training images becomes enormous. As a result, problems arise such as a decrease in the accuracy of individual blur sharpening and failure of learning to converge. For this reason, it is necessary to discretely learn the blur at zoom positions, aperture values, and subject distances that the optical system 102a can assume, and have the machine learning model predict the intermediate regions. However, in this case, sharpening the blur in the intermediate regions that are not included in the training images may result in the edges of the diffraction pattern being mistakenly recognized as the subject and sharpened.

次に、機械学習モデルのパラメータ不足について詳細に説明する。機械学習モデルは複数の層を有し、各層で層の入力とウエイトの線型和が取られる。線型和として入力とフィルタの畳み込み（フィルタの各要素の値がウエイトに該当。また、バイアスとの和を含んでいてもよい）を用いるＣＮＮを機械学習モデルとした場合、フィルタの層数がパラメータの数に対応する。つまり、パラメータ不足とは、フィルタの層数不足を意味している。フィルタの層数と学習時間および処理速度はトレードオフの関係になっているため、パラメータ不足が発生する場合がある。機械学習モデルの入力データとして、撮像画像と撮像画像に対応する輝度飽和マップを用いる手法と、飽和影響マップを生成する手法を盛り込んで学習した機械学習モデルを使用すると、弊害を低減させることは可能であるが、完全に弊害を消すことは難しい。輝度飽和マップを用いる手法と、飽和影響マップを生成する手法についてそれぞれ詳細に説明する。 Next, we will explain in detail the parameter deficiency of machine learning models. Machine learning models have multiple layers, and at each layer, a linear sum of the layer's input and weight is taken. If the machine learning model is a CNN, which uses the convolution of the input and filter as a linear sum (the value of each filter element corresponds to the weight, and may also include a sum with a bias), the number of filter layers corresponds to the number of parameters. In other words, parameter deficiency means an insufficient number of filter layers. Because there is a trade-off between the number of filter layers, training time, and processing speed, parameter deficiency can occur. Using a machine learning model trained using a method that uses a captured image and a corresponding brightness saturation map as input data for the machine learning model, and a method that generates a saturation influence map, it is possible to reduce the adverse effects, but it is difficult to completely eliminate them. The method using the brightness saturation map and the method generating a saturation influence map are each explained in detail.

次に、輝度飽和マップについて説明する。輝度飽和マップとは、撮像画像において輝度飽和領域を表すマップである。輝度飽和を起こした領域（輝度飽和領域）では、被写体空間の構造に関する情報が失われ、各領域の境界で偽エッジが出現することもあり、被写体の正しい特徴量を抽出できない。そこで、輝度飽和マップを入力することで、ニューラルネットワークが前述のような問題のある領域を特定できるため、推定精度の低下を抑制することができる。 Next, we will explain the brightness saturation map. A brightness saturation map is a map that represents brightness saturated areas in a captured image. In areas where brightness saturation occurs (brightness saturated areas), information about the structure of the subject space is lost, and false edges may appear at the boundaries between areas, making it impossible to extract correct features of the subject. Therefore, by inputting a brightness saturation map, the neural network can identify problematic areas such as those mentioned above, thereby preventing a decrease in estimation accuracy.

次に、飽和影響マップについて説明する。輝度飽和マップを使用しても、機械学習モデルの判定が正しく行われない場合がある。例えば、輝度飽和した領域の近傍が注目領域だった場合、機械学習モデルは、注目領域の近傍に輝度飽和した領域があるため、注目領域が輝度飽和の影響を受けた領域と判定可能である。しかし、輝度飽和した領域から離れた位置が注目領域の場合、ここが輝度飽和の影響を受けているか否かを判定することは容易でなく、曖昧性が高くなる。その結果、輝度飽和した領域から離れた位置では、機械学習モデルが誤判定を起こすことがある。これによって、タスクがぼけの先鋭化の場合、非飽和ぼけ像に対して、飽和ぼけ像に対応する先鋭化処理を実行する。この際、ぼけを先鋭化した画像にアーティファクトが発生し、タスクの精度が低下する。そのため、機械学習モデルを用いて、ぼけが発生した撮像画像から飽和影響マップを生成することが好ましい。 Next, we will explain the saturation influence map. Even when a brightness saturation map is used, the machine learning model may not make an accurate judgment. For example, if the region of interest is near a brightness saturated region, the machine learning model can determine that the region of interest is affected by brightness saturation because there is a brightness saturated region near the region of interest. However, if the region of interest is located far from the brightness saturated region, it is not easy to determine whether the region is affected by brightness saturation, and ambiguity increases. As a result, the machine learning model may make an incorrect judgment at a position far from the brightness saturated region. For this reason, when the task is to sharpen the blur, sharpening processing corresponding to a saturated blur image is performed on a non-saturated blur image. In this case, artifacts occur in the sharpened blur image, reducing the accuracy of the task. For this reason, it is preferable to use a machine learning model to generate a saturation influence map from a captured image in which blur has occurred.

飽和影響マップとは、撮像画像の輝度飽和した領域の被写体が、ぼけによって広がった信号値の大きさと範囲を表すマップ（空間的に配列された信号列）である。機械学習モデルに飽和影響マップを生成させることで、機械学習モデルは、撮像画像中の輝度飽和の影響の有無とその大きさを高精度に推定することができる。飽和影響マップが生成されることで、機械学習モデルは、輝度飽和の影響を受けた領域に実行すべき処理と、それ以外の領域に実行すべき処理を、それぞれ適切な領域に実行することができる。そのため、機械学習モデルに飽和影響マップを生成させることで、飽和影響マップの生成を介さない（撮像画像から直接、認識ラベルやぼけ先鋭化画像のみを生成する）場合に対して、タスクの精度が向上する。 A saturation influence map is a map (spatially arranged signal sequence) that represents the magnitude and range of signal values that have spread due to blurring of subjects in brightness-saturated areas of a captured image. By having a machine learning model generate a saturation influence map, the machine learning model can accurately estimate the presence and magnitude of the influence of brightness saturation in a captured image. By generating a saturation influence map, the machine learning model can execute the processing that should be performed on areas affected by brightness saturation and the processing that should be performed on other areas, respectively, in the appropriate areas. Therefore, by having a machine learning model generate a saturation influence map, task accuracy is improved compared to when saturation influence map generation is not required (only recognition labels and blur-sharpened images are generated directly from the captured image).

上記２つの手法は有効ではあるが、弊害を完全に消すことは難しい。そこで本実施例は、撮像画像の撮像に用いた光学系の絞り値に関する情報と、撮像画像の輝度値に関する情報とに基づいて重みマップを生成し、撮像画像と推定画像とを加重平均する。これにより、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能になる。 While the above two methods are effective, it is difficult to completely eliminate the adverse effects. Therefore, in this embodiment, a weight map is generated based on information about the aperture value of the optical system used to capture the captured image and information about the luminance value of the captured image, and a weighted average is calculated between the captured image and the estimated image. This makes it possible to maintain the blur correction effect while suppressing the adverse effects caused by the aperture value of the optical system and the luminance value of the captured image.

次に、図７を参照して、学習装置１０１で実行される機械学習モデルの学習に関して説明する。図７は、機械学習モデルの学習のフローチャートである。学習装置１０１は、記憶部１０１ａ、取得部１０１ｂ、演算部１０１ｃ、および更新部１０１ｄを有し、図７の各ステップは、主に、学習装置１０１の各部により実行される。 Next, with reference to Figure 7, we will explain the learning of the machine learning model executed by the learning device 101. Figure 7 is a flowchart of learning of the machine learning model. The learning device 101 has a memory unit 101a, an acquisition unit 101b, a calculation unit 101c, and an update unit 101d, and each step in Figure 7 is mainly executed by each unit of the learning device 101.

まずステップＳ１０１において、取得部１０１ｂは、記憶部１０１ａから１枚以上の原画像を取得する。原画像は、第２の信号値より高い信号値を有する画像である。第２の信号値は、撮像画像の輝度飽和値に相当する信号値である。ただし、機械学習モデルに入力する際、信号値を規格化してもよいため、必ずしも第２の信号値と撮像画像の輝度飽和値が一致する必要はない。原画像を基にして機械学習モデルの学習を行うため、原画像は様々な周波数成分（異なる向きと強度のエッジ、グラデーション、平坦部など）を有する画像であることが望ましい。原画像は実写画像でもよいし、ＣＧ（ＣｏｍｐｕｔｅｒＧｒａｐｈｉｃｓ）でもよい。 First, in step S101, the acquisition unit 101b acquires one or more original images from the storage unit 101a. The original image is an image having a signal value higher than the second signal value. The second signal value is a signal value corresponding to the luminance saturation value of the captured image. However, since the signal value may be normalized when input into the machine learning model, the second signal value does not necessarily have to match the luminance saturation value of the captured image. Since the machine learning model is trained based on the original image, it is desirable that the original image be an image having various frequency components (edges of different directions and intensities, gradations, flat areas, etc.). The original image may be a real-life image or CG (Computer Graphics).

続いてステップＳ１０２において、演算部１０１ｃは、原画像にぼけを付与し、ぼけ画像を生成する。ぼけ画像は、学習時に機械学習モデルに入力される画像であり、推定時の撮像画像に相当する。付与するぼけは、先鋭化の対象となるぼけである。本実施例では、光学系１０２ａの収差と回折、および撮像素子１０２ｂの光学ローパスフィルタによって発生するぼけを付与する。光学系１０２ａの収差と回折によるぼけの形状は、像面座標（像高とアジムス）によって変化する。また、光学系１０２ａの変倍、絞り、フォーカスの状態によっても変化する。これらのぼけ全てを先鋭化する機械学習モデルを一括で学習したい場合、光学系１０２ａで発生する複数のぼけを用いて、複数のぼけ画像を生成するとよい。また、ぼけ画像において、第２の信号値を超える信号値はクリップされる。これは、撮像画像の撮像過程で起きる輝度飽和を再現するために行う。必要に応じて、撮像素子１０２ｂで発生するノイズをぼけ画像に付与してもよい。 Next, in step S102, the calculation unit 101c blurs the original image to generate a blurred image. The blurred image is an image input to the machine learning model during training and corresponds to the captured image during estimation. The blur to be added is the blur to be sharpened. In this embodiment, blur generated by the aberration and diffraction of the optical system 102a and the optical low-pass filter of the image sensor 102b is added. The shape of the blur caused by the aberration and diffraction of the optical system 102a varies depending on the image plane coordinates (image height and azimuth). It also varies depending on the magnification, aperture, and focus state of the optical system 102a. If you want to simultaneously train a machine learning model that sharpens all of these blurs, it is recommended to generate multiple blurred images using multiple blurs generated by the optical system 102a. Furthermore, in the blurred image, signal values exceeding a second signal value are clipped. This is done to reproduce the brightness saturation that occurs during the capture process of the captured image. If necessary, noise generated by the image sensor 102b may be added to the blurred image.

続いてステップＳ１０３において、演算部１０１ｃは、原画像に基づく画像と信号値の閾値とに基づいて、第１の領域を設定する。本実施例では、原画像に基づく画像として、ぼけ画像を用いるが、原画像そのものなどを用いてもよい。ぼけ画像の信号値と、信号値の閾値と、を比較することで、第１の領域を設定する。より具体的には、ぼけ画像の信号値が、信号値の閾値以上となっている領域を第１の領域とする。本実施例において、信号値の閾値は第２の信号値である。故に、第１の領域は、ぼけ画像の輝度飽和した領域（飽和領域）を表す。ただし、信号値の閾値と第２の信号値は、一致しなくてもよい。信号値の閾値を、第２の信号値よりやや小さい値（例えば、０．９倍）に設定してもよい。 Next, in step S103, the calculation unit 101c sets a first region based on an image based on the original image and a signal value threshold. In this embodiment, a blurred image is used as the image based on the original image, but the original image itself may also be used. The first region is set by comparing the signal value of the blurred image with the signal value threshold. More specifically, the first region is defined as the region where the signal value of the blurred image is equal to or greater than the signal value threshold. In this embodiment, the signal value threshold is the second signal value. Therefore, the first region represents a brightness-saturated region of the blurred image (saturated region). However, the signal value threshold and the second signal value do not have to match. The signal value threshold may be set to a value slightly smaller than the second signal value (for example, 0.9 times).

続いてステップＳ１０４において、演算部１０１ｃは、第１の領域に原画像の信号値を有する第１の領域画像を生成する。第１の領域画像は、第１の領域以外の領域において、原画像とは異なる信号値を有する。さらに望ましくは、第１の領域画像は、第１の領域以外の領域において、第１の信号値を有する。本実施例において、第１の信号値は０であるが、発明はこれに限定されない。本実施例では、第１の領域画像は、ぼけ画像が輝度飽和した領域のみに原画像の信号値を有し、それ以外の領域の信号値は０である。 Next, in step S104, the calculation unit 101c generates a first region image having the signal value of the original image in the first region. The first region image has a signal value different from that of the original image in regions other than the first region. More preferably, the first region image has a first signal value in regions other than the first region. In this embodiment, the first signal value is 0, but the invention is not limited to this. In this embodiment, the first region image has the signal value of the original image only in regions where the blurred image is saturated in brightness, and the signal value in other regions is 0.

続いてステップＳ１０５において、演算部１０１ｃは、第１の領域画像にぼけを付与し、飽和影響正解マップを生成する。付与されるぼけは、ぼけ画像に付与したぼけと同じである。これによって、ぼけ画像の輝度飽和した領域にある被写体から、撮像時のぼけ（劣化）によって広がった信号値の大きさと範囲を表すマップ（空間的に配列された信号列）である飽和影響正解マップが生成される。本実施例では、ぼけ画像と同様に、飽和影響正解マップを第２の信号値でクリップするが、必ずしもクリップを行う必要はない。 Next, in step S105, the calculation unit 101c blurs the first region image and generates a saturation influence correct map. The blur applied is the same as the blur applied to the blurred image. As a result, a saturation influence correct map is generated, which is a map (spatially arranged signal sequence) that represents the magnitude and range of signal values that have expanded due to blur (deterioration) during imaging from the subject in the brightness-saturated region of the blurred image. In this embodiment, as with the blurred image, the saturation influence correct map is clipped with the second signal value, but clipping is not necessarily required.

続いてステップＳ１０６において、取得部１０１ｂは、正解モデル出力を取得する。本実施例のタスクはぼけ先鋭化のため、正解モデル出力はぼけ画像よりぼけの小さい画像である。本実施例では、原画像を第２の信号値でクリップすることで、正解モデル出力を生成する。原画像に高周波成分が不足している場合、原画像を縮小した画像を正解モデル出力としてもよい。この場合、ステップＳ１０２でぼけ画像を生成する際にも同様に縮小を行う。また、ステップＳ１０６は、ステップＳ１０１より後で、ステップＳ１０７より前であれば、いつ実行してもよい。 Next, in step S106, the acquisition unit 101b acquires the correct model output. Since the task in this embodiment is blur sharpening, the correct model output is an image with less blur than the blurred image. In this embodiment, the correct model output is generated by clipping the original image with the second signal value. If the original image lacks high-frequency components, an image obtained by reducing the original image may be used as the correct model output. In this case, reduction is also performed in the same way when generating the blurred image in step S102. Furthermore, step S106 may be executed at any time after step S101 and before step S107.

続いてステップＳ１０７において、演算部１０１ｃは、機械学習モデルを用いて、ぼけ画像に基づき、飽和影響マップとモデル出力を生成する。本実施例では、図１に示される機械学習モデルを使用するが、発明はこれに限定されない。ぼけ画像２０１と輝度飽和マップ２０２が、機械学習モデルに入力される。輝度飽和マップ２０２は、ぼけ画像２０１の輝度飽和した（信号値が第２の信号値以上である）領域を示したマップである。例えば、第２の信号値で、ぼけ画像２０１を二値化することによって生成できる。ただし、輝度飽和マップ２０２は、必ずしも必須ではない。ぼけ画像２０１と輝度飽和マップ２０２は、チャンネル方向に連結されて、機械学習モデルに入力される。ただし、発明はこれに限定されない。例えば、ぼけ画像２０１と輝度飽和マップ２０２をそれぞれ特徴マップに変換し、それらの特徴マップをチャンネル方向に連結してもよい。また、輝度飽和マップ２０２以外の情報を入力に追加してもよい。 Next, in step S107, the calculation unit 101c uses a machine learning model to generate a saturation influence map and model output based on the blurred image. In this embodiment, the machine learning model shown in FIG. 1 is used, but the invention is not limited to this. A blurred image 201 and a brightness saturation map 202 are input to the machine learning model. The brightness saturation map 202 is a map that indicates areas of the blurred image 201 where brightness is saturated (where the signal value is equal to or greater than a second signal value). For example, the brightness saturation map 202 can be generated by binarizing the blurred image 201 using the second signal value. However, the brightness saturation map 202 is not necessarily required. The blurred image 201 and the brightness saturation map 202 are linked in the channel direction and input to the machine learning model. However, the invention is not limited to this. For example, the blurred image 201 and the brightness saturation map 202 may each be converted into a feature map, and these feature maps may be linked in the channel direction. Furthermore, information other than the brightness saturation map 202 may be added to the input.

機械学習モデルは複数の層を有し、各層で層の入力とウエイトの線型和が取られる。ウエイトの初期値は、乱数などで決定するとよい。本実施例は、線型和として入力とフィルタの畳み込み（フィルタの各要素の値がウエイトに該当。また、バイアスとの和を含んでいてもよい）を用いるＣＮＮを機械学習モデルとするが、発明はこれに限定されない。また、各層では必要に応じて、ＲｅＬＵ（ＲｅｃｔｉｆｉｅｄＬｉｎｅａｒＵｎｉｔ）やシグモイド関数などの活性化関数による非線型変換が実行される。さらに、機械学習モデルは必要に応じて、残差ブロックやＳｋｉｐＣｏｎｎｅｃｔｉｏｎ（ＳｈｏｒｔｃｕｔＣｏｎｎｅｃｔｉｏｎともいう）を有していてもよい。複数の層（本実施例では畳み込み層１６層）を介した結果、飽和影響マップ２０３が生成される。 The machine learning model has multiple layers, and at each layer, a linear sum of the layer's input and weights is calculated. The initial values of the weights can be determined using random numbers or the like. In this embodiment, the machine learning model is a CNN that uses the convolution of the input and filter as the linear sum (the value of each filter element corresponds to the weight, and may also include a sum with a bias), but the invention is not limited to this. Furthermore, at each layer, nonlinear transformations are performed using activation functions such as ReLU (Rectified Linear Unit) and sigmoid functions as necessary. Furthermore, the machine learning model may have residual blocks or skip connections (also known as shortcut connections) as necessary. A saturation influence map 203 is generated as a result of passing through multiple layers (16 convolutional layers in this embodiment).

本実施例では、層２１１の出力と輝度飽和マップ２０２の要素毎の和を取ることで飽和影響マップ２０３とするが、構成はこれに限定されない。飽和影響マップが直接、層２１１の出力として生成されてもよい。或いは、層２１１の出力に対して任意の処理を施した結果を飽和影響マップ２０３としてもよい。次に、飽和影響マップ２０３とぼけ画像２０１をチャンネル方向に連結して後続の層に入力し、複数の層（本実施例では畳み込み層１６層）を介した結果、モデル出力２０４を生成する。モデル出力２０４も、層２１２の出力とぼけ画像２０１の要素ごとの和を取ることで生成されるが、構成はこれに限定されない。なお本実施例では、各層で３×３のフィルタ６４種類（ただし、層２１１と層２１２は、フィルタ種類の数がぼけ画像２０１のチャンネル数と同数）との畳み込みを実行するが、構成はこれに限定されない。 In this embodiment, the saturation influence map 203 is generated by taking the element-by-element sum of the output of layer 211 and the intensity saturation map 202, but the configuration is not limited to this. The saturation influence map may be generated directly as the output of layer 211. Alternatively, the saturation influence map 203 may be generated by performing any processing on the output of layer 211. Next, the saturation influence map 203 and the blurred image 201 are concatenated in the channel direction and input to a subsequent layer, and model output 204 is generated after passing through multiple layers (16 convolutional layers in this embodiment). The model output 204 is also generated by taking the element-by-element sum of the output of layer 212 and the blurred image 201, but the configuration is not limited to this. Note that in this embodiment, each layer performs convolution with 64 types of 3x3 filters (however, layers 211 and 212 perform convolution with the same number of filter types as the number of channels of the blurred image 201), but the configuration is not limited to this.

続いてステップＳ１０８において、更新部１０１ｄは、誤差関数に基づいて、機械学習モデルのウエイトを更新する。本実施例において、誤差関数は、飽和影響マップ２０３と飽和影響正解マップの誤差と、モデル出力２０４と正解モデル出力の誤差と、の重み付き和である。誤差の算出には、ＭＳＥ（ＭｅａｎＳｑｕａｒｅｄＥｒｒｏｒ）を使用する。重みは両者１とする。ただし、誤差関数と重みはこれに限定されない。ウエイトの更新には、誤差逆伝搬法（Ｂａｃｋｐｒｏｐａｇａｔｉｏｎ）などを用いるとよい。また、誤差は残差成分に対してとってもよい。残差成分の場合、飽和影響マップ２０３と輝度飽和マップ２０２の差分成分と、飽和影響正解マップと輝度飽和マップ２０２の差分成分と、の誤差を用いる。同様に、モデル出力２０４とぼけ画像２０１の差分成分と、正解モデル出力とぼけ画像２０１の差分成分と、の誤差を用いる。 Next, in step S108, the update unit 101d updates the weights of the machine learning model based on the error function. In this embodiment, the error function is a weighted sum of the error between the saturation influence map 203 and the saturation influence correct map and the error between the model output 204 and the correct model output. MSE (Mean Squared Error) is used to calculate the error. Both weights are set to 1. However, the error function and weights are not limited to this. Backpropagation or the like may be used to update the weights. The error may also be calculated for the residual component. For the residual component, the error between the difference component between the saturation influence map 203 and the brightness saturation map 202 and the difference component between the saturation influence correct map and the brightness saturation map 202 is used. Similarly, the error between the difference component between the model output 204 and the blurred image 201 and the difference component between the correct model output and the blurred image 201 is used.

続いてステップＳ１０９において、更新部１０１ｄは、機械学習モデルの学習が完了したか否かを判定する。学習の完了は、ウエイトの更新の反復回数が既定の回数に達したかや、更新時のウエイトの変化量が既定値より小さいかなどによって、判定することができる。ステップＳ１０９にて学習が完了していないと判定された場合、ステップＳ１０１へ戻り、取得部１０１ｂは１枚以上の新たな原画像を取得する。一方、学習が完了したと判定された場合、更新部１０１ｄは学習を終了し、機械学習モデルの構成とウエイトの情報を記憶部１０１ａに記憶する。 Next, in step S109, the update unit 101d determines whether learning of the machine learning model is complete. Completion of learning can be determined by, for example, whether the number of iterations of weight update has reached a predetermined number, or whether the amount of change in weight during update is smaller than a predetermined value. If it is determined in step S109 that learning is not complete, the process returns to step S101, and the acquisition unit 101b acquires one or more new original images. On the other hand, if it is determined that learning is complete, the update unit 101d ends learning and stores the configuration of the machine learning model and weight information in the storage unit 101a.

以上の学習方法によって、機械学習モデルは、ぼけ画像（推定時には撮像画像）の輝度飽和した領域の被写体がぼけによって広がった信号値の大きさと範囲を表す飽和影響マップを推定することができる。飽和影響マップを明示的に推定することで、機械学習モデルは、飽和ぼけ像と非飽和ぼけ像それぞれに対するぼけの先鋭化を、適切な領域に実行できるようになるため、アーティファクトの発生が抑制される。 The above learning method allows the machine learning model to estimate a saturation influence map that represents the magnitude and range of signal values that have spread due to blurring of subjects in brightness-saturated areas of a blurred image (or captured image when estimated). Explicitly estimating the saturation influence map allows the machine learning model to sharpen the blur for both saturated and non-saturated blurred images in the appropriate areas, thereby suppressing the occurrence of artifacts.

次に、図８を参照して、画像処理装置１０３で実行される、学習済みの機械学習モデルを用いた撮像画像のぼけ先鋭化に関して説明する。図８は、モデル出力の生成のフローチャートである。画像処理装置１０３は、記憶部１０３ａ、取得部１０３ｂ、および先鋭化部１０３ｃを有し、図８の各ステップは、主に、画像処理装置１０３の各部により実行される。 Next, referring to Figure 8, we will explain the blur sharpening of a captured image using a trained machine learning model, which is executed by the image processing device 103. Figure 8 is a flowchart for generating model output. The image processing device 103 has a storage unit 103a, an acquisition unit 103b, and a sharpening unit 103c, and each step in Figure 8 is mainly executed by each unit of the image processing device 103.

まずステップＳ２０１において、取得部１０３ｂは、撮像画像と機械学習モデルを取得する。機械学習モデルの構成とウエイトの情報は、記憶部１０３ａから取得される。 First, in step S201, the acquisition unit 103b acquires a captured image and a machine learning model. Information about the configuration and weights of the machine learning model is acquired from the storage unit 103a.

続いてステップＳ２０２において、先鋭化部（第１の生成手段）１０３ｃは、機械学習モデルを用いて、補正に関する情報を生成する。本実施例において、補正に関する情報は、撮像画像から、撮像画像のぼけが先鋭化されたぼけ先鋭化画像（モデル出力）である。なお、ぼけ先鋭化画像（撮像画像を補正した画像）ではなく、ぼけ先鋭化の補正成分でもよい。機械学習モデルは、学習時と同様に、図１に示される構成を有する。学習時と同様に、撮像画像の輝度飽和した領域を表す輝度飽和マップを生成して入力し、飽和影響マップとモデル出力を生成する。例として、図１０（Ａ）および図１１（Ａ）にぼけ先鋭化画像（撮像画像）、図１０（Ｂ）および図１１（Ｂ）に飽和影響マップを示す。図１０（Ａ）、（Ｂ）は、撮像画像の平均輝度値が明るいシーンである。図１１（Ａ）、（Ｂ）は、撮像画像の平均輝度値が暗いシーンである。なお、平均輝度値の算出方法については後述する。 Next, in step S202, the sharpening unit (first generation means) 103c generates information related to the correction using a machine learning model. In this embodiment, the information related to the correction is a blur-sharpened image (model output) in which the blur of the captured image has been sharpened from the captured image. Note that instead of a blur-sharpened image (an image obtained by correcting the captured image), a blur-sharpening correction component may be used. The machine learning model has the same configuration as during training as shown in FIG. 1. As with training, a brightness saturation map representing brightness-saturated areas of the captured image is generated and input, and a saturation influence map and model output are generated. As examples, FIGS. 10(A) and 11(A) show a blur-sharpened image (captured image), and FIGS. 10(B) and 11(B) show saturation influence maps. FIGS. 10(A) and (B) show a scene in which the average brightness value of the captured image is bright. FIGS. 11(A) and (B) show a scene in which the average brightness value of the captured image is dark. Note that the method of calculating the average brightness value will be described later.

次に、図９を参照して、画像処理装置１０３で実行される、撮像画像とモデル出力との合成に関して説明する。図９は、先鋭化の強度調整のフローチャートである。図９の各ステップは、主に、画像処理装置１０３の各部により実行される。 Next, referring to Figure 9, we will explain the synthesis of a captured image and model output, which is performed by the image processing device 103. Figure 9 is a flowchart for adjusting the sharpening strength. Each step in Figure 9 is mainly performed by each unit of the image processing device 103.

まずステップＳ２１１において、取得部１０３ｂは、撮像画像から撮像状態を取得する。撮像状態とは、例えば、光学系１０２ａの絞り値（Ｆ値）と、撮像素子１０２ｂの画素ピッチである。撮像画像における光芒とエアリーディスクの見え方は、光学系１０２ａの絞り値と、撮像素子１０２ｂの画素ピッチに依存する。具体的には、光学系１０２ａの絞り値が大きくなるほど、そして画素ピッチが小さくなるほど、光芒とエアリーディスクは目立つようになる。そのため、ステップＳ２１１にて取得した絞り値と画素ピッチに応じて、重みマップを生成する。 First, in step S211, the acquisition unit 103b acquires the imaging state from the captured image. The imaging state is, for example, the aperture value (F-number) of the optical system 102a and the pixel pitch of the image sensor 102b. The appearance of the rays of light and airy disks in the captured image depends on the aperture value of the optical system 102a and the pixel pitch of the image sensor 102b. Specifically, the larger the aperture value of the optical system 102a and the smaller the pixel pitch, the more noticeable the rays of light and airy disks become. Therefore, a weight map is generated according to the aperture value and pixel pitch acquired in step S211.

続いてステップＳ２１２において、取得部１０３ｂは、撮像画像の輝度値に関する情報を取得する。ここで、撮像画像の輝度値に関する情報とは、撮像画像の輝度値に関する統計量であって、撮像画像の輝度値の平均値、中央値、分散、またはヒストグラムの少なくとも一つである。また、撮像画像の輝度値に関する情報である撮像画像の輝度値に関する統計量は、撮像画像全体に関するものでもよいし、撮像画像を分割した領域毎の統計量を用いてもよい。 Next, in step S212, the acquisition unit 103b acquires information related to the luminance values of the captured image. Here, the information related to the luminance values of the captured image is a statistical quantity related to the luminance values of the captured image, and is at least one of the mean, median, variance, or histogram of the luminance values of the captured image. Furthermore, the statistical quantity related to the luminance values of the captured image, which is information related to the luminance values of the captured image, may be related to the entire captured image, or statistics for each region obtained by dividing the captured image may be used.

本実施例では、撮像画像の輝度値に関する情報として、撮像画像の平均輝度値を取得する。ただし、撮像画像の平均輝度値を取得する場合、撮像画像に飽和領域が多く存在すると、夜景のような暗い画像であっても平均輝度値としては大きい値が取得され、明るい画像であると判定される場合がある。したがって、撮像画像から飽和影響マップを除いた第２の画像の平均輝度値を取得することが好ましい。この場合、撮像画像の輝度値に関する情報は、撮像画像から飽和影響マップを除いた第２の画像の輝度値に関する統計量に基づいて生成される。第２の画像において、撮像画像で飽和していない領域の平均輝度値を取得することで、撮像画像が明るいシーンなのか暗いシーンなのかを適切に判定することができる。なお、飽和影響マップの使用は必須ではなく、飽和領域に関する情報であればよい。例えば、撮像画像の輝度飽和に関する撮像画像から輝度飽和マップを除いて第２の画像を生成してもよい。 In this embodiment, the average luminance value of the captured image is acquired as information regarding the luminance value of the captured image. However, when acquiring the average luminance value of the captured image, if the captured image contains many saturated areas, a large average luminance value may be acquired even for a dark image such as a night scene, and the image may be determined to be bright. Therefore, it is preferable to acquire the average luminance value of a second image obtained by excluding the saturation influence map from the captured image. In this case, information regarding the luminance value of the captured image is generated based on statistics regarding the luminance values of the second image obtained by excluding the saturation influence map from the captured image. By acquiring the average luminance value of the non-saturated areas in the second image, it is possible to appropriately determine whether the captured image represents a bright scene or a dark scene. Note that the use of a saturation influence map is not required; information regarding saturated areas will suffice. For example, a second image may be generated by excluding the luminance saturation map from the captured image regarding the luminance saturation of the captured image.

続いてステップＳ２１３において、先鋭化部１０３ｃは、光学系１０２ａの絞り値に関する情報と撮像画像の輝度値に関する情報とに基づいて、重みマップを生成する。本実施例において、光学系１０２ａの絞り値がＦ２２（所定の絞り値）以上、かつ撮像素子１０２ｂの画素ピッチが６μｍ（所定の画素ピッチ）未満の場合、先鋭化部１０３ｃは強度調整を実行する。その他の場合、先鋭化部１０３ｃは、撮像画像の重みが全て０になる重みマップを生成する。または、重みマップを生成せずにステップＳ２１２～Ｓ２１５の処理を省略してもよい。なお、強度調整を実行する絞り値（所定の絞り値）と画素ピッチの閾値（所定の画素ピッチ）は、任意の数値に決定が可能である。例えば、光学系１０２ａの絞り値がＦ１６以上、かつ撮像素子１０２ｂの画素ピッチが４μｍ未満の場合に強度調整を実行してもよい。なお本実施例において、重みマップは、画素ピッチを考慮することなく、絞り値に基づいて生成されてもよい。 Next, in step S213, the sharpening unit 103c generates a weight map based on information related to the aperture value of the optical system 102a and information related to the luminance values of the captured image. In this embodiment, if the aperture value of the optical system 102a is F22 (predetermined aperture value) or greater and the pixel pitch of the image sensor 102b is less than 6 μm (predetermined pixel pitch), the sharpening unit 103c performs intensity adjustment. In other cases, the sharpening unit 103c generates a weight map in which all weights of the captured image are 0. Alternatively, the processing of steps S212 to S215 may be omitted without generating a weight map. The aperture value (predetermined aperture value) and pixel pitch threshold (predetermined pixel pitch) used to perform intensity adjustment can be set to any numerical value. For example, intensity adjustment may be performed if the aperture value of the optical system 102a is F16 or greater and the pixel pitch of the image sensor 102b is less than 4 μm. In this embodiment, the weight map may be generated based on the aperture value without considering the pixel pitch.

また本実施例において、強度調整は、所定の絞り値（例えばＦ２２）を基準として実行されるか否かに限定されるものではない。例えば、絞り値に基づいて撮像画像の重みが連続的に（段階的に）変化する重みマップを用いてもよい。すなわち、絞り値が、第１の絞り値、または第１の絞り値よりも大きい第２の絞り値に設定可能である場合、重みマップは、第１の絞り値よりも第２の絞り値のほうが撮像画像の重みが大きいデータである。 Furthermore, in this embodiment, the intensity adjustment is not limited to whether it is performed based on a predetermined aperture value (e.g., F22). For example, a weight map in which the weight of the captured image changes continuously (in stages) based on the aperture value may be used. In other words, if the aperture value can be set to a first aperture value or a second aperture value greater than the first aperture value, the weight map contains data in which the weight of the captured image is greater for the second aperture value than for the first aperture value.

次に、輝度値に基づく重みマップについて説明する。本実施例において、重みマップは、輝度値が小さいほど撮像画像の重みが大きいデータである。例えば、撮像画像の輝度値に関する情報として撮像画像の平均輝度値を用いる場合、平均輝度値が小さいほど、撮像画像の重みが大きくなるようにする。具体的には、平均輝度値に対する重みマップの重みとの関係を一次関数として保持しておき、撮像画像の平均輝度値に応じた重みを取得して、重みマップを生成する。 Next, we will explain the weight map based on brightness values. In this embodiment, the weight map is data in which the smaller the brightness value, the greater the weight of the captured image. For example, if the average brightness value of the captured image is used as information related to the brightness value of the captured image, the smaller the average brightness value, the greater the weight of the captured image. Specifically, the relationship between the weight of the weight map and the average brightness value is stored as a linear function, and a weight corresponding to the average brightness value of the captured image is obtained to generate the weight map.

例えば、図１２に示されるような関係式から取得された、平均輝度値に対応する撮像画像の重みを用いる。図１２は重みマップの説明図であり、横軸は平均輝度値、縦軸は撮像画像の重みをそれぞれ示す。ただし、平均輝度値に対する重みマップの調整値との関係はこれに限定されない。図１２は未現像のＲＡＷ画像における平均輝度値を示すが、現像後の平均輝度値を使用してもよい。ＲＡＷ画像における平均輝度値を算出する際は、撮像素子１０２ｂにおけるオプティカルブラック領域における信号値を減算してから平均輝度値を算出することが好ましい。これにより、ＩＳＯ感度や撮像素子１０２ｂに依存しない平均輝度値の算出が可能となる。また、平均輝度値を、撮像画像を分割した領域毎に取得した場合、各領域の平均輝度値から重みを取得して、重みマップを生成してよい。この詳細は、実施例２において後述する。 For example, the weight of the captured image corresponding to the average luminance value obtained from the relational expression shown in FIG. 12 is used. FIG. 12 is an explanatory diagram of a weight map, with the horizontal axis representing the average luminance value and the vertical axis representing the weight of the captured image. However, the relationship between the average luminance value and the weight map adjustment value is not limited to this. While FIG. 12 shows the average luminance value of an undeveloped RAW image, the average luminance value after development may also be used. When calculating the average luminance value of a RAW image, it is preferable to subtract the signal value in the optical black region of the image sensor 102b before calculating the average luminance value. This makes it possible to calculate an average luminance value that is independent of the ISO sensitivity or the image sensor 102b. Furthermore, if the average luminance value is obtained for each region into which the captured image is divided, a weight may be obtained from the average luminance value of each region to generate a weight map. Details of this will be described later in Example 2.

なお、撮像画像の信号値に関する統計量として、撮像画像の輝度値の平均値、中央値、分散、またはヒストグラムを取得した場合にも、統計量が小さいほど撮像画像の重みが大きくなるように、重みマップを生成する。例えば、撮像画像の輝度値のヒストグラムを用いる場合、ヒストグラムの重心やピークが小さいほど撮像画像の重みが大きくなるように、重みマップを生成する。 In addition, even when the average, median, variance, or histogram of the brightness values of the captured image is obtained as a statistical quantity related to the signal values of the captured image, a weight map is generated so that the smaller the statistical quantity, the greater the weight of the captured image. For example, when a histogram of the brightness values of the captured image is used, a weight map is generated so that the smaller the center of gravity or peak of the histogram, the greater the weight of the captured image.

続いて、図９のステップＳ２１４において、先鋭化部１０３ｃは、撮像画像の飽和領域に関する情報に基づいて重みマップを調整する。本実施例において、飽和領域に関する情報は飽和影響マップであり、ＲＧＢの全てが飽和していなくてもよい。また、飽和影響マップは、設定した信号値で０から１に規格化し、これを重みマップの調整に使用する。具体的には、ステップＳ２１３にて生成された重みマップに規格化後の重みマップを適用（乗算）する。これにより、飽和領域影響にステップＳ２１３で生成した重みマップの重みが作用することになり、光芒とエアリーディスクが目立ちやすい飽和影響領域のみ強度調整が可能になる。なお、飽和影響マップの適用は必須ではなく、撮像画像全体を強度調整の対象としてもよい。飽和影響マップを使用することで、飽和影響領域まで補正強度の調整が可能となる。なお、飽和影響マップではなく、輝度飽和マップでもいいし、輝度飽和マップを像高ごとにぼかしたものを使用してもよい。 Next, in step S214 of FIG. 9 , the sharpening unit 103c adjusts the weight map based on information about saturated regions in the captured image. In this embodiment, the information about saturated regions is a saturation influence map, and not all RGB colors need to be saturated. The saturation influence map is normalized from 0 to 1 using a set signal value and used to adjust the weight map. Specifically, the normalized weight map is applied (multiplied) to the weight map generated in step S213. This causes the weights of the weight map generated in step S213 to affect the saturated region influence, making it possible to adjust the intensity only in saturated affected regions where rays of light and airy disks are more noticeable. Note that applying the saturation influence map is not required; the entire captured image may be subject to intensity adjustment. Using the saturation influence map makes it possible to adjust the correction intensity up to the saturated affected region. Note that instead of the saturation influence map, a brightness saturation map may be used, or a brightness saturation map blurred for each image height may be used.

例として、図１３（Ａ）、（Ｂ）に最終的な重みマップを示す。本実施例において、重みマップは、画素値が大きいほど撮像画像の重みが大きく、画素値が小さいほど撮像画像の重みが小さい。図１３（Ａ）は、平均輝度値の明るいシーンを示し、図１３（Ｂ）は平均輝度値の暗いシーンを示す。弊害の目立ちやすい領域である、図１３（Ｂ）の飽和影響領域において、撮像画像の重みが大きい。 As an example, the final weight maps are shown in Figures 13(A) and (B). In this embodiment, the weight map indicates that the larger the pixel value, the greater the weight of the captured image, and the smaller the pixel value, the smaller the weight of the captured image. Figure 13(A) shows a scene with a bright average luminance value, and Figure 13(B) shows a scene with a dark average luminance value. The weight of the captured image is large in the saturation affected area in Figure 13(B), which is an area where adverse effects are easily noticeable.

重みマップは、さらに撮像画像の撮像に用いた光学系の光学性能に関する情報に基づいて生成されることが好ましい。具体的には、光学性能が低い場合には撮像画像の重みを上げ、光学性能が高い場合には撮像画像の重みを下げる。光学性能に関する情報とは、撮像画像の撮像時のズーム位置、絞り径、または被写体距離の少なくとも一つと、光学系の像高ごとのＰＳＦの信号値の大きさと範囲とに基づいて算出することができる。なお、ＰＳＦを用いることは必須ではなく、光学性能に関する情報であればよい。例えば、光学伝達関数を用いてもよい。 It is preferable that the weight map is further generated based on information regarding the optical performance of the optical system used to capture the captured image. Specifically, if the optical performance is low, the weight of the captured image is increased, and if the optical performance is high, the weight of the captured image is decreased. Information regarding optical performance can be calculated based on at least one of the zoom position, aperture diameter, or subject distance when the captured image was captured, and the magnitude and range of the PSF signal value for each image height of the optical system. Note that it is not necessary to use a PSF; any information regarding optical performance will suffice. For example, an optical transfer function may be used.

続いて、図９のステップＳ２１５において、先鋭化部（第２の生成手段）１０３ｃは、ステップＳ２１４にて生成された重みマップを用いて、撮像画像とぼけ先鋭化画像を加重平均（合成）し、強度調整画像２０５を生成する。 Next, in step S215 of FIG. 9, the sharpening unit (second generation means) 103c uses the weight map generated in step S214 to perform a weighted average (combine) of the captured image and the blur-sharpened image to generate the intensity-adjusted image 205.

以上の構成により、本実施例によれば、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能な画像処理システムを提供することができる。 With the above configuration, this embodiment can provide an image processing system that can maintain blur correction effects while suppressing adverse effects caused by the aperture value of the optical system and the brightness value of the captured image.

次に、本発明の実施例２における画像処理システムに関して説明する。本実施例では、平均輝度値を、撮像画像を分割した領域毎に取得し、各領域で平均輝度値に応じた重みを取得して、重みマップを生成する。図１４は、本実施例における画像処理システム３００のブロック図である。図１５は、画像処理システム３００の外観図である。画像処理システム３００は、学習装置３０１、撮像装置３０２、および画像処理装置３０３を有する。学習装置３０１と画像処理装置３０３、画像処理装置３０３と撮像装置３０２はそれぞれ、有線または無線のネットワークで接続される。 Next, an image processing system according to a second embodiment of the present invention will be described. In this embodiment, an average brightness value is obtained for each region into which a captured image is divided, and a weight corresponding to the average brightness value is obtained for each region to generate a weight map. Figure 14 is a block diagram of an image processing system 300 according to this embodiment. Figure 15 is an external view of the image processing system 300. The image processing system 300 has a learning device 301, an image capturing device 302, and an image processing device 303. The learning device 301 and the image processing device 303, and the image processing device 303 and the image capturing device 302 are each connected via a wired or wireless network.

撮像装置３０２は、光学系３２１、撮像素子３２２、記憶部３２３、通信部３２４、および表示部３２５を有する。撮像画像は、通信部３２４を介して画像処理装置３０３へ送信される。画像処理装置３０３は、通信部３３２を介して撮像画像を受信し、記憶部３３１に記憶された機械学習モデルの構成とウエイトの情報を用いて、ぼけ先鋭化を行う。機械学習モデルの構成とウエイトの情報は、学習装置３０１によって学習されたものであり、予め学習装置３０１から取得され、記憶部３３１に記憶されている。さらに、画像処理装置３０３は、ぼけ先鋭化の強度を調整する機能を有する。撮像画像のぼけが先鋭化されたぼけ先鋭化画像（モデル出力）および強度が調整された強度調整画像は、撮像装置３０２に送信され、記憶部３２３に記憶、表示部３２５に表示される。 The imaging device 302 has an optical system 321, an image sensor 322, a memory unit 323, a communication unit 324, and a display unit 325. The captured image is transmitted to the image processing device 303 via the communication unit 324. The image processing device 303 receives the captured image via the communication unit 332 and performs blur sharpening using information about the machine learning model configuration and weights stored in the memory unit 331. The information about the machine learning model configuration and weights is learned by the learning device 301, acquired in advance from the learning device 301, and stored in the memory unit 331. The image processing device 303 also has a function for adjusting the intensity of blur sharpening. A blur-sharpened image (model output) in which the blur of the captured image has been sharpened, and an intensity-adjusted image in which the intensity has been adjusted, are transmitted to the imaging device 302, stored in the memory unit 323, and displayed on the display unit 325.

学習装置３０１で行う学習データの生成とウエイトの学習（学習フェーズ）と画像処理装置３０３で実行される、なお、学習済みの機械学習モデルを用いた撮像画像のぼけ先鋭化（推定フェーズ）は実施例１と同様のため、省略する。 The generation of training data and weight training (training phase) performed by the training device 301 and the sharpening of blur in captured images using a trained machine learning model (estimation phase) performed by the image processing device 303 are the same as in Example 1, and therefore will not be repeated.

次に、図１６を参照して、画像処理装置３０３で実行される、撮像画像とモデル出力の合成に関して説明する。図１６は、先鋭化の強度調整のフローチャートである。図１６の各ステップは、主に、画像処理装置３０３の各部により実行される。 Next, referring to Figure 16, we will explain the synthesis of a captured image and model output, which is performed by the image processing device 303. Figure 16 is a flowchart for adjusting the sharpening strength. Each step in Figure 16 is mainly performed by each unit of the image processing device 303.

まずステップＳ３１１において、取得部３３３は、撮像画像から撮像状態を取得する。撮像状態は、撮像装置３０２における光学系３２１の絞り値、および撮像素子３２２の画素ピッチの状態を含むが、これらに限定されるものではない。本実施例において、光学系３２１の絞り値がＦ１６以上、かつ撮像素子３２２の画素ピッチが４μｍ以上の場合に強度調整を実行する。その他の場合、後述する重みマップの生成において、撮像画像の重みが全て０になる重みマップを生成し、強度調整を実行しない。または、重みマップを生成せずにステップＳ３１２～Ｓ３１４の処理を省略してもよい。 First, in step S311, the acquisition unit 333 acquires the imaging state from the captured image. The imaging state includes, but is not limited to, the aperture value of the optical system 321 in the imaging device 302 and the pixel pitch state of the image sensor 322. In this embodiment, intensity adjustment is performed when the aperture value of the optical system 321 is F16 or greater and the pixel pitch of the image sensor 322 is 4 μm or greater. In other cases, when generating a weight map (described below), a weight map is generated in which all weights of the captured image are 0, and intensity adjustment is not performed. Alternatively, the processing of steps S312 to S314 may be omitted without generating a weight map.

続いてステップＳ３１２において、取得部３３３は、撮像画像の輝度値に関する情報を取得する。本実施例では、平均輝度値を、撮像画像を分割した領域毎に取得し、各領域で平均輝度値に応じた重みを取得して、重みマップを生成する。図１７（Ａ）、（Ｂ）は重みマップの説明図であり、図１７（Ａ）は撮像画像、図１７（Ｂ）は領域毎の平均輝度値をそれぞれ示す。図１７（Ｂ）において、２００１は平均輝度値の低い領域、２００２は平均輝度値の高い領域、２００３は平均輝度値の中間領域である。 Next, in step S312, the acquisition unit 333 acquires information related to the brightness values of the captured image. In this embodiment, the average brightness value is acquired for each region into which the captured image is divided, and a weight corresponding to the average brightness value is acquired for each region to generate a weight map. Figures 17(A) and (B) are explanatory diagrams of the weight map, with Figure 17(A) showing the captured image and Figure 17(B) showing the average brightness value for each region. In Figure 17(B), 2001 is a region with a low average brightness value, 2002 is a region with a high average brightness value, and 2003 is a region with an intermediate average brightness value.

続いて、図１６のステップＳ３１３において、先鋭化部３３４は、光学系３２１の絞り値と撮像画像の輝度値に関する情報とに基づいて、重みマップを生成する。なお、重みマップの生成方法は実施例１と同様のため、その説明を省略する。続いてステップＳ３１４において、先鋭化部３３４は、ステップＳ３１２にて生成された重みマップを用いて、撮像画像とぼけ先鋭化画像（モデル出力）を加重平均（合成）し、強度調整画像２０５を生成する。 Next, in step S313 of FIG. 16, the sharpening unit 334 generates a weight map based on information relating to the aperture value of the optical system 321 and the luminance value of the captured image. Note that the method for generating the weight map is the same as in Example 1, and therefore its description will be omitted. Next, in step S314, the sharpening unit 334 uses the weight map generated in step S312 to perform a weighted average (combination) of the captured image and the blur-sharpened image (model output) to generate the intensity-adjusted image 205.

次に、本発明の実施例３における画像処理システムに関して説明する。図１８は、本実施例における画像処理システム４００のブロック図である。図１９は、画像処理システム４００の外観図である。画像処理システム４００は、学習装置４０１、レンズ装置４０２、撮像装置４０３、制御装置（第１の装置）４０４、画像推定装置（第２の装置）４０５、およびネットワーク４０６、４０７を有する。 Next, an image processing system according to a third embodiment of the present invention will be described. Figure 18 is a block diagram of an image processing system 400 according to this embodiment. Figure 19 is an external view of the image processing system 400. The image processing system 400 includes a learning device 401, a lens device 402, an imaging device 403, a control device (first device) 404, an image estimation device (second device) 405, and networks 406 and 407.

学習装置４０１および画像推定装置４０５はそれぞれ、例えばサーバである。制御装置４０４は、パーソナルコンピュータやモバイル端末などのユーザが操作する機器である。学習装置４０１は、記憶部４０１ａ、取得部４０１ｂ、演算部４０１ｃ、および更新部４０１ｄを有し、レンズ装置４０２と撮像装置４０３を用いて撮像された撮像画像からぼけの先鋭化をする機械学習モデルのウエイトを学習する。なお、学習方法、すなわち学習装置４０１で行う学習データの生成とウエイトの学習（学習フェーズ）は、実施例１と同様のため省略する。 The learning device 401 and the image estimation device 405 are each, for example, a server. The control device 404 is a device operated by a user, such as a personal computer or mobile terminal. The learning device 401 has a memory unit 401a, an acquisition unit 401b, a calculation unit 401c, and an update unit 401d, and learns the weights of a machine learning model that sharpens blur from an image captured using the lens device 402 and the imaging device 403. Note that the learning method, i.e., the generation of learning data and weight learning (learning phase) performed by the learning device 401, is the same as in Example 1, and therefore will not be described here.

撮像装置４０３は撮像素子４０３ａを有し、撮像素子４０３ａがレンズ装置４０２の形成した光学像を光電変換して撮像画像を取得する。レンズ装置４０２と撮像装置４０３とは着脱可能であり、互いに複数種類と組み合わることが可能である。制御装置４０４は、通信部４０４ａ、表示部４０４ｂ、記憶部４０４ｃ、および取得部４０４ｄを有し、有線または無線で接続された撮像装置４０３から取得した撮像画像に対して、実行する処理をユーザの操作に従って制御する。或いは、撮像装置４０３で撮像した撮像画像を予め記憶部４０４ｃに記憶しておき、撮像画像を読み出してもよい。 The imaging device 403 has an imaging element 403a, which photoelectrically converts the optical image formed by the lens device 402 to obtain a captured image. The lens device 402 and imaging device 403 are detachable and can be combined with multiple types of each other. The control device 404 has a communication unit 404a, a display unit 404b, a memory unit 404c, and an acquisition unit 404d, and controls the processing to be performed on the captured image obtained from the imaging device 403 connected via a wired or wireless connection in accordance with user operations. Alternatively, the captured image taken by the imaging device 403 may be stored in advance in the memory unit 404c and then read out.

画像推定装置４０５は、通信部４０５ａ、取得部４０５ｂ、記憶部４０５ｃ、および先鋭化部４０５ｄを有し、制御装置４０４と通信可能に構成されている。画像推定装置４０５は、ネットワーク４０６を介して接続された制御装置４０４の要求に応じて、撮像画像のぼけの先鋭化処理を実行する。画像推定装置４０５は、ネットワーク４０６を介して接続された学習装置４０１から、学習済みのウエイトの情報をぼけ先鋭化の推定時または予め取得し、撮像画像のぼけ先鋭化の推定に用いる。ぼけ先鋭化の推定後の推定画像は、先鋭化の強度調整が行われた後に再び制御装置４０４へ伝送されて、記憶部４０４ｃに記憶され、表示部４０４ｂに表示される。 The image estimation device 405 has a communication unit 405a, an acquisition unit 405b, a storage unit 405c, and a sharpening unit 405d, and is configured to be able to communicate with the control device 404. The image estimation device 405 performs blur sharpening processing on a captured image in response to a request from the control device 404, which is connected via the network 406. The image estimation device 405 acquires learned weight information from the learning device 401, which is connected via the network 406, when estimating blur sharpening or in advance, and uses this information to estimate blur sharpening on a captured image. The estimated image after blur sharpening estimation is subjected to sharpening strength adjustment, and then transmitted again to the control device 404, where it is stored in the storage unit 404c and displayed on the display unit 404b.

次に、図２０を参照して、制御装置４０４と画像推定装置４０５で実行される撮像画像のぼけ先鋭化に関して説明する。図２０は、モデル出力および先鋭化の強度調整のフローチャートである。図２０の各ステップは、主に、制御装置４０４または画像推定装置４０５の各部により実行される。 Next, referring to Figure 20, we will explain the blur sharpening of captured images performed by the control device 404 and image estimation device 405. Figure 20 is a flowchart of model output and sharpening intensity adjustment. Each step in Figure 20 is mainly performed by each section of the control device 404 or image estimation device 405.

まずステップＳ４０１において、制御装置４０４の取得部４０４ｄは、撮像画像とユーザが指定した先鋭化の強度を取得する。続いてステップＳ４０２において、通信部（送信手段）４０４ａは、画像推定装置４０５へ撮像画像とぼけ先鋭化の推定処理の実行に関する要求を送信する。 First, in step S401, the acquisition unit 404d of the control device 404 acquires the captured image and the sharpening strength specified by the user. Next, in step S402, the communication unit (transmission means) 404a transmits the captured image and a request to execute blur sharpening estimation processing to the image estimation device 405.

続いてステップＳ４０３において、画像推定装置４０５の通信部（受信手段）４０５ａは、制御装置４０４から送信された撮像画像と処理の要求を受信し、取得する。続いてステップＳ４０４において、取得部４０５ｂは、撮像画像に対応する学習済みのウエイトの情報を記憶部４０５ｃから取得する。ウエイトの情報は、予め記憶部４０１ａから読み出され、記憶部４０５ｃに記憶されている。続いてステップＳ４０５において、先鋭化部４０５ｄは、機械学習モデルを用いて、撮像画像から、撮像画像のぼけが先鋭化されたぼけ先鋭化画像（モデル出力）を生成する。機械学習モデルは、学習時と同様に、図１に示される構成を有する。学習時と同様に、撮像画像の輝度飽和した領域を表す輝度飽和マップを生成して入力し、飽和影響マップとモデル出力を生成する。 Next, in step S403, the communication unit (receiving means) 405a of the image estimation device 405 receives and acquires the captured image and processing request transmitted from the control device 404. Next, in step S404, the acquisition unit 405b acquires learned weight information corresponding to the captured image from the storage unit 405c. The weight information is read out in advance from the storage unit 401a and stored in the storage unit 405c. Next, in step S405, the sharpening unit 405d uses the machine learning model to generate a blur-sharpened image (model output) from the captured image, in which the blur of the captured image has been sharpened. The machine learning model has the same configuration as during learning, as shown in Figure 1. As with learning, a brightness saturation map representing brightness-saturated regions of the captured image is generated and input, and a saturation influence map and model output are generated.

続いてステップＳ４０６において、先鋭化部４０５ｄは、重みマップを生成する。重みマップの生成方法は、実施例１と同様である。ユーザが指定した先鋭化の強度に合わせて、デフォルトの重みマップを調整する。なお、事前に調整済みの重みマップを強度調整可能な範囲で保持しておいてもよい。続いてステップＳ４０７において、先鋭化部４０５ｄは、重みマップに基づいて、撮像画像とぼけ先鋭化画像（モデル出力）とを合成する。続いてステップＳ４０８において、通信部４０５ａは、合成画像を制御装置４０４へ送信する。続いてステップＳ４０９において、制御装置４０４の通信部４０４ａは、画像推定装置４０５から送信された推定画像を取得する。 Next, in step S406, the sharpening unit 405d generates a weight map. The method for generating the weight map is the same as in Example 1. The default weight map is adjusted according to the sharpening strength specified by the user. Note that a pre-adjusted weight map may be stored within an adjustable strength range. Next, in step S407, the sharpening unit 405d combines the captured image with the blur-sharpened image (model output) based on the weight map. Next, in step S408, the communication unit 405a transmits the combined image to the control device 404. Next, in step S409, the communication unit 404a of the control device 404 acquires the estimated image transmitted from the image estimation device 405.

以上の構成により、本実施例によれば、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能な画像処理システムを提供することができる。
（その他の実施例）
本発明は、上述の実施例の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。そして、画像処理装置は本発明の画像処理機能を有する装置であれば足り、撮像装置やＰＣの形態で実現可能である。 With the above configuration, this embodiment can provide an image processing system that can maintain the blur correction effect while suppressing adverse effects caused by the aperture value of the optical system and the luminance value of the captured image.
(Other Examples)
The present invention can also be realized by supplying a program that realizes one or more of the functions of the above-described embodiments to a system or device via a network or a storage medium, and having one or more processors in the computer of the system or device read and execute the program. It can also be realized by a circuit (e.g., an ASIC) that realizes one or more functions. The image processing device can be any device that has the image processing function of the present invention, and can be realized in the form of an imaging device or a PC.

各実施例によれば、光学系の絞り値と撮像画像の輝度値に起因する弊害を抑制しつつ、ぼけの補正効果を保つことが可能な画像処理方法、画像処理装置、画像処理プログラム、および、記憶媒体を提供することができる。 Each embodiment provides an image processing method, image processing device, image processing program, and storage medium that can maintain blur correction effects while suppressing adverse effects caused by the aperture value of the optical system and the brightness value of the captured image.

以上、本発明の好ましい実施例について説明したが、本発明はこれらの実施例に限定されず、その要旨の範囲内で種々の変形及び変更が可能である。 The above describes preferred embodiments of the present invention, but the present invention is not limited to these embodiments and various modifications and variations are possible within the scope of the invention.

１０３画像処理装置
１０３ｃ先鋭化部（第１の生成手段、第２の生成手段） 103 Image processing device 103c Sharpening unit (first generating means, second generating means)

Claims

a step in which a computer generates information related to correction of a captured image based on the captured image obtained by imaging using an optical system;
generating a weight map by a computer based on information about the aperture value of the optical system and information about the luminance value of the captured image;
and generating a first image by a computer based on the captured image, the information related to the correction, and the weight map.

The image processing method described in claim 1, characterized in that, in the step of generating the weight map, the weight map is further generated based on information regarding saturated regions in the captured image.

The image processing method described in claim 2, characterized in that the information regarding the saturated region is information representing an area in which a subject in the saturated region of the captured image has expanded due to blurring that occurred during the capture.

The image processing method described in claim 3, characterized in that information regarding the luminance values of the captured image is generated based on statistics regarding the luminance values of a second image based on the difference between the luminance values of the captured image and the luminance values in the region.

The image processing method of claim 4, wherein the statistics include at least one of the mean, median, variance, or histogram.

An image processing method described in any one of claims 2 to 5, characterized in that the information regarding the saturated region is generated using a machine learning model.

The image processing method described in any one of claims 1 to 6, characterized in that in the step of generating the weight map, the weight map is further generated based on information regarding the optical performance of the optical system.

The image processing method described in claim 7, characterized in that the information regarding the optical performance is calculated based on at least one of the zoom position, aperture value, and subject distance at the time of capturing the image, and the point spread function for each image height of the optical system.

An image processing method according to any one of claims 1 to 8, characterized in that the information related to the correction is a corrected image obtained by performing the correction on the captured image, or a correction component which is the difference between the corrected image and the captured image.

An image processing method according to any one of claims 1 to 8, characterized in that the correction includes at least one of increasing the resolution of the captured image and converting the shape of defocus blur in the captured image.

An image processing method according to any one of claims 1 to 10, characterized in that the weight map weights the captured image more heavily as the brightness value decreases.

the statistic regarding the luminance values of the second image is an average value of the luminance values in the second image;
6. The image processing method according to claim 5, wherein the weight map weights the captured image more heavily as the average value decreases.

the aperture value can be set to a first aperture value or a second aperture value greater than the first aperture value;
13. The image processing method according to claim 1, wherein a weight in the weight map for the second aperture value is greater than a weight in the weight map for the first aperture value.

a first generating means for generating information related to correction of a captured image based on the captured image obtained by imaging using an optical system;
a second generating means for generating a weight map based on information about the aperture value of the optical system and information about the luminance value of the captured image;
and third generating means for generating a first image based on the captured image, the information related to the correction, and the weight map.

An image processing system comprising the image processing device according to claim 14 and a control device capable of communicating with the image processing device,
the control device has a transmission means for transmitting a request to cause the image processing device to execute processing on the captured image,
The image processing system is characterized in that the image processing device has a receiving means for receiving the request, and executes processing on the captured image in response to the request.

A program causing a computer to execute the image processing method described in any one of claims 1 to 13.