Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /home/zhenxiangba/zhenxiangba.com/public_html/phproxy-improved-master/index.php on line 456
US12536616B2 - Image processing device and image processing method - Google Patents
[go: Go Back, main page]

US12536616B2 - Image processing device and image processing method - Google Patents

Image processing device and image processing method

Info

Publication number
US12536616B2
US12536616B2 US17/768,853 US202017768853A US12536616B2 US 12536616 B2 US12536616 B2 US 12536616B2 US 202017768853 A US202017768853 A US 202017768853A US 12536616 B2 US12536616 B2 US 12536616B2
Authority
US
United States
Prior art keywords
image data
multispectral
multispectral image
image processing
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US17/768,853
Other versions
US20240303773A1 (en
Inventor
Piergiorgio Sartor
Alexander Gatto
Takeshi Uemori
Zoltan Facius
Vincent PARRET
Ralf Müller
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sony Group Corp
Original Assignee
Sony Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sony Group Corp filed Critical Sony Group Corp
Assigned to Sony Group Corporation reassignment Sony Group Corporation ASSIGNMENT OF ASSIGNOR'S INTEREST Assignors: FACIUS, ZOLTAN, MUELLER, RALF, UEMORI, TAKESHI, GATTO, Alexander, PARRET, Vincent, SARTOR, PIERGIORGIO
Publication of US20240303773A1 publication Critical patent/US20240303773A1/en
Application granted granted Critical
Publication of US12536616B2 publication Critical patent/US12536616B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/60Image enhancement or restoration using machine learning, e.g. neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10032Satellite or aerial image; Remote sensing
    • G06T2207/10036Multispectral image; Hyperspectral image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure generally pertains to an image processing device and an image processing method.
  • neural networks such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN) are known, and they are used in a plurality of technical fields, for example in image processing.
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • image processing devices may use DNN and CNN for image reconstruction, multispatial and multispectral image generation, object recognition and the like.
  • DNN and CNN typically have an input layer, an output layer and multiple hidden layers between the input layer and the output layer.
  • a neural network may be trained to output images having high spectral resolution or high spatial resolution, using as an input to the neural network, a color channel image, such as an RGB image (having red, green and blue color channels).
  • the disclosure provides an image processing device comprising circuitry configured to obtain input image data being represented by a number of color channels and to input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
  • the disclosure provides an image processing method comprising obtaining input image data being represented by a number of color channels and inputting the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
  • FIG. 1 illustrates a proposed approach of multispectral image data generation from input image data represented by a number of color channels
  • FIG. 2 illustrates an exemplary optimized relationship between spectral resolution and spatial resolution of multispectral image data
  • FIG. 3 visualizes the application of a Convolutional Neural Network
  • FIG. 4 shows a block diagram of an embodiment of an image processing device
  • FIG. 5 illustrates an embodiment of a processing scheme of a learning method of a Convolutional Neural Network
  • FIG. 6 shows a block diagram of an embodiment of learning system
  • FIG. 7 shows a block diagram of an embodiment of an image processing system
  • FIG. 8 is a flowchart of an embodiment of an image processing method.
  • multispectral imaging systems and common Red Green Blue (RGB) imaging systems are used to capture and analyze images having high spectral resolution and high spatial resolution, respectively.
  • RGB Red Green Blue
  • a multispectral imaging device provides higher resolved spectral information than a common RGB imaging system.
  • the analysis of a high resolved spectrum may be used in a variety of applications, such as biometrics, remote sensing, medical and food inspection.
  • a multispectral sensing device is usually more expensive than a RGB imaging device.
  • the spatial resolution of a mosaic-array multispectral sensor typically is lower than the spatial resolution of a common RGB sensor.
  • the design costs of a common RGB sensor usually, are less than the costs of a multispectral sensor, most imaging systems focus on spatial resolution rather than spectral resolution.
  • multispectral imaging systems perform hyper/multispectral image data reconstruction from a RGB image using deep learning techniques, in order to benefit from both spatial and spectral resolution information.
  • neural networks such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN) are known, and they have reached state-of-the-art level performance in many domains, such as of image processing, image reconstruction, multispatial and multispectral image generation, language processing and the like.
  • CNN is a part of DNN that are usually applied to analyzing visual imagery.
  • CNN uses image classification algorithms for image transformation, multispatial and multispectral image generation, image classification, medical image analysis, image and video recognition, natural language processing, material classification applications (e.g. remote sensing, medical diagnosis) and the like.
  • a CNN may have an input layer and an output layer, as well as multiple hidden layers.
  • the hidden layers of a CNN typically have a number of convolutional layers i.e. pooling layers, fully connected layers and the like.
  • Each convolutional layer within a neural network usually has attributes, such as an input having shape (number of images) ⁇ (image width) ⁇ (image height) ⁇ (image depth), a number of convolutional kernels, acting like a filter, whose width and height are hyper-parameters, and whose depth must be typically equal to that of the image.
  • the convolutional layers convolve the input and pass their result to the next layer.
  • the Conventional CNN is trained such as to reconstruct a hyper/multispectral image from an RGB image.
  • the conventional CNN may be trained to output only images with a predefined number of spectral channels, without taking into account the amount of spectral information, which may be needed for different applications, target scenes, systems, users desires or the like.
  • such an approach usually requires a high computational effort as well as much memory when calculating a high resolved multispectral image, which has a large number of spectral channels.
  • the conventional approach typically outputs a hyper/multispectral image with a predefined number of spectral channels, and, thus, maybe with an unnecessary amount of spectral information for a target or vice versa.
  • some embodiments pertain to an image processing device including circuitry configured to obtain input image data being represented by a number of color channels, and to input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
  • the image processing device may be a digital (video) camera, a surveillance camera, a biometric device, a security camera, a medical/healthcare device, a remote sensing device, a food inspection device, an edge computing enabled image sensor, such as smart sensor associated with smart speaker, or the like, a motor vehicles device, a smartphone, a personal computer, a laptop computer, a personal computer, a wearable electronic device, electronic glasses, or the like, a circuitry, a processor, multiple processors, logic circuits or a mixture of those parts.
  • the circuitry may include one or more processors, logical circuits, memory (read only memory, random memory, etc., storage memory, i.e. hard disc, compact disc, flash drive, etc.), an interface for communication via a network, such as a wireless network, internet, local area network, or the like, a CMOS (Complementary Metal Oxide Semiconductor) image sensor, a CCD (Charge Coupled Device) image sensor, or the like.
  • CMOS Complementary Metal Oxide Semiconductor
  • CCD Charge Coupled Device
  • the input image data may be generated by the image sensor, as mentioned above.
  • the input image data may be also obtained from a memory included in the device, from an external memory, etc., from an artificial image generator, created via computer generated graphics, or the like.
  • the input image data may be represented by a number of color channels, for example three color channels, such as Red, Green and Blue, or the like.
  • the input image data may also be represented for example by a small number of spectral channels.
  • the color channel of a specific color for example red, green, or blue, may include information of multiple spectral channels that corresponds to the wavelength range of red, green, or blue, respectively. That is, the color channels may be considered as an integration of the corresponding (multiple) spectral channels located in the wavelength range of the associated color channel.
  • FIG. 1 a proposed approach for generating multispectral image data from input image data represented by a number of color channels is illustrated.
  • an image processing device acquires input image data, as input image data 1 , representing an image, for example captured by a digital camera.
  • the input image data 1 are represented by a number of color channels.
  • the number of channels of the input image data 1 is three, namely Red, Green and Blue, without limiting the present disclosure to these three color channels (in principal, any number and type of color channels can be chosen).
  • the input image data 1 are input to a neural network, such as for example a CNN, for generating output multispectral image data, such as multispectral image data 2 .
  • the output multispectral image data 2 are generated from the input image data 1 and the number of spectral channels of the output multispectral image data 2 is nine (9), in this embodiment. Therefore, the input image data being represented by a number of color channels have been transformed to output multispectral image data 2 being represented by a number of spectral channels.
  • the neural network generates at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
  • the neural network may also generate a plurality of multispectral image data on the basis of the input image data. That is, each of the plurality of multispectral image data may be followed by another multispectral image data and each of the plurality of multispectral image data may be generated on the basis of the previous generated multispectral image data.
  • multiple intermediate multispectral image data May be generated by the neural network.
  • the circuitry may be further configured to obtain the first or the second multispectral image data as the output multispectral image data.
  • the neural network may generate at least first and second multispectral image data and thus, the circuitry may be obtain, as the output multispectral image data, the first multispectral image data or the second multispectral image data, based, for example, on a setting of a user, or a predetermined set up of the image processing device based on a target application.
  • the input image data may include spectral image data.
  • the spectral image data may be input image data represented by a small number of spectral channels, which may be suitable for example, for object classification or the like, using neural network.
  • the input image data may also include Red Green Blue (RGB) image data represented by a specific number of color channels, in which multiple spectral channels are integrated, as described above.
  • RGB Red Green Blue
  • the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data.
  • the number of spectral channels of the output multispectral image data may be nine (9), or the like. Therefore, the output multispectral image data may have, after processing, higher spectral resolution.
  • a size of the image data remains the same before and after image processing, even in the case of higher spectral resolution of the output image data after image processing.
  • a spatial resolution of the first multispectral image data may be higher than a spatial resolution of the second multispectral image data.
  • a conventional imaging device such as a mosaic-array multispectral imaging device, using a conventional neural network, usually sacrifices its spatial resolution for spectral resolution, while both information offers benefits for computational sensing applications. Therefore, it may be suitable, a multispectral image to be generated from a RGB image, or from a multispectral image represented by small number of spectral channels, which has an optimized trade-off condition between spatial and spectral resolution, for the device.
  • the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
  • the predetermined relationship may be an optimized trade-off relationship between spectral resolution and spatial resolution.
  • the optimal point of an optimized trade-off relationship between spectral resolution and spatial resolution may depend on a system, an application, a target scene, or the like.
  • the predetermined relationship between the spatial resolution and the number of spectral channels may be determined based on a setting of a user, or a predetermined set up of the image processing device according to a target application.
  • FIG. 2 An exemplary optimized relationship between spectral resolution and spatial resolution of multispectral image data, such as multispectral skin data is illustrated in FIG. 2 .
  • the number of spectral channels e.g. spectral bands
  • the number of spectral bands increases from three (3) spectral bands to three hundred (300) spectral bands.
  • the classification accuracy is represented in y-axis, which, in this embodiment, increases with the number of spectral bands, up to the sixteen (16) bands.
  • Dashed line 3 represents the spatial resolution of the image.
  • the optimized performance is obtained with 16 spectral channels multispectral data.
  • the best relationship between spectral resolution and spatial resolution depends on the target. For example, for other applications, the best trade-off point may be different.
  • the optimal relationship between spectral resolution and spatial resolution may change depending on the content in a scene, which may make difficult the design of an optimal multispectral sensor with the best performance.
  • the neural network may be a convolutional neural network (CNN), without limiting the present disclosure in that regard.
  • CNN convolutional neural network
  • the convolutional neural network may include convolutional layers, or may also include local or global pooling layers, such as max-pooling layers, which reduce the dimensions of the image data, as it is generally known.
  • the pooling layers may be used for pooling, which is a form of non-linear down-sampling, such as spatial pooling, namely max-pooling, average pooling, sum pooling, or the like.
  • the generation of the multispectral image data may be either during a training phase of a neural network, such as a CNN, or may be a generation of the multispectral image data with an already trained neural network, such as a trained CNN, for example, for extracting information from the image data (e.g. object recognition, or recognition of other information in the image data, such as spatial information, spectral information, patterns, colors, etc.).
  • a neural network such as a CNN
  • the neural network may be an un-trained neural network.
  • the neural network may be part of the image processing device, e.g. stored in a storage or memory of the image processing device, or the image processing device may have access to a neural network, e.g. based on inter-processor communication, electronic bus, network (including internet), etc.
  • FIG. 3 shows generally in the first line the CNN structure, and in the second line the basic principle of building blocks.
  • the principles of a CNN and its application in imaging is generally known and, thus, it is only briefly discussed in the following under reference of FIG. 3 .
  • the input image includes for example three maps or layers (exemplary red, green and blue (RGB) color information) and N times N blocks.
  • the CNN has a convolutional layer and a subsequent pooling layer, wherein this structure can be repeated as also shown in FIG. 3 .
  • the convolutional layer includes the neurons.
  • a kernel filter
  • the pooling layer which is based in the present embodiment on the Max-Pooling (see second line, “Max-Pooling), takes the information of the most active neurons of the convolution layer and discards the other information. After several repetitions (three in FIG.
  • the process ends with a fully-connected layer, which is also referred to as affine layer.
  • the last layer includes typically a number of neurons, which corresponds to the number of object classes (output features) which are to be differentiated by the CNN.
  • the output is illustrated in FIG. 3 , first line, as an output distribution, wherein the distribution is shown by a row of columns, wherein each column represents a class and the height of the column represents the weight of the object class.
  • the different classes correspond to the output or image attribute features, which are output by the CNN.
  • the classes are, for example, “people, car, etc.” Typically several hundred or several thousand of classes can be used, e.g. also for object recognition of different objects.
  • the convolutional neural network may be trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
  • the convolutional neural network may generate a plurality of multispectral image data, such as a first multispectral image data and a second multispectral image data, which is generated on the basis of first multispectral image data and which follows the first multispectral image data. That is, each of the plurality of multispectral image data may be generated on the basis of the previous generated multispectral image data and each of the plurality of multispectral image data may be followed by another multispectral image data.
  • the convolutional neural network may be trained based on RGB image data and on multispectral image data.
  • the training data of multispatial multispectral images may also be generated from high resolution hyperspectral data (the terms multispectral and hyperspectral data are generally known in the art, and they are typically differentiated by the number of spectral channels, wherein the hyperspectral data has more spectral channels than multispectral data).
  • a CNN in image processing, uses as training database, groundtruth image data and desired image data, for example RGB image data and multispectral image data.
  • multispectral image data represented by C channels
  • hyperspectral image data are generated from hyperspectral image data by using following equation:
  • I c ⁇ 3 ⁇ 8 ⁇ 0 780 R ⁇ ( ⁇ ) ⁇ L ⁇ ( ⁇ ) ⁇ S c ( ⁇ ) ⁇ d ⁇ ⁇ + n
  • I c is the intensity of spectral band c (spectral channel) of a multispectral image
  • is the wavelength over which is integrated
  • R is the spectral reflectance of a target in a scene
  • L is the spectral distribution of the illumination, e.g. white illumination, which has a flat spectral distribution over all wavelengths
  • S c is the sensor's spectral sensitivity of spectral band c
  • n is the sensor noise.
  • R is measured hyperspectral data (HS image) by a hyperspectral camera
  • L can be set considering the illumination which will be used in the application and S c is given from a sensor specification of a camera.
  • the circuitry is further configured to perform object recognition.
  • object recognition may be performed in an autonomous vehicle application, in which a size of a pedestrian in an image may depend on a distance from the vehicle. To detect a pedestrian who is far from the vehicle, a higher spatial resolved image may be suitable for a pedestrian detector.
  • object recognition may be performed, for example, in a hand identification application, in which a hand may make various poses. In such cases, spatial resolution is less useful than spectral resolution. Hence, the relationship between spectral resolution and spatial resolution may include a higher amount of spectral information than spatial information.
  • image processing based on multispectral and hyperspectral imaging is widely used in food industry (e.g. bruise detection of a fruit, freshness detection of a fish), material classification applications (e.g. remote sensing, medical diagnosis) and the like, and, thus, some embodiments pertain to these fields.
  • Some embodiments pertain to an image processing method, which may be performed by the image processing device described herein, or any other electronic device, processor, or other computing means or the like.
  • the method includes obtaining input image data being represented by a number of color channels and inputting the input image into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
  • the image processing method may further include obtaining the first or the second multispectral image data as the output multispectral image data.
  • the input image data may include spectral image data, wherein the number of spectral channels of the output multispectral image data may be larger than the number of spectral channels of the input image data.
  • a spatial resolution of the first multispectral image data may be higher than a spatial resolution of the second multispectral image data.
  • the output multispectral image data may be generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
  • the neural network may be a convolutional neural network, which may be trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
  • the convolutional neural network may also be trained based on RGB image data and on multispectral image data, as discussed herein.
  • the image processing method may further include performing object recognition.
  • FIG. 4 a block diagram of an embodiment of an image processing device 11 is illustrated, which inputs image data into a convolutional neural network (CNN) for generating multispectral image data, as mentioned herein.
  • CNN convolutional neural network
  • the image processing device 11 includes a circuitry 12 with an interface 13 , a Central Processing Unit (CPU) 14 , including multiple processors including Graphics Processing Units (GPUs), a memory 15 that includes a RAM, a ROM and a storage memory and a trained CNN 16 (which is stored in a memory).
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • a memory 15 that includes a RAM, a ROM and a storage memory
  • a trained CNN 16 which is stored in a memory.
  • the image processing device 11 acquires, through the interface 13 , image data, such as input image data 1 , being represented by a number of color channels, namely Red, Green and Blue in this embodiment.
  • image data such as input image data 1
  • the input image data 1 represent an image of a target scene been captured with a digital camera, such as an RGB camera (not shown).
  • the input image data 1 being represented by a number of color channels are transmitted to the CPU 14 , which inputs the input image data 1 into the CNN 16 for generating multispectral image data, being represented by a number of spectral channels.
  • the CNN 16 has been trained in advance to generate (at least) first multispectral image data and second multispectral image data on the basis of the input image data 1 .
  • the image processing device 11 is configured to obtain as output multispectral image data, such as output multispectral image data 2 , anyone of the first or the second multispectral image data generated by the CNN 16 .
  • the image processing device 11 obtains the second multispectral image data as the output multispectral image data 2 .
  • the number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data, exemplary, the number of spectral channels of the second multispectral image data is nine (9).
  • the implementation of the above described image processing device 11 may result to computational effort reduction and memory reduction.
  • the CNN 16 may be a single CNN, being able to generate multispatial multispectral image data from a RGB image.
  • FIG. 5 illustrates an embodiment of a processing scheme of a learning method of the CNN 16 for generating a plurality of multispectral image data 20 - 1 to 20 -N, on the basis of the input image data 1 , wherein each multispectral image data of the plurality of multispectral image data may be obtained from the image processing device 11 , as the output multispectral image data 2 of FIG. 4 .
  • the image processing device 11 inputs into the CNN 16 the input image data 1 , such as RGB image data, which are represented by a number of color channels, namely Red, Green and Blue.
  • an input image in a CNN has a shape, that is, (number of images) ⁇ (image width) ⁇ (image height) ⁇ (image depth).
  • the input image data 1 representing an input image of which a height and a width define a spatial resolution.
  • the height of the input image data 1 is Height 0 and the width is Width 0 .
  • the number of spectral channels of the input image data 1 is Ch 0 .
  • the convolutional layers of the CNN 16 convolve the input image data 1 , perform rectification using Rectified Linear Unit (RELU) and spatial pooling that is carried out by max-pooling layers and then, pass their result to the next layer.
  • the result of the next layer is multispectral image data 20 - 1 (e.g. corresponding to first multispectral image data) being represented, by six (6) spectral channels and the multispectral image data 20 - 1 represent a multispectral image, which has a height Height 1 , a width Width 1 and a number of spectral channels Ch 1 , wherein Height 0 >Height 1 , Width 0 >Width 1 and Ch 0 ⁇ Ch 1 .
  • the result of the next layer is multispectral image data 20 - 2 (e.g. corresponding to second multispectral image data) being represented, by nine (9) spectral channels and the multispectral image data 20 - 2 represent a multispectral image, which has a height Height 2 , a width Width 2 and a number of spectral channels Ch 2 , wherein Height 0 >Height 1 >Height 2 , Width 0 >Width 1 >Width 2 and Ch 0 ⁇ Ch 1 ⁇ Ch 2 .
  • the convolution process evolves as described above until a size of the multispectral image data become a size of the kernel of the CNN 16 .
  • the result of the last layer of the CNN 16 is multispectral image data 20 -N (e.g. corresponding to N-th multispectral data) being represented, by twelve (12) spectral channels and the multispectral image data 20 -N represent a multispectral image, which has a height Height N , a width Width N and a number of spectral channels Ch N , wherein Height 0 >Height 1 >Height 2 > . . . >Height N , Width 0 >Width 1 >Width 2 > . . . >Width N and Ch 0 ⁇ Ch 1 ⁇ Ch 2 ⁇ . . . ⁇ Ch N .
  • the CNN 16 is trained so that anyone of the multispectral image data 20 - 1 to 20 -N (e.g. first to N-th multispectral data) could be obtained by the image processing device 11 , as output multispectral image data 2 . That is, the CNN 16 generates multiple intermediate multispectral image data, at several points in the neural network. Moreover, the image processing device 11 obtains anyone of the multispectral image data 20 - 1 to 20 -N with a predetermined relationship between spatial resolution and spectral resolution depending on the application or the target scene. The predetermined relationship may be set in advance by a user and thus, the CNN 16 does not calculate anymore, when the predetermined relationship, which is an optimized relationship between spatial resolution and spectral resolution, is achieved.
  • a suitable multispectral image may be determined by analyzing a degree of spatial frequency of an input RGB image.
  • a multispectral image with a small number of spectral channels may be desirable.
  • spectral information may be more important for the performed application and a multispectral image with a large number of spectral channels may be desirable in some embodiments.
  • a target performance may be determined by a result of the application, e.g. reliability of object classification result.
  • the application result may not achieve the target performance when inputting a multispectral image with a small number of spectral channels, and thus, the CNN may continue to generate a multispectral image with a larger number of spectral channels, until the application result achieves the setting criteria.
  • FIG. 6 An embodiment of a learning system 30 , shown as a block diagram, is illustrated in FIG. 6 that generates a learning model based on which a neural network, such as CNN 16 is trained.
  • a neural network such as CNN 16
  • the learning system 30 includes a memory device, such as the memory 15 of image processing device 11 , described under the reference of FIG. 4 , a RGB image generator 31 , a multispectral image generator 32 , a learning apparatus, such as the CNN 16 and a learning model 33 .
  • N and N is the number of intermediate MS images, e.g. represented by a plurality of multispectral image data, such as multispectral image data 20 - 1 to 20 -N of FIG. 5 .
  • the intermediate MS images should meet the following conditions:
  • the learning apparatus such as the CNN 16 is trained to transform multiple MS images from a RGB image by minimizing the following reconstruction loss:
  • MS REC ⁇ circle around (1) ⁇ is a reconstructed MS image i by CNN
  • MS GT ⁇ circle around (1) ⁇ is a ground truth of MS image i which is generated by the multispectral image generator 32
  • MSE is a Mean Squared Error function, without limiting the present disclosure in that regard.
  • the Mean Absolute Error function, or the like, may also be used.
  • the learning system 30 generates a learned model, which is stored into the memory 15 of image processing device 11 .
  • FIG. 7 shows a block diagram of the image processing system 40 .
  • the image processing system 40 includes an image capturing apparatus 41 , such as a camera including a RGB image sensor, the image processing device 11 , a memory 48 , for storing a database, and an information processing apparatus 44 , which includes a target area detection unit 45 , a feature extraction unit 46 and a recognition unit 47 .
  • an image capturing apparatus 41 such as a camera including a RGB image sensor
  • the image processing device 11 the image processing device 11
  • a memory 48 for storing a database
  • an information processing apparatus 44 which includes a target area detection unit 45 , a feature extraction unit 46 and a recognition unit 47 .
  • the image processing system 40 is configured to perform object recognition of the image data provided by the image capturing apparatus 41 and processed by the image processing device 11 .
  • the image capturing apparatus 41 such as an RGB camera, captures an image, such as a RGB image, of a target scene and transmits RGB image data, representing the captured RGB image, to the image processing device 11 .
  • the image processing device 11 outputs multispectral image data 43 being generated by a trained convolutional neural network, such as the CNN 16 , which is trained based on the learned model 33 .
  • the generated multispectral image data 43 are generated also based on input information 42 that is related to a target to be recognized.
  • the generated multispectral image data 43 are transmitted to the information processing apparatus 44 , which is configured to perform object recognition.
  • the multispectral image data 43 are transmitted to the target area detection unit 45 , the feature extraction unit 46 and then to the recognition unit 47 .
  • the recognition unit 47 performs object recognition based on data, included in the database, which is stored in the memory 48 .
  • the output 49 of the image processing system 40 depends on recognition result of the target (e.g. user ID).
  • an image processing method 50 which is performed by the image processing device 11 and/or the image processing system 40 in some embodiments, is discussed under reference of FIG. 8 .
  • input image data such as input image data 1
  • the image processing device 11 and/or the image processing system 40 are obtained by the image processing device 11 and/or the image processing system 40 , as discussed above.
  • the input image data may be obtained from an image sensor or from a memory included in the device, from an external memory, etc., or from an artificial image generator, created via computer generated graphics, or the like.
  • the input image data, at 52 are input into a convolutional neural network, such as CNN 16 , for generating, at 53 , output multispectral (MS) image data, such as the output multispectral image data 2 , as discussed above.
  • a convolutional neural network such as CNN 16
  • MS output multispectral
  • the input image data may be represented by a number of color channels, such as Red, Green and Blue, or may be represented by a small number of spectral channels, for example three (3), or the like.
  • the convolutional neural network generates first and second multispectral image data on the basis of the input image data.
  • a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data, as discussed above.
  • the first or the second multispectral image data are obtained as output multispectral data.
  • the first multispectral image data are generated on the basis of the input image data and the second multispectral image data are generated on the basis of the first multispectral image data, as discussed herein.
  • the obtained output multispectral image data are output.
  • the method as described herein is also implemented in some embodiments as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor.
  • a non-transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the methods described herein to be performed.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

An image processing device has circuitry, which is configured to obtain input image data being represented by a number of color channels and to input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS
The present application is based on PCT filing PCT/EP2020/079090, filed Oct. 15, 2020, which claims priority to EP 19204783.5, filed Oct. 23, 2019, the entire contents of each are incorporated herein by reference.
TECHNICAL FIELD
The present disclosure generally pertains to an image processing device and an image processing method.
TECHNICAL BACKGROUND
Generally, neural networks, such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN) are known, and they are used in a plurality of technical fields, for example in image processing. Known image processing devices may use DNN and CNN for image reconstruction, multispatial and multispectral image generation, object recognition and the like.
Moreover, DNN and CNN typically have an input layer, an output layer and multiple hidden layers between the input layer and the output layer. In image processing, a neural network may be trained to output images having high spectral resolution or high spatial resolution, using as an input to the neural network, a color channel image, such as an RGB image (having red, green and blue color channels).
Although there exist techniques for image processing, it is generally desirable to improve image processing devices and methods.
SUMMARY
According to a first aspect, the disclosure provides an image processing device comprising circuitry configured to obtain input image data being represented by a number of color channels and to input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
According to a second aspect, the disclosure provides an image processing method comprising obtaining input image data being represented by a number of color channels and inputting the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
Further aspects are set forth in the dependent claims, the following description and the drawings.
BRIEF DESCRIPTION OF THE DRAWINGS
Embodiments are explained by way of example with respect to the accompanying drawings, in which:
FIG. 1 illustrates a proposed approach of multispectral image data generation from input image data represented by a number of color channels;
FIG. 2 illustrates an exemplary optimized relationship between spectral resolution and spatial resolution of multispectral image data;
FIG. 3 visualizes the application of a Convolutional Neural Network;
FIG. 4 shows a block diagram of an embodiment of an image processing device
FIG. 5 illustrates an embodiment of a processing scheme of a learning method of a Convolutional Neural Network;
FIG. 6 shows a block diagram of an embodiment of learning system;
FIG. 7 shows a block diagram of an embodiment of an image processing system; and
FIG. 8 is a flowchart of an embodiment of an image processing method.
DETAILED DESCRIPTION OF EMBODIMENTS
Before a detailed description of the embodiments under reference of FIG. 4 is given, general explanations are made.
As indicated in the outset, it is generally known that multispectral imaging systems and common Red Green Blue (RGB) imaging systems are used to capture and analyze images having high spectral resolution and high spatial resolution, respectively. Typically, a multispectral imaging device provides higher resolved spectral information than a common RGB imaging system. The analysis of a high resolved spectrum may be used in a variety of applications, such as biometrics, remote sensing, medical and food inspection. A multispectral sensing device is usually more expensive than a RGB imaging device.
Moreover, the spatial resolution of a mosaic-array multispectral sensor typically is lower than the spatial resolution of a common RGB sensor. However, since the design costs of a common RGB sensor, usually, are less than the costs of a multispectral sensor, most imaging systems focus on spatial resolution rather than spectral resolution.
It is known that multispectral imaging systems perform hyper/multispectral image data reconstruction from a RGB image using deep learning techniques, in order to benefit from both spatial and spectral resolution information.
As mentioned in the outset, neural networks, such as Deep Neural Network (DNN) and Convolutional Neural Network (CNN) are known, and they have reached state-of-the-art level performance in many domains, such as of image processing, image reconstruction, multispatial and multispectral image generation, language processing and the like. CNN is a part of DNN that are usually applied to analyzing visual imagery.
In particular, CNN uses image classification algorithms for image transformation, multispatial and multispectral image generation, image classification, medical image analysis, image and video recognition, natural language processing, material classification applications (e.g. remote sensing, medical diagnosis) and the like.
As it is generally known, a CNN may have an input layer and an output layer, as well as multiple hidden layers. The hidden layers of a CNN typically have a number of convolutional layers i.e. pooling layers, fully connected layers and the like. Each convolutional layer within a neural network usually has attributes, such as an input having shape (number of images)×(image width)×(image height)×(image depth), a number of convolutional kernels, acting like a filter, whose width and height are hyper-parameters, and whose depth must be typically equal to that of the image. The convolutional layers convolve the input and pass their result to the next layer.
In some cases, it may be suitable, that the Conventional CNN is trained such as to reconstruct a hyper/multispectral image from an RGB image. In such cases, the conventional CNN may be trained to output only images with a predefined number of spectral channels, without taking into account the amount of spectral information, which may be needed for different applications, target scenes, systems, users desires or the like. Moreover, such an approach usually requires a high computational effort as well as much memory when calculating a high resolved multispectral image, which has a large number of spectral channels. Furthermore, the conventional approach typically outputs a hyper/multispectral image with a predefined number of spectral channels, and, thus, maybe with an unnecessary amount of spectral information for a target or vice versa.
However, it has been recognized that, for example, for different systems, applications, target scenes, it is desired to have different spatial resolutions and different spectral resolutions in the output image data. Moreover, a different proportion of spatial resolution and spectral resolution in the output image data may be suitable for different systems, applications, or target scenes. In such cases, it has been recognized a Conventional CNN may not be suitable, since by setting in advance a predetermined number of spectral channels, the output image data may include unnecessary amount of spectral information or spatial information.
Consequently, some embodiments pertain to an image processing device including circuitry configured to obtain input image data being represented by a number of color channels, and to input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
The image processing device may be a digital (video) camera, a surveillance camera, a biometric device, a security camera, a medical/healthcare device, a remote sensing device, a food inspection device, an edge computing enabled image sensor, such as smart sensor associated with smart speaker, or the like, a motor vehicles device, a smartphone, a personal computer, a laptop computer, a personal computer, a wearable electronic device, electronic glasses, or the like, a circuitry, a processor, multiple processors, logic circuits or a mixture of those parts.
The circuitry may include one or more processors, logical circuits, memory (read only memory, random memory, etc., storage memory, i.e. hard disc, compact disc, flash drive, etc.), an interface for communication via a network, such as a wireless network, internet, local area network, or the like, a CMOS (Complementary Metal Oxide Semiconductor) image sensor, a CCD (Charge Coupled Device) image sensor, or the like.
The input image data may be generated by the image sensor, as mentioned above. The input image data may be also obtained from a memory included in the device, from an external memory, etc., from an artificial image generator, created via computer generated graphics, or the like.
The input image data may be represented by a number of color channels, for example three color channels, such as Red, Green and Blue, or the like. The input image data may also be represented for example by a small number of spectral channels. The color channel of a specific color, for example red, green, or blue, may include information of multiple spectral channels that corresponds to the wavelength range of red, green, or blue, respectively. That is, the color channels may be considered as an integration of the corresponding (multiple) spectral channels located in the wavelength range of the associated color channel.
Referring to FIG. 1 , a proposed approach for generating multispectral image data from input image data represented by a number of color channels is illustrated.
As mentioned above, an image processing device acquires input image data, as input image data 1, representing an image, for example captured by a digital camera. The input image data 1 are represented by a number of color channels. In this embodiment, the number of channels of the input image data 1 is three, namely Red, Green and Blue, without limiting the present disclosure to these three color channels (in principal, any number and type of color channels can be chosen). The input image data 1 are input to a neural network, such as for example a CNN, for generating output multispectral image data, such as multispectral image data 2. The output multispectral image data 2 are generated from the input image data 1 and the number of spectral channels of the output multispectral image data 2 is nine (9), in this embodiment. Therefore, the input image data being represented by a number of color channels have been transformed to output multispectral image data 2 being represented by a number of spectral channels.
The neural network generates at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data. The neural network may also generate a plurality of multispectral image data on the basis of the input image data. That is, each of the plurality of multispectral image data may be followed by another multispectral image data and each of the plurality of multispectral image data may be generated on the basis of the previous generated multispectral image data. Hence, multiple intermediate multispectral image data May be generated by the neural network.
In some embodiments, the circuitry may be further configured to obtain the first or the second multispectral image data as the output multispectral image data. As mentioned above, the neural network may generate at least first and second multispectral image data and thus, the circuitry may be obtain, as the output multispectral image data, the first multispectral image data or the second multispectral image data, based, for example, on a setting of a user, or a predetermined set up of the image processing device based on a target application.
In some embodiments, the input image data may include spectral image data. For example, the spectral image data may be input image data represented by a small number of spectral channels, which may be suitable for example, for object classification or the like, using neural network. The input image data may also include Red Green Blue (RGB) image data represented by a specific number of color channels, in which multiple spectral channels are integrated, as described above.
In some embodiments, the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data. In a case that multispectral image data, represented by a small number of spectral channels, for example, six (6) spectral channels, are input to the processing device, then the number of spectral channels of the output multispectral image data may be nine (9), or the like. Therefore, the output multispectral image data may have, after processing, higher spectral resolution.
Typically, it is desired that a size of the image data remains the same before and after image processing, even in the case of higher spectral resolution of the output image data after image processing. Hence, in some embodiments, a spatial resolution of the first multispectral image data may be higher than a spatial resolution of the second multispectral image data.
As mentioned above, a conventional imaging device, such as a mosaic-array multispectral imaging device, using a conventional neural network, usually sacrifices its spatial resolution for spectral resolution, while both information offers benefits for computational sensing applications. Therefore, it may be suitable, a multispectral image to be generated from a RGB image, or from a multispectral image represented by small number of spectral channels, which has an optimized trade-off condition between spatial and spectral resolution, for the device.
Thus, in some embodiments, the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels. The predetermined relationship may be an optimized trade-off relationship between spectral resolution and spatial resolution. The optimal point of an optimized trade-off relationship between spectral resolution and spatial resolution may depend on a system, an application, a target scene, or the like. The predetermined relationship between the spatial resolution and the number of spectral channels may be determined based on a setting of a user, or a predetermined set up of the image processing device according to a target application.
An exemplary optimized relationship between spectral resolution and spatial resolution of multispectral image data, such as multispectral skin data is illustrated in FIG. 2 . In particular, in x-axis the number of spectral channels, e.g. spectral bands, is represented in vertical bars. In this embodiment, the number of spectral bands increases from three (3) spectral bands to three hundred (300) spectral bands. The classification accuracy is represented in y-axis, which, in this embodiment, increases with the number of spectral bands, up to the sixteen (16) bands. Dashed line 3 represents the spatial resolution of the image. In this case, the optimized performance is obtained with 16 spectral channels multispectral data. As discussed above, the best relationship between spectral resolution and spatial resolution depends on the target. For example, for other applications, the best trade-off point may be different. Moreover, the optimal relationship between spectral resolution and spatial resolution may change depending on the content in a scene, which may make difficult the design of an optimal multispectral sensor with the best performance.
In some embodiments, the neural network may be a convolutional neural network (CNN), without limiting the present disclosure in that regard. For example, in some embodiments, the convolutional neural network may include convolutional layers, or may also include local or global pooling layers, such as max-pooling layers, which reduce the dimensions of the image data, as it is generally known.
The pooling layers may be used for pooling, which is a form of non-linear down-sampling, such as spatial pooling, namely max-pooling, average pooling, sum pooling, or the like.
The generation of the multispectral image data may be either during a training phase of a neural network, such as a CNN, or may be a generation of the multispectral image data with an already trained neural network, such as a trained CNN, for example, for extracting information from the image data (e.g. object recognition, or recognition of other information in the image data, such as spatial information, spectral information, patterns, colors, etc.). Hence, the neural network may be an un-trained neural network.
Moreover, the neural network may be part of the image processing device, e.g. stored in a storage or memory of the image processing device, or the image processing device may have access to a neural network, e.g. based on inter-processor communication, electronic bus, network (including internet), etc.
The general principle of the usage of the CNN is exemplary illustrated in FIG. 3 , which shows generally in the first line the CNN structure, and in the second line the basic principle of building blocks. The principles of a CNN and its application in imaging is generally known and, thus, it is only briefly discussed in the following under reference of FIG. 3 .
The input image includes for example three maps or layers (exemplary red, green and blue (RGB) color information) and N times N blocks. The CNN has a convolutional layer and a subsequent pooling layer, wherein this structure can be repeated as also shown in FIG. 3 . The convolutional layer includes the neurons. By applying a kernel (filter) (see convolution kernels in the second line) on the input image, a respective feature map can be obtained. The pooling layer, which is based in the present embodiment on the Max-Pooling (see second line, “Max-Pooling), takes the information of the most active neurons of the convolution layer and discards the other information. After several repetitions (three in FIG. 3 ), the process ends with a fully-connected layer, which is also referred to as affine layer. The last layer includes typically a number of neurons, which corresponds to the number of object classes (output features) which are to be differentiated by the CNN. The output is illustrated in FIG. 3 , first line, as an output distribution, wherein the distribution is shown by a row of columns, wherein each column represents a class and the height of the column represents the weight of the object class. The different classes correspond to the output or image attribute features, which are output by the CNN. The classes are, for example, “people, car, etc.” Typically several hundred or several thousand of classes can be used, e.g. also for object recognition of different objects.
In some embodiments, the convolutional neural network may be trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data. As mentioned above, the convolutional neural network (CNN) may generate a plurality of multispectral image data, such as a first multispectral image data and a second multispectral image data, which is generated on the basis of first multispectral image data and which follows the first multispectral image data. That is, each of the plurality of multispectral image data may be generated on the basis of the previous generated multispectral image data and each of the plurality of multispectral image data may be followed by another multispectral image data.
In some embodiments, the convolutional neural network (CNN) may be trained based on RGB image data and on multispectral image data. The training data of multispatial multispectral images may also be generated from high resolution hyperspectral data (the terms multispectral and hyperspectral data are generally known in the art, and they are typically differentiated by the number of spectral channels, wherein the hyperspectral data has more spectral channels than multispectral data). Typically, a CNN, in image processing, uses as training database, groundtruth image data and desired image data, for example RGB image data and multispectral image data.
In particular, multispectral image data, represented by C channels, are generated from hyperspectral image data by using following equation:
I c = 3 8 0 780 R ( λ ) L ( λ ) S c ( λ ) d λ + n
where Ic is the intensity of spectral band c (spectral channel) of a multispectral image, λ is the wavelength over which is integrated, R is the spectral reflectance of a target in a scene, L is the spectral distribution of the illumination, e.g. white illumination, which has a flat spectral distribution over all wavelengths, Sc is the sensor's spectral sensitivity of spectral band c and n is the sensor noise.
Here, R is measured hyperspectral data (HS image) by a hyperspectral camera, L can be set considering the illumination which will be used in the application and Sc is given from a sensor specification of a camera.
In some embodiments, the circuitry is further configured to perform object recognition. For example, object recognition may be performed in an autonomous vehicle application, in which a size of a pedestrian in an image may depend on a distance from the vehicle. To detect a pedestrian who is far from the vehicle, a higher spatial resolved image may be suitable for a pedestrian detector. In addition, object recognition may be performed, for example, in a hand identification application, in which a hand may make various poses. In such cases, spatial resolution is less useful than spectral resolution. Hence, the relationship between spectral resolution and spatial resolution may include a higher amount of spectral information than spatial information.
Moreover, image processing based on multispectral and hyperspectral imaging is widely used in food industry (e.g. bruise detection of a fruit, freshness detection of a fish), material classification applications (e.g. remote sensing, medical diagnosis) and the like, and, thus, some embodiments pertain to these fields.
Some embodiments pertain to an image processing method, which may be performed by the image processing device described herein, or any other electronic device, processor, or other computing means or the like. The method includes obtaining input image data being represented by a number of color channels and inputting the input image into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
The image processing method may further include obtaining the first or the second multispectral image data as the output multispectral image data. As mentioned, the input image data may include spectral image data, wherein the number of spectral channels of the output multispectral image data may be larger than the number of spectral channels of the input image data. In addition, a spatial resolution of the first multispectral image data may be higher than a spatial resolution of the second multispectral image data. The output multispectral image data may be generated based on a predetermined relationship between the spatial resolution and the number of spectral channels. Moreover, the neural network may be a convolutional neural network, which may be trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data. Furthermore, the convolutional neural network may also be trained based on RGB image data and on multispectral image data, as discussed herein. The image processing method may further include performing object recognition.
Returning to the description of the embodiments under reference of FIGS. 4 to 8 , in the following, an embodiment of an image processing device is discussed under reference of FIG. 4 .
In FIG. 4 , a block diagram of an embodiment of an image processing device 11 is illustrated, which inputs image data into a convolutional neural network (CNN) for generating multispectral image data, as mentioned herein.
In the present embodiment, the image processing device 11 includes a circuitry 12 with an interface 13, a Central Processing Unit (CPU) 14, including multiple processors including Graphics Processing Units (GPUs), a memory 15 that includes a RAM, a ROM and a storage memory and a trained CNN 16 (which is stored in a memory).
The image processing device 11 acquires, through the interface 13, image data, such as input image data 1, being represented by a number of color channels, namely Red, Green and Blue in this embodiment. The input image data 1 represent an image of a target scene been captured with a digital camera, such as an RGB camera (not shown).
The input image data 1 being represented by a number of color channels are transmitted to the CPU 14, which inputs the input image data 1 into the CNN 16 for generating multispectral image data, being represented by a number of spectral channels. The CNN 16 has been trained in advance to generate (at least) first multispectral image data and second multispectral image data on the basis of the input image data 1. As discussed herein, the image processing device 11 is configured to obtain as output multispectral image data, such as output multispectral image data 2, anyone of the first or the second multispectral image data generated by the CNN 16.
In the present embodiment, the image processing device 11 obtains the second multispectral image data as the output multispectral image data 2. The number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data, exemplary, the number of spectral channels of the second multispectral image data is nine (9).
The implementation of the above described image processing device 11 may result to computational effort reduction and memory reduction. Furthermore, the CNN 16 may be a single CNN, being able to generate multispatial multispectral image data from a RGB image.
In the following, the usage of the CNN 16, for generating a plurality of multispectral image data on the basis of the input image data 1 is explained under reference of FIG. 5 .
FIG. 5 illustrates an embodiment of a processing scheme of a learning method of the CNN 16 for generating a plurality of multispectral image data 20-1 to 20-N, on the basis of the input image data 1, wherein each multispectral image data of the plurality of multispectral image data may be obtained from the image processing device 11, as the output multispectral image data 2 of FIG. 4 .
The image processing device 11 inputs into the CNN 16 the input image data 1, such as RGB image data, which are represented by a number of color channels, namely Red, Green and Blue. As mentioned, an input image in a CNN has a shape, that is, (number of images)×(image width)×(image height)×(image depth). In this embodiment, the input image data 1 representing an input image, of which a height and a width define a spatial resolution. The height of the input image data 1 is Height0 and the width is Width0. The number of spectral channels of the input image data 1 is Ch0. The convolutional layers of the CNN 16 convolve the input image data 1, perform rectification using Rectified Linear Unit (RELU) and spatial pooling that is carried out by max-pooling layers and then, pass their result to the next layer. The result of the next layer is multispectral image data 20-1 (e.g. corresponding to first multispectral image data) being represented, by six (6) spectral channels and the multispectral image data 20-1 represent a multispectral image, which has a height Height1, a width Width1 and a number of spectral channels Ch1, wherein Height0>Height1, Width0>Width1 and Ch0<Ch1. Accordingly, the result of the next layer is multispectral image data 20-2 (e.g. corresponding to second multispectral image data) being represented, by nine (9) spectral channels and the multispectral image data 20-2 represent a multispectral image, which has a height Height2, a width Width2 and a number of spectral channels Ch2, wherein Height0>Height1>Height2, Width0>Width1>Width2 and Ch0<Ch1<Ch2. In this embodiment, the convolution process evolves as described above until a size of the multispectral image data become a size of the kernel of the CNN 16. The result of the last layer of the CNN 16 is multispectral image data 20-N (e.g. corresponding to N-th multispectral data) being represented, by twelve (12) spectral channels and the multispectral image data 20-N represent a multispectral image, which has a height HeightN, a width WidthN and a number of spectral channels ChN, wherein Height0>Height1>Height2> . . . >HeightN, Width0>Width1>Width2> . . . >WidthN and Ch0<Ch1<Ch2< . . . <ChN.
The CNN 16 is trained so that anyone of the multispectral image data 20-1 to 20-N (e.g. first to N-th multispectral data) could be obtained by the image processing device 11, as output multispectral image data 2. That is, the CNN 16 generates multiple intermediate multispectral image data, at several points in the neural network. Moreover, the image processing device 11 obtains anyone of the multispectral image data 20-1 to 20-N with a predetermined relationship between spatial resolution and spectral resolution depending on the application or the target scene. The predetermined relationship may be set in advance by a user and thus, the CNN 16 does not calculate anymore, when the predetermined relationship, which is an optimized relationship between spatial resolution and spectral resolution, is achieved.
The above described embodiment does not limit the present disclosure in that regard. For example, a suitable multispectral image may be determined by analyzing a degree of spatial frequency of an input RGB image. Depending on the performed application, e.g. object classification using CNNs, a multispectral image with a small number of spectral channels may be desirable. On the other hand, spectral information may be more important for the performed application and a multispectral image with a large number of spectral channels may be desirable in some embodiments. Moreover, a target performance may be determined by a result of the application, e.g. reliability of object classification result. For example, the application result may not achieve the target performance when inputting a multispectral image with a small number of spectral channels, and thus, the CNN may continue to generate a multispectral image with a larger number of spectral channels, until the application result achieves the setting criteria.
An embodiment of a learning system 30, shown as a block diagram, is illustrated in FIG. 6 that generates a learning model based on which a neural network, such as CNN 16 is trained.
The learning system 30 includes a memory device, such as the memory 15 of image processing device 11, described under the reference of FIG. 4 , a RGB image generator 31, a multispectral image generator 32, a learning apparatus, such as the CNN 16 and a learning model 33.
High-resolved hyperspectral images represented by hyperspectral (HS) image data are stored into the memory 15 of image processing device 11, having an image resolution of (H)ight*(W)idth*(C)hannels. Then an RGB image is generated from a hyperspectral (HS) image by the RGB image generator 31, having resolution h0(≤H)*w0(≤W)*c0(=3) and multiple MS images are generated from a HS image by the multispectral image generator 32, the MS images have a resolution, such as hi(<h0)*wi(<w0)*ci(>c0), where i=1, 2, . . . , N and N is the number of intermediate MS images, e.g. represented by a plurality of multispectral image data, such as multispectral image data 20-1 to 20-N of FIG. 5 . In this embodiment, the intermediate MS images should meet the following conditions:
h i * w i * c i = h 0 * w 0 * c 0 ( a ) h i > h i + 1 , w i > w i + 1 , c 1 < c 1 + 1 ( b )
as already described in detail in FIG. 5 . The learning apparatus, such as the CNN 16 is trained to transform multiple MS images from a RGB image by minimizing the following reconstruction loss:
loss = MSE ( M S REC 1 - M S GT 1 ) + M S E ( M S REC 2 - M S GT 2 ) + + M S E ( M S R E C ( N ) - M S G T ( N ) )
where MSREC{circle around (1)} is a reconstructed MS imagei by CNN, MSGT{circle around (1)} is a ground truth of MS imagei which is generated by the multispectral image generator 32, MSE is a Mean Squared Error function, without limiting the present disclosure in that regard. The Mean Absolute Error function, or the like, may also be used. The learning system 30 generates a learned model, which is stored into the memory 15 of image processing device 11.
An embodiment of an image processing system 40 is illustrated in FIG. 7 , which shows a block diagram of the image processing system 40.
The image processing system 40 includes an image capturing apparatus 41, such as a camera including a RGB image sensor, the image processing device 11, a memory 48, for storing a database, and an information processing apparatus 44, which includes a target area detection unit 45, a feature extraction unit 46 and a recognition unit 47.
The image processing system 40 is configured to perform object recognition of the image data provided by the image capturing apparatus 41 and processed by the image processing device 11.
In the present embodiment, the image capturing apparatus 41, such as an RGB camera, captures an image, such as a RGB image, of a target scene and transmits RGB image data, representing the captured RGB image, to the image processing device 11. The image processing device 11 outputs multispectral image data 43 being generated by a trained convolutional neural network, such as the CNN 16, which is trained based on the learned model 33. The generated multispectral image data 43 are generated also based on input information 42 that is related to a target to be recognized. The generated multispectral image data 43 are transmitted to the information processing apparatus 44, which is configured to perform object recognition. In the following, regarding the object recognition performed by the information processing apparatus 44, the multispectral image data 43 are transmitted to the target area detection unit 45, the feature extraction unit 46 and then to the recognition unit 47. The recognition unit 47 performs object recognition based on data, included in the database, which is stored in the memory 48. The output 49 of the image processing system 40 depends on recognition result of the target (e.g. user ID).
In the following, an image processing method 50, which is performed by the image processing device 11 and/or the image processing system 40 in some embodiments, is discussed under reference of FIG. 8 .
At 51, input image data, such as input image data 1, are obtained by the image processing device 11 and/or the image processing system 40, as discussed above.
The input image data may be obtained from an image sensor or from a memory included in the device, from an external memory, etc., or from an artificial image generator, created via computer generated graphics, or the like.
The input image data, at 52 are input into a convolutional neural network, such as CNN 16, for generating, at 53, output multispectral (MS) image data, such as the output multispectral image data 2, as discussed above.
The input image data may be represented by a number of color channels, such as Red, Green and Blue, or may be represented by a small number of spectral channels, for example three (3), or the like.
At 54, the convolutional neural network generates first and second multispectral image data on the basis of the input image data.
A number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data, as discussed above.
At 55, the first or the second multispectral image data are obtained as output multispectral data.
The first multispectral image data are generated on the basis of the input image data and the second multispectral image data are generated on the basis of the first multispectral image data, as discussed herein.
At 56, the obtained output multispectral image data are output.
It should be recognized that the embodiments describe methods with an exemplary ordering of method steps. The specific ordering of method steps is however given for illustrative purposes only and should not be construed as binding.
The method as described herein is also implemented in some embodiments as a computer program causing a computer and/or a processor to perform the method, when being carried out on the computer and/or processor. In some embodiments, also a non-transitory computer-readable recording medium is provided that stores therein a computer program product, which, when executed by a processor, such as the processor described above, causes the methods described herein to be performed.
All units and entities described in this specification and claimed in the appended claims can, if not stated otherwise, be implemented as integrated circuit logic, for example on a chip, and functionality provided by such units and entities can, if not stated otherwise, be implemented by software.
In so far as the embodiments of the disclosure described above are implemented, at least in part, using software-controlled data processing apparatus, it will be appreciated that a computer program providing such software control and a transmission, storage or other medium by which such a computer program is provided are envisaged as aspects of the present disclosure.
Note that the present technology can also be configured as described below.
    • (1) An image processing device comprising circuitry configured to:
      • obtain input image data being represented by a number of color channels;
      • input the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
    • (2) The image processing device of (1), wherein the circuitry is further configured to obtain the first or the second multispectral image data as the output multispectral image data.
    • (3) The image processing device of (1) or (2), wherein the input image data include spectral image data.
    • (4) The image processing device of (3), wherein the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data.
    • (5) The image processing device of anyone of (1) to (4), wherein a spatial resolution of the first multispectral image data is higher than a spatial resolution of the second multispectral image data.
    • (6) The image processing device of (5), wherein the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
    • (7) The image processing device of anyone of (1) to (6), wherein the neural network is a convolutional neural network.
    • (8) The image processing device of (7), wherein the convolutional neural network is trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
    • (9) The image processing device of (7), wherein the convolutional neural network is trained based on RGB image data and on multispectral image data.
    • (10) The image processing device of anyone of (1) to (9), wherein the circuitry is further configured to perform object recognition.
    • (11) An image processing method comprising:
      • obtaining input image data being represented by a number of color channels;
      • inputting the input image data into a neural network for generating output multispectral image data, wherein the neural network is configured to generate at least first and second multispectral image data on the basis of the input image data, wherein a number of spectral channels of the second multispectral image data is larger than the number of spectral channels of the first multispectral image data.
    • (12) The image processing method of (11), further comprising obtaining the first or the second multispectral image data as the output multispectral image data.
    • (13) The image processing method of (11) or (12), wherein the input image data include spectral image data.
    • (14) The image processing method of (13), wherein the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data.
    • (15) The image processing method of anyone of (11) to (14), wherein a spatial resolution of the first multispectral image data is higher than a spatial resolution of the second multispectral image data.
    • (16) The image processing method of (15), wherein the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
    • (17) The image processing method of anyone of (11) to (16), wherein the neural network is a convolutional neural network.
    • (18) The image processing device of (17), wherein the convolutional neural network is trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
    • (19) The image processing method of (17), wherein the convolutional neural network is trained based on RGB image data and on multispectral image data.
    • (20) The image processing method of anyone of (11) to (19), further comprising performing object recognition
    • (21) A computer program comprising program code causing a computer to perform the method according to anyone of (11) to (20), when being carried out on a computer.
    • (22) A non-transitory computer-readable recording medium that stores therein a computer program product, which, when executed by a processor, causes the method according to anyone of (11) to (20) to be performed.

Claims (20)

The invention claimed is:
1. An image processing device comprising circuitry configured to:
obtain input image data being represented by a number of color channels;
input the input image data into a neural network for generating output multispectral image data, wherein
the neural network is configured to generate first multispectral image data having a first number of spectral channels and second multispectral image data having a second number of spectral channels greater than the first number, both the first and second multispectral image data generated by the neural network from the input image data.
2. The image processing device of claim 1, wherein the circuitry is further configured to obtain the first or the second multispectral image data as the output multispectral image data.
3. The image processing device of claim 1, wherein the input image data include spectral image data.
4. The image processing device of claim 3, wherein the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data.
5. The image processing device of claim 1, wherein a spatial resolution of the first multispectral image data is higher than a spatial resolution of the second multispectral image data.
6. The image processing device of claim 5, wherein the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
7. The image processing device of claim 1, wherein the neural network is a convolutional neural network.
8. The image processing device of claim 7, wherein the convolutional neural network is trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
9. The image processing device of claim 7, wherein the convolutional neural network is trained based on RGB image data and on multispectral image data.
10. The image processing device of claim 1, wherein the circuitry is further configured to perform object recognition.
11. An image processing method comprising:
obtaining input image data being represented by a number of color channels;
inputting the input image data into a neural network for generating output multispectral image data, wherein
the neural network is configured to generate first multispectral image data having a first number of spectral channels and second multispectral image data having a second number of spectral channels greater than the first number, both the first and second multispectral image data generated by the neural network from the input image data.
12. The image processing method of claim 11, further comprising obtaining the first or the second multispectral image data as the output multispectral image data.
13. The image processing method of claim 11, wherein the input image data include spectral image data.
14. The image processing method of claim 13, wherein the number of spectral channels of the output multispectral image data is larger than the number of spectral channels of the input image data.
15. The image processing method of claim 11, wherein a spatial resolution of the first multispectral image data is higher than a spatial resolution of the second multispectral image data.
16. The image processing method of claim 15, wherein the output multispectral image data is generated based on a predetermined relationship between the spatial resolution and the number of spectral channels.
17. The image processing method of claim 11, wherein the neural network is a convolutional neural network.
18. The image processing device of claim 17, wherein the convolutional neural network is trained to generate the first multispectral image data from the input image data and the second multispectral image data from the first multispectral image data.
19. The image processing method of claim 17, wherein the convolutional neural network is trained based on RGB image data and on multispectral image data.
20. The image processing method of claim 11, further comprising performing object recognition.
US17/768,853 2019-10-23 2020-10-15 Image processing device and image processing method Active 2042-09-01 US12536616B2 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
EP19204783 2019-10-23
EP19204783.5 2019-10-23
EP19204783 2019-10-23
PCT/EP2020/079090 WO2021078629A1 (en) 2019-10-23 2020-10-15 Image processing device and image processing method

Publications (2)

Publication Number Publication Date
US20240303773A1 US20240303773A1 (en) 2024-09-12
US12536616B2 true US12536616B2 (en) 2026-01-27

Family

ID=68342612

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/768,853 Active 2042-09-01 US12536616B2 (en) 2019-10-23 2020-10-15 Image processing device and image processing method

Country Status (3)

Country Link
US (1) US12536616B2 (en)
CN (1) CN114556428A (en)
WO (1) WO2021078629A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20240242380A1 (en) * 2023-01-13 2024-07-18 Maya Heat Transfer Technologies Ltd. System for generating an image dataset for training an artificial intelligence model for object recognition, and method of use thereof

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485688A (en) 2016-09-23 2017-03-08 西安电子科技大学 High spectrum image reconstructing method based on neutral net
CN108830796A (en) 2018-06-20 2018-11-16 重庆大学 Based on the empty high spectrum image super-resolution reconstructing method combined and gradient field is lost of spectrum
US20190096049A1 (en) 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
US20200302249A1 (en) * 2019-03-19 2020-09-24 Mitsubishi Electric Research Laboratories, Inc. Systems and Methods for Multi-Spectral Image Fusion Using Unrolled Projected Gradient Descent and Convolutinoal Neural Network
US20210250526A1 (en) * 2017-09-12 2021-08-12 Carbon Bee Device for capturing a hyperspectral image
US20210350590A1 (en) * 2019-01-29 2021-11-11 Korea Advanced Institute Of Science And Technology Method and device for imaging of lensless hyperspectral image
US11354804B1 (en) * 2019-09-27 2022-06-07 Verily Life Sciences Llc Transforming multispectral images to enhanced resolution images enabled by machine learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3483108B2 (en) * 1997-11-25 2004-01-06 株式会社日立製作所 Multispectral image processing apparatus and recording medium storing program for the same
CN104112263B (en) * 2014-06-28 2018-05-01 南京理工大学 The method of full-colour image and Multispectral Image Fusion based on deep neural network
CN108805874B (en) * 2018-06-11 2022-04-22 中国电子科技集团公司第三研究所 Multispectral image semantic cutting method based on convolutional neural network
CN109003239B (en) * 2018-07-04 2022-03-29 华南理工大学 Multispectral image sharpening method based on transfer learning neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106485688A (en) 2016-09-23 2017-03-08 西安电子科技大学 High spectrum image reconstructing method based on neutral net
US20210250526A1 (en) * 2017-09-12 2021-08-12 Carbon Bee Device for capturing a hyperspectral image
US20190096049A1 (en) 2017-09-27 2019-03-28 Korea Advanced Institute Of Science And Technology Method and Apparatus for Reconstructing Hyperspectral Image Using Artificial Intelligence
CN108830796A (en) 2018-06-20 2018-11-16 重庆大学 Based on the empty high spectrum image super-resolution reconstructing method combined and gradient field is lost of spectrum
US20210350590A1 (en) * 2019-01-29 2021-11-11 Korea Advanced Institute Of Science And Technology Method and device for imaging of lensless hyperspectral image
US20200302249A1 (en) * 2019-03-19 2020-09-24 Mitsubishi Electric Research Laboratories, Inc. Systems and Methods for Multi-Spectral Image Fusion Using Unrolled Projected Gradient Descent and Convolutinoal Neural Network
US11354804B1 (en) * 2019-09-27 2022-06-07 Verily Life Sciences Llc Transforming multispectral images to enhanced resolution images enabled by machine learning

Non-Patent Citations (16)

* Cited by examiner, † Cited by third party
Title
Can et al., "An Efficient CNN for Spectral Reconstruction from RGB Images", arxiv.org, Cornell University Library, Apr. 12, 2018, 5 pages.
Galliani et al., "Learned Spectral Super-Resolution", arXiv:1703.09470v1, Mar. 28, 2017, 10 pages.
Han et al: "Reconstruction From Multispectral to Hyperspectral Image Using Spectral Library-Based Dictionary Learning", IEEE Transactions on Geoscience and Remote Sensing, vol. 57, No. 3, Mar. 2019, pp. 1325-1335.
International Search Report and Written Opinion mailed on Dec. 14, 2020, received for PCT Application PCT/EP2020/079090, Filed on Oct. 15, 2020, 10 pages.
Li et al., "Hyperspectral image super-resolution using deep convolutional neural network", Journal of Latex Templates, Neurocomputing, Available Online at: https://www.researchgate.net/publication/317024713_Hyperspectral_image_super-resolution_using_deep_convolutional_neural_network May 3, 2017, pp. 1-35.
Mei et al., "Hyperspectral Image Spatial Super-Resolution via 3D Full Convolutional Neural Network", Remote Sensing, vol. 9, Available Online at: https://www.researchgate.net/publication/320913192_Hyperspectral_Image_Spatial_Super-Resolution_via_3D_Full_Convolutional_Neural_Network Nov. 7, 2017, pp. 1-22.
Miao et al: "lambda-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement", 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, Oct. 27, 2019, pp. 4058-4068.
Yang et al., "Hyperspectral and Multispectral Image Fusion via Deep Two-Branches Convolutional Neural Network," May 21, 2018 (Year: 2018). *
Can et al., "An Efficient CNN for Spectral Reconstruction from RGB Images", arxiv.org, Cornell University Library, Apr. 12, 2018, 5 pages.
Galliani et al., "Learned Spectral Super-Resolution", arXiv:1703.09470v1, Mar. 28, 2017, 10 pages.
Han et al: "Reconstruction From Multispectral to Hyperspectral Image Using Spectral Library-Based Dictionary Learning", IEEE Transactions on Geoscience and Remote Sensing, vol. 57, No. 3, Mar. 2019, pp. 1325-1335.
International Search Report and Written Opinion mailed on Dec. 14, 2020, received for PCT Application PCT/EP2020/079090, Filed on Oct. 15, 2020, 10 pages.
Li et al., "Hyperspectral image super-resolution using deep convolutional neural network", Journal of Latex Templates, Neurocomputing, Available Online at: https://www.researchgate.net/publication/317024713_Hyperspectral_image_super-resolution_using_deep_convolutional_neural_network May 3, 2017, pp. 1-35.
Mei et al., "Hyperspectral Image Spatial Super-Resolution via 3D Full Convolutional Neural Network", Remote Sensing, vol. 9, Available Online at: https://www.researchgate.net/publication/320913192_Hyperspectral_Image_Spatial_Super-Resolution_via_3D_Full_Convolutional_Neural_Network Nov. 7, 2017, pp. 1-22.
Miao et al: "lambda-Net: Reconstruct Hyperspectral Images From a Snapshot Measurement", 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, Oct. 27, 2019, pp. 4058-4068.
Yang et al., "Hyperspectral and Multispectral Image Fusion via Deep Two-Branches Convolutional Neural Network," May 21, 2018 (Year: 2018). *

Also Published As

Publication number Publication date
WO2021078629A1 (en) 2021-04-29
CN114556428A (en) 2022-05-27
US20240303773A1 (en) 2024-09-12

Similar Documents

Publication Publication Date Title
Zhang et al. Cloud detection method using CNN based on cascaded feature attention and channel attention
Cao et al. PanCSC-Net: A model-driven deep unfolding method for pansharpening
US11551333B2 (en) Image reconstruction method and device
Zhang et al. LR-Net: Low-rank spatial-spectral network for hyperspectral image denoising
Chang et al. HSI-DeNet: Hyperspectral image restoration via convolutional neural network
Romero et al. Unsupervised deep feature extraction for remote sensing image classification
Jiang et al. Multi-spectral RGB-NIR image classification using double-channel CNN
Xie et al. Hyperspectral image super-resolution using deep feature matrix factorization
EP4163832B1 (en) Neural network training method and apparatus, and image processing method and apparatus
Li et al. DMNet: A network architecture using dilated convolution and multiscale mechanisms for spatiotemporal fusion of remote sensing images
Kemker et al. Self-taught feature learning for hyperspectral image classification
CN112819910A (en) Hyperspectral image reconstruction method based on double-ghost attention machine mechanism network
CN115565045B (en) Hyperspectral and multispectral image fusion method based on multiscale spatial spectrum transformation
US20220180476A1 (en) Systems and methods for image feature extraction
Ahmed et al. PIQI: perceptual image quality index based on ensemble of Gaussian process regression
US20210374527A1 (en) Information processing apparatus, information processing method, and storage medium
US12340540B2 (en) Imaging sensor, an image processing device and an image processing method
Wang et al. Pixel-to-abundance translation: Conditional generative adversarial networks based on patch transformer for hyperspectral unmixing
Confalonieri et al. An end-to-end framework for the classification of hyperspectral images in the wood domain
El-gabri et al. Dlra-net: Deep local residual attention network with contextual refinement for spectral super-resolution
US12536616B2 (en) Image processing device and image processing method
CN109961083B (en) Method and image processing entity for applying a convolutional neural network to an image
Wang et al. BDPartNet: Feature decoupling and reconstruction fusion network for infrared and visible image
CN113256556B (en) Image selection method and device
Chen et al. Hyperspectral remote sensing IQA via learning multiple kernels from mid-level features

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: SONY GROUP CORPORATION, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:SARTOR, PIERGIORGIO;GATTO, ALEXANDER;UEMORI, TAKESHI;AND OTHERS;SIGNING DATES FROM 20221025 TO 20230803;REEL/FRAME:064575/0670

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: ALLOWED -- NOTICE OF ALLOWANCE NOT YET MAILED

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT RECEIVED

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE