JP7631291B2

JP7631291B2 - Information processing system, information processing method, and program

Info

Publication number: JP7631291B2
Application number: JP2022210579A
Authority: JP
Inventors: ヴィヴェクバルソピア; メナンヅロローハス
Original assignee: Rakuten Group Inc
Current assignee: Rakuten Group Inc
Priority date: 2022-12-27
Filing date: 2022-12-27
Publication date: 2025-02-18
Anticipated expiration: 2042-12-27
Also published as: JP2024093923A

Description

本開示は、情報処理システム、情報処理方法及びプログラムに関する。 This disclosure relates to an information processing system, an information processing method, and a program.

物体を示す元画像に対して画像変換を行い、元画像と異なる、前記物体を示す画像を得る技術が知られている。例えば、元画像に対して、回転や拡大・縮小、明るさの変更、色調の変更等の操作を行うことにより、様々な画像を大量に得ることができる。このようにして得られた大量の画像は、例えば、画像認識用の機械学習モデルの学習データとして用いられる（下記特許文献１参照）。 A technique is known in which an original image showing an object is transformed to obtain an image showing the object that is different from the original image. For example, a large number of different images can be obtained by performing operations such as rotating, enlarging/reducing, changing the brightness, or changing the color tone of the original image. The large number of images obtained in this way are used, for example, as training data for a machine learning model for image recognition (see Patent Document 1 below).

再表２０２１／０３８６７８公報Re-tabled publication 2021/038678

上記従来技術では、画像変換の態様は限られている。例えば、元画像が示す物体の向きやポーズ、形状、配置を変えることは、上記従来技術では困難である。 The above conventional techniques only allow limited image transformation. For example, it is difficult to change the orientation, pose, shape, or position of an object shown in the original image using the above conventional techniques.

本発明は、上記課題に鑑みてなされたものであって、その目的は、画像変換の態様を拡張する情報処理システム、情報処理方法及びプログラムを提供することにある。 The present invention has been made in consideration of the above problems, and its purpose is to provide an information processing system, information processing method, and program that expand the aspects of image conversion.

本開示に係る情報処理システムは、物体を示す元画像を取得する元画像取得手段と、前記物体を表す３Ｄモデル、照明データ、及びテクスチャデータに基づいて、仮想視点から見た、前記３Ｄモデルが配置された仮想空間の様子を示す仮想画像を生成する仮想画像生成手段と、前記元画像から元画像特徴量を抽出するとともに、前記仮想画像から仮想画像特徴量を抽出する特徴抽出手段と、前記元画像特徴量と前記仮想画像特徴量との類似度が大きくなるように、前記照明データのパラメータ及び前記テクスチャデータのパラメータを調整するパラメータ調整手段と、を有する。 The information processing system according to the present disclosure includes an original image acquisition means for acquiring an original image showing an object, a virtual image generation means for generating a virtual image showing the appearance of a virtual space in which the 3D model is placed as seen from a virtual viewpoint based on a 3D model representing the object, lighting data, and texture data, a feature extraction means for extracting original image features from the original image and virtual image features from the virtual image, and a parameter adjustment means for adjusting parameters of the lighting data and parameters of the texture data so as to increase the similarity between the original image features and the virtual image features.

本実施形態の情報処理システムのハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of the information processing system according to the present embodiment. 本実施形態の情報処理システムの処理の概要を示す図である。FIG. 2 is a diagram illustrating an overview of processing in the information processing system according to the present embodiment. 本実施形態の情報処理システムで実現される機能の一例を示す機能ブロック図である。FIG. 2 is a functional block diagram showing an example of functions realized in the information processing system of the present embodiment. 仮想画像を生成する様子を示す図である。FIG. 13 is a diagram showing how a virtual image is generated. 複数の仮想視点のそれぞれから見た仮想画像を生成する様子を示す図である。FIG. 13 is a diagram showing how virtual images viewed from each of a plurality of virtual viewpoints are generated. 本実施形態の情報処理システムにおいて実行される処理の一例を示すフロー図である。FIG. 2 is a flow chart showing an example of processing executed in the information processing system according to the present embodiment. 本実施形態の情報処理システムにおいて実行される処理の一例を示すフロー図である。FIG. 2 is a flow chart showing an example of processing executed in the information processing system according to the present embodiment.

［１．本実施形態の情報処理システムのハードウェア構成］
以下、本開示に係る情報処理システムの実施形態の一例を説明する。図１は、本実施形態の情報処理システム１のハードウェア構成の一例を示す図である。図１に示すように、情報処理システム１は、例えば、サーバコンピュータやパーソナルコンピュータ等のコンピュータであり、制御部１１、記憶部１２、通信部１３、操作部１４及び表示部１５を含む。なお、図１には、情報処理システム１が１台のコンピュータにより構成される場合を示すが、情報処理システム１は、複数台のコンピュータにより構成されていてもよい。 [1. Hardware configuration of the information processing system according to this embodiment]
An example of an embodiment of an information processing system according to the present disclosure will be described below. Fig. 1 is a diagram showing an example of a hardware configuration of an information processing system 1 according to the present embodiment. As shown in Fig. 1, the information processing system 1 is a computer such as a server computer or a personal computer, and includes a control unit 11, a storage unit 12, a communication unit 13, an operation unit 14, and a display unit 15. Although Fig. 1 shows a case where the information processing system 1 is configured by one computer, the information processing system 1 may be configured by multiple computers.

制御部１１は、少なくとも１つのプロセッサを含む。記憶部１２は、ＲＡＭ等の揮発性メモリと、フラッシュメモリ等の不揮発性メモリと、を含む。通信部１３は、有線通信用の通信インタフェースと、無線通信用の通信インタフェースと、の少なくとも一方を含む。操作部１４は、キーボード、マウス、又はタッチパネル等の入力デバイスである。表示部１５は、液晶ディスプレイ又は有機ＥＬディスプレイ等のディスプレイである。 The control unit 11 includes at least one processor. The storage unit 12 includes a volatile memory such as a RAM and a non-volatile memory such as a flash memory. The communication unit 13 includes at least one of a communication interface for wired communication and a communication interface for wireless communication. The operation unit 14 is an input device such as a keyboard, a mouse, or a touch panel. The display unit 15 is a display such as a liquid crystal display or an organic EL display.

なお、記憶部１２に記憶されるプログラムは、ネットワークＮを介して供給されてもよい。また、コンピュータ読み取り可能な情報記憶媒体に記憶されたプログラムが、情報記憶媒体を読み取る読取部（例えば、光ディスクドライブやメモリカードスロット）、又は、外部機器とデータの入出力をするための入出力部（例えば、ＵＳＢポート）を介して供給されてもよい。 The program stored in the storage unit 12 may be supplied via the network N. Also, the program stored in a computer-readable information storage medium may be supplied via a reading unit (e.g., an optical disk drive or a memory card slot) that reads the information storage medium, or an input/output unit (e.g., a USB port) that inputs and outputs data to and from an external device.

［２．本実施形態の情報処理システムの概要］
以下、本実施形態の情報処理システム１の概要を説明する。図２は、本実施形態の情報処理システム１の処理の概要を示す図である。 2. Overview of the Information Processing System of the Present Embodiment
An overview of the information processing system 1 of this embodiment will be described below. Fig. 2 is a diagram showing an overview of the processing of the information processing system 1 of this embodiment.

図２に示すように、本実施形態の情報処理システム１は、元画像ＯＩが示す物体Ｏの外観と、物体Ｏを表す３ＤモデルＭ、照明データＬ及びテクスチャデータＴに基づいて生成した仮想画像ＶＩが示す物体Ｏの外観と、の類似度が大きくなるように、照明データＬ及びテクスチャデータＴのパラメータを調整する。元画像ＯＩが示す物体Ｏの外観と仮想画像ＶＩが示す物体Ｏの外観との類似度の評価は、具体的には、元画像ＯＩから抽出した元画像特徴量ＯＦと仮想画像ＶＩから抽出した仮想画像特徴量ＶＦとの類似度を評価することにより行われる。なお、本明細書において、「物体Ｏの外観」は、物体Ｏの向きやポーズ、形状、配置以外の物体Ｏの見た目（例えば色や材質感等）を意味するものとする。 2, the information processing system 1 of this embodiment adjusts the parameters of the lighting data L and the texture data T so as to increase the similarity between the appearance of the object O shown in the original image OI and the appearance of the object O shown in the virtual image VI generated based on the 3D model M representing the object O, the lighting data L, and the texture data T. Specifically, the similarity between the appearance of the object O shown in the original image OI and the appearance of the object O shown in the virtual image VI is evaluated by evaluating the similarity between the original image feature OF extracted from the original image OI and the virtual image feature VF extracted from the virtual image VI. In this specification, the "appearance of the object O" refers to the appearance of the object O (e.g., color, texture, etc.) other than the orientation, pose, shape, and arrangement of the object O.

本実施形態の情報処理システム１によれば、上記した照明データＬ及びテクスチャデータＴのパラメータ調整により、元画像ＯＩが示す物体Ｏの外観を反映した照明データＬ及びテクスチャデータＴを得ることができる。そして、こうして得られた照明データＬ及びテクスチャデータＴを用いて、３ＤモデルＭを種々の条件下でレンダリングすることにより、元画像ＯＩを様々な態様で変換した画像を得ることができる。なお、本明細書において、「変換」の語は広義に用いられる。すなわち、元画像ＯＩを「変換」するとは、物体Ｏを示す元画像ＯＩに基づいて、元画像ＯＩとは異なる、物体Ｏを示す画像（以下、「変換画像」という。）を得ることを意味するものとし、元画像ＯＩに対して何らかの操作を行い、元画像ＯＩから直接的に変換画像を得ることのみを意味するものではない。以降、本実施形態の情報処理システム１の詳細について説明する。 According to the information processing system 1 of this embodiment, the lighting data L and texture data T that reflect the appearance of the object O shown in the original image OI can be obtained by adjusting the parameters of the lighting data L and texture data T described above. The lighting data L and texture data T obtained in this way can be used to render the 3D model M under various conditions, thereby obtaining an image in which the original image OI is transformed in various ways. In this specification, the term "transformation" is used in a broad sense. In other words, "transforming" the original image OI means obtaining an image showing the object O that is different from the original image OI (hereinafter referred to as a "transformed image") based on the original image OI showing the object O, and does not only mean performing some operation on the original image OI and obtaining a transformed image directly from the original image OI. Hereinafter, the information processing system 1 of this embodiment will be described in detail.

［３．本実施形態の情報処理システムにおいて実現される機能］
以下、図２を参照しつつ、図３に示す本実施形態の情報処理システム１で実現される機能について説明する。 3. Functions realized in the information processing system of this embodiment
Hereinafter, functions realized by the information processing system 1 of the present embodiment shown in FIG. 3 will be described with reference to FIG.

図３は、本実施形態の情報処理システム１で実現される機能の一例を示す機能ブロック図である。図３に示すように、本実施形態では、元画像記憶部１００、３Ｄモデル記憶部１０１、照明データ記憶部１０２、テクスチャデータ記憶部１０３、特徴抽出器記憶部１０４、元画像取得部１１０、仮想画像生成部１２０、特徴抽出部１３０、パラメータ調整部１４０、及び変換画像生成部１５０が、情報処理システム１で実現される。元画像記憶部１００、３Ｄモデル記憶部１０１、照明データ記憶部１０２、テクスチャデータ記憶部１０３、及び特徴抽出器記憶部１０４は、記憶部１２を主として実現される。元画像取得部１１０、仮想画像生成部１２０、特徴抽出部１３０、パラメータ調整部１４０、及び変換画像生成部１５０は、制御部１１を主として実現される。 FIG. 3 is a functional block diagram showing an example of functions realized by the information processing system 1 of this embodiment. As shown in FIG. 3, in this embodiment, the original image storage unit 100, the 3D model storage unit 101, the lighting data storage unit 102, the texture data storage unit 103, the feature extractor storage unit 104, the original image acquisition unit 110, the virtual image generation unit 120, the feature extraction unit 130, the parameter adjustment unit 140, and the converted image generation unit 150 are realized by the information processing system 1. The original image storage unit 100, the 3D model storage unit 101, the lighting data storage unit 102, the texture data storage unit 103, and the feature extractor storage unit 104 are realized mainly by the storage unit 12. The original image acquisition unit 110, the virtual image generation unit 120, the feature extraction unit 130, the parameter adjustment unit 140, and the converted image generation unit 150 are realized mainly by the control unit 11.

［元画像記憶部］
元画像記憶部１００は、元画像ＯＩを記憶する。 [Original image storage section]
The original image storage unit 100 stores the original image OI.

図２に示すように、元画像ＯＩは、物体Ｏを示す画像である。 As shown in Figure 2, the original image OI is an image showing an object O.

物体Ｏは、一定の形状と大きさを有する物体である。本実施形態では、例として、物体Ｏは犬である。なお、物体Ｏは、人や動物、植物等の生物であってもよいし、石や山、建物等の無生物であってもよい。 Object O is an object having a certain shape and size. In this embodiment, as an example, object O is a dog. Note that object O may be a living thing such as a person, animal, or plant, or an inanimate object such as a stone, mountain, or building.

元画像ＯＩは、物体Ｏの一部又は全部を示す。本実施形態では、元画像ＯＩに物体Ｏの全部が示されるが、元画像ＯＩは物体Ｏの一部のみが示されてもよい。また、本実施形態では、元画像ＯＩが示す物体Ｏは１つであるが、元画像ＯＩが示す物体Ｏは複数であってもよい。元画像ＯＩが示す物体Ｏが複数である場合、各物体Ｏは互いに同じ種類のものであってもよいし、互いに異なる種類のものであってよい。本実施形態の例に即して説明すれば、元画像ＯＩは、犬のほかにも、鳥や木、山等を示してもよい。 The original image OI shows a part or all of the object O. In this embodiment, the original image OI shows the whole of the object O, but the original image OI may show only a part of the object O. Also, in this embodiment, the original image OI shows one object O, but the original image OI may show multiple objects O. When the original image OI shows multiple objects O, the objects O may be of the same type or different types. Explaining based on the example of this embodiment, the original image OI may show birds, trees, mountains, etc. in addition to dogs.

本実施形態では、元画像ＯＩは、現実空間に配置された物体Ｏの撮影画像である。これにより、現実空間に配置された物体Ｏの撮影画像から、現実には存在しない形態の物体Ｏを示す変換画像を得ることができる。一例として、本実施形態の情報処理システム１を用いて、現実空間に配置された新品のネジの撮影画像を変換することにより、現実に存在しない形態の当該ネジを示す変換画像（錆びたネジを示す画像や折れ曲がったネジを示す画像、駆動部が潰れたネジを示す画像等）を得ることができる。このようにして得られた多様な形態のネジを示す変換画像は、例えば不良品検出用の機械学習モデルの学習データとして用いることができる。なお、元画像ＯＩは、現実空間に配置された物体Ｏの撮影画像に限られず、例えば、手描きの絵画や３Ｄモデルのレンダリング画像であってもよい。 In this embodiment, the original image OI is a photographed image of an object O placed in real space. This makes it possible to obtain a converted image showing an object O of a shape that does not exist in reality from the photographed image of the object O placed in real space. As an example, by using the information processing system 1 of this embodiment to convert a photographed image of a new screw placed in real space, a converted image showing the screw of a shape that does not exist in reality (an image showing a rusty screw, an image showing a bent screw, an image showing a screw with a crushed drive unit, etc.) can be obtained. The converted images showing various shapes of screws obtained in this way can be used, for example, as learning data for a machine learning model for detecting defective products. Note that the original image OI is not limited to a photographed image of an object O placed in real space, and may be, for example, a hand-drawn painting or a rendering image of a 3D model.

［元画像取得部］
元画像取得部１１０は、元画像ＯＩを取得する。本実施形態では、元画像ＯＩは、元画像記憶部１００に記憶されているので、元画像取得部１１０は、元画像記憶部１００から元画像ＯＩを取得する。なお、元画像ＯＩは、外部の情報記憶媒体又はコンピュータに記憶されていてもよい。この場合、元画像取得部１１０は、外部の情報記憶媒体又はコンピュータから元画像ＯＩを取得してもよい。 [Original image acquisition section]
The original image acquisition unit 110 acquires the original image OI. In this embodiment, the original image OI is stored in the original image storage unit 100, and therefore the original image acquisition unit 110 acquires the original image OI from the original image storage unit 100. Note that the original image OI may be stored in an external information storage medium or a computer. In this case, the original image acquisition unit 110 may acquire the original image OI from the external information storage medium or the computer.

［３Ｄモデル記憶部］
３Ｄモデル記憶部１０１は、３ＤモデルＭについてのデータを記憶する。 [3D model storage unit]
The 3D model storage unit 101 stores data regarding the 3D model M.

図２に示すように、３ＤモデルＭは、物体Ｏを表すモデルである。３ＤモデルＭは、物体Ｏの立体形状を表す少なくとも１つのポリゴンによって構成される。本実施形態では、３ＤモデルＭは、物体Ｏの立体形状を表すメッシュデータに相当し、後述のテクスチャデータＴを含まないモデルデータに相当する。本実施形態では、３ＤモデルＭはソリッドモデルとするが、３ＤモデルＭはワイヤフレームモデルやサーフェスモデルであってもよい。本実施形態では、３ＤモデルＭは、３Ｄ－ＣＡＤ等の任意の３ＤＣＧソフトウェアにより予め作成されているものとする。３ＤモデルＭについてのデータは、３ＤモデルＭのポリゴンを定義する各頂点の３次元座標を示す。本実施形態では、３ＤモデルＭについてのデータは３Ｄモデル記憶部１０１に記憶されるが、３ＤモデルＭについてのデータは、外部の情報記憶媒体又はコンピュータに記憶されていてもよい。 2, the 3D model M is a model representing the object O. The 3D model M is composed of at least one polygon that represents the three-dimensional shape of the object O. In this embodiment, the 3D model M corresponds to mesh data that represents the three-dimensional shape of the object O, and corresponds to model data that does not include texture data T, which will be described later. In this embodiment, the 3D model M is a solid model, but the 3D model M may be a wireframe model or a surface model. In this embodiment, the 3D model M is created in advance by any 3DCG software such as 3D-CAD. The data for the 3D model M indicates the three-dimensional coordinates of each vertex that defines the polygon of the 3D model M. In this embodiment, the data for the 3D model M is stored in the 3D model storage unit 101, but the data for the 3D model M may be stored in an external information storage medium or computer.

３ＤモデルＭの数は、元画像ＯＩが示す物体Ｏの数と同数である。本実施形態では、元画像ＯＩが示す物体Ｏが１つであるため、３ＤモデルＭの数も１つである。 The number of 3D models M is the same as the number of objects O shown in the original image OI. In this embodiment, since the original image OI shows one object O, the number of 3D models M is also one.

ところで、「物体Ｏを表す」とは、元画像ＯＩが示す物体Ｏと同位概念の物体の立体形状を表すことを意味し、元画像ＯＩが示す物体Ｏそのものの立体形状を表すことを必ずしも意味しない。例えば、元画像ＯＩが示す物体Ｏが「ジョン」と名付けられた個体のビーグル犬である場合、３ＤモデルＭは、元画像ＯＩが示す物体Ｏと同位概念の物体であるビーグル犬の立体形状を表すものであれば足り、「ジョン」と名付けられた個体のビーグル犬そのものの立体形状を表すものである必要は無い。 By the way, "representing object O" means representing the three-dimensional shape of an object of the same concept as object O shown in original image OI, and does not necessarily mean representing the three-dimensional shape of object O itself shown in original image OI. For example, if object O shown in original image OI is an individual beagle dog named "John", it is sufficient for 3D model M to represent the three-dimensional shape of a beagle dog that is an object of the same concept as object O shown in original image OI, and it does not have to represent the three-dimensional shape of the individual beagle dog named "John" itself.

なお、３ＤモデルＭは、元画像ＯＩが示す物体Ｏそのものの立体形状を表すものであることが好ましい。これにより、元画像ＯＩが示す物体Ｏの外観及び形状を高い精度で維持しつつ、元画像ＯＩが示す物体Ｏの向き、ポーズ、又は位置を変化させた変換画像を得ることができる。 It is preferable that the 3D model M represents the three-dimensional shape of the object O itself shown in the original image OI. This makes it possible to obtain a converted image in which the orientation, pose, or position of the object O shown in the original image OI is changed while maintaining the appearance and shape of the object O shown in the original image OI with high accuracy.

［照明データ記憶部］
照明データ記憶部１０２は、照明データＬを記憶する。 [Lighting data storage unit]
The illumination data storage unit 102 stores the illumination data L.

照明データＬは、３ＤモデルＭに光を照射する光源についてのデータである。具体的には、照明データＬは、例えば、光源から照射される光の強度及び色、並びに、後述する仮想空間ＶＳにおける光源の位置及び向きをパラメータとして含む。なお、照明データＬが含むパラメータはこれらに限られず、照明データＬは、これら以外のパラメータを含んでもよいし、これらのパラメータを含まなくてもよい。光源としては、点光源、線光源、面光源、ボリューム光源等、任意の光源が利用可能である。また、イメージベースドライティングを利用する場合は、現実世界の全方向の光情報をキャプチャした画像が光源として利用されてもよい。 The lighting data L is data about a light source that irradiates the 3D model M. Specifically, the lighting data L includes, for example, the intensity and color of the light irradiated from the light source, as well as the position and orientation of the light source in the virtual space VS described below as parameters. Note that the parameters included in the lighting data L are not limited to these, and the lighting data L may or may not include parameters other than these. Any light source can be used as the light source, such as a point light source, a line light source, a surface light source, or a volume light source. Furthermore, when image-based lighting is used, an image that captures light information in all directions in the real world may be used as the light source.

［テクスチャデータ記憶部］
テクスチャデータ記憶部１０３は、テクスチャデータＴを記憶する。 [Texture data storage unit]
The texture data storage unit 103 stores the texture data T.

テクスチャデータＴは、３ＤモデルＭの表面に設定されるテクスチャについてのデータである。テクスチャは、質感とも呼ばれ、例えば、物体Ｏの色、光沢感、透明度、金属感、凹凸等を含む。テクスチャデータＴは、具体的には、３ＤモデルＭの表面に設定されるテクスチャを示す画像データである。本実施形態では、テクスチャデータＴは、例えば、物体Ｏ表面の拡散反射率（アルベド）、法線ベクトル（ノーマル）、鏡面反射率、メタリック、粗さ（ラフネス）、光沢、異方度、透明度等をパラメータとして含む。なお、テクスチャデータＴが含むパラメータはこれらに限られず、テクスチャデータＴは、これら以外のパラメータを含んでもよいし、これらのパラメータを含まなくてもよい。 The texture data T is data about the texture set on the surface of the 3D model M. Texture is also called texture, and includes, for example, the color, glossiness, transparency, metallic feel, and unevenness of the object O. Specifically, the texture data T is image data indicating the texture set on the surface of the 3D model M. In this embodiment, the texture data T includes, for example, the diffuse reflectance (albedo), normal vector (normal), specular reflectance, metallic, roughness, gloss, anisotropy, transparency, and the like of the surface of the object O as parameters. Note that the parameters included in the texture data T are not limited to these, and the texture data T may include parameters other than these, or may not include these parameters.

［仮想画像生成部］
以下、図４を参照しつつ、仮想画像生成部１２０の処理を説明する。図４は、仮想画像を生成する様子を示す図である。 [Virtual image generation unit]
The process of the virtual image generating unit 120 will be described below with reference to Fig. 4. Fig. 4 is a diagram showing how a virtual image is generated.

図２及び図４に示すように、仮想画像生成部１２０は、３ＤモデルＭ、照明データＬ及びテクスチャデータＴに基づいて、仮想視点ＶＶから見た、３ＤモデルＭが配置された仮想空間ＶＳの様子を示す仮想画像ＶＩを生成する。別の言い方をすれば、仮想画像生成部１２０は、照明データＬ、テクスチャデータＴ、及び仮想視点ＶＶをレンダリング条件として、３ＤモデルＭのレンダリング処理を行うということもできる。また、別の言い方をすれば、仮想画像生成部１２０は、３ＤモデルＭ、照明データＬ及びテクスチャデータＴに基づいて、物理ベースレンダリングを行い、レンダリング画像を生成するということもできる。また、別の言い方をすれば、仮想画像生成部１２０は、元画像ＯＩが示す物体Ｏに係る新視点画像を生成するということもできる、なお、レンダリング条件は、上記以外に、例えば仮想空間ＶＳの背景画像や、生成する仮想画像ＶＩの解像度・アスペクト比等を含んでもよい。仮想画像ＶＩの生成には、フォワードレンダリング、ディファードレンダリング等の種々の公知のレンダリング手法（レンダリングパイプライン）が利用可能である。 2 and 4, the virtual image generating unit 120 generates a virtual image VI showing the state of the virtual space VS in which the 3D model M is placed, as seen from the virtual viewpoint VV, based on the 3D model M, the lighting data L, and the texture data T. In other words, the virtual image generating unit 120 can perform a rendering process of the 3D model M with the lighting data L, the texture data T, and the virtual viewpoint VV as rendering conditions. In other words, the virtual image generating unit 120 can perform physically based rendering based on the 3D model M, the lighting data L, and the texture data T to generate a rendering image. In other words, the virtual image generating unit 120 can generate a new viewpoint image related to the object O indicated by the original image OI. In addition to the above, the rendering conditions may include, for example, a background image of the virtual space VS, and the resolution and aspect ratio of the virtual image VI to be generated. Various known rendering methods (rendering pipelines), such as forward rendering and deferred rendering, can be used to generate the virtual image VI.

仮想空間ＶＳは、仮想的な３次元空間である。仮想空間ＶＳには、互いに直交する３つの座標軸が設定される。これら３つの座標軸は、ワールド座標系の座標軸である。原点は任意の位置であってよく、仮想空間ＶＳ内の位置は３次元座標で表される。なお、仮想画像生成部１２０では、仮想空間ＶＳには背景画像が設定されてもよい。背景画像は、例えば、木や山、空等の風景を示す画像である。 The virtual space VS is a virtual three-dimensional space. Three mutually orthogonal coordinate axes are set in the virtual space VS. These three coordinate axes are the coordinate axes of the world coordinate system. The origin may be any position, and positions in the virtual space VS are expressed in three-dimensional coordinates. Note that the virtual image generating unit 120 may set a background image in the virtual space VS. The background image is, for example, an image showing scenery such as trees, mountains, and the sky.

仮想視点ＶＶは、仮想カメラとも呼ばれるものであり、ビュー座標系の座標軸を定義する。仮想画像生成部１２０では、仮想視点ＶＶのパラメータ（位置・画角・拡大倍率等）が設定される。 The virtual viewpoint VV is also called a virtual camera, and defines the coordinate axes of the view coordinate system. The virtual image generator 120 sets the parameters of the virtual viewpoint VV (position, angle of view, magnification, etc.).

以下、図５を用いて、仮想画像生成部１２０の詳細な処理を説明する。図５は、複数の仮想視点のそれぞれから見た仮想画像を生成する様子を示す図である。 The detailed processing of the virtual image generating unit 120 will be described below with reference to FIG. 5. FIG. 5 is a diagram showing how virtual images viewed from each of a plurality of virtual viewpoints are generated.

すなわち、図５に示すように、仮想画像生成部１２０は、より具体的には、互いに異なる複数の仮想視点ＶＶ１，ＶＶ２，ＶＶ３のそれぞれにそれぞれが対応するとともに、当該仮想視点から見た仮想空間ＶＳの様子を示す複数の仮想画像ＶＩ１，ＶＩ２，ＶＩ３を生成する。なお、本実施形態では、仮想視点の数は３つであるが、仮想視点の数は、これより少なくてもよく、これより多くてもよい。 More specifically, as shown in FIG. 5, the virtual image generating unit 120 generates a plurality of virtual images VI1, VI2, and VI3, each of which corresponds to a plurality of different virtual viewpoints VV1, VV2, and VV3, and shows the state of the virtual space VS as seen from the respective virtual viewpoints. Note that, although the number of virtual viewpoints is three in this embodiment, the number of virtual viewpoints may be less than this, or may be more than this.

本実施形態の情報処理システム１は、上記のように、複数の仮想視点ＶＶ１，ＶＶ２，ＶＶ３のそれぞれから見た仮想空間ＶＳの様子を示す仮想画像ＶＩを生成する。すなわち、本実施形態の情報処理システム１は、元画像ＯＩが示す物体Ｏの外観を、複数の仮想視点ＶＶ１，ＶＶ２，ＶＶ３のそれぞれから見た物体Ｏの外観に反映させることができる。これにより、後述するように、元画像ＯＩが示す物体Ｏの向きやポーズ、形状、配置を変えた変換画像において、元画像ＯＩが示す物体Ｏの外観を好適に保つことができる。 As described above, the information processing system 1 of this embodiment generates a virtual image VI that shows the state of the virtual space VS as seen from each of the multiple virtual viewpoints VV1, VV2, and VV3. That is, the information processing system 1 of this embodiment can reflect the appearance of the object O shown in the original image OI in the appearance of the object O as seen from each of the multiple virtual viewpoints VV1, VV2, and VV3. This makes it possible to favorably maintain the appearance of the object O shown in the original image OI in a converted image in which the orientation, pose, shape, and position of the object O shown in the original image OI are changed, as will be described later.

［特徴抽出器記憶部］
特徴抽出器記憶部１０４は、特徴抽出部１３０での処理に用いられる特徴抽出器を記憶する。具体的には、特徴抽出器記憶部１０４は、特徴抽出器のプログラム及びパラメータを記憶する。 [Feature Extractor Storage Unit]
The feature extractor storage unit 104 stores a feature extractor used in the processing in the feature extraction unit 130. Specifically, the feature extractor storage unit 104 stores the program and parameters of the feature extractor.

特徴抽出器は、画像の特徴を抽出する学習済みの機械学習モデルである。特徴抽出器は、後述する第１の中間特徴量及び第２の中間特徴量の抽出に用いられる。本実施形態では、特徴抽出器として、学習済みのＣＮＮ（ＣｏｎｖｏｌｕｔｉｏｎａｌＮｅｕｒａｌＮｅｔｗｏｒｋ）から全結合層を除いたものが用いられる。なお、特徴抽出器としては、ＣＮＮから全結合層を除いたもの以外にも、学習済みの公知の機械学習モデルが利用可能である。 The feature extractor is a trained machine learning model that extracts image features. The feature extractor is used to extract the first intermediate feature amount and the second intermediate feature amount described below. In this embodiment, a trained CNN (Convolutional Neural Network) without the fully connected layer is used as the feature extractor. Note that, in addition to a CNN without the fully connected layer, a trained, publicly known machine learning model can also be used as the feature extractor.

［特徴抽出部］
特徴抽出部１３０は、元画像ＯＩから元画像特徴量ＯＦを抽出するとともに、仮想画像ＶＩから仮想画像特徴量ＶＦを抽出する。本実施形態では、特徴抽出部１３０は、複数の仮想画像ＶＩ１，ＶＩ２，ＶＩ３のそれぞれから、当該仮想画像に対応する仮想画像特徴量ＶＦを抽出する。 [Feature extraction section]
The feature extraction unit 130 extracts original image feature values OF from the original image OI, and extracts virtual image feature values VF from the virtual image VI. In this embodiment, the feature extraction unit 130 extracts virtual image feature values VF corresponding to each of the multiple virtual images VI1, VI2, and VI3.

具体的には、特徴抽出部１３０は、元画像ＯＩから第１の中間特徴量を抽出し、第１の中間特徴量から元画像特徴量ＯＦを抽出する。また、特徴抽出部１３０は、仮想画像ＶＩから第２の中間特徴量を抽出し、第２の中間特徴量から仮想画像特徴量ＶＦを抽出する。特徴抽出部１３０は、第１の中間特徴量及び第２の中間特徴量を、それぞれ先述の特徴抽出器を用いて抽出する。 Specifically, the feature extraction unit 130 extracts a first intermediate feature from the original image OI, and extracts an original image feature OF from the first intermediate feature. The feature extraction unit 130 also extracts a second intermediate feature from the virtual image VI, and extracts a virtual image feature VF from the second intermediate feature. The feature extraction unit 130 extracts the first intermediate feature and the second intermediate feature using the feature extractor described above.

第１の中間特徴量は、元画像ＯＩにおける物体Ｏの形状及び配置に関する情報を含む。また、第２の中間特徴量は、仮想画像ＶＩにおける物体Ｏの形状及び配置に関する情報を含む。本実施形態では、学習済みのＣＮＮから全結合層を除いたものを特徴抽出器として用いるので、具体的には、第１の中間特徴量及び第２の中間特徴量は、それぞれ、ＣＮＮの畳み込み層から出力される特徴マップである。ここで、特徴マップの数は、フィルタ（カーネル）の数と同数である。 The first intermediate feature includes information about the shape and arrangement of the object O in the original image OI. The second intermediate feature includes information about the shape and arrangement of the object O in the virtual image VI. In this embodiment, a trained CNN excluding the fully connected layer is used as a feature extractor, so specifically, the first intermediate feature and the second intermediate feature are each a feature map output from the convolutional layer of the CNN. Here, the number of feature maps is the same as the number of filters (kernels).

ここで、元画像特徴量ＯＦは、元画像ＯＩの各色の度数分布を示す特徴量である。また、仮想画像特徴量ＶＦは、仮想画像ＶＩの各色の度数分布を示す特徴量である。別の言い方をすれば、元画像特徴量ＯＦは、元画像ＯＩにおける物体Ｏの形状及び配置に関する情報を含まない特徴量である。また、仮想画像特徴量ＶＦは、仮想画像ＶＩにおける物体Ｏの形状及び配置に関する情報を含まない特徴量である。本実施形態では、元画像特徴量ＯＦ及び仮想画像特徴量ＶＦは、それぞれ、ＣＮＮ等の学習済みの特徴抽出器から出力可能な特徴量に相当するベクトル表現であってよい。本実施形態では、具体的には、元画像特徴量ＯＦ及び仮想画像特徴量ＶＦは、それぞれ、ＣＮＮの畳み込み層から出力される特徴マップ同士の相関を計算することによって得られるグラム行列である。このグラム行列は、具体的には、ＣＮＮの畳み込み層から出力される特徴マップ同士の内積を計算することによって得られる。画像特徴量は、テクスチャパターンの度数分布をあわせて示す特徴量であってよい。 Here, the original image feature OF is a feature indicating the frequency distribution of each color of the original image OI. The virtual image feature VF is a feature indicating the frequency distribution of each color of the virtual image VI. In other words, the original image feature OF is a feature that does not include information about the shape and arrangement of the object O in the original image OI. The virtual image feature VF is a feature that does not include information about the shape and arrangement of the object O in the virtual image VI. In this embodiment, the original image feature OF and the virtual image feature VF may each be a vector expression equivalent to a feature that can be output from a trained feature extractor such as a CNN. In this embodiment, specifically, the original image feature OF and the virtual image feature VF are each a Gram matrix obtained by calculating the correlation between feature maps output from the convolution layer of the CNN. This Gram matrix is specifically obtained by calculating the inner product of feature maps output from the convolution layer of the CNN. The image feature may also be a feature that indicates the frequency distribution of the texture pattern.

［パラメータ調整部］
パラメータ調整部１４０は、元画像特徴量ＯＦと仮想画像特徴量ＶＦとの類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。すなわち、パラメータ調整部１４０は、元画像ＯＩの各色の度数分布を示す特徴量と仮想画像ＶＩの各色の度数分布を示す特徴量との類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。パラメータ調整部１４０は、元画像ＯＩの配色と仮想画像ＶＩの配色との類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整するということもできる。これにより、本実施形態の情報処理システム１は、元画像ＯＩにおける物体Ｏの配置と、仮想画像ＶＩにおける物体Ｏの配置と、が異なっていても、元画像ＯＩが示す物体Ｏの外観を、変換画像が示す物体Ｏの外観に的確に反映させることができる。 [Parameter adjustment section]
The parameter adjustment unit 140 adjusts the parameters of the illumination data L and the parameters of the texture data T so that the similarity between the original image feature OF and the virtual image feature VF increases. That is, the parameter adjustment unit 140 adjusts the parameters of the illumination data L and the parameters of the texture data T so that the similarity between the feature indicating the frequency distribution of each color of the original image OI and the feature indicating the frequency distribution of each color of the virtual image VI increases. It can also be said that the parameter adjustment unit 140 adjusts the parameters of the illumination data L and the parameters of the texture data T so that the similarity between the color scheme of the original image OI and the color scheme of the virtual image VI increases. As a result, the information processing system 1 of this embodiment can accurately reflect the appearance of the object O shown in the original image OI in the appearance of the object O shown in the converted image even if the arrangement of the object O in the original image OI is different from the arrangement of the object O in the virtual image VI.

具体的には、パラメータ調整部１４０は、元画像特徴量ＯＦと仮想画像特徴量ＶＦとに基づいて、損失を計算し、当該損失が小さくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。損失の計算自体は、二乗誤差やクロスエントロピー等の公知の計算方法が利用可能である。また、照明データＬのパラメータ及びテクスチャデータＴのパラメータの調整自体は、勾配降下法や誤差逆伝播法等の公知のパラメータ調整方法が利用可能である。なお、パラメータ調整部１４０では、特徴抽出器に係るパラメータ（重み係数やバイアス等）は調整されず固定されたままである。 Specifically, the parameter adjustment unit 140 calculates the loss based on the original image feature amount OF and the virtual image feature amount VF, and adjusts the parameters of the illumination data L and the parameters of the texture data T so as to reduce the loss. The calculation of the loss itself can use a known calculation method such as squared error or cross entropy. Furthermore, the adjustment of the parameters of the illumination data L and the parameters of the texture data T can use a known parameter adjustment method such as gradient descent or backpropagation. Note that in the parameter adjustment unit 140, the parameters related to the feature extractor (weighting coefficients, bias, etc.) are not adjusted but remain fixed.

本実施形態では、パラメータ調整部１４０は、元画像特徴量ＯＦと、複数の仮想画像ＶＩ１，ＶＩ２，ＶＩ３のそれぞれに対応する仮想画像特徴量ＶＦと、の類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。具体的には、まず、パラメータ調整部１４０は、元画像特徴量ＯＦと仮想画像ＶＩ１に対応する仮想画像特徴量ＶＦとの類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。次に、パラメータ調整部１４０は、元画像特徴量ＯＦと仮想画像ＶＩ２に対応する仮想画像特徴量ＶＦとの類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。最後に、パラメータ調整部１４０は、元画像特徴量ＯＦと仮想画像ＶＩ３に対応する仮想画像特徴量ＶＦとの類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する。 In this embodiment, the parameter adjustment unit 140 adjusts the parameters of the lighting data L and the parameters of the texture data T so that the similarity between the original image feature OF and the virtual image feature VF corresponding to each of the multiple virtual images VI1, VI2, and VI3 is increased. Specifically, first, the parameter adjustment unit 140 adjusts the parameters of the lighting data L and the parameters of the texture data T so that the similarity between the original image feature OF and the virtual image feature VF corresponding to the virtual image VI1 is increased. Next, the parameter adjustment unit 140 adjusts the parameters of the lighting data L and the parameters of the texture data T so that the similarity between the original image feature OF and the virtual image feature VF corresponding to the virtual image VI2 is increased. Finally, the parameter adjustment unit 140 adjusts the parameters of the lighting data L and the parameters of the texture data T so that the similarity between the original image feature OF and the virtual image feature VF corresponding to the virtual image VI3 is increased.

ここで、元画像ＯＩが示す物体Ｏの向きと、最初にパラメータ調整部１４０での処理に供される仮想画像特徴量ＶＦに対応する仮想画像ＶＩ１が示す物体Ｏの向きと、は一致していることが好ましい。これにより、照明データＬのパラメータ及びテクスチャデータＴのパラメータの調整に係る処理負荷が軽減される。 Here, it is preferable that the orientation of the object O indicated by the original image OI coincides with the orientation of the object O indicated by the virtual image VI1 corresponding to the virtual image feature VF that is first subjected to processing by the parameter adjustment unit 140. This reduces the processing load associated with adjusting the parameters of the illumination data L and the parameters of the texture data T.

なお、パラメータ調整部１４０は、仮想空間ＶＳに背景画像が設定される場合、照明データＬのパラメータ及びテクスチャデータＴのパラメータに加え、当該背景画像のパラメータを調整してもよい。 When a background image is set in the virtual space VS, the parameter adjustment unit 140 may adjust the parameters of the background image in addition to the parameters of the lighting data L and the parameters of the texture data T.

なお、上記した特徴抽出部１３０及びパラメータ調整部１４０は、例えば、参考文献（Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR. (2016)）の記載に従って構成されてもよい。 The feature extraction unit 130 and parameter adjustment unit 140 may be configured, for example, according to the description in the reference (Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR. (2016)).

［変換画像生成部］
変換画像生成部１５０は、３ＤモデルＭと、パラメータ調整部１４０によりパラメータが調整された照明データＬ及びテクスチャデータＴと、に基づいて、変換画像を生成する。変換画像は、元画像ＯＩのバリエーションであるということもできる。変換画像生成部１５０で実行される処理の内容は仮想画像生成部１２０と同様である。 [Converted image generation unit]
The converted image generating unit 150 generates a converted image based on the 3D model M and the illumination data L and texture data T whose parameters have been adjusted by the parameter adjusting unit 140. The converted image can also be said to be a variation of the original image OI. The content of the processing executed by the converted image generating unit 150 is similar to that of the virtual image generating unit 120.

本実施形態の情報処理システム１は、上記のように、パラメータ調整した照明データＬ及びテクスチャデータＴを用いて変換画像を生成する。この変換画像の生成の際に、種々の条件（３ＤモデルＭの位置・ポーズ・形状・向き、仮想視点ＶＶの位置・拡大倍率・画角等）を変化させることによって、様々な変換画像を生成することができる。また、変換画像の生成の際に、照明データＬのパラメータやテクスチャデータＴのパラメータを更に調整し、照明の位置・明るさや、物体Ｏ表面に設定されるテクスチャを変化させることによって、より様々な変換画像を生成することができる。 As described above, the information processing system 1 of this embodiment generates a converted image using the lighting data L and texture data T whose parameters have been adjusted. When generating this converted image, various conditions (position, pose, shape, and orientation of the 3D model M, position, magnification, and angle of view of the virtual viewpoint VV, etc.) can be changed to generate a variety of converted images. Furthermore, when generating a converted image, the parameters of the lighting data L and the parameters of the texture data T can be further adjusted to change the position and brightness of the lighting and the texture set on the surface of the object O, thereby generating even more diverse converted images.

また、本実施形態の情報処理システム１によれば、上記のような様々な変換画像を大量に生成することが容易となる。このようにして得られた大量の変換画像は、例えば、画像認識用の機械学習モデルの学習データとして利用することができる。 In addition, according to the information processing system 1 of this embodiment, it becomes easy to generate a large number of various converted images as described above. The large number of converted images obtained in this manner can be used, for example, as learning data for a machine learning model for image recognition.

なお、本実施形態の情報処理システム１の用途は、上記した変換画像の生成に限られない。例えば、本実施形態の情報処理システム１は、ビデオゲームに登場するゲームオブジェクトの作成に用いることもできる。すなわち、（１）物体Ｏとしてゲームキャラクターを示す元画像ＯＩが作成される。（２）当該ゲームキャラクターを表す３ＤモデルＭ、照明データＬ、及びテクスチャデータＴに基づいて、仮想画像ＶＩが生成される。（３）上記のパラメータ調整を行うことによって、元画像ＯＩの外観を反映させたゲームオブジェクトが得られる。 Note that the use of the information processing system 1 of this embodiment is not limited to generating the converted image described above. For example, the information processing system 1 of this embodiment can also be used to create game objects that appear in a video game. That is, (1) an original image OI showing a game character as the object O is created. (2) A virtual image VI is generated based on a 3D model M representing the game character, lighting data L, and texture data T. (3) By adjusting the parameters as described above, a game object that reflects the appearance of the original image OI is obtained.

［４．本実施形態の情報処理システムにおいて実行される処理］
最後に、図６及び図７を用い、本実施形態の情報処理システム１において実行される処理について説明する。図６及び図７は、本実施形態の情報処理システム１において実行される処理の一例を示すフロー図である。図６及び図７に示す処理は、制御部１１が記憶部１２に記憶されたプログラムに従って動作することによって実行される。下記に説明する処理は、図３に示す機能ブロックにより実行される処理の一例である。 4. Processing Executed in the Information Processing System of the Present Embodiment
Finally, the process executed in the information processing system 1 of this embodiment will be described with reference to Fig. 6 and Fig. 7. Fig. 6 and Fig. 7 are flow charts showing an example of the process executed in the information processing system 1 of this embodiment. The process shown in Fig. 6 and Fig. 7 is executed by the control unit 11 operating according to a program stored in the storage unit 12. The process described below is an example of the process executed by the functional blocks shown in Fig. 3.

図６に示すように、まず、制御部１１は、記憶部１２から元画像ＯＩを取得する（Ｓ１００）。制御部１１は、取得した元画像ＯＩから、第１の中間特徴量を抽出する（Ｓ１０１）。制御部１１は、更に第１の中間特徴量から元画像特徴量ＯＦを抽出し、抽出した元画像特徴量ＯＦを記憶部１２に格納する（Ｓ１０２）。 As shown in FIG. 6, first, the control unit 11 acquires an original image OI from the storage unit 12 (S100). The control unit 11 extracts a first intermediate feature from the acquired original image OI (S101). The control unit 11 further extracts an original image feature OF from the first intermediate feature, and stores the extracted original image feature OF in the storage unit 12 (S102).

図７に移り、制御部１１は、仮想視点ＶＶを設定する（Ｓ１０３）。具体的には、制御部１１は、仮想視点ＶＶの位置・画角・拡大倍率等のパラメータを設定する。次いで、制御部１１は、記憶部１２から、３ＤモデルＭ、照明データＬ、及びテクスチャデータＴを取得し（Ｓ１０４）、これらに基づいて、仮想画像ＶＩを生成する（Ｓ１０５）。 Moving on to FIG. 7, the control unit 11 sets a virtual viewpoint VV (S103). Specifically, the control unit 11 sets parameters such as the position, angle of view, and magnification of the virtual viewpoint VV. Next, the control unit 11 acquires a 3D model M, lighting data L, and texture data T from the storage unit 12 (S104), and generates a virtual image VI based on these (S105).

制御部１１は、生成した仮想画像ＶＩから第２の中間特徴量を抽出する（Ｓ１０６）。記憶部１２は、更に第２の中間特徴量から仮想画像特徴量ＶＦを抽出し、抽出した仮想画像特徴量ＶＦを記憶部１２に格納する（Ｓ１０７）。制御部１１は、記憶部１２に格納された元画像特徴量ＯＦと仮想画像特徴量ＶＦとの類似度が大きくなるように、照明データＬのパラメータ及びテクスチャデータＴのパラメータを調整する（Ｓ１０８）。 The control unit 11 extracts second intermediate features from the generated virtual image VI (S106). The storage unit 12 further extracts virtual image features VF from the second intermediate features and stores the extracted virtual image features VF in the storage unit 12 (S107). The control unit 11 adjusts the parameters of the lighting data L and the parameters of the texture data T so that the similarity between the original image features OF stored in the storage unit 12 and the virtual image features VF increases (S108).

制御部１１は、元画像特徴量ＯＦと仮想画像特徴量ＶＦとの類似度が所定の条件を満たしておらず、照明データＬのパラメータ及びテクスチャデータＴのパラメータの調整を再度繰り返すと判定した場合（Ｓ１０９；Ｎ）、Ｓ１０５からＳ１０８の処理を再度実行する。具体的には、制御部１１は、元画像特徴量ＯＦと仮想画像特徴量ＶＦとに基づいて計算される損失が所定の閾値未満になるまで、照明データＬのパラメータ及びテクスチャデータＴのパラメータの調整を繰り返す。 When the control unit 11 determines that the similarity between the original image feature amount OF and the virtual image feature amount VF does not satisfy a predetermined condition and that the adjustment of the parameters of the illumination data L and the parameters of the texture data T should be repeated (S109; N), the control unit 11 executes the processes from S105 to S108 again. Specifically, the control unit 11 repeats the adjustment of the parameters of the illumination data L and the parameters of the texture data T until the loss calculated based on the original image feature amount OF and the virtual image feature amount VF becomes less than a predetermined threshold value.

一方、制御部１１は、元画像特徴量ＯＦと仮想画像特徴量ＶＦとの類似度が所定の条件を満たし、パラメータ調整を終了すると判定した場合（Ｓ１０９；Ｙ）、次いで、仮想視点ＶＶを変えて仮想画像ＶＩを生成するか否かを判定する（Ｓ１１０）。制御部１１は、仮想視点ＶＶを変えて仮想画像ＶＩを生成すると判定した場合（Ｓ１１０；Ｎ）、Ｓ１０３からＳ１０９の処理を再度実行する。具体的には、制御部１１は、所定数の仮想視点ＶＶのそれぞれについて、Ｓ１０３からＳ１０９の処理を繰り返す。一方、制御部１１は、仮想視点ＶＶを変えて仮想画像ＶＩを生成しないと判定した場合（Ｓ１１０；Ｙ）、本処理を終了する。 On the other hand, when the control unit 11 determines that the similarity between the original image feature OF and the virtual image feature VF satisfies a predetermined condition and that the parameter adjustment is to be terminated (S109; Y), it then determines whether or not to change the virtual viewpoint VV and generate a virtual image VI (S110). When the control unit 11 determines that the virtual viewpoint VV is changed and a virtual image VI is to be generated (S110; N), it executes the processes from S103 to S109 again. Specifically, the control unit 11 repeats the processes from S103 to S109 for each of a predetermined number of virtual viewpoints VV. On the other hand, when the control unit 11 determines that the virtual viewpoint VV is not changed and a virtual image VI is not to be generated (S110; Y), it terminates this process.

［５．付記］
例えば、本開示に係る情報処理システムは、下記のような構成も可能である。 [5. Notes]
For example, the information processing system according to the present disclosure may be configured as follows.

（１）
物体を示す元画像を取得する元画像取得手段と、
前記物体を表す３Ｄモデル、照明データ、及びテクスチャデータに基づいて、仮想視点から見た、前記３Ｄモデルが配置された仮想空間の様子を示す仮想画像を生成する仮想画像生成手段と、
前記元画像から元画像特徴量を抽出するとともに、前記仮想画像から仮想画像特徴量を抽出する特徴抽出手段と、
前記元画像特徴量と前記仮想画像特徴量との類似度が大きくなるように、前記照明データのパラメータ及び前記テクスチャデータのパラメータを調整するパラメータ調整手段と、
を有する、情報処理システム。 (1)
An original image acquisition means for acquiring an original image showing an object;
a virtual image generating means for generating a virtual image showing an appearance of a virtual space in which the 3D model is placed, as viewed from a virtual viewpoint, based on a 3D model representing the object, lighting data, and texture data;
a feature extraction means for extracting an original image feature from the original image and a virtual image feature from the virtual image;
a parameter adjusting means for adjusting parameters of the illumination data and parameters of the texture data so as to increase a similarity between the original image feature amount and the virtual image feature amount;
An information processing system having the above configuration.

（２）
前記元画像は、現実空間に配置された前記物体の撮影画像である、
（１）に記載の情報処理システム。 (2)
The original image is a photographed image of the object disposed in real space.
An information processing system according to (1).

（３）
前記元画像特徴量は、前記元画像の各色の度数分布を示し、
前記仮想画像特徴量は、前記仮想画像の各色の度数分布を示し、
前記特徴抽出手段は、前記元画像から、前記元画像における前記物体の形状及び配置に関する情報を含む第１の中間特徴量を抽出し、前記第１の中間特徴量から前記元画像特徴量を抽出するとともに、前記仮想画像から、前記仮想画像における前記物体の形状及び配置に関する情報を含む第２の中間特徴量を抽出し、前記第２の中間特徴量から前記仮想画像特徴量を抽出する、
（１）又は（２）に記載の情報処理システム。 (3)
the original image feature amount indicates a frequency distribution of each color of the original image,
the virtual image feature amount indicates a frequency distribution of each color of the virtual image,
the feature extraction means extracts, from the original image, a first intermediate feature amount including information regarding a shape and an arrangement of the object in the original image, and extracts the original image feature amount from the first intermediate feature amount, and also extracts, from the virtual image, a second intermediate feature amount including information regarding a shape and an arrangement of the object in the virtual image, and extracts the virtual image feature amount from the second intermediate feature amount.
An information processing system according to (1) or (2).

（４）
前記仮想画像生成手段は、互いに異なる複数の前記仮想視点のそれぞれにそれぞれが対応するとともに、当該仮想視点から見た前記仮想空間の様子を示す複数の前記仮想画像を生成し、
前記特徴抽出手段は、前記複数の仮想画像のそれぞれから、当該仮想画像に対応する仮想画像特徴量を抽出し、
前記パラメータ調整手段は、前記元画像特徴量と、前記複数の仮想画像のそれぞれに対応する仮想画像特徴量と、の類似度が大きくなるように、前記照明データのパラメータ及び前記テクスチャデータのパラメータを調整する、
（１）から（３）のいずれかに記載の情報処理システム。 (4)
the virtual image generating means generates a plurality of virtual images each corresponding to a plurality of different virtual viewpoints and showing a state of the virtual space as seen from the corresponding virtual viewpoint;
the feature extraction means extracts, from each of the plurality of virtual images, a virtual image feature corresponding to the virtual image;
the parameter adjustment means adjusts the parameters of the illumination data and the parameters of the texture data so as to increase a similarity between the original image feature amount and a virtual image feature amount corresponding to each of the plurality of virtual images.
An information processing system according to any one of (1) to (3).

（５）
前記３Ｄモデルと、前記パラメータ調整手段によりパラメータが調整された前記照明データ及び前記テクスチャデータと、に基づいて、前記元画像とは異なる、前記物体を示す画像である変換画像を生成する変換画像生成手段を更に有する、
（１）から（４）のいずれかに記載の情報処理システム。 (5)
a converted image generating means for generating a converted image, which is an image showing the object and is different from the original image, based on the 3D model and the lighting data and the texture data whose parameters have been adjusted by the parameter adjusting means;
An information processing system according to any one of (1) to (4).

以上に説明した本実施形態の情報処理システム１によれば、画像変換の態様を拡張することができる。 According to the information processing system 1 of this embodiment described above, the image conversion mode can be expanded.

１情報処理システム、Ｎネットワーク、１１制御部、１２記憶部、１３通信部、１４操作部、１５表示部、１００元画像記憶部、１０１３Ｄモデル記憶部、１０２照明データ記憶部、１０３テクスチャデータ記憶部、１０４特徴抽出器記憶部、１１０元画像取得部、１２０仮想画像生成部、１３０特徴抽出部、１４０パラメータ調整部、１５０変換画像生成部、ＯＩ元画像、Ｏ物体、Ｍ３Ｄモデル、Ｌ照明データ、Ｔテクスチャデータ、ＶＩ，ＶＩ１,ＶＩ２,ＶＩ３仮想画像、ＯＦ元画像特徴量、ＶＦ仮想画像特徴量、ＶＶ，ＶＶ１,ＶＶ２,ＶＶ３仮想視点、ＶＳ仮想空間。

1 Information processing system, N Network, 11 Control unit, 12 Memory unit, 13 Communication unit, 14 Operation unit, 15 Display unit, 100 Original image memory unit, 101 3D model memory unit, 102 Lighting data memory unit, 103 Texture data memory unit, 104 Feature extractor memory unit, 110 Original image acquisition unit, 120 Virtual image generation unit, 130 Feature extraction unit, 140 Parameter adjustment unit, 150 Transformed image generation unit, OI Original image, O Object, M 3D model, L Lighting data, T Texture data, VI, VI1, VI2, VI3 Virtual image, OF Original image feature amount, VF Virtual image feature amount, VV, VV1, VV2, VV3 Virtual viewpoint, VS Virtual space.

Claims

An original image acquisition means for acquiring an original image showing an object;
a virtual image generating means for generating a virtual image showing an appearance of a virtual space in which the 3D model is placed, as viewed from a virtual viewpoint, based on a 3D model representing the object, lighting data, and texture data;
a feature extraction means for extracting an original image feature from the original image and a virtual image feature from the virtual image;
a parameter adjustment means for adjusting a parameter of the illumination data and a parameter of the texture data based on a loss indicating a degree of similarity between the original image feature amount and the virtual image feature amount;
An information processing system having the above configuration.

The original image is a photographed image of the object disposed in real space.
The information processing system according to claim 1 .

the original image feature amount indicates a frequency distribution of each color of the original image,
the virtual image feature amount indicates a frequency distribution of each color of the virtual image,
the feature extraction means extracts, from the original image, a first intermediate feature amount including information regarding a shape and an arrangement of the object in the original image, and extracts the original image feature amount from the first intermediate feature amount, and also extracts, from the virtual image, a second intermediate feature amount including information regarding a shape and an arrangement of the object in the virtual image, and extracts the virtual image feature amount from the second intermediate feature amount.
3. The information processing system according to claim 1 or 2.

the virtual image generating means generates a plurality of virtual images each corresponding to a plurality of different virtual viewpoints and showing a state of the virtual space as seen from the corresponding virtual viewpoint;
the feature extraction means extracts, from each of the plurality of virtual images, a virtual image feature corresponding to the virtual image;
the parameter adjustment means adjusts the parameters of the illumination data and the parameters of the texture data so as to increase a similarity between the original image feature amount and a virtual image feature amount corresponding to each of the plurality of virtual images.
3. The information processing system according to claim 1 or 2.

a converted image generating means for generating a converted image, which is an image showing the object and is different from the original image, based on the 3D model and the lighting data and the texture data whose parameters have been adjusted by the parameter adjusting means;
3. The information processing system according to claim 1 or 2.

An original image acquisition step of acquiring an original image showing an object;
a virtual image generating step of generating a virtual image showing an appearance of a virtual space in which the 3D model is placed, as viewed from a virtual viewpoint, based on a 3D model representing the object, lighting data, and texture data;
a feature extraction step of extracting an original image feature from the original image and a virtual image feature from the virtual image;
a parameter adjustment step of adjusting parameters of the illumination data and parameters of the texture data based on a loss indicating a degree of similarity between the original image feature amount and the virtual image feature amount;
An information processing method comprising the steps of:

An original image acquisition means for acquiring an original image showing an object;
a virtual image generating means for generating a virtual image showing an appearance of a virtual space in which the 3D model is placed, as viewed from a virtual viewpoint, based on a 3D model representing the object, lighting data, and texture data;
a feature extraction means for extracting an original image feature from the original image and a virtual image feature from the virtual image;
a parameter adjusting means for adjusting a parameter of the illumination data and a parameter of the texture data based on a loss indicating a degree of similarity between the original image feature amount and the virtual image feature amount;
A program that makes a computer function as a