JP7704288B2

JP7704288B2 - Image conversion device, image conversion method, and image conversion program

Info

Publication number: JP7704288B2
Application number: JP2024502364A
Authority: JP
Inventors: 雄貴蔵内; 真奈笹川; 直紀萩山; 文香佐野; 隆二山本
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2022-02-25
Filing date: 2022-02-25
Publication date: 2025-07-08
Anticipated expiration: 2042-02-25
Also published as: WO2023162131A1; JPWO2023162131A1

Description

この発明の実施形態は、画像変換装置、画像変換方法及び画像変換プログラムに関する。 Embodiments of the present invention relate to an image conversion device, an image conversion method, and an image conversion program.

非特許文献１は、リアルタイムな表情変形フィードバックによる感情体験の操作の可能性について開示している。非特許文献１では、被験者の顔をリアルタイムにトラッキングして自然な表情変形処理を施している。非特許文献１では、画像変換法としてＲｉｇｉｄＭＬＳ（Moving Least Squares）法を使用して、顔画像における表情を変形している。ＲｉｇｉｄＭＬＳ法は、画像から認識した画像中の特徴点を制御点として、各制御点を移動させることで、画像を歪めるという手法である。なお、顔画像とは、被験者の顔を撮影した画像、コンピュータが生成したアバターの顔を抽出した画像、などである。Non-Patent Document 1 discloses the possibility of manipulating emotional experiences by real-time facial deformation feedback. In Non-Patent Document 1, the subject's face is tracked in real time and a natural facial deformation process is performed. In Non-Patent Document 1, the Rigid MLS (Moving Least Squares) method is used as an image conversion method to deform the facial expression in the facial image. The Rigid MLS method is a technique in which feature points in an image recognized from an image are used as control points, and the image is distorted by moving each control point. Note that the facial image may be an image of the subject's face, an image of an avatar's face extracted by a computer, etc.

吉田成朗ら，「リアルタイムな表情変形フィードバックによる感情体験の操作」，ヒューマンインタフェース学会論文誌，Vol.17，No.1，2015Shigeaki Yoshida et al., "Manipulation of Emotional Experience by Real-time Facial Deformation Feedback," Journal of Human Interface Society, Vol. 17, No. 1, 2015

顔パーツを認識する場合に、顔パーツの片側のみしか認識しないような特徴点認識手法がある。そのような特徴点認識手法によって認識された顔パーツを、ＲｉｇｉｄＭＬＳ法を使用して動かそうとすると、不自然な表情の顔画像しか得ることができない。When recognizing facial parts, there are feature point recognition methods that only recognize one side of the facial part. If you try to move a facial part recognized by such a feature point recognition method using the Rigid MLS method, you will only get an image of a face with an unnatural expression.

例えば、眉について、その片側である上側のみが認識され、もう片側の下側は認識されないような特徴点認識手法がある。この手法で上側のみが認識された眉を、上方向に移動するよう画像変換すると、得られた顔画像は眉が太くなってしまい、不自然な表情の顔画像しか得られない。他に、瞼の二重の幅、輪郭の陰影、などについても、片側しか認識されないと、画像変形した際に、同様に不自然な表情の顔画像になってしまう。For example, there is a feature point recognition method that recognizes only one side of eyebrows, the upper side, and not the other lower side. If only the upper side of eyebrows is recognized using this method, and the image is transformed to move them upwards, the resulting facial image will have thicker eyebrows and an unnatural facial expression. In addition, if only one side of the width of the double eyelid, the shading of the contour, etc. is recognized, the facial image will also have an unnatural facial expression when the image is transformed.

この発明は、顔パーツの片側のみしか認識しない場合であっても、自然な表情の画像に変換することを可能とする画像変換技術を提供しようとするものである。 This invention aims to provide an image conversion technology that can convert an image of a facial expression into one that is natural, even when only one side of the facial feature is recognized.

上記課題を解決するために、この発明の一態様に係る画像変換装置は、制御点生成部と表情変換部とを備える。制御点生成部は、人の顔の画像から認識された顔パーツの片側の特徴点である第１の特徴点に基づいて、認識されていないもう片側の特徴点である第２の特徴点を追加し、第１及び第２の特徴点を制御点とする。表情変換部は、変換するべき変換表情に応じた変形量により制御点を変形することで人の顔の表情を変換した変換画像を得る。In order to solve the above problem, an image conversion device according to one aspect of the present invention includes a control point generation unit and a facial expression conversion unit. The control point generation unit adds a second feature point, which is a feature point on the other side that has not been recognized, based on a first feature point, which is a feature point on one side of a facial feature recognized from an image of a human face, and sets the first and second feature points as control points. The facial expression conversion unit obtains a converted image in which the facial expression of the human face has been converted by transforming the control point by an amount of transformation corresponding to the conversion facial expression to be converted.

この発明の一態様によれば、顔パーツの片側の特徴点に基づいてもう片側の特徴点を追加して画像を変換するので、顔パーツの片側のみしか認識しない場合であっても、自然な表情の顔画像に変換することを可能とする画像変換技術を提供することができる。 According to one aspect of the invention, an image is converted by adding feature points on one side of a facial feature based on feature points on the other side, thereby providing an image conversion technology that can convert a facial image into one with a natural expression even when only one side of a facial feature is recognized.

図１は、この発明の一実施形態に係る画像変換装置の構成の一例を示すブロック図である。FIG. 1 is a block diagram showing an example of the configuration of an image conversion device according to an embodiment of the present invention. 図２は、画像変換装置のハードウェア構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of the image conversion device. 図３は、顔の特徴点の一例を示す図である。FIG. 3 is a diagram showing an example of facial feature points. 図４は、特徴点の記憶形態の一例を示す図である。FIG. 4 is a diagram showing an example of a storage format of feature points. 図５は、変化量の記憶形態の一例を示す図である。FIG. 5 is a diagram showing an example of a storage format of the amount of change. 図６は、画像変換装置による画像変換処理動作の一例を示すフローチャートである。FIG. 6 is a flowchart showing an example of an image conversion processing operation by the image conversion device. 図７は、眉の特徴点と目の上の特徴点との関係を説明するための模式図である。FIG. 7 is a schematic diagram for explaining the relationship between the eyebrow feature points and the feature points above the eyes. 図８は、眉の下側の特徴点の追加方法を説明するための模式図である。FIG. 8 is a schematic diagram for explaining a method of adding feature points below the eyebrows. 図９は、瞼の特徴点の追加方法を説明するための模式図である。FIG. 9 is a schematic diagram for explaining a method of adding eyelid feature points.

［一実施形態］
以下、図面を参照して、この発明に係わる一実施形態を説明する。 [One embodiment]
An embodiment of the present invention will now be described with reference to the drawings.

（構成例）
図１は、この発明の一実施形態に係る画像変換装置１の構成の一例を示すブロック図である。画像変換装置１は、画像取得部１１、特徴点認識部１２、制御点生成部１３、変換表情入力部１４、変化量格納部１５、表情変換部１６及び画像出力部１７を有する。 (Configuration example)
1 is a block diagram showing an example of the configuration of an image conversion device 1 according to an embodiment of the present invention. The image conversion device 1 has an image acquisition unit 11, a feature point recognition unit 12, a control point generation unit 13, a converted facial expression input unit 14, a change amount storage unit 15, a facial expression conversion unit 16, and an image output unit 17.

画像取得部１１は、ｗｅｂカメラやアバターなどから顔画像を取得する。画像取得部１１は、取得した顔画像を、特徴点認識部１２及び表情変換部１６に出力する。The image acquisition unit 11 acquires a facial image from a web camera, an avatar, etc. The image acquisition unit 11 outputs the acquired facial image to the feature point recognition unit 12 and the facial expression conversion unit 16.

特徴点認識部１２は、画像取得部１１が取得した顔画像を入力とし、その顔画像から特徴点を認識する。この特徴点認識部１２における特徴点の認識手法については後述する。特徴点認識部１２は、認識した特徴点を制御点生成部１３に出力する。The feature point recognition unit 12 receives the facial image acquired by the image acquisition unit 11 as input and recognizes feature points from the facial image. The method for recognizing feature points in the feature point recognition unit 12 will be described later. The feature point recognition unit 12 outputs the recognized feature points to the control point generation unit 13.

制御点生成部１３は、特徴点認識部１２が認識した特徴点である第１の特徴点を入力とし、入力されたそれらの第１の特徴点に基づいて、認識されない特徴点である第２の特徴点を追加する。例えば、制御点生成部１３は、第１の特徴点である眉の特徴点と目の特徴点との間の距離を計算し、眉の各特徴点から求めた距離の半分だけ下に第２の特徴点を追加する。この第２の特徴点の追加手法については、後で詳述する。制御点生成部１３は、第１及び第２の特徴点を制御点として、表情変換部１６に出力する。第１の特徴点の何れに基づいて第２の特徴点を追加するのか、及び、追加する第２の特徴点の個数は、予め決まっている。従って、制御点の個数も予め決まっている。The control point generation unit 13 receives the first feature points recognized by the feature point recognition unit 12 as input, and adds the second feature points that are not recognized based on the input first feature points. For example, the control point generation unit 13 calculates the distance between the eyebrow feature points and the eye feature points that are the first feature points, and adds the second feature points half the calculated distance below each eyebrow feature point. The method of adding the second feature points will be described in detail later. The control point generation unit 13 outputs the first and second feature points as control points to the facial expression conversion unit 16. It is predetermined which of the first feature points the second feature points are added based on, and the number of second feature points to be added. Therefore, the number of control points is also predetermined.

変換表情入力部１４は、キーボードなどのユーザインタフェースからユーザが指定入力した、笑顔などの変換したい先の表情である変換表情を取得する。変換表情入力部１４は、取得した変換表情を表情変換部１６に出力する。The conversion facial expression input unit 14 acquires a conversion facial expression, which is the facial expression to be converted to, such as a smile, that is specified and input by the user from a user interface such as a keyboard. The conversion facial expression input unit 14 outputs the acquired conversion facial expression to the facial expression conversion unit 16.

変化量格納部１５は、変換したい先の表情ごとに、各制御点についての変化量を予め格納する。変化量は、制御点をどの程度移動すべきかを示す情報である。変化量は、例えば、ユーザが特定の顔画像について無表情顔に表情変形処理を適用しながら、自然な表情となるように調整して、予め求めることができる。The change amount storage unit 15 pre-stores the change amount for each control point for each facial expression to be converted. The change amount is information indicating how far the control point should be moved. The change amount can be determined in advance, for example, by a user applying facial expression transformation processing to an expressionless face for a specific facial image, adjusting it to produce a natural expression.

表情変換部１６は、画像取得部１１が取得した顔画像、制御点生成部１３が出力した制御点、及び変換表情入力部１４が取得した変換表情を入力とする。また、表情変換部１６は、変換表情入力部１４から入力された変換表情で示される変換したい先の表情における変化量を変化量格納部１５から読み出す。変化量格納部１５は、入力された顔画像における各制御点を、読み出したその制御点の移動量に基づいて移動することで、顔画像の表情を変換した顔画像を得る。表情変換部１６は、変換後の顔画像を画像出力部１７に出力する。The facial expression conversion unit 16 receives as input the facial image acquired by the image acquisition unit 11, the control points output by the control point generation unit 13, and the converted facial expression acquired by the converted facial expression input unit 14. The facial expression conversion unit 16 also reads out from the change amount storage unit 15 the amount of change in the facial expression to be converted, which is indicated by the converted facial expression input from the converted facial expression input unit 14. The change amount storage unit 15 moves each control point in the input facial image based on the amount of movement of the read control points, thereby obtaining a facial image in which the facial expression of the facial image has been converted. The facial expression conversion unit 16 outputs the converted facial image to the image output unit 17.

画像出力部１７は、表情変換部１６からの変換後の顔画像を入力とし、入力された顔画像を出力する。ここで、出力とは、例えば、記憶媒体に記憶すること、ディスプレイで表示すること、通信ネットワークを介して他の機器へ送信すること、などを含む。The image output unit 17 receives the converted facial image from the facial expression conversion unit 16 as input, and outputs the input facial image. Here, output includes, for example, storing the image in a storage medium, displaying the image on a display, transmitting the image to another device via a communication network, and the like.

図２は、画像変換装置１のハードウェア構成の一例を示す図である。 Figure 2 is a diagram showing an example of the hardware configuration of the image conversion device 1.

画像変換装置１は、例えば、パーソナルコンピュータ（Personal computer）、スマートホン、サーバコンピュータ、などのコンピュータにより構成される。画像変換装置１は、図２に示すように、ＣＰＵ（Central Processing Unit）等のハードウェアプロセッサ１００を有する。なお、ＣＰＵは、マルチコア及びマルチスレッドのものを用いることで、同時に複数の情報処理を実行することができる。また、プロセッサ１００は、複数のＣＰＵを備えていても良い。そして、画像変換装置１では、このプロセッサ１００に対し、プログラムメモリ２００と、データメモリ３００と、通信インタフェース４００と、入出力インタフェース（図２では入出力ＩＦと記す）５００とが、バス６００を介して接続される。The image conversion device 1 is composed of a computer such as a personal computer, a smartphone, or a server computer. As shown in FIG. 2, the image conversion device 1 has a hardware processor 100 such as a CPU (Central Processing Unit). The CPU can execute multiple information processes simultaneously by using a multi-core and multi-threaded one. The processor 100 may also have multiple CPUs. In the image conversion device 1, a program memory 200, a data memory 300, a communication interface 400, and an input/output interface (referred to as input/output IF in FIG. 2) 500 are connected to the processor 100 via a bus 600.

通信インタフェース４００は、例えば一つ以上の有線または無線の通信モジュールを含むことができる。通信インタフェース４００は、ケーブルまたはＬＡＮ（Local Area Network）やインターネット等のネットワークを介して接続される他のコンピュータ、ｗｅｂカメラ、などとの間で通信を行うことができる。The communication interface 400 may include, for example, one or more wired or wireless communication modules. The communication interface 400 may communicate with other computers, web cameras, etc. that are connected via a cable or a network such as a LAN (Local Area Network) or the Internet.

入出力インタフェース５００には、入力部７００及び表示部８００が接続されている。入力部７００は、キーボード、マウスなどのポインティングデバイス、などの入力デバイス、カメラなどのセンサデバイス、などを含む。また、表示部８００は、液晶ディスプレイ、ＣＲＴ（Cathode Ray Tube）ディスプレイ、などの表示デバイスである。入力部７００及び表示部８００は、いわゆるタブレット型の入力・表示デバイスを用いたものが用いられることもできる。この種の入力・表示デバイスは、例えば液晶または有機ＥＬ（Electro Luminescence）を使用した表示デバイスの表示画面上に、静電方式または圧力方式を採用した入力検知シートを配置して構成される。入出力インタフェース５００は、上記入力部７００において入力された操作情報をプロセッサ１００に入力すると共に、プロセッサ１００で生成された表示情報を表示部８００に表示させる。The input unit 700 and the display unit 800 are connected to the input/output interface 500. The input unit 700 includes input devices such as a keyboard, a pointing device such as a mouse, and a sensor device such as a camera. The display unit 800 is a display device such as a liquid crystal display or a CRT (Cathode Ray Tube) display. The input unit 700 and the display unit 800 can also use a so-called tablet-type input/display device. This type of input/display device is configured by arranging an input detection sheet using an electrostatic method or a pressure method on the display screen of a display device using, for example, liquid crystal or organic EL (Electro Luminescence). The input/output interface 500 inputs operation information inputted in the input unit 700 to the processor 100 and causes the display unit 800 to display display information generated by the processor 100.

なお、入力部７００及び表示部８００は、入出力インタフェース５００に接続されていなくても良い。入力部７００及び表示部８００は、通信インタフェース４００と直接またはネットワークを介して接続するための通信ユニットを備えることで、プロセッサ１００との間で情報の授受を行い得る。In addition, the input unit 700 and the display unit 800 do not have to be connected to the input/output interface 500. The input unit 700 and the display unit 800 can transmit and receive information between the processor 100 and the input unit 700 and the display unit 800 by being provided with a communication unit for connecting to the communication interface 400 directly or via a network.

また、入出力インタフェース５００は、フラッシュメモリ等の半導体メモリといった記録媒体のリード／ライト機能を有しても良いし、あるいは、そのような記録媒体のリード／ライト機能を持ったリーダライタとの接続機能を有しても良い。さらに、入出力インタフェース５００は、他の機器との接続機能を有して良い。In addition, the input/output interface 500 may have a function for reading/writing a recording medium such as a semiconductor memory such as a flash memory, or may have a function for connecting to a reader/writer having a function for reading/writing such a recording medium. Furthermore, the input/output interface 500 may have a function for connecting to other devices.

プログラムメモリ２００は、非一時的な有形のコンピュータ可読記憶媒体として、随時書込み及び読出しが可能な不揮発性メモリと、随時読出しのみが可能な不揮発性メモリとが組み合わせて使用されたものである。随時書込み及び読出しが可能な不揮発性メモリは、例えば、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、などである。随時読出しのみが可能な不揮発性メモリは、例えば、ＲＯＭなどである。このプログラムメモリ２００には、プロセッサ１００が一実施形態に係る各種制御処理を実行するために必要なプログラム、例えば画像変換プログラムが格納されている。すなわち、上記の画像取得部１１、特徴点認識部１２、制御点生成部１３、変換表情入力部１４、変化量格納部１５、表情変換部１６及び画像出力部１７の各部における処理機能部は、何れも、プログラムメモリ２００に格納された画像変換プログラムを上記プロセッサ１００により読み出させて実行させることにより実現され得る。なお、これらの処理機能部の一部または全部は、特定用途向け集積回路（ＡＳＩＣ：Application Specific Integrated Circuit）またはＦＰＧＡ（field-programmable gate array）等の集積回路を含む、他の多様な形式によって実現されても良い。The program memory 200 is a non-transient tangible computer-readable storage medium that is a combination of a non-volatile memory that can be written and read at any time and a non-volatile memory that can only be read at any time. Examples of non-volatile memories that can be written and read at any time include HDDs (Hard Disk Drives) and SSDs (Solid State Drives). Examples of non-volatile memories that can only be read at any time include ROMs. The program memory 200 stores programs, such as an image conversion program, that are necessary for the processor 100 to execute various control processes according to one embodiment. That is, the processing function units in each of the image acquisition unit 11, feature point recognition unit 12, control point generation unit 13, conversion facial expression input unit 14, change amount storage unit 15, facial expression conversion unit 16, and image output unit 17 can all be realized by having the processor 100 read and execute the image conversion program stored in the program memory 200. It should be noted that some or all of these processing functions may be implemented in a variety of other forms, including integrated circuits such as application specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs).

データメモリ３００は、有形のコンピュータ可読記憶媒体として、例えば、上記の不揮発性メモリと、ＲＡＭ（Random Access Memory）等の揮発性メモリとが組み合わせて使用されたものである。このデータメモリ３００は、各種処理が行われる過程で取得及び作成された各種データが記憶されるために用いられる。すなわち、データメモリ３００には、各種処理が行われる過程で、適宜、各種データを記憶するための領域が確保される。そのような領域として、データメモリ３００には、例えば、取得画像記憶部３０１、特徴点記憶部３０２、制御点記憶部３０３、変換表情指定記憶部３０４、変化量記憶部３０５、変換画像記憶部３０６及び一時記憶部３０７を設けることができる。The data memory 300 is a tangible computer-readable storage medium, for example, a combination of the above-mentioned non-volatile memory and a volatile memory such as a RAM (Random Access Memory). This data memory 300 is used to store various data acquired and created in the process of performing various processes. That is, in the data memory 300, an area is secured for storing various data as appropriate in the process of performing various processes. As such areas, the data memory 300 may be provided with, for example, an acquired image storage unit 301, a feature point storage unit 302, a control point storage unit 303, a converted facial expression designation storage unit 304, a change amount storage unit 305, a converted image storage unit 306, and a temporary storage unit 307.

取得画像記憶部３０１は、プロセッサ１００が上記の画像取得部１１として動作したときに取得した顔画像を記憶するために使用される。The acquired image memory unit 301 is used to store facial images acquired when the processor 100 operates as the above-mentioned image acquisition unit 11.

特徴点記憶部３０２は、プロセッサ１００が上記の特徴点認識部１２として動作したときに取得した特徴点を記憶するために使用される。The feature point memory unit 302 is used to store feature points acquired when the processor 100 operates as the above-mentioned feature point recognition unit 12.

図３は、顔の特徴点の一例を示す図である。図３中の星印がプロセッサ１００が認識した特徴点であり、各特徴点の横に付された数字は各特徴点を識別するための一意な特徴点ＩＤである。特徴点ＩＤの数及び各特徴点ＩＤに対する顔の部分は、採用する特徴点認識手法により決まっている。例えば、特徴点ＩＤ「１８」の特徴点は向かって左の眉の左端、のように予め決まっている。 Figure 3 is a diagram showing an example of facial feature points. The stars in Figure 3 are feature points recognized by the processor 100, and the numbers next to each feature point are unique feature point IDs for identifying each feature point. The number of feature point IDs and the part of the face that each feature point ID corresponds to are determined by the feature point recognition method employed. For example, the feature point for feature point ID "18" is predetermined to be the left end of the left eyebrow as viewed from the front.

図４は、特徴点記憶部３０２における特徴点の記憶形態の一例を示す図である。図４に示すように、特徴点記憶部３０２は、テーブル形式で、特徴点ＩＤに対応付けて顔画像中のｘ座標及びｙ座標を記憶する。座標の値はピクセルである。従って、特徴点記憶部３０２は、図３の例であれば、特徴点ＩＤ「１」～「６８」の特徴点について、そのｘｙ座標を記憶する。 Figure 4 is a diagram showing an example of the storage format of feature points in the feature point storage unit 302. As shown in Figure 4, the feature point storage unit 302 stores the x and y coordinates in the face image in a table format, corresponding to the feature point ID. The coordinate values are in pixels. Therefore, in the example of Figure 3, the feature point storage unit 302 stores the x and y coordinates for feature points with feature point IDs "1" to "68".

制御点記憶部３０３は、プロセッサ１００が上記の制御点生成部１３として動作したときに生成した制御点を記憶するために使用される。この制御点記憶部３０３における制御点の記憶形態は、例えば、図４に示した特徴点記憶部３０２における特徴点の記憶形態と同様である。すなわち、制御点記憶部３０３は、テーブル形式で、制御点ＩＤに対応付けて顔画像中のｘ座標及びｙ座標を記憶することかできる。制御点記憶部３０３は、図３に示した特徴点に対して割り当てられた「１」～「６８」の特徴点ＩＤがそのまま制御点ＩＤ「１」～「６８」として、特徴点ＩＤ「１」～「６８」のｘｙ座標を対応付けて記憶する。また、プロセッサ１００は、追加した特徴点である第２の特徴点のそれぞれのｘｙ座標を、制御点ＩＤ「６９」～に対応付けて記憶する。The control point storage unit 303 is used to store the control points generated when the processor 100 operates as the control point generation unit 13. The storage format of the control points in the control point storage unit 303 is, for example, the same as the storage format of the feature points in the feature point storage unit 302 shown in FIG. 4. That is, the control point storage unit 303 can store the x and y coordinates in the face image in a table format in association with the control point ID. The control point storage unit 303 stores the feature point IDs "1" to "68" assigned to the feature points shown in FIG. 3 as control point IDs "1" to "68" in association with the x and y coordinates of the feature point IDs "1" to "68". The processor 100 also stores the x and y coordinates of the second feature point, which is the added feature point, in association with the control point IDs "69" and onward.

変換表情指定記憶部３０４は、プロセッサ１００が上記の変換表情入力部１４として動作したときに取得した、ユーザによって指定された変換表情を記憶するために使用される。The conversion facial expression specification memory unit 304 is used to store the conversion facial expression specified by the user, which is obtained when the processor 100 operates as the above-mentioned conversion facial expression input unit 14.

変化量記憶部３０５は、上記の変化量格納部１５に相当する。 The change amount memory unit 305 corresponds to the change amount storage unit 15 described above.

図５は、変化量記憶部３０５における変化量の記憶形態の一例を示す図である。図５に示すように、変化量記憶部３０５は、変換表情ごとに、制御点ＩＤに対応付けてｘ座標の変化量とｙ座標の変化量とを記憶するテーブル形式とすることができる。変化量の値はピクセルである。変化量は、制御点の移動方向と移動量によって表される。例えば、移動量「＋１」は、正方向に１ピクセル移動することを表す。 Figure 5 is a diagram showing an example of the storage format of the amount of change in the change amount storage unit 305. As shown in Figure 5, the change amount storage unit 305 can be in a table format that stores the amount of change in the x coordinate and the amount of change in the y coordinate for each converted facial expression in association with a control point ID. The value of the amount of change is in pixels. The amount of change is represented by the direction and amount of movement of the control point. For example, a movement amount of "+1" represents a movement of one pixel in the positive direction.

変換画像記憶部３０６は、プロセッサ１００が上記の表情変換部１６として動作したときに変換した顔画像を記憶するために使用される。The converted image memory unit 306 is used to store the converted facial image when the processor 100 operates as the above-mentioned facial expression conversion unit 16.

一時記憶部３０７は、プロセッサ１００が動作途中で発生する、上記取得画像記憶部３０１、特徴点記憶部３０２、制御点記憶部３０３、変換表情指定記憶部３０４、変化量記憶部３０５及び変換画像記憶部３０６に記憶しない種々の中間データを記憶するために使用される。The temporary memory unit 307 is used to store various intermediate data that is generated during operation of the processor 100 and is not stored in the acquired image memory unit 301, the feature point memory unit 302, the control point memory unit 303, the converted facial expression designation memory unit 304, the change amount memory unit 305 and the converted image memory unit 306.

（動作）
次に、画像変換装置１を有する画像変換装置１の動作を説明する。 (operation)
Next, the operation of the image conversion device 1 having the image conversion device 1 will be described.

図６は、画像変換装置１による画像変換処理動作の一例を示すフローチャートである。画像変換装置１のプロセッサ１００は、プログラムメモリ２００に記憶された画像変換プログラムを読み出して実行することで、このフローチャートに示す画像変換装置１としての動作を開始する。プロセッサ１００での画像変換プログラムの実行は、入力部７００から、入出力インタフェース５００を介して、あるいは、通信インタフェース４００を介して、画像変換の実施を指示されることで開始される。 Figure 6 is a flowchart showing an example of an image conversion processing operation by image conversion device 1. Processor 100 of image conversion device 1 starts operation as image conversion device 1 shown in this flowchart by reading and executing an image conversion program stored in program memory 200. Execution of the image conversion program by processor 100 is started when an instruction to perform image conversion is received from input unit 700 via input/output interface 500 or via communication interface 400.

プロセッサ１００は、変換表情入力部１４として動作して、ユーザによる、笑顔などの変換したい先の表情である変換表情の指定入力を待つ（ステップＳ１）。例えば、プロセッサ１００は、入出力インタフェース５００または通信インタフェース４００を介した入力部７００からの入力信号が変換表情の指定入力を含むか否かを判断する。変換表情の指定入力が有ったならば、プロセッサ１００は、ステップＳ２の処理へ移行する。The processor 100 operates as the conversion facial expression input unit 14 and waits for the user to input a designated conversion facial expression, such as a smile, which is the facial expression to be converted to (step S1). For example, the processor 100 determines whether the input signal from the input unit 700 via the input/output interface 500 or the communication interface 400 includes a designated input of a conversion facial expression. If a designated input of a conversion facial expression is present, the processor 100 proceeds to processing of step S2.

プロセッサ１００は、指定された変換表情を、データメモリ３００の変換表情指定記憶部３０４に記憶させる（ステップＳ２）。The processor 100 stores the specified converted facial expression in the converted facial expression specification memory unit 304 of the data memory 300 (step S2).

プロセッサ１００は、画像取得部１１として動作して、顔画像を取得する（ステップＳ３）。例えば、プロセッサ１００は、入力部７００のカメラが被験者の顔を撮影した画像を入出力インタフェース５００を介して取得する。あるいは、プロセッサ１００は、ネットワークに接続されたｗｅｂカメラが撮影した顔画像や他のコンピュータが生成したアバターの顔を通信インタフェース４００を介して取得する。プロセッサ１００は、取得した顔画像を、データメモリ３００の取得画像記憶部３０１に記憶させる。The processor 100 operates as the image acquisition unit 11 to acquire a facial image (step S3). For example, the processor 100 acquires an image of the subject's face captured by the camera of the input unit 700 via the input/output interface 500. Alternatively, the processor 100 acquires a facial image captured by a web camera connected to a network or the face of an avatar generated by another computer via the communication interface 400. The processor 100 stores the acquired facial image in the acquired image storage unit 301 of the data memory 300.

プロセッサ１００は、特徴点認識部１２として動作して、取得画像記憶部３０１に記憶されている顔画像から第１の特徴点を認識する（ステップＳ４）。プロセッサ１００は、例えば、ｄｌｉｂのｆａｃｅ＿ｌａｎｄｍａｒｋ＿ｄｅｔｅｃｔｉｏｎ関数（例えばhttp://dlib.net/face_landmark_detection.py.htmlを参照）などを利用して、顔画像に対して特徴点を認識する。具体的には、プロセッサ１００は、入力の顔画像に対して、ＨＯＧ（Histogram of Oriented Gradients）特徴と呼ばれる輝度の勾配方向の分布を抽出する。ＨＯＧ特徴と顔の特徴点の位置を紐付けたデータをもとに学習したモデルは一般的に提供されている。よって、プロセッサ１００は、抽出したＨＯＧ特徴をこの学習モデルに入力し、顔の特徴点の位置を取得する。プロセッサ１００は、取得した第１の特徴点の位置を、データメモリ３００の特徴点記憶部３０２に記憶させる。The processor 100 operates as the feature point recognition unit 12 and recognizes the first feature point from the face image stored in the acquired image storage unit 301 (step S4). The processor 100 recognizes feature points from the face image, for example, by using the face_landmark_detection function of dlib (see, for example, http://dlib.net/face_landmark_detection.py.html). Specifically, the processor 100 extracts a distribution of the gradient direction of brightness, called a HOG (Histogram of Oriented Gradients) feature, from the input face image. A model that is trained based on data that links the HOG feature to the positions of the facial feature points is generally provided. Therefore, the processor 100 inputs the extracted HOG feature into this learning model and acquires the positions of the facial feature points. The processor 100 stores the positions of the acquired first feature points in the feature point storage unit 302 of the data memory 300.

プロセッサ１００は、制御点生成部１３として動作して、制御点を生成する（ステップＳ５）。具体的には、プロセッサ１００は、認識された第１の特徴点を制御点としてデータメモリ３００の制御点記憶部３０３に記憶させる。さらにプロセッサ１００は、片側しか特徴点が認識されていない顔のパーツに対して、もう片方の特徴点である第２の特徴点を追加する。そしてプロセッサ１００は、それら追加した第２の特徴点も追加の制御点として制御点記憶部３０３に記憶させる。The processor 100 operates as the control point generator 13 to generate control points (step S5). Specifically, the processor 100 stores the recognized first feature points as control points in the control point storage unit 303 of the data memory 300. The processor 100 further adds second feature points, which are feature points on the other side, to face parts for which feature points are only recognized on one side. The processor 100 then stores the added second feature points as additional control points in the control point storage unit 303.

片側しか特徴点が認識されていない顔のパーツとしては、例えば、眉、瞼、輪郭などがある。眉は、上側の特徴点しか認識されていないため、プロセッサ１００は、眉の下側の特徴点を追加する。瞼は二重の内、下側の特徴点となる目の上側の特徴点しか認識されていないため、プロセッサ１００は、上側の特徴点を追加する。輪郭は、陰影の無い部分が特徴点として認識されるため、プロセッサ１００は、陰影部分の特徴点を追加する。Examples of facial features for which feature points are only recognized on one side include eyebrows, eyelids, and contours. As only the upper feature points of the eyebrows are recognized, processor 100 adds feature points below the eyebrows. As only the upper feature points of the eyelids, which are the lower feature points of the double eyelids, are recognized, processor 100 adds feature points above them. As the contours are recognized as feature points in the non-shaded parts, processor 100 adds feature points in the shaded parts.

例えば、プロセッサ１００は、以下のようにして眉の特徴点を追加する。
図７は、眉の特徴点と目の上の特徴点との関係を説明するための模式図である。プロセッサ１００は、眉の各特徴点（特徴点ＩＤ「１８」～「２２」の特徴点）から垂線を下ろし、目の上の特徴点（特徴点ＩＤ「３７」～「４０」の特徴点）の内、垂線からの距離が最も近い特徴点を得る。すなわち、プロセッサ１００は、特徴点ＩＤ「１８」の特徴点ならば特徴点ＩＤ「３７」の特徴点、特徴点ＩＤ「１９」の特徴点ならば特徴点ＩＤ「３７」の特徴点、特徴点ＩＤ「２０」の特徴点ならば特徴点ＩＤ「３８」の特徴点、特徴点ＩＤ「２１」の特徴点ならば特徴点ＩＤ「３９」の特徴点、特徴点ＩＤ「２２」の特徴点ならば特徴点ＩＤ「４０」の特徴点、…、特徴点ＩＤ「２７」の特徴点ならば特徴点ＩＤ「４６」の特徴点、を得る。このとき、それぞれの特徴点の間の距離（縦座標の差）として、ｄ１８、ｄ１９、ｄ２０、ｄ２１、ｄ２２、…、ｄ２７、が得られる。 For example, the processor 100 adds eyebrow feature points as follows.
7 is a schematic diagram for explaining the relationship between the feature points of the eyebrows and the feature points above the eyes. The processor 100 draws a perpendicular line from each feature point of the eyebrows (feature points with feature point IDs "18" to "22") and obtains the feature point above the eyes (feature points with feature point IDs "37" to "40") that is closest to the perpendicular line. That is, the processor 100 obtains the feature point ID "37" for the feature point ID "18", the feature point ID "37" for the feature point ID "19", the feature point ID "38" for the feature point ID "20", the feature point ID "39" for the feature point ID "21", the feature point ID "40" for the feature point ID "22", ... and the feature point ID "46" for the feature point ID "27". At this time, d18, d19, d20, d21, d22, . . . , d27 are obtained as the distances (differences in ordinates) between the respective feature points.

図８は、眉の第２の特徴点である下側の特徴点の追加方法を説明するための模式図である。プロセッサ１００は、上記の特徴点の間の距離ｄ１８～ｄ２７の平均距離ｄａを計算する。このとき、右目と左目を区別せずに平均距離ｄａを求めても良いし、一般に、左右の目には若干の差異が存在するので、別々に平均距離ｄａを求めても良い。ここでは、左右を区別して平均距離を求めるものとする。すなわち、プロセッサ１００は、向かって左の眉についてはｄａ＝（ｄ１８＋ｄ１９＋ｄ２０＋ｄ２１＋ｄ２２）／５により、また、向かって右の眉についてはｄａ＝（ｄ２３＋ｄ２４＋ｄ２５＋ｄ２６＋ｄ２７）／５により、それぞれ平均距離ｄａを計算する。 Figure 8 is a schematic diagram for explaining a method for adding a lower feature point, which is a second feature point of an eyebrow. The processor 100 calculates the average distance da of the distances d18 to d27 between the above feature points. At this time, the average distance da may be calculated without distinguishing between the right eye and the left eye, or, since there is generally a slight difference between the left and right eyes, the average distance da may be calculated separately. Here, the average distance is calculated by distinguishing between the left and right. That is, the processor 100 calculates the average distance da for the left eyebrow as follows: da = (d18 + d19 + d20 + d21 + d22)/5, and for the right eyebrow as follows: da = (d23 + d24 + d25 + d26 + d27)/5.

プロセッサ１００は、こうして計算した平均距離ｄａの１／２つまりｄａ／２を特徴点追加距離ｄとして、眉の第１の特徴点それぞれから特徴点追加距離ｄだけ下に第２の特徴点を追加する。すなわち、プロセッサ１００は、特徴点ＩＤ「１８」の第１の特徴点から特徴点追加距離ｄだけ下に、特徴点ＩＤ「６９」となる第２の特徴点を追加する。同様にして、プロセッサ１００は、特徴点ＩＤ「１９」の第１の特徴点の下に特徴点ＩＤ「７０」の第２の特徴点、特徴点ＩＤ「２０」の第１の特徴点の下に特徴点ＩＤ「７１」の第２の特徴点、…、特徴点ＩＤ「２７」の第１の特徴点の下に特徴点ＩＤ「７８」の第２の特徴点、を追加する。The processor 100 sets the feature point addition distance d to half the average distance da thus calculated, i.e., da/2, and adds second feature points the feature point addition distance d below each of the first feature points of the eyebrows. That is, the processor 100 adds a second feature point with feature point ID "69" the feature point addition distance d below the first feature point with feature point ID "18". In the same manner, the processor 100 adds a second feature point with feature point ID "70" below the first feature point with feature point ID "19", a second feature point with feature point ID "71" below the first feature point with feature point ID "20", ..., a second feature point with feature point ID "78" below the first feature point with feature point ID "27".

また、プロセッサ１００は、例えば以下のようにして瞼の特徴点を追加する。
図９は、瞼の第２の特徴点の追加方法を説明するための模式図である。瞼の認識された片側の特徴点は、目の上側の特徴点である。そこで、瞼の第２の特徴点を追加する際にも、眉の第２の特徴点を追加する際に使用した平均距離ｄａを使用することができる。プロセッサ１００は、この左右別々の平均距離ｄａの１／４つまりｄａ／４を特徴点追加距離ｄとする。プロセッサ１００は、目の上の特徴点（特徴点ＩＤ「３７」～「４０」の第１の特徴点）それぞれから特徴点追加距離ｄだけ上に第２の特徴点を追加する。すなわち、プロセッサ１００は、特徴点ＩＤ「３７」の第１の特徴点から特徴点追加距離ｄだけ上に、特徴点ＩＤ「７９」となる第２の特徴点を追加する。同様にして、プロセッサ１００は、特徴点ＩＤ「３８」の第１の特徴点の上に特徴点ＩＤ「８０」の第２の特徴点、特徴点ＩＤ「３９」の第１の特徴点の上に特徴点ＩＤ「８１」の第２の特徴点、…、特徴点ＩＤ「４６」の第１の特徴点の上に特徴点ＩＤ「８６」の第２の特徴点、を追加する。 The processor 100 also adds eyelid feature points, for example, in the following manner.
9 is a schematic diagram for explaining a method for adding a second feature point of an eyelid. The feature point on one side of the recognized eyelid is the feature point on the upper side of the eye. Therefore, when adding a second feature point of the eyelid, the average distance da used when adding the second feature point of the eyebrow can be used. The processor 100 sets the feature point adding distance d to be 1/4 of the average distance da for the left and right, that is, da/4. The processor 100 adds a second feature point at a distance d above each of the feature points above the eyes (first feature points with feature point IDs "37" to "40"). That is, the processor 100 adds a second feature point with feature point ID "79" at a distance d above the first feature point with feature point ID "37". In a similar manner, the processor 100 adds a second feature point with feature point ID "80" on top of the first feature point with feature point ID "38", a second feature point with feature point ID "81" on top of the first feature point with feature point ID "39", ..., a second feature point with feature point ID "86" on top of the first feature point with feature point ID "46".

なお、輪郭の陰影の大きさについては、特に個人差は大きくないので、プロセッサ１００は、例えば以下のようにして輪郭の陰影の第２の特徴点を追加する。例えば、プロセッサ１００は、輪郭の特徴点（特徴点ＩＤ「１」～「１７」の第１の特徴点）ごとに、予め決められた方向及び距離の位置に第２の特徴点を追加する。As there is not much individual variation in the size of the contour shading, the processor 100 adds a second feature point to the contour shading, for example, as follows. For example, the processor 100 adds a second feature point to each of the contour feature points (first feature points with feature point IDs "1" to "17") at a position in a predetermined direction and distance.

プロセッサ１００は、表情変換部１６として動作して、取得画像記憶部３０１に記憶されている顔画像の表情を変換する（ステップＳ６）。すなわち、プロセッサ１００は、制御点記憶部３０３に記憶された制御点と、変化量記憶部３０５に記憶された、変換表情指定記憶部３０４に記憶された変換表情に応じた変化量とに基づいて、顔画像を変換する。例えば、プロセッサ１００は、ＭＬＳの実装（例えば、https://github.com/Jarvis73/Moving-Least-Squaresを参照）などを利用する。
具体的には、プロセッサ１００は、各制御点について、変換表情指定記憶部３０４に記憶された変換表情に応じた変化量分、移動させる。例えば、表情を笑顔に変化する場合には、制御点ＩＤ「１」の制御点については、変換前のｘｙ座標が（２３，４５）であるので（図４参照）、プロセッサ１００は、ｘ座標を「＋１」、ｙ座標を「＋２」する（図５参照）ことで、当該制御点の画素を（２４，４７）に移動するような変換を行う。 The processor 100 operates as the facial expression conversion unit 16 to convert the facial expression of the facial image stored in the acquired image storage unit 301 (step S6). That is, the processor 100 converts the facial image based on the control points stored in the control point storage unit 303 and the change amount stored in the change amount storage unit 305 according to the converted facial expression stored in the converted facial expression designation storage unit 304. For example, the processor 100 uses an implementation of MLS (see, for example, https://github.com/Jarvis73/Moving-Least-Squares) or the like.
Specifically, processor 100 moves each control point by an amount of change corresponding to the converted facial expression stored in converted facial expression designation storage unit 304. For example, when changing the facial expression to a smile, since the x and y coordinates before conversion for control point ID "1" are (23, 45) (see FIG. 4), processor 100 performs a conversion such that the pixel of the control point moves to (24, 47) by incrementing the x coordinate by "+1" and the y coordinate by "+2" (see FIG. 5).

そして、制御点以外の点については、プロセッサ１００は、下記アフィン変換（ヘルマート変換＝相似変換及びｒｉｇｉｄｄｅｆｏｒｍａｔｉｏｎ＝剛体変形を含む）を適用する。 For points other than the control points, the processor 100 applies the following affine transformation (including the Helmert transformation = similarity transformation and rigid deformation):

ただし、ｘ，ｙは近傍の制御点の座標、ｘ’，ｙ’はその制御点の座標に変化量を足した座標、ａ，ｂ，ｃ，ｄはパラメータ、ｔ_x，ｔ_yは平行移動パラメータである。プロセッサ１００は、制御点の座標ｘ，ｙと変化量を足した座標ｘ’，ｙ’の最小二乗平均を算出し、これを最小化するようなパラメータａ，ｂ，ｃ，ｄ，ｔ_x，ｔ_yを大域最適化により求める。そして、変換するべき対象点の座標をｘ，ｙとして、これら求めたパラメータを用いて変換後の座標を求める。プロセッサ１００は、こうして求めたパラメータａ，ｂ，ｃ，ｄ，ｔ_x，ｔ_yを用いて、追加した制御点から上記アフィン変換により変換後の座標を求める。 Here, x and y are the coordinates of the nearby control point, x' and y' are the coordinates obtained by adding the change amount to the coordinates of the control point, a, b, c, d are parameters, and t _x and t _y are translation parameters. The processor 100 calculates the least square mean of the coordinates x' and y' obtained by adding the change amount to the coordinates of the control point x and y, and obtains parameters a, b, c, d, t _x and t _y that minimize this by global optimization. Then, the coordinates of the target point to be transformed are set to x and y, and the transformed coordinates are obtained using these obtained parameters. The processor 100 uses the parameters a, b, c, d, t _x and t _y obtained in this way to obtain the transformed coordinates from the added control point by the above-mentioned affine transformation.

プロセッサ１００は、こうして変換した顔画像を変換画像としてデータメモリ３００の変換画像記憶部３０６に記憶させる。The processor 100 stores the facial image thus converted as a converted image in the converted image storage unit 306 of the data memory 300.

プロセッサ１００は、画像出力部１７として動作して、変換画像記憶部３０６に記憶された変換画像を出力する（ステップＳ７）。例えば、プロセッサ１００は、入出力インタフェース５００を介して表示部８００に顔画像を表示させる。あるいは、プロセッサ１００は、通信インタフェース４００によりネットワーク上に送信し、ネットワークに接続された表示デバイスに表示させたり、ネットワークに接続された他のコンピュータの表示部に表示させたりする。The processor 100 operates as the image output unit 17 and outputs the converted image stored in the converted image storage unit 306 (step S7). For example, the processor 100 causes the display unit 800 to display the facial image via the input/output interface 500. Alternatively, the processor 100 transmits the facial image onto the network via the communication interface 400 and causes it to be displayed on a display device connected to the network, or on the display unit of another computer connected to the network.

プロセッサ１００は、このフローチャートに示す画像変換装置１としての動作を終了するか否か判断する（ステップＳ８）。例えば、プロセッサ１００は、入力部７００から、入出力インタフェース５００を介して、あるいは、通信インタフェース４００を介して、ユーザから画像変換の終了を指示されたか否か確認する。ここで、動作を終了する場合には、プロセッサ１００は、このフローチャートに示す動作を終了する。The processor 100 determines whether or not to end the operation of the image conversion device 1 shown in this flowchart (step S8). For example, the processor 100 checks whether or not the user has instructed the image conversion to end from the input unit 700, via the input/output interface 500, or via the communication interface 400. If the operation is to end, the processor 100 ends the operation shown in this flowchart.

これに対して、未だ動作を終了しないと場合には、プロセッサ１００は、変換表情入力部１４として動作して、ユーザによる変換表情の変更指定入力が有ったか否か判断する（ステップＳ９）。変換表情の変更指定入力が無ければ、プロセッサ１００は、ステップＳ３の処理へ移行する。また、変換表情の変更指定入力が有った場合には、プロセッサ１００は、ステップＳ２の処理へ移行する。On the other hand, if the operation has not yet ended, the processor 100 operates as the conversion facial expression input unit 14 and determines whether or not the user has input a specification to change the conversion facial expression (step S9). If there has been no input specifying a change to the conversion facial expression, the processor 100 proceeds to processing of step S3. If there has been input specifying a change to the conversion facial expression, the processor 100 proceeds to processing of step S2.

以上に説明した一実施形態に係る画像変換装置１は、制御点生成部１３と表情変換部１６とを備える。制御点生成部１３は、人の顔の画像から認識された顔パーツの片側の特徴点である第１の特徴点に基づいて、認識されていないもう片側の特徴点である第２の特徴点を追加し、第１及び第２の特徴点を制御点とする。表情変換部１６は、変換するべき変換表情に応じた変形量により制御点を変形することで人の顔の表情を変換した変換画像を得る。
従って、一実施形態に係る画像変換装置１は、顔パーツの片側の特徴点に基づいてもう片側の特徴点を追加して画像を変換するので、顔パーツの片側のみしか認識しない場合であっても、自然な表情の顔画像に変換することを可能とする画像変換技術を提供することができる。 The image conversion device 1 according to the embodiment described above includes a control point generation unit 13 and a facial expression conversion unit 16. Based on a first feature point that is a feature point on one side of a facial feature recognized from an image of a person's face, the control point generation unit 13 adds a second feature point that is a feature point on the other side that is not recognized, and sets the first and second feature points as control points. The facial expression conversion unit 16 obtains a converted image in which the facial expression of the person is converted by transforming the control point by an amount of transformation corresponding to the conversion facial expression to be converted.
Therefore, the image conversion device 1 according to one embodiment converts an image by adding feature points on one side of a facial feature based on feature points on the other side, and therefore can provide an image conversion technology that can convert a facial image into one with a natural expression even when only one side of a facial feature is recognized.

さらに、一実施形態に係る画像変換装置１では、顔パーツは、少なくとも眉または瞼を含む。
このように、眉の上側の特徴点及び／または瞼の下側の特徴点しか認識できなくても、眉の下側の特徴点及び／または瞼の上側の特徴点を追加して、自然な表情の顔画像に変換することが可能となる。 Furthermore, in the image conversion device 1 according to an embodiment, the facial features include at least the eyebrows or the eyelids.
In this way, even if only feature points above the eyebrows and/or feature points below the eyelids can be recognized, it is possible to add feature points below the eyebrows and/or feature points above the eyelids and convert the face image into one with a natural expression.

ここで、制御点生成部１３は、第１の特徴点である眉の上側の特徴点と目の上側の特徴点との間の距離に基づいて特徴点追加距離ｄを算出し、第１の特徴点である眉の上側の特徴点から特徴点追加距離ｄだけ離れた眉の下側の点を第２の特徴点として追加する。
よって、眉の認識されない下側の第２の特徴点を容易に追加することができる。 Here, the control point generation unit 13 calculates a feature point addition distance d based on the distance between the feature point above the eyebrow, which is the first feature point, and the feature point above the eye, and adds a point below the eyebrow that is away from the first feature point above the eyebrow by the feature point addition distance d as a second feature point.
Therefore, a second feature point below the unrecognized eyebrow can be easily added.

あるいは、制御点生成部１３は、眉の上側の特徴点と第１の特徴点である瞼の下側の特徴点との間の距離に基づいて特徴点追加距離ｄを算出し、第１の特徴点である瞼の下側の特徴点から特徴点追加距離ｄだけ離れた瞼の上側の点を第２の特徴点として追加する。
よって、瞼の認識されない上側の第２の特徴点を容易に追加することができる。 Alternatively, the control point generation unit 13 calculates a feature point adding distance d based on the distance between the upper feature point of the eyebrow and the lower feature point of the eyelid, which is the first feature point, and adds a point on the upper eyelid that is the feature point adding distance d away from the first feature point of the lower eyelid, which is the first feature point, as the second feature point.
Therefore, a second feature point on the upper side of the eyelid that is not recognized can be easily added.

また、一実施形態に係る画像変換装置１では、変換するべき変換表情ごとに、制御点それぞれについての変形量を表す変化量を予め格納する変化量格納部１５と、変換するべき変換表情を入力する変換表情入力部１４と、を備える。そして、表情変換部１６は、入力された変換表情に応じた変化量を変化量格納部から読み出し、その読み出した変化量を用いて変換画像を得る。
このように、追加する第２の特徴点に対応する制御点についても予め変化量を格納しておくことで、第２の特徴点に対応する制御点も利用して自然な表情の顔画像に変換することが可能となる。 Moreover, the image conversion device 1 according to one embodiment includes a change amount storage unit 15 that stores in advance a change amount representing the amount of deformation for each control point for each conversion facial expression to be converted, and a conversion facial expression input unit 14 that inputs the conversion facial expression to be converted. The facial expression conversion unit 16 reads out the change amount corresponding to the input conversion facial expression from the change amount storage unit, and obtains a conversion image using the read out change amount.
In this way, by storing the change amount in advance for the control point corresponding to the second feature point to be added, it is possible to convert the face image into one with a natural expression by using the control point corresponding to the second feature point as well.

［他の実施形態］
なお、この発明は上記一実施形態に限定されるものではない。
例えば、以上で説明した各処理の流れは、説明した手順に限定されるものではなく、いくつかのステップの順序が入れ替えられても良いし、いくつかのステップが同時並行で実施されても良い。 [Other embodiments]
It should be noted that the present invention is not limited to the above embodiment.
For example, the flow of each process described above is not limited to the procedures described, and the order of some steps may be changed, or some steps may be performed simultaneously in parallel.

また、以上で説明した各処理の流れは、リアルタイムに取得する顔画像の表情をリアルタイムに変換していく場合であったが、リアルタイム処理ではなく、保存された顔画像の表情を変換する用途にも同様に適用できる。 In addition, the process flow described above is for converting facial expressions in facial images acquired in real time in real time, but it can also be applied to applications where the facial expressions of stored facial images are converted rather than real-time processing.

第２の特徴点を追加する際、平均距離ｄａに対して１／２または１／４と固定の値を使用しているが、ユーザが任意の値を指定できるようにしても良い。 When adding a second feature point, a fixed value of 1/2 or 1/4 is used for the average distance da, but it may also be possible to allow the user to specify any value.

また、ユーザが、顔のどのパーツに対して、もう片側の特徴点である第２の特徴点を追加するか選択できるようにしても良い。 The user may also be able to select which facial features to add a second feature point, which is a feature point on the other side.

また、実施形態に記載した手法は、計算機（コンピュータ）に実行させることができるプログラム（ソフトウェア手段）として、例えば磁気ディスク（フロッピー（登録商標）ディスク、ハードディスク等）、光ディスク（ＣＤ－ＲＯＭ、ＤＶＤ、ＭＯ等）、半導体メモリ（ＲＯＭ、ＲＡＭ、フラッシュメモリ等）等の記録媒体に格納し、また通信媒体により伝送して頒布することもできる。なお、媒体側に格納されるプログラムには、計算機に実行させるソフトウェア手段（実行プログラムのみならずテーブル、データ構造も含む）を計算機内に構成させる設定プログラムをも含む。本装置を実現する計算機は、記録媒体に記録されたプログラムを読み込み、また場合により設定プログラムによりソフトウェア手段を構築し、このソフトウェア手段によって動作が制御されることにより上述した処理を実行する。なお、本明細書でいう記録媒体は、頒布用に限らず、計算機内部あるいはネットワークを介して接続される機器に設けられた磁気ディスク、半導体メモリ等の記憶媒体を含むものである。 The method described in the embodiment can be stored as a program (software means) that can be executed by a calculator (computer) on a recording medium such as a magnetic disk (floppy disk, hard disk, etc.), optical disk (CD-ROM, DVD, MO, etc.), semiconductor memory (ROM, RAM, flash memory, etc.), and can also be distributed by transmitting it via a communication medium. The program stored on the medium also includes a setting program that configures the software means (including not only execution programs but also tables and data structures) that the computer executes. The computer that realizes this device reads the program recorded on the recording medium, and in some cases, constructs the software means using the setting program, and executes the above-mentioned processing by controlling the operation of this software means. Note that the recording medium referred to in this specification is not limited to a storage medium for distribution, but also includes storage media such as a magnetic disk or semiconductor memory installed inside the computer or in a device connected via a network.

要するに、この発明は上記実施形態に限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で種々に変形することが可能である。また、各実施形態は可能な限り適宜組み合わせて実施しても良く、その場合組み合わせた効果が得られる。さらに、上記実施形態には種々の段階の発明が含まれており、開示される複数の構成要件における適当な組み合わせにより種々の発明が抽出され得る。In short, this invention is not limited to the above-described embodiments, and various modifications can be made in the implementation stage without departing from the gist of the invention. Furthermore, the embodiments may be implemented in appropriate combinations as far as possible, in which case the combined effects can be obtained. Furthermore, the above-described embodiments include inventions at various stages, and various inventions can be extracted by appropriate combinations of the multiple constituent elements disclosed.

１…画像変換装置
１１…画像取得部
１２…特徴点認識部
１３…制御点生成部
１４…変換表情入力部
１５…変化量格納部
１６…表情変換部
１７…画像出力部
１００…プロセッサ
２００…プログラムメモリ
３００…データメモリ
３０１…取得画像記憶部
３０２…特徴点記憶部
３０３…制御点記憶部
３０４…変換表情指定記憶部
３０５…変化量記憶部
３０６…変換画像記憶部
３０７…一時記憶部
４００…通信インタフェース
５００…入出力インタフェース
６００…バス
７００…入力部
８００…表示部

1...Image conversion device 11...Image acquisition unit 12...Feature point recognition unit 13...Control point generation unit 14...Transformed facial expression input unit 15...Change amount storage unit 16...Facial expression conversion unit 17...Image output unit 100...Processor 200...Program memory 300...Data memory 301...Acquired image storage unit 302...Feature point storage unit 303...Control point storage unit 304...Transformed facial expression designation storage unit 305...Change amount storage unit 306...Transformed image storage unit 307...Temporary storage unit 400...Communication interface 500...Input/output interface 600...Bus 700...Input unit 800...Display unit

Claims

a control point generation unit that adds a second feature point that is a feature point on the other side of a face feature that has not been recognized based on a first feature point that is a feature point on one side of a face feature recognized from an image of a human face, and sets the first and second feature points as control points;
an expression conversion unit for converting the control points by a transformation amount corresponding to a transformation expression to be converted, thereby obtaining a transformation image in which the facial expression of the person is transformed;
Equipped with
The facial features include at least eyebrows or eyelids,
The control point generation unit
Calculating a feature point addition distance based on a distance between the feature point above the eyebrow and the feature point above the eye, which are the first feature points;
an image conversion device that adds, as the second feature point, a point below the eyebrow that is away from the first feature point above the eyebrow by the feature point addition distance.

a control point generation unit that adds a second feature point that is a feature point on the other side of a face feature that has not been recognized based on a first feature point that is a feature point on one side of a face feature recognized from an image of a human face, and sets the first and second feature points as control points;
an expression conversion unit for converting the control points by a transformation amount corresponding to a transformation expression to be converted, thereby obtaining a transformation image in which the facial expression of the person is transformed;
Equipped with
The facial features include at least eyebrows or eyelids,
The control point generation unit
Calculating a feature point addition distance based on a distance between an upper feature point of an eyebrow and a lower feature point of an eyelid, which is the first feature point;
an image conversion device that adds, as the second feature point, a point on an upper side of an eyelid that is away from the first feature point, which is the feature point on the lower side of the eyelid, by the feature point addition distance.

a change amount storage unit that stores in advance a change amount representing the deformation amount for each of the control points for each of the facial expressions to be converted;
a conversion expression input unit for inputting the conversion expression to be converted;
Further comprising:
the facial expression conversion unit reads out the amount of change corresponding to the input converted facial expression from the amount of change storage unit, and obtains the converted image using the amount of change read out.
3. An image conversion device according to claim 1 or 2 .

1. An image conversion method for an image conversion device having a processor and converting facial expressions in an image of a human face, comprising:
adding, by the processor, a second feature point which is a feature point on one side of a facial feature that has not been recognized, based on a first feature point which is a feature point on the other side of a facial feature recognized from an image of the person's face, and setting the first and second feature points as control points;
The facial features include at least eyebrows or eyelids,
Calculating a feature point addition distance based on a distance between the feature point above the eyebrow and the feature point above the eye, which are the first feature points;
adding a point below the eyebrow that is separated by the feature point addition distance from the feature point above the eyebrow, which is the first feature point, as the second feature point;
a transformation image in which the facial expression of the person is transformed by the processor by a transformation amount corresponding to the transformation facial expression to be transformed;
Image conversion methods.

1. An image conversion method for an image conversion device having a processor and converting facial expressions in an image of a human face, comprising:
adding, by the processor, a second feature point which is a feature point on one side of a facial feature that has not been recognized, based on a first feature point which is a feature point on the other side of a facial feature recognized from an image of the person's face, and setting the first and second feature points as control points;
The facial features include at least eyebrows or eyelids,
Calculating a feature point addition distance based on a distance between an upper feature point of an eyebrow and a lower feature point of an eyelid, which is the first feature point;
adding, as the second feature point, a point on an upper side of an eyelid that is separated by the feature point addition distance from the feature point on the lower side of the eyelid, which is the first feature point;
a transformation image in which the facial expression of the person is transformed by the processor by a transformation amount corresponding to the transformation facial expression to be transformed;
Image conversion methods.

4. An image conversion program for causing a processor to function as each of the units of the image conversion device according to claim 1 .