JP7485982B2

JP7485982B2 - Processing device, playback system, processing method, and processing program

Info

Publication number: JP7485982B2
Application number: JP2022529167A
Authority: JP
Inventors: 健太今泉; 公孝堤; 隆佐藤
Original assignee: Nippon Telegraph and Telephone Corp; NTT Inc USA
Current assignee: NTT Inc; NTT Inc USA
Priority date: 2020-06-02
Filing date: 2020-06-02
Publication date: 2024-05-17
Anticipated expiration: 2040-06-02
Also published as: JPWO2021245775A1; WO2021245775A1

Description

本発明は、処理装置、再生システム、処理方法および処理プログラムに関する。 The present invention relates to a processing device, a playback system, a processing method and a processing program.

近年、駅のホーム、美術館、および博物館などの公共空間において、特定のエリアにのみ音を提示するエリア再生技術が、普及している。特定のエリアのみに音を提示するエリア再生技術は、家庭においてテレビを見ている人にのみ音を届けたり、車内において運転者にのみナビゲーションの音を提示し、その他の同乗者には音楽を提示したりするなどの、小規模空間への適用考えられている。In recent years, area playback technology that presents sound only to specific areas has become widespread in public spaces such as train stations, art galleries, and museums. Area playback technology that presents sound only to specific areas is being considered for application to small spaces, such as delivering sound only to people watching television at home, or presenting navigation sounds only to the driver in a car and music to other passengers.

このようなエリア再生技術において特に、超音波を用いた鋭い指向性を持つスピーカ、または通常のスピーカを複数並べて構成したスピーカアレイにより、音場を制御する方法が検討されている。一般的に、超音波スピーカは超音波を可聴音に復調しているため、復調の際に歪みが発生することで音質を劣化させ、特に高音域の再生が困難になる。音楽など様々なコンテンツを再生することを考えると、高音質な再生が可能なエリア再生が求められる。 In particular, methods of controlling the sound field using ultrasonic speakers with sharp directionality or speaker arrays consisting of multiple regular speakers lined up are being considered for this type of area reproduction technology. Generally, ultrasonic speakers demodulate ultrasonic waves into audible sound, but distortion occurs during the demodulation, degrading the sound quality, making it particularly difficult to reproduce high-pitched sounds. When considering the reproduction of various content such as music, area reproduction that enables high-quality sound reproduction is required.

例えば、スピーカアレイを用いて任意の特定のエリアにのみを音を提示し、それ以外のエリアには音が漏れないようにするエリア再生方法がある（特許文献１参照）。For example, there is an area playback method that uses a speaker array to present sound only to any specific area, preventing the sound from leaking out to other areas (see Patent Document 1).

スピーカアレイを用いてエリア再生を実現する手法として、指向性制御技術や音響コントラスト最大化技術（ＡＣＭ：Acoustic Contrast Maximization）を用いる場合がある。ＡＣＭを用いて、任意に設定した音を提示するエリア（Bright-zone）と、音を抑圧するエリア（Dark-zone）との間の音響コントラストをスピーカアレイにより最大化するエリア再生が検討される（非特許文献１）。非特許文献１において、罰則項の重みである正則化パラメータとして、周波数ごとの所望の音場の再現精度とフィルタゲインの大きさから、実験的に全ての周波数に対して同じ値が用いられる。As a method for realizing area reproduction using a speaker array, directivity control technology and acoustic contrast maximization (ACM) technology may be used. Using ACM, area reproduction that maximizes the acoustic contrast between an area where an arbitrarily set sound is presented (bright zone) and an area where the sound is suppressed (dark zone) using a speaker array is considered (Non-Patent Document 1). In Non-Patent Document 1, the same value is experimentally used for all frequencies as the regularization parameter, which is the weight of the penalty term, based on the reproduction accuracy of the desired sound field for each frequency and the magnitude of the filter gain.

フィルタを導出する目的関数に対して罰則項を用いることで、フィルタを設計する方法がある（非特許文献２参照）。非特許文献２において、フィルタゲインを抑圧するために、フィルタ係数の二乗和が、罰則項として用いられる。There is a method for designing a filter by using a penalty term for the objective function from which the filter is derived (see Non-Patent Document 2). In Non-Patent Document 2, the sum of squares of the filter coefficients is used as the penalty term to suppress the filter gain.

特開２０１３－１１０４９５号公報JP 2013-110495 A

J. W. Choi, Y. H. Kim, “Generation of an acoustically bright zone with an illuminated region using multiple sources,” The Journal of the Acoustical Society of America 111 (4) (2001) 1695J. W. Choi, Y. H. Kim, “Generation of an acoustically bright zone with an illuminated region using multiple sources,” The Journal of the Acoustical Society of America 111 (4) (2001) 1695 MM. Boone, WH. Cho, and JG Ih, “Design of a highly directional endfire loudspeaker array,” Journal of the Audio Engineering Society 57.5 (2009): 309-325MM. Boone, WH. Cho, and JG Ih, “Design of a highly directional endfire loudspeaker array,” Journal of the Audio Engineering Society 57.5 (2009): 309-325

任意の周波数帯域に対してＡＣＭまたは指向性制御によってフィルタを設計する場合、フィルタを設計することは可能であるものの、低域のフィルタゲインが大きくなってしまう場合がある。設計されたフィルタゲインにより得られた駆動信号がスピーカの許容振動を超えた場合、出力される音が割れてしまうという課題がある。 When designing a filter using ACM or directional control for any frequency band, it is possible to design a filter, but the filter gain in the low range may become large. If the drive signal obtained by the designed filter gain exceeds the allowable vibration of the speaker, the output sound may become distorted.

本発明は、上記事情に鑑みてなされたものであり、本発明の目的は、複数のスピーカのそれぞれに加わるゲインを抑圧して、目標とする指向特性を実現する技術を提供することである。The present invention has been made in consideration of the above circumstances, and the object of the present invention is to provide a technology that suppresses the gain applied to each of multiple speakers to achieve a target directional characteristic.

本発明の一態様の処理装置は、目標とする指向特性から、報酬関数を取得する報酬関数取得部と、目標とする指向特性を実現する複数のスピーカが再生可能な駆動信号を生成するためのフィルタ係数の範囲を制約条件として取得する制約条件取得部と、複数のスピーカのそれぞれについて、制約条件を満たし、報酬関数を最大とするフィルタを算出する算出部と、入力音響信号を、算出されたフィルタで畳み込み、各スピーカに入力する駆動信号を生成する畳み込み演算部を備える。A processing device according to one embodiment of the present invention includes a reward function acquisition unit that acquires a reward function from a target directional characteristic, a constraint condition acquisition unit that acquires as constraint conditions a range of filter coefficients for generating drive signals that can be reproduced by multiple speakers that achieve the target directional characteristic, a calculation unit that calculates, for each of the multiple speakers, a filter that satisfies the constraint conditions and maximizes the reward function, and a convolution calculation unit that convolves an input acoustic signal with the calculated filter to generate drive signals to be input to each speaker.

本発明の一態様の再生システムは、上記処理装置と、畳み込み演算部が算出した駆動信号を再生する複数のスピーカを備える。 A playback system according to one embodiment of the present invention comprises the above-mentioned processing device and a plurality of speakers that play back the drive signals calculated by the convolution calculation unit.

本発明の一態様の処理方法は、コンピュータが、目標とする指向特性から、報酬関数を取得するステップと、コンピュータが、目標とする指向特性を実現する複数のスピーカが再生可能な駆動信号を生成するためのフィルタ係数の範囲を制約条件として取得するステップと、コンピュータが、複数のスピーカのそれぞれについて、制約条件を満たし、報酬関数を最大とするフィルタを算出するステップと、コンピュータが、入力音響信号を、算出されたフィルタで畳み込み、各スピーカに入力する駆動信号を生成するステップを備える。 A processing method according to one aspect of the present invention includes a step in which a computer obtains a reward function from a target directional characteristic, a step in which the computer obtains, as a constraint condition, a range of filter coefficients for generating drive signals that can be reproduced by multiple speakers that achieve the target directional characteristic, a step in which the computer calculates, for each of the multiple speakers, a filter that satisfies the constraint condition and maximizes the reward function, and a step in which the computer convolves an input acoustic signal with the calculated filter to generate a drive signal to be input to each speaker.

本発明の一態様は、上記処理装置として、コンピュータを機能させる処理プログラムである。 One aspect of the present invention is a processing program that causes a computer to function as the above-mentioned processing device.

本発明によれば、複数のスピーカのそれぞれに加わるゲインを抑圧して、目標とする指向特性を実現する技術を提供することができる。 The present invention provides a technology that suppresses the gain applied to each of multiple speakers to achieve a desired directional characteristic.

図１は、本発明の実施の形態に係る再生システムのシステム構成と処理装置の機能ブロックを説明する図である。FIG. 1 is a diagram illustrating the system configuration of a playback system according to an embodiment of the present invention and functional blocks of a processing device. 図２は、処理装置の処理を説明するフローチャートである。FIG. 2 is a flowchart illustrating the processing of the processing device. 図３は、指向性制御フィルタを求める際の観測系を説明する図である。FIG. 3 is a diagram for explaining an observation system when determining a directivity control filter. 図４は、音響コントラスト最大化のフィルタを求める際の観測系を説明する図である。FIG. 4 is a diagram for explaining an observation system when a filter for maximizing the acoustic contrast is obtained. 図５は、処理装置に用いられるコンピュータのハードウエア構成を説明する図である。FIG. 5 is a diagram for explaining the hardware configuration of a computer used in the processing device.

以下、図面を参照して、本発明の実施形態を説明する。図面の記載において同一部分には同一符号を付し説明を省略する。Hereinafter, an embodiment of the present invention will be described with reference to the drawings. In the description of the drawings, the same parts are given the same reference numerals and the description will be omitted.

（再生システム）
図１を参照して、本発明の実施の形態に係る処理装置１が用いられる再生システム３を説明する。再生システム３は、処理装置１と、スピーカアレイ２を備える。 (Playback System)
A reproduction system 3 using a processing device 1 according to an embodiment of the present invention will be described with reference to Fig. 1. The reproduction system 3 includes the processing device 1 and a speaker array 2.

処理装置１は、入力音響信号を、スピーカ毎の駆動信号に変換するコンピュータである。処理装置１は、所望の音場の指向特性と、スピーカアレイ２を構成する各スピーカの仕様から、各スピーカに対応するフィルタを生成する。処理装置１は、入力音響信号を、生成したスピーカ毎のフィルタで畳み込み、スピーカ毎の駆動信号に変換する。The processing device 1 is a computer that converts the input acoustic signal into a drive signal for each speaker. The processing device 1 generates a filter corresponding to each speaker based on the directional characteristics of the desired sound field and the specifications of each speaker that makes up the speaker array 2. The processing device 1 convolves the input acoustic signal with the generated filter for each speaker, and converts it into a drive signal for each speaker.

スピーカアレイ２は、複数のスピーカによって形成される。複数のスピーカはそれぞれ、再生システム３で音場を再現する会場に設置される。各スピーカは、それぞれ、処理装置１から出力されたスピーカ毎の駆動信号を再生する。ここで駆動信号が再生されることによって、所望の指向特定が実現された出力音響信号が得られる。The speaker array 2 is formed by multiple speakers. Each of the multiple speakers is installed in a venue where a sound field is reproduced by a reproduction system 3. Each speaker reproduces the drive signal for each speaker output from the processing device 1. By reproducing the drive signal here, an output acoustic signal that realizes the desired directional specification is obtained.

（処理方法）
図２を参照して、処理装置１による処理の概要を説明する。 (Processing method)
An overview of the processing performed by the processing device 1 will be described with reference to FIG.

ステップＳ１において処理装置１は、所望の音場とスピーカの仕様を取得する。ステップＳ２において処理装置１は、ステップＳ１で取得した所望の音場から生成された報酬関数を取得する。ステップＳ３において処理装置１は、スピーカの仕様から生成された制約条件を取得する。In step S1, the processing device 1 acquires the desired sound field and speaker specifications. In step S2, the processing device 1 acquires a reward function generated from the desired sound field acquired in step S1. In step S3, the processing device 1 acquires constraint conditions generated from the speaker specifications.

ステップＳ４において処理装置１は、強化学習によって、音場を制御するフィルタを算出する。このとき処理装置１は、ステップＳ３で取得した制約条件を満たし、ステップＳ２で取得した報酬関数を最大にするように、フィルタを設定する。In step S4, the processing device 1 calculates a filter that controls the sound field by reinforcement learning. At this time, the processing device 1 sets the filter so as to satisfy the constraint conditions obtained in step S3 and maximize the reward function obtained in step S2.

ステップＳ５において処理装置１は、入力音響信号をステップＳ４で設計したフィルタで畳み込み、各スピーカに入力する駆動信号を生成する。駆動信号が各スピーカに入力されることにより、所望の音場が実現される。In step S5, the processing device 1 convolves the input acoustic signal with the filter designed in step S4 to generate a drive signal to be input to each speaker. The desired sound field is realized by inputting the drive signal to each speaker.

（処理装置）
処理装置１は、報酬関数として所望の音場を生成する関数が、フィルタゲインがスピーカアレイ２を構成する各スピーカが再生可能な一定の範囲に収まる制約条件下において、最大となる最適化問題を強化学習により、各スピーカのフィルタを生成する。処理装置１は、報酬関数取得部１１、制約条件取得部１２、算出部１３および畳み込み演算部１４を備える。 (Processing Equipment)
The processing device 1 generates filters for each speaker by reinforcement learning to solve an optimization problem in which a function for generating a desired sound field as a reward function is maximized under a constraint condition that the filter gain falls within a certain range that can be reproduced by each speaker constituting the speaker array 2. The processing device 1 includes a reward function acquisition unit 11, a constraint condition acquisition unit 12, a calculation unit 13, and a convolution calculation unit 14.

報酬関数取得部１１は、目標とする指向特性から設定された報酬関数を取得する。報酬関数取得部１１は、目標とする指向特性の音場を実現するための報酬関数を取得する。報酬関数は、音場を実現する関数から任意に設定される。報酬関数取得部１１は、ユーザが生成した報酬関数を取得しても良いし、コンピュータ処理により目標とする指向特性を実現するための報酬関数を生成して、生成された報酬関数を取得しても良い。報酬関数取得部１１は、取得した報酬関数を、後述の算出部１３が参照可能なように処理装置１内に設定する。The reward function acquisition unit 11 acquires a reward function set from the target directional characteristics. The reward function acquisition unit 11 acquires a reward function for realizing a sound field with the target directional characteristics. The reward function is arbitrarily set from functions that realize a sound field. The reward function acquisition unit 11 may acquire a reward function generated by a user, or may generate a reward function for realizing the target directional characteristics by computer processing and acquire the generated reward function. The reward function acquisition unit 11 sets the acquired reward function in the processing device 1 so that it can be referenced by the calculation unit 13 described below.

報酬関数として、最小二乗法による指向性制御のフィルタ設計における目標関数が示す所望の音場と制御点で観測される音場との誤差について、誤差が大きいほど報酬関数が小さく、誤差が小さいほど報酬関数が大きくなるように変換した関数を用いても良い。 As a reward function, a function may be used that is transformed so that the larger the error is, the smaller the reward function becomes, and vice versa, for the error between the desired sound field indicated by the target function in the filter design for directional control using the least squares method and the sound field observed at the control point.

具体的には、例えば図３に示す観測系において、スピーカを複数個並べたスピーカアレイの周囲に制御点を配置し、スピーカから制御点までの伝達の特性を基に、各スピーカの振幅、位相を制御するフィルタが設計される方法がある。この方法は、スピーカから強く音が伝搬する方向、または音が伝搬しない方向を制御することができる。このような方法において、各制御点で観測される信号は、式（１）で表される。 Specifically, for example, in the observation system shown in Figure 3, control points are placed around a speaker array in which multiple speakers are lined up, and a filter is designed to control the amplitude and phase of each speaker based on the transmission characteristics from the speaker to the control point. This method makes it possible to control the direction in which sound propagates strongly from the speaker, or the direction in which sound does not propagate. In this method, the signal observed at each control point is expressed by equation (1).

ここで、指向性制御のフィルタを求める最小二乗法は、所望の指向特性と、制御点にて観測される指向特性との誤差の二乗和を最小化するフィルタを求める最小化問題となる。従って、最小化する目的関数Ｊは、式（３）で表される。Here, the least squares method for finding a filter for directional control is a minimization problem for finding a filter that minimizes the sum of squares of the error between the desired directional characteristic and the directional characteristic observed at the control point. Therefore, the objective function J to be minimized is expressed by equation (3).

本発明の実施の形態において、式（３）から報酬関数が導かれる。報酬関数は、式（３）で表される目的関数における所望の指向特性と制御点で観測される指向特性との誤差が大きいほど報酬関数が小さく、誤差が小さいほど報酬関数が大きくなるように変換した関数である。報酬関数は、例えば、式（３）に示す目的関数に－１を乗算することにより、得られる。In an embodiment of the present invention, a reward function is derived from equation (3). The reward function is a function that is transformed so that the larger the error between the desired directional characteristic in the objective function expressed by equation (3) and the directional characteristic observed at the control point, the smaller the reward function becomes, and vice versa. The reward function is obtained, for example, by multiplying the objective function shown in equation (3) by -1.

一般的に、式（３）に示す目的関数を最小化するフィルタを算出することにより、式（４）に示す指向性制御のフィルタが得られるが、本発明の実施の形態においては、後述の算出部１３で得られたフィルタを用いる。Generally, a directional control filter shown in equation (4) is obtained by calculating a filter that minimizes the objective function shown in equation (3), but in an embodiment of the present invention, a filter obtained by calculation unit 13 described below is used.

他の例の報酬関数として、ＡＣＭにおける音響コントラストを用いる例を説明する。音響コントラストを用いる場合、図４に示すように、スピーカが設定されるエリアにおいて、任意に設定した音を提示するエリア(Bright-zone)、と音を抑圧するエリア(Dark-zone)とが、任意に設定される。ＡＣＭにおいて、ブライトゾーンとダークゾーンの間の音響コントラストがスピーカアレイにより最大化するように、フィルタが設計される。As another example of a reward function, we will explain an example of using acoustic contrast in ACM. When using acoustic contrast, as shown in Figure 4, an area where speakers are set is arbitrarily set to present an arbitrarily set sound (bright-zone) and an area where sound is suppressed (dark-zone). In ACM, a filter is designed so that the acoustic contrast between the bright zone and the dark zone is maximized by the speaker array.

任意のスピーカアレイをフィルタで駆動した場合、それぞれのエリアで観測される音場は以下の式（５）および式（６）で表すことができる。 When an arbitrary speaker array is driven by a filter, the sound field observed in each area can be expressed by the following equations (5) and (6).

各エリアの音響エネルギーは、式（７）および式（８）で表される。 The acoustic energy in each area is expressed by equations (7) and (8).

音響コントラストは、式（９）に示すように、ブライトゾーンの音響エネルギーとダークゾーンの音響エネルギーの比で得られる。 Acoustic contrast is given by the ratio of acoustic energy in the bright zone to the acoustic energy in the dark zone, as shown in equation (9).

本発明の実施の形態において、式（９）の音響コントラストの式が、報酬関数となる。In an embodiment of the present invention, the acoustic contrast equation in equation (9) is the reward function.

一般的に、式（９）に示す音響コントラストを、式（１０）に示すように最大化する最適化問題となるが、本発明の実施の形態においては、後述の算出部１３で得られたフィルタを用いる。 Generally, this is an optimization problem of maximizing the acoustic contrast shown in equation (9) as shown in equation (10), but in an embodiment of the present invention, a filter obtained by the calculation unit 13 described below is used.

なお、本発明の実施の形態において２種の報酬関数の例を示したがこれに限らない。音場を表現可能な報酬関数であれば、どのようなものでも良い。In the embodiment of the present invention, two types of reward functions are shown as examples, but the present invention is not limited to these. Any reward function that can express a sound field may be used.

ここで報酬関数取得部１１が、コンピュータ処理により目標とする指向特性を実現するための報酬関数を生成する処理の一例を説明する。Here, we will explain an example of the process in which the reward function acquisition unit 11 generates a reward function for achieving the target directional characteristic through computer processing.

例えば、図３に示す観察系において、指向性制御のフィルタ設計から報酬関数を生成する方法を説明する。まずユーザが、目標とする指向特性として、スピーカからのビームの方向とビーム幅を入力する。ビームの方向とビーム幅は、例えばスピーカを中心とする角度で指定される。報酬関数取得部１１は、入力された指向特性から式（３）のｄを生成して、式（３）の目的関数を報酬関数として生成することができる。 For example, in the observation system shown in Figure 3, a method for generating a reward function from a filter design for directivity control will be described. First, the user inputs the direction and beam width of the beam from the speaker as the target directional characteristics. The beam direction and beam width are specified, for example, as an angle centered on the speaker. The reward function acquisition unit 11 can generate d in equation (3) from the input directional characteristics and generate the objective function of equation (3) as the reward function.

他の例として、図４に示す観察系において、ＡＣＭにおける音響コントラストから報酬関数を生成する方法を説明する。報酬関数取得部１１は、ブライトゾーンとダークゾーンのそれぞれの制御点について、式（５）および式（６）の各音場を生成して、式（９）により得られる音響コントラストを、報酬関数として生成する。As another example, we will explain a method of generating a reward function from the acoustic contrast in ACM in the observation system shown in Figure 4. The reward function acquisition unit 11 generates each sound field of equations (5) and (6) for each control point of the bright zone and the dark zone, and generates the acoustic contrast obtained by equation (9) as a reward function.

このように報酬関数取得部１１は、コンピュータ処理により報酬関数を生成することもできる。 In this way, the reward function acquisition unit 11 can also generate the reward function through computer processing.

制約条件取得部１２は、目標とする指向特性を実現する複数のスピーカが再生可能な駆動信号を生成するためのフィルタ係数の範囲を制約条件として取得する。制約条件取得部１２は、ユーザが生成した制約条件を取得しても良いし、コンピュータ処理によりスピーカアレイ２のスピーカの仕様から制約条件を生成して、生成された制約条件を取得しても良い。制約条件取得部１２は、取得した制約条件を、後述の算出部１３が参照可能なように処理装置１内に設定する。The constraint condition acquisition unit 12 acquires, as constraint conditions, a range of filter coefficients for generating drive signals that can be reproduced by multiple speakers that achieve the target directional characteristics. The constraint condition acquisition unit 12 may acquire constraint conditions generated by a user, or may generate constraint conditions from the specifications of the speakers of the speaker array 2 by computer processing and acquire the generated constraint conditions. The constraint condition acquisition unit 12 sets the acquired constraint conditions in the processing device 1 so that the calculation unit 13 described below can refer to them.

スピーカアレイにより音場を制御するフィルタは、フィルタからの出力音源に影響を与えるフィルタゲインを含んで算出される。ある各周波数における、あるスピーカに対応するフィルタゲインは、式（１１）で表現される。 The filter that controls the sound field using the speaker array is calculated including the filter gain that affects the output sound source from the filter. The filter gain corresponding to a certain speaker at each frequency is expressed by equation (11).

式（１１）に示すように、フィルタゲインが大きいと、スピーカに入力される信号も比例して大きくなり、スピーカに大きな負荷がかかるため再生が困難となる場合がある。特に任意の周波数帯域に対してＡＣＭまたは指向性制御のフィルタを設計する場合、低域のフィルタゲインが大きくなってしまう。そこで本発明の実施の形態において制約条件は、目標とする指向特性を実現する複数のスピーカが再生可能な駆動信号を生成するためのフィルタ係数の範囲である。制約条件は、各スピーカに対応するフィルタ係数が一定の範囲に収まるように拘束する。As shown in equation (11), if the filter gain is large, the signal input to the speaker also becomes proportionally large, which places a heavy load on the speaker and may make reproduction difficult. In particular, when designing an ACM or directional control filter for an arbitrary frequency band, the low-frequency filter gain becomes large. Therefore, in an embodiment of the present invention, the constraint is the range of filter coefficients for generating drive signals that can be reproduced by multiple speakers that achieve the target directional characteristics. The constraint is to restrict the filter coefficients corresponding to each speaker to fall within a certain range.

制約条件は、各スピーカに対応するスピーカゲインを抑圧して駆動信号を再生できるように設定される。制約条件は、各周波数帯域において同じであっても良い。The constraints are set so that the drive signal can be reproduced by suppressing the speaker gain corresponding to each speaker. The constraints may be the same for each frequency band.

また入力音響信号の大きさによって、スピーカが再生可能なフィルタの範囲が制限される場合があるので、制約条件取得部１２は、入力音響信号の大きさも考慮して、制約条件を設定しても良い。 In addition, since the range of filters that the speaker can reproduce may be limited depending on the magnitude of the input acoustic signal, the constraint condition acquisition unit 12 may set constraint conditions taking into account the magnitude of the input acoustic signal.

なお、式（３）で得られた目的関数に、フィルタゲインを抑圧するために、フィルタ係数の二乗和を罰則項として用いると、目的関数は式（１２）で表される。 In addition, if the sum of squares of the filter coefficients is used as a penalty term in order to suppress the filter gain in the objective function obtained by equation (3), the objective function is expressed by equation (12).

式（１２）において正規化パラメータは、損失項と罰則項の相対的な重みを制御する。一般的に正則化パラメータは、周波数ごとの所望の音場の再現精度とフィルタゲインの大きさを基に、実験的に各周波数帯に対して同じ値が用いられる。従来、各周波数帯に対して同じ正則化パラメータが用いられる場合、周波数帯ごとに最適な正則化パラメータを与えられない問題があった。また周波数帯ごとに正則化パラメータを決定する場合、各周波数に対して実験的に値を設定するのは困難である。加えて、所望の音場の再現精度とフィルタゲインの大きさはトレードオフの関係にあるため、最適なパラメータを設定するのが困難であるといった課題がある。In equation (12), the regularization parameter controls the relative weight of the loss term and the penalty term. Generally, the same value of the regularization parameter is experimentally used for each frequency band based on the reproduction accuracy of the desired sound field for each frequency and the magnitude of the filter gain. Conventionally, when the same regularization parameter is used for each frequency band, there was a problem that the optimal regularization parameter could not be given for each frequency band. In addition, when determining the regularization parameter for each frequency band, it is difficult to experimentally set a value for each frequency. In addition, there is a problem that it is difficult to set the optimal parameter because there is a trade-off between the reproduction accuracy of the desired sound field and the magnitude of the filter gain.

そこで本発明の実施の形態において、罰則項を用いるのではなく、各スピーカおよび周波数帯域に対応するスピーカゲインを抑圧して駆動信号を再生できるように、強化学習手法における制約条件が、設定される。制約条件は、各周波数帯域において異なっても良い。これにより、各スピーカおよび各周波数帯においてフィルタ係数が一定の範囲に収まるように拘束される。Therefore, in an embodiment of the present invention, rather than using a penalty term, constraint conditions in the reinforcement learning method are set so that the drive signal can be reproduced by suppressing the speaker gain corresponding to each speaker and frequency band. The constraint conditions may be different for each frequency band. This restricts the filter coefficients for each speaker and each frequency band to fall within a certain range.

なお、一般的には、式（１２）の目的関数を、式（４）と同様に最小化問題を解くことで、式（１３）に示す最小二乗法のフィルタが得られるが、本発明の実施の形態においては、後述の算出部１３で得られたフィルタを用いる。 In general, the least squares filter shown in equation (13) can be obtained by solving the minimization problem of the objective function of equation (12) in the same manner as equation (4), but in an embodiment of the present invention, a filter obtained by calculation unit 13 described below is used.

算出部１３は、複数のスピーカのそれぞれについて、制約条件を満たし、報酬関数を最大とするフィルタを算出する。制約条件は、制約条件取得部１２においてスピーカアレイ２を構成するスピーカの仕様によって定められる。報酬関数は、報酬関数取得部１１において目標とする音場の指向特性から決定される。The calculation unit 13 calculates a filter for each of the multiple speakers that satisfies the constraint conditions and maximizes the reward function. The constraint conditions are determined by the constraint condition acquisition unit 12 based on the specifications of the speakers that make up the speaker array 2. The reward function is determined by the reward function acquisition unit 11 based on the directional characteristics of the target sound field.

算出部１３は、フィルタ係数を行動の連続値として扱う強化学習手法により、制約条件を満たし、報酬関数を最大とするフィルタを算出する。算出部１３は、ＤＤＰＧ（Deep Deterministic Policy Gradient）、ＮＡＦ（Normalized Advantage Function）等の、行動を連続値として扱う強化学習手法により、制約条件の下、報酬関数を最大化する最適化問題を解くことで、所望の音場を生成するフィルタを算出する。The calculation unit 13 calculates a filter that satisfies the constraint conditions and maximizes the reward function by a reinforcement learning method that treats the filter coefficients as continuous values of the behavior. The calculation unit 13 calculates a filter that generates a desired sound field by solving an optimization problem that maximizes the reward function under the constraint conditions by a reinforcement learning method that treats the behavior as continuous values, such as DDPG (Deep Deterministic Policy Gradient) and NAF (Normalized Advantage Function).

下記に、報酬関数を音響コントラストとし、ＮＡＦを用いてフィルタを算出する処理の一例を説明する。 Below, we explain an example of a process in which the reward function is acoustic contrast and a filter is calculated using NAF.

Ｑ関数は、状態価値関数とアドバンテージ関数により、式（１４）で表される。 The Q function is expressed by equation (14) using the state value function and advantage function.

本発明の実施の形態において、上記表１の状態、行動、報酬およびクリッピングの範囲は、式（１６）のように定義される。In an embodiment of the present invention, the states, actions, rewards and clipping ranges in Table 1 above are defined as in equation (16).

表１に示す処理で最終的に得られる行動ａ’が、所望の音場を生成するフィルタとなる。この処理を使用する全ての周波数に対して実施することで、算出部１３は、各周波数における最適なフィルタが求めることができる。The action a' finally obtained by the process shown in Table 1 is the filter that generates the desired sound field. By performing this process for all frequencies to be used, the calculation unit 13 can find the optimal filter for each frequency.

畳み込み演算部１４は、入力音響信号を、算出部１３で算出されたフィルタで畳み込み、各スピーカに入力する駆動信号を生成する。畳み込み演算部１４は、入力音響信号と、算出部１３で算出されたフィルタから、駆動信号を生成する。畳み込み演算部１４が算出した駆動信号を、複数のスピーカがそれぞれ再生することにより、所望の音場が実現される。The convolution calculation unit 14 convolves the input sound signal with the filter calculated by the calculation unit 13 to generate a drive signal to be input to each speaker. The convolution calculation unit 14 generates a drive signal from the input sound signal and the filter calculated by the calculation unit 13. The desired sound field is realized by having multiple speakers each reproduce the drive signal calculated by the convolution calculation unit 14.

このように本発明の実施の形態に係る処理装置１において、報酬関数取得部１１は、所望のエリア再生を実現する音場を生成する関数を報酬関数として設定し、制約条件取得部１２は、フィルタゲインが一定の範囲に収まるような拘束を制約条件として設定する。算出部１３は、出力された報酬関数および制約条件から、制約条件を満たし、報酬関数を最大とする音場制御のフィルタを導出する。畳み込み演算部１４は、入力音響信号と、算出部１３から出力された音場制御のフィルタを畳み込み再生する。これにより所望のエリア再生を実現するために、各スピーカに入力される駆動信号を取得することができる。 In this manner, in the processing device 1 according to the embodiment of the present invention, the reward function acquisition unit 11 sets a function that generates a sound field that realizes the reproduction of a desired area as the reward function, and the constraint condition acquisition unit 12 sets a constraint that the filter gain falls within a certain range as the constraint condition. The calculation unit 13 derives a sound field control filter that satisfies the constraint condition and maximizes the reward function from the output reward function and constraint condition. The convolution calculation unit 14 convolves the input acoustic signal and the sound field control filter output from the calculation unit 13 to reproduce the signal. This makes it possible to obtain the drive signal to be input to each speaker in order to realize the reproduction of the desired area.

ここで制約条件として、複数のスピーカが再生可能な駆動信号を生成するためのフィルタ係数の範囲が設定される。本発明の実施の形態に係る処理装置１は、複数のスピーカのそれぞれに加わるゲインを抑圧して、目標とする指向特性を実現することができる。Here, a range of filter coefficients is set as a constraint for generating a drive signal that can be reproduced by multiple speakers. The processing device 1 according to an embodiment of the present invention can suppress the gain applied to each of the multiple speakers to achieve the target directional characteristics.

なお本発明の実施の形態において報酬関数を算出するための観測系において、複数のスピーカが円形に配置される例を説明したが、これに限らない。複数のスピーカが１列に配置されるなど、任意の位置に配置されても良い。In the embodiment of the present invention, an example in which multiple speakers are arranged in a circle in the observation system for calculating the reward function has been described, but this is not limited to this. Multiple speakers may be arranged in any position, such as in a row.

上記説明した本実施形態の処理装置１は、例えば、CPU（Central Processing Unit、プロセッサ）９０１と、メモリ９０２と、ストレージ９０３（HDD：Hard Disk Drive、SSD：Solid State Drive）と、通信装置９０４と、入力装置９０５と、出力装置９０６とを備える汎用的なコンピュータシステムが用いられる。このコンピュータシステムにおいて、CPU９０１がメモリ９０２上にロードされた処理プログラムを実行することにより、処理装置１の各機能が実現される。The processing device 1 of the present embodiment described above is, for example, a general-purpose computer system including a CPU (Central Processing Unit, processor) 901, a memory 902, a storage 903 (HDD: Hard Disk Drive, SSD: Solid State Drive), a communication device 904, an input device 905, and an output device 906. In this computer system, the CPU 901 executes a processing program loaded on the memory 902, thereby realizing each function of the processing device 1.

なお、処理装置１は、１つのコンピュータで実装されてもよく、あるいは複数のコンピュータで実装されても良い。また処理装置１は、コンピュータに実装される仮想マシンであっても良い。The processing device 1 may be implemented in one computer or in multiple computers. The processing device 1 may also be a virtual machine implemented in a computer.

処理装置１の処理プログラムは、HDD、SSD、USB（Universal Serial Bus）メモリ、CD (Compact Disc)、DVD (Digital Versatile Disc)などのコンピュータ読取り可能な記録媒体に記憶することも、ネットワークを介して配信することもできる。The processing program of the processing device 1 can be stored on a computer-readable recording medium such as a HDD, SSD, USB (Universal Serial Bus) memory, CD (Compact Disc), or DVD (Digital Versatile Disc), or can be distributed via a network.

なお、本発明は上記実施形態に限定されるものではなく、その要旨の範囲内で数々の変形が可能である。The present invention is not limited to the above-described embodiments, and many variations are possible within the scope of the invention.

１処理装置
２スピーカアレイ
３再生システム
１１報酬関数取得部
１２制約条件取得部
１３算出部
１４畳み込み演算部
９０１ CPU
９０２メモリ
９０３ストレージ
９０４通信装置
９０５入力装置
９０６出力装置 REFERENCE SIGNS LIST 1 Processing device 2 Speaker array 3 Playback system 11 Reward function acquisition unit 12 Constraint condition acquisition unit 13 Calculation unit 14 Convolution calculation unit 901 CPU
902 Memory 903 Storage 904 Communication device 905 Input device 906 Output device

Claims

A reward function acquisition unit that acquires a reward function set from a target directional characteristic;
a constraint condition acquisition unit that acquires, as a constraint condition, a range of filter coefficients for generating drive signals that can be reproduced by a plurality of speakers that realize the target directional characteristic;
A calculation unit that calculates a filter that satisfies the constraint condition and maximizes the reward function for each of the plurality of speakers;
a convolution calculation unit that convolves an input sound signal with the calculated filter to generate a drive signal to be input to each speaker ,
The constraint condition is set in a processing device so that a drive signal can be reproduced by suppressing a speaker gain corresponding to each speaker .

The processing device according to claim 1 , wherein the calculation unit calculates a filter that satisfies the constraint condition and maximizes the reward function by a reinforcement learning method that treats a filter coefficient as a continuous value of an action.

The processing device according to claim 1 , wherein the constraint condition is set so that a drive signal can be reproduced by suppressing a speaker gain corresponding to each speaker and frequency band.

A processing device according to any one of claims 1 to 3 ,
A reproduction system comprising a plurality of speakers that reproduce the drive signals calculated by the convolution calculation unit.

A step in which a computer obtains a reward function from a target directional characteristic;
The computer acquires, as a constraint condition, a range of filter coefficients for generating drive signals that can be reproduced by a plurality of speakers that realize the target directional characteristic;
The computer calculates, for each of the plurality of speakers, a filter that satisfies the constraint condition and maximizes the reward function;
The computer convolves an input acoustic signal with the calculated filter to generate a drive signal to be input to each speaker ,
A processing method in which the constraint conditions are set so that a drive signal can be reproduced by suppressing a speaker gain corresponding to each speaker .

A processing program for causing a computer to function as the processing device according to any one of claims 1 to 3.