JP4530199B2

JP4530199B2 - Model creation device and model creation program for natural instrument musical sound judgment device

Info

Publication number: JP4530199B2
Application number: JP2004047809A
Authority: JP
Inventors: 錬澄田; 直行田中
Original assignee: Kawai Musical Instruments Manufacturing Co Ltd
Current assignee: Kawai Musical Instruments Manufacturing Co Ltd
Priority date: 2004-02-24
Filing date: 2004-02-24
Publication date: 2010-08-25
Anticipated expiration: 2024-02-24
Also published as: JP2005241717A

Description

本発明は、自然楽器の楽音判定装置におけるモデル作成装置およびモデル作成用プログラムに関し、特に、自然楽器の演奏において演奏指示手段の指示どおりの操作がなされたかどうかを判定するためのモデルを誤りなく作成することができる自然楽器の楽音判定装置におけるモデル作成装置およびモデル作成用プログラムに関する。 The present invention relates to a model creation device and a model creation program in a natural musical instrument musical sound judgment device, and more particularly, to create a model for judging whether or not an operation as instructed by a performance instruction means has been performed in the performance of a natural musical instrument. The present invention relates to a model creation device and a model creation program in a musical instrument tone determination device capable of performing the same.

従来、鍵盤楽器の演奏を独習するための押鍵指示手段を有する電子楽器が知られている。例えば、下記特許文献１には、予め記憶された演奏情報を表示回路に順次読み出し、鍵盤の各鍵に対応して設けた表示ランプをこの演奏情報に従って付勢することにより押鍵指示を行い、演奏者が押鍵した鍵の音高をキースキャン回路で検出し、この音高が表示ランプで指示した演奏情報の音高と一致した場合に次の演奏情報を表示回路に読み出すようにした押鍵指示手段を備えた電子楽器が記載されている。 2. Description of the Related Art Conventionally, an electronic musical instrument having a key pressing instruction means for self-studying the performance of a keyboard instrument is known. For example, in Patent Document 1 below, performance information stored in advance is sequentially read out to a display circuit, and a display lamp provided corresponding to each key of the keyboard is energized according to the performance information to give a key depression instruction. The pitch of the key pressed by the performer is detected by the key scanning circuit, and when the pitch matches the pitch of the performance information indicated by the display lamp, the next performance information is read to the display circuit. An electronic musical instrument having a key instruction means is described.

下記特許文献１に従う押鍵指示は、キースキャン回路を備えた電子楽器では容易に実現できるが、キースキャン回路を持たないアコースティックピアノのような自然楽器では実現が困難である。自然楽器にキースキャン回路を設けることも考えられるが、キースキャン回路に必要なキースイッチや表示ランプを鍵ごとに後付けする作業が繁雑であり、手軽に行うことはできない。 The key press instruction according to the following Patent Document 1 can be easily realized with an electronic musical instrument provided with a key scan circuit, but is difficult to realize with a natural musical instrument such as an acoustic piano without a key scan circuit. Although it is conceivable to provide a key scan circuit for a natural musical instrument, the work of retrofitting key switches and display lamps necessary for the key scan circuit for each key is complicated and cannot be performed easily.

自然楽器においても押鍵指示などの演奏指示を可能にするため、本出願人は、マイクロフォンから各音高の単音を取り込んで予め各単音のパワースペクトルをモデルとして作成して記憶させておき、レッスン時に演奏指示に従って次に弾くべき音のモデルを読み出し、これとマイクロフォンで取り込んだ演奏音のパワースペクトルを比較し、両者が閾値以内の距離に近づいた場合に演奏指示を進めるという自然楽器の楽音判定装置を特願２００３−１３４３７２号（先願）で提案した。 In order to enable performance instructions such as key pressing even for natural musical instruments, the Applicant takes a single note of each pitch from a microphone and creates and stores the power spectrum of each single note as a model beforehand. Sometimes it reads out the model of the next sound to be played according to the performance instruction, compares the power spectrum of the performance sound captured with the microphone, and determines the musical sound of a natural instrument by advancing the performance instruction when both approach the distance within the threshold An apparatus was proposed in Japanese Patent Application No. 2003-134372 (prior application).

この装置は、自然楽器を実際に弾いて作成したモデルとレッスン時の演奏音とを比較するものであるため、演奏音のピッチなどのパラメータを検出する必要がなく、非常に簡単な計算で、また、自然楽器の調律がずれている場合でも全く問題なく確実に、モデルとレッスン時の演奏音との一致を検出できるという特徴を持つが、実際に自然楽器を弾いて各音高の単音のパワースペクトルをモデルとして予め誤りなく作成し、記憶させておく必要がある。
特公平２−７０５号公報 This device compares a model created by actually playing a natural instrument with the performance sound at the time of the lesson, so it is not necessary to detect parameters such as the pitch of the performance sound, and it is a very simple calculation, In addition, even if the natural instrument is out of tune, it is possible to detect the match between the model and the performance sound during the lesson without any problem. The power spectrum must be created and stored in advance as a model without error.
Japanese Patent Publication No. 2-705

上記先願では、モデルとレッスン時の演奏音のパワースペクトルはできるだけ早い段階で一致検出することが望ましいので、モデル作成時に押鍵後の演奏音のアタックを検出し、アタック検出時点のパワースペクトルをモデルとして記憶する。アタックは、音量または振幅が、予想されるノイズの音量または振幅以上に閾値を超過したことで検出され、単に閾値を超過した音量または振幅が入力されればアタックが検出されてモデルが作成される。 In the above-mentioned prior application, it is desirable to detect the power spectrum of the model and the performance sound at the time of lesson as early as possible. Therefore, when the model is created, the attack of the performance sound after key depression is detected, and the power spectrum at the time of attack detection is calculated. Remember as a model. An attack is detected when the volume or amplitude exceeds a threshold that is greater than or equal to the expected noise volume or amplitude, and if a volume or amplitude that exceeds the threshold is simply input, the attack is detected and a model is created. .

しかしながら、演奏者は鍵によっていろいろな強さで押鍵する可能性があり、このように押鍵強さが各鍵で異なって発生された演奏音を基にモデルを作成することは好ましくない。また、レッスン時、次に弾くべき音が和音の場合には和音を構成する単音のモデルを合成してモデルとするが、ピアノの場合には同じ強さで押鍵しても音域によって音量が大きく異なるため、モデル合成時に各単音の音量を正規化して合成することは適当でない。以上のことから、モデル作成時には演奏者に各鍵を同じ強さで弾いてもらうようにして、それにより発生された演奏音を基にモデルを作成し、和音の場合には各単音のモデルをそのまま合成し、合成後に全体として正規化して使用できるようにする工夫が必要である。 However, there is a possibility that the performer may press the key with various strengths, and it is not preferable to create a model based on the performance sound generated with different key pressing strengths for each key. In the lesson, if the next note to be played is a chord, a single note model that composes the chord is synthesized to create a model. Since they differ greatly, it is not appropriate to synthesize by normalizing the volume of each single tone during model synthesis. From the above, we ask the performer to play each key with the same strength when creating a model, and create a model based on the performance sound generated by it. It is necessary to devise a method for synthesizing the product as it is and normalizing it as a whole after synthesis.

また、アタック検出は入力されるデジタル信号の音量または振幅の影響を受けるため、入力される演奏音のレベルが適正に設定される必要がある。さらに、同じ強さで弾いた場合でも音域（音高）によって発生される音量に大きな差があるので、単純に音量最大値が同じ位になるようにガイドしたのでは同じ強さで弾いてもらうことはできない。また、モデル作成時に余り強く押鍵すると、高次倍音の成分が強く出てしまい、これにより作成されたモデルとレッスン時の演奏音のパワースペクトルの一致検出が難しくなってしまう。そのため、モデル作成時には普通の強さか、やや弱めに押鍵してもらうことが望ましい。 Also, since attack detection is affected by the volume or amplitude of the input digital signal, the level of the input performance sound needs to be set appropriately. Furthermore, even if you play with the same strength, there is a large difference in the volume generated by the range (pitch), so if you simply guide the maximum volume to the same level, you will have to play with the same strength It is not possible. In addition, if the key is pressed too strongly when creating a model, a high-order harmonic component appears strongly, which makes it difficult to detect the coincidence between the created model and the power spectrum of the performance sound during the lesson. For this reason, it is desirable to have the key pressed at a normal strength or slightly weaker when creating a model.

また、モデル作成時には、モデルを作成する音域の鍵を次々と押鍵してそれぞれの単音のモデルを作成して記憶していくが、余り早く次の鍵を押鍵すると前の演奏音の残響がまだ残っているため余分な周波数成分が含まれてしまい、押鍵した鍵に対応する本来のモデルが作成されないという問題がある。そうすると、レッスン時に正しい鍵を押鍵してもモデルとの一致が得られなくなってしまう。十分に残響がなくなるまで待って次の鍵のモデル作成を開始させることによりこの不都合をなくすことができるが、単音ごとに必要以上に十分な時間を費やすことは無駄であり、広い音域のモデルを作成する場合には多大に時間を要することになる。また、どの時点で次の鍵を押鍵してモデル作成に入っていいかはっきりしない。 Also, when creating a model, the keys of the range to create the model are pressed one after another to create and store a model of each single note, but if you press the next key too early, the reverberation of the previous performance sound However, there is still a problem that an extra frequency component is included and the original model corresponding to the depressed key is not created. Then, even if you press the correct key during the lesson, you will not be able to get a match with the model. You can eliminate this inconvenience by waiting until there is enough reverberation to start creating the next key model, but it is wasteful to spend more time than necessary for each single note. It takes a lot of time to create it. Also, it is not clear at what point in time the next key is pressed to enter the model creation.

また、上述の各音高の単音のパワースペクトルをモデルとして作成するとき、マイクロフォンからの入力音にはピアノの音以外にピアノのアクション音も含まれてしまう。モデルとレッスン時の演奏音のパワースペクトルを比較する際にはフーリエ変換（ＦＦＴ）を行うが、このアクション音のパワースペクトルは、ＦＦＴで得られたモデルおよびレッスン時の演奏音のパワースペクトルの低音部に入っている。 In addition, when the power spectrum of a single tone at each pitch described above is created as a model, the input sound from the microphone includes a piano action sound in addition to the piano sound. The Fourier spectrum (FFT) is used to compare the power spectrum of the model and the performance sound during the lesson. The power spectrum of this action sound is a low frequency of the power spectrum of the model and the performance sound during the lesson. In the club.

高音の鍵の場合には本来のピアノ音である弦の振動音よりもアクション音のパワースペクトルの方が強く出るため、モデルとレッスン時の演奏音とを比較すると、アクション音のパワースペクトルの一致が比較的大きく評価されてしまう。このアクション音がモデルとレッスン時の演奏音の含まれることにより両者の正しい一致が得られず不都合であるため、何らかの工夫が必要である。 In the case of high-pitched keys, the power spectrum of the action sound is stronger than the vibration sound of the string that is the original piano sound, so when comparing the model and the performance sound at the lesson, the power spectrum of the action sound matches Will be evaluated relatively large. Since this action sound includes the model and the performance sound at the time of the lesson, it is inconvenient that a correct match between the two cannot be obtained.

アクション音が評価されないようにするには、アクション音を取り除けばよい。しかし、アクション音はアクション部分から発せられるノイズのようなものであり、そのパワースペクトルは毎回異なるためアクション音のみを選択して取り除くことは困難である。 To prevent the action sound from being evaluated, the action sound can be removed. However, the action sound is like noise emitted from the action part, and its power spectrum is different every time, so it is difficult to select and remove only the action sound.

また、モデルとレッスン時の演奏音のパワースペクトルの比較の前処理としてパワースペクトルの正規化を行うが、単純にパワースペクトルを音量で正規化すると外部環境によるノイズなども正規化されてしまい、特に弱く弾いた音を正規化した場合、ノイズ成分を大きく持ち上げてしまう結果になる。従って、このようなノイズがモデルやレッスン時の演奏音に含まれていても両者の正しい一致が得られない。 In addition, normalization of the power spectrum is performed as a pre-process for comparing the power spectrum of the model and the performance sound at the time of the lesson, but simply normalizing the power spectrum with the volume will also normalize noise due to the external environment. Normalizing a weakly played sound results in a significant increase in the noise component. Therefore, even if such noise is included in the performance sound at the time of the model or lesson, a correct match between the two cannot be obtained.

また、パワースペクトルを作成するためにＦＦＴを行うと、パワースペクトルにＤＣ（直流）成分が現れる。そのようなＤＣ成分がモデルやレッスン時の演奏音のパワースペクトルに含まれていても、モデルとレッスン時の演奏音のパワースペクトルの正しい一致が得られない。 When FFT is performed to create a power spectrum, a DC (direct current) component appears in the power spectrum. Even if such a DC component is included in the power spectrum of the performance sound at the time of the model or lesson, a correct match between the power spectrum of the performance sound at the time of the model and the lesson cannot be obtained.

さらに、モデルとレッスン時の演奏音のパワースペクトルの比較に当たってパワースペクトルの正規化のために音量を測定する場合、ＤＣ成分の影響で音量を正しく測定できない。Ａ／Ｄ変換前のアナログ信号にのっているＤＣのオフセットは、Ａ／Ｄ変換後のデジタル信号にものってしまうため、波形の振幅から音量を求めようとするとＤＣのオフセットによって音量を正しく測定できない。 Furthermore, when the volume is measured for normalization of the power spectrum when comparing the power spectrum of the performance sound between the model and the lesson, the volume cannot be measured correctly due to the influence of the DC component. Since the DC offset on the analog signal before A / D conversion is the same as the digital signal after A / D conversion, if you try to find the volume from the amplitude of the waveform, the DC offset will correct the volume correctly. It cannot be measured.

従って、モデルとレッスン時の演奏音のパワースペクトルの比較を正しく行うにはノイズ成分やＤＣ成分の影響も受けないようにする工夫も必要である。 Therefore, in order to correctly compare the power spectrum of the model and the performance sound at the time of the lesson, it is necessary to devise measures not to be affected by noise components and DC components.

本発明の目的は、自然楽器の演奏において演奏指示手段の指示どおりの操作がなされたかどうかを判定するためのモデルを誤りなく作成できるようにすることにある。 An object of the present invention is to make it possible to create a model for determining whether or not an operation in accordance with an instruction from a performance instruction means has been performed in playing a natural musical instrument without error.

上記の課題を解決するため、本発明は、モデル作成時に自然楽器を実際に弾くことにより発生された単音の演奏音をレベル調整して取り込み、取り込んだ演奏音をレベル調整された演奏音の音量および該レベル調整状態で取り込んだノイズの音量に基づいて設定した閾値と比較してアタックを検出し、アタックを検出したときの演奏音のパワースペクトルを算出して当該単音のモデルとして記憶する点に特徴がある。 In order to solve the above-described problems, the present invention is to adjust the level of a single performance sound generated by actually playing a natural instrument at the time of model creation, and to adjust the volume of the performance sound whose level is adjusted. In addition, the attack is detected in comparison with a threshold value set based on the volume of the noise captured in the level adjustment state, and the power spectrum of the performance sound when the attack is detected is calculated and stored as a model of the single sound. There are features.

また、本発明は、モデル作成時に自然楽器を実際に弾くことにより発生された単音の演奏音を取り込んで当該単音のモデルをとして記憶すると共に、取り込んだ演奏音の音量を閾値と比較し、演奏音の音量が閾値を一定時間継続的に下回ったとき無音状態と判断して次の単音のモデル作成に移る点に特徴がある。 In addition, the present invention captures a single performance sound generated by actually playing a natural instrument at the time of model creation , stores the single performance model as a model, compares the volume of the captured performance sound with a threshold value, It is characterized in that when the sound volume falls below the threshold continuously for a certain period of time, it is determined that there is no sound and the next single-tone model is created.

また、本発明は、モデル作成時に自然楽器を実際に弾くことにより発生された単音の演奏音を取り込み、取り込んだ演奏音のパワースペクトルを求め、モデルを作成しようとする単音の音高の基本周波数より低い周波数成分をカットしたパワースペクトルを当該単音のモデルとして記憶する点に特徴がある。 In addition, the present invention captures a single performance sound generated by actually playing a natural instrument at the time of model creation, obtains a power spectrum of the captured performance sound, and calculates a fundamental frequency of a single pitch to be modeled. It is characterized in that a power spectrum obtained by cutting a lower frequency component is stored as a model of the single sound .

また、本発明は、モデル作成時に自然楽器を演奏しない状態で取り込んだ音のパワースペクトルを求め、そのパワースペクトル中でパワー最大のスペクトルのパワー値をノイズ基準値とし、単音の演奏音のパワースペクトルにおいてパワー値がノイズ基準値以下をカットしたパワースペクトルを当該単音のモデルとして記憶する点に特徴がある。 In addition, the present invention obtains a power spectrum of a sound captured without playing a natural instrument at the time of creating a model, and uses the power value of the spectrum with the maximum power in the power spectrum as a noise reference value, and the power spectrum of a single performance sound the power spectrum power values were cut below the noise reference value is characterized in that stored as the single sound model in.

さらに、本発明は、モデル作成時に自然楽器を実際に弾くことにより発生された単音の演奏音を取り込み、該演奏音のパワースペクトルを求め、該パワースペクトル中の最も低い周波数付近の成分をカットしたパワースペクトルを当該単音のモデルとして記憶する点に特徴がある。 Furthermore, the present invention takes in a single performance sound generated by actually playing a natural musical instrument when creating a model, obtains a power spectrum of the performance sound, and cuts a component near the lowest frequency in the power spectrum. It is characterized in that the power spectrum is stored as a model of the sound .

なお、無音検出、アクション音やノイズやＤＣ成分除去の技術は適宜組み合わせて採用することができ、また、本発明は装置あるいはプログラムとして構成することができる。 Incidentally, silence detection, action sound or noise and DC component removal techniques can be employed in appropriate combination, the invention also can be configured as an apparatus or program.

本発明の特徴によれば、モデル作成時に演奏音の音量を適当に調整できるとともに、レベル調整されて入力された単音の演奏音のアタックを良好に検出でき、演奏音の音量が適当でなければその旨のメッセージを送出してほぼ一定の強さで弾くように促すことができるので、誤りのないモデルを作成することができる。 According to the features of the present invention, it is possible to appropriately adjust the volume of the performance sound at the time of creating the model, to detect well the attack of the single performance sound input after the level adjustment, and if the volume of the performance sound is not appropriate Since a message to that effect can be sent to prompt the player to play with almost constant strength, an error-free model can be created.

また、無音状態を検出して次のモデル作成に進ませることにより、先のモデル作成での演奏音の残響の影響をなくすことができるとともに、安心して次々とモデルを作成することができる。 Further, by detecting the silent state and proceeding to the next model creation, it is possible to eliminate the influence of the reverberation of the performance sound in the previous model creation, and it is possible to create the models one after another in peace.

さらに、モデル作成時にモデルの基本周波数以下のパワースペクトルを取り除いたり、モデルとするパワースペクトルの中からノイズのパワー最大値よりも低いパワーのパワースペクトルを取り除いたり、演奏音のパワースペクトルにおいて最も低い周波数付近の成分やＤＣ成分を取り除くことにより、アクション音やノイズやＤＣ成分の影響がモデルに出ないようにすることができる。 Furthermore, the power spectrum below the fundamental frequency of the model is removed when creating the model, the power spectrum with a power lower than the maximum noise power is removed from the model power spectrum, or the lowest frequency in the power spectrum of the performance sound By removing nearby components and DC components, it is possible to prevent the effects of action sounds, noise, and DC components from appearing in the model.

図１は、本発明が適用される楽音判定装置のブロック図であり、アコースティックピアノ（以下、特に電子ピアノと区別しない場合は、単に「ピアノ」と呼ぶ）の楽音判定装置として構成された例である。 FIG. 1 is a block diagram of a musical tone determination apparatus to which the present invention is applied, and is an example configured as a musical tone determination apparatus for an acoustic piano (hereinafter, simply referred to as “piano” unless otherwise distinguished from an electronic piano). is there.

図１に示す楽音判定装置は、ＣＰＵ１、ＲＯＭ２、ＲＡＭ３、音源装置４、スピーカ５、Ａ／Ｄ変換器６、入力装置（キーボードやマウス）７、および表示装置８を含むパーソナルコンピュータで要部を構成することができる。表示装置８としては液晶ディスプレイやブラウン管等、パーソナルコンピュータの処理結果を表示するための周知の表示手段を使用することができる。パーソナルコンピュータには外部記憶装置９が接続され、かつＡ／Ｄ変換器６を介してマイクロフォン１０が接続される。マイクロフォン１０はピアノＰの発生音を取り込むために設けられるものであり、ピアノＰ内に配置されるのが望ましい。 1 is a personal computer including a CPU 1, a ROM 2, a RAM 3, a sound source device 4, a speaker 5, an A / D converter 6, an input device (keyboard or mouse) 7, and a display device 8. Can be configured. As the display device 8, a known display means for displaying the processing result of the personal computer, such as a liquid crystal display or a cathode ray tube, can be used. An external storage device 9 is connected to the personal computer, and a microphone 10 is connected via the A / D converter 6. The microphone 10 is provided to capture the sound generated by the piano P, and is preferably disposed in the piano P.

ＣＰＵ１は、音量測定部１１、スペクトル作成部１２，アタック検出部１３、スペクトル比較部１４、および押鍵指示部１５を要部機能として備える。すなわち、音量測定部１１、スペクトル作成部１２，アタック検出部１３、スペクトル比較部１４、および押鍵指示部１５はソフトウエアとして構成されている。もちろん、これらの部分はハードウエアでも構成できる。 The CPU 1 includes a volume measuring unit 11, a spectrum creating unit 12, an attack detecting unit 13, a spectrum comparing unit 14, and a key pressing instruction unit 15 as main functions. That is, the sound volume measurement unit 11, the spectrum creation unit 12, the attack detection unit 13, the spectrum comparison unit 14, and the key press instruction unit 15 are configured as software. Of course, these parts can also be configured by hardware.

音量測定部１１は、マイクロフォン１０から入力された音のレベル（音量）を検出する。スペクトル作成部１２は、マイクロフォン１０およびＡ／Ｄ変換器６を通じて入力されたデジタル楽音信号からパワースペクトルを得るＦＦＴ（フーリエ変換）機能を有する。ＦＦＴによって得られたパワースペクトルはＲＡＭ３に記憶される。 The volume measuring unit 11 detects the level (volume) of the sound input from the microphone 10. The spectrum creating unit 12 has an FFT (Fourier transform) function for obtaining a power spectrum from a digital musical tone signal input through the microphone 10 and the A / D converter 6. The power spectrum obtained by the FFT is stored in the RAM 3.

アタック検出部１３は、検出されたパワースペクトルに基づいてアタックを検出する。スペクトル比較部１４はモデル作成フェーズおよびレッスンフェーズにおいてスペクトル作成部１２でそれぞれ作成されたパワースペクトルの距離を比較して演奏が押鍵指示どおり行われたかどうかを判断する。モデル作成フェーズ、レッスンフェーズについては後で詳細に説明する。 The attack detection unit 13 detects an attack based on the detected power spectrum. The spectrum comparison unit 14 compares the power spectrum distances created by the spectrum creation unit 12 in the model creation phase and the lesson phase, and determines whether or not the performance has been performed according to the key depression instruction. The model creation phase and lesson phase will be described in detail later.

押鍵指示部１５は、演奏情報に従う押鍵指示を表示装置８で表示するための表示データを作成する。表示データは入力装置７から入力される演奏開始指示に応答して作成開始され表示装置８に入力される。 The key pressing instruction unit 15 creates display data for displaying on the display device 8 a key pressing instruction according to the performance information. Display data is generated in response to a performance start instruction input from the input device 7 and input to the display device 8.

楽音判定装置における処理は、演奏者がピアノＰを実際に弾くことにより発生される演奏音に従って各音高の単音のモデルを作成するモデル作成フェーズと、演奏者のレッスン演奏に従って入力される演奏音とモデルとを比較して一致を判定し、その判定結果に従ってレッスンを進めるレッスンフェーズとに分けられる。以下では、モデル作成フェーズで発生される演奏音を、特に「モデル作成演奏音」と称することにする。 The processing in the musical tone determination apparatus includes a model creation phase in which a single tone model of each pitch is created according to a performance sound generated when the performer actually plays the piano P, and a performance sound input according to the lesson performance of the performer. And the model are compared to determine a match, and the lesson phase is advanced according to the determination result. Hereinafter, the performance sound generated in the model creation phase is particularly referred to as “model creation performance sound”.

モデル作成フェーズでは、まず、アコースティックピアノＰの各音高の鍵を１音ずつ演奏者に押鍵してもらい、発生されるモデル作成演奏音をマイクロフォン１０から取り込む。それにより入力されたアナログ信号をＡ／Ｄ変換器６でデジタル信号に変換し、デジタル信号の音量を音量測定部１１で測定する。アタック検出部１３は、音量測定部１１で測定された音量から音の立ち上がり（アタック）を検出する。 In the model creation phase, first, the player presses the keys of each pitch of the acoustic piano P one by one, and the generated model creation performance sound is captured from the microphone 10. The analog signal thus input is converted into a digital signal by the A / D converter 6, and the volume of the digital signal is measured by the volume measuring unit 11. The attack detection unit 13 detects the rising edge (attack) of the sound from the volume measured by the volume measurement unit 11.

アタックが検出されたとき、スペクトル作成部１２は、デジタル信号をＦＦＴしてパワースペクトルを作成する。スペクトル作成部１２で作成されたパワースペクトルは、押鍵された鍵の音高の単音のモデルとしてＲＡＭ３または外部記憶装置９に記憶される。 When an attack is detected, the spectrum creation unit 12 creates a power spectrum by performing FFT on the digital signal. The power spectrum created by the spectrum creating unit 12 is stored in the RAM 3 or the external storage device 9 as a single tone model of the pitch of the depressed key.

レッスンフェーズでは、ＲＡＭ３または外部記憶装置９に記憶された楽曲の演奏情報を読み出し、その楽曲の先頭の楽音の押鍵指示を表示装置８で表示する。同時に、楽曲の先頭の楽音のモデルをＲＡＭ３または外部記憶装置９から読み出す。表示装置８に表示された押鍵指示に従ってピアノＰで押鍵された音をマイクロフォン１０から取り込む。マイクロフォン１０を通して入力されたアナログ信号をＡ／Ｄ変換器６でデジタル信号に変換し、デジタル信号の音量を音量測定部１１で測定する。 In the lesson phase, the performance information of the music stored in the RAM 3 or the external storage device 9 is read, and the key press instruction for the first musical tone of the music is displayed on the display device 8. At the same time, the musical tone model at the beginning of the music is read from the RAM 3 or the external storage device 9. The sound pressed by the piano P is taken from the microphone 10 in accordance with the key pressing instruction displayed on the display device 8. An analog signal input through the microphone 10 is converted into a digital signal by the A / D converter 6, and the volume of the digital signal is measured by the volume measuring unit 11.

スペクトル作成部１２は、音量測定部１１で測定された音量が閾値以上であるならばデジタル信号をＦＦＴしてパワースペクトルを作成する。アタック検出部１３は、スペクトル作成部１２で作成されたパワースペクトルから音の立ち上がりのアタックを検出する。アタックが一度でも検出されたならば、スペクトル比較部１４は、ＦＦＴにより求められたパワースペクトルを読み出されたモデルと比較し、両者が一致した場合には表示装置８の押鍵指示を次の楽音に進める。 The spectrum creation unit 12 creates a power spectrum by performing FFT on the digital signal if the volume measured by the volume measurement unit 11 is equal to or greater than a threshold value. The attack detection unit 13 detects a sound attack attack from the power spectrum created by the spectrum creation unit 12. If an attack is detected even once, the spectrum comparison unit 14 compares the power spectrum obtained by FFT with the read model, and if they match, the spectrum pressing unit 14 instructs the display device 8 to press the key. Advance to a musical tone.

なお、比較に際し、押鍵が単音の場合には単音のモデルをそのまま使用できるが、和音の場合には単音モデルを合成してパワースペクトルを比較するようにすればよい。以上の一連の動作を繰り返すことによりレッスンが進められる。 In the comparison, if the key depression is a single note, the single note model can be used as it is, but if it is a chord, the single note model may be synthesized to compare the power spectrum. The lesson is advanced by repeating the above sequence of actions.

図２は、モデル作成フェーズのＣＰＵ１の処理を示すフローチャートである。モデル作成が指示されたとき、まず、楽曲あるいはレッスン（技量）レベルの選択を促す画面を表示装置８に表示し、演奏者に楽曲を選択してもらうか、レッスンレベルを選択してもらうかする（Ｓ１）。この選択に従ってモデル作成の音域を設定する（Ｓ２）。 FIG. 2 is a flowchart showing the processing of the CPU 1 in the model creation phase. When the model creation is instructed, a screen prompting the user to select a song or a lesson (skill) level is first displayed on the display device 8 to ask the player to select a song or to select a lesson level. (S1). According to this selection, a model-created sound range is set (S2).

楽曲が選択された場合には、その楽曲の演奏情報を走査し、例えば図３に示すように、それに含まれる音高を少なくとも含む音域をモデル作成の音域として自動的に設定する。選択された楽曲のレベルに応じてモデル作成の音域を自動的に設定するようにしてもよいし、選択された楽曲に出てくる音高を演奏者に示し、それらの音高を少なくとも含む音域を演奏者に指示させてモデル作成の音域を設定するようにしてもよい。 When a musical piece is selected, the performance information of the musical piece is scanned, and for example, as shown in FIG. 3, a musical range including at least the pitch included therein is automatically set as a musical range for model creation. The range of the model creation may be automatically set according to the level of the selected music, or the pitch that appears in the selected music is shown to the performer, and the range that includes at least those pitches May be instructed by the performer to set the model-created range.

レッスンレベルが選択された場合には、選択されたレッスンレベルの楽曲に含まれる音域をモデル作成の音域として設定する。例えば図４に示すように、入門、初級、中級、上級に従って音域を広く設定する。モデル作成の音域を設定したら、モデル作成の最初の音高の音、例えばモデル作成の音域に含まれる最低音をセットする（Ｓ３）。 When the lesson level is selected, the range included in the music of the selected lesson level is set as the model-created range. For example, as shown in FIG. 4, a wide sound range is set according to the introductory, beginner, intermediate, and advanced levels. When the model-created range is set, the first tone of the model-created pitch, for example, the lowest tone included in the model-created range is set (S3).

次に、モデル作成の音高の音に対応する鍵を表示装置８に表示した鍵盤に表示し、演奏者に押鍵を指示する（Ｓ４）。この際、押鍵指示した音高の音を音源４に発音するように指示してもよい。演奏者が指示された鍵を押鍵すると、ピアノＰから発せられたモデル作成演奏音はマイクロフォン１０により取り込まれる。Ｓ４で押鍵指示した音高の音をスピーカ５から発音させれば、ここでピアノＰから発せられるモデル作成演奏音と聴き比べて押鍵の間違いを演奏者に気付かせることができる。 Next, the key corresponding to the pitch of the model created pitch is displayed on the keyboard displayed on the display device 8, and the player is instructed to press the key (S4). At this time, the sound source 4 may be instructed to generate a sound of the pitch that has been instructed to press the key. When the performer presses the designated key, the model-created performance sound emitted from the piano P is captured by the microphone 10. If the sound of the pitch that was instructed to be pressed in S4 is generated from the speaker 5, it is possible to make the player aware of the mistake in pressing the key compared with the model-created performance sound emitted from the piano P.

マイクロフォン１０を通して入力されたアナログ信号をＡ／Ｄ変換器６でデジタル信号に変換し、その音量を音量測定部１１で測定する。音量測定は離鍵指示まで継続して行う。音量測定部１１で測定した音量が閾値を超えたらアタックとみなす（Ｓ５）。このアタック検出処理については後で詳細に説明する。スペクトル作成部１２はデジテル信号をＦＦＴしてパワースペクトルを作成し、音量測定部１１はその音量を算出する（Ｓ６）。アタックが検出されたときのパワースペクトルと音量は、後で使用するのでバッファとしてのＲＡＭ３に一時的に保存する。 The analog signal input through the microphone 10 is converted into a digital signal by the A / D converter 6, and the volume is measured by the volume measuring unit 11. Volume measurement continues until the key release instruction. If the volume measured by the volume measuring unit 11 exceeds the threshold, it is regarded as an attack (S5). This attack detection process will be described in detail later. The spectrum creation unit 12 creates a power spectrum by performing FFT on the digitel signal, and the volume measurement unit 11 calculates the volume (S6). The power spectrum and sound volume when the attack is detected are temporarily stored in the RAM 3 as a buffer for later use.

パワースペクトルが作成され、音量が算出されたら表示装置８に離鍵を促す指示を表示する（Ｓ７）。離鍵指示はスピーカ５による音もしくは音声の指示であってもよい。その後、次の処理に移るが、次の処理が先の音高の音の残響音で影響されないように、その前にピアノＰが無音状態になっている必要あるので、離鍵指示後、音量測定部１１で算出される音量が無音検出の閾値以下になるまで待つ（Ｓ８）。音量がこの閾値以下になれば実質的に無音状態と判断する。ピアノの場合にはかなりの残響音があるので、誤りなくモデルを作成するために無音検出は重要である。 When the power spectrum is created and the volume is calculated, an instruction to release the key is displayed on the display device 8 (S7). The key release instruction may be a sound or voice instruction from the speaker 5. After that, the process proceeds to the next process, but the piano P needs to be silent before the next process so that it is not affected by the reverberation of the previous pitch. Wait until the sound volume calculated by the measurement unit 11 is equal to or lower than the silence detection threshold (S8). If the volume falls below this threshold, it is determined that the sound is substantially silent. In the case of a piano, there is considerable reverberation, so silence detection is important to create a model without error.

無音状態になったと判断できたら、押鍵指示からこれまでの期間の音量の最大値または振幅の最大値が許容範囲内に入っているかどうかを判断する（Ｓ９）。ここで音量や振幅が許容範囲内に入っておらず、それが大きすぎると判断した場合には、例えば「もっと小さく弾いて下さい。」とメッセージを出し、逆にそれが小さすぎると判断された場合には「もっと大きく弾いて下さい。」とメッセージを出すなどしてからＳ４に戻り、演奏者に再度の押鍵を指示する。これは極端に強くまたは弱く押鍵された場合のパワースペクトルはモデルとしてふさわしくないからである。アタック検出時点の音量や振幅ではなく、押鍵指示からこれまでの期間の音量の最大値または振幅をチェックするのは、アタック検出時点の音量や振幅はピーク値ではなく、音量や振幅は通常、アタック検出後にピーク値となるからである。 If it can be determined that the silent state has been reached, it is determined whether the maximum value of the volume or the maximum value of the amplitude during the period from the key pressing instruction is within the allowable range (S9). If it is determined that the volume or amplitude is not within the allowable range and that it is too high, for example, a message “Please play smaller” is displayed, and it is determined that it is too low. In such a case, a message such as “Please play larger” is displayed, and the process returns to S4 to instruct the performer to press the key again. This is because the power spectrum when pressed extremely strongly or weakly is not suitable as a model. The maximum volume or amplitude of the period from the key press instruction to the previous period is checked instead of the volume and amplitude at the time of attack detection. The volume and amplitude at the time of attack detection are not peak values, and the volume and amplitude are usually This is because the peak value is obtained after the attack is detected.

押鍵指示からこれまでの期間の音量の最大値または振幅の最大値が許容範囲内に入っていると判断できたら、一時的に保存しておいたパワースペクトルと音量を押鍵された音高の音のモデルとしてＲＡＭ１１に記憶する（Ｓ１０）。 If it can be determined that the maximum volume or maximum amplitude during the period from the key press instruction is within the allowable range, the temporarily stored power spectrum and volume are pressed. Is stored in the RAM 11 as a sound model (S10).

Ｓ１〜Ｓ１０により単音のモデルが記憶され、以上の処理を順次、次々の音高の音について繰り返すことにより設定された音域内の単音のモデルを記憶することができるが、本実施形態では、外部環境によるノイズなどが発せられたとしても各音高の単音のモデルを正しく記憶できるように、同じ鍵を再度押鍵するように指示し、これにより得られたモデルが先に記憶したモデルと一致すればそれを正規のモデルとして残すようにしている。 A single-tone model is stored by S1 to S10, and a single-tone model within a set range can be stored by repeating the above processing sequentially for successive pitches. Instructs the user to press the same key again so that the model of a single note of each pitch can be stored correctly even if noise is generated due to the environment, etc., and the resulting model matches the previously stored model If it does, it tries to leave it as a regular model.

以下に説明するＳ１１〜Ｓ１７がこの処理である。１回目の押鍵によるモデルをＲＡＭ３に記憶した後、演奏者にもう一度同じ鍵を押鍵するように指示する（Ｓ１１）。この押鍵指示は、１回目と同様、表示装置８による表示あるいは音源４による発音、あるいは両者によって行ってもよい。次に、Ｓ１０で記憶したモデルを読み出す（Ｓ１２）。 S11 to S17 described below is this process. After storing the model of the first key depression in the RAM 3, the player is instructed to depress the same key again (S11). This key pressing instruction may be performed by display on the display device 8, sound generation by the sound source 4, or both, as in the first time. Next, the model stored in S10 is read (S12).

演奏者の押鍵によるモデル作成演奏音はマイクロフォン１０によって取り込まれる。マイクロフォン１０を通して入力されたアナログ信号をＡ／Ｄ変換器６でデジタル信号に変換した後、その音量を音量測定部１１で検出する（Ｓ１３）。ある程度以上の音量が検出されたら、スペクトル作成部１２でデジタル信号をＦＦＴしてパワースペクトルを作成し、これを２回目のモデルとする（Ｓ１４）。ある程度以上の音量が検出されて始めてパワースペクトルを作成することにより、無駄な処理をなくし、本装置がパーソナルコンピュータなどで構成された場合のＣＰＵの負荷を減らすことができる。 The model-created performance sound by the player's key depression is captured by the microphone 10. After the analog signal input through the microphone 10 is converted into a digital signal by the A / D converter 6, the volume is detected by the volume measuring unit 11 (S13). When a sound volume of a certain level or more is detected, the spectrum creating unit 12 creates a power spectrum by performing FFT on the digital signal, and this is used as a second model (S14). By creating a power spectrum only after a sound volume of a certain level or more is detected, useless processing can be eliminated, and the load on the CPU when this apparatus is configured by a personal computer or the like can be reduced.

２回目のモデルを作成したら、すでにアタックが検出されているかどうかをチェックする（Ｓ１５）。最初はアタックは検出されていないので次のステップに進み、アタックを検出する（Ｓ１６）。アタック検出は、音の立ち上がりを検出するものであり、例えば音量変化で検出する方法やパワースペクトルの各スペクトルのパワー増加分から検出する方法などで実行できる。 After the second model is created, it is checked whether an attack has already been detected (S15). At first, since no attack is detected, the process proceeds to the next step to detect the attack (S16). Attack detection is to detect the rise of sound, and can be executed by, for example, a method of detecting by volume change or a method of detecting from the power increase of each spectrum of the power spectrum.

Ｓ１６でアタックを検出できなかった場合、Ｓ１３に戻って再度音量の検出からやり直す。また、アタックを検出できた場合には、Ｓ１２で読み出したモデル（１回目のモデル）とＳ１４で作成したモデル（２回目のモデル）との比較処理（Ｓ１７）に移る。アタック検出（Ｓ１６）は、同じ楽音の間は一度検出したら２度目以降は検出する必要はない。通常、アタックが検出されるのは音が立ち上がる瞬間のみであり、必ずしもアタック時の楽音のパワースペクトルと１回目のモデルとが一致するとは限らないからである。比較処理に際しては、１回目と２回目の押鍵による音量の違いをなくすために両者の音量が同じになるように正規化することが好ましい。 If no attack is detected in S16, the process returns to S13 and starts again from the detection of the volume. If an attack can be detected, the process proceeds to a comparison process (S17) between the model read in S12 (first model) and the model created in S14 (second model). In the attack detection (S16), once it is detected during the same musical tone, it is not necessary to detect it after the second time. Usually, the attack is detected only at the moment when the sound rises, and the power spectrum of the musical sound at the time of the attack does not necessarily match the first model. In the comparison process, it is preferable to normalize the volume so that the volume is the same in order to eliminate the difference in volume between the first and second key presses.

比較処理（Ｓ１７）は、例えば１回目のモデルと比べて２回目のモデルに足りない不足分のパワースペクトルの合計を距離として算出し、これを所定の閾値と比較し、距離が閾値を超える場合には不一致と判断し、距離が閾値以下の場合には一致と判断する方法により実行できる。この方法は、ピアノなどのような残響が多く残る楽器には有効な方法である。このとき、距離が閾値以下の場合でも押鍵された鍵の楽音のパワースペクトルに構成音ごとの特徴が見つからない場合には不一致と判断するようにすることもできる。 In the comparison process (S17), for example, the sum of the insufficient power spectrum that is insufficient for the second model compared to the first model is calculated as a distance, and this is compared with a predetermined threshold, and the distance exceeds the threshold Can be executed by a method in which it is determined that they do not match, and when the distance is equal to or less than a threshold value, they are determined to match. This method is effective for instruments such as a piano that have a lot of reverberation. At this time, even when the distance is equal to or smaller than the threshold value, it is possible to determine that there is a discrepancy when a characteristic for each constituent sound is not found in the power spectrum of the tone of the key that has been pressed.

Ｓ１７で不一致と判断した場合、Ｓ４に戻って再度１回目のモデル作成からやり直す。この再度の処理で作成されたモデルは、先に記憶されたモデルに上書きされる。 If it is determined that there is a mismatch in S17, the process returns to S4 and starts again from the first model creation. The model created by this re-processing is overwritten on the previously stored model.

Ｓ１７で一致と判断すれば、Ｓ１０でＲＡＭ３に記憶したモデルを正しいモデルとして残したままにしておく。次に、作成されたモデルの音がモデル作成の最終音かどうかを判断する（Ｓ１８）。Ｓ２で設定された音域の最終音（最高音）ではないと判断すれば、例えば半音高い次の音高の音にするなどモデル作成演奏音を１つ進め（ＲＡＭ３にセット）てＳ４に戻り（Ｓ１９）、その音について同様の処理を繰り返す。また、Ｓ１８で最終音と判断すれば、作成された全ての音高の音のモデルをＲＡＭ３または外部記憶装置９に保存し（Ｓ２０）、モデル作成を終了する。 If it is determined that they match in S17, the model stored in RAM 3 in S10 is left as a correct model. Next, it is determined whether the sound of the created model is the final sound of model creation (S18). If it is determined that it is not the final sound (highest sound) in the range set in S2, the model-created performance sound is advanced by one (set in the RAM 3), for example, the sound of the next higher pitch is set, and the process returns to S4 ( S19) The same processing is repeated for the sound. If it is determined in S18 that the sound is the final sound, all the created pitch models are stored in the RAM 3 or the external storage device 9 (S20), and the model creation is terminated.

次に、アタック検出処理（Ｓ５）について説明する。アタック検出は、ノイズや入力信号レベルに応じた閾値に基づいて行う。モデルとレッスン時の演奏音のパワースペクトルはできるだけ早い段階で一致検出することが望ましいので、モデル作成時に押鍵後の演奏音のアタックを検出し、アタック検出時点のパワースペクトルをモデルとして記憶する。 Next, the attack detection process (S5) will be described. Attack detection is performed based on a threshold corresponding to noise or an input signal level. Since it is desirable to detect the power spectrum of the model and the performance sound at the time of lesson as early as possible, the attack of the performance sound after the key depression is detected at the time of model creation, and the power spectrum at the time of the attack detection is stored as a model.

アタック検出に際しては、まず最初に、ピアノＰを弾いていない状態で余分な音を立てないようにして、音量測定部１１で騒音（暗騒音）の音量を測定する。この測定時間は特に問題にならないが、例えば5秒程度でよく、余り長いと騒音が入り込む可能性が大きくなるので1秒程度でもよい。 At the time of attack detection, first, the volume measuring unit 11 measures the volume of noise (background noise) without making an excessive sound while the piano P is not being played. This measurement time is not particularly problematic, but may be, for example, about 5 seconds, and if it is too long, the possibility of noise entering increases, so it may be about 1 second.

音量測定部１１は、Ａ／Ｄ変換器３からの出力サンプルがＮ個たまるごとに下記式で音量を測定する。なお、Ｓ_ｉはｉ番目の出力（振幅）を表す。 The sound volume measuring unit 11 measures the sound volume by the following formula every time N output samples from the A / D converter 3 are accumulated. Note that S _i represents the i-th output (amplitude).

サンプリング周波数が11025Hzの場合、Ｎの値は128程度が好ましく、この場合には約12msecごとに繰り返し音量が計算されることになる。上記式で求めた音量Ｐ_Ｗの例えば5秒間の最大値をノイズ音量の最大値Ｐ_{ＷＮＯＩＳＥ}として保存する。 When the sampling frequency is 11025 Hz, the value of N is preferably about 128. In this case, the volume is repeatedly calculated about every 12 msec. For example, the maximum value for 5 seconds of the volume _PW obtained by the above equation is stored as the maximum value P _WNOISE of the noise volume.

ノイズ音量の最大値Ｐ_{ＷＮＯＩＳＥ}が保存されたら、これにアタックとみなす音量Ｐ_{ＷＡＴＴＡＣＫ}を加算した値をアタック検出の音量閾値Ｐ_ＷＴＨとする。つまり、ノイズ音量以上にある程度の音量が入力されたらピアノＰが演奏されたと判断する。図５にＰ_{ＷＮＯＩＳＥ}、Ｐ_{ＷＡＴＴＡＣＫ}およびＰ_ＷＴＨの関係を示す。 When the maximum value _{P WNOISE} noise volume is stored, the value obtained by adding the volume _{P WATTACK} regarded as an attack sound volume threshold _{P WTH} Attack detection thereto. That is, it is determined that the piano P has been played when a certain level of sound is input above the noise level. It shows the relationship between _{_P WNOISE,} _P _WATTACK and _{P WTH} in FIG.

以上で説明したアタック検出は、入力されるデジタル信号の音量または振幅の影響を受けるため、マイクロフォン１の入力レベルが適正に設定される必要がある。 Since the attack detection described above is affected by the volume or amplitude of the input digital signal, the input level of the microphone 1 needs to be set appropriately.

図６は、マイクロフォン２とＡ／Ｄ変換器３の部分（図１）の詳細構成を示すブロック図である。マイクロフォン２の出力はマイクボリューム２０１に入力され、次にマイクアンプ２０２を経由してＡ／Ｄ変換器３に入力される。マイクボリューム２０２を調節することによってマイクの感度や入力音の音量に応じてＡ／Ｄ変換器３の入力信号レベルを変更することができる。楽音判定装置がパーソナルコンピュータに実装される場合などでは録音デバイスのミキサコントロールがこれに相当する。 FIG. 6 is a block diagram showing a detailed configuration of the microphone 2 and the A / D converter 3 (FIG. 1). The output of the microphone 2 is input to the microphone volume 201 and then input to the A / D converter 3 via the microphone amplifier 202. By adjusting the microphone volume 202, the input signal level of the A / D converter 3 can be changed according to the sensitivity of the microphone and the volume of the input sound. For example, when the musical tone determination apparatus is mounted on a personal computer, the mixer control of the recording device corresponds to this.

ピアノＰの演奏音をマイクロフォン２で取り込んでＡ／Ｄ変換器３に入力するとき、Ａ／Ｄ変換後のデジタル信号がノイズに埋もれたり、クリップされないようにするために、モデル作成時に表示装置８にピークメータを表示する。 When the performance sound of the piano P is captured by the microphone 2 and input to the A / D converter 3, the display device 8 is used when creating the model so that the digital signal after A / D conversion is not buried or clipped by noise. Display the peak meter.

ピークメータは、例えば図７に示すように、点灯個数の振れ具合を変化（色変化）させてＡ／Ｄ変換後のデジタル信号の一定時間ごとの振幅の最大値を表示し、その最大値が０のときには１つも点灯しない表示になる。また、その最大値がデジタル信号の量子化ビット数によって決まる振幅の最大値もしくはその半分程度になったとき全てが点灯する表示になる。 For example, as shown in FIG. 7, the peak meter displays the maximum value of the amplitude of the digital signal after A / D conversion by changing the lighting condition (color change), and the maximum value is When 0, no display is turned on. Further, when the maximum value becomes the maximum value of the amplitude determined by the number of quantization bits of the digital signal or about half of the maximum value, the display is turned on.

ピアノＰを強めに弾いたときに、上記のように設定されたピークメータの７〜８割程度が点灯するようにマイクボリューム２０１を調節すれば、ノイズに埋もれたり、クリップされたりしない適切な入力信号レベルにすることができる。なお、この調節の際に弾く鍵は高音域より中音域や低音域の方が適している。中音域や低音域の方が適正な音量が得られるからである。また、振幅の最大値を更新っする一定時間間隔は、10〜50msec程度が適当である。余り速すぎるとピークメータの動きに目が追いつかないし、余り遅すぎると瞬間的な音量の変化が表示されなくなってしまうからである。 If the microphone volume 201 is adjusted so that about 70 to 80% of the peak meter set as above lights up when the piano P is played strongly, an appropriate input that will not be buried or clipped by noise The signal level can be set. It should be noted that the key played during this adjustment is more suitable for the mid range and the low range than for the high range. This is because an appropriate volume can be obtained in the middle range and the low range. The fixed time interval for updating the maximum value of the amplitude is suitably about 10 to 50 msec. This is because if the speed is too fast, the eyes cannot catch up with the movement of the peak meter, and if the speed is too slow, the instantaneous volume change will not be displayed.

アタックとみなす音量Ｐ_{ＷＡＴＴＡＣＫ}は、マイク入力レベルを適正に設定するために表示したピークメータの振れ具合に応じて変える。マイクボリューム２０１の調節によってピークメータの７〜８割程度が振れるようにできれば問題はないが、マイクロフォン２の感度が低いため、マイクボリューム２０１を最大限にしてもピークメータが余る振れないような場合には、７〜８割程度が振れるようにした場合と同じ閾値では、思いっきり強く弾いてもアタックが検出されなかったり、かなり強く弾かないとアタックが検出されなかったりする。また、ピアノのような楽器では余り強く弾くと高い倍音が強く出るため、強く弾いてモデルを作成すると、レッスン時にも同じように強く弾かないと一致が得られなくなってしまう。 The volume P _WATTACK regarded as an attack is changed in accordance with the shake level of the displayed peak meter in order to properly set the microphone input level. There is no problem if about 70 to 80% of the peak meter can be swung by adjusting the microphone volume 201. However, since the sensitivity of the microphone 2 is low, even if the microphone volume 201 is maximized, the peak meter does not swing excessively. In the case of the same threshold as when about 70 to 80% is swung, the attack is not detected even if played strongly, or the attack is not detected unless played very strongly. Also, if you play it too strongly with a musical instrument such as a piano, high harmonics will be strong, so if you play strongly and create a model, you will not be able to get a match unless you play it as well during lessons.

ピークメータは、音量Ｐ_ＷとしてＡ／Ｄ変換後のデジタル波形の振幅の平均値を表示しているので、ピークメータが半分しか振れない場合にはアタックとみなす音量Ｐ_{ＷＡＴＴＡＣＫ}も半分程度にすればよい。つまり、通常（強めに弾いたときにピークメータが７〜８割程度振れる）時のアタックとみなす音量Ｐ_{ＷＡＴＴＡＣＫ}が100位で丁度よいことが実験的に分かっているとき、ピークメータがその半分位しか振れないときにはアタックとみなす音量Ｐ_{ＷＡＴＴＡＣＫ}も半分の50位にすればよい。 Since the peak meter displays the average value of the amplitude of the digital waveform after A / D conversion as the volume P _W , if the peak meter can swing only half, the volume P _WATTACK regarded as an attack can be reduced to about half. Good. In other words, when it is experimentally known that the volume P _{WATTACK, which} is regarded as an attack when normal (the peak meter swings about 70 to 80% when played strongly), is 100th, it is just half that peak meter If it can only be _{shaken, the} volume P _WATTACK regarded as an attack may be reduced to the 50th position.

次に、音量（振幅）チェック処理（Ｓ９）で用いられる音量最大値許容範囲の設定について説明する。図８は、あるピアノを最低音の音高（ノートナンバ）21から最高音のノートナンバ108までの88鍵を、ほぼ同じ強さで弾いたときの音高対音量最大値の特性を示す。なお、音量最大値とは各鍵を弾いたときに上記式により算出される音量Ｐ_Ｗの最大値のことである。同図にみられるように、同じ強さで弾いた場合でも音域（音高）によって音量値に大きな差があることが分かる。特に最高音に近い音域では非常に小さい音量最大値しか得られない。このように音域によって音量最大値に大きな差があるため、単純に音量最大値が同じ位になるようにガイドしたのでは同じ強さで弾いてもらうことはできない。 Next, the setting of the maximum volume allowable range used in the volume (amplitude) check process (S9) will be described. FIG. 8 shows the characteristics of pitch vs. maximum volume when a piano is played with almost the same strength from 88 keys from the lowest note pitch (note number) 21 to the highest note number 108. The maximum volume value is the maximum value of the volume P _W calculated by the above formula when each key is played. As can be seen from the figure, there is a large difference in volume value depending on the range (pitch) even when playing with the same strength. In particular, in the range close to the highest sound, only a very small volume maximum value can be obtained. Since there is a large difference in the maximum volume value depending on the sound range in this way, it is not possible to play with the same strength simply by guiding the maximum volume value to the same level.

図８の音高対音量最大値の特性が予め分かっている場合にはこの特性に従って音量最大値の許容範囲を変化させればよいが、この特性は個々のピアノによって異なるし、マイクロフォンの特性によっても音域ごとの音量が異なってくるので、全ての装置に共通的な音量最大値許容範囲を予め設定しておくことはできない。 If the characteristics of the pitch vs. volume maximum value in FIG. 8 are known in advance, the allowable range of the volume maximum value may be changed according to this characteristic, but this characteristic varies depending on the individual piano and depends on the characteristics of the microphone. However, since the sound volume differs for each sound range, a sound volume maximum value allowable range common to all devices cannot be set in advance.

これは、以下のような方法で個々の装置において音量最大値許容範囲を設定することにより解決できる。ピアノによって個体差があり、マイクロフォンによって周波数特性が異なるが、アコースティックピアノで一般的なマイクロフォンを使う限り最高音あたりの音量が最も小さく、中音域あたりの音量はかなり大きいことに間違いない。そこで、予め最高音と中音域の１音（中央のドなど）をモデル作成にふさわしい普通の強さ、またはやや弱いタッチで弾いてもらい、そのときの音量最大値を保存しておく。 This can be solved by setting the maximum volume allowable range in each device by the following method. There are individual differences depending on the piano, and the frequency characteristics differ depending on the microphone. However, as long as a general microphone is used in an acoustic piano, the volume per highest sound is the smallest and the volume per mid-range is definitely high. Therefore, the highest sound and one midrange sound (such as the middle sound) are played with a normal touch suitable for model creation or a slightly weak touch, and the maximum volume at that time is stored.

この２つの音量最大値にある程度の幅を持たせた範囲を音量最大値許容範囲とする。つまり、最高音の音量最大値をＰ_ＷＨ、中音域の音量最大値をＰ_ＷＭとすると、入力された音の音量最大値がＰ_ＷＨ×ＲからＰ_ＷＭ／Ｒ′（Ｒ、Ｒ′は1.5程度の数値）の範囲に入っているかどうかをチェックし、入っていない場合には「もっと小さく弾いて下さい」や「もっと大きく弾いて下さい。」とメッセージを出すなどして、再度押鍵してもらうようにする。なお、ＲとＲ′は同じ値でも異なる値でもよい。 A range in which the two maximum sound volume values have a certain width is defined as a maximum sound volume allowable range. That is, if the maximum volume value of the highest sound is P _WH and the maximum volume value of the middle range is P _WM , the maximum volume value of the input sound is changed from P _WH × R to P _WM / R ′ (R, R ′ is 1.5). Check if it is in the range of (about), and if it is not, press the key again with a message such as "Please play smaller" or "Please play larger" To get. R and R ′ may be the same value or different values.

また、モデル作成にふさわしい普通の強さで弾くということが演奏者の主観が入って難しく、また同じ位の強さで弾くということが難しい場合には、最高音と中音域の１音を思いっきり強く弾いてもらって音量最大値を測定し、それを何分の一かにした音量を普通の強さで弾いたときの音量と類推し、この音量を基準値として音量最大値許容範囲を設定するようにしてもよい。ピアノの構造上、強い音といっても限界があるので、この方法により個人差のないデータを入力することができる。 Also, if it is difficult to play with normal strength suitable for model creation, and it is difficult to play with the same level of strength, it is quite possible to play one note in the highest and middle range. Measure the maximum volume by playing strongly, analogize the volume of the volume that is a fraction of it with normal strength, and set the maximum volume allowable range with this volume as the reference value You may do it. Since there is a limit to the strong sound due to the structure of the piano, data without individual differences can be input by this method.

また、モデル作成する音域の中の数カ所の音を弾いてもらい、それにより得られる音量最大値から音高対音量最大値の特性を作成し、この特性から音量最大値の許容範囲を決めることもできる。この場合も、標準の強さで弾いてもらうことが難しい場合には各鍵を思いっきり強く弾いてもらって音量最大値を測定し、それを何分の一かにした音量を音量最大値許容範囲の基準値とすることができる。 It is also possible to have several notes in the range to be modeled play, create a characteristic of pitch vs. maximum volume from the maximum volume value obtained, and determine the allowable range of maximum volume from this characteristic. it can. In this case as well, if it is difficult to play with standard strength, measure the maximum volume level by playing each key as hard as possible. It can be a reference value.

なお、各鍵の押鍵強さを判断するのに使う数量は、必ずしも音量最大値である必要はなく、Ａ／Ｄ変換後のデジタル波形の振幅の最大値などでもよい。また、最大値ではなく平均値を使うこともできる。 Note that the quantity used to determine the key-pressing strength of each key is not necessarily the maximum volume value, and may be the maximum value of the amplitude of the digital waveform after A / D conversion. It is also possible to use the average value instead of the maximum value.

次に、離鍵指示（Ｓ７）と無音検出処理（Ｓ８）について説明する。図９は、モデル作成時に表示装置８に鍵盤を表示し、押鍵する鍵のみの色を変えて押鍵を指示する押鍵指示の例を示す。押鍵指示と同時に音源４に指示してスピーカ５から発音させるようにしてもよい。 Next, the key release instruction (S7) and the silence detection process (S8) will be described. FIG. 9 shows an example of a key pressing instruction in which a keyboard is displayed on the display device 8 at the time of model creation, and the key pressing is instructed by changing the color of only the key to be pressed. Simultaneously with the key pressing instruction, the sound source 4 may be instructed to generate sound from the speaker 5.

押鍵指示に従って押鍵され、パワースペクトルを作成し、音量を算出したら、離鍵を指示する（Ｓ７）。離鍵指示は、押鍵指示の反対であり、表示装置８に表示した鍵の色を元に戻すか、または押鍵指示のときとは異なる色で表示し、以下に説明するように無音になったと判断したときに鍵の色を元に戻すことにより行う。なお、スピーカ５から発音させている場合にはその停止を音源４に指示する。 When the key is pressed in accordance with the key pressing instruction, a power spectrum is created, and the volume is calculated, the key release is instructed (S7). The key release instruction is the opposite of the key press instruction, and the key color displayed on the display device 8 is restored or displayed in a different color from the key press instruction, and is silent as will be described below. This is done by returning the key color when it is determined that it has become. If sound is produced from the speaker 5, the sound source 4 is instructed to stop the sound.

離鍵後も残響音が残っているので、次の音のモデル作成のための押鍵あるいは２回目の押鍵の前に残響音が消えるのを待つ（Ｓ８）。残響音の音量は、音量Ｐ_Ｗを求める上記式によって求めることができる。 Since the reverberant sound remains even after the key is released, it waits for the reverberant sound to disappear before the key depression for the next sound model creation or the second key depression (S8). The volume of the reverberant sound can be obtained by the above formula for obtaining the volume P _W.

図１０は、アコースティックピアノにおいて最低音（ノートナンバ21）を押鍵してすぐに離鍵したときに得られた音量変化を示し、実測により得られたものである。同図のように、離鍵後の残響音は音量はアタック直後に著しく低下するが、その後は上下しながら徐々に減衰していくので、音量が単純に閾値Ｐ_ＷＴＨを一度下回ったということで無音と判断すると、その後にまた音量が上がってしまう可能性がある。そこで、閾値Ｐ_ＷＴＨ以下になっても音量を追跡するとともに経過時間をカウントし、再び音量が閾値Ｐ_ＷＴＨを上回ったら経過時間をクリアする。そして、閾値Ｐ_ＷＴＨ以下の経過時間が連続して例えば１秒になって初めて無音と判断する。 FIG. 10 shows a change in volume obtained when the lowest note (note number 21) is pressed and released immediately on an acoustic piano, and is obtained by actual measurement. As shown in the figure, the volume of the reverberant sound after the key release decreases significantly immediately after the attack, but then gradually attenuates while going up and down, so the volume simply falls below the threshold P _WTH once. If it is determined that there is no sound, the volume may increase again thereafter. So, even if equal to or less than a threshold value P _WTH counts the elapsed time as well as keep track of the volume, to clear the elapsed time Once exceeded again volume threshold P _WTH. Then, it is determined that there is no silence _until the elapsed time equal to or less than the threshold value P _WTH is continuously 1 second, for example.

ここで用いる閾値Ｐ_ＷＴＨは以下のようにして決定できる。モデル作成に先立ち、ノイズの音量を測定する。これは、演奏者に音を立てないように静かにしてもらってマイクロフォン１０から１〜５秒程度の間の音を取り込み、上記式により音量を算出すればよい。これにより測定された音量の最大値をノイズ音量の最大値Ｐ_{ＷＮＯＩＳＥ}とする。閾値Ｐ_ＷＴＨはノイズ音量の最大値Ｐ_{ＷＮＯＩＳＥ}に無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}を加えたものとする。 The threshold value P _WTH used here can be determined as follows. Prior to model creation, measure noise volume. This can be done by having the performer quiet so as not to make a sound, taking in the sound for about 1 to 5 seconds from the microphone 10 and calculating the volume by the above formula. The maximum value of the sound volume thus measured is set as the maximum value P _WNOISE of the noise sound volume. Threshold _{P WTH} is a plus volume _{P WSILENT} regarded as silence to the maximum value _{P WNOISE} noise volume.

無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}は小さいほど無音に近い状態で判断できるが、音の減衰は時間が経過するほど減衰する音量の程度が減っていくので、余り小さくすると音量が閾値Ｐ_ＷＴＨ以下になるまでの時間が著しく長くなってしまうので、パワースペクトルに現れないくらいの音量まで低下したら無音と判断されるように設定するのがよい。 It can be determined that the volume _PWSILENT regarded as silence is closer to silence as the volume is smaller. However, since the attenuation of the sound decreases as the time elapses, the volume decreases until the volume falls below the threshold _PWTH. Therefore, it should be set so that the sound is judged to be silent when the volume is reduced to a level that does not appear in the power spectrum.

また、無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}をＡ／Ｄ変換後のデジタル信号の振幅に応じて変えるようにしてもよい。例えばＡ／Ｄ変換後のデジタル信号の一定時間ごとの振幅のピークを図７に示すようなピークメータで表示するとき、通常（強めに弾いたときにピークメータが７〜８割振れる）時の無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}が50位の値で丁度よいことが実験的に分かっているとき、ピークメータがその半分位しか振れないときには無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}もその半分の25位にする。 The volume _PWSILENT regarded as silence may be changed according to the amplitude of the digital signal after A / D conversion. For example, when the peak of amplitude of a digital signal after A / D conversion is displayed with a peak meter as shown in FIG. 7, it is normal (when the peak meter swings 70-80% when played hard) when the volume P _WSILENT regarded as silence is that just good at 50-position of the value has been found experimentally, the volume P _WSILENT regarded as silent also at the 25-position of the half when the peak meter is not shake only half position.

図１１（ａ）は、アコースティックピアノにおいてノートナンバ48（中央のドよりも１オクターブ低いドの音）を普通の強さで押鍵したときに得られた音量変化を示し、実測により得られたものである。なお、図１１（ｂ）は、その一部の拡大図である。同図のように、音量はアタック直後に著しく低下するが、その後は上下しながら徐々に減衰し、10秒前後でノイズと区別が付かない程度まで減衰する。低音になるほどこの減衰にかかる時間は長くなるが、次の音のモデル作成のための押鍵あるいは２回目の押鍵に早く入れるように、無音と判断できる程度に音量が減衰するのをできるだけ早く検知してその旨をガイドするのがよい。 FIG. 11 (a) shows the change in volume obtained when the note number 48 (a sound of 1 octave lower than the center position) is pressed with a normal strength on an acoustic piano, and is obtained by actual measurement. Is. FIG. 11B is an enlarged view of a part thereof. As shown in the figure, the volume decreases significantly immediately after the attack, but then gradually decreases while going up and down, and attenuates to the extent that it cannot be distinguished from noise in about 10 seconds. The lower the sound, the longer this decay takes. However, as soon as the key is pressed for the next sound model creation or the second key press, the sound volume is attenuated as quickly as possible so that it can be judged as silence. It is better to detect and guide to that effect.

図１２〜図１５はそれぞれ、実測により得られたアタック直後、アタックから約1.4秒後、アタックから約4.1秒後、アタックから約5.8秒後のパワースペクトルを示す。また、図１６は、次のモデル作成音（ノートナンバ50）をやや弱く押鍵したときに得られたパワースペクトルを示す。 12 to 15 show power spectra obtained immediately after the attack, about 1.4 seconds after the attack, about 4.1 seconds after the attack, and about 5.8 seconds after the attack, respectively. FIG. 16 shows a power spectrum obtained when the next model creation sound (note number 50) is depressed slightly weakly.

図１２〜図１５と図１６を比較すれば、図１１においてどの程度に音量が減衰すれば次のモデル作成が影響されないかが判断できる。この例の音域の場合、弱めに弾いても比較的大きなパワースペクトルが得られているので、アタックから1.4秒後や4.1秒後程度でも次のモデル作成への影響は殆どない。つまり無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}は250〜750程度でも構わないことになるが、もっと小さくしてできるだけ無音に近い状態で次のモデル作成に移る方が望ましい。 By comparing FIGS. 12 to 15 and FIG. 16, it can be determined how much the volume is attenuated in FIG. 11 and the next model creation is not affected. In the case of the sound range in this example, a relatively large power spectrum is obtained even if played softly, so there is almost no influence on the next model creation even after 1.4 seconds or 4.1 seconds after the attack. In other words, the volume P _WSILENT regarded as silence may be about ₂₅₀ to 750, but it is preferable to _{make the} volume smaller and move to the next model creation as close to silence as possible.

一方、図１７は、最高音（ノートナンバ108）を普通の強さで押鍵したときに得られた音量変化を示し、最高でも300程度の音量しか出ないのが分かる。この場合の無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴを}を250〜750程度としたのでは大きすぎる。図１０と図１６から、無音とみなす音量Ｐ_{ＷＳＩＬＥＮＴ}を全音域で同じとするならば、10〜75程度がふさわしいことが分かる。それより大きくては、最高音（ノートナンバ108）のように音量が出ない音域で少し弱めに押鍵してしまうと最大音量がＰ_{ＷＮＯＩＳＥ}＋Ｐ_{ＷＳＩＬＥＮＴ}を下回ってしまうことがあり、それより小さくては、少し大きめのノイズが入ると無音と判断できない状況が生じるからである。 On the other hand, FIG. 17 shows the change in volume obtained when the highest note (note number 108) is pressed with normal strength, and it can be seen that only a maximum volume of about 300 is produced. In this case, it is too large if the volume _PWSILENT regarded as silence is set to about ₂₅₀ to 750. From FIG. 10 and FIG. 16, it can be seen that about 10 to 75 is appropriate if the volume P _WSILENT regarded as silence is the same in the whole sound range. If it is larger than that, the maximum volume may be lower than P _WNOISE + P _WSILENT if the key is _pressed slightly weakly in the range where the volume does not come out like the highest note (note number 108), and it is smaller than that. This is because a situation in which it cannot be determined that there is no sound occurs when a little larger noise is input.

なお、図９の押鍵指示において黒丸印で示すように、モデル作成時に表示装置８に表示する鍵盤の中央「ド」などの鍵に目印表示Ｓを常時表示し、実際のピアノの鍵盤の中央「ド」の鍵にも同じ目印表示のシールＳを貼っておくことなどにより両鍵盤の対応を分かりやすくできる。また、通常、実際のピアノの鍵盤上側部にはメーカ名などのロゴ１０２が付されているが、このロゴ１０２の所定位置（図９では「Ｂ」の位置）に対応させた目印を表示装置８に表示させるなどしてもよい。これによりモデル作成時の負担を軽減することができる。 As indicated by black circles in the key pressing instruction in FIG. 9, the mark display S is always displayed on the key such as the center “do” of the keyboard displayed on the display device 8 when the model is created, and the center of the actual piano keyboard is displayed. The correspondence between the two keys can be easily understood by sticking the sticker S with the same mark on the “do” key. Further, a logo 102 such as a manufacturer name is usually attached to the upper part of the keyboard of an actual piano. A mark corresponding to a predetermined position (position “B” in FIG. 9) of the logo 102 is displayed on the display device. 8 may be displayed. Thereby, the burden at the time of model creation can be reduced.

図１８は、レッスンフェーズのＣＰＵ１の処理を示すフローチャートである。レッスン開始が指示されたとき、まず、ＲＡＭ３または外部記憶装置９からモデルをワーク領域に読み込む（Ｓ２１）。読み込まれるモデルは、モデル作成フェーズで設定された音域に含まれる全ての音高の音のモデルである。ここでは、前回作成したモデルまたは前回使用したモデルを自動的に読み込むようにしてもよいし、これまで作成したモデルの中から演奏者に選択させて読み込むようにしてもよい。 FIG. 18 is a flowchart showing the processing of the CPU 1 in the lesson phase. When the lesson start is instructed, first, the model is read from the RAM 3 or the external storage device 9 into the work area (S21). The model to be read is a model of all pitches included in the range set in the model creation phase. Here, the model created last time or the model used last time may be automatically read, or the player may select and read from the models created so far.

次に、演奏者がレッスンする楽曲を選択する（Ｓ２２）。この際、ＲＡＭ３または外部記憶装置９に予め記憶されているレッスン曲データを検索して楽曲の一覧を作成し、この一覧の中から演奏者がレッスンする楽曲を自分で選択するようにしてもよいし、演奏者のレベルにあった楽曲を装置側で自動的に選択して提示するようにしてもよい。レッスンする楽曲が決まったら、この楽曲の演奏情報をＲＡＭ３または外部記憶装置９から読み出す（Ｓ２３）。 Next, a music piece to be performed by the performer is selected (S22). At this time, lesson song data stored in advance in the RAM 3 or the external storage device 9 may be searched to create a list of songs, and the player may select a song to be taught by himself / herself from the list. Then, a music piece suitable for the player's level may be automatically selected and presented on the device side. When the music to be lesson is determined, the performance information of this music is read from the RAM 3 or the external storage device 9 (S23).

図１９は、レッスンで表示装置８に押鍵指示を表示したり伴奏をしたりするのに使用される演奏情報のフォーマット例を示す図である。演奏情報は、少なくともイベントデータ、およびイベントデータの読み出しタイミングを指示するタイミングデータを含む。イベントデータは、ノートナンバ（音高）を含むノートオンデータおよびノートオフデータからなる。タイミングデータは、例えば１つのイベント終了から次のイベント発生までの時間情報として設定される。イベントデータとタイミングデータは、図示のように、アドレス進行に従って記憶される。
この演奏情報は、外部記憶装置９に記憶しておくことができる。 FIG. 19 is a diagram showing a format example of performance information used for displaying a key pressing instruction on the display device 8 and performing accompaniment in a lesson. The performance information includes at least event data and timing data for instructing reading timing of the event data. The event data includes note-on data and note-off data including a note number (pitch). The timing data is set, for example, as time information from the end of one event to the next event occurrence. Event data and timing data are stored according to address progression as shown.
This performance information can be stored in the external storage device 9.

次に、選択された楽曲に含まれる全ての音高の単音のモデルがＳ２１で読み込んだモデルに含まれているかどうかをチェックする（Ｓ２４）。ここで、選曲した楽曲の全ての音高の音の単音モデルが含まれていないと判断すれば、Ｓ２２に戻って別の楽曲の選択を指示するか、Ｓ２１に戻って別のモデルの読み込みを指示する。 Next, it is checked whether or not all the single tone models included in the selected music are included in the model read in S21 (S24). Here, if it is determined that the single tone models of all pitches of the selected music are not included, the process returns to S22 to instruct the selection of another music, or the process returns to S21 to read another model. Instruct.

Ｓ２４で、選曲した楽曲の全ての音高の音の単音モデルが含まれている判断すれば、この楽曲の最初の楽音の音高をＲＡＭ３上の変数にセットする（Ｓ２５）。同時に、この音高の音の鍵を演奏者に押鍵してもらうための押鍵指示を表示装置８に表示し（Ｓ２６）、さらに、Ｓ２１で読み込んだモデルからこの音高の音に対応する単音モデルを読み出す（Ｓ２７）。 If it is determined in S24 that a single tone model of all the pitches of the selected music is included, the pitch of the first musical tone of this music is set as a variable on the RAM 3 (S25). At the same time, a key pressing instruction for causing the performer to press the key of this pitch is displayed on the display device 8 (S26), and further, this pitch corresponds to the pitch from the model read in S21. A single tone model is read (S27).

演奏者は表示装置８に表示された押鍵指示をみてピアノＰの鍵を押鍵するが、そのときに発生される演奏音はマイクロフォン１０によって取り込まれる。マイクロフォン１０を通して入力されたアナログ信号をＡ／Ｄ変換器６でデジタル信号に変換した後、その音量を音量測定部１１で検出する（Ｓ２８）。Ｓ２８〜Ｓ３２の処理は、図２のＳ１３〜Ｓ１７の処理と同様であるので、詳細な説明は省略する。Ｓ３２で不一致と判断した場合、Ｓ２８に戻って再度音量の検出からやり直す。 The performer presses the key of the piano P according to the key pressing instruction displayed on the display device 8, and the performance sound generated at that time is captured by the microphone 10. After the analog signal input through the microphone 10 is converted into a digital signal by the A / D converter 6, the volume is detected by the volume measuring unit 11 (S28). Since the process of S28-S32 is the same as the process of S13-S17 of FIG. 2, detailed description is abbreviate | omitted. If it is determined that there is a mismatch in S32, the process returns to S28 and starts again from the detection of the volume.

Ｓ３２で一致と判断した場合には、まず、アタック検出をリセットし（Ｓ３３）、選曲した楽曲の最後の楽音か否かを判断する（Ｓ３４）。ここで楽曲の最後の楽音でないと判断すれば、次の楽音の音高をＲＡＭ３にセットし（Ｓ３５）、Ｓ２６に戻る。以上の処理を楽曲の最後の楽音まで繰り返し実行し、楽曲の終わりに達したならばもう一度レッスンするか尋ねたり（Ｓ３６）、別の楽曲をレッスンするかを尋ねたりして（Ｓ３７）その応答に対応する処理に移る。 If it is determined in S32 that they match, first, attack detection is reset (S33), and it is determined whether or not it is the last musical tone of the selected music (S34). If it is determined that it is not the last musical tone of the music, the pitch of the next musical tone is set in the RAM 3 (S35), and the process returns to S26. The above processing is repeatedly executed until the last musical tone of the music, and when the end of the music is reached, it is asked whether to take a lesson again (S36) or whether another music is to be lessoned (S37). Move to the corresponding process.

図２０は、１回目のモデルと２回目のモデルとの比較処理（Ｓ１７）あるいはモデルとレッスン時の演奏音のパワースペクトルとの比較処理（Ｓ３２）の例を示すフローチャートである。 FIG. 20 is a flowchart showing an example of the comparison process (S17) between the first model and the second model, or the comparison process (S32) between the model and the power spectrum of the performance sound at the time of the lesson.

まず、今回の演奏音と比較するためのモデルを、Ｓ１２あるいはＳ２７で読み出したモデルから作成する（Ｓ４１）。単音の演奏音の場合は単音のモデルそのままでよく、和音の演奏音の場合には単音のモデルを合成する。また、モデルと今回の演奏音の音量の違いをなくすためにモデル作成時の音量と今回の演奏音の音量が同じになるようにモデルと今回の演奏音のパワースペクトルを音量によって正規化する。和音の演奏音の場合には和音を構成する音のモデル作成時の音量の総和によって正規化する。なお、音量ではなく波形のピーク値によって正規化してもよい。 First, a model for comparison with the performance sound of this time is created from the model read out in S12 or S27 (S41). In the case of a single performance sound, the single sound model may be used as it is, and in the case of a chord performance sound, a single sound model is synthesized. In order to eliminate the difference in volume between the model and the current performance sound, the power spectrum of the model and the current performance sound is normalized by the volume so that the volume at the time of model creation and the current performance sound is the same. In the case of a chord performance sound, normalization is performed by the sum of the volumes at the time of model creation of the sound composing the chord. In addition, you may normalize not with a sound volume but with the peak value of a waveform.

モデルが作成できたら、モデルと比べて今回の演奏音に足りない不足分のパワースペクトルの合計を距離として算出する（Ｓ４２）。この方法は、ピアノなどのような残響が多く残る楽器には有効な方法である。 When the model is created, the sum of the insufficient power spectrum that is insufficient for the current performance sound compared to the model is calculated as a distance (S42). This method is effective for instruments such as a piano that have a lot of reverberation.

次に、モデルと今回の演奏音が一致すると判断するための閾値を設定し（Ｓ４３）、Ｓ４２で算出した距離をこの閾値と比較する（Ｓ４４）。距離が閾値以上である場合には不一致と判断し、距離が閾値より小さい場合には一致と判断する。このとき、距離が閾値より小さい場合でも今回の演奏音のパワースペクトルに構成音ごとの特徴が見つからない場合には不一致と判断するようにしてもよい（Ｓ４５）。Ｓ４４あるいはＳ４５で不一致と判断した場合、Ｓ４（Ｓ２８）に戻って再度押鍵指示（音量検出）からやり直す。また、Ｓ４５で一致と判断した場合にはＳ１８（Ｓ３３）に進む。 Next, a threshold for determining that the model and the current performance sound match is set (S43), and the distance calculated in S42 is compared with this threshold (S44). When the distance is greater than or equal to the threshold value, it is determined that they do not match, and when the distance is less than the threshold value, they are determined to match. At this time, even if the distance is smaller than the threshold, it may be determined that there is a mismatch if the characteristics of each component sound are not found in the power spectrum of the performance sound of this time (S45). If it is determined that there is a mismatch in S44 or S45, the process returns to S4 (S28) and starts again from the key pressing instruction (volume detection). If it is determined in S45 that they match, the process proceeds to S18 (S33).

図２１は、レッスン時のピアノロールつまりスクロール方式による押鍵指示の表示の例を示す図である。表示装置８の表示画面１０１内の下部に鍵盤図形Ｋを表示し、その上方に押鍵指示マークＭを表示する。表示画面１０１の上下方向は時間軸であり、押鍵指示マークＭは、押鍵すべき鍵に対応する位置に楽音の演奏長さに応じた長さで表示される。最も鍵盤図形に近い押鍵指示マークＭは、次に押鍵すべき楽音であり、鍵盤図形Ｋに近い方から順番に押鍵すべき楽音を示している。 FIG. 21 is a diagram showing an example of display of a key press instruction by a piano roll, that is, a scroll method at the time of a lesson. A keyboard figure K is displayed at the bottom of the display screen 101 of the display device 8, and a key press instruction mark M is displayed above it. The vertical direction of the display screen 101 is the time axis, and the key pressing instruction mark M is displayed at a position corresponding to the key to be pressed, with a length corresponding to the musical performance length. The key press instruction mark M closest to the keyboard figure is a tone to be pressed next, and indicates a tone to be pressed in order from the side closest to the keyboard figure K.

上記Ｓ３２（図１８）で、次に押鍵すべく楽音のモデルとピアノＰの演奏音から計算されたパワースペクトルの距離が閾値以下であることが判断されると、ピアノロールは下方向にスクロールされ、次の押鍵すべき楽音の押鍵指示マークＭが鍵盤図形Ｋに最も近い位置まで下がってきた時点で止まる。押鍵指示された楽音が押鍵されるまでピアノロールのスクロールが停止しているので、演奏者は自分のペースで次の楽音を押鍵することができる。Ｓ２３（図１８）で読み出す演奏情報には、演奏者に押鍵を促す楽音だけでなく、伴奏の演奏情報も含ませることができる。この場合には次に押鍵すべき音のタイミングまでの伴奏音が音源４を通してスピーカ５から出力される。 If it is determined in S32 (FIG. 18) that the distance between the power spectrum calculated from the musical sound model and the performance sound of the piano P to be depressed next is equal to or smaller than the threshold value, the piano roll is scrolled downward. Then, the operation stops when the key depression instruction mark M of the next musical tone to be depressed is lowered to the position closest to the keyboard figure K. Since the scroll of the piano roll is stopped until the musical tone instructed to be depressed is depressed, the performer can depress the next musical tone at his / her own pace. The performance information read in S23 (FIG. 18) can include not only musical sounds that prompt the player to press the key but also performance information of accompaniment. In this case, the accompaniment sound up to the timing of the sound to be pressed next is output from the speaker 5 through the sound source 4.

現在押鍵されているいるべき鍵はマークｍで示される。このマークｍの長さはこの鍵をあとどれだけ長く押鍵していなくてはならないかを示しており、これにより演奏者は離鍵するタイミングを前もって知ることができる。小節線ＢＬで示すように、この例では２小節分の押鍵指示を同時に１画面に表示しているが、同時に表示する小節数、表示する演奏情報の範囲は任意である。 The key to be currently pressed is indicated by a mark m. The length of the mark m indicates how long the key must be depressed, so that the player can know in advance when to release the key. As indicated by the bar line BL, in this example, key pressing instructions for two bars are simultaneously displayed on one screen, but the number of bars to be displayed simultaneously and the range of performance information to be displayed are arbitrary.

次に、モデル作成においてモデルに含まれるアクション音、ノイズ成分およびＤＣ成分を低減する方法をについて説明する。以下では、自然楽器をピアノ（アコースティックピアノ）として説明するが、その他の自然楽器でも同様に考えることができる。 Next, a method for reducing action sounds, noise components, and DC components included in a model in model creation will be described. In the following, a natural musical instrument is described as a piano (acoustic piano), but other natural musical instruments can be considered in the same manner.

まず、モデルに含まれるアクション音を低減する方法ついて説明する。マイクロフォン２から音を取り込むときにピアノＰを弾くと、どうしてもピアノＰのアクション音まで取り込んでしまう。図２２はマイクロフォン２から取り込まれた音のスペクトルの例を示す図である。 First, a method for reducing the action sound included in the model will be described. If the piano P is played when the sound is taken in from the microphone 2, the action sound of the piano P is always taken in. FIG. 22 is a diagram illustrating an example of a spectrum of sound captured from the microphone 2.

アクション音は毎回違ったパワースペクトルを有し、ノイズのようなものである。アクション音は主に低音部に強い成分を持つパワースペクトルとして現れるが、高音部の鍵以外ではアクション音が入ってもピアノ本来の音（弦の振動音）のパワースペクトルの方がかなり大きく、押鍵された鍵の演奏音とモデルの比較に殆ど影響しない。 The action sound has a different power spectrum each time and is like noise. The action sound appears mainly as a power spectrum with a strong component in the bass part, but the power spectrum of the original sound of the piano (string vibration sound) is considerably larger than the keys of the treble part, even if the action sound enters. It has little effect on the comparison between the performance sound of the key and the model.

しかし、高音部の鍵の場合、弦の振動音のパワースペクトルが小さいため、アクション音のパワースペクトルの影響が大きく、押鍵された鍵の演奏音とモデルを比較するときにアクション音のパワースペクトルが大きく評価される結果となってしまう。したがって、押鍵された鍵の演奏音とモデルの正しい比較結果が得られなくなる。アクション音のパワースペクトルが毎回同じであればそれを減算して取り除くことは可能であるが、アクション音のパワースペクトルはノイズのようなもので毎回異なるので単純に減算して取り除くことは不可能である。 However, in the case of high-pitched keys, the power spectrum of the string vibration sound is small, so the effect of the power spectrum of the action sound is large, and the power spectrum of the action sound when comparing the performance sound of the pressed key with the model Will be greatly evaluated. Therefore, a correct comparison result between the performance sound of the pressed key and the model cannot be obtained. If the power spectrum of the action sound is the same every time, it can be subtracted and removed, but the power spectrum of the action sound is like noise and is different every time, so it is impossible to simply subtract and remove it. is there.

そこで、モデルを作成するときに、モデルを作成しようとする音高の音の基本周波数ｆ_０以下のパワースペクトルをカットする。ただし、ピアノＰの調律がずれていることも考えられるので、実際にカットする周波数は基本周波数ｆ_０よりも若干低い周波数、例えば半音低い周波数にすることが好ましい。アクション音に含まれる周波数成分より基本周波数が低い音高ではアクション音のパワースペクトルを取り除くことができないことになるが、上述のように高音部の鍵以外では弦の振動音のパワースペクトルに比べてアクション音のパワースペクトルは比較的小さいので、レッスン時に押鍵された鍵の演奏音とモデルとを比較する上での問題は生じない。 Therefore, when creating the model, the power spectrum of the fundamental frequency f ₀ or less of the pitch tone to be created is cut. However, since it is conceivable that deviation tuning piano P, the frequency to be actually cut the fundamental frequency f ₀ frequency slightly lower than, be, for example, semitone lower frequency preferred. If the pitch is lower than the frequency component contained in the action sound, the power spectrum of the action sound cannot be removed. Since the power spectrum of the action sound is relatively small, there is no problem in comparing the performance sound of the key pressed during the lesson with the model.

次に、モデルに含まれるノイズ成分を低減する方法について説明する。モデルを作成するときにマイクロフォン２から音を取り込むと、ノイズが混入されるのが普通である。ノイズには、周囲の環境音やマイクロフォン２のケーブルに入り込むノイズ、マイクアンプやＡ／Ｄ変換器３のアナログ回路部分に入り込むノイズなどがある。ピアノＰの音の音量が大きく、それに比較してノイズが目立たないほど小さければ問題はないが、弱い押鍵での演奏音をそのままモデルにした場合、モデルの音量は小さいので、モデルとレッスン時に押鍵された鍵の演奏音を比較する際に音量で正規化を行うと、ノイズ成分も大きく持ち上げられてしまうことになる。 Next, a method for reducing noise components included in the model will be described. When sound is taken from the microphone 2 when creating a model, noise is usually mixed. The noise includes ambient sound, noise entering the cable of the microphone 2, noise entering the analog circuit portion of the microphone amplifier and the A / D converter 3, and the like. There is no problem if the volume of the sound of the piano P is high and the noise is so low that it is not conspicuous. However, if the performance sound with a weak key press is used as a model, the volume of the model is low. If normalization is performed with the volume when comparing the performance sounds of the pressed keys, the noise component will be greatly increased.

一方、レッスン時に押鍵された鍵の演奏音にもノイズが含まれているが、その演奏音の音量が大きければ相対的にノイズ成分のパワースペクトルが低くなり、音量で正規化してもノイズ成分のスペクトルは小さいままである。そのため、モデルとレッスン時の押鍵された鍵の演奏音の一致は得られなくなってしまう。 On the other hand, the performance sound of the key pressed during the lesson also contains noise, but if the volume of the performance sound is high, the power spectrum of the noise component will be relatively low, and the noise component will be normalized by the volume. The spectrum of remains small. For this reason, it becomes impossible to obtain a match between the performance sound of the model and the key pressed during the lesson.

この不都合をなくすために、演奏者がピアノＰを弾いていない状態でマイクロフォン２から音を数秒取り込み、取り込んだ入力音を一定時間間隔でオーバラップさせながらＦＦＴし、これにより求められたパワースペクトル中でのパワー最大値を測定して保存しておく。このパワー最大値をノイズと判断する基準値とする。すなわち、モデルとするパワースペクトルの中からこのパワー最大値よりも低いパワーのパワースペクトルをカットすることによりノイズと思われる成分を取り除く。図２３はノイズ除去前（ａ）とノイズ除去後（ｂ）のパワースペクトルの例を示す。なお、モデルの音量の値はそのままとし、レッスン時にモデルと演奏音のパワースペクトルの比較に際し、この音量の値に基づいて正規化を行う。 In order to eliminate this inconvenience, the sound is captured from the microphone 2 for several seconds without the player playing the piano P, and the input sound is subjected to FFT while being overlapped at a constant time interval. Measure and save the maximum power value at. This power maximum value is used as a reference value for determining noise. That is, a component that seems to be noise is removed by cutting a power spectrum having a power lower than the maximum power value from the model power spectrum. FIG. 23 shows examples of power spectra before noise removal (a) and after noise removal (b). Note that the value of the volume of the model is left as it is, and normalization is performed based on the value of the volume when comparing the power spectrum of the model and the performance sound during the lesson.

次に、モデルに含まれるＤＣ成分を低減する方法について説明する。押鍵された鍵の演奏音をＦＦＴしてパワースペクトルに変換すると、ＤＣ成分は最も低い周波数のパワースペクトルとして現れる。ＤＣ成分のパワースペクトルが実際に押鍵された鍵自体の演奏音のパワースペクトルよりも強く出てしまう場合もある。このような場合には、モデルとレッスン時に押鍵された鍵の演奏音を比較する際にＤＣ成分が大きく評価されてしまい不都合が生じる。そこで、ＦＦＴのサイドローブも考慮し、得られたパワースペクトルにおいて最も低い周波数とその隣のパワースペクトルを強制的にゼロにしてＤＣ成分と考えられる部分を取り除く。 Next, a method for reducing the DC component included in the model will be described. When the performance sound of the pressed key is FFTed and converted to a power spectrum, the DC component appears as the power spectrum of the lowest frequency. In some cases, the power spectrum of the DC component is stronger than the power spectrum of the performance sound of the key that is actually pressed. In such a case, when comparing the performance sound of the key pressed during the lesson with the model, the DC component is greatly evaluated, resulting in inconvenience. Therefore, considering the side lobe of the FFT, the lowest frequency in the obtained power spectrum and the adjacent power spectrum are forcibly made zero, and the portion considered to be a DC component is removed.

次に、音量を測定するときのＤＣを取り除く方法について説明する。モデルとレッスン時に押鍵された鍵の演奏音を比較する際にパワースペクトルを音量で正規化するが、Ａ／Ｄ変換前のアナログ信号にＤＣのオフセットがのっているとＡ／Ｄ変換後のデジタル信号にもＤＣのオフセットがのってしまい、音量の測定では単位時間当たりの波形の振幅の絶対値の平均を音量としているため正しい音量を求めることができない。そこで、音量を測定する瞬間の例えば１秒前までの波形から音の振幅の平均値を求める。この平均値は波形のＤＣを表すのでこの平均値を音量測定の波形から減算することによりＤＣを取り除くことができる。図２４はＤＣ除去前（ａ）とＤＣ除去後（ｂ）の波形の例を示し、波形振幅の中心の０レベルからのずれがなくなる様子を示している。 Next, a method for removing DC when measuring the volume will be described. The power spectrum is normalized by the volume when comparing the performance sound of the model and the key that was pressed during the lesson. If the analog signal before A / D conversion has a DC offset, it is after A / D conversion. A DC offset is also added to the digital signal, and the measurement of the volume cannot obtain the correct volume because the average of the absolute values of the amplitudes of the waveforms per unit time is used as the volume. Therefore, the average value of the amplitude of the sound is obtained from the waveform, for example, one second before the moment when the volume is measured. Since this average value represents the DC of the waveform, the DC can be removed by subtracting this average value from the waveform of the volume measurement. FIG. 24 shows examples of waveforms before (a) DC removal and after (b) DC removal, and shows how the deviation of the center of the waveform amplitude from the 0 level disappears.

本発明は種々の実施形態で実施可能である。例えば、作成されたモデルはその後のレッスンで継続して使用されるものであるので、特に注意深く、間違いなく作成される必要があるのでモデル作成者の負担が大きい。この負担を軽減するために、Ｓ１９（図２）で音高を音階に従って直上の音高に上げるのではなく、白鍵と黒鍵を別にしてそれぞれ個別に連続して押鍵させるように音高を上げていくのがよい。音階に従って順に音高を上げていくと白鍵と黒鍵を交互に弾く場合が生じ、押鍵指示された鍵に対応しない鍵を弾く誤りが起こることがあるからである。 The present invention can be implemented in various embodiments. For example, the created model is to be used continuously in subsequent lessons, so it needs to be created carefully and definitely, which is a heavy burden on the modeler. In order to reduce this burden, in S19 (FIG. 2), instead of raising the pitch according to the scale to the pitch directly above, the sound is generated so that the white key and the black key are separately pressed separately. It is good to raise the height. This is because if the pitch is increased in order according to the scale, the white key and the black key may be played alternately, and an error may occur in playing a key that does not correspond to the key instructed to be pressed.

また、設定されたモデル作成音域の全ての鍵をある色で表示し、その色を変えるなどして押鍵指示を行えば、あとどれだけの単音モデルを作成しなければならないかが容易に分かる。 In addition, if all keys in the set model creation range are displayed in a certain color and the key is pressed by changing the color, etc., it is easy to see how many single-tone models must be created. .

なお、図２のフローチャートによれば、１回目と２回目のモデルとが一致しないとき、再度１回目のモデル作成から行うようにしており、これによれば１回目のモデルを記憶しておくだけでよいが、同一音高に単音対して１回目、２回目、３回目、・・・というように順次作成されたモデルを記憶していき、複数回、例えば２回の一致が得られた時点でその音高の単音のモデル作成を終了して、一致したモデルを正規のモデルとして採用するようにすることもできる。 According to the flowchart of FIG. 2, when the first model and the second model do not match, the first model is created again, and according to this, only the first model is stored. However, when a model is created sequentially such as the first, second, third, etc. for a single note at the same pitch, multiple matches, for example, when two matches are obtained, are stored. Then, it is possible to finish the creation of a single tone model of the pitch and adopt the matched model as a regular model.

また、上記したアタック検出、無音検出、アクション音やノイズやＤＣ成分除去の技術は適宜組み合わせて採用することができる。 Further, the techniques of attack detection, silence detection, action sound, noise, and DC component removal described above can be appropriately combined and employed.

本発明が適用される楽音判定装置のブロック図である。It is a block diagram of the musical tone determination apparatus to which the present invention is applied. モデル作成フェーズの処理を示すフローチャートである。It is a flowchart which shows the process of a model creation phase. モデル作成の音域設定の例を示す図である。It is a figure which shows the example of the sound range setting of model preparation. モデル作成の音域設定の他の例を示す図である。It is a figure which shows the other example of the sound range setting of model creation. アタック検出処理の説明図である。It is explanatory drawing of an attack detection process. マイクロフォンとＡ／Ｄ変換器の部分の詳細構成を示すブロック図である。It is a block diagram which shows the detailed structure of the part of a microphone and an A / D converter. 音量表示の例を示す図である。It is a figure which shows the example of a volume display. 音高対音量最大値の特性図である。FIG. 6 is a characteristic diagram of pitch versus volume maximum value. モデル作成時の押鍵指示の例を示す図である。It is a figure which shows the example of the key pressing instruction | indication at the time of model creation. 最低音（ノートナンバ21）の鍵の離鍵後の音量変化を示す図である。It is a figure which shows the volume change after the key release of the key of the lowest sound (note number 21). 最低音（ノートナンバ48）の鍵の離鍵後の音量変化を示す図である。It is a figure which shows the volume change after the key release of the key of the lowest sound (note number 48). アタック直後のパワースペクトルを示す図である。It is a figure which shows the power spectrum immediately after an attack. アタックから約1.4秒後のパワースペクトルを示す図である。It is a figure which shows the power spectrum about 1.4 seconds after an attack. アタックから約4.1秒後のパワースペクトルを示す図である。It is a figure which shows the power spectrum about 4.1 seconds after an attack. アタックから約5.8秒後のパワースペクトルを示す図である。It is a figure which shows the power spectrum about 5.8 second after an attack. モデル作成音（ノートナンバ50）を押鍵したときに得られたパワースペクトルを示すShows the power spectrum obtained when the model creation sound (note number 50) is pressed. 最高音（ノートナンバ108）を押鍵したときに得られた音量変化を示す図である。It is a figure which shows the volume change obtained when the highest note (note number 108) was pressed. レッスンフェーズの処理を示すフローチャートである。It is a flowchart which shows the process of a lesson phase. 演奏情報のフォーマット例を示す図である。It is a figure which shows the example of a format of performance information. 比較処理の例を示すフローチャートである。It is a flowchart which shows the example of a comparison process. レッスン時のスクロール方式による押鍵指示の表示の例を示す図である。It is a figure which shows the example of the display of the key press instruction | indication by the scroll system at the time of a lesson. マイクロフォンから取り込まれた音のスペクトルの例を示す図である。It is a figure which shows the example of the spectrum of the sound taken in from the microphone. ノイズ除去前とノイズ除去後のパワースペクトルの例を示す図である。It is a figure which shows the example of the power spectrum before noise removal and after noise removal. ＤＣ除去前とＤＣ除去後の波形の例を示す図である。It is a figure which shows the example of the waveform before DC removal and after DC removal.

Explanation of symbols

１・・・ＣＰＵ、２・・・ＲＯＭ、３・・・ＲＡＭ、４・・・音源、５・・・スピーカ、６・・・Ａ／Ｄ変換器、７・・・入力装置、８・・・表示装置、９・・・外部記憶装置、１０・・・マイクロフォン、１１・・・音量測定部、１２・・・スペクトル作成部、１３・・・アタック検出部、１４・・・スペクトル比較部、１５・・・押鍵指示部、１０１・・・表示画面、１０２・・・ロゴ、、２０１・・・マイクボリューム、２０２・・・マイクアンプ、Ｋ・・・鍵盤図形、Ｍ・・・押鍵指示マーク、ｍ・・・現在押鍵マーク、Ｐ・・・アコースティックピアノ、ＢＬ・・・小節線 1 ... CPU, 2 ... ROM, 3 ... RAM, 4 ... sound source, 5 ... speaker, 6 ... A / D converter, 7 ... input device, ... Display device 9 ... External storage device 10 ... Microphone 11 ... Volume measuring unit 12 ... Spectrum creation unit 13 ... Attack detection unit 14 ... Spectrum comparison unit DESCRIPTION OF SYMBOLS 15 ... Key press instruction | indication part, 101 ... Display screen, 102 ... Logo, 201 ... Microphone volume, 202 ... Microphone amplifier, K ... Keyboard figure, M ... Key press Instruction mark, m ... current key press mark, P ... acoustic piano, BL ... bar line

Claims

The power spectrum of a single tone of each pitch is created and stored in advance as a model, the model of the sound to be played next is read according to the performance instruction, and the power of the performance sound generated by actually playing the natural instrument. In the model creation device in the musical instrument determination device for natural musical instruments that compares the spectra and advances the performance instruction when both approach a distance within the threshold,
A performance sound capturing means for level-acquiring and capturing a single performance sound generated by actually playing a natural instrument when creating a model,
Attack detection means for detecting an attack by comparing the performance sound captured by the performance sound capture means with a threshold value set based on the volume of the performance sound adjusted in level and the noise volume captured in the level adjustment state When,
Power spectrum calculation means for calculating the power spectrum of the performance sound when an attack is detected,
A model creation apparatus in a musical instrument determination device for a natural musical instrument, wherein the power spectrum calculated by the power spectrum calculation means is stored as a model of the single sound.

2. The model creating apparatus for a musical instrument determination device for a natural musical instrument according to claim 1, further comprising display means for displaying a volume of the performance sound whose level is adjusted when the model is created.

A judgment means is provided for judging whether or not the volume of the performance sound at the time of model creation is within an allowable range according to the characteristics of the performance-adjusted sound volume and the musical instrument pitch versus volume maximum value. 3. The model creation device for a musical instrument determination device for a natural musical instrument according to claim 1 or 2, wherein a message to that effect is transmitted when the value is not within the allowable range.

Comparing means for comparing the volume of the performance sound captured by the performance sound capturing means with a threshold value, and silence state determining means for determining a silence state when the volume of the performance sound continuously falls below the threshold value for a predetermined time. the power spectrum calculated by the power spectrum calculating means stores as the monophonic models, the to the silent state determination unit claims 1, characterized in that moves the model creation of the next single tone when it is determined that the silence 4. A model creation device in the musical instrument sound determination device according to any one of 3 above .

Before Symbol comparison means, tones of natural musical instrument according to claim 4, characterized in that it comprises a threshold setting means for setting a threshold value depending on the volume of the noise captured are level adjusted by the performance sound acquisition means A model creation device in a judgment device.

A low frequency component removing means for cutting a low frequency component lower than a fundamental frequency of a pitch of a single tone to be modeled in the power spectrum calculated by the power spectrum calculating means, the low frequency component removing means 6. The model creation device for a musical sound determination device for a natural musical instrument according to claim 1, wherein a power spectrum in which a low frequency component is cut is stored as a model of the single tone.

Power spectrum calculation means to obtain the power spectrum of the sound captured without playing the natural instrument at the time of model creation, and the power value of the spectrum with the maximum power among the power spectrum of the sound captured without playing the natural instrument as a noise reference A noise component removing unit that cuts a power spectrum whose power value is equal to or less than a noise reference value in the power spectrum of a single performance sound, and stores the power spectrum cut by the noise component removing unit as a model of the single tone The model creation device in the musical instrument tone determination device according to any one of claims 1 to 6 .

DC component removing means for cutting a component near the lowest frequency in the power spectrum calculated by the power spectrum calculating means, and storing the power spectrum cut by the DC component removing means as a model of the single sound. The model creation apparatus in the musical sound determination apparatus of the natural musical instrument according to any one of claims 1 to 7 .

Calculated when comparing the power spectrum of the model and the performance sound with the volume calculation means that calculates the volume average based on the performance sound obtained by subtracting the amplitude average value and obtaining the average amplitude value of the performance sound waveform at the time of model creation 9. The model creation device for a musical instrument determination device for a natural musical instrument according to claim 1, further comprising normalizing means for normalizing the power spectrum of the model with the volume of the played sound.

The power spectrum of a single tone of each pitch is created and stored in advance as a model, the model of the sound to be played next is read according to the performance instruction, and the power of the performance sound generated by actually playing the natural instrument. In a program for creating a model in a musical instrument determination device for natural musical instruments that compares spectra and advances performance instructions when both approach a distance within a threshold,
On the computer,
To adjust the level of a single performance sound generated by actually playing a natural instrument when creating a model,
A procedure for detecting an attack by comparing the captured performance sound with a threshold value corresponding to the volume of the performance sound whose level is adjusted and the volume of the noise captured in the level adjustment state;
A program for creating a model in a musical instrument determination device for a natural musical instrument for executing a procedure of calculating a power spectrum of a performance sound when an attack is detected and storing it as a model of the single sound.

11. The program for creating a model in a musical instrument determination device for a natural musical instrument according to claim 10, further comprising a step of displaying a volume of a performance sound whose level has been adjusted at the time of model creation.

The procedure for determining whether the volume of the performance sound at the time of model creation is within the allowable range according to the characteristics of the level-adjusted performance sound and the musical instrument pitch vs. maximum volume, and the performance sound volume is acceptable 12. The program for creating a model in a musical instrument determination device for a natural musical instrument according to claim 10 or 11, further comprising a procedure for sending a message to that effect when not within the range.

Including a procedure for comparing the volume of the captured performance sound with a threshold value, and a procedure for determining a silent state when the volume of the performance sound continuously falls below the threshold value for a certain period of time. 13. The program for creating a model in a musical instrument determination device for a natural musical instrument according to claim 10 , wherein the program is stored as a model, and when it is determined that there is no sound, the model for the next single tone is created .

14. The musical sound determination device for a natural musical instrument according to claim 13, wherein the procedure for comparing the volume of the performance sound with a threshold value includes a procedure for setting the threshold value according to the volume of noise that has been acquired by adjusting the level. A model creation program.

A method comprising cutting a power spectrum of a low frequency component lower than a fundamental frequency of a pitch of a single tone to be modeled, and storing the power spectrum from which the low frequency component has been cut as a model of the single tone. Item 15. A program for creating a model in the musical sound determination device for a natural musical instrument according to any one of Items 10 to 14 .

The procedure for obtaining the power spectrum of a sound captured without playing a natural instrument at the time of model creation and the power value of the spectrum with the maximum power in the power spectrum of a sound captured without playing a natural instrument is used as a noise reference value. The power spectrum having a power value equal to or less than the noise reference value in the spectrum of the performance sound is cut as noise, and the power spectrum from which the noise has been cut is stored as a model of the single sound. A program for creating a model in the musical sound determination device for a natural musical instrument according to any one of claims 15 to 15 .

17. A power spectrum obtained by cutting a component near the lowest frequency in the power spectrum of a performance sound, and storing a power spectrum obtained by cutting a component near the lowest frequency as a model of the single sound. The program for model creation in the musical instrument musical sound judgment apparatus in any one of .

When calculating the average amplitude of a performance sound waveform for a certain period when creating a model, calculating the volume of the performance sound based on the performance sound obtained by subtracting the average amplitude value, and comparing the power spectrum of the model and the performance sound. A program for creating a model in a musical instrument determination device for a natural musical instrument according to any one of 10 to 17, including a procedure for normalizing the power spectrum of the model with the volume of the calculated performance sound.