JP4862772B2

JP4862772B2 - Karaoke device with scoring function

Info

Publication number: JP4862772B2
Application number: JP2007199557A
Authority: JP
Inventors: 哲也水谷
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2007-07-31
Filing date: 2007-07-31
Publication date: 2012-01-25
Anticipated expiration: 2027-07-31
Also published as: JP2009036878A

Description

本発明は、精度の高い採点が可能な採点機能を有するカラオケ装置に関する。 The present invention relates to a karaoke apparatus having a scoring function capable of scoring with high accuracy.

従来、採点機能を有するカラオケ装置が広く知られている。マイクなどから入力されたカラオケ演奏中の歌唱者の音声を分析し、分析によって得られた特徴量を所定の手法で評価し、数値化して、その数値を採点結果として歌唱者に報知するものである。また、採点処理に用いられる特徴量としては、歌唱音声の音高を抽出した音高ピッチの周波数を特徴量とする情報（音高ピッチ情報）が一般的である。下記特許文献１においては、デュエット曲に対し、男女のパート毎に採点するものが開示されている。また、下記特許文献２においては、男女音声の周波数特性の違いを利用して入力音声の性別を判別するものが開示されている。
特開平１１−２８２４７８号公報特開２００１−５６６９９号公報 Conventionally, a karaoke apparatus having a scoring function is widely known. Analyzes the voice of a singer during a karaoke performance input from a microphone, etc., evaluates the feature value obtained by the analysis by a predetermined method, quantifies it, and notifies the singer of the numerical value as a scoring result is there. In addition, information (pitch pitch information) that uses a pitch frequency obtained by extracting the pitch of a singing voice as a feature amount is generally used as a feature amount used for scoring processing. In the following Patent Document 1, a score for each male and female part is disclosed for a duet song. Japanese Patent Application Laid-Open No. 2004-228561 discloses a technique for determining the gender of input speech using the difference in frequency characteristics of male and female speech.
Japanese Patent Laid-Open No. 11-282478 JP 2001-56699 A

ところで、カラオケ演奏中の入力歌唱音声を分析（採点）する場合、高精度の採点結果を得るためには、入力歌唱音声以外の音響信号を除去することが望ましい。しかしながら、歌唱者のマイクには、歌唱者の歌唱音声のみならず、楽音再生装置から出力されるカラオケ演奏中の楽曲音も一部入力されることになる。 By the way, when analyzing (scoring) the input singing voice during karaoke performance, it is desirable to remove acoustic signals other than the input singing voice in order to obtain a highly accurate scoring result. However, not only the singing voice of the singer but also a part of the music sound during the karaoke performance outputted from the musical sound reproducing device is inputted to the microphone of the singer.

そのため、従来、周波数帯域フィルタなどを利用したフィルタ処理によって上記楽曲音を除去する試みが行われているが、この場合、歌唱者の入力歌唱音声は除去せずに、楽曲音のみを除去するためのフィルタ処理を施す場合、入力歌唱音声には影響を及ぼさないようなフィルタ処理を行うよう留意する必要がある。 For this reason, conventionally, attempts have been made to remove the music sound by filtering using a frequency band filter or the like, but in this case, in order to remove only the music sound without removing the singer's input singing voice. When performing this filtering process, it is necessary to take care to perform a filtering process that does not affect the input singing voice.

このとき、男声と女声とでは歌唱音声の周波数帯域が異なる（一般的に、男声を発する男性の音声は、女声を発する女性の音声よりも低い周波数帯域である。）ため、男声と女声に対して同じフィルタ処理を施すことは、高精度の分析結果を得る観点からは好ましくない。 At this time, the frequency band of the singing voice is different between the male voice and female voice (in general, the voice of a male voice is lower than that of a female voice of a female voice). Applying the same filter processing is not preferable from the viewpoint of obtaining a highly accurate analysis result.

この点について、ドラム音、女性、男声それぞれの周波数分布について図を用いて説明する。以下説明する図においては、横軸は対数周波数であり、縦軸はデシベルである。図１〜図３は、ドラム音の周波数分布の例を示すものである。ドラム音は、ドン、ドンという打音である。図１では、７０Ｈｚ付近に第１のピークがある。図２では、６０Ｈｚ付近に第１のピークがある。図３では、８０Ｈｚ付近にピークがある。これらの図によれば、一般的に、ドラム音は、１００Ｈｚ以下の帯域に、第１のピークが存在することがわかる。 Regarding this point, the frequency distribution of each of the drum sound, female, and male voice will be described with reference to the drawings. In the figures described below, the horizontal axis is the logarithmic frequency, and the vertical axis is the decibel. 1 to 3 show examples of frequency distribution of drum sounds. The drum sound is a percussive sound. In FIG. 1, there is a first peak near 70 Hz. In FIG. 2, there is a first peak near 60 Hz. In FIG. 3, there is a peak near 80 Hz. According to these figures, it can be seen that the drum sound generally has a first peak in a band of 100 Hz or less.

図４〜図６は、女声の周波数分布の例を示すものである。図４では、３００Ｈｚ付近に第１のピークがある。図５では、４００Ｈｚ付近に第１のピークがある。図６では、５００Ｈｚ付近に第１のピークがある。 4 to 6 show examples of female voice frequency distribution. In FIG. 4, there is a first peak near 300 Hz. In FIG. 5, there is a first peak near 400 Hz. In FIG. 6, there is a first peak near 500 Hz.

図７、図８は、男声の周波数分布の例を示すものである。図７では、８０Ｈｚ付近に第１のピークがある。図８では、１００〜２００Ｈｚの間に第１のピークがある。 7 and 8 show examples of male voice frequency distribution. In FIG. 7, there is a first peak near 80 Hz. In FIG. 8, there is a first peak between 100 and 200 Hz.

これらの図によれば、女声の周波数帯域と男声の周波数帯域とで大きな差があることがわかる。また、低い声の男声の第１のピークは、ドラム音の第１のピークに近いことがわかる。 According to these figures, it can be seen that there is a large difference between the female voice frequency band and the male voice frequency band. Further, it can be seen that the first peak of the low-pitched male voice is close to the first peak of the drum sound.

これらの図が示すとおり、男声に合わせてフィルタ（例えば、ローカットフィルタ）を設定し、そのフィルタをそのまま用いて女声に対しフィルタ処理した場合、ドラム音の帯域が除去できないため、女声の音高ピッチ情報の取得に悪影響を及ぼすことになる。また、女声に合わせてフィルタを設定し、そのフィルタをそのまま男声に対しをフィルタ処理した場合、男声自体にもフィルタ処理が施されてしまい、男声の音高ピッチ情報の取得に悪影響を及ぼす。 As shown in these figures, when a filter (for example, a low cut filter) is set in accordance with the male voice, and the female voice is filtered using the filter as it is, the pitch of the female voice cannot be removed. It will adversely affect the acquisition of information. Further, when a filter is set in accordance with a female voice and the filter is applied to the male voice as it is, the male voice itself is also subjected to the filter processing, which adversely affects the acquisition of the pitch pitch information of the male voice.

また、男声／女声でフィルタを切り替える場合、入力音声が男声であるか女声であるかを判別する必要がある。カラオケ演奏時に歌唱者に性別を選択させることもできるが、このような選択動作は歌唱者に負担を課すことになる。また、男声／女声の判別を装置が自動的に行う場合は、例えばセキュリティシステムに用いられるような精度の高い判別が必要となる。一般的に、精度の高い音声認識技術を、ワンチップＣＰＵ等を用いたカラオケ用採点装置で実行した場合は、認識処理に時間がかかるため、娯楽として用いられるカラオケ装置に採用しがたい面がある。 Further, when the filter is switched between male voice / female voice, it is necessary to determine whether the input voice is male voice or female voice. Although it is possible to make the singer select a gender when performing karaoke, such a selection operation places a burden on the singer. In addition, when the apparatus automatically determines male / female voice, it is necessary to perform highly accurate determination, for example, as used in a security system. In general, when a highly accurate voice recognition technology is executed by a karaoke scoring device using a one-chip CPU or the like, the recognition processing takes time, so it is difficult to adopt it for a karaoke device used for entertainment. is there.

そこで、本発明は、上記問題点を解消し、歌唱者に何らの負担を課すことなく、楽曲演奏中に男声／女声を正確に、かつ、容易に判別し、男声女声別に適正なフィルタ処理を行うことによって、精度の高い採点処理ができるカラオケ用採点装置の実現を目的とする。 Therefore, the present invention eliminates the above-mentioned problems, accurately and easily discriminates male / female voices during music performance without imposing any burden on the singer, and performs appropriate filter processing for each male / female voice. The purpose of this is to realize a karaoke scoring device that can perform scoring with high accuracy.

上記目的を達成するため、請求項１に係る発明は、歌唱者音声入力手段、楽曲再生手段、制御手段、記憶手段、採点手段を備えたカラオケ歌唱用採点装置において、上記記憶手段には、ピッチ周波数と周波数帯域フィルタとが関連付けられた関連データと、歌唱者音程情報、楽器音情報、及び、楽曲演奏進行情報とを含んだ楽曲データとが記憶されており、上記制御手段は、前記楽曲データに基く楽曲の演奏中に、上記楽器音情報及び楽曲演奏進行情報に基いて、設定された所定の楽器音が発音されていないかどうかを判断し、前記楽器音が発音されていない期間において上記歌唱者音声入力手段に入力された入力音声から歌唱者のピッチ周波数を特定し、特定されたピッチ周波数及び上記関連データに基いて、周波数帯域フィルタを特定し、特定した周波数帯域フィルタによって上記入力音声をフィルタ処理し、採点手段は、上記歌唱者音程情報を用いて前記フィルタ処理された後の入力音声を採点することを特徴とする。 In order to achieve the above object, the invention according to claim 1 is a karaoke singing scoring device comprising a singer voice input means, a music reproducing means, a control means, a storage means, and a scoring means. Related data in which a frequency and a frequency band filter are associated with each other and music data including singer pitch information, instrument sound information, and music performance progress information are stored, and the control means includes the music data During the performance of the music based on the instrument sound information and the music performance progress information, it is determined whether the set predetermined instrument sound is not pronounced, and in the period when the instrument sound is not pronounced The pitch frequency of the singer is specified from the input voice input to the singer voice input means, and the frequency band filter is specified based on the specified pitch frequency and the related data. , The input speech is filtered by the specified frequency band filter, scoring means may be scoring the input speech after being the filtering process using the singer interval information.

請求項２に係る発明は、請求項１の採点機能を有するカラオケ装置において、蓄積手段をさらに有し、上記発音されていない期間において取得した上記入力音声の音高ピッチ情報は、上記蓄積手段に記憶され、所定時間分の音高ピッチ情報が蓄積されると平均処理によって上記歌唱者のピッチ周波数を特定することを特徴とする。 According to a second aspect of the present invention, in the karaoke apparatus having the scoring function of the first aspect, the karaoke apparatus further includes a storage unit, and the pitch pitch information of the input voice acquired during the non-sounding period is stored in the storage unit When the pitch pitch information for a predetermined time is stored and stored, the pitch frequency of the singer is specified by an averaging process.

請求項３に係る発明は、請求項１または２の採点機能を有するカラオケ装置において、上記楽曲データには、上記所定の楽器音を特定可能にする情報が付されており、上記制御手段は、上記情報に基いて上記所定の楽器音を設定することを特徴とする。 According to a third aspect of the present invention, in the karaoke apparatus having the scoring function of the first or second aspect, the music data is attached with information enabling the predetermined musical instrument sound to be specified, and the control means includes: The predetermined musical instrument sound is set based on the information.

請求項４に係る発明は、請求項１または２の採点機能を有するカラオケ装置において、上記所定の楽器音が予め定められていることを特徴とする。 According to a fourth aspect of the present invention, in the karaoke apparatus having the scoring function of the first or second aspect, the predetermined musical instrument sound is predetermined.

請求項５に係る発明は、請求項１〜４いずれかの採点機能を有するカラオケ装置において、上記楽曲は、ＭＩＤＩデータを含む楽曲データに基いて演奏され、上記制御手段は、上記所定の楽器音を発音させるための命令が出力されてから一定の時間を、上記所定の楽器音が発音されている期間と判断することを特徴とする。 According to a fifth aspect of the present invention, in the karaoke apparatus having the scoring function according to any one of the first to fourth aspects, the music is played based on music data including MIDI data, and the control means is configured to perform the predetermined instrument sound. It is determined that a predetermined time after the command for generating the sound is output is a period during which the predetermined musical instrument sound is being generated.

請求項６に係る発明は、請求項４の採点機能を有するカラオケ装置において、上記制御手段は、上記楽曲の演奏開始後所定の時間内に上記一時メモリに所定時間分のピッチが蓄積されなかった場合に、所定の情報を利用者に報知することを特徴とする。 According to a sixth aspect of the present invention, in the karaoke apparatus having the scoring function of the fourth aspect, the control means does not accumulate a pitch for a predetermined time in the temporary memory within a predetermined time after the performance of the music starts. In this case, the user is notified of predetermined information.

請求項１に係る発明によれば、所定の楽器音が発音されていない期間において入力された入力音声に基いて歌唱者のピッチ周波数を特定し、その結果に基いてフィルタを設定し、設定したフィルタを用いてフィルタ処理を行った音声に基いて採点を行うので、音高ピッチ情報の抽出に悪影響を及ぼす楽器を所定の楽器音として設定することにより、精度の高い採点処理を実現することができる。 According to the first aspect of the invention, the pitch frequency of the singer is specified based on the input voice that is input during a period when the predetermined instrument sound is not being generated, and the filter is set and set based on the result. Since scoring is performed on the basis of the filtered sound using the filter, it is possible to realize highly accurate scoring processing by setting a musical instrument that adversely affects pitch pitch information extraction as a predetermined musical instrument sound. it can.

請求項２に係る発明によれば、さらに、所定時間分の音高ピッチ情報を利用して、歌唱者のピッチ周波数を特定するので、特定したピッチ周波数の精度を高めることができる。 According to the invention which concerns on Claim 2, since the pitch frequency of a singer is specified using the pitch information for predetermined time, the precision of the specified pitch frequency can be improved.

請求項３に係る発明によれば、さらに、上記所定の楽器音を特定可能にする情報が楽曲データに付されているから、楽曲ごとに、音高ピッチ情報の取得に悪影響を及ぼす楽器音を特定することが可能となるため、音高ピッチ情報の取得のための適切な区間を決定することできる。 According to the invention of claim 3, since the music data is further provided with information that enables the predetermined musical instrument sound to be specified, the musical instrument sound that adversely affects the acquisition of pitch pitch information is provided for each musical piece. Since it becomes possible to specify, the suitable area for acquisition of pitch pitch information can be determined.

請求項４に係る発明によれば、さらに、上記所定の楽器音が予め定められているから、既存の楽曲データを利用して精度の高い採点処理を行うことができる。 According to the fourth aspect of the present invention, since the predetermined musical instrument sound is determined in advance, it is possible to perform highly accurate scoring using existing music data.

請求項５に係る発明によれば、さらに、上記所定の楽器音を発音させるための命令が出力されてから一定の時間を上記所定の楽器音が発音されている期間と判断するので、上記所定の楽器音を発音させるための命令が出力されたタイミングを利用した簡易な処理で上記所定の楽器音が発音されていない期間を決定することができる。

According to the invention of claim 5, further, since it is determined that the period in which the predetermined musical instrument sounds for a period of time after the instruction for sounding the said predetermined musical instrument sound is output is sound, the predetermined It is possible to determine a period during which the predetermined instrument sound is not generated by a simple process using the timing at which a command for generating the instrument sound is output.

請求項６に係る発明によれば、さらに、楽曲の演奏開始後所定の時間内に所定時間分のピッチが蓄積できなかった場合にその旨を報知するので、男声／女声判別が正しく行われたか否かを演奏中の早い段階で歌唱者に報知することができる。 According to the sixth aspect of the present invention, when the pitch for a predetermined time cannot be accumulated within a predetermined time after the performance of the music is started, the fact is notified. It is possible to notify the singer of the early stage during the performance.

［第１実施形態］
本発明の第１実施形態に係るカラオケ装置について、図面を参照しつつ詳細に説明する。 [First Embodiment]
A karaoke apparatus according to a first embodiment of the present invention will be described in detail with reference to the drawings.

図９は、カラオケ装置における制御装置の内部構成、及び、その周辺機器を示すブロック図である。 FIG. 9 is a block diagram showing the internal configuration of the control device and its peripheral devices in the karaoke device.

制御装置１０は、通信回線を介してホストコンピュータ（図示せず）に接続されており、通信回線を介して楽曲データを通信Ｉ／Ｆ１５を介して受信する。受信されたカラオケ曲データは、記憶装置１２に記憶される。 The control device 10 is connected to a host computer (not shown) via a communication line, and receives music data via the communication I / F 15 via the communication line. The received karaoke song data is stored in the storage device 12.

ここで、楽曲データには、楽曲の再生用データに加えて、カラオケ曲のタイトルデータ、カラオケ曲に対応する映像データ等が含まれることもある。 Here, the music data may include karaoke music title data, video data corresponding to the karaoke music, and the like in addition to the music playback data.

コントローラ１１は、制御装置１０全体の制御を行う。また、種々のプログラムを実行する。
記憶装置１２は、楽曲データ等を記憶する。また、記憶装置１２は、動的記憶媒体（ＨＤＤ等）で構成される。また、必要に応じて静的記憶媒体で構成してもよい。 The controller 11 controls the entire control device 10. Various programs are executed.
The storage device 12 stores music data and the like. The storage device 12 is configured by a dynamic storage medium (HDD or the like). Moreover, you may comprise with a static storage medium as needed.

操作パネル１３は、操作者が選曲番号等の各種情報を入力するために用いられる。また、リモコン１３ａを介して各種の情報を入力してもよい。 The operation panel 13 is used for an operator to input various information such as a music selection number. Various kinds of information may be input via the remote controller 13a.

ＲＡＭ１４は、種々の制御に必要な情報が記憶される一時記憶メモリである。ＲＡＭ１４は、マルチスレッドプロセスにおいては、共有メモリとして機能する。なお、本発明における共有メモリとしての使用方法については、後述する。 The RAM 14 is a temporary storage memory in which information necessary for various controls is stored. The RAM 14 functions as a shared memory in the multithread process. The usage method as the shared memory in the present invention will be described later.

通信Ｉ／Ｆ１５は、図示しないホストコンピュータとの通信を、通信回線を介して行う。ここで、通信回線は、有線無線を問わない。 The communication I / F 15 performs communication with a host computer (not shown) via a communication line. Here, the communication line may be wired or wireless.

採点回路１６は、マイク１７より入力された音声の採点を行う。また、本実施形態においては、採点の対象となる音声は、後述するフィルタ処理が施されている。なお、図９においては、マイクの数は２つであるが、マイクの数はいくつでもよい。なお、図９においては、コントローラ１１と採点回路１６を別個の構成として図示しているが、コントローラ１１が、フィルタ処理及び採点処理を行ってもよい。また、フィルタ処理を別の回路が行ってもよい。 The scoring circuit 16 scores the voice input from the microphone 17. Moreover, in this embodiment, the audio | voice used as the object of scoring is given the filter process mentioned later. In FIG. 9, the number of microphones is two, but any number of microphones may be used. In FIG. 9, the controller 11 and the scoring circuit 16 are illustrated as separate configurations, but the controller 11 may perform filter processing and scoring processing. Further, another circuit may perform the filtering process.

音源１８は、アンプ１９に接続されている。楽曲データは、音源１８を介して音声信号に変換され、アンプ１９で増幅された後、スピーカ２０によって音声出力される。なお、本実施形態においては、音源１８は、ＭＩＤＩ音源である。また、アンプ１９は、マイク１７より入力された音声についても増幅する。 The sound source 18 is connected to the amplifier 19. The music data is converted into an audio signal via the sound source 18, amplified by the amplifier 19, and then output by the speaker 20. In the present embodiment, the sound source 18 is a MIDI sound source. The amplifier 19 also amplifies the sound input from the microphone 17.

映像制御回路２１は、モニタ２２に接続されている。記憶装置１２または通信回線より取得した映像データと、楽曲データに含まれた歌詞情報とを、映像制御回路２１を介して楽曲のカラオケ再生時の背景映像と歌詞として、モニタ２２に表示する。また、映像データが、符号化されている場合は、復号処理を映像制御回路２１で行ってもよい。 The video control circuit 21 is connected to the monitor 22. The video data acquired from the storage device 12 or the communication line and the lyric information included in the music data are displayed on the monitor 22 via the video control circuit 21 as background video and lyrics at the time of karaoke playback of the music. If the video data is encoded, the video control circuit 21 may perform the decoding process.

なお、上述した内部構成は、本発明の説明に必要なものを主に記載したものであり、上述した構成以外にも、種々の回路や要素が含まれることはもちろんである。 The above-described internal configuration mainly describes what is necessary for the description of the present invention, and it goes without saying that various circuits and elements are included in addition to the above-described configuration.

なお、本実施形態におけるカラオケ装置の外観は、本発明において何ら限定されるものではない。また、本発明においては、上述したカラオケ装置の内部構成として示した一部の要素を、外部に備えてよい。一部の構成要素の機能を、ネットワークに接続されたサーバで実現することも可能である。 In addition, the external appearance of the karaoke apparatus in this embodiment is not limited at all in the present invention. Moreover, in this invention, you may provide the one part element shown as an internal structure of the karaoke apparatus mentioned above outside. The functions of some components can be realized by a server connected to a network.

本実施形態で用いる楽曲データは、楽器音情報、楽曲演奏進行情報、歌唱者音程情報等を有するものである。代表的なものとしてＭＩＤＩデータを挙げることができるが、本発明は、ＭＩＤＩデータに限定されるものではなく、本発明を実施可能な限度においてその他のデータであってもよい。また、ＭＩＤＩデータにおいては、チャネル番号によって楽器音が指定され、楽器音の発音のオン／オフは、例えば、ノートオン／ノートオフ信号で制御される。また、歌唱者音程情報は、ボーカルトラックのノートナンバを用いることができる。 The music data used in this embodiment has instrument sound information, music performance progress information, singer pitch information, and the like. As typical examples, MIDI data can be cited, but the present invention is not limited to MIDI data, and may be other data as long as the present invention can be implemented. In MIDI data, a musical instrument sound is designated by a channel number, and on / off of the sound generation of the musical instrument sound is controlled by, for example, a note on / note off signal. Moreover, the note number of a vocal track can be used for the singer pitch information.

図１０は、共有メモリの内部構成を示す図である。この図が示すとおり、共有メモリには、所定の楽器音（楽器トラック）におけるノートオン信号が出力された時刻が記憶される領域と、ボーカルトラックのノートナンバが記憶される領域とからなる。なお、図１０は、所定の楽器音としてドラム音源を指定した場合の共有メモリの内容を示している。 FIG. 10 is a diagram illustrating an internal configuration of the shared memory. As shown in this figure, the shared memory includes an area for storing a time when a note-on signal is output in a predetermined instrument sound (instrument track) and an area for storing a note number of a vocal track. FIG. 10 shows the contents of the shared memory when a drum sound source is designated as a predetermined instrument sound.

次に、本実施形態におけるカラオケ装置の処理の流れについて図を参照しつつ説明する。図１１は、本実施形態におけるカラオケ装置の採点用プロセスのフローチャートであり、図１２は、本実施形態におけるカラオケ装置の演奏用プロセスのフローチャートである。ここで、採点用プロセスと演奏用プロセスとは、マルチスレッドとして処理される。すなわち、上記両プロセスは、並列で処理される。マルチスレッド処理については公知であるので説明を省略する。 Next, the flow of processing of the karaoke apparatus in the present embodiment will be described with reference to the drawings. FIG. 11 is a flowchart of the scoring process of the karaoke apparatus in the present embodiment, and FIG. 12 is a flowchart of the performance process of the karaoke apparatus in the present embodiment. Here, the scoring process and the performance process are processed as multi-threads. That is, both processes are processed in parallel. Since multi-thread processing is publicly known, description thereof is omitted.

上記両プロセスは、実行中に共有メモリにアクセスすることによって、両プロセス間で情報のやり取りが可能となっている。 Both processes can exchange information between the two processes by accessing the shared memory during execution.

［採点用プロセス］
まず、採点用プロセスについて図１１を参照にしつつ説明する。楽曲の再生がスタートすると、採点用プロセスは実行開始される。
Ｓ１において、取得ピッチ保持エリアを初期化する。取得ピッチ保持エリアは、ＲＡＭ１４内に形成される。 [Scoring process]
First, the scoring process will be described with reference to FIG. When music playback starts, the scoring process begins to run.
In S1, the acquisition pitch holding area is initialized. The acquisition pitch holding area is formed in the RAM 14.

Ｓ２において、ローカットフィルタをオフにする。なお、本実施形態においては、フィルタとしてローカットフィルタを用いる例を説明するが、バンドパスフィルタ等を用いても本発明は実現可能であることはもちろんである。また、フィルタをデジタルフィルタで構成してもよいし、アナログフィルタで構成してもよい。 In S2, the low cut filter is turned off. In the present embodiment, an example in which a low cut filter is used as a filter will be described. However, it goes without saying that the present invention can be realized even if a bandpass filter or the like is used. Further, the filter may be configured with a digital filter or an analog filter.

Ｓ３において、取得ピッチ保持エリア内に音高ピッチ情報がＦＵＬＬになったか否かを判断する。取得ピッチ保持エリアに記憶する音高ピッチ情報の量は適宜設定可能であり、入力音声のピッチ周波数を正確に算出できる量とする。音高情報ピッチ情報がＦＵＬＬになっていないと判断した場合は（Ｓ３：ＮＯ）、Ｓ４に進む。 In S3, it is determined whether or not the pitch pitch information becomes FULL within the acquired pitch holding area. The amount of pitch information stored in the acquired pitch holding area can be set as appropriate, and is an amount that can accurately calculate the pitch frequency of the input voice. If it is determined that the pitch information pitch information is not FULL (S3: NO), the process proceeds to S4.

Ｓ４において、楽曲の演奏が終了したか否かを判別する。演奏が終了したと判断された場合（Ｓ４：ＹＥＳ）は、Ｓ１１に進む。演奏が終了していないと判断された場合（Ｓ４：ＮＯ）は、Ｓ５に進む。 In S4, it is determined whether or not the music performance has ended. If it is determined that the performance has ended (S4: YES), the process proceeds to S11. If it is determined that the performance has not ended (S4: NO), the process proceeds to S5.

Ｓ５において、演奏開始から一定時間が経過したか否かを判断する。一定時間が経過したと判断した場合は（Ｓ５：ＹＥＳ）、Ｓ１１に進む。この一定時間は、適宜設定可能である。一定時間が経過したとの判断は、音高ピッチ情報が所定時間内に所定量取得できなかったと判断したことを意味する。 In S5, it is determined whether or not a predetermined time has elapsed since the start of the performance. If it is determined that the predetermined time has elapsed (S5: YES), the process proceeds to S11. This certain time can be set as appropriate. The determination that a certain time has elapsed means that it has been determined that a predetermined amount of pitch pitch information has not been acquired within a predetermined time.

Ｓ１１において、楽曲の演奏を終了し、採点ができなかった旨をモニタ２２に表示する。また、Ｓ５及びＳ１１における処理は、必要に応じ省略してもよい。 In S11, the performance of the music is finished, and a message that the scoring is not possible is displayed on the monitor 22. Moreover, you may abbreviate | omit the process in S5 and S11 as needed.

Ｓ５において、演奏開始から一定時間が経過していない場合は（Ｓ５：ＮＯ）は、Ｓ６に進む。
Ｓ６において、音高ピッチ情報が取得できたか否かを判断する。音高ピッチ情報が取得できないと判断した場合は（Ｓ６：ＮＯ）、Ｓ３に戻る。音高ピッチ情報が取得できたと判断した場合は（Ｓ６：ＹＥＳ）、Ｓ７に進む。なお、音高ピッチ情報の取得の手法については、公知の種々の技術を採用することができる。 In S5, when the predetermined time has not elapsed since the start of the performance (S5: NO), the process proceeds to S6.
In S6, it is determined whether or not pitch pitch information has been acquired. When it is determined that pitch pitch information cannot be acquired (S6: NO), the process returns to S3. If it is determined that pitch pitch information has been acquired (S6: YES), the process proceeds to S7. It should be noted that various known techniques can be employed as a method for acquiring pitch pitch information.

Ｓ７において、Ｓ６で取得した音高ピッチ情報の取得タイミングにおける入力音量が所定以上であるか否かを判断する。入力音量が所定以上でない場合は（Ｓ７：ＮＯ）、当該タイミングにおいてマイク１７により音声が入力されていなかったと判断してＳ３に戻る。入力音量が所定以上である場合は（Ｓ７：ＹＥＳ）、Ｓ８に進む。 In S7, it is determined whether or not the input volume at the acquisition timing of the pitch pitch information acquired in S6 is greater than or equal to a predetermined value. If the input volume is not higher than the predetermined level (S7: NO), it is determined that no sound has been input by the microphone 17 at the timing, and the process returns to S3. If the input volume is equal to or higher than the predetermined level (S7: YES), the process proceeds to S8.

Ｓ８においては、現在時刻を取得する。この現在時刻は、Ｓ６で取得した音高ピッチ情報を取得した時刻を表すものである。その後、Ｓ９に進み、Ｓ８で取得した時刻と、共有メモリに記憶されている時刻との時間差を算出する。Ｓ８で取得した時刻が、共有メモリに記憶されている時刻を基点として所定時間以上経過していない場合（Ｓ９：ＹＥＳ）は、Ｓ３に戻る。所定時間以上経過している場合は（Ｓ９：ＮＯ）は、Ｓ１０に進む。なお、共有メモリに書き込まれている時刻は、後述する演奏用プロセスにおいて書き込まれた所定の楽器音（本実施形態においてはドラム音源）のノートオン信号が出力された時刻である。 In S8, the current time is acquired. This current time represents the time when the pitch pitch information acquired in S6 is acquired. Thereafter, the process proceeds to S9, and a time difference between the time acquired in S8 and the time stored in the shared memory is calculated. When the time acquired in S8 has not elapsed for a predetermined time or more based on the time stored in the shared memory (S9: YES), the process returns to S3. If the predetermined time or more has elapsed (S9: NO), the process proceeds to S10. The time written in the shared memory is the time when a note-on signal of a predetermined musical instrument sound (in this embodiment, a drum sound source) written in a performance process described later is output.

Ｓ９における処理について具体的に説明する。Ｓ９では、ドラム音源が発音されてからの所定時間は、マイク１７に入力された音声にドラム音が含まれていると判断し、当該所定時間内に取得した音高ピッチ情報は、ピッチ算出には用いない。通常の音源は、ノートオン信号によって発音した後は、ノートオフ信号を受信するまで発音が続く。しかしながら、ドラムやパーカッション音源は、ノートオン信号を受信すると同時に発音し、ノートオフ信号の有無に関わらず音色の最後まで達したときに消音される。そのため、上記所定の楽器音として、ドラム音源を指定している場合は、ノートオン信号が出力されてから消音されるまでの時間を所定の時間として管理しておき、上記ノートオン信号が出力されてから上記所定の時間を発音されている区間として判断できるのである。すなわち、ノートオフ信号が出力されたか否かを判断する必要がない。 The process in S9 will be specifically described. In S9, it is determined that the drum sound is included in the sound input to the microphone 17 for a predetermined time after the drum sound source is sounded, and the pitch information acquired within the predetermined time is used for pitch calculation. Is not used. After a normal sound source is sounded by a note-on signal, the sound continues until a note-off signal is received. However, the drum or percussion sound source sounds at the same time as the note-on signal is received, and is muted when the end of the timbre is reached regardless of the presence or absence of the note-off signal. Therefore, when a drum sound source is specified as the predetermined instrument sound, the time from when the note-on signal is output until the sound is muted is managed as the predetermined time, and the note-on signal is output. Then, the predetermined time can be determined as the section where the sound is being generated. That is, it is not necessary to determine whether or not a note-off signal has been output.

Ｓ１０においては、Ｓ６で取得した音高ピッチ情報を取得ピッチ保持エリアに書き込む。そして、Ｓ３に戻る。 In S10, the pitch pitch information acquired in S6 is written in the acquired pitch holding area. Then, the process returns to S3.

Ｓ３において、取得ピッチ保持エリアがＦＵＬＬになっていると判断した場合は（Ｓ３：ＹＥＳ）は、Ｓ１２に進む。
Ｓ１２において、取得ピッチ保持エリア内の全データの平均値を算出し、平均ピッチ情報を入力音声のピッチ周波数として取得する。その後、Ｓ１３に進む。 If it is determined in S3 that the acquired pitch holding area is FULL (S3: YES), the process proceeds to S12.
In S12, an average value of all data in the acquired pitch holding area is calculated, and the average pitch information is acquired as the pitch frequency of the input voice. Then, it progresses to S13.

Ｓ１３において、算出したピッチ周波数が、所定ピッチ以下であるか否かを判別する。ピッチ周波数が所定ピッチ以下である場合（Ｓ１３：ＹＥＳ）は、マイク１７に入力された音声は男声であると判断して、Ｓ１４に進む。
Ｓ１４において、ローカットフィルタを男声用に設定する。 In S13, it is determined whether or not the calculated pitch frequency is equal to or less than a predetermined pitch. When the pitch frequency is equal to or lower than the predetermined pitch (S13: YES), it is determined that the voice input to the microphone 17 is a male voice, and the process proceeds to S14.
In S14, the low cut filter is set for male voice.

ピッチ周波数が所定ピッチ量以下ではない場合（Ｓ１３：ＮＯ）は、マイク１７に入力された音声は女声であると判断して、Ｓ１５に進む。
Ｓ１５において、ローカットフィルタを女声用に設定する。 If the pitch frequency is not equal to or less than the predetermined pitch amount (S13: NO), it is determined that the voice input to the microphone 17 is a female voice, and the process proceeds to S15.
In S15, the low cut filter is set for female voice.

なお、Ｓ１３〜Ｓ１５の処理においては、マイク１７に入力された音声のピッチ情報を２つのピッチ周波数（男声・女声）に分類し、それぞれのピッチ周波数に対してフィルタを設定したが、ピッチ周波数を３つ以上に分類して、それぞれの周波数に対しフィルタを設定してもよい。また、周波数と設定されるフィルタとの関係は、予めデータテーブルとして有してもよいし、プログラム上で処理（すなわち、ＩＦ／ＴＨＥＮ処理）してもよい。また、男声と判断した場合は、フィルタ処理をしないように構成してもよい。 In the processes of S13 to S15, the pitch information of the voice input to the microphone 17 is classified into two pitch frequencies (male voice / female voice), and a filter is set for each pitch frequency. It is possible to classify into three or more and set a filter for each frequency. Further, the relationship between the frequency and the set filter may be stored in advance as a data table, or may be processed on a program (that is, IF / THEN processing). Further, when it is determined that the voice is male, the filter process may not be performed.

Ｓ１６において、採点値を初期化する。なお、本実施形態においては、減点法によって処理を行うため、初期値として例えば１０００点を設定できるが、加点法や、その他周知の採点値設定法を適宜適用可能であることはいうまでもない。なお、その場合は、初期値は１０００点ではなく異なる値となることはいうまでもない。また、減点法以外の採点手法を採用する場合は、以下に説明するＳ２２の処理が異なる点はいうまでもない。 In S16, the scoring value is initialized. In this embodiment, since processing is performed by the deduction method, for example, 1000 points can be set as the initial value, but it is needless to say that a scoring method or other known scoring value setting methods can be applied as appropriate. . In this case, it goes without saying that the initial value is not 1000 points but a different value. Needless to say, when a scoring method other than the deduction method is employed, the processing of S22 described below is different.

Ｓ１７において、楽曲の演奏が終了したか否かを判断する。楽曲の演奏が終了したと場合は（Ｓ１７：ＹＥＳ）、Ｓ１８に進み、採点結果をモニタ２２に表示する。なお、楽曲の演奏中においても、適宜採点結果（途中結果）を表示するよう構成してもよい。楽曲の演奏が終了していない場合は（Ｓ１７：ＮＯ）、Ｓ１９に進む。 In S17, it is determined whether or not the music performance has ended. When the performance of the music is finished (S17: YES), the process proceeds to S18 and the scoring result is displayed on the monitor 22. In addition, you may comprise so that a scoring result (intermediate result) may be displayed suitably also during the performance of a music. If the performance of the music has not ended (S17: NO), the process proceeds to S19.

Ｓ１９において、音高ピッチ情報を取得できたか否かを判断する。音高ピッチ情報が取得できなかった場合は（Ｓ１９：ＮＯ）、Ｓ１７に戻る。音高ピッチ情報が取得できたと判断した場合は（Ｓ１９：ＹＥＳ）、Ｓ２０に進む。 In S19, it is determined whether or not pitch pitch information has been acquired. If the pitch information cannot be acquired (S19: NO), the process returns to S17. If it is determined that pitch pitch information has been acquired (S19: YES), the process proceeds to S20.

Ｓ２０において、Ｓ１９で取得した音高ピッチ情報の取得タイミングにおける入力音量が所定以上であるか否かを判断する。入力音量が所定以上でない場合は（Ｓ２０：ＮＯ）、当該タイミングにおいてマイク１７により音声が入力されていなかったと判断してＳ１７に戻る。入力音量が所定以上である場合は（Ｓ２０：ＹＥＳ）、Ｓ２１に進む。 In S20, it is determined whether or not the input volume at the acquisition timing of the pitch pitch information acquired in S19 is greater than or equal to a predetermined value. If the input volume is not higher than the predetermined level (S20: NO), it is determined that no sound is input from the microphone 17 at this timing, and the process returns to S17. If the input volume is equal to or higher than the predetermined level (S20: YES), the process proceeds to S21.

Ｓ２１において、Ｓ１９で取得した音高ピッチ情報と、共有メモリに記憶されているＭＩＤＩデータにおけるボーカルトラックのノートナンバとの差分を算出する。本実施形態においては、ＭＩＤＩデータにおけるボーカルトラックのノートナンバを採点の基準としている。また、ＭＩＤＩデータにおけるボーカルトラックは、歌唱時において、ガイドメロディとしても利用される。共有メモリへの書き込みについては、演奏用プロセスにおいて詳細に説明する。その後、Ｓ２２に進む。 In S21, the difference between the pitch pitch information acquired in S19 and the note number of the vocal track in the MIDI data stored in the shared memory is calculated. In this embodiment, the note number of the vocal track in the MIDI data is used as a scoring standard. The vocal track in MIDI data is also used as a guide melody when singing. The writing to the shared memory will be described in detail in the performance process. Then, it progresses to S22.

Ｓ２２において、現在の採点値からＳ２１で求めた差分の絶対値を減算する。ここで、マイクから入力された音高ピッチ情報が共有メモリに記憶されているノートナンバが示すピッチ情報と同じであれば、マイク１７から入力された音声信号の音高ピッチ情報は正しいものなので、減算されないことになる。 In S22, the absolute value of the difference obtained in S21 is subtracted from the current scoring value. Here, if the pitch information input from the microphone is the same as the pitch information indicated by the note number stored in the shared memory, the pitch information of the audio signal input from the microphone 17 is correct. It will not be subtracted.

なお、Ｓ２１及びＳ２２における採点手法においては、歌唱音声がガイドメロディに対して時間方向にずれている場合にも採点値に影響を及ぼす。そこで、所定時間分のガイドメロディ（ボーカルトラック）のノートナンバー（ピッチ情報）を共有メモリに記憶しておき、入力音声における音高ピッチ情報とＤＰマッチング等を利用して採点処理を行うことにより、時間方向のずれが採点値に影響を及ぼすことを低減できる。 In the scoring method in S21 and S22, the scoring value is also affected when the singing voice is shifted in the time direction with respect to the guide melody. Therefore, by storing a note number (pitch information) of a guide melody (vocal track) for a predetermined time in a shared memory and performing a scoring process using pitch pitch information and DP matching in the input voice, It is possible to reduce the influence of the deviation in the time direction on the scoring value.

［演奏用プロセス］
次に、演奏用プロセスについて図１２を参照しつつ説明する。楽曲の再生がスタートすると、演奏用プロセスは実行開始される。演奏用プロセスでは、楽曲データであるＭＩＤＩデータが有する複数のトラックのうち、ドラムトラック及びボーカルトラックが処理対象となる。そのため、楽曲データに含まれる複数のトラックから、ドラムトラック及びボーカルトラックを特定する必要がある。 [Performance process]
Next, the performance process will be described with reference to FIG. When the reproduction of the music starts, the performance process is started. In the performance process, a drum track and a vocal track are processed among a plurality of tracks included in MIDI data as music data. Therefore, it is necessary to specify a drum track and a vocal track from a plurality of tracks included in the music data.

Ｓ３１において、取得したトラックのデータがドラムトラックであるか否かを判断する。なお、本実施形態においては、ドラムトラックを音高ピッチ情報の取得に影響を及ぼす所定の楽器音としている。ドラムトラックでない場合は（Ｓ３１：ＮＯ）、他のトラックを処理の対象とするため、当該トラックに対しては処理を行なわず、Ｓ３５に進む。 In S31, it is determined whether or not the acquired track data is a drum track. In this embodiment, the drum track is a predetermined instrument sound that affects the acquisition of pitch pitch information. If the track is not a drum track (S31: NO), the process proceeds to S35 without performing the process on the track because the other track is the target of the process.

ドラムトラックである場合は（Ｓ３１：ＹＥＳ）、Ｓ３２に進む。
Ｓ３２において、Ｓ３１で取得したデータにノートオン情報が存在するか否かを判断する。ノートオン指示情報が含まれていない場合は（Ｓ３２：ＮＯ）、ドラム音源の発音開始タイミングではないと判断してＳ３５に進む。
ノートオン指示情報が含まれている場合は（Ｓ３２：ＹＥＳ）、ドラム音源の発音開始タイミングであると判断し、Ｓ３３に進む。 If it is a drum track (S31: YES), the process proceeds to S32.
In S32, it is determined whether or not note-on information exists in the data acquired in S31. If the note-on instruction information is not included (S32: NO), it is determined that it is not the sound generation start timing of the drum sound source, and the process proceeds to S35.
If note-on instruction information is included (S32: YES), it is determined that the sound generation start timing of the drum sound source is reached, and the process proceeds to S33.

Ｓ３３において、現在時刻を取得する。この現在時刻は、ドラム音源の発音開始時刻を意味するものである。なお、この現在時刻としては、システム起動からの時間を利用することができるが、その他の時刻情報でもよい。その後、Ｓ３４に進む。 In S33, the current time is acquired. This current time means the sounding start time of the drum sound source. As the current time, the time from system startup can be used, but other time information may be used. Thereafter, the process proceeds to S34.

Ｓ３４において、Ｓ３３で取得した現在時刻を時刻設定用共有メモリにセットする。この時刻設定用共有メモリの内容は、Ｓ３３で現在時刻が取得される度に更新されるものである。 In S34, the current time acquired in S33 is set in the time setting shared memory. The contents of the time setting shared memory are updated each time the current time is acquired in S33.

Ｓ３５において、取得したトラックのデータがボーカルトラックであるか否かを判断する。ボーカルトラックでない場合は（Ｓ３５：ＮＯ）は、他のトラックを処理の対象とするため、当該トラックに対しては処理を行わずＳ４０に進む。 In S35, it is determined whether or not the acquired track data is a vocal track. If the track is not a vocal track (S35: NO), the process proceeds to S40 without processing the track because the other track is the target of processing.

ボーカルトラックである場合は（Ｓ３５：ＹＥＳ）は、Ｓ３５で取得したデータにノートオン指示情報が存在するか否かを判断する。ノートオン指示情報が含まれていない場合は（Ｓ３６：ＮＯ）、ボーカル発声開始タイミングではないと判断してＳ３８に進む。ノートオン情報が含まれている場合は（Ｓ３６：ＹＥＳ）、ボーカル発声開始タイミングであると判断して、Ｓ３７に進む。 If it is a vocal track (S35: YES), it is determined whether or not note-on instruction information exists in the data acquired in S35. If the note-on instruction information is not included (S36: NO), it is determined that it is not the vocal utterance start timing, and the process proceeds to S38. If note-on information is included (S36: YES), it is determined that it is the vocal utterance start timing, and the process proceeds to S37.

Ｓ３７において、ノートナンバをノートナンバ用共有メモリにセットする。この内容は、採点用プロセスにおいて、採点の基準として用いられる。 In S37, the note number is set in the note number shared memory. This content is used as a scoring standard in the scoring process.

Ｓ３８において、Ｓ３５で取得したデータにノートオフ指示情報が存在するか否かを判断する。ノートオフ指示情報が含まれていない場合は（Ｓ３８：ＮＯ）、Ｓ４０に進む。ノートオフ指示情報が含まれている場合は（Ｓ３８：ＹＥＳ）、Ｓ３９に進む。
Ｓ３９において、ノートナンバ用共有メモリをクリアする。これにより、ボーカル発声期間のみ、ノートナンバ用共有メモリに情報が記憶されることになる。 In S38, it is determined whether or not note-off instruction information exists in the data acquired in S35. When the note-off instruction information is not included (S38: NO), the process proceeds to S40. If note-off instruction information is included (S38: YES), the process proceeds to S39.
In S39, the note number shared memory is cleared. As a result, information is stored in the note number shared memory only during the vocal utterance period.

Ｓ４０において、ＭＩＤＩデータを音源に送出する。これにより、楽音データに基いて演奏が制御されることになる。 In S40, the MIDI data is sent to the sound source. As a result, the performance is controlled based on the musical sound data.

上記処理においては、ドラムトラック及びボーカルトラックに対する処理の後にＭＩＤＩデータを音源に送出するよう構成したが、先にＭＩＤＩデータを音源に送出し、その後ドラムトラック及びボーカルトラックに対する処理を行うよう構成してもよい。 In the above processing, the MIDI data is sent to the sound source after the processing for the drum track and the vocal track. However, the MIDI data is first sent to the sound source and then the processing for the drum track and the vocal track is performed. Also good.

なお、採点用プロセス及び演奏用プロセスとして説明した上記フローチャートは単なる一例であり、上記処理と同等の結果を得ることできる処理であれば、他のフローチャートによって処理を実現してもよい。 Note that the flowcharts described as the scoring process and the performance process are merely examples, and the process may be realized by another flowchart as long as the process can obtain a result equivalent to the above process.

次に、上述した採点用プロセス及び演奏用プロセス実行時における、音高ピッチ情報が取得される様子を、図を用いて説明する。
図１３は、ドラムトラックのノートオン、ノートオフのタイミング、システム起動からの時間、取得したピッチ情報の関係を示す。なお、横軸は時間軸である。 Next, how the pitch pitch information is acquired during the above-described scoring process and performance process will be described with reference to the drawings.
FIG. 13 shows the relationship between the note-on and note-off timing of the drum track, the time from system activation, and the acquired pitch information. The horizontal axis is the time axis.

図１３においては、ドラム音源の発音開始から、２０ｍｓをピッチ無視時間として、図１１のＳ５における一定時間として設定しているが、この時間間隔は適宜設定可能である。システム時間８００１０（単位であるｍｓは省略する。以下同じ。）の時点でドラム音源がノートオンされたので、８００１０から８００３０までに取得された音高ピッチ情報は破棄される。そのため、ピッチ情報１は、取得ピッチ保持エリアに記憶されることなく破棄される。 In FIG. 13, 20 ms is set as the pitch ignorance time from the start of the sound generation of the drum sound source, and is set as the fixed time in S5 of FIG. Since the drum sound source is turned on at the time of system time 80010 (the unit ms is omitted; the same applies hereinafter), pitch pitch information acquired from 80010 to 80030 is discarded. Therefore, the pitch information 1 is discarded without being stored in the acquired pitch holding area.

また、ピッチ情報２、３は、８００３０以降に取得され、また、次のドラム音源の発音開始前であるので、取得ピッチ保持エリアに書き込まれることになる。以下同様にして、ピッチ情報４、８は破棄され、ピッチ情報５〜７、９〜１１は取得ピッチ保持エリアに書き込まれる。なお、図１３においては、ドラムトラックのノートオフ指示情報は、上述したとおり利用していない。取得ピッチ保持エリアに所定時間分のピッチ情報が蓄積されると、ピッチ周波数が算出されることになる。 The pitch information 2 and 3 is acquired after 80030 and is written in the acquired pitch holding area because it is before the start of sound generation of the next drum sound source. Similarly, the pitch information 4 and 8 are discarded, and the pitch information 5 to 7 and 9 to 11 are written in the acquired pitch holding area. In FIG. 13, the drum track note-off instruction information is not used as described above. When pitch information for a predetermined time is accumulated in the acquired pitch holding area, the pitch frequency is calculated.

上述したとおり、本実施形態においては、演奏開始後、ドラム音源が演奏されていない期間で取得した音高ピッチ情報に基いて男声／女声を判断し、判断した性別に応じたフィルタを設定し、設定されたフィルタを用いて入力音声をフィルタ処理し、フィルタ処理された音声を採点対象とするので、精度の高い採点を行うことができる。また、楽曲の演奏開始が所定時間内に所定量の音高ピッチ情報が取得できなかった場合は、その旨演奏の早い段階で報知することができる。 As described above, in the present embodiment, after the performance is started, the male / female voice is determined based on pitch pitch information acquired in a period in which the drum sound source is not played, and a filter corresponding to the determined gender is set. Since the input speech is filtered using the set filter and the filtered speech is used as a scoring target, scoring with high accuracy can be performed. Further, when a predetermined amount of pitch pitch information cannot be acquired within a predetermined time when the music performance starts, it can be notified at an early stage of the performance.

なお、上記処理においては、共有メモリを利用してメモリ内のボーカルトラックの情報を随時変更していたが、共有メモリを用いることなく処理することもできる。すなわち、採点用プロセスにおいて、直接楽曲データに含まれるボーカルトラックのノートナンバをアクセスするよう構成してもよい。 In the above processing, the information of the vocal track in the memory is changed as needed using the shared memory. However, the processing can be performed without using the shared memory. That is, in the scoring process, the note number of the vocal track included in the music data may be directly accessed.

［第２実施形態］
第１実施形態においては、男声／女声の判断を行い、フィルタが決定されるまでにマイク１７に入力された音声は採点処理の対象としなかったが、この期間の音声信号をバッファ等に蓄えておき、フィルタ決定後にこれら入力音声をフィルタ処理して採点結果に反映させることもできる。 [Second Embodiment]
In the first embodiment, male / female voice is determined, and the voice input to the microphone 17 is not subject to scoring until the filter is determined. However, the voice signal of this period is stored in a buffer or the like. In addition, after the filter is determined, these input voices can be filtered and reflected in the scoring results.

図１４は、バッファに上記期間の入力音声を記憶しておき、フィルタ決定後に用いる処理（以下、「一時記憶用プロセス」という。）のフローチャートを示すものである。 FIG. 14 shows a flowchart of processing (hereinafter referred to as “temporary storage process”) that is performed after the input voice of the above period is stored in the buffer and the filter is determined.

一時記憶用プロセスは、採点用プロセス及び演奏用プロセスと並列して処理される。一時記憶用プロセスは、楽曲の開始がスタートすると実行開始される。 The temporary storage process is processed in parallel with the scoring process and the performance process. The temporary storage process is started when the music starts.

Ｓ５１において、楽曲の演奏が終了したか否かを判断する。楽曲の演奏が終了した場合は（Ｓ５１：ＹＥＳ）、一時記憶用プロセスを終了する。楽曲の演奏が終了していない場合は（Ｓ５１：ＮＯ）、Ｓ５２に進む。 In S51, it is determined whether or not the music performance has been completed. When the performance of the music is finished (S51: YES), the temporary storage process is finished. When the performance of the music has not ended (S51: NO), the process proceeds to S52.

Ｓ５２において、採点用プロセスにおいてフィルタが設定されたか否かを判断する。フィルタが設定された場合は（Ｓ５４：ＹＥＳ）、Ｓ５４に進む。フィルタが設定されていない場合は（Ｓ５４：ＮＯ）、Ｓ５３に進む。 In S52, it is determined whether or not a filter is set in the scoring process. When the filter is set (S54: YES), the process proceeds to S54. If no filter is set (S54: NO), the process proceeds to S53.

Ｓ５３において、マイク１７に入力された音声信号を、バッファに書き込む。Ｓ５１〜Ｓ５３の処理を繰り返すことにより、採点用プロセスにおいてフィルタが設定されるまでの入力音声は、バッファに書き込まれることになる。 In S53, the audio signal input to the microphone 17 is written into the buffer. By repeating the processing of S51 to S53, the input voice until the filter is set in the scoring process is written to the buffer.

Ｓ５４において、バッファに記憶した入力音声信号を順次読み出す。その後、Ｓ５５に進む。
Ｓ５５において、読み出した入力音声信号に対し採点用プロセスにおいて設定されたフィルタを利用してフィルタ処理を行う。その後、Ｓ５６に進む。 In S54, the input audio signals stored in the buffer are sequentially read out. Thereafter, the process proceeds to S55.
In S55, the read input audio signal is filtered using the filter set in the scoring process. Thereafter, the process proceeds to S56.

Ｓ５６において、フィルタ処理された音声信号に対し、採点処理を行う。この採点処理自体は、採点用プロセスで行われるものと同じであるので説明を省略する。
なお、Ｓ５４〜Ｓ５６の処理は、互いに並列に行ってもよい。すなわち、バッファから読み出した音声をフィルタ処理している最中に、次の入力音声をバッファから読み出してもよい。 In S56, a scoring process is performed on the filtered audio signal. Since the scoring process itself is the same as that performed in the scoring process, description thereof is omitted.
In addition, you may perform the process of S54-S56 mutually in parallel. That is, the next input sound may be read from the buffer while the sound read from the buffer is being filtered.

Ｓ５７において、本プロセスで算出した採点値を、採点用プロセスにおける採点結果に反映させる。なお、反映させるタイミングは適宜設定可能である。 In S57, the scoring value calculated in this process is reflected in the scoring result in the scoring process. In addition, the timing to reflect can be set suitably.

上述した本実施形態においては、フィルタが設定されるまでにマイク１７に入力された音声についても採点結果に反映されることができる。なお、上記フローチャートは単なる一例であり、上記処理と同等の結果を得ることできる処理であれば、他のフローチャートによって処理を実現してもよい。 In the above-described embodiment, the voice input to the microphone 17 before the filter is set can be reflected in the scoring result. Note that the above flowchart is merely an example, and the process may be realized by another flowchart as long as it can obtain a result equivalent to the above process.

［所定の楽器音の設定において］
上述した第１及び第２実施形態においては、ドラム音源を所定の楽器音としていた。所定の楽器音の設定方法としては、図１５に示すように、楽曲データに所定の楽器音を示すための情報を含ませてもよい。このような構成を採れば、楽曲データ作成時に、楽曲データ作成者側で、予め楽曲ごとに音高ピッチ情報取得に悪影響を及ぼす楽器音を特定することができるので、前記楽曲データを用いたカラオケ用採点装置は、歌唱者に何らの負担を課すことなく、楽曲演奏中の楽器音種類と時間的流れとに応じて、男声／女声を正確に、かつ、容易に判別できるようになり、男声女声別に適正なフィルタ処理を行えるので、楽曲ごとに精度の高い採点ができる。なお、図１５に示す楽曲データのデータ構造は単なる一例である。 [In the setting of the specified instrument sound]
In the first and second embodiments described above, the drum sound source is a predetermined musical instrument sound. As a method for setting a predetermined instrument sound, as shown in FIG. 15, information for indicating a predetermined instrument sound may be included in the music data. By adopting such a configuration, at the time of music data creation, the music data creator side can specify in advance the instrument sound that adversely affects the pitch pitch information acquisition for each music, so the karaoke using the music data The scoring system for males can discriminate male / female voice accurately and easily according to the type of musical instrument sound and the time flow during the performance of music without imposing any burden on the singer. Appropriate filtering can be performed for each female voice, so highly accurate scoring is possible for each song. Note that the data structure of the music data shown in FIG. 15 is merely an example.

例えば、図１５の楽曲データに含まれる除外トラック指定情報を参照することにより、音高ピッチ情報取得に悪影響を及ぼす楽器音（設定する楽器音）のトラックを特定することが可能となる。 For example, by referring to the exclusion track designation information included in the music data in FIG. 15, it is possible to specify the track of the instrument sound (instrument sound to be set) that adversely affects the pitch pitch information acquisition.

ここで、除外トラックとは、どの楽器音が音高ピッチ情報取得に悪影響を及ぼすかを特定するものである。除外トラックにドラムトラックが指定されていれば、図１２（Ｓ３１，Ｓ３２）で示した処理と同様の処理が可能となる。また、例えば、除外トラックにトラックＮｏ．１０が設定されていれば、トラックＮｏ．１０のノートオンタイミングからピッチ無視時間（例えば、２０ｍＳ）内に取得されたピッチ情報は破棄される。この場合、トラックＮｏ．１０に指定されている楽器音を、音高ピッチ情報取得に悪影響を及ぼす楽器音（設定する楽器音）として特定することになる。 Here, the excluded track specifies which musical instrument sound has an adverse effect on the pitch pitch information acquisition. If a drum track is designated as the excluded track, the same processing as that shown in FIG. 12 (S31, S32) can be performed. Also, for example, the track No. If 10 is set, the track No. The pitch information acquired within the pitch ignorance time (for example, 20 mS) from the 10 note-on timing is discarded. In this case, the track No. The musical instrument sound designated as 10 is specified as the musical instrument sound that adversely affects the pitch pitch information acquisition (the musical instrument sound to be set).

また、ＭＩＤＩにおけるプログラム・チェンジコマンドを利用することにより、任意の楽器音のトラックを上記所定の楽器音のトラックとして指定することもできる。ドラム音源を上記所定の楽器音としている場合に、任意の楽器音トラックのＭＩＤＩデータ中に、ドラム音源に切り替えるためのプログラム・チェンジコマンドが挿入されていると、そのプログラム・チェンジコマンド実行後においては、その任意の楽器音のトラックは、ドラムトラックとして機能することになる。これにより、除外トラック指定情報によって上記所定の楽器音を直接指定しなくても、ＭＩＤＩコマンドによって上記所定の楽器音を特定することができる。 Further, by using a program change command in MIDI, an arbitrary instrument sound track can be designated as the predetermined instrument sound track. If the drum sound source is the above-mentioned predetermined instrument sound, and a program change command for switching to the drum sound source is inserted in the MIDI data of any instrument sound track, after the program change command is executed, The track of any instrument sound will function as a drum track. Thus, the predetermined instrument sound can be specified by the MIDI command without directly specifying the predetermined instrument sound by the exclusion track designation information.

また、カラオケ装置において予め所定の楽器音を決定しておいてもよい。この場合は、既存の楽曲データを何ら加工することなく、本発明を実現できる。この場合、楽曲によっては、音高ピッチ情報が正しく取得できないことも想定されるが、上述した採点用プロセスにおいては、楽曲の演奏開始後所定時間内に所定時間分の音高ピッチ情報が取得できなかった旨を報知するので、利用者は、演奏中に採点が正しく行われなかったことを把握することができる。 Further, a predetermined instrument sound may be determined in advance in the karaoke apparatus. In this case, the present invention can be realized without processing any existing music data. In this case, although it is assumed that the pitch pitch information cannot be acquired correctly depending on the music, the pitch process for the predetermined time can be acquired within a predetermined time after the performance of the music starts in the scoring process described above. Since the fact that there has been no notification is given, the user can grasp that scoring was not performed correctly during the performance.

本発明は上述した実施形態に限定されるものではなく、本発明の要旨を逸脱しない範囲内で種々の改良、変形が可能であることは勿論である。また、上述した処理を実行するためのカラオケ装置における採点方法としても本発明は実現可能である。さらに、当該カラオケ装置における採点方法をコンピュータで実行させるためのプログラム、及び、そのプログラムが記録された記録媒体としても本発明は実現可能である。
The present invention is not limited to the above-described embodiment, and various improvements and modifications can be made without departing from the scope of the present invention. The present invention can also be realized as a scoring method in a karaoke apparatus for executing the above-described processing. Furthermore, the present invention can be realized as a program for causing a computer to execute the scoring method in the karaoke apparatus and a recording medium on which the program is recorded.

ドラム音源の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound source. ドラム音源の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound source. ドラム音源の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a drum sound source. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 女声の周波数分布の一例を示した図である。It is the figure which showed an example of the frequency distribution of a female voice. 男声の周波数分布の一例を示した図である。It is the figure which showed an example of the male voice frequency distribution. 男声の周波数分布の一例を示した図である。It is the figure which showed an example of the male voice frequency distribution. 本発明の実施形態に係る採点機能を有するカラオケ装置の制御装置の内部構成及びその周辺要素を示した図である。It is the figure which showed the internal structure of the control apparatus of the karaoke apparatus which has a scoring function based on embodiment of this invention, and its peripheral element. 本発明の実施形態に係るカラオケ用装置の共有メモリの内部構造を示した図である。It is the figure which showed the internal structure of the shared memory of the apparatus for karaoke which concerns on embodiment of this invention. 本発明の実施形態に係る採点用プロセスのフローチャートである。It is a flowchart of the process for scoring which concerns on embodiment of this invention. 本発明の実施形態に係る演奏用プロセスのフローチャートである。It is a flowchart of the process for performance which concerns on embodiment of this invention. 本発明における、ドラム音源の発音タイミング、システム起動からの時間、取得された音高ピッチ情報を示したタイミング図である。FIG. 4 is a timing chart showing the sound generation timing of a drum sound source, the time from system activation, and acquired pitch pitch information in the present invention. 本発明の第２実施形態に係る一時記憶用プロセスのフローチャートである。It is a flowchart of the process for temporary storage which concerns on 2nd Embodiment of this invention. 本発明における楽曲データの構成の一例を示した図である。It is the figure which showed an example of the structure of the music data in this invention.

Explanation of symbols

１０制御装置
１１コントローラ
１２記憶装置
１３操作パネル
１３ａリモコン
１４ＲＡＭ
１５通信Ｉ／Ｆ
１６採点回路
１７マイク
１８音源
１９アンプ
２０スピーカ
２１映像制御回路
２２モニタ DESCRIPTION OF SYMBOLS 10 Control apparatus 11 Controller 12 Storage device 13 Operation panel 13a Remote control 14 RAM
15 Communication I / F
16 Scoring Circuit 17 Microphone 18 Sound Source 19 Amplifier 20 Speaker 21 Video Control Circuit 22 Monitor

Claims

In a karaoke apparatus having a scoring function comprising a singer voice input means, a music playback means, a control means, a storage means, and a scoring means,
In the storage means,
The related data in which the pitch frequency and the frequency band filter are associated, and song data including singer pitch information, instrument sound information, and music performance progress information are stored,
During the performance of the music based on the music data, the control means,
Based on the instrument sound information and the music performance progress information, it is determined whether or not the set predetermined instrument sound is being pronounced,
The pitch frequency of the singer is identified from the input voice input to the singer voice input means during a period when the instrument sound is not pronounced,
Based on the identified pitch frequency and the related data, identify the frequency band filter,
Filter the input speech with the specified frequency band filter,
The scoring means scores the input voice after the filtering process using the singer pitch information.
A karaoke apparatus having a scoring function.

It further has storage means,
The pitch information of the input voice acquired during the period where the pronunciation is not performed is stored in the storage means,
When pitch information for a predetermined time is accumulated, the pitch frequency of the singer is specified by averaging.
A karaoke apparatus having a scoring function according to claim 1.

The music data is attached with information that makes it possible to specify the predetermined instrument sound,
The control means sets the predetermined instrument sound based on the information;
A karaoke apparatus having a scoring function according to claim 1 or 2.

The predetermined musical instrument sound is predetermined;
A karaoke apparatus having a scoring function according to claim 1 or 2.

The above music is played based on music data including MIDI data,
The control means determines a period instruction for sounding the said predetermined musical instrument sound a certain time from the output, the predetermined musical instrument sound is pronounced,
The karaoke apparatus which has a scoring function in any one of Claims 1-4 characterized by the above-mentioned.

The control means notifies the user of predetermined information when a pitch for a predetermined time is not accumulated in the temporary memory within a predetermined time after the performance of the music starts.
A karaoke apparatus having a scoring function according to claim 4.