JP4148261B2

JP4148261B2 - Karaoke equipment

Info

Publication number: JP4148261B2
Application number: JP2005380145A
Authority: JP
Inventors: 政宏深谷; 嘉寛小橋; 琢磨久野
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2005-12-28
Filing date: 2005-12-28
Publication date: 2008-09-10
Anticipated expiration: 2025-12-28
Also published as: JP2007178923A

Description

本発明は、カラオケ装置におけるカラオケ歌唱に関し、特にカラオケ歌唱される歌詞の音声認識技術に関する。 The present invention relates to karaoke singing in a karaoke apparatus, and more particularly to speech recognition technology for lyrics sung in karaoke.

従来、カラオケ装置の付帯機能として採点機能が知られている。この採点機能は、次のような機能である。すなわち、マイクロフォンから入力された歌唱者の音声信号をサンプリングすることで歌唱者が発声した音高や声量あるいはテンポなどの歌唱状態を示す歌唱データを生成する。次に、この歌唱データとカラオケデータ中の主旋律パートデータなどの採点基準データとを比較し、その比較結果に基づいて所定の得点を付与して採点データを生成する。そして、歌唱パートが終了するとこの採点データ中の得点を集計して総合得点を算出する。総合得点はそのままの得点をスコアボードやディスプレイに表示したり、所定のメッセージや所定の表現内容を含む映像など総合得点を反映した映像をディスプレイに出力したりする（例えば特許文献１参照）。 Conventionally, a scoring function is known as an incidental function of a karaoke apparatus. This scoring function is the following function. That is, singing data indicating the singing state such as pitch, volume or tempo uttered by the singer is generated by sampling the voice signal of the singer input from the microphone. Next, the singing data is compared with scoring reference data such as main melody part data in karaoke data, and a predetermined score is given based on the comparison result to generate scoring data. When the singing part ends, the scores in the scoring data are totaled to calculate a total score. The total score is displayed as it is on a scoreboard or a display, or a video reflecting the total score such as a video including a predetermined message or predetermined expression content is output to the display (for example, see Patent Document 1).

また、デュエット曲など複数人が歌唱する場合に、各人の歌唱音声について採点する採点機能を有するカラオケ装置（例えば特許文献２参照）や２系統のマイクロフォンからの音声信号の入力頻度に基づいて男女を判定し別々に採点する歌唱採点方式のようなカラオケ装置（例えば特許文献３参照）がある。 In addition, when multiple people sing, such as duet songs, men and women based on the input frequency of audio signals from a karaoke device (for example, refer to Patent Document 2) having a scoring function for scoring each person's singing voice or two microphones. There is a karaoke apparatus (see, for example, Patent Document 3) such as a singing scoring system that determines the score and scores separately.

そして、上述のような採点機能を有するカラオケ装置では、歌唱者が発声した音高や声量あるいはテンポなどの歌唱状態を示す歌唱データに基づいてカラオケ歌唱力を採点するものであった。 In the karaoke apparatus having the scoring function as described above, the karaoke singing ability is scored based on the singing data indicating the singing state such as the pitch, volume or tempo uttered by the singer.

ところが、カラオケ歌唱において、カラオケ歌唱される歌詞を音声認識するようなカラオケ装置はなかった。
なお、カラオケ装置の技術分野に限らなければ、制御用の音声指令の他に、騒音や、オーディオ機器の出力音等が存在する環境下でも十分に音声認識を可能とする騒音除去装置のような音声認識する装置（例えば特許文献４参照）がある。
特許第３２６１９９０号公報特開２０００−３３０５８０号公報特開平１１−２８２４７８号公報特開平５−７３０９１号公報 However, in karaoke singing, there has been no karaoke apparatus that recognizes the lyric lyrics.
If not limited to the technical field of the karaoke device, such as a noise removal device that can sufficiently recognize speech even in an environment where noise, output sound of an audio device, etc. exist in addition to a voice command for control. There is a voice recognition device (see, for example, Patent Document 4).
Japanese Patent No. 3261990 JP 2000-330580 A Japanese Patent Laid-Open No. 11-282478 Japanese Patent Laid-Open No. 5-73091

しかし、近年、カラオケ装置を使ってカラオケ歌唱の歌詞を間違えないで歌唱できるかを競うゲームが登場してきた。このゲームにおいては、審査員などがカラオケ歌唱の歌詞を間違えないで歌唱したか否か判断していた。すなわち、審査員などの耳を通して、カラオケ歌唱された歌詞を聴いて歌詞を認識し、カラオケ歌唱されたカラオケ演奏曲の歌詞とを比較してカラオケ歌唱の歌詞を間違えないで歌唱したか否かを判断していた。 However, in recent years, games have emerged that use karaoke devices to compete for singing karaoke songs without making mistakes. In this game, judges and others judged whether or not they sang karaoke songs without making mistakes. In other words, through the ears of judges, etc., listen to the lyrics sung by karaoke, recognize the lyrics, compare with the lyrics of the karaoke songs that were sung karaoke, and sing the karaoke song without mistakes. I was judging.

したがって、１人でカラオケ装置を使ってカラオケ歌唱する場合には、このゲームを楽しむことができなかった。また、多人数でカラオケ歌唱する場合でも、例えば１人の利用者がカラオケ歌唱した歌詞を聴いて歌詞を認識し、カラオケ歌唱されたカラオケ演奏曲の歌詞とを比較してカラオケ歌唱の歌詞を間違えないで歌唱したか否かを判断する同伴者にとっては、カラオケ演奏を十分に楽しむことができなかった。 Therefore, this game cannot be enjoyed when one person sings a karaoke using a karaoke apparatus. Also, even when singing karaoke with a large number of people, for example, one user listens to the lyrics sung by karaoke, recognizes the lyrics, and compares the lyrics of the karaoke performance karaoke song with the wrong karaoke song lyrics. For companions who decide whether or not to sing without karaoke, they could not fully enjoy karaoke performance.

また、カラオケ歌唱の歌詞の中から歌唱してはいけない歌詞（以下、ＮＧワードと称す）を予め決めておいて、カラオケ歌唱中にＮＧワ−ドを歌唱したか否かを競うゲームも登場してきた。このゲームにおいても、審査員などがカラオケ歌唱中にＮＧワ−ドを歌唱したか否かを判断していた。 There is also a game that decides in advance the lyrics that should not be sung from the lyrics of karaoke songs (hereinafter referred to as NG words) and competes whether or not the NG words were sung during the karaoke song. It was. Also in this game, judges and the like judged whether or not they sang NG words during karaoke singing.

これらのゲームにおいて、「歌唱採点」と同様に審査員などの人手に頼らずに自動化することが求められている。
しかし、カラオケ歌唱される歌詞を音声認識する場合においては、カラオケ演奏音がカラオケ装置から放音されるため、マイクロフォンには歌詞をカラオケ歌唱する音声とともにカラオケ演奏音も入力される。したがって、カラオケ歌唱する音声以外のカラオケ演奏音自体が騒音となって、音声認識する認識率が低くなるという課題があった。 These games are required to be automated without relying on a person such as a judge as in the case of “singing a song”.
However, when recognizing karaoke-sung lyrics, the karaoke performance sound is emitted from the karaoke device, so the karaoke performance sound is also input to the microphone together with the sound of singing the karaoke song. Therefore, there has been a problem that the karaoke performance sound itself other than the voice for singing karaoke becomes noise and the recognition rate of voice recognition is lowered.

周囲騒音が大きな場所で音声認識する従来技術として、車のエンジン音や室内のダクト音などの周囲音を抽出して消去する技術がある。なお、このような周囲音は音量や周波数が一定な騒音であり、周期も短いため周囲音を抽出するまでの時間（以下、タイムラグとも称する）が短い。そのため、タイムラグがあったとしても、タイムラグの時間経過以降は周囲音の抽出ができる。 As a conventional technique for recognizing voices in a place with a large ambient noise, there is a technique for extracting and deleting ambient sounds such as car engine sounds and indoor duct sounds. Note that such ambient sounds are noises having a constant volume and frequency and have a short period, so that the time until the ambient sounds are extracted (hereinafter also referred to as a time lag) is short. Therefore, even if there is a time lag, ambient sounds can be extracted after the time lag has elapsed.

しかし、カラオケ装置の場合には、カラオケ演奏音の１周期は、１楽曲のうちの１パートであるため、前記車のエンジン音や室内のダクト音などの周囲音に比べて周期が非常に長い。また、前記カラオケ演奏音の周波数や楽曲演奏の強弱（例えばフォルテシモやピアニシモなど）による音量も不規則に変化するため、カラオケ演奏音を周囲音として抽出するための時間が長くなる。したがって、カラオケ演奏音を周囲音として消去するのが難しかった。また、カラオケ歌唱の音声は、日常会話の音声とは異なり、カラオケ演奏音（周囲音）と合うように発音するので、音声とカラオケ演奏音（周囲音）とを区別してカラオケ歌唱する歌詞を音声認識するのが難しかった。 However, in the case of a karaoke device, one cycle of karaoke performance sound is one part of one piece of music, so the cycle is very long compared to ambient sounds such as the engine sound of the car and the duct sound of the room. . Moreover, since the frequency of the karaoke performance sound and the volume of the music performance (for example, fortesimo, pianissimo, etc.) change irregularly, it takes a long time to extract the karaoke performance sound as the ambient sound. Therefore, it is difficult to erase the karaoke performance sound as the ambient sound. Also, unlike the voice of daily conversation, the voice of karaoke singing is pronounced to match the karaoke performance sound (ambient sound), so the karaoke singing lyrics are distinguished from the karaoke performance sound (ambient sound). It was difficult to recognize.

上述した問題点を解決するためになされた本発明のカラオケ装置（１：なお、この欄においては、発明に対する理解を容易にするため、必要に応じて「発明を実施するための最良の形態」欄において説明した構成要素を括弧内に示すが、この記載によって特許請求の範囲を限定することを意味するものではない。）は、カラオケ歌唱の音声信号を入力するための音声信号入力手段（Ｍ２）と、カラオケ曲の楽曲データを記憶する楽曲データ記憶手段（Ｍ１０）と、楽曲データ記憶手段が記憶する前記楽曲データを音信号として再生し、且つ再生した音信号と前記音声信号入力手段から入力されたカラオケ歌唱の音声信号とをスピーカ（Ｍ２０）へ出力するカラオケ演奏再生手段（Ｍ１４）と、カラオケ演奏再生手段から出力された音信号に対応する第１の信号と音声信号入力手段から入力されたカラオケ歌唱の音声信号に対応する第２の信号とを比較し、第２の信号から第１の信号を減じた音声認識用信号を生成する第１の生成手段（Ｍ７０）と、第１の生成手段によって生成された音声認識用信号に基づいて、前記第１の生成手段が第２の信号から第１の信号を減じるための利得を設定する第１の利得設定手段（Ｍ７２）と、第１の生成手段によって生成された音声認識用信号に基づいてカラオケ歌唱の歌詞を認識する音声認識手段（Ｍ７６）と、カラオケ演奏再生手段から出力された音信号に対応する第１の信号と音声信号入力手段から入力されたカラオケ歌唱の音声信号に対応する第２の信号とを比較し、第２の信号から第１の信号を減じた採点用信号を生成する第２の生成手段（Ｍ３０）と、第２の生成手段によって生成された採点用信号から音高データを抽出する音高抽出手段（Ｍ６）と、第２の生成手段によって生成された採点用信号に基づいて、第２の生成手段が第２の信号から第１の信号を減じるための利得を設定する第２の利得設定手段（Ｍ３２）と、楽曲データ記憶手段が記憶する前記カラオケ曲の歌唱旋律の音高データと音高抽出手段によって抽出された前記採点用信号の音高データとの音高差を算出し、算出した音高差に基づいて所定区間ごとにカラオケ歌唱の歌唱を採点する歌唱採点手段（Ｍ２４）と、を備える。そして、カラオケ曲演奏期間中における前奏期間または間奏期間の歌唱されない期間の場合、第１の利得設定手段は、音声認識用信号の信号レベルが最小になるように利得を設定し、第２の利得設定手段は、採点用信号の信号レベルが最小になるように利得を設定する。 The karaoke apparatus of the present invention made to solve the above-mentioned problems (1: In this section, in order to facilitate the understanding of the invention, “the best mode for carrying out the invention” is necessary as necessary. The components described in the column are shown in parentheses, but this description does not mean that the scope of the claims is limited. The voice signal input means (M2) for inputting the voice signal of karaoke singing ), Music data storage means (M10) for storing the music data of the karaoke music, and the music data stored in the music data storage means is reproduced as a sound signal, and the reproduced sound signal and the audio signal input means are input. The karaoke performance reproducing means (M14) for outputting the karaoke singing voice signal to the speaker (M20) and the sound signal output from the karaoke performance reproducing means. The first signal to be compared with the second signal corresponding to the voice signal of the karaoke song input from the voice signal input means, and a voice recognition signal is generated by subtracting the first signal from the second signal. Based on the first generation means (M70) and the speech recognition signal generated by the first generation means, a gain for the first generation means to subtract the first signal from the second signal is set. Output from the first gain setting means (M72), the voice recognition means (M76) for recognizing the lyrics of the karaoke song based on the voice recognition signal generated by the first generation means, and the karaoke performance reproduction means. The first signal corresponding to the received sound signal is compared with the second signal corresponding to the voice signal of the karaoke song input from the voice signal input means, and the first signal is subtracted from the second signal. Second generation to generate the signal Based on the stage (M30), the pitch extraction means (M6) for extracting pitch data from the scoring signal generated by the second generating means, and the scoring signal generated by the second generating means, Second gain setting means (M32) for setting a gain for the second generation means to subtract the first signal from the second signal, and the pitch of the karaoke song melody stored in the song data storage means A singing scoring means for scoring karaoke singing for each predetermined section based on the calculated pitch difference and calculating the pitch difference between the data and the pitch data of the scoring signal extracted by the pitch extracting means ( M24). Then, in the case of the prelude period during the karaoke song performance period or the period in which the interlude period is not sung, the first gain setting means sets the gain so that the signal level of the speech recognition signal is minimized, and the second gain The setting means sets the gain so that the signal level of the scoring signal is minimized.

上述のように、本発明のカラオケ装置は、スピーカからカラオケ演奏音が放音された場合でも、スピーカから放音されたカラオケ演奏音に対応する音信号を減じて生成された音声認識用信号に基づいてカラオケ歌唱の歌詞を認識されるので適切な音声認識結果を得ることができる。具体的には、カラオケ演奏再生手段から出力された音信号に対応する第１の信号は、カラオケ演奏音の演奏信号であり、スピーカからカラオケ演奏音が放音される。よって、音声認識する場合にもマイクロフォンへは歌唱者の音声とは別にスピーカから放音されたカラオケ演奏音も入力されることになる。つまり、音声信号入力手段から入力されたカラオケ歌唱の音声信号に対応する第２の信号には、歌唱者の音声とは別にスピーカから放音されたカラオケ演奏音も入力されることになる。しかしながら、第１の生成手段（Ｍ７０）は、第１の信号と第２の信号とを比較し、第２の信号から第１の信号を減じた音声認識用信号を生成するので、スピーカから放音されたカラオケ演奏音に対応する音信号が減じられる。そして、音声認識手段（Ｍ７６）は、第１の生成手段によって生成された音声認識用信号に基づいてカラオケ歌唱の歌詞を認識するので、適切な音声認識結果を得ることができる。 As described above, the karaoke apparatus of the present invention uses the voice recognition signal generated by subtracting the sound signal corresponding to the karaoke performance sound emitted from the speaker even when the karaoke performance sound is emitted from the speaker. Since the lyrics of the karaoke song are recognized based on this, an appropriate speech recognition result can be obtained. Specifically, the first signal corresponding to the sound signal output from the karaoke performance reproducing means is a performance signal of the karaoke performance sound, and the karaoke performance sound is emitted from the speaker. Therefore, in the case of voice recognition, the karaoke performance sound emitted from the speaker is input to the microphone separately from the voice of the singer. That is, the karaoke performance sound emitted from the speaker is input to the second signal corresponding to the karaoke singing voice signal input from the voice signal input means. However, the first generation means (M70) compares the first signal with the second signal and generates a voice recognition signal obtained by subtracting the first signal from the second signal, so that it is released from the speaker. The sound signal corresponding to the sung karaoke performance sound is reduced. And since a voice recognition means (M76) recognizes the lyrics of a karaoke song based on the signal for voice recognition produced | generated by the 1st production | generation means, it can obtain a suitable voice recognition result.

また、請求項２に記載するように、請求項１に記載のカラオケ装置の音声認識手段によって認識されたカラオケ歌唱の歌詞データに基づいてカラオケ歌唱の歌詞を採点するとよい。 In addition, as described in claim 2, the lyrics of the karaoke song may be scored based on the lyrics data of the karaoke song recognized by the voice recognition means of the karaoke apparatus according to claim 1.

すなわち、請求項１に記載のカラオケ装置において、楽曲データ記憶手段は、カラオケ曲の歌詞データを含む楽曲データを記憶し、音声認識手段によって認識されたカラオケ歌唱の歌詞データと楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データとを比較して、相違する歌詞データの数量を抽出し、抽出された数量と前記カラオケ曲の歌詞データの数量とに基づいてカラオケ歌唱の歌詞を採点する歌詞採点手段（１２）を備えるとよい。 That is, in the karaoke apparatus according to claim 1, the music data storage means stores music data including lyrics data of the karaoke music, and the lyrics data and music data storage means of the karaoke song recognized by the voice recognition means. The lyric scoring means for comparing the lyric data of the karaoke song, extracting the quantity of the different lyric data, and scoring the lyrics of the karaoke song based on the extracted quantity and the lyric data quantity of the karaoke song (12) may be provided.

このように構成されたカラオケ装置によれば、音声認識手段によって認識されたカラオケ歌唱の歌詞データと楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データとを比較して、カラオケ歌唱の歌詞を採点することができる。具体的には、音声認識手段によって認識されたカラオケ歌唱の歌詞データ、例えばカラオケ歌唱された複数の単語と、楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データ上の複数の単語と、を比較する。そして、相違する歌詞データの数量、例えば複数の単語中から相違した単語の数量を抽出し、抽出された単語の数量と前記カラオケ曲の歌詞データが有する単語の数量とに基づいてカラオケ歌唱の歌詞を採点する。より具体的には、例えば相違した単語の数量が５個あり、カラオケ曲の歌詞データが有する単語の数量が１００個あるとすれば、（１００個−５個）／１００個＝０．９５となり、１００点満点中の９５点と歌詞採点することができる。 According to the karaoke apparatus configured in this way, the lyrics data of the karaoke song recognized by the voice recognition means is compared with the lyrics data of the karaoke song stored in the song data storage means, and the lyrics of the karaoke song are scored. can do. Specifically, the lyrics data of the karaoke song recognized by the voice recognition means, for example, a plurality of words sung by the karaoke song and a plurality of words on the lyrics data of the karaoke song stored in the song data storage means are compared. To do. Then, the number of different lyrics data, for example, the number of different words from a plurality of words is extracted, and the lyrics of the karaoke song based on the extracted number of words and the number of words included in the lyrics data of the karaoke song Scoring. More specifically, for example, if there are five different words and the number of words included in the lyrics data of the karaoke song is 100, (100-5) /100=0.95. , You can score lyrics with 95 points out of 100.

さらに、歌詞採点のゲーム性を高めるために、請求項３に記載のように、カラオケ装置の音声認識手段によって認識されたカラオケ歌唱の歌詞が所定数以上誤っていたらカラオケ演奏を中止することも考えられる。 Furthermore, in order to improve the game performance of lyric scoring, as described in claim 3, karaoke performance may be stopped if the lyrics of the karaoke song recognized by the voice recognition means of the karaoke apparatus are erroneous by a predetermined number or more. It is done.

すなわち、請求項１に記載のカラオケ装置において、楽曲データ記憶手段は、カラオケ曲の歌詞データを含む楽曲データを記憶し、音声認識手段によって認識されたカラオケ歌唱の歌詞データと楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データとを比較して、相違する歌詞データの数量を抽出し、抽出された数量が所定数以上あると判定した場合には、カラオケ演奏再生手段を制御して、カラオケ演奏の再生を停止させる制御手段（１２）を備える。 That is, in the karaoke apparatus according to claim 1, the music data storage means stores music data including lyrics data of the karaoke music, and the lyrics data and music data storage means of the karaoke song recognized by the voice recognition means. Comparing the lyrics data of the karaoke song and extracting the quantity of the different lyric data. If it is determined that the extracted quantity is a predetermined number or more, the karaoke performance playback means is controlled to The control means (12) which stops reproduction | regeneration of this is provided.

このように構成されたカラオケ装置によれば、カラオケ歌唱の最中であっても、歌詞を所定数誤って歌唱した場合には、カラオケ演奏が中止されるのでよりゲーム性を高めることができる。なお、所定数とは、例えばカラオケ歌唱の熟練者では「１」としたり、中級者では「３」としたり、初心者では「５」としたり、それぞれのカラオケ歌唱者のレベルに応じて設定される数値である。 According to the karaoke apparatus configured in this way, even when the karaoke song is being performed, if a predetermined number of lyrics are sung in error, the karaoke performance is stopped, so that the game performance can be improved. Note that the predetermined number is set according to the level of each karaoke singer, for example, “1” for a skilled karaoke singer, “3” for an intermediate singer, or “5” for a beginner. It is a numerical value.

また、請求項４に記載のように、カラオケ歌唱の歌詞の中から歌唱してはいけないＮＧワードを予め表示しておいて、カラオケ歌唱中にＮＧワ−ドを歌唱した場合には、カラオケ演奏を中止することも考えられる。 Further, as described in claim 4, when NG words that should not be sung are displayed in advance from the lyrics of karaoke singing and NG words are sung during karaoke singing, It is also possible to cancel.

すなわち、請求項１に記載のカラオケ装置において、カラオケ曲の歌詞データを表示可能な表示手段（Ｍ２６）を備える。
また、楽曲データ記憶手段は、カラオケ曲の歌詞データを含む楽曲データを記憶する。 That is, the karaoke apparatus according to claim 1 includes display means (M26) capable of displaying lyrics data of karaoke songs.
The music data storage means stores music data including lyrics data of karaoke music.

そして、楽曲データ記憶手段が記憶している楽曲データから特定の歌詞データを抽出して、表示手段を制御して、前記抽出した歌詞データを表示させるとともに、音声認識手段によって認識されたカラオケ歌唱の歌詞データと前記抽出した歌詞データとを比較して、同じ歌詞データがあると判定した場合には、カラオケ演奏再生手段を制御して、カラオケ演奏の再生を停止させる制御手段（１２）を備える。 Then, specific lyric data is extracted from the music data stored in the music data storage means, the display means is controlled to display the extracted lyric data, and the karaoke song recognized by the voice recognition means Control means (12) is provided for controlling the karaoke performance reproduction means to stop the reproduction of the karaoke performance when the lyrics data is compared with the extracted lyrics data and it is determined that there is the same lyrics data.

このように構成されたカラオケ装置によれば、カラオケ歌唱中にＮＧワ−ドを歌唱したか否かを競うゲームを楽しむことができる。
なお、請求項２に記載のカラオケ装置において、歌詞採点するための条件として、請求項５に記載するように、歌唱採点結果が所定値以上に限るとよい。 According to the karaoke apparatus configured in this way, it is possible to enjoy a game of competing whether or not an NG word has been sung during karaoke singing.
In addition, in the karaoke apparatus according to claim 2, as a condition for scoring lyrics, the singing scoring result may be limited to a predetermined value or more as described in claim 5.

すなわち、請求項２に記載のカラオケ装置において、さらに音声認識手段によって認識されたカラオケ歌唱の歌詞データを記憶する音声認識記憶手段（Ｍ１０）を備える。 That is, the karaoke apparatus according to claim 2 further includes voice recognition storage means (M10 ) for storing lyrics data of the karaoke song recognized by the voice recognition means.

そして、歌詞採点手段は、所定区間ごとに歌唱採点手段によるカラオケ歌唱の歌唱採点結果が所定値以上あるか否かを判定し、所定値以上あると判定した場合には、音声認識記憶手段に記憶されたカラオケ歌唱の歌詞データを読み出し、読み出されたカラオケ歌唱の歌詞データと楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データとを比較して、相違する歌詞データの数量を抽出し、抽出された数量と前記カラオケ曲の歌詞データの数量とに基づいてカラオケ歌唱の歌詞を採点するとよい。 Then, the lyrics scoring means determines whether or not the singing score result of the karaoke singing by the singing scoring means is greater than or equal to a predetermined value for each predetermined section. The karaoke song lyrics data is read out, the karaoke song lyrics data read out is compared with the karaoke song lyrics data stored in the song data storage means, and the quantity of the different lyric data is extracted and extracted. The karaoke singing lyrics may be scored based on the number of karaoke songs and the number of lyrics data of the karaoke songs.

このように構成されたカラオケ装置によれば、歌詞採点するための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、音声認識の認識率の低下を防止できる。つまり、例えばあまりにもタイミングがずれていたり、音量レベルが小さかったり、音高レベルが大きかったりして歌唱採点が所定値以上例えば８０点以上でない場合には、音声認識の認識率も低下すると考えられるからである。 According to the karaoke apparatus configured in this way, as a condition for scoring the lyrics, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more, so that the recognition rate of voice recognition can be prevented from being lowered. That is, for example, when the timing is too shifted, the volume level is low, or the pitch level is high, and the singing score is not more than a predetermined value, for example, 80 points or more, it is considered that the recognition rate of voice recognition also decreases. Because.

また、請求項３に記載のカラオケ装置においても、歌詞を誤るとカラオケ演奏が中止されるゲームをするための条件として、請求項６に記載するように、歌唱採点結果が所定値以上に限るとよい。 Also, in the karaoke apparatus according to claim 3, as a condition for playing a game in which karaoke performance is stopped if the lyrics are incorrect, the singing scoring result is limited to a predetermined value or more as described in claim 6. Good.

すなわち、請求項３に記載のカラオケ装置において、さらに音声認識手段によって認識されたカラオケ歌唱の歌詞データを記憶する音声認識記憶手段（Ｍ１０）を備える。 That is, the karaoke apparatus according to claim 3 further includes voice recognition storage means (M10 ) for storing lyrics data of the karaoke song recognized by the voice recognition means.

そして、制御手段は、所定区間ごとに歌唱採点手段によるカラオケ歌唱の歌唱採点結果が所定値以上あるか否かを判定し、所定値以上あると判定した場合には、音声認識記憶手段に記憶されたカラオケ歌唱の歌詞データを読み出し、読み出されたカラオケ歌唱の歌詞データと楽曲データ記憶手段が記憶する前記カラオケ曲の歌詞データとを比較して、相違する歌詞データの数量を抽出し、抽出された数量が所定数以上あると判定した場合には、カラオケ演奏再生手段を制御して、カラオケ演奏の再生を停止させるとよい。 Then, the control means determines whether or not the singing result of the karaoke singing by the singing grading means is greater than or equal to a predetermined value for each predetermined section, and if it is determined that it is greater than or equal to the predetermined value, it is stored in the voice recognition storage means. The karaoke song lyrics data is read out, the karaoke song lyrics data read out is compared with the karaoke song lyrics data stored in the song data storage means, and the quantity of the different lyric data is extracted and extracted. If it is determined that there is a predetermined number or more, the karaoke performance playback means may be controlled to stop playback of the karaoke performance.

このように構成されたカラオケ装置によれば、歌詞を誤るとカラオケ演奏が中止されるゲームをするための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、より難易度の高いゲームでの音声認識の認識率の低下を防止できる。 According to the karaoke apparatus configured in this way, as a condition for playing a game in which karaoke performance is stopped if the lyrics are incorrect, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more. It is possible to prevent a decrease in the recognition rate of voice recognition in a high game.

また、請求項４に記載のカラオケ装置においても、カラオケ歌唱中にＮＧワ−ドを歌唱したか否かを競うゲームをするための条件として、請求項７に記載するように、歌唱採点結果が所定値以上に限るとよい。 Further, in the karaoke apparatus according to claim 4, as described in claim 7, the singing scoring result is a condition for playing a game for competing whether or not the NG word was sung during karaoke singing. It should be limited to a predetermined value or more.

すなわち、請求項４に記載のカラオケ装置において、さらに音声認識手段によって認識されたカラオケ歌唱の歌詞データを記憶する音声認識記憶手段（Ｍ１０）を備える。 That is, the karaoke apparatus according to claim 4 further includes voice recognition storage means (M10 ) for storing lyrics data of the karaoke song recognized by the voice recognition means.

そして、制御手段は、楽曲データ記憶手段が記憶している楽曲データから特定の歌詞データを抽出して、表示手段を制御して、前記抽出した歌詞データを表示させるとともに、所定区間ごとに歌唱採点手段によるカラオケ歌唱の歌唱採点結果が所定値以上あるか否かを判定し、所定値以上あると判定した場合には、音声認識記憶手段に記憶されたカラオケ歌唱の歌詞データを読み出し、読み出されたカラオケ歌唱の歌詞データと前記抽出した歌詞データとを比較して、同じ歌詞データがあると判定した場合には、カラオケ演奏再生手段を制御して、カラオケ演奏の再生を停止させるとよい。 Then, the control means extracts specific lyric data from the music data stored in the music data storage means, controls the display means to display the extracted lyric data, and singing a score for each predetermined section It is determined whether or not the singing score result of the karaoke song by the means is greater than or equal to a predetermined value, and if it is determined that it is greater than or equal to the predetermined value, the lyrics data of the karaoke song stored in the voice recognition storage means is read and read If the lyrics data of the karaoke song is compared with the extracted lyrics data and it is determined that there is the same lyrics data, the playback of the karaoke performance may be stopped by controlling the karaoke performance playback means.

このように構成されたカラオケ装置によれば、カラオケ歌唱中にＮＧワ−ドを歌唱したか否かを競うゲームをするための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、より競技性の高いゲームでの音声認識の認識率の低下を防止できる。 According to the karaoke apparatus configured in this manner, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more as a condition for playing a game for competing whether or not the NG word has been sung during karaoke singing. Thus, it is possible to prevent a decrease in the recognition rate of voice recognition in a game with higher competitiveness.

以下に本発明の実施形態を図面とともに説明する。
［カラオケ装置の機能概略構成の説明］
図１はカラオケ装置の機能を中心とした概略構成を示す図である。なお、図１に示した機能の内、データ抽出部Ｍ６、歌唱用比較部Ｍ８、シーケンサＭ１４、採点部Ｍ２４、歌唱用差分抽出部Ｍ３４、スイッチ制御部Ｍ３６、歌唱用調整部Ｍ３７、音声認識部Ｍ７６、歌詞用差分抽出部Ｍ７４、歌詞用調整部Ｍ７７及び判定部Ｍ８０については、主としてＣＰＵ及びソフトウェアで実現している。 Embodiments of the present invention will be described below with reference to the drawings.
[Description of schematic configuration of karaoke equipment function]
FIG. 1 is a diagram showing a schematic configuration centering on the function of the karaoke apparatus. Among the functions shown in FIG. 1, a data extraction unit M6, a singing comparison unit M8, a sequencer M14, a scoring unit M24, a singing difference extraction unit M34, a switch control unit M36, a singing adjustment unit M37, a voice recognition unit. About M76, the difference extraction part M74 for lyrics, the adjustment part M77 for lyrics, and the determination part M80, it implement | achieved mainly with CPU and software.

まず、カラオケ装置は採点機能を備えており、カラオケ歌唱の音声信号をディジタル化して取り込み、歌唱旋律であるガイドメロディの音高と比較することによってその歌唱の正確さを判定する。 First, the karaoke apparatus has a scoring function, and the singing accuracy is determined by digitizing and capturing the voice signal of the karaoke song and comparing it with the pitch of the guide melody, which is a song melody.

また、カラオケ装置は音声認識機能を備えており、カラオケ歌唱の音声信号をディジタル化して取り込み、歌詞を認識する。
次に、図１に示すように歌唱者がカラオケ歌唱の音声を入力するためのマイクロフォンＭ２（以下、マイクロフォンをマイクと略記する）は、アンプＭ１８に接続されるとともにＡ／ＤコンバータＭ４に接続されている。また、カラオケ演奏用データ、ガイドメロディデータ、歌詞データを含む楽曲データがデータ格納部Ｍ１０に記憶されている。楽曲データのうち、カラオケ歌唱者に選択された１曲分が実行メモリＭ１２に読み出され、演奏時にはシーケンサＭ１４によって順次読み出される。シーケンサＭ１４によって読み出されたカラオケ演奏用データは楽音発生部Ｍ１６に入力される。また、シーケンサＭ１４により順次読み出されたガイドメロディデータは歌唱用比較部Ｍ８に入力される。楽音発生部Ｍ１６は入力されたカラオケ演奏用データに基づいてカラオケ曲の演奏信号を発生し、この演奏信号はアンプＭ１８に入力される。アンプＭ１８はカラオケ演奏信号およびマイクＭ２から入力された歌唱音声信号を増幅してスピーカＭ２０に出力する。 Moreover, the karaoke apparatus has a voice recognition function, digitizes the voice signal of the karaoke singing, and recognizes the lyrics.
Next, as shown in FIG. 1, a microphone M2 (hereinafter, abbreviated as a microphone) for a singer to input karaoke singing voice is connected to an amplifier M18 and to an A / D converter M4. ing. In addition, music data including karaoke performance data, guide melody data, and lyrics data is stored in the data storage unit M10. Of the song data, one song selected by the karaoke singer is read to the execution memory M12, and is sequentially read by the sequencer M14 during performance. The karaoke performance data read out by the sequencer M14 is input to the musical tone generator M16. The guide melody data sequentially read by the sequencer M14 is input to the singing comparison unit M8. The musical sound generator M16 generates a performance signal for karaoke music based on the input karaoke performance data, and this performance signal is input to the amplifier M18. The amplifier M18 amplifies the karaoke performance signal and the singing voice signal input from the microphone M2, and outputs the amplified signal to the speaker M20.

なお、このガイドメロディデータは、カラオケ曲の歌唱旋律に対応するものであるため、いわゆるガイドメロディ機能として、伴奏楽音と共にスピーカＭ２０から出力される機能を実現する場合にも用いられる。このガイドメロディ機能についてはユーザ（カラオケ歌唱者）がその機能のオン／オフを切り替え可能であり、その機能が必要であると考えたユーザが図示しない操作パネルやリモコンなどを操作することによって機能オンとすれば、カラオケ伴奏だけでなく歌唱旋律がガイドメロディとしてスピーカＭ２０から出力され、それを参考にして歌唱することができる。一方、機能オフとすれば、ガイドメロディがスピーカＭ２０から出力されることはなく、ガイドメロディデータは、採点のためにのみ用いられることとなる。 Since this guide melody data corresponds to the singing melody of karaoke music, it is also used when realizing a function output from the speaker M20 together with the accompaniment music as a so-called guide melody function. The guide melody function can be switched on / off by the user (karaoke singer), and the user who thinks that the function is necessary can be turned on by operating an operation panel or a remote control (not shown). Then, not only the karaoke accompaniment but also the singing melody is output from the speaker M20 as a guide melody, and it is possible to sing with reference to it. On the other hand, if the function is turned off, the guide melody is not output from the speaker M20, and the guide melody data is used only for scoring.

なお、以下の説明においては、ガイドメロディ機能をオンの状態であることを前提とする。したがって、マイクＭ２から入力された歌唱音声信号には、カラオケ曲の演奏信号とガイドメロディ信号とが加わっている。また、以下の説明において、単に「歌唱音声信号」という場合はカラオケ曲の演奏信号とガイドメロディの音信号とが加わっている歌唱音声信号を指し、単に「歌唱音声」という場合はカラオケ曲の演奏音とガイドメロディとが加わっている歌唱音声を指すこととする。 In the following description, it is assumed that the guide melody function is on. Therefore, the karaoke song performance signal and the guide melody signal are added to the singing voice signal input from the microphone M2. In the following description, the term “singing voice signal” refers to a singing voice signal in which the performance signal of the karaoke song and the sound signal of the guide melody are added, and the term “singing voice” simply refers to the performance of the karaoke song. It refers to a singing voice to which a sound and a guide melody are added.

一方、Ａ／Ｄコンバータ（ＡＤＣ）Ｍ４に入力された歌唱音声信号はディジタル信号に変換されたのち、歌唱用信号生成部Ｍ３０及び歌詞用信号生成部Ｍ７０へ入力される。また、上述した楽音発生部Ｍ１６は入力されるカラオケ演奏用データとガイドメロディデータとに基づいてカラオケ曲の演奏信号とガイドメロディの音信号を発生し、このカラオケ曲の演奏信号とガイドメロディの音信号は歌唱用可変利得アンプＭ３２及び歌詞用可変利得アンプＭ７２へ入力される。 On the other hand, the singing voice signal input to the A / D converter (ADC) M4 is converted into a digital signal and then input to the singing signal generation unit M30 and the lyrics signal generation unit M70. The musical tone generator M16 described above generates a karaoke song performance signal and a guide melody sound signal based on the inputted karaoke performance data and the guide melody data. The karaoke song performance signal and the guide melody sound are generated. The signal is input to the singing variable gain amplifier M32 and the lyrics variable gain amplifier M72.

そして、歌唱用可変利得アンプＭ３２へ入力されるカラオケ曲の演奏信号とガイドメロディの音信号は歌唱用可変利得アンプＭ３２によって増幅されて歌唱用信号生成部Ｍ３０へ入力される。歌唱用信号生成部Ｍ３０はＡ／Ｄコンバータ（ＡＤＣ）Ｍ４から入力されたディジタル信号に変換された歌唱音声信号と歌唱用可変利得アンプＭ３２から入力されたカラオケ曲の演奏信号とガイドメロディの音信号とから採点用信号を生成し、データ抽出部Ｍ６へ入力するとともに歌唱用差分抽出部Ｍ３４へ入力する。歌唱用差分抽出部Ｍ３４へ入力された採点用信号はスイッチ制御部Ｍ３６によって、接続状態（以下、単にオンともいう）か切断状態（以下、単にオフともいう）かを制御されるスイッチＭ３８を介して歌唱用調整部Ｍ３７へ入力される。そして、歌唱用調整部Ｍ３７は、歌唱用調整部Ｍ３７へ入力された採点用信号の信号レベルが最小になるよう歌唱用可変利得アンプＭ３２へ利得を指示する。 The performance signal of the karaoke song and the sound signal of the guide melody input to the singing variable gain amplifier M32 are amplified by the singing variable gain amplifier M32 and input to the singing signal generation unit M30. The singing signal generation unit M30 is a singing voice signal converted into a digital signal input from an A / D converter (ADC) M4, a karaoke song performance signal and a guide melody sound signal input from a singing variable gain amplifier M32. A scoring signal is generated from the data and input to the data extraction unit M6 and to the singing difference extraction unit M34. The scoring signal input to the singing difference extraction unit M34 is switched by a switch control unit M36 via a switch M38 that is controlled to be connected (hereinafter also simply referred to as “on”) or disconnected (hereinafter also simply referred to as “off”). To the singing adjustment unit M37. Then, the singing adjustment unit M37 instructs the singing variable gain amplifier M32 to gain the signal level of the scoring signal input to the singing adjustment unit M37.

また、歌詞用可変利得アンプＭ７２へ入力されるカラオケ曲の演奏信号とガイドメロディの音信号は歌詞用可変利得アンプＭ７２によって増幅されて歌詞用信号生成部Ｍ７０へ入力される。歌詞用信号生成部Ｍ７０はＡ／Ｄコンバータ（ＡＤＣ）Ｍ４から入力されたディジタル信号に変換された歌唱音声信号と歌詞用可変利得アンプＭ７２から入力されたカラオケ曲の演奏信号とガイドメロディの音信号とから音声認識用信号を生成し、音声認識部Ｍ７６へ入力するとともに歌詞用差分抽出部Ｍ７４へ入力する。歌詞用差分抽出部Ｍ７４へ入力された音声認識用信号はスイッチ制御部Ｍ３６によって、接続状態（以下、単にオンともいう）か切断状態（以下、単にオフともいう）かを制御されるスイッチＭ７８を介して歌詞用調整部Ｍ７７へ入力される。そして、歌詞用調整部Ｍ７７は、歌詞用調整部Ｍ７７へ入力された音声認識用信号の信号レベルが最小になるよう歌詞用可変利得アンプＭ７２へ利得を指示する。また、シーケンサＭ１４によって読み出されたカラオケ演奏用データは、スイッチ制御部Ｍ３６によって、接続状態か切断状態かを制御されるスイッチＭ４０を介して楽音発生部Ｍ１６に入力される。 The karaoke song performance signal and the guide melody sound signal input to the lyrics variable gain amplifier M72 are amplified by the lyrics variable gain amplifier M72 and input to the lyrics signal generation unit M70. The lyric signal generator M70 is a singing voice signal converted into a digital signal input from an A / D converter (ADC) M4, a karaoke song performance signal and a guide melody sound signal input from a lyric variable gain amplifier M72. A speech recognition signal is generated from the above and input to the speech recognition unit M76 and to the lyrics difference extraction unit M74. The voice recognition signal input to the lyric difference extraction unit M74 has a switch M78 that is controlled by the switch control unit M36 to be connected (hereinafter simply referred to as ON) or disconnected (hereinafter also simply referred to as OFF). To the lyrics adjustment unit M77. Then, the lyrics adjustment unit M77 instructs the lyrics variable gain amplifier M72 to gain such that the signal level of the speech recognition signal input to the lyrics adjustment unit M77 is minimized. Also, the karaoke performance data read by the sequencer M14 is input to the musical sound generating unit M16 via the switch M40, which is controlled by the switch control unit M36 to be connected or disconnected.

ここで、スイッチ制御部Ｍ３６はシーケンサＭ１４によって読み出された楽曲データから歌唱期間か否かを判断して、スイッチＭ３８、スイッチＭ４０及びスイッチＭ７８を制御する。 Here, the switch control unit M36 determines whether or not it is a singing period from the music data read by the sequencer M14, and controls the switch M38, the switch M40, and the switch M78.

そして、図２は、カラオケ演奏以前、カラオケ演奏期間、カラオケ演奏終了以後の経過時間に対するスイッチＭ４０、スイッチＭ３８及びスイッチＭ７８のオン／オフの変化と、歌唱用可変利得アンプＭ３２及び歌詞用可変利得アンプＭ７２からの出力信号レベルの変化と、を示す説明図である。カラオケ演奏以前には、スイッチＭ４０、スイッチＭ３８及びスイッチＭ７８はオフされているが、シーケンサＭ１４が楽曲データを読み出し、カラオケ演奏期間に入ると、スイッチ制御部Ｍ３６がスイッチＭ４０を制御してオン状態にさせ、カラオケ演奏終了までオンの状態を保たせ、カラオケ演奏終了するとオフ状態にさせる。そして、スイッチ制御部Ｍ３６はシーケンサＭ１４によって読み出された楽曲データから歌唱期間か否かを判断して、歌唱期間ではないと判断した場合、すなわち前奏期間または間奏期間と判断した場合には、スイッチＭ３８及びスイッチＭ７８を制御してオン状態にさせ、歌唱期間であると判断した場合には、スイッチＭ３８及びスイッチＭ７８を制御してオフ状態にさせる。ここで、上述した「スイッチ制御部Ｍ３６はシーケンサＭ１４によって読み出された楽曲データから歌唱期間か否かを判断して」とは、［音声制御部２４のＮＧ単語ゲーム処理の説明］にて後述するように「カラオケ演奏を実行するシーケンスプログラムから受け渡される制御データトラックに記憶されているカラオケ曲の歌唱区間の開始点を示す区間分割データによって前奏期間または間奏期間の歌唱されない期間か否かを判断する」ことを指す。 2 shows on / off changes of the switch M40, the switch M38 and the switch M78 with respect to the elapsed time after the end of the karaoke performance, the karaoke performance period, the singing variable gain amplifier M32 and the lyrics variable gain amplifier. It is explanatory drawing which shows the change of the output signal level from M72. Before the karaoke performance, the switch M40, the switch M38 and the switch M78 are turned off. However, when the sequencer M14 reads the music data and enters the karaoke performance period, the switch control unit M36 controls the switch M40 to be turned on. It is kept on until the end of the karaoke performance, and is turned off when the karaoke performance ends. When the switch control unit M36 determines whether or not it is a singing period from the music data read out by the sequencer M14 and determines that it is not a singing period, that is, when it determines that it is a prelude period or an interlude period, M38 and switch M78 are controlled to be turned on, and when it is determined that it is a singing period, switch M38 and switch M78 are controlled to be turned off. Here, “the switch control unit M36 determines whether or not it is a singing period from the music data read out by the sequencer M14” is described later in [Description of NG word game processing of voice control unit 24]. As described above, “whether it is a period in which the prelude period or the interlude period is not sung by the section division data indicating the start point of the singing section of the karaoke song stored in the control data track delivered from the sequence program for performing the karaoke performance. To judge.

そして、歌唱期間ではないと判断した場合、すなわち前奏期間または間奏期間と判断した場合には、スイッチＭ３８を制御してオン状態にさせ、歌唱用差分抽出部Ｍ３４から歌唱用調整部Ｍ３７へ入力された採点用信号の信号レベルが最小になるように歌唱用調整部Ｍ３７は、歌唱用可変利得アンプＭ３２へ利得を指示する（図２参照）。したがって、楽曲データから歌唱期間ではないと判断した場合、すなわち前奏期間または間奏期間には、採点用信号の信号レベルが最小になるように調整される。 When it is determined that it is not a singing period, that is, when it is determined that it is a prelude period or an interlude period, the switch M38 is controlled to be turned on, and is input from the singing difference extraction unit M34 to the singing adjustment unit M37. The singing adjustment unit M37 instructs the singing variable gain amplifier M32 so as to minimize the signal level of the scoring signal (see FIG. 2). Therefore, when it is determined from the music data that it is not the singing period, that is, during the prelude period or the interlude period, the signal level of the scoring signal is adjusted to be minimum.

一方、歌唱期間であると判断した場合には、スイッチＭ３８を制御してオフ状態にさせるので、歌唱用可変利得アンプＭ３２は、歌唱用調整部Ｍ３７によって利得を指示されず、歌唱用可変利得アンプＭ３２からの出力信号レベルは固定される（図２参照）。そして、歌唱用信号生成部Ｍ３０によって生成された採点用信号を入力されたデータ抽出部Ｍ６は例えば５０ｍｓ毎にこのディジタル化された採点用信号から音高を割り出し、音高データとして歌唱用比較部Ｍ８に入力する。歌唱用比較部Ｍ８では、データ抽出部Ｍ６から入力される採点用信号の音高データとシーケンサＭ１４から入力されるガイドメロディの音高データ（以下、単にリファレンスともいう）とをリアルタイムに比較する。なお、５０ｍｓは１２０のメトロノームテンポで３２分音符に相当し、歌唱の特徴を抽出するために十分な分解能である。 On the other hand, if it is determined that it is the singing period, the switch M38 is controlled to be turned off, so that the singing variable gain amplifier M32 is not instructed by the singing adjustment unit M37, and the singing variable gain amplifier The output signal level from M32 is fixed (see FIG. 2). Then, the data extraction unit M6 to which the scoring signal generated by the singing signal generation unit M30 is input calculates the pitch from the digitized scoring signal every 50 ms, for example, and the singing comparison unit is used as pitch data. Input to M8. The singing comparison unit M8 compares the pitch data of the scoring signal input from the data extraction unit M6 with the pitch data of the guide melody (hereinafter also simply referred to as reference) input from the sequencer M14 in real time. Note that 50 ms corresponds to a thirty-second note with a metronome tempo of 120, and has a sufficient resolution for extracting the characteristics of singing.

歌唱用比較部Ｍ８では、採点用信号が入力されるタイミングにあわせて５０ｍｓ毎に採点用信号とリファレンスとの差を算出する。これはリアルタイムの差分データ（音高差分データ）として採点部Ｍ２４へ出力される。採点部Ｍ２４では、この音高差分データをカラオケ曲の区間毎に集計しデータ格納部Ｍ１０へ記憶する。そして採点部Ｍ２４では、各区間の差分データの集計を行って採点結果を求める。 The singing comparison unit M8 calculates the difference between the scoring signal and the reference every 50 ms in accordance with the timing at which the scoring signal is input. This is output to the scoring unit M24 as real-time difference data (pitch difference data). In the scoring unit M24, the pitch difference data is totaled for each section of the karaoke music and stored in the data storage unit M10. And in the scoring part M24, the difference data of each section is totaled and a scoring result is obtained.

また、歌唱期間ではないと判断した場合、すなわち前奏期間または間奏期間と判断した場合には、スイッチＭ７８を制御してオン状態にさせ、歌詞用差分抽出部Ｍ７４から歌詞用調整部Ｍ７７へ入力された音声認識用信号の信号レベルが最小になるように歌詞用調整部Ｍ７７は、歌詞用可変利得アンプＭ７２へ利得を指示する（図２参照）。したがって、楽曲データから歌唱期間ではないと判断した場合、すなわち前奏期間または間奏期間には、音声認識用信号の信号レベルが最小になるように調整される。 Further, when it is determined that it is not a singing period, that is, when it is determined that it is a prelude period or an interlude period, the switch M78 is controlled to be turned on and input from the lyrics difference extracting unit M74 to the lyrics adjusting unit M77. Then, the lyrics adjusting unit M77 instructs the lyrics variable gain amplifier M72 so as to minimize the signal level of the voice recognition signal (see FIG. 2). Therefore, when it is determined from the music data that it is not the singing period, that is, during the prelude period or the interlude period, the signal level of the voice recognition signal is adjusted to be minimum.

一方、歌唱期間であると判断した場合には、スイッチＭ７８を制御してオフ状態にさせるので、歌詞用可変利得アンプＭ７２は、歌詞用調整部Ｍ７７によって利得を指示されず、歌詞用可変利得アンプＭ７２からの出力信号レベルは固定される（図２参照）。そして、歌詞用信号生成部Ｍ７０によって生成された音声認識用信号を入力された音声認識部Ｍ７６は歌詞を音声認識し、歌詞データとしてデータ格納部Ｍ１０へ記憶される。判定部Ｍ８０は、データ格納部Ｍ１０へ記憶されている楽曲データから選曲された楽曲の特定歌詞データを抽出して表示部Ｍ２６へ表示し、所定区間ごとにデータ格納部Ｍ１０へ記憶されている採点結果が所定値例えば８０点以上あると判定した場合には、データ格納部Ｍ１０へ記憶されている音声認識された歌詞データを読み出し、読み出された音声認識された歌詞データと前記抽出された歌詞データとを比較して、同じ歌詞データがあると判定した場合には、シーケンサＭ１４を制御してカラオケ演奏を停止させる。 On the other hand, when it is determined that it is the singing period, the switch M78 is controlled to be turned off, so that the lyrics variable gain amplifier M72 is not instructed by the lyrics adjustment unit M77 to have a gain and the lyrics variable gain amplifier. The output signal level from M72 is fixed (see FIG. 2). Then, the voice recognition unit M76, to which the speech recognition signal generated by the lyrics signal generation unit M70 is input, recognizes the lyrics and stores them in the data storage unit M10 as lyrics data. The determination unit M80 extracts specific lyric data of the selected music from the music data stored in the data storage unit M10 and displays it on the display unit M26, and the scoring stored in the data storage unit M10 for each predetermined section. When it is determined that the result is a predetermined value, for example, 80 points or more, the speech-recognized lyrics data stored in the data storage unit M10 is read, and the speech-recognized lyrics data and the extracted lyrics are read When it is determined that there is the same lyric data by comparing with the data, the sequencer M14 is controlled to stop the karaoke performance.

なお、カラオケ装置の機能を中心とした概略構成を示す図１においては、マイクＭ２が「音声信号入力手段」に相当し、データ格納部Ｍ１０が「楽曲データ記憶手段」及び「音声認識記憶手段」に相当する。また、シーケンサＭ１４が「カラオケ演奏再生手段」に相当し、歌唱用信号生成部Ｍ３０が「第２の生成手段」に相当し、データ抽出部Ｍ６が「音高抽出手段」に相当する。そして、スイッチ制御部Ｍ３６と、スイッチＭ３８と、スイッチＭ４０と、歌唱用差分抽出部Ｍ３４と、可変利得アンプＭ３２と、歌唱用調整部Ｍ３７と、が「第２の利得設定手段」に相当する。また、採点部Ｍ２４が「歌唱採点手段」に相当する。また、スピーカＭ２０が「スピーカ」に相当し、表示部Ｍ２６が「表示手段」に相当する。 In FIG. 1 showing a schematic configuration centering on the function of the karaoke apparatus, the microphone M2 corresponds to “speech signal input means”, and the data storage unit M10 is “music data storage means” and “voice recognition storage means”. It corresponds to. The sequencer M14 corresponds to “karaoke performance reproduction means”, the singing signal generation unit M30 corresponds to “second generation means”, and the data extraction unit M6 corresponds to “pitch extraction means”. The switch control unit M36, the switch M38, the switch M40, the singing difference extraction unit M34, the variable gain amplifier M32, and the singing adjustment unit M37 correspond to the “second gain setting unit”. The scoring unit M24 corresponds to “singing scoring means”. The speaker M20 corresponds to a “speaker”, and the display unit M26 corresponds to a “display unit”.

また、歌詞用信号生成部Ｍ７０が「第１の生成手段」に相当し、音声認識部Ｍ７６が「音声認識手段」に相当する。そして、スイッチ制御部Ｍ３６と、スイッチＭ７８と、スイッチＭ４０と、歌詞用差分抽出部Ｍ７４と、歌詞用可変利得アンプＭ７２と、歌詞用調整部Ｍ７７と、が「第１の利得設定手段」に相当する。また、判定部Ｍ８０が「制御手段」に相当する。 The lyrics signal generation unit M70 corresponds to the “first generation unit”, and the voice recognition unit M76 corresponds to the “voice recognition unit”. The switch control unit M36, the switch M78, the switch M40, the lyrics difference extracting unit M74, the lyrics variable gain amplifier M72, and the lyrics adjusting unit M77 correspond to the “first gain setting means”. To do. The determination unit M80 corresponds to a “control unit”.

次に、図３を参照して、採点用信号、リファレンスについて説明する。図３に例示する点線はリファレンスであるガイドメロディを音高データ化したものであり、一般的なガイドメロディのデータは機械的に非常に正確なものである。これに対して、図３に例示する実線は採点用信号を音高データ化したものを示している。採点用信号の音高データはガイドメロディの音高データが示す値から上下に変動しており、前奏及び間奏においては採点用信号の音高データはない。 Next, the scoring signal and reference will be described with reference to FIG. The dotted line illustrated in FIG. 3 is obtained by converting the guide melody that is a reference into pitch data, and general guide melody data is mechanically very accurate. On the other hand, the solid line illustrated in FIG. 3 shows the scoring signal converted to pitch data. The pitch data of the scoring signal varies up and down from the value indicated by the pitch data of the guide melody, and there is no pitch data of the scoring signal in the prelude and interlude.

図１ではカラオケ装置の機能を中心とした概略構成を示したが、図４は同カラオケ装置の具体的なハード構成を示すブロック図である。
［カラオケ装置１の構成の説明］
図４は、カラオケ装置１の構成を示すブロック図である。カラオケ装置１は、図４に示すように、カラオケ装置１全体の動作を制御する制御部１２、カラオケ装置１をネットワーク１００に接続するためのインタフェース部１４、演奏楽曲の伴奏内容および歌詞を示す楽曲データや映像データなどを記憶するハードディスク（ＨＤＤ）１６、複数のキー・スイッチからなる操作部１８、リモコン端末２や携帯電話からの赤外線信号を赤外線通信によって受信するための赤外線通信部２０、操作部１８からの信号を処理する操作処理部２２、ハードディスク１６に記憶された楽曲データから演奏楽曲のオーディオ信号（音響，音声に関する信号）を生成し、生成されたオーディオ信号及びマイク２５から入力されたオーディオ信号を増幅してスピーカ２８へ出力する音声制御部２４、音声認識部１０、ＭＩＤＩ音源３０、映像情報を一時的に記憶するビデオＲＡＭ３２、映像データに基づく映像の再生を制御する映像再生部３４、ビデオＲＡＭ３２に記憶された映像情報および映像再生部３４により再生される映像の表示部３６での表示を制御する映像制御部３８などを備えている。 Although FIG. 1 shows a schematic configuration centering on the function of the karaoke apparatus, FIG. 4 is a block diagram showing a specific hardware configuration of the karaoke apparatus.
[Description of configuration of karaoke apparatus 1]
FIG. 4 is a block diagram showing a configuration of the karaoke apparatus 1. As shown in FIG. 4, the karaoke apparatus 1 includes a control unit 12 that controls the operation of the karaoke apparatus 1 as a whole, an interface unit 14 for connecting the karaoke apparatus 1 to the network 100, music that shows the accompaniment content and lyrics of a performance tune. A hard disk (HDD) 16 for storing data, video data, etc., an operation unit 18 comprising a plurality of keys and switches, an infrared communication unit 20 for receiving infrared signals from the remote control terminal 2 and mobile phone by infrared communication, an operation unit An operation processing unit 22 that processes a signal from 18, an audio signal (signal related to sound and sound) of a performance music is generated from music data stored in the hard disk 16, and the generated audio signal and audio input from the microphone 25 are generated. A voice control unit 24 that amplifies the signal and outputs the amplified signal to the speaker 28; IDI sound source 30, video RAM 32 for temporarily storing video information, video playback unit 34 for controlling playback of video based on video data, video information stored in video RAM 32 and display of video played back by video playback unit 34 A video control unit 38 for controlling display in the unit 36 is provided.

そして、制御部１２、インタフェース部１４、ＨＤＤ１６、赤外線通信部２０、操作処理部２２、ビデオＲＡＭ３２、映像再生部３４、映像制御部３８は、各々バス３９によって接続されている。また、制御部１２と音声制御部２４とはＵＳＢ４０によって接続されている。なお、制御部１２及び音声制御部２４は、後述する各種処理を実行する。 The control unit 12, interface unit 14, HDD 16, infrared communication unit 20, operation processing unit 22, video RAM 32, video playback unit 34, and video control unit 38 are connected by a bus 39. The control unit 12 and the audio control unit 24 are connected by a USB 40. In addition, the control part 12 and the audio | voice control part 24 perform the various processes mentioned later.

このうち、ＨＤＤ１６には、図５（ａ）に例示するように、楽曲データを記憶する楽曲データメモリ領域５０、楽曲データから抽出した単語データを記憶する単語データメモリ領域５２及び音高差のデータに応じた採点情報を記憶する採点情報メモリ領域５４が設けられている。楽曲データメモリ領域５０に記憶されている楽曲データは、図６（ａ）に例示するようにヘッダ情報、ＭＩＤＩデータ、タイトルデータ及び歌詞で使用されている単語データを有している。そして、ヘッダ情報は、ＭＩＤＩデータサイズ、タイトルデータサイズ及び単語データサイズを有している。また、歌詞で使用されている単語データのデータ構造は、登録単語数、１番目の単語サイズ、１番目の単語、２番目単語サイズ、２番目の単語と続き、ｎ番目の単語サイズ、ｎ番目の単語へと続いているデータ構造となっている。一例を挙げると、図６（ｂ）に例示するように、登録単語数に対応する「３単語」を示すデータ、１番目の単語サイズに対応する「８ｂｙｔｅ」を示すデータ、１番目の単語に対応する「あいどる」を示すデータ、２番目単語サイズに対応する「６ｂｙｔｅ」を示すデータ、２番目の単語に対応する「すてき」を示すデータ、３番目の単語サイズに対応する「１０ｂｙｔｅ」を示すデータ、３番目の単語に対応する「あいしてる」へと続くデータ構造となっている。 Among these, as illustrated in FIG. 5A, the HDD 16 stores a song data memory area 50 for storing song data, a word data memory area 52 for storing word data extracted from the song data, and pitch difference data. There is provided a scoring information memory area 54 for storing scoring information according to. The music data stored in the music data memory area 50 includes word data used in header information, MIDI data, title data, and lyrics, as illustrated in FIG. The header information has a MIDI data size, a title data size, and a word data size. The data structure of the word data used in the lyrics includes the number of registered words, the first word size, the first word, the second word size, the second word, the nth word size, and the nth word. The data structure continues to the word. For example, as shown in FIG. 6B, data indicating “3 words” corresponding to the registered word count, data indicating “8 bytes” corresponding to the first word size, Corresponding data indicating “Idoru”, data indicating “6 bytes” corresponding to the second word size, data indicating “nice” corresponding to the second word, and “10 bytes” corresponding to the third word size The data structure continues to “I love you” corresponding to the third word.

また、楽曲データが有するＭＩＤＩデータは、図５（ｂ）に例示するように、楽曲トラック、ガイドメロディトラック及び制御データトラックを有している。楽曲トラックには、メロディトラック、リズムトラックを初めとして種々のパートのトラックが形成されている。ガイドメロディトラックには、カラオケ曲の旋律すなわち歌唱者が歌うべき旋律のシーケンスデータを記憶している。制御データトラックには、カラオケ曲の歌唱区間の開始点を示す区間分割データを記憶している。 The MIDI data included in the music data has a music track, a guide melody track, and a control data track as illustrated in FIG. 5B. In the music track, a track of various parts including a melody track and a rhythm track is formed. The guide melody track stores the melody sequence data of the karaoke song, that is, the melody that the singer should sing. The control data track stores section division data indicating the starting point of the singing section of the karaoke song.

採点情報メモリ領域５４に記憶されている音高差のデータに応じた採点情報は、図１５に例示するように音高差のデータと採点とを関連付けたデータ構造を有している。一例を挙げると、音高差のデータとしての「０セミトーン」に対して、採点としての「１００点」を関連付けている。 The scoring information corresponding to the pitch difference data stored in the scoring information memory area 54 has a data structure in which the pitch difference data is associated with the scoring as illustrated in FIG. For example, “100 semitones” as pitch difference data is associated with “100 points” as a scoring.

また、音声制御部２４が内蔵するＲＯＭ（図示なし）には、図５（ｃ）に例示するように、カラオケ歌唱の歌詞記録エリア６０、ガイドメロディバッファ６２、リファレンスデータレジスタ６４及び差分データ記憶エリア６６が設けられている。カラオケ歌唱の歌詞記録エリア６０には、音声認識された歌詞を記憶する。ガイドメロディバッファ６２には、読み出されたガイドメロディデータを一時記憶する。リファレンスデータレジスタ６４には、このガイドメロディデータから抽出されたリファレンス（つまり、ガイドメロディの音高データ）を記憶する。差分データ記憶エリア６６には、リファレンスと歌唱音声との差分データを記憶する。なお、リファレンスデータレジスタ６４は音高データレジスタからなっており、差分データ記憶エリア６６は音高差データ記憶エリアからなっている。 Further, in the ROM (not shown) built in the voice control unit 24, as illustrated in FIG. 5C, the lyrics recording area 60 for the karaoke song, the guide melody buffer 62, the reference data register 64, and the difference data storage area. 66 is provided. In the lyrics recording area 60 of the karaoke song, the speech-recognized lyrics are stored. The guide melody buffer 62 temporarily stores the read guide melody data. The reference data register 64 stores a reference extracted from the guide melody data (that is, pitch data of the guide melody). The difference data storage area 66 stores difference data between the reference and the singing voice. The reference data register 64 is a pitch data register, and the difference data storage area 66 is a pitch difference data storage area.

なお、本実施形態においては、マイク２５が「音声信号入力手段」に相当し、ＨＤＤ１６が「楽曲データ記憶手段」に相当し、音声制御部２４が内蔵するＲＯＭが「音声認識記憶手段」に相当する。また、音声制御部２４が「カラオケ演奏再生手段」、「第１の生成手段」、「第２の生成手段」、「音高抽出手段」、「第１の利得設定手段」及び「第２の利得設定手段」に相当し、制御部１２が「制御手段」、「歌唱採点手段」に相当する。また、スピーカ２８が「スピーカ」に相当し、音声認識部１０が「音声認識手段」に相当し、表示部３６が「表示手段」に相当する。 In the present embodiment, the microphone 25 corresponds to “audio signal input means”, the HDD 16 corresponds to “music data storage means”, and the ROM built in the audio control unit 24 corresponds to “voice recognition storage means”. To do. Further, the voice control unit 24 performs “karaoke performance reproduction means”, “first generation means”, “second generation means”, “pitch extraction means”, “first gain setting means”, and “second gain setting means”. It corresponds to “gain setting means”, and the control unit 12 corresponds to “control means” and “singing scoring means”. The speaker 28 corresponds to “speaker”, the voice recognition unit 10 corresponds to “voice recognition unit”, and the display unit 36 corresponds to “display unit”.

［制御部１２のＮＧ単語ゲーム処理の説明］
以下に、カラオケ装置１の制御部１２が実行する「制御部１２のＮＧ単語ゲーム処理」の手順を図７、図８のフローチャートに基づいて説明する。 [Description of NG word game process of control unit 12]
Below, the procedure of "NG word game process of the control part 12" which the control part 12 of the karaoke apparatus 1 performs is demonstrated based on the flowchart of FIG. 7, FIG.

なお、以下の説明においては、ユーザ（カラオケ歌唱者）によってカラオケ曲が選曲されている状態とする。具体的には、操作部１８で受け付けたカラオケ曲の選曲番号のデータは操作処理部２２によって制御部１２へ送信されるのであるが、制御部１２は、選曲番号のデータを受信し、その選曲番号のデータを音声制御部２４へ送信している状態とする。 In the following description, it is assumed that a karaoke song is selected by a user (karaoke singer). Specifically, the music selection number data of the karaoke song received by the operation unit 18 is transmitted to the control unit 12 by the operation processing unit 22, but the control unit 12 receives the music selection number data and receives the music selection number. It is assumed that the number data is being transmitted to the voice control unit 24.

操作部１８で受け付けたゲーム開始指示のデータは操作処理部２２によって制御部１２へ送信されるのであるが、制御部１２は、ゲーム開始指示のデータを受信したか否かを判断する（Ｓ１１０）。そして、操作処理部２２から送信されたゲーム開始指示のデータを受信すると（Ｓ１１０：ＹＥＳ）、選曲されているカラオケ曲の選曲番号に対応する楽曲データをＨＤＤ１６に設けられている図５（ａ）に例示する楽曲データメモリ領域５０から読み出す（Ｓ１１４）。そして、読み出された楽曲データから図６（ａ）に例示するタイトルデータと歌詞で使用されている単語データを読み出し、読み出された前記単語データを単語データメモリ領域５２へ記憶する（Ｓ１１４）。Ｓ１１４の処理が終了したらＳ１１６の処理を実行する。 The game start instruction data received by the operation unit 18 is transmitted to the control unit 12 by the operation processing unit 22, but the control unit 12 determines whether or not the game start instruction data has been received (S110). . When the game start instruction data transmitted from the operation processing unit 22 is received (S110: YES), the music data corresponding to the music selection number of the selected karaoke music is provided in the HDD 16 (a) in FIG. Is read from the music data memory area 50 illustrated in FIG. Then, the title data and the word data used in the lyrics exemplified in FIG. 6A are read from the read music data, and the read word data are stored in the word data memory area 52 (S114). . When the process of S114 is completed, the process of S116 is executed.

Ｓ１１６の処理においては、Ｓ１１４の処理において読み出されたタイトルを表示部３６へ表示する。具体的には、図１６（ａ）に例示するように、Ｓ１１４の処理において読み出されたタイトルに対応する「赤な女子」などを表示部３６へ表示するように映像制御部３８を制御する。次に、ＮＧ単語を表示部３６へ表示し、ＮＧ単語データを音声制御部２４へ送信する（Ｓ１１８）。具体的には、Ｓ１１４の処理において単語データメモリ領域５２へ記憶された単語データからＮＧ単語データを選定し、図１６（ｂ）に例示するように、選定されたＮＧ単語に対応する「あいしてる」などを表示部３６へ表示するように映像制御部３８を制御する。そして、ＮＧ単語データを音声制御部２４へ送信する。Ｓ１１８の処理が終了したらＳ１２０の処理を実行する。 In the process of S116, the title read in the process of S114 is displayed on the display unit 36. Specifically, as illustrated in FIG. 16A, the video control unit 38 is controlled so that “red girls” and the like corresponding to the title read in the process of S <b> 114 are displayed on the display unit 36. . Next, the NG word is displayed on the display unit 36, and the NG word data is transmitted to the voice control unit 24 (S118). Specifically, NG word data is selected from the word data stored in the word data memory area 52 in the process of S114, and “I love you” corresponding to the selected NG word as illustrated in FIG. 16B. The video control unit 38 is controlled so as to be displayed on the display unit 36. Then, the NG word data is transmitted to the voice control unit 24. When the process of S118 ends, the process of S120 is executed.

Ｓ１２０の処理においては、カラオケ演奏開始信号を音声制御部２４へ送信する。そして、音声認識開始信号を音声制御部２４へ送信する（Ｓ１２２）。さらに、区間歌唱採点開始信号を音声制御部２４へ送信する（Ｓ１２４、図８参照）。Ｓ１２４の処理が終了したらＳ１２６の処理を実行する。 In the process of S120, a karaoke performance start signal is transmitted to the voice control unit 24. Then, a voice recognition start signal is transmitted to the voice control unit 24 (S122). Furthermore, a section singing start signal is transmitted to the voice control unit 24 (S124, see FIG. 8). When the process of S124 is completed, the process of S126 is executed.

さて、比較結果のデータは音声制御部２４から制御部１２へ送信される（この送信処理については後述する）のであるが、制御部１２は、比較結果のデータを受信したか否かを判断する（Ｓ１２６）。そして、音声制御部２４から比較結果を受信した場合には（Ｓ１２６：ＹＥＳ）、Ｓ１２８の処理を実行する。一方、音声制御部２４から比較結果を受信しない場合には（Ｓ１２６：ＮＯ）、Ｓ１３８の処理を実行する。 The comparison result data is transmitted from the voice control unit 24 to the control unit 12 (this transmission process will be described later). The control unit 12 determines whether or not the comparison result data has been received. (S126). When the comparison result is received from the voice control unit 24 (S126: YES), the process of S128 is executed. On the other hand, when the comparison result is not received from the voice control unit 24 (S126: NO), the process of S138 is executed.

Ｓ１２８の処理においては、ＮＧ単語を歌唱していたか否かを判断する。この判断基準は、例えばカラオケ歌唱の歌詞データ中のＮＧ単語データの数量が１つ以上あれば、ＮＧ単語を歌唱していたとする。そして、ＮＧ単語を歌唱していたと判断した場合には（Ｓ１２８：ＹＥＳ）、カラオケ演奏停止信号を音声制御部２４へ送信する（Ｓ１３０）。Ｓ１３０の処理が終了したらＳ１３２の処理を実行する。一方、ＮＧ単語を歌唱していないと判断した場合には（Ｓ１２８：ＮＯ）、Ｓ１３８の処理を実行する。 In the process of S128, it is determined whether or not an NG word has been sung. For example, it is assumed that if the number of NG word data in the lyrics data of karaoke singing is one or more, the NG word is sung. If it is determined that the NG word has been sung (S128: YES), a karaoke performance stop signal is transmitted to the voice control unit 24 (S130). When the process of S130 is completed, the process of S132 is executed. On the other hand, when it is determined that the NG word is not sung (S128: NO), the process of S138 is executed.

Ｓ１３２の処理においては、音声認識終了信号を音声制御部２４へ送信する。
また、ＮＧ単語歌詞画面のデータは音声制御部２４から制御部１２へ送信される（この送信処理については後述する）のであるが、制御部１２は、ＮＧ単語歌詞のデータを受信したか否かを判断する（Ｓ１３４）。そして、ＮＧ単語歌詞のデータを受信すると（Ｓ１３４：ＹＥＳ）、ＮＧ単語歌詞画面を表示部３６へ表示する（Ｓ１３６）。具体的には、図１６（ｃ）に例示するようにＮＧ単語を歌唱した直前までの歌詞に対応する「わたしのことを見た」などを表示部３６へ表示し、さらに図１６（ｅ）に例示するように「まだまだ！残念」などを表示部３６へ表示するように映像制御部３８を制御する。 In the process of S132, a voice recognition end signal is transmitted to the voice control unit 24.
The data of the NG word lyrics screen is transmitted from the voice control unit 24 to the control unit 12 (this transmission process will be described later), but the control unit 12 determines whether or not the data of the NG word lyrics has been received. Is determined (S134). When NG word lyrics data is received (S134: YES), an NG word lyrics screen is displayed on the display unit 36 (S136). Specifically, as illustrated in FIG. 16 (c), “I saw me” or the like corresponding to the lyrics immediately before singing the NG word is displayed on the display unit 36, and FIG. 16 (e). The video control unit 38 is controlled to display “Still! Sorry” on the display unit 36 as illustrated in FIG.

そして、Ｓ１３６の処理が終了したら、本「制御部１２の採点処理」は終了する。
Ｓ１３８の処理においては、音声制御部２４からカラオケ演奏終了信号を受信したか否かを判断する。そして、音声制御部２４からカラオケ演奏終了信号を受信しない場合には（Ｓ１３８：ＮＯ）、Ｓ１２４の処理へ戻り、上述した処理を実行する。一方、音声制御部２４からカラオケ演奏終了信号を受信した場合には（Ｓ１３８：ＹＥＳ）
音声認識終了信号を音声制御部２４へ送信する（Ｓ１４０）。そして、完唱結果を表示部３６へ表示する（Ｓ１４２）。具体的には、図１６（ｄ）に例示するように「やった！おめでとう」などを表示部３６へ表示するように映像制御部３８を制御する。 Then, when the process of S136 ends, the present “scoring process of the control unit 12” ends.
In the process of S138, it is determined whether or not a karaoke performance end signal has been received from the voice control unit 24. If no karaoke performance end signal is received from the voice control unit 24 (S138: NO), the process returns to S124, and the above-described process is executed. On the other hand, when a karaoke performance end signal is received from the voice control unit 24 (S138: YES)
A voice recognition end signal is transmitted to the voice control unit 24 (S140). Then, the completion result is displayed on the display unit 36 (S142). Specifically, as illustrated in FIG. 16D, the video control unit 38 is controlled so that “Done! Congratulations” is displayed on the display unit 36.

そして、Ｓ１４２の処理が終了したら、本「制御部１２の採点処理」は終了する。
［音声制御部２４のＮＧ単語ゲーム処理の説明］
次に、カラオケ装置１の音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の手順を図９〜図１４のフローチャートに基づいて説明する。このＮＧ単語ゲーム処理に関する動作プログラムは、カラオケ演奏を実行するシーケンスプログラムと並行して実行され、シーケンスプログラムとのデータの交換も行われる。なお、以下の説明においては、制御部１２からカラオケ曲の選曲番号のデータを受信している状態とする。 Then, when the process of S142 is completed, the present “scoring process of the control unit 12” ends.
[Description of NG word game process of voice control unit 24]
Next, the procedure of “NG word game processing of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 will be described with reference to the flowcharts of FIGS. The operation program related to the NG word game process is executed in parallel with the sequence program for performing the karaoke performance, and data exchange with the sequence program is also performed. In the following description, it is assumed that data of the song selection number of the karaoke song is received from the control unit 12.

まず、音声制御部２４は制御部１２から送信されたＮＧ単語データを受信したか否かを判断する（Ｓ２１０）。そして、ＮＧ単語データを受信すると（Ｓ２１０：ＹＥＳ）、受信したＮＧ単語データを音声制御部２４が有するメモリ（図示せず）へ記憶し、Ｓ２１１の処理を実行する。 First, the voice control unit 24 determines whether or not the NG word data transmitted from the control unit 12 has been received (S210). When NG word data is received (S210: YES), the received NG word data is stored in a memory (not shown) of the voice control unit 24, and the process of S211 is executed.

Ｓ２１１の処理においては、音声制御部２４は制御部１２から送信されたカラオケ演奏開始信号を受信したか否かを判断する。そして、カラオケ演奏開始信号を受信すると（Ｓ２１１：ＹＥＳ）、カラオケ曲の選曲番号に対応する楽曲データを再生し、カラオケ演奏を開始する（Ｓ２１２）。 In the process of S211, the voice control unit 24 determines whether or not the karaoke performance start signal transmitted from the control unit 12 has been received. When the karaoke performance start signal is received (S211: YES), the music data corresponding to the music selection number of the karaoke music is reproduced and the karaoke performance is started (S212).

次に、音声制御部２４は制御部１２から送信された音声認識開始信号を受信したか否かを判断する（Ｓ２１４）。そして、音声認識開始信号を受信すると（Ｓ２１４：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を開始する（Ｓ２１６）。具体的には、音声制御部２４が音声認識部１０を制御してカラオケ歌唱の歌詞を音声認識させる。そして、音声認識させたカラオケ歌唱の歌詞を音声制御部２４が内蔵するＲＯＭに設けられているカラオケ歌唱の歌詞記録エリア６０へ記憶する。 Next, the voice control unit 24 determines whether or not the voice recognition start signal transmitted from the control unit 12 has been received (S214). When the voice recognition start signal is received (S214: YES), voice recognition of the lyrics of the karaoke song is started (S216). Specifically, the voice control unit 24 controls the voice recognition unit 10 to recognize the karaoke song lyrics. Then, the lyrics of the karaoke song that has been voice-recognized are stored in the karaoke song lyrics recording area 60 provided in the ROM built in the voice control unit 24.

そして、音声制御部２４は制御部１２から送信された区間歌唱採点開始信号を受信したか否かを判断する（Ｓ２１８）。そして、区間歌唱採点開始信号を受信すると（Ｓ２１８：ＹＥＳ）、リファレンスカウンタ（全体）の初期化を行なう（Ｓ２２０）。 And the audio | voice control part 24 judges whether the area song scoring start signal transmitted from the control part 12 was received (S218). When the section singing start signal is received (S218: YES), the reference counter (entire) is initialized (S220).

次に、前奏期間または間奏期間の歌唱されない期間か否かを判断する（Ｓ２２２、図１０参照）。このＳ２２２の判断は、カラオケ演奏を実行するシーケンスプログラムから受け渡される制御データトラックに記憶されているカラオケ曲の前奏期間、歌唱期間、及び間奏期間の開始点を示す区間分割データによって前奏期間または間奏期間、すなわち歌唱されない期間か否かを判断する。そして、前奏期間または間奏期間でない場合、すなわち歌唱期間の場合（Ｓ２２２：ＮＯ）には、後述する歌唱採点の比較処理を実行する（Ｓ２２４）。一方、前奏期間または間奏期間の場合（Ｓ２２２：ＹＥＳ）には、後述する歌唱採点の調整処理を実行する（Ｓ２２６）とともに、後述する音声認識の調整処理を実行する（Ｓ２２８）。そして、歌唱採点の比較処理を実行した場合（Ｓ２２４）、歌唱採点の調整処理を実行した場合（Ｓ２２６）、もしくは音声認識の調整処理を実行した場合（Ｓ２２８）には、区間歌唱採点が終了したか否かを判断する（Ｓ２３０）。そして、区間歌唱採点が終了していない場合（Ｓ２３０：ＮＯ）には、Ｓ２２２へ戻り、上述した処理を実行する。 Next, it is determined whether or not it is a period in which the prelude period or the interlude period is not sung (S222, see FIG. 10). This determination of S222 is made by determining the prelude period or interlude by section division data indicating the predecessor period, the singing period, and the start point of the interlude period of the karaoke song stored in the control data track delivered from the sequence program that performs the karaoke performance. It is determined whether or not it is a period, that is, a period during which singing is not performed. And when it is not a prelude period or an interlude period, ie, in the case of a singing period (S222: NO), the singing scoring comparison process mentioned later is performed (S224). On the other hand, in the case of a prelude period or an interlude period (S222: YES), a singing scoring adjustment process described later is executed (S226) and a voice recognition adjusting process described later is executed (S228). When the singing scoring comparison process is executed (S224), the singing scoring adjustment process is executed (S226), or the voice recognition adjustment process is executed (S228), the section singing is finished. Whether or not (S230). And when section song scoring is not completed (S230: NO), it returns to S222 and performs the processing mentioned above.

一方、区間歌唱採点が終了した場合（Ｓ２３０：ＹＥＳ）には、後述する歌唱採点の比較処理によって音声制御部２４が内蔵するＲＯＭ（図示なし）の差分データ記憶エリア６６へ記憶された音高差データから全体の音高差分データを取り出し（Ｓ２３２）、全体の音高差分データの合計をリファレンスカウンタ値で割って音高差分データを平均化する（Ｓ２３４）。そして、ＨＤＤ１６の採点情報メモリ領域５４に記憶されている音高差のデータに応じた歌唱採点情報（図１５参照）を用いて音高差のデータに応じて歌唱採点する（Ｓ２３６）。Ｓ２３６の処理が終了したらＳ２３８（図１１参照）の処理を実行する。 On the other hand, when the section singing is finished (S230: YES), the pitch difference stored in the difference data storage area 66 of the ROM (not shown) built in the voice control unit 24 by the singing scoring comparison process described later. The entire pitch difference data is extracted from the data (S232), and the total pitch difference data is divided by the reference counter value to average the pitch difference data (S234). Then, using the singing score information (see FIG. 15) corresponding to the pitch difference data stored in the scoring information memory area 54 of the HDD 16, singing is scored according to the pitch difference data (S236). When the process of S236 is completed, the process of S238 (see FIG. 11) is executed.

Ｓ２３８の処理においては、歌唱採点の結果が８０点以上か否かを判断する。この歌唱採点の結果を判断する点数は、この「８０点」には限らない。音声認識の認識率の低下を防止できる点数に設定するとよい。そして、歌唱採点の結果が８０点以上でない場合には（Ｓ２３８：ＮＯ）、Ｓ２５４の処理を実行する。一方、歌唱採点の結果が８０点以上である場合には（Ｓ２３８：ＹＥＳ）、Ｓ２４０の処理を実行する。 In the process of S238, it is determined whether or not the result of the singing score is 80 points or more. The score for judging the result of singing is not limited to “80 points”. It is better to set the score that can prevent the speech recognition rate from being lowered. And when the result of a singing score is not 80 points or more (S238: NO), the process of S254 is performed. On the other hand, when the result of the singing score is 80 points or more (S238: YES), the process of S240 is executed.

Ｓ２４０の処理においては、音声認識した歌詞データとＮＧ単語データとを比較する。具体的には、音声制御部２４が内蔵するＲＯＭに設けられているカラオケ歌唱の歌詞記録エリア６０へ記憶されている音声認識させたカラオケ歌唱の歌詞データを読み出す。また、音声制御部２４が有するメモリ（図示せず）へ記憶されているＮＧ単語データを読み出す。そして、読み出されたカラオケ歌唱の歌詞データとＮＧ単語データとを比較する。そして、その比較結果例えばカラオケ歌唱の歌詞データ中のＮＧ単語データの数量を制御部１２へ送信する（Ｓ２４２）。Ｓ２４２の処理が終了したらＳ２４４の処理を実行する。 In the process of S240, the speech-recognized lyrics data is compared with the NG word data. Specifically, the lyrics data of the karaoke song that has been voice-recognized and stored in the lyrics recording area 60 of the karaoke song provided in the ROM built in the voice control unit 24 is read. Further, NG word data stored in a memory (not shown) included in the voice control unit 24 is read out. Then, the read lyrics data of the karaoke song is compared with the NG word data. Then, the comparison result, for example, the quantity of NG word data in the lyrics data of the karaoke song is transmitted to the control unit 12 (S242). When the process of S242 is completed, the process of S244 is executed.

そして、音声制御部２４は制御部１２から送信されたカラオケ演奏停止信号を受信したか否かを判断する（Ｓ２４４）。そして、カラオケ演奏が終了しない場合には（Ｓ２４４：ＮＯ）、Ｓ２１８（図９参照）へ戻り、上述した処理を実行する。一方、カラオケ演奏停止信号を受信すると（Ｓ２４４：ＹＥＳ）、カラオケ演奏を停止する（Ｓ２４６）。Ｓ２４６の処理が終了したらＳ２４８の処理を実行する。 Then, the voice control unit 24 determines whether or not the karaoke performance stop signal transmitted from the control unit 12 has been received (S244). If the karaoke performance does not end (S244: NO), the process returns to S218 (see FIG. 9) to execute the above-described processing. On the other hand, when the karaoke performance stop signal is received (S244: YES), the karaoke performance is stopped (S246). When the process of S246 is completed, the process of S248 is executed.

次に、音声制御部２４は制御部１２から送信された音声認識終了信号を受信したか否かを判断する（Ｓ２４８）。そして、音声認識終了信号を受信すると（Ｓ２４８：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を終了する（Ｓ２５０）。そして、Ｓ２５０の処理が終了したらＳ２５２の処理を実行する。 Next, the voice control unit 24 determines whether or not the voice recognition end signal transmitted from the control unit 12 has been received (S248). When the voice recognition end signal is received (S248: YES), the voice recognition of the lyrics of the karaoke song is finished (S250). Then, when the process of S250 is completed, the process of S252 is executed.

Ｓ２５２の処理においては、ＮＧ単語歌唱のデータを制御部１２へ送信する。
そして、Ｓ２５２の処理が終了したら、本「音声制御部２４のＮＧ単語ゲーム処理」は終了する。 In the process of S252, NG word singing data is transmitted to the control unit 12.
Then, when the processing of S252 is completed, the present “sound control unit 24 NG word game processing” ends.

Ｓ２５４の処理においては、カラオケ演奏が終了したか否かを判断する。そして、カラオケ演奏が終了しない場合には（Ｓ２５４：ＮＯ）、Ｓ２１８（図９参照）へ戻り、上述した処理を実行する。一方、カラオケ演奏が終了した場合には（Ｓ２５４：ＹＥＳ）、カラオケ演奏終了信号を制御部１２へ送信する（Ｓ２５６）。 In the process of S254, it is determined whether or not the karaoke performance has ended. If the karaoke performance does not end (S254: NO), the process returns to S218 (see FIG. 9), and the above-described processing is executed. On the other hand, when the karaoke performance is completed (S254: YES), a karaoke performance end signal is transmitted to the control unit 12 (S256).

次に、音声制御部２４は制御部１２から送信された音声認識終了信号を受信したか否かを判断する（Ｓ２５８）。そして、音声認識終了信号を受信すると（Ｓ２５８：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を終了する（Ｓ２６０）。 Next, the voice control unit 24 determines whether or not the voice recognition end signal transmitted from the control unit 12 has been received (S258). When the voice recognition end signal is received (S258: YES), the voice recognition of the lyrics of the karaoke song is finished (S260).

そして、Ｓ２６０の処理が終了したら、本「音声制御部２４のＮＧ単語ゲーム処理」は終了する。
以上の採点では、音高データの比較を行って歌唱巧拙を判断している。つまり、音量の大小を得点に反映させていない。これは、音量は発声する語彙、性別、年齢などによってばらつきが大きいため実際の歌唱の巧拙とかけ離れた得点が出る場合があることを考慮したためである。また、音量を採点に使わないことによってアルゴリズムを簡略化でき、短時間で歌唱の正確さを判定することができる。 Then, when the process of S260 is completed, the present “sound control part 24 NG word game process” ends.
In the above scoring, the singing skill is judged by comparing the pitch data. That is, the volume level is not reflected in the score. This is because the sound volume varies widely depending on the vocabulary, gender, age, etc., so that it may take a point far from the skill of actual singing. Also, by not using the volume for scoring, the algorithm can be simplified and the accuracy of singing can be determined in a short time.

（１）歌唱採点の比較処理の説明
図１２はデータの取り込み処理を示すフローチャートである。
まず、図１２（ａ）はマイク２５および音声制御部２４で実行されるデータの取り込み処理の手順を示している。 (1) Explanation of Singing Score Comparison Processing FIG. 12 is a flowchart showing data fetching processing.
First, FIG. 12A shows a procedure of data capture processing executed by the microphone 25 and the voice control unit 24.

歌唱音声が入力されたマイク２５から出力されたアナログ形式の音声信号をディジタル形式の音声信号に変換し、その音声信号より、カラオケ演奏によって再生された楽曲データの再生信号を差し引いて採点用信号を生成する（Ｓ２２４０）。そして音声制御部２４は、この採点用信号を用いて、５０ｍｓのフレーム単位で周波数のカウント（Ｓ２２４２）を行う。この算出された周波数カウント値は５０ｍｓ毎に読み取られる。なお、この点については、図１３の歌唱採点の比較処理を示すフローチャートを用いて後述する。 An analog audio signal output from the microphone 25 to which the singing voice is input is converted into a digital audio signal, and a scoring signal is obtained by subtracting a reproduction signal of music data reproduced by karaoke performance from the audio signal. Generate (S2240). Then, the voice control unit 24 counts the frequency (S2242) in units of 50 ms frames using the scoring signal. The calculated frequency count value is read every 50 ms. This point will be described later with reference to a flowchart showing the singing scoring comparison process of FIG.

また、図１２（ｂ）は音声制御部２４で実行されるガイドメロディデータの取り込み処理の手順を示すフローチャートである。この処理はカラオケ演奏を実行するシーケンスプログラムからガイドメロディトラックのイベントデータが受け渡されたときに実行される。まず、シーケンスプログラムから渡されたガイドメロディデータを音声制御部２４が内蔵するＲＯＭ（図示なし）のガイドメロディバッファ６２に取り込む（Ｓ２２４４）。そのガイドメロディデータから音高データ（つまり、リファレンス）を抽出する（Ｓ２２４６）。そして、このようにして抽出した音高データで音声制御部２４が内蔵するＲＯＭ（図示なし）のリファレンスデータレジスタ６４を更新する（Ｓ２２４８）。したがって、リファレンスデータレジスタ６４は新たなガイドメロディデータが入力される毎に更新される。 FIG. 12B is a flowchart showing a procedure of guide melody data fetching processing executed by the voice control unit 24. This process is executed when the event data of the guide melody track is delivered from the sequence program that executes the karaoke performance. First, the guide melody data passed from the sequence program is taken into the guide melody buffer 62 of the ROM (not shown) built in the voice control unit 24 (S2244). Pitch data (that is, reference) is extracted from the guide melody data (S2246). Then, the reference data register 64 of the ROM (not shown) built in the voice control unit 24 is updated with the pitch data extracted in this way (S2248). Therefore, the reference data register 64 is updated each time new guide melody data is input.

次に、図１３は音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の歌唱採点の比較処理（図１０のＳ２２４）の詳細を示すフローチャートである。この処理は、採点用信号の周波数カウント値を取り込んで採点用信号の音高データ、周波数データに変換し、図１２（ｂ）のリファレンスデータ入力動作で求められたリファレンスデータの音高データと比較して差分データを求める動作である。なお、本歌唱採点の比較動作は、歌唱音声信号の１フレーム時間である５０ｍｓ毎に実行される。 Next, FIG. 13 is a flowchart showing the details of the singing scoring comparison process (S224 in FIG. 10) of the “NG word game process of the voice control unit 24” executed by the voice control unit 24. In this process, the frequency count value of the scoring signal is taken and converted into pitch data and frequency data of the scoring signal, and compared with the pitch data of the reference data obtained by the reference data input operation of FIG. Thus, the difference data is obtained. The singing scoring comparison operation is executed every 50 ms, which is one frame time of the singing voice signal.

まず、リファレンスが更新されたかどうかが判断され（Ｓ２２５０）、リファレンスが更新されない場合（Ｓ２２５０：ＮＯ）はリターンされる。リファレンスが更新された場合（Ｓ２２５０：ＹＥＳ）はリファレンスカウンタ（全体）をインクリメントする（Ｓ２２５２）。そして、上述した周波数カウント値を読み取り（Ｓ２２５４）、この周波数カウント値に基づいて音高データを生成する（Ｓ２２５６）。次に、採点用信号およびリファレンスの音高データを比較してその差を算出し（Ｓ２２５８）、この算出した差を、音高差分データとして差分データ記憶エリア６６の現在の区間に対応する記憶エリアに記憶する（Ｓ２２５９）。 First, it is determined whether or not the reference is updated (S2250). If the reference is not updated (S2250: NO), the process returns. When the reference is updated (S2250: YES), the reference counter (entire) is incremented (S2252). Then, the frequency count value described above is read (S2254), and pitch data is generated based on the frequency count value (S2256). Next, the scoring signal and the reference pitch data are compared to calculate the difference (S2258), and the calculated difference is stored as pitch difference data in the storage area corresponding to the current section of the difference data storage area 66. (S2259).

（２）歌唱採点の調整処理の説明
次に、上述した歌唱採点の調整処理（図１０のＳ２２６）の詳細を図１４（ａ）のフローチャートに基づいて説明する。 (2) Explanation of Singing Score Adjustment Processing Next, details of the above-described singing score adjustment processing (S226 in FIG. 10) will be described based on the flowchart of FIG.

まず、歌唱音声が入力されたマイク２５から出力されたアナログ形式の音声信号をディジタル形式の音声信号に変換し、その音声信号より、カラオケ演奏によって再生された楽曲データの再生信号を差し引いて採点用信号を生成する（Ｓ２２６０）。そして音声制御部２４は、この採点用信号の信号レベルを検出する（Ｓ２２６２）。次に、この採点用信号の信号レベルが規定値以上か否かを判断する（Ｓ２２６４）。そして、この採点用信号の信号レベルが規定値以上の場合（Ｓ２２６４：ＹＥＳ）には、カラオケ演奏によって再生された楽曲データの再生信号を増幅する（Ｓ２２６６）。具体的には、図１にて例示したように歌唱用可変利得アンプＭ３２へ利得を指示して、採点用信号の信号レベルが規定値以下になるように調整する。 First, an analog audio signal output from the microphone 25 to which the singing voice is input is converted into a digital audio signal, and the reproduction signal of the music data reproduced by the karaoke performance is subtracted from the audio signal for scoring. A signal is generated (S2260). Then, the voice control unit 24 detects the signal level of the scoring signal (S2262). Next, it is determined whether or not the signal level of the scoring signal is equal to or higher than a specified value (S2264). When the signal level of the scoring signal is equal to or higher than the specified value (S2264: YES), the reproduction signal of the music data reproduced by the karaoke performance is amplified (S2266). Specifically, as illustrated in FIG. 1, the singing variable gain amplifier M32 is instructed to adjust the gain so that the signal level of the scoring signal is below a specified value.

一方、この採点用信号の信号レベルが規定値以上でない場合（Ｓ２２６４：ＮＯ）には、リターンされる。なお、本歌唱採点の調整処理は、歌唱音声信号の１フレーム時間である５０ｍｓ毎に実行される。したがって、５０ｍｓ毎に採点用信号の信号レベルが規定値以下になるように制御される。 On the other hand, if the signal level of the scoring signal is not equal to or higher than the specified value (S2264: NO), the process returns. The singing scoring adjustment process is executed every 50 ms, which is one frame time of the singing voice signal. Therefore, control is performed so that the signal level of the scoring signal becomes equal to or less than the specified value every 50 ms.

（３）音声認識の調整処理の説明
次に、上述した音声認識の調整処理（図１０のＳ２２８）の詳細を図１４（ｂ）のフローチャートに基づいて説明する。 (3) Explanation of Adjustment Processing for Speech Recognition Next, details of the above-described speech recognition adjustment processing (S228 in FIG. 10) will be described based on the flowchart in FIG.

まず、歌唱音声が入力されたマイク２５から出力されたアナログ形式の音声信号をディジタル形式の音声信号に変換し、その音声信号より、カラオケ演奏によって再生された楽曲データの再生信号を差し引いて音声認識用信号を生成する（Ｓ２２８０）。そして音声制御部２４は、この音声認識用信号の信号レベルを検出する（Ｓ２２８２）。次に、この音声認識用信号の信号レベルが規定値以上か否かを判断する（Ｓ２２８４）。そして、この音声認識用信号の信号レベルが規定値以上の場合（Ｓ２２８４：ＹＥＳ）には、カラオケ演奏によって再生された楽曲データの再生信号を増幅する（Ｓ２２８６）。具体的には、図１にて例示したように歌詞用可変利得アンプＭ７２へ利得を指示して、音声認識用信号の信号レベルが規定値以下になるように調整する。 First, an analog audio signal output from the microphone 25 to which the singing voice is input is converted into a digital audio signal, and the audio signal is subtracted from the audio signal to reproduce the music data reproduced by the karaoke performance. A signal for use is generated (S2280). Then, the voice control unit 24 detects the signal level of the voice recognition signal (S2282). Next, it is determined whether the signal level of the voice recognition signal is equal to or higher than a specified value (S2284). If the signal level of the voice recognition signal is equal to or higher than the specified value (S2284: YES), the reproduction signal of the music data reproduced by the karaoke performance is amplified (S2286). Specifically, as illustrated in FIG. 1, the gain is instructed to the lyrics variable gain amplifier M72, and the signal level of the speech recognition signal is adjusted to be equal to or lower than a specified value.

一方、この音声認識用信号の信号レベルが規定値以上でない場合（Ｓ２２８４：ＮＯ）には、リターンされる。なお、本音声認識の調整処理は、歌唱音声信号の１フレーム時間である５０ｍｓ毎に実行される。したがって、５０ｍｓ毎に音声認識用信号の信号レベルが規定値以下になるように制御される。 On the other hand, if the signal level of the voice recognition signal is not equal to or higher than the specified value (S2284: NO), the process returns. The adjustment process of the voice recognition is executed every 50 ms that is one frame time of the singing voice signal. Therefore, control is performed so that the signal level of the speech recognition signal becomes equal to or less than the specified value every 50 ms.

[効果の説明]
（１）従来のカラオケ装置においてカラオケ歌唱される歌詞を音声認識する場合には、カラオケ演奏音がカラオケ装置から放音されるため、マイクロフォンには歌詞をカラオケ歌唱する音声とともにカラオケ演奏音も入力されるため、カラオケ歌唱する音声以外のカラオケ演奏音自体が騒音となって、音声認識する認識率が低くなるという課題があった。 [Description of effects]
(1) In the case of recognizing karaoke singing lyrics in a conventional karaoke device, the karaoke performance sound is emitted from the karaoke device. Therefore, there has been a problem that the recognition rate of speech recognition is low because the karaoke performance sound itself other than the voice for singing karaoke becomes noise.

それに対して本実施形態のカラオケ装置１によれば、歌詞用信号生成部Ｍ７０は、カラオケ演奏音の音信号と歌唱音声信号とを比較し、歌唱音声信号からカラオケ演奏音の音信号を減じた音声認識用信号を生成するので、スピーカＭ２０から放音されたカラオケ演奏音に対応する音信号が減じられる。そして、音声認識部Ｍ７６は、歌詞用信号生成部Ｍ７０によって生成された音声認識用信号に基づいてカラオケ歌唱の歌詞を認識するので、適切な音声認識結果を得ることができる。 On the other hand, according to the karaoke apparatus 1 of the present embodiment, the lyrics signal generation unit M70 compares the sound signal of the karaoke performance sound with the singing sound signal, and subtracts the sound signal of the karaoke performance sound from the singing sound signal. Since the voice recognition signal is generated, the sound signal corresponding to the karaoke performance sound emitted from the speaker M20 is reduced. Then, since the voice recognition unit M76 recognizes the lyrics of the karaoke song based on the voice recognition signal generated by the lyrics signal generation unit M70, an appropriate voice recognition result can be obtained.

（２）また、本実施形態のカラオケ装置１によれば、ＨＤＤ１６が記憶している楽曲データからＮＧ単語データを選定して、映像制御部３８を制御して、前記選定したＮＧ単語データを表示部３６へ表示させるとともに、音声認識部１０によって認識されたカラオケ歌唱の歌詞データと前記選定したＮＧ単語データとを比較して、同じ歌詞データがあると判定した場合には、音声制御部２４を制御して、カラオケ演奏の再生を停止させる。 (2) Also, according to the karaoke apparatus 1 of the present embodiment, NG word data is selected from the music data stored in the HDD 16, and the video control unit 38 is controlled to display the selected NG word data. When the lyric data of the karaoke song recognized by the voice recognition unit 10 is compared with the selected NG word data and determined to have the same lyric data, the voice control unit 24 is displayed. Control to stop playback of karaoke performance.

したがって、このようなカラオケ装置１によれば、カラオケ歌唱中にＮＧ単語を歌唱したか否かを競うＮＧ単語ゲームを楽しむことができる。
（３）また、本実施形態のカラオケ装置１によれば、区間歌唱採点が所定値例えば８０点以上ある場合に、音声認識部１０によって認識されたカラオケ歌唱の歌詞データと前記選定したＮＧ単語データとを比較する。 Therefore, according to such a karaoke apparatus 1, it is possible to enjoy an NG word game that competes whether or not an NG word is sung during karaoke singing.
(3) Also, according to the karaoke apparatus 1 of the present embodiment, the lyrics data of the karaoke song recognized by the voice recognition unit 10 and the selected NG word data when the section singing score is a predetermined value, for example, 80 points or more. And compare.

したがって、このようなカラオケ装置１によれば、カラオケ歌唱中にＮＧ単語を歌唱したか否かを競うＮＧ単語ゲームをするための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、より競技性の高いゲームでの音声認識の認識率の低下を防止できる。 Therefore, according to such a karaoke apparatus 1, as a condition for playing an NG word game in which NG words are sung during karaoke singing, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more. Thus, it is possible to prevent a decrease in the recognition rate of voice recognition in a game with higher competitiveness.

[他の実施形態]
以上、本発明の実施形態について説明したが、本発明は上記実施形態に限定されるものではなく、以下のような様々な態様にて実施することが可能である。 [Other embodiments]
As mentioned above, although embodiment of this invention was described, this invention is not limited to the said embodiment, It is possible to implement in the following various aspects.

（１）上記実施形態では、区間歌唱採点ごとに、区間歌唱採点が所定値例えば８０点以上ある場合に、音声認識部１０によって認識されたカラオケ歌唱の歌詞データと前記選定したＮＧ単語データとを比較して、ＮＧ単語を歌唱した場合には、カラオケ演奏を停止するゲームであったが、これには限らない。区間歌唱採点ごとに、区間歌唱採点が所定値例えば８０点以上ある場合に、カラオケ演奏終了まで音声認識部１０によって認識されたカラオケ歌唱の歌詞データとＨＤＤ１６に設けられている単語データメモリ領域５２へ記憶されている単語データとを比較して、相違する歌詞データの数量と前記単語データとに基づいて歌詞採点してもよい。 (1) In the above embodiment, for each section singing score, when the section singing score is a predetermined value, for example, 80 points or more, the lyrics data of the karaoke song recognized by the voice recognition unit 10 and the selected NG word data are In comparison, when the NG word is sung, the game stops the karaoke performance, but the present invention is not limited to this. For each section singing score, when the section singing score is a predetermined value, for example, 80 points or more, the lyrics data of the karaoke song recognized by the voice recognition unit 10 until the end of the karaoke performance and the word data memory area 52 provided in the HDD 16 The stored word data may be compared and the lyrics may be scored based on the quantity of the different lyric data and the word data.

以下に、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞採点処理」及びカラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞採点処理」を順に説明する。 Below, “the lyrics scoring process of the control unit 12” executed by the control unit 12 of the karaoke apparatus 1 and “the lyrics scoring process of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 will be described in order.

［制御部１２の歌詞採点処理の説明］
以下に、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞採点処理」の手順を図１７、図１８のフローチャートに基づいて説明する。 [Description of Lyric Scoring Process of Control Unit 12]
Below, the procedure of "the lyrics scoring process of the control part 12" which the control part 12 of the karaoke apparatus 1 performs is demonstrated based on the flowchart of FIG. 17, FIG.

操作部１８で受け付けた歌詞採点指示のデータは操作処理部２２によって制御部１２へ送信されるのであるが、制御部１２は、歌詞採点指示のデータを受信したか否かを判断する（Ｓ３１０）。そして、操作処理部２２から送信された歌詞採点指示のデータを受信すると（Ｓ３１０：ＹＥＳ）、選曲されているカラオケ曲の選曲番号に対応する楽曲データをＨＤＤ１６に設けられている図５（ａ）に例示する楽曲データメモリ領域５０から読み出す（Ｓ３１２）。そして、読み出された楽曲データから図６（ａ）に例示する歌詞で使用されている単語データを読み出し、読み出された前記単語データを単語データメモリ領域５２へ記憶する（Ｓ３１４）。Ｓ３１４の処理が終了したらＳ３２０の処理を実行する。 The lyrics scoring instruction data received by the operation unit 18 is transmitted to the control unit 12 by the operation processing unit 22, but the control unit 12 determines whether or not the lyrics scoring instruction data has been received (S310). . When the lyrics scoring instruction data transmitted from the operation processing unit 22 is received (S310: YES), music data corresponding to the music selection number of the selected karaoke music is provided in the HDD 16 (a). Is read from the music data memory area 50 illustrated in FIG. Then, the word data used in the lyrics illustrated in FIG. 6A is read from the read music data, and the read word data is stored in the word data memory area 52 (S314). When the process of S314 is completed, the process of S320 is executed.

Ｓ３２０の処理においては、カラオケ演奏開始信号を音声制御部２４へ送信する。そして、音声認識開始信号を音声制御部２４へ送信する（Ｓ３２２）。さらに、区間歌唱採点開始信号を音声制御部２４へ送信する（Ｓ３２４、図１８参照）。Ｓ３２４の処理が終了したらＳ３２６の処理を実行する。 In the process of S320, a karaoke performance start signal is transmitted to the voice control unit 24. Then, a voice recognition start signal is transmitted to the voice control unit 24 (S322). Furthermore, a section singing start signal is transmitted to the voice control unit 24 (S324, see FIG. 18). When the process of S324 is completed, the process of S326 is executed.

さて、音声認識された歌詞データは音声制御部２４から制御部１２へ送信される（この送信処理については後述する）のであるが、制御部１２は、音声認識された歌詞データを受信したか否かを判断する（Ｓ３２６）。そして、音声制御部２４から音声認識された歌詞データを受信した場合には（Ｓ３２６：ＹＥＳ）、Ｓ３２８の処理を実行する。一方、音声制御部２４から音声認識された歌詞データを受信しない場合には（Ｓ３２６：ＮＯ）、Ｓ３３８の処理を実行する。 The speech-recognized lyrics data is transmitted from the speech control unit 24 to the control unit 12 (this transmission process will be described later), but the control unit 12 has received the speech-recognized lyrics data. Is determined (S326). And when the lyrics data by which the speech recognition was carried out from the audio | voice control part 24 was received (S326: YES), the process of S328 is performed. On the other hand, when the lyrics data recognized by the voice from the voice control unit 24 is not received (S326: NO), the process of S338 is executed.

Ｓ３２８の処理においては、音声認識された歌詞データと単語データとを比較する。具体的には、前記単語データメモリ領域５２へ記憶されている単語データを読み出し、読み出された単語データと音声認識された歌詞データとから相違する歌詞データの数量を抽出する。そして、比較結果に基づいて歌詞採点する（Ｓ３３２）。具体的には、Ｓ３２８の処理において抽出された相違する歌詞データの数量と前記単語データの数量とに基づいてカラオケ歌唱の歌詞を採点する。より具体的には、例えば相違した歌詞データの数量が５個あり、単語データの数量が１００個あるとすれば、（１００個−５個）／１００個＝０．９５となり、１００点満点中の９５点と歌詞採点することができる。さらに、総合歌詞採点する（Ｓ３３２）。具体的には、区間歌唱採点ごとの歌詞採点を加重平均して総合歌詞採点する。より具体的には、例えば１回目の区間歌唱採点において相違した歌詞データの数量が５個あり、単語データの数量が１００個あるとし、２回目の区間歌唱採点において相違した歌詞データの数量が３個あり、単語データの数量が１００個あるとすれば、相違した歌詞データの数量の合計は８個であり、単語データの数量の合計は２００個である。したがって、１回目と２回目とを加重平均すると、（２００個−８個）／２００個＝０．９６となり、１００点満点中の９６点と総合歌詞採点することができる。Ｓ３３２の処理が終了したらＳ３３８の処理を実行する。 In the process of S328, the speech-recognized lyrics data is compared with the word data. Specifically, the word data stored in the word data memory area 52 is read out, and the quantity of different lyric data is extracted from the read word data and the speech-recognized lyrics data. Then, the lyrics are scored based on the comparison result (S332). Specifically, the lyrics of the karaoke song are scored based on the quantity of the different lyric data extracted in the process of S328 and the quantity of the word data. More specifically, for example, if there are 5 different lyric data quantities and 100 word data quantities, (100-5 pieces) / 100 pieces = 0.95, out of 100 points. You can score 95 points. Further, comprehensive lyrics are scored (S332). Specifically, the lyric score for each section singing score is weighted and averaged to score the overall lyrics. More specifically, for example, there are five different lyric data quantities in the first section singing score, and there are 100 word data quantities, and the lyric data quantity in the second section singing score is three. If there are 100 words and the quantity of word data is 100, the total number of different lyrics data is 8 and the quantity of word data is 200. Therefore, when the first time and the second time are weighted averaged, (200-8 pieces) / 200 pieces = 0.96, and 96 points out of 100 points can be scored with comprehensive lyrics. When the process of S332 ends, the process of S338 is executed.

Ｓ３３８の処理においては、カラオケ演奏終了信号は音声制御部２４から制御部１２へ送信される（この送信処理については後述する）のであるが、制御部１２は、カラオケ演奏終了信号を受信したか否かを判断する。そして、カラオケ演奏終了信号を
受信しない場合には（Ｓ３３８：ＮＯ）、Ｓ３２４へ戻り、上述した処理を実行する。一方、カラオケ演奏終了信号を受信した場合には（Ｓ３３８：ＹＥＳ）、音声認識終了信号を音声制御部２４へ送信する（Ｓ３４０）。Ｓ３４０の処理が終了したらＳ３４２の処理を実行する。 In the process of S338, the karaoke performance end signal is transmitted from the voice control unit 24 to the control unit 12 (this transmission process will be described later), but the control unit 12 has received the karaoke performance end signal. Determine whether. And when not receiving a karaoke performance end signal (S338: NO), it returns to S324 and performs the process mentioned above. On the other hand, when a karaoke performance end signal is received (S338: YES), a voice recognition end signal is transmitted to the voice control unit 24 (S340). When the process of S340 ends, the process of S342 is executed.

Ｓ３４２の処理においては、総合歌詞採点結果を表示部３６へ表示する。具体的には、例えば「総合歌詞採点：９６点」などを表示部３６へ表示するように映像制御部３８を制御する。 In the process of S342, the comprehensive lyrics scoring result is displayed on the display unit 36. Specifically, the video control unit 38 is controlled so that, for example, “total lyrics scoring: 96 points” is displayed on the display unit 36.

そして、Ｓ３４２の処理が終了したら、本「制御部１２の歌詞採点処理」は終了する。
［音声制御部２４の歌詞採点処理の説明］
次に、カラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞採点処理」の手順を図１９〜図２１のフローチャートに基づいて説明する。この歌詞採点処理に関する動作プログラムは、カラオケ演奏を実行するシーケンスプログラムと並行して実行され、シーケンスプログラムとのデータの交換も行われる。なお、以下の説明においては、制御部１２からカラオケ曲の選曲番号のデータを受信している状態とする。 Then, when the processing of S342 is completed, the present “lyric scoring processing of the control unit 12” ends.
[Explanation of the lyrics scoring process of the voice control unit 24]
Next, the procedure of “the lyrics scoring process of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 will be described based on the flowcharts of FIGS. The operation program related to the lyric scoring process is executed in parallel with the sequence program for performing the karaoke performance, and data exchange with the sequence program is also performed. In the following description, it is assumed that data of the song selection number of the karaoke song is received from the control unit 12.

まず、音声制御部２４は制御部１２から送信されたカラオケ演奏開始信号を受信したか否かを判断する（Ｓ４１０）。そして、カラオケ演奏開始信号を受信すると（Ｓ４１０：ＹＥＳ）、カラオケ曲の選曲番号に対応する楽曲データを再生し、カラオケ演奏を開始する（Ｓ４１２）。 First, the voice control unit 24 determines whether or not the karaoke performance start signal transmitted from the control unit 12 has been received (S410). And if a karaoke performance start signal is received (S410: YES), the music data corresponding to the music selection number of a karaoke music will be reproduced | regenerated and a karaoke performance will be started (S412).

次に、音声制御部２４は制御部１２から送信された音声認識開始信号を受信したか否かを判断する（Ｓ４１４）。そして、音声認識開始信号を受信すると（Ｓ４１４：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を開始する（Ｓ４１６）。具体的には、音声制御部２４が音声認識部１０を制御してカラオケ歌唱の歌詞を音声認識させる。そして、音声認識させたカラオケ歌唱の歌詞データを音声制御部２４が内蔵するＲＯＭに設けられているカラオケ歌唱の歌詞記録エリア６０へ記憶する。 Next, the voice control unit 24 determines whether or not the voice recognition start signal transmitted from the control unit 12 has been received (S414). When the voice recognition start signal is received (S414: YES), voice recognition of the lyrics of the karaoke song is started (S416). Specifically, the voice control unit 24 controls the voice recognition unit 10 to recognize the karaoke song lyrics. Then, the lyrics data of the karaoke song that has been voice-recognized is stored in the karaoke song lyrics recording area 60 provided in the ROM built in the voice control unit 24.

そして、音声制御部２４は制御部１２から送信された区間歌唱採点開始信号を受信したか否かを判断する（Ｓ４１８）。そして、区間歌唱採点開始信号を受信すると（Ｓ４１８：ＹＥＳ）、リファレンスカウンタ（全体）の初期化を行なう（Ｓ４２０）。 And the audio | voice control part 24 judges whether the area song scoring start signal transmitted from the control part 12 was received (S418). When the section singing start signal is received (S418: YES), the reference counter (entire) is initialized (S420).

次に、前奏期間または間奏期間の歌唱されない期間か否かを判断する（Ｓ４２２、図２０参照）。このＳ４２２の判断は、カラオケ演奏を実行するシーケンスプログラムから受け渡される制御データトラックに記憶されているカラオケ曲の前奏期間、歌唱期間、及び間奏期間の開始点を示す区間分割データによって前奏期間または間奏期間、すなわち歌唱されない期間か否かを判断する。そして、前奏期間または間奏期間でない場合、すなわち歌唱期間の場合（Ｓ４２２：ＮＯ）には、歌唱採点の比較処理を実行する（Ｓ４２４）。具体的には、上記実施形態の「（１）歌唱採点の比較処理の説明」で説明した処理を実行する。一方、前奏期間または間奏期間の場合（Ｓ４２２：ＹＥＳ）には、歌唱採点の調整処理を実行する（Ｓ４２６）とともに、音声認識の調整処理を実行する（Ｓ４２８）。具体的には、上記実施形態の「（２）歌唱採点の調整処理の説明」で説明した処理を実行するとともに、上記実施形態の「（３）音声認識の調整処理の説明」で説明した処理を実行する。そして、歌唱採点の比較処理を実行した場合（Ｓ４２４）、歌唱採点の調整処理を実行した場合（Ｓ４２６）、もしくは音声認識の調整処理を実行した場合（Ｓ４２８）には、区間歌唱採点が終了したか否かを判断する（Ｓ４３０）。そして、区間歌唱採点が終了していない場合（Ｓ４３０：ＮＯ）には、Ｓ４２２へ戻り、上述した処理を実行する。 Next, it is determined whether it is a period during which the prelude period or the interlude period is not sung (S422, see FIG. 20). This determination of S422 is made by determining the prelude period or the interlude based on the segment division data indicating the predecessor period, the singing period, and the start point of the interlude period of the karaoke song stored in the control data track delivered from the sequence program that executes the karaoke performance. It is determined whether or not it is a period, that is, a period during which singing is not performed. And when it is not a prelude period or an interlude period, ie, in the case of a singing period (S422: NO), a singing scoring comparison process is performed (S424). Specifically, the process described in “(1) Description of singing scoring comparison process” in the above embodiment is executed. On the other hand, in the case of the prelude period or the interlude period (S422: YES), the singing scoring adjustment process is executed (S426) and the voice recognition adjustment process is executed (S428). Specifically, the process described in “(2) Explanation of adjustment process of singing scoring” in the above embodiment is executed, and the process described in “(3) Explanation of adjustment process of voice recognition” in the above embodiment. Execute. When the singing scoring comparison process is executed (S424), the singing scoring adjustment process is executed (S426), or the voice recognition adjustment process is executed (S428), the section singing is finished. It is determined whether or not (S430). And when section song scoring is not complete | finished (S430: NO), it returns to S422 and performs the process mentioned above.

一方、区間歌唱採点が終了した場合（Ｓ４３０：ＹＥＳ）には、上記実施形態の「（１）歌唱採点の比較処理の説明」で説明した歌唱採点の比較処理によって音声制御部２４が内蔵するＲＯＭ（図示なし）の差分データ記憶エリア６６へ記憶された音高差データから全体の音高差分データを取り出し（Ｓ４３２）、全体の音高差分データの合計をリファレンスカウンタ値で割って音高差分データを平均化する（Ｓ４３４）。そして、ＨＤＤ１６の採点情報メモリ領域５４に記憶されている音高差のデータに応じた歌唱採点情報（図１５参照）を用いて音高差のデータに応じて歌唱採点する（Ｓ４３６）。Ｓ４３６の処理が終了したらＳ４３８（図２１参照）の処理を実行する。 On the other hand, when the section singing is completed (S430: YES), the ROM included in the voice control unit 24 by the singing comparison processing described in “(1) Description of singing comparison processing” in the above embodiment. The entire pitch difference data is extracted from the pitch difference data stored in the difference data storage area 66 (not shown) (S432), and the total pitch difference data is divided by the reference counter value to obtain the pitch difference data. Are averaged (S434). Then, using the singing score information (see FIG. 15) corresponding to the pitch difference data stored in the scoring information memory area 54 of the HDD 16, singing is scored according to the pitch difference data (S436). When the process of S436 is completed, the process of S438 (see FIG. 21) is executed.

Ｓ４３８の処理においては、歌唱採点の結果が８０点以上か否かを判断する。この歌唱採点の結果を判断する点数は、この「８０点」には限らない。音声認識の認識率の低下を防止できる点数に設定するとよい。そして、歌唱採点の結果が８０点以上でない場合には（Ｓ４３８：ＮＯ）、Ｓ４４４の処理を実行する。一方、歌唱採点の結果が８０点以上である場合には（Ｓ４３８：ＹＥＳ）、Ｓ４４２の処理を実行する。 In the process of S438, it is determined whether or not the result of the singing score is 80 points or more. The score for judging the result of singing is not limited to “80 points”. It is better to set the score that can prevent the speech recognition rate from being lowered. And when the result of a singing score is not 80 points or more (S438: NO), the process of S444 is performed. On the other hand, when the result of the singing score is 80 points or more (S438: YES), the process of S442 is executed.

Ｓ４４２の処理においては、音声認識した歌詞データを制御部１２へ送信する。具体的には、音声制御部２４が内蔵するＲＯＭに設けられているカラオケ歌唱の歌詞記録エリア６０へ記憶されているカラオケ歌唱の歌詞データを読み出し、読み出したカラオケ歌唱の歌詞データを制御部１２へ送信する。
Ｓ４４２の処理が終了したらＳ４４４の処理を実行する。 In the processing of S442, the speech-recognized lyrics data is transmitted to the control unit 12. Specifically, the lyrics data of the karaoke song stored in the karaoke song lyrics recording area 60 provided in the ROM built in the voice control unit 24 is read, and the read lyrics data of the karaoke song is sent to the control unit 12. Send.
When the process of S442 ends, the process of S444 is executed.

Ｓ４４４の処理においては、カラオケ演奏が終了したか否かを判断する。そして、カラオケ演奏が終了しない場合には（Ｓ４４４：ＮＯ）、Ｓ４１８（図１９参照）へ戻り、上述した処理を実行する。一方、カラオケ演奏が終了した場合には（Ｓ４４４：ＹＥＳ）、カラオケ演奏終了信号を制御部１２へ送信する（Ｓ４４６）。 In the process of S444, it is determined whether or not the karaoke performance has ended. If the karaoke performance does not end (S444: NO), the process returns to S418 (see FIG. 19) and the above-described processing is executed. On the other hand, when the karaoke performance is completed (S444: YES), a karaoke performance end signal is transmitted to the control unit 12 (S446).

次に、音声制御部２４は制御部１２から送信された音声認識終了信号を受信したか否かを判断する（Ｓ４４８）。そして、音声認識終了信号を受信すると（Ｓ４４８：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を終了する（Ｓ４５０）。 Next, the voice control unit 24 determines whether or not the voice recognition end signal transmitted from the control unit 12 has been received (S448). When the voice recognition end signal is received (S448: YES), the voice recognition of the lyrics of the karaoke song is ended (S450).

そして、Ｓ４５０の処理が終了したら、本「音声制御部２４の歌詞採点処理」は終了する。
このような、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞採点処理」及びカラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞採点処理」を実行することにより、カラオケ演奏終了まで音声認識部１０によって認識されたカラオケ歌唱の歌詞データとＨＤＤ１６に設けられている単語データメモリ領域５２へ記憶されている単語データとを比較して、相違する歌詞データの数量と前記単語データとに基づいてカラオケ歌唱の歌詞を採点することができる。 Then, when the processing of S450 is completed, the present “lyric scoring processing of the voice control unit 24” is completed.
By executing the “lyric scoring process of the control unit 12” executed by the control unit 12 of the karaoke apparatus 1 and the “lyric scoring process of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 as described above. The lyrics data of the karaoke song recognized by the voice recognition unit 10 until the end of the karaoke performance is compared with the word data stored in the word data memory area 52 provided in the HDD 16, and the quantity of the different lyrics data The lyrics of the karaoke song can be scored based on the word data.

また、本実施形態のカラオケ装置１によれば、区間歌唱採点が所定値例えば８０点以上ある場合に、音声認識部１０によって認識されたカラオケ歌唱の歌詞データと前記単語データとを比較する。 Moreover, according to the karaoke apparatus 1 of this embodiment, when the section singing score is a predetermined value, for example, 80 points or more, the lyrics data of the karaoke song recognized by the voice recognition unit 10 is compared with the word data.

したがって、このようなカラオケ装置１によれば、カラオケ歌唱の歌詞を採点するための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、歌詞採点における音声認識の認識率の低下を防止できる。 Therefore, according to such a karaoke apparatus 1, as a condition for scoring the lyrics of the karaoke song, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more, so that the recognition rate of speech recognition in lyric scoring is reduced. Can be prevented.

（２）また、区間歌唱採点ごとに、区間歌唱採点が所定値例えば８０点以上ある場合に、カラオケ演奏終了まで音声認識部１０によって認識されたカラオケ歌唱の歌詞データとＨＤＤ１６に設けられている単語データメモリ領域５２へ記憶されている単語データとを比較して、相違する歌詞データの数量が所定数以上あると判定した場合には、カラオケ演奏を停止するゲームとしてもよい。 (2) For each section singing score, when the section singing score is a predetermined value, for example, 80 points or more, the lyrics data of the karaoke song recognized by the voice recognition unit 10 until the end of the karaoke performance and the words provided in the HDD 16 If the word data stored in the data memory area 52 is compared and it is determined that there are a predetermined number or more of different lyric data, the game may be stopped.

以下に、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞誤りゲーム処理」及びカラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」を順に説明する。 Hereinafter, “the lyrics error game process of the control unit 12” executed by the control unit 12 of the karaoke apparatus 1 and “the lyrics error game process of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 will be described in order. .

［制御部１２の歌詞誤りゲーム処理の説明］
以下に、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞誤りゲーム処理」の手順を図２２、図２３のフローチャートに基づいて説明する。 [Explanation of Lyric Error Game Process of Control Unit 12]
Below, the procedure of "the lyrics error game process of the control part 12" which the control part 12 of the karaoke apparatus 1 performs is demonstrated based on the flowchart of FIG. 22, FIG.

操作部１８で受け付けたゲーム開始指示のデータは操作処理部２２によって制御部１２へ送信されるのであるが、制御部１２は、ゲーム開始指示のデータを受信したか否かを判断する（Ｓ５１０）。そして、操作処理部２２から送信されたゲーム開始指示のデータを受信すると（Ｓ５１０：ＹＥＳ）、制御部１２が有する歌詞誤り数カウンタ（図示せず）へ数式「ｎ＝０」を記憶する（Ｓ５１１）。Ｓ５１１の処理が終了したらＳ５１２の処理うぃ実行する。 The game start instruction data received by the operation unit 18 is transmitted to the control unit 12 by the operation processing unit 22, but the control unit 12 determines whether or not the game start instruction data has been received (S510). . When the game start instruction data transmitted from the operation processing unit 22 is received (S510: YES), the mathematical expression “n = 0” is stored in the lyrics error number counter (not shown) of the control unit 12 (S511). ). When the process of S511 is completed, the process of S512 is executed.

Ｓ５１２の処理においては、選曲されているカラオケ曲の選曲番号に対応する楽曲データをＨＤＤ１６に設けられている図５（ａ）に例示する楽曲データメモリ領域５０から読み出す。そして、読み出された楽曲データから図６（ａ）に例示する歌詞で使用されている単語データを読み出し、読み出された前記単語データを単語データメモリ領域５２へ記憶する（Ｓ５１４）。Ｓ５１４の処理が終了したらＳ５２０の処理を実行する。 In the process of S512, the music data corresponding to the music selection number of the selected karaoke music is read from the music data memory area 50 illustrated in FIG. Then, the word data used in the lyrics illustrated in FIG. 6A is read from the read music data, and the read word data is stored in the word data memory area 52 (S514). When the process of S514 is completed, the process of S520 is executed.

Ｓ５２０の処理においては、カラオケ演奏開始信号を音声制御部２４へ送信する。そして、音声認識開始信号を音声制御部２４へ送信する（Ｓ５２２）。さらに、区間歌唱採点開始信号を音声制御部２４へ送信する（Ｓ５２４、図２３参照）。Ｓ５２４の処理が終了したらＳ５２６の処理を実行する。 In the process of S520, a karaoke performance start signal is transmitted to the voice control unit 24. Then, a voice recognition start signal is transmitted to the voice control unit 24 (S522). Furthermore, a section singing start signal is transmitted to the voice control unit 24 (S524, see FIG. 23). When the process of S524 is completed, the process of S526 is executed.

さて、音声認識された歌詞データは音声制御部２４から制御部１２へ送信される（この送信処理については後述する）のであるが、制御部１２は、音声認識された歌詞データを受信したか否かを判断する（Ｓ５２６）。そして、音声制御部２４から音声認識された歌詞データを受信した場合には（Ｓ５２６：ＹＥＳ）、Ｓ５２８の処理を実行する。一方、音声制御部２４から音声認識された歌詞データを受信しない場合には（Ｓ５２６：ＮＯ）、Ｓ５３８の処理を実行する。 The speech-recognized lyrics data is transmitted from the speech control unit 24 to the control unit 12 (this transmission process will be described later), but the control unit 12 has received the speech-recognized lyrics data. Is determined (S526). And when the lyrics data by which the speech recognition was carried out from the audio | voice control part 24 was received (S526: YES), the process of S528 is performed. On the other hand, when the lyrics data recognized by the voice from the voice control unit 24 is not received (S526: NO), the process of S538 is executed.

Ｓ５２８の処理においては、歌詞誤り数カウンタが所定数を超えたか否かを判断する。ここで、所定数とは、例えばカラオケ歌唱の熟練者では「１」としたり、中級者では「３」としたり、初心者では「５」としたり、それぞれのカラオケ歌唱者のレベルに応じて設定される数値である。そして、歌詞誤り数カウンタが所定数を超えない場合には（Ｓ５２８：ＮＯ）、Ｓ５３７の処理を実行する。一方、歌詞誤り数カウンタが所定数を超えた場合には（Ｓ５２８：ＹＥＳ）、カラオケ演奏停止信号を音声制御部２４へ送信する（Ｓ５３０）。Ｓ５３０の処理が終了したらＳ５３２の処理を実行する。 In the process of S528, it is determined whether or not the lyrics error number counter exceeds a predetermined number. Here, the predetermined number is set according to the level of each karaoke singer, for example, “1” for a skilled karaoke singer, “3” for an intermediate singer, or “5” for a beginner. It is a numerical value. If the lyrics error number counter does not exceed the predetermined number (S528: NO), the process of S537 is executed. On the other hand, when the lyrics error number counter exceeds the predetermined number (S528: YES), a karaoke performance stop signal is transmitted to the voice control unit 24 (S530). When the process of S530 is completed, the process of S532 is executed.

Ｓ５３２の処理においては、音声認識終了信号を音声制御部２４へ送信する。そして、歌詞誤り数の結果を表示部３６へ表示する（Ｓ５３６）。具体的には、例えば歌詞誤り数の結果に対応する「歌詞誤り数：５件」などを表示部３６へ表示し、さらに図１６（ｅ）に例示するように「まだまだ！残念」などを表示部３６へ表示するように映像制御部３８を制御する。 In the process of S532, a voice recognition end signal is transmitted to the voice control unit 24. Then, the result of the number of lyrics errors is displayed on the display unit 36 (S536). Specifically, for example, “Lyrics error count: 5” corresponding to the result of the number of lyrics errors is displayed on the display unit 36, and “Still! Sorry” is displayed as shown in FIG. The video control unit 38 is controlled to display on the unit 36.

そして、Ｓ５３６の処理が終了したら、本「制御部１２の採点処理」は終了する。
Ｓ５３７の処理においては、制御部１２が有する歌詞誤り数カウンタ（図示せず）が記憶する数式「ｎ＝０」に数値「１」を加算して数式「ｎ＝１」とし、制御部１２が有するメモリ（図示せず）へ数式「ｎ＝１」を記憶する（Ｓ５３７）。Ｓ５３７の処理が終了したらＳ５３８の処理を実行する。 Then, when the process of S536 is completed, the present “scoring process of the control unit 12” is completed.
In the process of S537, the numerical value “1” is added to the mathematical expression “n = 0” stored in the lyrics error counter (not shown) of the control unit 12 to obtain the mathematical expression “n = 1”. The mathematical expression “n = 1” is stored in a memory (not shown) (S537). When the process of S537 is completed, the process of S538 is executed.

Ｓ５３８の処理においては、音声制御部２４からカラオケ演奏終了信号を受信したか否かを判断する。そして、音声制御部２４からカラオケ演奏終了信号を受信しない場合には（Ｓ５３８：ＮＯ）、Ｓ５２４の処理へ戻り、上述した処理を実行する。一方、音声制御部２４からカラオケ演奏終了信号を受信した場合には（Ｓ５３８：ＹＥＳ）
音声認識終了信号を音声制御部２４へ送信する（Ｓ５４０）。そして、歌詞誤り数の結果を表示部３６へ表示する（Ｓ５４２）。具体的には、例えば歌詞誤り数の結果に対応する「歌詞誤り数：０件」などを表示部３６へ表示し、さらに図１６（ｄ）に例示するように「やった！おめでとう」などを表示部３６へ表示するように映像制御部３８を制御する。 In the process of S538, it is determined whether or not a karaoke performance end signal has been received from the voice control unit 24. If the karaoke performance end signal is not received from the voice control unit 24 (S538: NO), the process returns to S524 and the above-described process is executed. On the other hand, when a karaoke performance end signal is received from the voice control unit 24 (S538: YES)
A voice recognition end signal is transmitted to the voice control unit 24 (S540). Then, the result of the number of lyrics errors is displayed on the display unit 36 (S542). Specifically, for example, “Lyrics error number: 0” corresponding to the result of the number of lyrics errors is displayed on the display unit 36, and “Year! Congratulations” is displayed as illustrated in FIG. The video control unit 38 is controlled to display on the display unit 36.

そして、Ｓ５４２の処理が終了したら、本「制御部１２の採点処理」は終了する。
［音声制御部２４の歌詞誤りゲーム処理の説明］
次に、カラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」の手順を図２４〜図２６のフローチャートに基づいて説明する。この歌詞誤りゲーム処理に関する動作プログラムは、カラオケ演奏を実行するシーケンスプログラムと並行して実行され、シーケンスプログラムとのデータの交換も行われる。なお、以下の説明においては、制御部１２からカラオケ曲の選曲番号のデータを受信している状態とする。 Then, when the process of S542 is completed, the present “scoring process of the control unit 12” ends.
[Description of Lyric Error Game Processing of Voice Control Unit 24]
Next, the procedure of “the lyrics error game process of the voice control unit 24” executed by the voice control unit 24 of the karaoke apparatus 1 will be described based on the flowcharts of FIGS. The operation program relating to the lyric error game processing is executed in parallel with the sequence program for executing the karaoke performance, and data exchange with the sequence program is also performed. In the following description, it is assumed that data of the song selection number of the karaoke song is received from the control unit 12.

まず、音声制御部２４は制御部１２から送信されたカラオケ演奏開始信号を受信したか否かを判断する（Ｓ６１０）。そして、カラオケ演奏開始信号を受信すると（Ｓ６１０：ＹＥＳ）、カラオケ曲の選曲番号に対応する楽曲データを再生し、カラオケ演奏を開始する（Ｓ６１２）。 First, the voice control unit 24 determines whether or not the karaoke performance start signal transmitted from the control unit 12 has been received (S610). And if a karaoke performance start signal is received (S610: YES), the music data corresponding to the music selection number of a karaoke music will be reproduced | regenerated and a karaoke performance will be started (S612).

次に、音声制御部２４は制御部１２から送信された音声認識開始信号を受信したか否かを判断する（Ｓ６１４）。そして、音声認識開始信号を受信すると（Ｓ６１４：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を開始する（Ｓ６１６）。具体的には、音声制御部２４が音声認識部１０を制御してカラオケ歌唱の歌詞を音声認識させる。そして、音声認識させたカラオケ歌唱の歌詞を音声制御部２４が内蔵するＲＯＭに設けられているカラオケ歌唱の歌詞記録エリア６０へ記憶する。 Next, the voice control unit 24 determines whether or not the voice recognition start signal transmitted from the control unit 12 has been received (S614). When the voice recognition start signal is received (S614: YES), voice recognition of the lyrics of the karaoke song is started (S616). Specifically, the voice control unit 24 controls the voice recognition unit 10 to recognize the karaoke song lyrics. Then, the lyrics of the karaoke song that has been voice-recognized are stored in the karaoke song lyrics recording area 60 provided in the ROM built in the voice control unit 24.

そして、音声制御部２４は制御部１２から送信された区間歌唱採点開始信号を受信したか否かを判断する（Ｓ６１８）。そして、区間歌唱採点開始信号を受信すると（Ｓ６１８：ＹＥＳ）、リファレンスカウンタ（全体）の初期化を行なう（Ｓ６２０）。 And the audio | voice control part 24 judges whether the area song scoring start signal transmitted from the control part 12 was received (S618). When the section singing start signal is received (S618: YES), the reference counter (entire) is initialized (S620).

次に、前奏期間または間奏期間の歌唱されない期間か否かを判断する（Ｓ６２２、図２５参照）。このＳ６２２の判断は、カラオケ演奏を実行するシーケンスプログラムから受け渡される制御データトラックに記憶されているカラオケ曲の前奏期間、歌唱期間、及び間奏期間の開始点を示す区間分割データによって前奏期間または間奏期間、すなわち歌唱されない期間か否かを判断する。そして、前奏期間または間奏期間でない場合、すなわち歌唱期間の場合（Ｓ６２２：ＮＯ）には、歌唱採点の比較処理を実行する（Ｓ６２４）。具体的には、上記実施形態の「（１）歌唱採点の比較処理の説明」で説明した処理を実行する。一方、前奏期間または間奏期間の場合（Ｓ６２２：ＹＥＳ）には、歌唱採点の調整処理を実行する（Ｓ６２６）とともに、音声認識の調整処理を実行する（Ｓ６２８）。具体的には、上記実施形態の「（２）歌唱採点の調整処理の説明」で説明した処理を実行するとともに、上記実施形態の「（３）音声認識の調整処理の説明」で説明した処理を実行する。そして、歌唱採点の比較処理を実行した場合（Ｓ６２４）、歌唱採点の調整処理を実行した場合（Ｓ６２６）、もしくは音声認識の調整処理を実行した場合（Ｓ６２８）には、区間歌唱採点が終了したか否かを判断する（Ｓ６３０）。そして、区間歌唱採点が終了していない場合（Ｓ６３０：ＮＯ）には、Ｓ６２２へ戻り、上述した処理を実行する。 Next, it is determined whether or not the period of the prelude period or the interlude period is not sung (see S622, FIG. 25). The determination in S622 is made by determining the prelude period or the interlude based on the segment division data indicating the predecessor period, the singing period, and the start point of the interlude period of the karaoke song stored in the control data track delivered from the sequence program that executes the karaoke performance. It is determined whether or not it is a period, that is, a period during which singing is not performed. And when it is not a prelude period or an interlude period, ie, in the case of a singing period (S622: NO), the comparison process of a singing score is performed (S624). Specifically, the process described in “(1) Description of singing scoring comparison process” in the above embodiment is executed. On the other hand, in the case of the prelude period or the interlude period (S622: YES), the singing scoring adjustment process is executed (S626) and the voice recognition adjustment process is executed (S628). Specifically, the process described in “(2) Explanation of adjustment process of singing scoring” in the above embodiment is executed, and the process described in “(3) Explanation of adjustment process of voice recognition” in the above embodiment. Execute. When the singing scoring comparison process is executed (S624), the singing scoring adjustment process is executed (S626), or the voice recognition adjustment process is executed (S628), the section singing is finished. Whether or not (S630). If the section singing has not been completed (S630: NO), the process returns to S622 and the above-described processing is executed.

一方、区間歌唱採点が終了した場合（Ｓ６３０：ＹＥＳ）には、上記実施形態の「（１）歌唱採点の比較処理の説明」で説明した歌唱採点の比較処理によって音声制御部２４が内蔵するＲＯＭ（図示なし）の差分データ記憶エリア６６へ記憶された音高差データから全体の音高差分データを取り出し（Ｓ６３２）、全体の音高差分データの合計をリファレンスカウンタ値で割って音高差分データを平均化する（Ｓ６３４）。そして、ＨＤＤ１６の採点情報メモリ領域５４に記憶されている音高差のデータに応じた歌唱採点情報（図１５参照）を用いて音高差のデータに応じて歌唱採点する（Ｓ６３６）。Ｓ６３６の処理が終了したらＳ６３８（図２６参照）の処理を実行する。 On the other hand, when the section singing is completed (S630: YES), the ROM included in the voice control unit 24 by the singing scoring comparison processing described in “(1) Description of singing scoring comparison processing” in the above embodiment. The entire pitch difference data is extracted from the pitch difference data stored in the difference data storage area 66 (not shown) (S632), and the total pitch difference data is divided by the reference counter value to obtain the pitch difference data. Are averaged (S634). Then, using the singing score information (see FIG. 15) corresponding to the pitch difference data stored in the scoring information memory area 54 of the HDD 16, singing is scored according to the pitch difference data (S636). When the process of S636 is completed, the process of S638 (see FIG. 26) is executed.

Ｓ６３８の処理においては、歌唱採点の結果が８０点以上か否かを判断する。この歌唱採点の結果を判断する点数は、この「８０点」には限らない。音声認識の認識率の低下を防止できる点数に設定するとよい。そして、歌唱採点の結果が８０点以上でない場合には（Ｓ６３８：ＮＯ）、Ｓ６５４の処理を実行する。一方、歌唱採点の結果が８０点以上である場合には（Ｓ６３８：ＹＥＳ）、Ｓ６４２の処理を実行する。 In the process of S638, it is determined whether or not the result of the singing score is 80 points or more. The score for judging the result of singing is not limited to “80 points”. It is better to set the score that can prevent the speech recognition rate from being lowered. And when the result of a singing score is not 80 points or more (S638: NO), the process of S654 is performed. On the other hand, when the result of the singing score is 80 points or more (S638: YES), the process of S642 is executed.

Ｓ６４２の処理においては、音声認識した歌詞データを制御部１２へ送信する。Ｓ６４２の処理が終了したらＳ６４４の処理を実行する。
そして、音声制御部２４は制御部１２から送信されたカラオケ演奏停止信号を受信したか否かを判断する（Ｓ６４４）。そして、カラオケ演奏が終了しない場合には（Ｓ６４４：ＮＯ）、Ｓ６１８（図２４参照）へ戻り、上述した処理を実行する。一方、カラオケ演奏停止信号を受信すると（Ｓ６４４：ＹＥＳ）、カラオケ演奏を停止する（Ｓ６４６）。Ｓ６４６の処理が終了したらＳ６４８の処理を実行する。 In the process of S642, the lyric data recognized by the voice is transmitted to the control unit 12. When the process of S642 is completed, the process of S644 is executed.
Then, the voice control unit 24 determines whether or not the karaoke performance stop signal transmitted from the control unit 12 has been received (S644). If the karaoke performance does not end (S644: NO), the process returns to S618 (see FIG. 24) and the above-described processing is executed. On the other hand, when the karaoke performance stop signal is received (S644: YES), the karaoke performance is stopped (S646). When the process of S646 ends, the process of S648 is executed.

次に、音声制御部２４は制御部１２から送信された音声認識終了信号を受信したか否かを判断する（Ｓ６４８）。そして、音声認識終了信号を受信すると（Ｓ６４８：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を終了する（Ｓ６５０）。 Next, the voice control unit 24 determines whether or not the voice recognition end signal transmitted from the control unit 12 has been received (S648). When the voice recognition end signal is received (S648: YES), the voice recognition of the lyrics of the karaoke song is finished (S650).

そして、Ｓ６５０の処理が終了したら、本「音声制御部２４の歌詞誤りゲーム処理」は終了する。
Ｓ６５４の処理においては、カラオケ演奏が終了したか否かを判断する。そして、カラオケ演奏が終了しない場合には（Ｓ６５４：ＮＯ）、Ｓ６１８（図２４参照）へ戻り、上述した処理を実行する。一方、カラオケ演奏が終了した場合には（Ｓ６５４：ＹＥＳ）、カラオケ演奏終了信号を制御部１２へ送信する（Ｓ６５６）。 Then, when the process of S650 is completed, the present “lyric error game process of the voice control unit 24” is completed.
In the process of S654, it is determined whether or not the karaoke performance has ended. If the karaoke performance does not end (S654: NO), the process returns to S618 (see FIG. 24) to execute the above-described processing. On the other hand, when the karaoke performance is completed (S654: YES), a karaoke performance end signal is transmitted to the control unit 12 (S656).

次に、音声制御部２４は制御部１２から送信された音声認識終了信号を受信したか否かを判断する（Ｓ６５８）。そして、音声認識終了信号を受信すると（Ｓ６５８：ＹＥＳ）、カラオケ歌唱の歌詞の音声認識を終了する（Ｓ６６０）。 Next, the voice control unit 24 determines whether or not the voice recognition end signal transmitted from the control unit 12 has been received (S658). When the voice recognition end signal is received (S658: YES), the voice recognition of the lyrics of the karaoke song is finished (S660).

そして、Ｓ６６０の処理が終了したら、本「音声制御部２４の歌詞誤りゲーム処理」は終了する。
このような、カラオケ装置１の制御部１２が実行する「制御部１２の歌詞誤りゲーム処理」及びカラオケ装置１の音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」を実行することにより、カラオケ歌唱の最中であっても、歌詞を所定数誤って歌唱した場合には、カラオケ演奏が中止されるのでよりゲーム性を高めることができる。 Then, when the processing of S660 is completed, the present “lyric error error game processing of the voice control unit 24” is completed.
The “lyric error error game process of the control unit 12” executed by the control unit 12 of the karaoke apparatus 1 and the “lyric error error game process of the audio control unit 24” executed by the audio control unit 24 of the karaoke apparatus 1 are executed. Thus, even during karaoke singing, if a predetermined number of lyrics are sung by mistake, karaoke performance is stopped, so that the game performance can be improved.

また、本実施形態のカラオケ装置１によれば、区間歌唱採点が所定値例えば８０点以上ある場合に、音声認識部１０によって認識されたカラオケ歌唱の歌詞誤り数が所定数以上かを判断する。 Moreover, according to the karaoke apparatus 1 of this embodiment, when the section singing score is a predetermined value, for example, 80 points or more, it is determined whether the number of lyrics errors of the karaoke song recognized by the voice recognition unit 10 is a predetermined number or more.

したがって、このようなカラオケ装置１によれば、歌詞を誤るとカラオケ演奏が中止されるゲームをするための条件として、歌唱採点結果が所定値以上例えば８０点以上に限ることで、より難易度の高いゲームでの音声認識の認識率の低下を防止できる。 Therefore, according to such a karaoke apparatus 1, as a condition for playing a game in which karaoke performance is stopped if the lyrics are incorrect, the singing scoring result is limited to a predetermined value or more, for example, 80 points or more. It is possible to prevent a decrease in the recognition rate of voice recognition in a high game.

カラオケ装置の機能を中心とした概略構成を示す図である。It is a figure which shows schematic structure centering on the function of a karaoke apparatus. カラオケ演奏以前、カラオケ演奏期間、カラオケ演奏終了以後の経過時間に対するスイッチＭ４０、スイッチＭ３８及びスイッチＭ７８のオン／オフの変化と、歌唱用可変利得アンプＭ３２及び歌詞用可変利得アンプＭ７２からの出力信号レベルの変化と、を示す説明図である。On / off changes of the switch M40, the switch M38 and the switch M78 with respect to the elapsed time after the karaoke performance period, before the karaoke performance period, and the output signal level from the singing variable gain amplifier M32 and the lyrics variable gain amplifier M72 It is explanatory drawing which shows this change. 楽曲データベースのガイドメロディデータの音高データと採点用信号の音高データを示す説明図である。It is explanatory drawing which shows the pitch data of the guide melody data of a music database, and the pitch data of the signal for scoring. カラオケ装置１の構成を示すブロック図である。1 is a block diagram illustrating a configuration of a karaoke apparatus 1. FIG. （ａ）はＨＤＤ１６に設けられたメモリ領域を示す説明図であり、（ｂ）はＭＩＤＩデータの構造を示す説明図であり、（ｃ）は音声制御部２４が内蔵するＲＯＭ（図示なし）に設けられたカラオケ歌唱の歌詞記録エリア６０、ガイドメロディバッファ６２、リファレンスデータレジスタ６４及び差分データ記憶エリア６６を示す説明図である。(A) is explanatory drawing which shows the memory area provided in HDD16, (b) is explanatory drawing which shows the structure of MIDI data, (c) is in ROM (not shown) which the audio | voice control part 24 incorporates. It is explanatory drawing which shows the lyrics recording area 60 of the provided karaoke song, the guide melody buffer 62, the reference data register 64, and the difference data storage area 66. （ａ）は楽曲データフォーマットを示す説明図であり、（ｂ）は歌詞で使用されている単語データの例を示す説明図である。(A) is explanatory drawing which shows a music data format, (b) is explanatory drawing which shows the example of the word data used by the lyrics. 制御部１２が実行する「制御部１２のＮＧ単語ゲーム処理」の手順の一部を示すフローチャートである。7 is a flowchart showing a part of a procedure of “NG word game process of control unit 12” executed by control unit 12. 制御部１２が実行する「制御部１２のＮＧ単語ゲーム処理」の手順の一部を示すフローチャートである。7 is a flowchart showing a part of a procedure of “NG word game process of control unit 12” executed by control unit 12. 音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “NG word game process of voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “NG word game process of voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “NG word game process of voice control unit 24” executed by the voice control unit 24; （ａ）は音声制御部２４が実行する「音声制御部２４の歌唱採点」の［比較処理］において実行されるデータの取り込み処理の手順を示すフローチャートであり、（ｂ）は音声制御部２４が実行する「音声制御部２４の歌唱採点」の［比較処理］において実行されるガイドメロディデータの取り込み処理の手順を示すフローチャートである。(A) is a flowchart which shows the procedure of the data acquisition process performed in [comparison process] of "the singing scoring of the audio | voice control part 24" which the audio | voice control part 24 performs, (b) is the audio | voice control part 24. It is a flowchart which shows the procedure of the taking-in process of the guide melody data performed in [comparison process] of "the singing scoring of the audio | voice control part 24" to perform. 音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の歌唱採点の比較処理の詳細を示すフローチャートである。It is a flowchart which shows the detail of the comparison process of the singing score of "NG word game process of the voice control part 24" which the voice control part 24 performs. （ａ）は音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の歌唱採点の調整処理の詳細を示すフローチャートであり、（ｂ）は音声制御部２４が実行する「音声制御部２４のＮＧ単語ゲーム処理」の歌唱採点の調整処理の詳細を示すフローチャートである。(A) is a flowchart which shows the detail of the adjustment process of the singing score of "NG word game process of the voice control part 24" which the voice control part 24 performs, (b) is the "voice control which the voice control part 24 performs. It is a flowchart which shows the detail of the adjustment process of the singing score of "NG word game process of the part 24." 音高差のデータに応じた採点情報を示す説明図である。It is explanatory drawing which shows the scoring information according to the data of pitch difference. （ａ）はタイトルを表示部３６へ表示させた例を示す説明図であり、（ｂ）はＮＧ単語を表示部３６へ表示させた例を示す説明図であり、（ｃ）はＮＧ単語歌詞画面を表示部３６へ表示させた例を示す説明図であり、（ｄ）はおめでとう画面を表示部３６へ表示させた例を示す説明図であり、（ｅ）は残念画面を表示部３６へ表示させた例を示す説明図である。(A) is explanatory drawing which shows the example which displayed the title on the display part 36, (b) is explanatory drawing which shows the example which displayed the NG word on the display part 36, (c) is NG word lyrics is an explanatory view showing an example of displaying the table radical 113 36 screen, (d) is an explanatory diagram showing an example of displaying on the display unit 36 Congratulations screen, (e) display the sorry screen 36 It is explanatory drawing which shows the example displayed on. 制御部１２が実行する「制御部１２の歌詞採点処理」の手順の一部を示すフローチャートである。It is a flowchart which shows a part of procedure of the "lyric scoring process of the control part 12" which the control part 12 performs. 制御部１２が実行する「制御部１２の歌詞採点処理」の手順の一部を示すフローチャートである。It is a flowchart which shows a part of procedure of the "lyric scoring process of the control part 12" which the control part 12 performs. 音声制御部２４が実行する「音声制御部２４の歌詞採点処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric scoring processing of the voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４の歌詞採点処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric scoring processing of the voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４の歌詞採点処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric scoring processing of the voice control unit 24” executed by the voice control unit 24; 制御部１２が実行する「制御部１２の歌詞誤りゲーム処理」の手順の一部を示すフローチャートである。12 is a flowchart showing a part of a procedure of “lyric error game process of control unit 12” executed by control unit 12. 制御部１２が実行する「制御部１２の歌詞誤りゲーム処理」の手順の一部を示すフローチャートである。12 is a flowchart showing a part of a procedure of “lyric error game process of control unit 12” executed by control unit 12. 音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric error error game processing of the voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric error error game processing of the voice control unit 24” executed by the voice control unit 24; 音声制御部２４が実行する「音声制御部２４の歌詞誤りゲーム処理」の手順の一部を示すフローチャートである。10 is a flowchart showing a part of a procedure of “lyric error error game processing of the voice control unit 24” executed by the voice control unit 24;

Explanation of symbols

１…カラオケ装置、２…リモコン端末、１０…音声認識部、１２…制御部、１４…インタフェース部、１６…ハードディスク（ＨＤＤ）、１８…操作部、２０…赤外線通信部、２２…操作処理部、２４…音声制御部、２５…マイク、２８…スピーカ、３０…ＭＩＤＩ音源、３２…ビデオＲＡＭ、３４…映像再生部、３６…表示部、３８…映像制御部、３９…バス、４０…ＵＳＢ、５０…楽曲データメモリ領域、５２…単語データメモリ領域、５４…採点情報メモリ領域、６０…カラオケ歌唱の歌詞記憶エリア、６２…ガイドメロディバッファ、６４…リファレンスデータレジスタ、６６…差分データ記憶エリア、１００…ネットワーク。 DESCRIPTION OF SYMBOLS 1 ... Karaoke apparatus, 2 ... Remote control terminal, 10 ... Voice recognition part, 12 ... Control part, 14 ... Interface part, 16 ... Hard disk (HDD), 18 ... Operation part, 20 ... Infrared communication part, 22 ... Operation processing part, 24 ... Audio control unit, 25 ... Microphone, 28 ... Speaker, 30 ... MIDI sound source, 32 ... Video RAM, 34 ... Video playback unit, 36 ... Display unit, 38 ... Video control unit, 39 ... Bus, 40 ... USB, 50 ... Music data memory area, 52 ... Word data memory area, 54 ... Scoring information memory area, 60 ... Karaoke song lyrics storage area, 62 ... Guide melody buffer, 64 ... Reference data register, 66 ... Difference data storage area, 100 ... network.

Claims

An audio signal input means for inputting an audio signal of karaoke singing;
Song data storage means for storing song data of karaoke songs;
Karaoke performance reproduction means for reproducing the music data stored in the music data storage means as a sound signal, and outputting the reproduced sound signal and the voice signal of the karaoke song input from the voice signal input means to a speaker;
The first signal corresponding to the sound signal output from the karaoke performance reproducing means is compared with the second signal corresponding to the karaoke singing voice signal input from the voice signal input means, and the second signal is First generation means for generating a speech recognition signal obtained by subtracting the first signal;
First gain setting means for setting a gain for the first generation means to subtract the first signal from the second signal based on the speech recognition signal generated by the first generation means;
Voice recognition means for recognizing lyrics of a karaoke song based on the voice recognition signal generated by the first generation means;
The first signal corresponding to the sound signal output from the karaoke performance reproducing means is compared with the second signal corresponding to the karaoke singing voice signal input from the voice signal input means, and the second signal is Second generating means for generating a scoring signal obtained by subtracting the first signal;
Pitch extraction means for extracting pitch data from the scoring signal generated by the second generation means;
Second gain setting means for setting a gain for the second generation means to subtract the first signal from the second signal based on the scoring signal generated by the second generation means;
The pitch difference between the pitch data of the singing melody of the karaoke song stored in the music data storage means and the pitch data of the scoring signal extracted by the pitch extraction means is calculated, and the calculated pitch difference Singing scoring means for scoring karaoke singing songs for each predetermined section based on
With
In the case of a karaoke song performance period, the first gain setting means sets the gain so that the signal level of the speech recognition signal is minimized, and the second gain setting means The gain setting means sets the gain so that the signal level of the scoring signal is minimized.
Karaoke apparatus according to claim.

The karaoke apparatus according to claim 1,
The music data storage means stores music data including lyrics data of karaoke music,
Compare the lyrics data of the karaoke song recognized by the voice recognition means and the lyrics data of the karaoke song stored in the music data storage means, extract the quantity of the different lyrics data, and extract the quantity and A karaoke apparatus comprising lyric scoring means for scoring the lyrics of karaoke songs based on the quantity of lyrics data of karaoke songs.

The karaoke apparatus according to claim 1,
The music data storage means stores music data including lyrics data of karaoke music,
The lyrics data of the karaoke song recognized by the voice recognition means and the lyrics data of the karaoke song stored in the music data storage means are compared to extract the quantity of the different lyrics data, and the extracted quantity is predetermined. A karaoke apparatus comprising control means for controlling the karaoke performance reproduction means to stop reproduction of the karaoke performance when it is determined that there are more than one.

The karaoke apparatus according to claim 1,
It has a display means that can display lyrics data of karaoke songs,
The music data storage means stores music data including lyrics data of karaoke music,
Extracting specific lyrics data from the song data stored in the song data storage means, controlling the display means to display the extracted lyrics data, and recognizing the karaoke song recognized by the voice recognition means If the lyric data is compared with the extracted lyric data and it is determined that there is the same lyric data, the karaoke performance reproducing means is controlled to stop the reproduction of the karaoke performance. A karaoke device that features it.

In the karaoke apparatus according to claim 2, further comprising a recognized karaoke singing voice recognition memory means to store the lyrics data of the said speech recognition means,
The lyrics scoring means determines whether or not the singing result of the karaoke singing by the singing scoring means is greater than or equal to a predetermined value for each predetermined section. Read out the lyrics data of the stored karaoke song, compare the read out lyrics data of the karaoke song with the lyrics data of the karaoke song stored in the song data storage means, and extract the quantity of the different lyrics data A karaoke apparatus for scoring the lyrics of a karaoke song based on the extracted quantity and the quantity of lyrics data of the karaoke song.

In the karaoke apparatus according to claim 3, further comprising a recognized karaoke singing voice recognition memory means to store the lyrics data of the said speech recognition means,
The control means determines whether or not the singing result of karaoke singing by the singing grading means is greater than or equal to a predetermined value for each predetermined section. The karaoke song lyrics data is read out, the karaoke song lyrics data read out is compared with the karaoke song lyrics data stored in the song data storage means, and the quantity of the different lyric data is extracted, A karaoke apparatus characterized in that when it is determined that the extracted quantity is a predetermined number or more, the karaoke performance reproduction means is controlled to stop reproduction of the karaoke performance.

In the karaoke apparatus according to claim 4, further comprising a recognized karaoke singing voice recognition memory means to store the lyrics data of the said speech recognition means,
The control means extracts specific lyric data from music data stored in the music data storage means, controls the display means to display the extracted lyric data, and displays the extracted lyric data for each predetermined section. It is determined whether or not the singing score result of the karaoke singing by the singing scoring means is greater than or equal to a predetermined value. If the lyric data of the karaoke song read is compared with the extracted lyric data and it is determined that there is the same lyric data, the karaoke performance reproducing means is controlled to stop the reproduction of the karaoke performance A karaoke apparatus characterized by that.