JP5168239B2

JP5168239B2 - Distribution apparatus and distribution method

Info

Publication number: JP5168239B2
Application number: JP2009155273A
Authority: JP
Inventors: 浩西川
Original assignee: Brother Industries Ltd
Current assignee: Brother Industries Ltd
Priority date: 2009-06-30
Filing date: 2009-06-30
Publication date: 2013-03-21
Anticipated expiration: 2029-06-30
Also published as: JP2011013295A

Description

本発明は、アクセントを有する語と当該語に対応する語とでペアとなる語の発声音が再生されるように生成された音声情報を端末装置に配信する配信装置及び配信方法の技術分野に関する。 TECHNICAL FIELD The present invention relates to a technical field of a distribution apparatus and a distribution method for distributing audio information generated so that utterance sounds of a pair of words having an accented word and a word corresponding to the word are reproduced to a terminal device. .

従来から、外国語等の記憶学習用の音声教材が知られている。この音声教材は、例えば、学習対象たる外国語の音声情報という形態で、インターネット上のウエブサイトからダウンロードすることが可能となっている。学習者は、音声情報をダウンロードした後、この音声情報を、例えば、携帯型の楽曲再生プレーヤ等の再生装置により再生処理させる。これにより、外国語の音声が再生出力され、学習者は聴覚を使って外国語の記憶学習を行うことができる。 Conventionally, audio teaching materials for memory learning of foreign languages and the like are known. This audio teaching material can be downloaded from a website on the Internet, for example, in the form of foreign language audio information to be learned. After the learner downloads the sound information, the learner causes the sound information to be reproduced by a reproducing device such as a portable music player. As a result, the sound of the foreign language is reproduced and output, and the learner can perform memory learning of the foreign language using the auditory sense.

また、このような音声情報として、楽曲のリズムやメロディに合わせて学習対象たる外国語が発音されるように作成された音声情報が知られている。これにより、学習者は、リズム良く又楽しく学習を行うことができる。 Also, as such voice information, voice information created so that a foreign language to be learned is pronounced in accordance with the rhythm and melody of the music is known. Thereby, the learner can learn with good rhythm and happily.

例えば、特許文献１には、学習させたい文字列を歌詞とした音楽を用いて所定の言語を学習するための言語学習材料を提供する方法が開示されている。この方法では、所定の楽曲を構成する音のうち、拍子の強拍にあたる音に対して、学習させたい文字列としての歌詞を構成する単語のアクセントが合致するように楽曲及び歌詞を選択する。 For example, Patent Literature 1 discloses a method for providing a language learning material for learning a predetermined language using music whose lyrics are a character string to be learned. In this method, the music and lyrics are selected so that the accents of the words constituting the lyrics as the character string to be learned match the sound that hits the strong beat of the time from among the sounds that constitute the predetermined music.

特開２００５−１７２８５８号公報JP 2005-172858 A

ところで、強勢アクセントを有する外国語の単語、例えば英単語を記憶学習する場合、学習する英単語と当該英単語の日本語訳とをペアで暗記することが一般的に行われている。そこで、英単語と日本語訳とを聴覚を使って学習者に覚えさせる場合、例えば、英単語と日本語訳とが交互に発音されるように音声情報を作成する。 By the way, when a foreign language word having a stress accent, for example, an English word is memorized and learned, it is generally performed to memorize a pair of an English word to be learned and a Japanese translation of the English word. Therefore, when letting the learner learn the English word and the Japanese translation using the auditory sense, for example, voice information is created so that the English word and the Japanese translation are alternately pronounced.

また、楽曲に合わせて英単語と日本語訳とが発音されるようにする場合、例えば、次のように音声情報を作成することが考えられる。例えば、「book 本 book 本 entertainment 娯楽 entertainment 娯楽」といったように、楽曲のリズムに合わせて英単語と日本語訳とが発音されるように音声情報が作成される。具体的には、英単語のアクセントの発音のタイミングが拍のタイミングに合わせられる。例えば、「book」の発音をカナで表記すると、「ブック」となる。この場合、アクセントのある「ブッ」の発音のタイミングが拍のタイミングに合わせられる。また、「entertainment」の発音をカナで表記すると、「エンターテインメント」となる。この場合、アクセントのある「テ」の発音のタイミングが拍のタイミングに合わせられる。このように、アクセントの発音のタイミングと拍のタイミングとが合うことにより、学習者は、リズム良く英単語を日本語訳とともに記憶することができる。 When English words and Japanese translations are pronounced in accordance with music, for example, it is conceivable to create audio information as follows. For example, voice information is created so that English words and Japanese translations are pronounced in accordance with the rhythm of the music, such as “book book book book entertainment entertainment entertainment entertainment”. Specifically, the timing of pronunciation of the accent of English words is matched to the timing of beats. For example, if the pronunciation of “book” is written in kana, it becomes “book”. In this case, the timing of the accented “bu” is matched to the beat timing. If the pronunciation of “entertainment” is written in kana, it becomes “entertainment”. In this case, the timing of the accented “te” is adjusted to the beat timing. In this way, the learner can memorize English words together with the Japanese translation in good rhythm by matching the timing of accent pronunciation with the timing of beats.

しかしながら、例えば、「entertainment」のように、アクセントが語頭に無い単語の場合、アクセント部分よりも前にある部分（例えば、「entertainment」の場合の「エンター」等）は、拍のタイミングよりも前に発音されることとなる。そうすると、英単語と当該英単語の直前に発音される日本語訳との間隔、つまり、日本語訳の発音終了から英単語の発音開始までの、日本語訳及び英単語の何れも発声されない空白時間が短くなる。この日本語訳と英単語との間隔があまりにも短いと、日本語訳と英単語とを学習者が聞き分けることが難しくなり、学習者が英単語と日本語訳とを記憶することが困難になるおそれがある。 However, for words such as “entertainment”, if the accent is not at the beginning of the word, the part before the accent part (for example, “enter” in the case of “entertainment”) is before the beat timing. Will be pronounced. Then, the interval between the English word and the Japanese translation that is pronounced immediately before the English word, that is, the blank space in which neither the Japanese translation nor the English word is spoken from the end of the pronunciation of the Japanese translation to the start of pronunciation of the English word. Time is shortened. If the interval between the Japanese translation and the English word is too short, it will be difficult for the learner to distinguish between the Japanese translation and the English word, making it difficult for the learner to remember the English word and the Japanese translation. There is a risk.

このような問題を防止するため、例えば、問題が生じる単語の発音タイミングを１拍分遅らせて、間隔を確保するということも考えられないことではない。しかしながら、この場合は、リズムが不自然となり、逆に単語が記憶しにくくなる場合がある。また、問題が生じる単語毎に１拍分余計に再生時間が長くなるため、単位時間当たりに再生される単語数が減り、学習効率が低下するという問題が生じる。 In order to prevent such a problem, for example, it is not unthinkable to delay the pronunciation timing of a problematic word by one beat to secure an interval. However, in this case, the rhythm becomes unnatural, and conversely, words may be difficult to memorize. In addition, since the reproduction time becomes longer by one beat for each word in which a problem occurs, the number of words reproduced per unit time is reduced, and the learning efficiency is lowered.

そこで、本発明は、以上の点に鑑みてなされたものであり、楽曲等のリズムに合わせて、アクセントを有する英単語等の語と、当該語に対応する日本語訳等の語とを、学習者が確実に効率よく記憶することができる音声情報を配信する配信装置及び配信方法を提供することを目的とする。 Therefore, the present invention has been made in view of the above points, and in accordance with the rhythm of music, etc., words such as English words having accents, and words such as Japanese translations corresponding to the words, It is an object of the present invention to provide a distribution device and a distribution method for distributing audio information that can be efficiently and efficiently stored by a learner.

上記課題を解決するために、請求項１に記載の発明は、アクセントを有する第１の語と当該第１の語に対応する第２の語とで構成されるペア語の発声音が再生されるように生成された音声情報である記憶用音声情報を配信する配信装置であって、拍子の拍のタイミングで発音されるリズム音の情報と、前記ペア語の情報と、に基づいて、同一又は互いに異なる複数の前記ペア語と前記リズム音とが発音され、且つ、前記第１の語のアクセントの発音タイミングが前記拍のタイミングに合わせて前記第１の語と前記第２の語とが交互に発音されるように構成された前記記憶用音声情報を生成する生成手段と、前記生成された記憶用音声情報を記憶する記憶手段と、前記記憶された記憶用音声情報を端末装置に送信する送信手段と、を備え、前記生成手段は、前記第２の語の発音終了から当該第２の語の次に発音される前記第１の語の発音開始までの間隔が所定時間以上になるように、前記間隔が前記所定時間未満となる前記第１の語のアクセントが前記拍のタイミングからずれて発音される前記記憶用音声情報を生成することを特徴とする。 In order to solve the above-mentioned problem, the invention according to claim 1 reproduces the utterance sound of a pair word composed of a first word having an accent and a second word corresponding to the first word. The distribution device distributes the storage audio information that is the audio information generated as described above, and is the same based on the information of the rhythm sound generated at the timing of the beat and the information of the pair word Alternatively, a plurality of different pair words and the rhythm sound are pronounced, and the first word and the second word are generated in accordance with the timing of the accent of the first word in accordance with the timing of the beat. Generation means for generating the storage voice information configured to be alternately sounded, storage means for storing the generated storage voice information, and transmitting the stored storage voice information to the terminal device Transmitting means for The generating means sets the interval to the predetermined time so that the interval from the end of pronunciation of the second word to the start of pronunciation of the first word pronounced next to the second word is equal to or longer than the predetermined time. The storage voice information is generated in which the accent of the first word that is less than the number is pronounced with a shift from the beat timing.

請求項２に記載の発明は、請求項１に記載の配信装置において、前記生成手段は、前記間隔が前記所定時間未満であるという条件を満たすか否かを判定する判定手段を備え、前記条件を満たすと判定された前記第１の語のアクセントの発音タイミングを前記拍のタイミングからずらして前記記憶用音声情報を生成することを特徴とする。 According to a second aspect of the present invention, in the distribution device according to the first aspect, the generation unit includes a determination unit that determines whether or not a condition that the interval is less than the predetermined time is satisfied. The sound information for storage is generated by shifting the pronunciation timing of the accent of the first word determined to satisfy the condition from the timing of the beat.

請求項３に記載の発明は、請求項１に記載の配信装置において、前記第１の語の発音開始タイミングからアクセントの発音タイミングまでの発音時間が所定の発音時間以上であること、又は、前記第１の語においてアクセントのある文字よりも前にある文字の数が所定の文字数以上であること、の何れか一方を条件とし、前記生成手段は、前記第１の語が前記条件を満たすか否かを判定する判定手段を備え、前記条件を満たすと判定された前記第１の語のアクセントの発音タイミングを前記拍のタイミングからずらして前記記憶用音声情報を生成することを特徴とする。 According to a third aspect of the present invention, in the distribution device according to the first aspect, the pronunciation time from the pronunciation start timing of the first word to the accent pronunciation timing is equal to or longer than a predetermined pronunciation time, or Whether the number of characters preceding the accented character in the first word is greater than or equal to a predetermined number of characters, and the generation means determines whether the first word satisfies the condition And determining means for determining whether or not to generate the storage voice information by shifting an accent pronunciation timing of the first word determined to satisfy the condition from a beat timing.

請求項４に記載の発明は、請求項２又は請求項３に記載の配信装置において、前記生成手段は、前記条件を満たすと判定された前記第１の語の発音開始タイミングを前記拍のタイミングに合わせて前記記憶用音声情報を生成することを特徴とする。 According to a fourth aspect of the present invention, in the distribution device according to the second or third aspect, the generation means determines the pronunciation start timing of the first word determined to satisfy the condition as the beat timing. The storage audio information is generated in accordance with the above.

請求項５に記載の発明は、請求項４に記載の配信装置において、前記ペア語を構成する前記第１の語及び前記第２の語に対して夫々連続する複数の前記拍が１小節中に割り当てられており、前記生成手段は、前記条件を満たすと判定された前記第１の語が、最も強く発音される第１のアクセントと、前記第１のアクセントの次に強く発音される第２のアクセントとを有し、且つ、前記第１のアクセントが前記第２のアクセントよりも後に発音される語である場合、前記第１の語の発音開始タイミングを、前記第１の語に割り当てられた複数の拍のうち先頭の拍のタイミングに合わせ、且つ、前記第１のアクセントの発音タイミングを、前記第１の語に割り当てられた複数の拍のうち先頭以外の拍のタイミングに合わせて、前記記憶用音声情報を生成することを特徴とする。 According to a fifth aspect of the present invention, in the distribution device according to the fourth aspect, a plurality of beats that are continuous with the first word and the second word constituting the pair word are in one measure. And the generation means determines that the first word determined to satisfy the condition satisfies a first accent that is pronounced most strongly and a tone that is pronounced next to the first accent. If the first accent is a word that is pronounced after the second accent, the pronunciation start timing of the first word is assigned to the first word. The timing of the first accent is synchronized with the timing of the first beat of the plurality of beats, and the timing of the first accent is synchronized with the timing of the beats other than the beginning of the plurality of beats assigned to the first word. , Voice information for storage Generated and characterized in that.

請求項６に記載の発明は、請求項２乃至５の何れか１項に記載の配信装置において、前記生成手段は、前記リズム音の情報と、前記第２の語の発声音の音声情報である第２語音声情報と、に基づいて、前記第２の語の発音タイミングを所定のタイミングに合わせ、前記リズム音の情報と、前記第１の語の発声音の音声情報である前記第１語音声情報と、に基づいて、前記条件を満たさない前記第１の語のアクセントの発音タイミングを前記拍のタイミングに合わせ、前記条件を満たす前記第１の語のアクセントの発音タイミングを前記拍のタイミングからずらして前記記憶用音声情報を生成することを特徴とする。 According to a sixth aspect of the present invention, in the distribution device according to any one of the second to fifth aspects, the generation unit includes the rhythm sound information and the voice information of the utterance sound of the second word. Based on certain second word voice information, the second word's pronunciation timing is synchronized with a predetermined timing, and the first rhythm sound information and the first word voice information are the voice information of the first word. Based on the word speech information, the pronunciation timing of the accent of the first word that does not satisfy the condition is matched to the timing of the beat, and the pronunciation timing of the accent of the first word that satisfies the condition is The storage audio information is generated by shifting from timing.

請求項７に記載の発明は、請求項６に記載の配信装置において、前記第１の語を示す第１語情報と、当該第１の語のアクセントの位置を示すアクセント情報と、を対応付けて記憶する第１語情報記憶手段と、ユーザから指定された前記第１の語を示す第１語指定情報を前記端末装置から受信する第１語指定情報受信手段と、前記受信された第１語指定情報に基づいて、前記指定された第１の語の前記第１語音声情報を取得する第１語音声情報取得手段と、前記受信された第１語指定情報に基づいて、前記指定された第１の語に対応する前記第２の語の前記第２語音声情報を取得する第２語音声情報取得手段と、を更に備え、前記判定手段は、前記ユーザから指定された第１の語に対応する前記アクセント情報に少なくとも基づいて、当該第１の語が前記条件を満たすか否かを判定し、前記生成手段は、前記リズム音の情報と、前記取得された第１語音声情報と、前記取得された第２語音声情報と、に基づいて、前記指定された第１の語の発声音と当該第１の語に対応する前記第２の語の発声音とが再生される前記記憶用音声情報を生成することを特徴とする。 According to a seventh aspect of the present invention, in the distribution device according to the sixth aspect, the first word information indicating the first word is associated with the accent information indicating the position of the accent of the first word. First word information storage means for storing the first word designation information receiving means for receiving the first word designation information indicating the first word designated by the user from the terminal device, and the received first Based on word designation information, the first word voice information acquisition means for acquiring the first word voice information of the designated first word, and the designation based on the received first word designation information Second word voice information acquisition means for acquiring the second word voice information of the second word corresponding to the first word, wherein the determination means is a first specified by the user. Based on at least the accent information corresponding to the word, the first It is determined whether or not a word satisfies the condition, and the generating means is based on the rhythm sound information, the acquired first word sound information, and the acquired second word sound information. The storage voice information for reproducing the utterance sound of the designated first word and the utterance sound of the second word corresponding to the first word is generated.

請求項８に記載の発明は、アクセントを有する第１の語と当該第１の語に対応する第２の語とで構成されるペア語の発声音が再生されるように生成された音声情報である記憶用音声情報を配信する配信方法であって、拍子の拍のタイミングで発音されるリズム音の情報と、前記ペア語の情報と、に基づいて、同一又は互いに異なる複数の前記ペア語と前記リズム音とが発音され、且つ、前記第１の語のアクセントの発音タイミングが前記拍のタイミングに合わせて前記第１の語と前記第２の語とが交互に発音されるように構成された前記記憶用音声情報を生成する生成工程と、前記生成された記憶用音声情報を記憶する記憶工程と、前記記憶された記憶用音声情報を端末装置に送信する送信工程と、を有し、前記生成工程においては、前記第２の語の発音終了から当該第２の語の次に発音される前記第１の語の発音開始までの間隔が所定時間以上になるように、前記間隔が前記所定時間未満となる前記第１の語のアクセントが前記拍のタイミングからずれて発音される前記記憶用音声情報を生成することを特徴とする。 According to the eighth aspect of the present invention, the voice information generated so that the utterance sound of the pair word composed of the first word having an accent and the second word corresponding to the first word is reproduced. A plurality of the pair words that are the same or different from each other based on the information on the rhythm sound that is pronounced at the timing of the beat and the information on the pair words. And the rhythm sound are generated, and the first word and the second word are alternately generated according to the timing of the beat of the accent of the first word. A generating step for generating the stored audio information for storage; a storing step for storing the generated audio information for storage; and a transmitting step for transmitting the stored audio information for storage to a terminal device. In the generating step, the first The first interval is less than the predetermined time so that the interval from the end of pronunciation of the first word to the start of pronunciation of the first word that is pronounced next to the second word is equal to or longer than the predetermined time. The speech information for storage in which the accent of a word is pronounced with a shift from the beat timing is generated.

請求項９に記載の発明は、請求項８に記載の配信方法において、ユーザから指定された前記第１の語を示す第１語指定情報を前記端末装置から受信する第１語指定情報受信工程と、前記受信された第１語指定情報に基づいて、前記指定された第１の語のの発声音の音声情報である前記第１語音声情報を取得する第１語音声情報取得工程と、前記受信された第１語指定情報に基づいて、前記指定された第１の語に対応する前記第２の語の発声音の音声情報である第２語音声情報を取得する第２語音声情報取得工程と、前記第１の語を示す第１語情報と、当該第１の語のアクセントの位置を示すアクセント情報と、を対応付けて記憶する第１語情報記憶手段から、前記ユーザから指定された第１の語に対応する前記アクセント情報を取得するアクセント情報取得工程と、前記取得されたアクセント情報に少なくとも基づいて、前記ユーザから指定された前記第１の語のアクセントの発音タイミングをずらすか否かを判定する判定工程と、を更に有し、前記生成工程においては、前記リズム音の情報と、前記取得された第１語音声情報と、前記取得された第２語音声情報と、に基づいて、前記指定された第１の語の発声音と当該第１の語に対応する前記第２の語の発声音とが再生される前記記憶用音声情報を生成し、且つ、前記第２の語の発音タイミングを所定のタイミングに合わせ、アクセントの発音タイミングをずらさないと判定された前記第１の語のアクセントの発音タイミングを前記拍のタイミングに合わせ、アクセントの発音タイミングをずらすと判定された前記第１の語のアクセントの発音タイミングを前記拍のタイミングからずらして前記記憶用音声情報を生成することを特徴とする。 The invention according to claim 9 is the delivery method according to claim 8, wherein the first word designation information receiving step of receiving from the terminal device first word designation information indicating the first word designated by the user. And, based on the received first word designation information, a first word voice information acquisition step of acquiring the first word voice information which is voice information of the utterance of the designated first word; Second word voice information that acquires second word voice information that is voice information of the utterance of the second word corresponding to the designated first word based on the received first word designation information Designated by the user from the first word information storage means for storing the acquisition step, the first word information indicating the first word, and the accent information indicating the position of the accent of the first word in association with each other Accent information for acquiring the accent information corresponding to the first word And a determination step of determining whether or not to shift the pronunciation timing of the accent of the first word designated by the user based at least on the acquired accent information, In the generation step, the utterance sound of the designated first word based on the rhythm sound information, the acquired first word sound information, and the acquired second word sound information And the speech information of the second word corresponding to the first word is generated, and the pronunciation timing of the second word is set to a predetermined timing to The accent sounding timing of the first word determined not to shift the pronunciation timing is matched with the beat timing, and the accent of the first word determined to shift the accent sounding timing is adjusted. It shifted in timing relative cents from the timing of the beat and generates the storing voice information.

請求項１又は請求項８に記載の発明によれば、第２の語の発音終了から当該第２の語の次に発音される第１の語の発音開始までの間隔が所定時間未満となる当該第１の語については、アクセントの発音タイミングが拍のタイミングからずれていることによって、前記の間隔が所定時間以上となる記憶用音声情報が生成される。よって、生成された記憶用音声情報が再生処理されることによって、第１の語と第２の語とを聞き分けることが容易な間隔で第１の語と第２の語とを再生することが可能となるので、ペアとなる語をユーザが確実に記憶することができる。また、単位時間当たりに再生される語数が減ることを防止することができるので、ペアとなる語をユーザが効率的に記憶することができる。 According to the invention described in claim 1 or claim 8, the interval from the end of pronunciation of the second word to the start of pronunciation of the first word pronounced next to the second word is less than a predetermined time. For the first word, the sound generation timing of the accent is deviated from the timing of the beat, so that the sound information for storage with the interval equal to or longer than the predetermined time is generated. Therefore, the first word and the second word can be reproduced at an interval at which it is easy to distinguish the first word and the second word by reproducing the generated storage audio information. As a result, the user can reliably store the paired words. In addition, since the number of words reproduced per unit time can be prevented from decreasing, the user can efficiently store the paired words.

請求項２に記載の発明によれば、第２の語と第１の語との間隔が所定時間未満となってしまうことを確実に防止することができる。 According to invention of Claim 2, it can prevent reliably that the space | interval of a 2nd word and a 1st word will be less than predetermined time.

請求項３に記載の発明によれば、条件が第２の語に関する事項を含まないので、第２の語に関する情報を用いなくても、第１の語の発音タイミングをずらすか否かを判定することができる。 According to the invention described in claim 3, since the condition does not include the matter relating to the second word, it is determined whether or not the pronunciation timing of the first word is shifted without using the information relating to the second word. can do.

請求項４に記載の発明によれば、第１の語の再生が拍のタイミングで開始されるので、第１の語の発声音がリズム良く再生され、ペアとなる語をより記憶させやすくすることができる。 According to the fourth aspect of the present invention, since the reproduction of the first word is started at the beat timing, the utterance sound of the first word is reproduced with a good rhythm, making it easier to memorize the paired words. be able to.

請求項５に記載の発明によれば、最も強く発音されるアクセントがその次に強く発音されるアクセントよりも後に発音される第１の語については、割り当てられている拍のうち先頭の拍のタイミングで発音が開始され、先頭の拍よりも後の拍のタイミングで最も強く発音するアクセントが発音される。よって、アクセントを複数有する語の発声音がリズム良く再生され、ペアとなる語をより記憶させやすくすることができる。 According to the fifth aspect of the present invention, for the first word that is pronounced after the accent that is most strongly pronounced, the first beat among the assigned beats. Sound is started at the timing, and the most pronounced accent is pronounced at the timing of the beat after the first beat. Therefore, the utterance sound of a word having a plurality of accents is reproduced with a good rhythm, and the paired words can be stored more easily.

請求項６に記載の発明によれば、第１の語の音声情報と第２の語の音声情報とが別個に存在するので、第１の語の発音タイミングをずらした記憶用音声情報の生成を容易に行うことができる。 According to the sixth aspect of the present invention, since the voice information of the first word and the voice information of the second word exist separately, generation of storage voice information in which the pronunciation timing of the first word is shifted is generated. Can be easily performed.

請求項７又は請求項９に記載の発明によれば、第１の語をユーザが指定することによって、ユーザが記憶したい第１の語を記憶学習することができる。 According to invention of Claim 7 or Claim 9, when a user designates the 1st word, the 1st word which a user wants to memorize can be memorized and learned.

第１実施形態に係る通信システムＳの概要構成の一例を示すブロック図である。It is a block diagram which shows an example of schematic structure of the communication system S which concerns on 1st Embodiment. 第１実施形態に係る楽曲配信サーバ１の概要構成の一例を示すブロック図である。It is a block diagram which shows an example of schematic structure of the music delivery server 1 which concerns on 1st Embodiment. パーツＷＡＶデータデータベースに登録されるデータ、及び、楽曲配信サーバ１におけるソフトウェアの概要構成の一例を示す図である。2 is a diagram illustrating an example of a schematic configuration of data registered in a parts WAV data database and software in the music distribution server 1. FIG. トレーニングコースを選択するためのメニュー構成の一例を示す図である。It is a figure which shows an example of the menu structure for selecting a training course. あるトレーニングコースにおける、各ステップのジョギング用楽曲データの仕様の一例を示す図である。It is a figure which shows an example of the specification of the music data for jogging of each step in a certain training course. 各ステップにおけるジョギング用楽曲データの概要構成の一例を示す図であり、（ａ）は、ステップ１のジョギング用楽曲データであり、（ｂ）は、ステップ２のジョギング用楽曲データであり、（ｃ）は、ステップ３のジョギング用楽曲データであり、（ｄ）は、ステップ４のジョギング用楽曲データであり、（ｅ）は、ステップ５のジョギング用楽曲データであり、（ｆ）は、ステップ６のジョギング用楽曲データである。It is a figure which shows an example of the outline structure of the music data for jogging in each step, (a) is the music data for jogging of step 1, (b) is the music data for jogging of step 2, (c ) Is the song data for jogging in step 3, (d) is the song data for jogging in step 4, (e) is the song data for jogging in step 5, and (f) is the song data in step 6. The music data for jogging. 各ジョギング時間におけるジョギング用楽曲データの概要構成の一例を示す図であり、（ａ）は、１５分の場合であり、（ｂ）は、３０分の場合であり、（ｃ）は、４５分の場合であり、（ｄ）は、６０分の場合であり、（ｅ）は、９０分の場合であり、（ｆ）は、１２０分の場合である。It is a figure which shows an example of the outline structure of the music data for jogging in each jogging time, (a) is a case for 15 minutes, (b) is a case for 30 minutes, (c) is 45 minutes. (D) is for 60 minutes, (e) is for 90 minutes, and (f) is for 120 minutes. ジョギング用楽曲データの１ファイルのデータ構造の概要例を示す図である。It is a figure which shows the example of an outline | summary of the data structure of 1 file of the music data for jogging. 記憶対象語音声データのデータ構造の概要例を示す図である。It is a figure which shows the example of an outline | summary of the data structure of memory | storage object word audio | speech data. 記憶対象語音声データの作成例を示す図である。It is a figure which shows the example of preparation of memory | storage object word audio | voice data. （ａ）及び（ｂ）は、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整例を示す図である。(A) And (b) is a figure which shows the example of adjustment of the pronunciation timing of the English word in the English word group WAV data 302. FIG. （ａ）及び（ｂ）は、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整例を示す図である。(A) And (b) is a figure which shows the example of adjustment of the pronunciation timing of the English word in the English word group WAV data 302. FIG. （ａ）及び（ｂ）は、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整例を示す図である。(A) And (b) is a figure which shows the example of adjustment of the pronunciation timing of the English word in the English word group WAV data 302. FIG. （ａ）乃至（ｄ）は、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整例を示す図である。(A) thru | or (d) is a figure which shows the example of adjustment of the pronunciation timing of the English word in the English word group WAV data 302. FIG. 第１実施形態に係る各種設定値等を示す図であり、（ａ）は、ジョギング本編部分の１ファイル中に含めることができる楽曲本体の曲数の最小値と最大値とを示す表であり、（ｂ）は、各走行テンポにおけるジョギング本編部分の１ファイルに含まれる小節数を示す表であり、（ｃ）は、前奏、曲間部及び後奏に用いられる曲間つなぎデータの小節数を示す表であり、（ｄ）は、ファイル時間が１５分の場合における各テンポの許容最長楽曲小節数を示す表である。It is a figure which shows the various setting values etc. which concern on 1st Embodiment, (a) is a table | surface which shows the minimum value and the maximum value of the music number of the music main body which can be included in 1 file of a jogging main part. , (B) is a table showing the number of bars included in one file of the main part of jogging at each running tempo, and (c) is the number of bars in the inter-song connecting data used for the prelude, the inter-song part, and the follower. (D) is a table showing the maximum allowable number of music bars for each tempo when the file time is 15 minutes. ジョギング用楽曲データのジョギング本編部分の作成方法の一例を示す図である。It is a figure which shows an example of the production method of the jogging main part of the music data for jogging. 第１実施形態に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the memory | storage target word audio | voice data creation process of the control part 11 of the music distribution server 1 which concerns on 1st Embodiment. 第１実施形態に係る楽曲配信サーバ１の制御部１１のメイン処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the main process of the control part 11 of the music distribution server 1 which concerns on 1st Embodiment. 第１実施形態に係る楽曲配信サーバ１の制御部１１の使用楽曲決定処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the music use determination process of the control part 11 of the music delivery server 1 which concerns on 1st Embodiment. 第１実施形態に係る楽曲配信サーバ１の制御部１１のジョギング用楽曲データ作成処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the music data creation process for jogging of the control part 11 of the music distribution server 1 which concerns on 1st Embodiment. 辞書データベースに登録される情報の内容の一例を示す図である。It is a figure which shows an example of the content of the information registered into a dictionary database. 記憶対象語音声データの作成例を示す図である。It is a figure which shows the example of preparation of memory | storage object word audio | voice data. 第２実施形態の実施例１に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the memory | storage object word sound data creation process of the control part 11 of the music distribution server 1 which concerns on Example 1 of 2nd Embodiment. 第２実施形態の実施例２に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。It is a flowchart which shows the process example in the memory | storage target word audio | voice data creation process of the control part 11 of the music distribution server 1 which concerns on Example 2 of 2nd Embodiment.

以下、本発明の最良の実施形態を図面に基づいて説明する。なお、以下に説明する実施の形態は、通信システムに本発明を適用した場合の実施形態である。 DESCRIPTION OF EXEMPLARY EMBODIMENTS Hereinafter, the best embodiment of the invention will be described with reference to the drawings. The embodiment described below is an embodiment when the present invention is applied to a communication system.

［１．第１実施形態］
［１．１通信システムの構成等］
始めに、本実施形態に係る通信システムＳの概要構成等について、図１を用いて説明する。 [1. First Embodiment]
[1.1 Configuration of communication system]
First, a schematic configuration and the like of the communication system S according to the present embodiment will be described with reference to FIG.

図１は、本実施形態に係る通信システムＳの概要構成の一例を示すブロック図である。 FIG. 1 is a block diagram illustrating an example of a schematic configuration of a communication system S according to the present embodiment.

図１に示すように、通信システムＳは、配信装置の一例としての楽曲配信サーバ１と、複数のユーザＰＣ（Personal Computer）２と、各ユーザＰＣ２に夫々接続可能な複数の携帯音楽プレーヤ３と、を含んで構成されている。 As shown in FIG. 1, the communication system S includes a music distribution server 1 as an example of a distribution device, a plurality of user PCs (Personal Computers) 2, and a plurality of portable music players 3 that can be connected to each user PC 2. , Including.

楽曲配信サーバ１とユーザＰＣ２とは、ネットワークＮＷを介して、例えば、通信プロトコルにＴＣＰ／ＩＰ（Transmission Control Protocol/Internet Protocol）等を用いて相互にデータの送受信が可能である。なお、ネットワークＮＷは、例えば、インターネット、専用通信回線（例えば、ＣＡＴＶ（Community Antenna Television）回線）、移動体通信網（基地局等を含む）、及びゲートウェイ等により構築されている。 The music distribution server 1 and the user PC 2 can transmit / receive data to / from each other via the network NW using, for example, TCP / IP (Transmission Control Protocol / Internet Protocol) as a communication protocol. The network NW is constructed by, for example, the Internet, a dedicated communication line (for example, a CATV (Community Antenna Television) line), a mobile communication network (including a base station), a gateway, and the like.

また、ユーザＰＣ２と携帯音楽プレーヤ３とは、例えば、ＵＳＢ（Universal Serial Bus）やＩＥＥＥ（The Institute of Electrical and Electronics Engineers, Inc.）1394等のバス規格に対応したケーブル等を介して、又は、Ｂｌｕｅｔｏｏｔｈ（IEEE 802.15.1）等の無線通信により、相互にデータの送受信が可能である。なお、ユーザＰＣ２と携帯音楽プレーヤ３との間におけるデータの授受は、メモリカード等の記録媒体を介して行われるようにしても良い。 Also, the user PC 2 and the portable music player 3 are connected via a cable or the like corresponding to a bus standard such as USB (Universal Serial Bus) or IEEE (The Institute of Electrical and Electronics Engineers, Inc.) 1394, or Data can be transmitted / received to / from each other by wireless communication such as Bluetooth (IEEE 802.15.1). Note that data exchange between the user PC 2 and the portable music player 3 may be performed via a recording medium such as a memory card.

このような構成の通信システムＳにおいて、楽曲配信サーバ１は、ユーザＰＣ２からの要求等に応じ、ユーザがジョギングしている最中等に聴く楽曲のデータであるジョギング用楽曲データを作成し、このジョギング用楽曲データをユーザＰＣ２に送信する。 In the communication system S having such a configuration, the music distribution server 1 creates jogging music data that is data of music to be listened to while the user is jogging in response to a request from the user PC 2 and the like. The musical composition data is transmitted to the user PC 2.

ユーザＰＣ２にダウンロードされたジョギング用楽曲データは、ユーザ操作等により、有線、無線又は記録媒体を介して携帯音楽プレーヤ３に転送される。そして、ユーザは、携帯音楽プレーヤ３にそのジョギング用楽曲データを再生させることにより、ジョギング用の楽曲を聴きながらジョギング等を行う。 The jogging music data downloaded to the user PC 2 is transferred to the portable music player 3 via a wired, wireless or recording medium by a user operation or the like. Then, the user performs jogging or the like while listening to the music for jogging by causing the portable music player 3 to reproduce the music data for jogging.

ところで、ジョギング等の運動を継続していると、いわゆるランニングハイ（又はランナーズハイとも称される）と言われる、気分が高揚した状態になることが知られている。このランニングハイは、脳内物質であるエンドルフィンが分泌されることが原因と言われているが、この状態になると、本来苦しいと感じるべき身体状態となっても精神的にはそれほど苦痛を感じなくなる。そして、このランニングハイの状態で物事を記憶すると、その記憶した内容は、運動中でないときに記憶した場合よりも忘却し難い状態を維持し易い場合が多いと言われている。 By the way, it is known that when an exercise such as jogging is continued, a so-called running high (or also called a runner's high) is said to be in an elevated state. This running high is said to be due to the secretion of endorphins, which are substances in the brain, but in this state, even if it becomes a physical condition that should be felt painful, it will not feel much pain mentally . It is said that when things are stored in this running high state, the stored contents are more likely to be maintained in a state that is more difficult to forget than when stored when not exercising.

そこで、本実施形態においては、ジョギング用の楽曲に、芸術音楽としての楽曲のほか、記憶対象の英単語（第１の語の一例）及び当該英単語の日本語訳（第２の語の一例）の発声音が所定のリズムに合わせて再生されるものも含ませることができる。ユーザは、ジョギング用の楽曲を聴きながらジョギング等を行うと共に、当該楽曲聴取後にランニングハイとなった状態において記憶対象の英単語及びその日本語訳の音声を聞いてその暗記を行う。 Therefore, in the present embodiment, in addition to music as art music in addition to music for jogging, an English word to be stored (an example of a first word) and a Japanese translation of the English word (an example of a second word) ) Uttered sound can be included that is reproduced in accordance with a predetermined rhythm. The user performs jogging and the like while listening to jogging music, and listens to the memorized English words and their Japanese translations in a state of running high after listening to the music, and memorizes them.

ここで、英単語及びその日本語訳を、「記憶対象語」（ペア語の一例）と称する。また、楽曲の１曲分に相当する記憶対象語の音声を、「記憶対象語音声」と称し、１曲に対応する記憶対象語音声の単位を、「１章」と称する。１章の記憶対象語音声には、所定数の記憶対象語（所定単語数分の英単語及び日本語訳）の音声が含まれている。この１章に相当する所定数の記憶対象語を、「記憶対象語群」と称する。 Here, English words and their Japanese translations are referred to as “memory words” (an example of pair words). Also, the speech of the storage target word corresponding to one song of the music is referred to as “storage target word speech”, and the unit of the storage target word speech corresponding to one song is referred to as “Chapter 1”. The storage target word speech of Chapter 1 includes a predetermined number of storage target words (a predetermined number of English words and Japanese translations). A predetermined number of storage target words corresponding to one chapter is referred to as a “storage target word group”.

記憶対象語音声には、英単語及び日本語訳の音声とともに、伴奏としての役割を有するドラムベース等によるリズム音も含まれているので、本実施形態においては、この記憶対象語音声も楽曲の一種類として扱う。なお、以降の説明において、芸術音楽と記憶対象語音声とを特に区別する必要がない場合、芸術音楽及び記憶対象語音声を纏めて、楽曲と称する。ただし、音声データとしての芸術音楽及び記憶対象語音声は、後述するように、楽曲本体データ及び記憶対象音声データというように区別する。 The storage target word sound includes a rhythm sound such as a drum base having a role as an accompaniment along with the English word and Japanese translation sound. Treat as one type. In the following description, when it is not necessary to particularly distinguish between art music and the storage target word speech, the art music and the storage target word speech are collectively referred to as music. However, art music and storage target speech as speech data are distinguished as music body data and storage target speech data, as will be described later.

詳細は後述するが、このジョギング用楽曲データは、複数の楽曲のデータにより構成されており、ユーザの目標等に合うトレーニングコースに対応した演奏時間及びテンポで各楽曲が再生されるように作成されている。また、本実施形態において、このジョギング用楽曲データについては、芸術音楽としての各楽曲の主要部（前奏、後奏等を除いた部分）が連続するメドレーとして再生されるとともに、記憶対象語音声としての楽曲が再生得されるように構成されている。 As will be described in detail later, this jogging song data is composed of a plurality of song data, and is created so that each song is played at a performance time and tempo corresponding to a training course that matches the user's goals and the like. ing. In the present embodiment, the jogging music data is reproduced as a continuous medley of the main parts (parts excluding the prelude, postlude, etc.) of each piece of music as art music, Are configured to be reproduced.

更に、ジョギング用楽曲データがダウンロードされると、そのジョギング用楽曲データの購入代金（ジョギング用楽曲データ等を構成する楽曲の曲数等に応じた著作権料等を含む）がシステム側からユーザに対して請求される。 Further, when the jogging music data is downloaded, the purchase price of the jogging music data (including the copyright fee according to the number of songs constituting the jogging music data, etc.) is sent from the system side to the user. Will be charged against.

なお、ユーザＰＣ２は、例えば、一般的な構成のパーソナルコンピュータを用いることが可能であり、また、携帯音楽プレーヤ３も、例えば、一般的な構成の携帯用のデジタルオーディオプレーヤを用いることができる。 The user PC 2 can use, for example, a personal computer with a general configuration, and the portable music player 3 can also use, for example, a portable digital audio player with a general configuration.

［１．２楽曲配信サーバの構成及び機能等］
［１．２．１楽曲配信サーバの構成］
次に、楽曲配信サーバ１の構成及び機能等について説明するが、始めに、楽曲配信サーバ１の構成について、図２を用いて説明する。 [1.2 Composition and function of music distribution server]
[1.2.1 Configuration of Music Distribution Server]
Next, the configuration and functions of the music distribution server 1 will be described. First, the configuration of the music distribution server 1 will be described with reference to FIG.

図２は、本実施形態に係る楽曲配信サーバ１の概要構成の一例を示すブロック図である。 FIG. 2 is a block diagram illustrating an example of a schematic configuration of the music distribution server 1 according to the present embodiment.

図２に示すように、楽曲配信サーバ１は、ＣＰＵ（Central Processing Unit）、ＲＯＭ（Read Only Memory）、ＲＡＭ（Random Access Memory）等を備える制御部１１と、各種データ及びプログラムを記憶する記憶手段の一例としての記憶部１２（例えば、ハードディスクドライブ等）と、ネットワークＮＷに接続して、ユーザＰＣ２等との通信状態を制御する受信手段及び送信手段夫々の一例としての通信部１３と、ＷＡＶフォーマット（RIFF（Resource Interchange File Format） Waveform Audio Format）の楽曲データ及び記憶対象語音声データをＭＰ３（MPEG Audio Layer-3）フォーマットの楽曲データ及び記憶対象語音声データにエンコードするエンコーダ部１４と、を含んで構成されており、制御部１１と各部とはシステムバス１５を介して接続されている。 As shown in FIG. 2, the music distribution server 1 includes a control unit 11 including a CPU (Central Processing Unit), a ROM (Read Only Memory), a RAM (Random Access Memory), and the like, and a storage unit that stores various data and programs. A storage unit 12 (for example, a hard disk drive or the like) as an example, a communication unit 13 as an example of a reception unit and a transmission unit that are connected to the network NW and control the communication state with the user PC 2 or the like, and a WAV format (RIFF (Resource Interchange File Format) Waveform Audio Format) music data and storage target word audio data are encoded into MP3 (MPEG Audio Layer-3) format music data and storage target word audio data. The control unit 11 and each unit are connected via a system bus 15.

制御部１１は、本発明において、生成手段及び送信手段の一例を構成する。そして、制御部１１は、ＣＰＵが、ＲＯＭや記憶部１２に記憶された各種プログラムを読み出して実行することにより楽曲配信サーバ１の各部を統括制御すると共に、後述するパーツＷＡＶデータデータベースプログラム２０１、サーバシステムプログラム２０２、ＷＥＢサイトプログラム２０３、楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４及び記憶対象語ＷＡＶデータ書き出しプログラム２０５等を読み出し実行することにより、上記生成手段及び送信手段として機能する。 In the present invention, the control unit 11 constitutes an example of a generation unit and a transmission unit. The control unit 11 performs overall control of each unit of the music distribution server 1 by the CPU reading and executing various programs stored in the ROM and the storage unit 12, as well as a parts WAV data database program 201 and a server described later. By reading and executing the system program 202, the WEB site program 203, the music body WAV part data writing program 204, the storage target word WAV data writing program 205, etc., it functions as the generating means and transmitting means.

［１．２．２データ及びプログラム等］
次に、記憶部１２に記憶されるデータ及びプログラムのソフトウェア構成等について、図３乃至図８を用いて説明する。 [1.2.2 Data and programs]
Next, the data stored in the storage unit 12, the software configuration of the program, and the like will be described with reference to FIGS.

図３は、パーツＷＡＶデータデータベースに登録されるデータ、及び、楽曲配信サーバ１におけるソフトウェアの概要構成の一例を示す図である。また、図４は、トレーニングコースを選択するためのメニュー構成の一例を示す図である。また、図５は、あるトレーニングコースにおける、各ステップのジョギング用楽曲データの仕様の一例を示す図である。また、図６は、各ステップにおけるジョギング用楽曲データの概要構成の一例を示す図である。また、図７は、各ジョギング時間におけるジョギング用楽曲データの概要構成の一例を示す図である。また、図８は、ジョギング用楽曲データの１ファイルのデータ構造の概要例を示す図である。 FIG. 3 is a diagram showing an example of a schematic configuration of data registered in the parts WAV data database and software in the music distribution server 1. FIG. 4 is a diagram illustrating an example of a menu configuration for selecting a training course. FIG. 5 is a diagram showing an example of the specification of jogging music data for each step in a certain training course. FIG. 6 is a diagram showing an example of a schematic configuration of jogging music data in each step. FIG. 7 is a diagram showing an example of a schematic configuration of jogging music data at each jogging time. FIG. 8 is a diagram showing a schematic example of the data structure of one file of jogging music data.

記憶部１２には、ユーザの個人情報（例えば、氏名、年齢、メールアドレス、ユーザＩＤ、パスワード等）、ユーザのトレーニング情報（例えば、選択されたトレーニングコース、当該トレーニングコースのジョギング用楽曲データを最初にダウンロードした日時、現在のステップ、ジョギング用楽曲データを構成する楽曲の内容及び演奏順等を示す演奏リスト）、作成されたジョギング用楽曲データ、ユーザの選曲の履歴を示す履歴情報（例えば、選択された楽曲、アルバム、アーティスト、ジャンル又は記憶対象語群等を時系列で示す情報）等が、ユーザ毎に対応付けて記憶されている。 The storage unit 12 stores the user's personal information (for example, name, age, email address, user ID, password, etc.), the user's training information (for example, the selected training course, and jogging music data for the training course). The date and time of download, the current step, a performance list indicating the contents and order of performance of the music constituting the music data for jogging), the created music data for jogging, and history information indicating the history of music selection by the user (for example, selection) Stored information, information indicating the time series of the music, album, artist, genre, storage target word group, and the like) stored in association with each user.

また、記憶部１２には、ジョギング用楽曲データを構成するパーツとなるパーツデータが登録されるパーツＷＡＶデータデータベースが構築されている。更にまた、記憶部１２には、図３に示すパーツＷＡＶデータデータベースプログラム２０１、サーバシステムプログラム２０２、ＷＥＢサイトプログラム２０３、楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４及び記憶対象語ＷＡＶデータ書き出しプログラム２０５等が記憶されている。 In addition, a part WAV data database is registered in the storage unit 12 in which part data, which is a part constituting the jogging music data, is registered. Furthermore, the storage unit 12 stores a parts WAV data database program 201, a server system program 202, a WEB site program 203, a music body WAV part data writing program 204, a storage target word WAV data writing program 205, and the like shown in FIG. Has been.

上記パーツＷＡＶデータデータベースには、図３に示す楽曲本体ＷＡＶパーツデータ１０１、ジョギングアレンジ曲間つなぎＷＡＶパーツデータ１０２、ジョギングアレンジ音声ガイダンスＷＡＶパーツデータ１０３、ＤＪ音声ＷＡＶパーツデータ１０４、記憶対象語ＷＡＶパーツデータ３０１等が、ＷＡＶフォーマットで登録されている。 The part WAV data database includes the music main body WAV part data 101, jogging arrangement inter-music connection WAV part data 102, jogging arrangement voice guidance WAV part data 103, DJ voice WAV part data 104, and storage target word WAV parts shown in FIG. Data 301 and the like are registered in the WAV format.

楽曲本体ＷＡＶパーツデータ１０１は、ジョギング用の楽曲を構成する主要的な位置を占める芸術音楽としての楽曲本体のＷＡＶデータであり、全てのデータが同一のテンポ（本実施形態においては、１４０ＢＰＭ（Beats Per Minute））で記録されている。そして、楽曲本体ＷＡＶパーツデータ１０１は、図３に示す楽曲本体ＭＩＤＩデータ１０５とジョギングアレンジドラムベースＷＡＶデータ１０６とに基づき、楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４を用いて作成される。 The music main body WAV part data 101 is the WAV data of the music main body as art music occupying the main positions constituting the music for jogging, and all the data have the same tempo (in this embodiment, 140 BPM (Beats Per Minute)). The music body WAV parts data 101 is created using the music body WAV parts data writing program 204 based on the music body MIDI data 105 and the jogging arrangement drum base WAV data 106 shown in FIG.

楽曲本体ＭＩＤＩデータ１０５は、楽曲本体ＷＡＶパーツデータ１０１の原曲が記録されたＭＩＤＩ（Musical Instrument Digital Interface）フォーマットのデータである。また、上記ジョギングアレンジドラムベースＷＡＶデータ１０６は、ドラムやシンバル等によるリズム音等が記録されたＷＡＶデータであり、楽曲本体ＭＩＤＩデータ１０５の原曲をジョギング用にアレンジするために用いられるデータである。 The music body MIDI data 105 is data in the MIDI (Musical Instrument Digital Interface) format in which the original music of the music body WAV parts data 101 is recorded. Further, the jogging arrangement drum base WAV data 106 is WAV data in which rhythm sounds by drums, cymbals and the like are recorded, and is data used for arranging the original music of the music body MIDI data 105 for jogging. .

楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４においては、楽曲本体ＭＩＤＩデータ１０５から、前奏部分、後奏部分、間奏部分等が小節単位で削除され、残った主要部分に対して１４０ＢＰＭでテンポが調整される。このとき、調整前と調整後とでは、音程が変わらないように調整が行われる。そして、当該主要部分の楽曲本体ＭＩＤＩデータ１０５のフォーマットがＷＡＶフォーマットに変換され、ジョギングアレンジドラムベースＷＡＶデータ１０６と合成されて、ジョギング用のアレンジ（例えば、ハウスミュージック調）が施される。こうして作成された楽曲データが楽曲本体ＷＡＶパーツデータ１０１である。 In the music body WAV parts data writing program 204, the prelude part, the subsequent part, the interlude part, etc. are deleted from the music body MIDI data 105 in units of measures, and the tempo is adjusted at 140 BPM with respect to the remaining main part. At this time, the adjustment is performed so that the pitch does not change before and after the adjustment. Then, the format of the music main body MIDI data 105 of the main part is converted into the WAV format, synthesized with the jogging arrangement drum base WAV data 106, and subjected to jogging arrangement (for example, house music style). The music data created in this way is music main body WAV parts data 101.

なお、楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４は、楽曲配信サーバ１にインストールされて、制御部１１により実行されるようにしても良いし、他の情報処理装置にインストールされて、当該装置上で実行されるようにしても良い。 Note that the music body WAV parts data writing program 204 may be installed in the music distribution server 1 and executed by the control unit 11, or installed in another information processing apparatus and executed on the apparatus. You may be made to do.

一方、記憶対象語ＷＡＶパーツデータ３０１は、上記記憶対象語音声としてのＷＡＶデータであり、全てのデータが同一のテンポ（本実施形態においては、楽曲本体ＷＡＶパーツデータ１０１と同一の１４０ＢＰＭ）で記録されている。そして、記憶対象語ＷＡＶパーツデータ３０１は、図３に示す英単語群ＷＡＶデータ３０２（第１語音声情報の一例）、日本語訳群ＷＡＶデータ３０３（第２語音声情報の一例）及び記憶対象語用ドラムベースＷＡＶデータ３０４（リズム音の情報の一例）に基づき、記憶対象語ＷＡＶパーツデータ書き出しプログラム２０５を用いて作成される。ここで、英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３は、本発明のペア語の情報の一例である。 On the other hand, the storage target word WAV part data 301 is the WAV data as the storage target word sound, and all the data is recorded at the same tempo (in the present embodiment, the same 140 BPM as the music main body WAV part data 101). Has been. The storage target word WAV parts data 301 includes the English word group WAV data 302 (an example of the first word speech information), the Japanese translation group WAV data 303 (an example of the second word speech information) shown in FIG. Based on the word drum base WAV data 304 (an example of rhythm sound information), it is created using the storage target word WAV part data writing program 205. Here, the English word group WAV data 302 and the Japanese translation group WAV data 303 are examples of pair word information of the present invention.

英単語群ＷＡＶデータ３０２は、１章に相当する記憶対象語群の英単語の原音声が記録されたＷＡＶデータである。また、日本語訳群ＷＡＶ３０３データは、１章に相当する記憶対象語群の日本語訳の原音声が記録されたＷＡＶデータである。また、上記記憶対象語用ドラムベースＷＡＶデータ３０４は、ドラムやシンバル等によるリズム音等が記録されたＷＡＶデータであり、英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３の原音声にリズムを付与し、また、当該原音声をジョギング用にアレンジするために用いられるデータである。 The English word group WAV data 302 is WAV data in which the original speech of English words of the storage target word group corresponding to Chapter 1 is recorded. The Japanese translation group WAV303 data is WAV data in which the original Japanese speech of the storage target word group corresponding to chapter 1 is recorded. The drum-based WAV data 304 for the storage target word is WAV data in which rhythm sounds such as drums and cymbals are recorded, and the rhythm is added to the original speech of the English word group WAV data 302 and the Japanese translation group WAV data 303. And is used to arrange the original voice for jogging.

なお、記憶対象語ＷＡＶパーツデータ３０１、英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３、及び記憶対象語用ドラムベースＷＡＶデータ３０４の詳細、並びに、記憶対象語ＷＡＶパーツデータ書き出しプログラム２０５を用いた記憶対象語ＷＡＶパーツデータ３０１の作成方法については、１．２．２．２項で説明する。 The storage target word WAV parts data 301, the English word group WAV data 302, the Japanese translation group WAV data 303, the details of the drum base WAV data 304 for the storage target word, and the storage target word WAV part data writing program 205 A method of creating the storage target word WAV parts data 301 used will be described in section 1.2.2.2.

ジョギングアレンジ曲間つなぎＷＡＶパーツデータ１０２は、ジョギング用の楽曲を構成する複数の楽曲本体の曲間、又は記憶対象語音声と楽曲本体との間に演奏される楽曲（以下、「曲間部」と称する）のＷＡＶデータ、最初の楽曲本体の前に演奏される前奏のＷＡＶデータ、及び、最後の楽曲本体の演奏又は記憶対象語音声の発声の後に演奏される後奏のＷＡＶデータの総称である。 Jogging arrangement inter-song linkage WAV parts data 102 is a piece of music (hereinafter referred to as “inter-song part”) that is played between songs of a plurality of song bodies constituting a song for jogging or between a speech to be stored and a song body. WAV data), the prelude WAV data played before the first music body, and the WAV data of the later music played after the last music body performance or speech of the speech to be stored. is there.

ジョギングアレンジ音声ガイダンスＷＡＶパーツデータ１０３は、専門のアドバイザーによる運動指導やアドバイス等の音声が記録されたＷＡＶデータである。 The jogging arrangement voice guidance WAV parts data 103 is WAV data in which voices such as exercise guidance and advice by a professional advisor are recorded.

ＤＪ音声ＷＡＶパーツデータ１０４は、曲間に流されるＤＪ（Disc Jockey）の音声が記録されたＷＡＶデータである。 The DJ audio WAV parts data 104 is WAV data in which audio of DJ (Disc Jockey) that is played between songs is recorded.

なお、以下の説明においては、楽曲本体ＷＡＶパーツデータ１０１を、単に「楽曲本体データ」と称し、記憶対象語ＷＡＶパーツデータ３０１を、単に「記憶対象語音声データ」（記憶用音声情報の一例）と称する。更に、ジョギングアレンジ曲間つなぎＷＡＶパーツデータ１０２を、単に「曲間つなぎデータ」と称する。また、ジョギングアレンジ音声ガイダンスＷＡＶパーツデータ１０３及びＤＪ音声ＷＡＶパーツデータ１０４を、纏めて、単に「音声データ」と称する。 In the following description, the music body WAV parts data 101 is simply referred to as “music body data”, and the storage target word WAV parts data 301 is simply “storage target word audio data” (an example of storage audio information). Called. Further, the jogging arrangement inter-music connection WAV part data 102 is simply referred to as “inter-music connection data”. The jogging arrangement voice guidance WAV parts data 103 and the DJ voice WAV parts data 104 are collectively referred to simply as “voice data”.

パーツＷＡＶデータデータベースプログラム２０１は、楽曲配信サーバ１の制御部１１がパーツＷＡＶデータデータベースを管理するためのプログラムであり、各パーツデータの登録要求に応じて、パーツデータを当該データベースに登録したり、サーバシステムプログラム２０２から要求されたパーツデータを当該データベースから取得して、サーバシステムプログラム２０２に渡すためのプログラムである。 The parts WAV data database program 201 is a program for the control unit 11 of the music distribution server 1 to manage the parts WAV data database. In response to a request for registration of each part data, the parts data is registered in the database. This is a program for acquiring part data requested from the server system program 202 from the database and passing it to the server system program 202.

サーバシステムプログラム２０２は、楽曲配信サーバ１の制御部１１が、パーツＷＡＶデータデータベースから取得されたパーツデータを用いて、ジョギング用楽曲データを作成するためのプログラムである。 The server system program 202 is a program for the control unit 11 of the music distribution server 1 to create jogging music data using part data acquired from the parts WAV data database.

ＷＥＢサイトプログラム２０３は、楽曲配信サーバ１の制御部１１が、作成されたジョギング用楽曲データを配信するＷＥＢサイトとして、ユーザＰＣ２からの要求に応じて、ＷＥＢページやジョギング用楽曲データを送信するためのプログラムである。 The WEB site program 203 is for the control unit 11 of the music distribution server 1 to transmit a WEB page and jogging music data in response to a request from the user PC 2 as a WEB site that distributes the created jogging music data. It is a program.

なお、パーツＷＡＶデータデータベースプログラム２０１、サーバシステムプログラム２０２、ＷＥＢサイトプログラム２０３、楽曲本体ＷＡＶパーツデータ書き出しプログラム２０４及び記憶対象語ＷＡＶデータ書き出しプログラム２０５等は、例えば、図示せぬネットワークを介して他のサーバ装置等から取得されるようにしても良いし、ＣＤ（Compact Disc）−ＲＯＭ等の記録媒体に記録されてドライブ装置等から読み込まれるようにしても良い。 The parts WAV data database program 201, the server system program 202, the WEB site program 203, the music body WAV parts data writing program 204, the storage target word WAV data writing program 205, and the like are, for example, other networks via a network (not shown). It may be acquired from a server device or the like, or may be recorded on a recording medium such as a CD (Compact Disc) -ROM and read from a drive device or the like.

次に、ジョギング用楽曲データの内容を説明する前に、このジョギング用楽曲データの仕様を決定付けるトレーニングコースを選択するためのメニューの構成について説明する。 Next, before describing the contents of the jogging music data, the configuration of a menu for selecting a training course that determines the specifications of the jogging music data will be described.

このトレーニングコースの選択は、楽曲配信サーバ１からユーザＰＣ２に送信されたコース選択用ＷＥＢページに基づいて、ユーザがユーザＰＣ２を操作することにより行われる。 The selection of the training course is performed by the user operating the user PC 2 based on the course selection WEB page transmitted from the music distribution server 1 to the user PC 2.

また、当該トレーニングコースの選択は、基本的には本実施形態に係る記憶対象語音声の再生内容とは無関係に、ユーザの運動能力やそれまでの経験に基づいて実行される。そして、記憶対象語の暗記（すなわち、記憶対象語音声の再生）は、その選ばれたトレーニングコースに相当するジョギング用楽曲データの再生中において、上記ランニングハイ状態となると予測される予め設定されたタイミング以降の期間において実行される。なお、記憶対象語音声の内容やその再生タイミングを含めたトレーニングコースを予め用意しておき、それをトレーニングコースの選択の一貫として選択可能に構成することもできる。 In addition, the selection of the training course is basically performed based on the user's athletic ability and experience so far, regardless of the reproduction content of the speech to be stored according to the present embodiment. The memorization of the storage target word (that is, the playback of the storage target word sound) is preset to be predicted to be in the running high state during the playback of the jogging music data corresponding to the selected training course. It is executed in a period after the timing. It is also possible to prepare a training course including the contents of the speech to be stored and its reproduction timing in advance and select it as a part of selection of the training course.

図４に示すように、トレーニングコースを選択するためのメニューの構成は、例えば、最上位の第１階層から最下位の第４階層までの階層構造をなしている。そして、各階層においては、ユーザの目標別等に応じたメニュー項目が定義されており、上位の階層では、大まかな目標に応じたメニュー項目が定義され、下位の階層になっていくに従って、具体的な目標に応じたメニュー項目が定義されている。 As shown in FIG. 4, the menu structure for selecting a training course has, for example, a hierarchical structure from the highest first hierarchy to the lowest fourth hierarchy. In each hierarchy, menu items are defined according to the user's goals, etc., and in the upper hierarchy, menu items are defined according to the general objective, and as the lower hierarchy, the specific items are specified. Menu items are defined according to specific goals.

先ず、「コース選択」が選択されると、第１階層のメニュー項目として、例えば、「健康維持」、「ダイエット」、「マラソンレース出場」、「タイムアップ」等のユーザの目標に応じるコースメニューがユーザＰＣ２の画面に表示される。そして、例えば、「マラソンレース出場」が選択されると、第２階層のメニュー項目として、「ホノルルマラソン」、「東京マラソン」、「ハーフマラソン」、「１０Ｋｍマラソン」等のコースメニューがユーザＰＣ２の画面に表示される。 First, when “Course selection” is selected, menu items corresponding to the user's goals such as “Health maintenance”, “Diet”, “Marathon race entry”, “Time up”, etc. are displayed as menu items in the first layer. Is displayed on the screen of the user PC 2. Then, for example, when “Marathon race entry” is selected, menu items such as “Honolulu Marathon”, “Tokyo Marathon”, “Half Marathon”, “10 Km Marathon”, etc. are displayed on the user PC 2 as the second-level menu items. Displayed on the screen.

またここで、例えば、「ホノルルマラソン」が選択されると、第３階層のメニュー項目として、「タイムを狙う」、「完走する」等のコースメニューがユーザＰＣ２の画面に表示される。更に、例えば、「タイムを狙う」が選択されると、第４階層のメニュー項目として、「初心者」、「中級者」、「上級者」等のコースメニューがユーザＰＣ２の画面に表示される。 Here, for example, when “Honolulu Marathon” is selected, course menus such as “Aim for time” and “Complete” are displayed on the screen of the user PC 2 as menu items in the third layer. Further, for example, when “Aim for time” is selected, course menus such as “beginner”, “intermediate”, “advanced”, etc. are displayed on the screen of the user PC 2 as menu items in the fourth layer.

そして、第４階層のメニュー項目の中から一のメニュー項目が選択されると、これに一意に対応したトレーニングコースが、楽曲配信サーバ１の制御部１により決定される。例えば、「マラソンレース出場」〜「ホノルルマラソン」〜「タイムを狙う」〜「初心者」と選択されると、これに対して、「トレーニングコースＡ」が決定される。また、同様にして「中級者」が選択されると、「トレーニングコースＢ」が決定される。また、同様にして「上級者」が選択されると、「トレーニングコースＣ」が決定される。 When one menu item is selected from the menu items in the fourth hierarchy, a training course uniquely corresponding to the menu item is determined by the control unit 1 of the music distribution server 1. For example, when “Marathon race entry” to “Honolulu Marathon” to “Aim for time” to “Beginner” is selected, “Training course A” is determined. Similarly, when “intermediate” is selected, “training course B” is determined. Similarly, when “advanced” is selected, “training course C” is determined.

楽曲配信サーバ１の記憶部１２には、上記メニュー構成を定義するメニューデータが記憶されている。また、記憶部１２には、最終的なトレーニングコース毎に、そのコース情報が記憶されている。このコース情報には、そのトレーニングコースの全ステップが定義されていると共に、ステップ毎にそのトレーニングの仕様が定義されている。具体的には、ステップ毎に、対応するトレーニング期間、運動時間の一例としてのジョギング時間（ユーザが走る時間）、ジョギング距離（ユーザが走る距離）、楽曲本体の曲数、初期テンポ（ジョギング用楽曲データの最初の曲のテンポ）、終了テンポ（ジョギング用楽曲データの最後の曲のテンポ）等が定義されている。ここで、ジョギング時間は、ジョギング用楽曲データの、後述するウォームアップ曲及びクールダウン曲の演奏時間を除いた演奏時間を示す時間となる。また、当該トレーニングの仕様自体は、あくまでトレーニングとしての仕様であり、当該仕様には本実施形態に係る記憶対象語音声の数等は含まれていない。 The storage unit 12 of the music distribution server 1 stores menu data that defines the menu configuration. The storage unit 12 stores course information for each final training course. In this course information, all steps of the training course are defined, and the specifications of the training are defined for each step. Specifically, for each step, the corresponding training period, jogging time (user running time) as an example of exercise time, jogging distance (user running distance), number of songs in the music body, initial tempo (music for jogging) The tempo of the first song in the data), the end tempo (the tempo of the last song in the jogging song data), and the like are defined. Here, the jogging time is a time indicating the performance time of the music data for jogging, excluding the performance time of warm-up music and cool-down music described later. In addition, the training specification itself is only a specification as a training, and the specification does not include the number of words to be stored according to the present embodiment.

［１．２．２．１ジョギング用楽曲データの詳細］
次に、図５乃至図８を用いて、本実施形態に係るジョギング用楽曲データについて詳細に説明する。 [1.2.2.1 Details of song data for jogging]
Next, the jogging music data according to the present embodiment will be described in detail with reference to FIGS.

図５は、あるトレーニングコースの各ステップにおけるジョギング用楽曲データの仕様の一例を示しており、このトレーニングコースは、ステップ１から開始され、基本的には１ヶ月毎にステップが上昇していく。 FIG. 5 shows an example of the specification of jogging music data in each step of a certain training course. This training course starts from step 1 and basically increases in steps every month.

ステップ１のトレーニング期間は１ヶ月目に設定されている。すなわち、ステップ１は、本トレーニングコースを開始してから１ヶ月経過するまでのジョギングを対象としている。そして、そのジョギング時間は３０分に設定され、ジョギング距離は約５Ｋｍに設定されている。また、初期テンポと最終テンポとは、何れも１６０ＢＰＭに設定されている。 The training period of step 1 is set to the first month. That is, Step 1 is intended for jogging from the start of this training course until one month has passed. The jogging time is set to 30 minutes, and the jogging distance is set to about 5 km. The initial tempo and the final tempo are both set to 160 BPM.

これらの情報は、ステップ１では、ウォームアップ曲及びクールダウン曲を除いたジョギング用楽曲データの演奏時間が３０分であり、そのテンポは、１６０ＢＰＭで一定であることを示している。そして、このテンポに対応したペースでユーザが３０分間走ることにより、約５Ｋｍの距離を走ることが想定されている。 These pieces of information indicate that in step 1, the performance time of the jogging music data excluding the warm-up music and the cool-down music is 30 minutes, and the tempo is constant at 160 BPM. Then, it is assumed that the user runs a distance of about 5 km when the user runs for 30 minutes at a pace corresponding to this tempo.

ここで、演奏される楽曲本体の曲数は８曲に設定されているが、これはあくまでも目安であって、選択された楽曲本体の演奏時間によって曲数は変化するものである。 Here, although the number of music pieces to be played is set to eight songs, this is only a guide, and the number of songs changes depending on the performance time of the selected music piece.

また、図５において、ステップ１のメドレー楽曲内容、すなわち、演奏される楽曲本体の内容として「○○○メドレー」（「○○○」は、例えば、アルバムの名称やアーティストの名称等）が示され、他のステップにおいても示されているが、これは選択されたアルバムやアーティスト、或いはジャンル等を例示的に示したものであり、例えば、ステップ１に対して必ずしも「○○○メドレー」が選択されるわけではない。 In FIG. 5, the contents of the medley music in step 1, that is, the contents of the music body to be played are “XXX medley” (“XXX” is, for example, the name of an album or the name of an artist). Although this is also shown in other steps, this shows an example of the selected album, artist, or genre. For example, “XXX medley” is not necessarily included in step 1. It is not selected.

また、図５の例において、ステップ１のメドレーを構成する８曲の楽曲は、夫々芸術音楽であるが、例えば、８曲の一部を芸術音楽のメドレーとし、残りを記憶対象語音声で構成しても良い。 In the example of FIG. 5, the eight music pieces constituting the medley in step 1 are each art music. For example, a part of the eight music pieces are medley of art music, and the rest are constituted by the speech to be stored. You may do it.

次に、ステップ２のトレーニング期間は２ヶ月目に設定されている。すなわち、ステップ１は、本トレーニングコースを開始して１ヶ月が経過してから更に１ヶ月を経過するまでのジョギングを対象としている。そして、そのジョギング時間は３０分に設定され、ジョギング距離は約５．５Ｋｍに設定され、曲数は８曲に設定されている。また、初期テンポは、１６０ＢＰＭに設定され、最終テンポは、１６５ＢＰＭに設定されている。 Next, the training period of step 2 is set to the second month. In other words, Step 1 is intended for jogging from the start of this training course until one month has passed and one month has passed. The jogging time is set to 30 minutes, the jogging distance is set to about 5.5 km, and the number of songs is set to eight. The initial tempo is set to 160 BPM, and the final tempo is set to 165 BPM.

これらの情報は、ステップ２では、ウォームアップ曲及びクールダウン曲を除いたジョギング用楽曲データの演奏時間が３０分であり、そのテンポは、１６０ＢＰＭから１６５ＢＰＭに徐々に上昇することを示している。そして、このテンポに対応したペースでユーザが３０分間走ることにより、約５．５Ｋｍの距離を走ることが想定されている。 These pieces of information indicate that in step 2, the performance time of the jogging music data excluding the warm-up music and the cool-down music is 30 minutes, and the tempo gradually increases from 160 BPM to 165 BPM. Then, it is assumed that the user runs a distance of about 5.5 km by running for 30 minutes at a pace corresponding to this tempo.

ステップ２は、ジョギング距離についてはステップ１と同様であるが、テンポが上昇していくことによって、ステップ１よりも約０．５Ｋｍ長い距離をユーザが走ることが想定されている。 In Step 2, the jogging distance is the same as that in Step 1, but it is assumed that the user runs a distance approximately 0.5 km longer than Step 1 as the tempo increases.

以下同様に、ステップ３のトレーニング期間は３ヶ月目に設定されており、ジョギング時間は４５分に設定され、ジョギング距離は約８Ｋｍに設定され、曲数は１２曲に設定されている。そして、初期テンポと最終テンポとは、何れも１６５ＢＰＭに設定されている。 Similarly, the training period of step 3 is set to the third month, the jogging time is set to 45 minutes, the jogging distance is set to about 8 km, and the number of songs is set to 12. The initial tempo and final tempo are both set to 165 BPM.

また、ステップ４のトレーニング期間は４ヶ月目に設定されており、ジョギング時間は４５分に設定され、ジョギング距離は約８．５Ｋｍに設定され、曲数は１２曲に設定されている。そして、初期テンポは１６５ＢＰＭに設定され、最終テンポは１７０ＢＰＭに設定されている。 The training period in step 4 is set to the fourth month, the jogging time is set to 45 minutes, the jogging distance is set to about 8.5 km, and the number of songs is set to 12. The initial tempo is set to 165 BPM, and the final tempo is set to 170 BPM.

また、ステップ５のトレーニング期間は５ヶ月目に設定されており、ジョギング時間は６０分に設定され、ジョギング距離は約１１Ｋｍに設定され、曲数は１６曲に設定されている。そして、初期テンポと最終テンポとは、何れも１７０ＢＰＭに設定されている。 The training period in step 5 is set to the fifth month, the jogging time is set to 60 minutes, the jogging distance is set to about 11 km, and the number of songs is set to 16 songs. The initial tempo and the final tempo are both set to 170 BPM.

また、ステップ６のトレーニング期間は６ヶ月目に設定されており、ジョギング時間は６０分に設定され、ジョギング距離は約１２Ｋｍに設定され、曲数は１６曲に設定されている。そして、初期テンポは１７０ＢＰＭに設定され、最終テンポは１７５ＢＰＭに設定されている。 The training period of step 6 is set to the sixth month, the jogging time is set to 60 minutes, the jogging distance is set to about 12 km, and the number of songs is set to 16 songs. The initial tempo is set to 170 BPM and the final tempo is set to 175 BPM.

図６に示すように、各ステップにおけるジョギング用の楽曲は、楽曲本体と前奏、後奏、曲間部、ガイド音声、ＤＪ音声によって構成されるジョギング本編と、ジョギング本編の前に再生されるウォームアップ曲と、ジョギング本編の後に再生されるクールダウン曲と、により構成されている。 As shown in FIG. 6, jogging music in each step includes a jogging main part composed of a music main body, a prelude, a postlude, an inter-musical part, a guide voice, and a DJ voice, and a warm played before the jogging main part. It consists of an up song and a cool down song played after the main jogging.

なお、図６中、（ａ）、（ｂ）、（ｃ）、（ｄ）、（ｅ）、（ｆ）は、図５において例示したステップ１、ステップ２、ステップ３、ステップ４、ステップ５、ステップ６に夫々対応している。 In FIG. 6, (a), (b), (c), (d), (e), and (f) are the same as Step 1, Step 2, Step 3, Step 4, and Step 5 illustrated in FIG. , Corresponding to step 6 respectively.

ウォームアップ曲は、ユーザがジョギングを開始する前に、その準備運動等のウォーミングアップを行う際に再生されることを想定した楽曲であり、例えば、テンポが徐々に上昇して、ユーザの気分を次第に盛り上げるような楽曲が選定される。 A warm-up song is a song that is supposed to be played when warming up such as a preparatory exercise before the user starts jogging. For example, the tempo gradually increases and the user's mood gradually increases. A song that excites is selected.

また、クールダウン曲は、ユーザがジョギングを終えた後に、心身を平静に戻すクーリングダウンを行う際に再生されることを想定した楽曲であり、例えば、テンポが徐々に下降して、次第に落ち着いていくような雰囲気の楽曲が選定される。 The cool-down song is a song that is supposed to be played when the user finishes jogging and then performs cooling-down to return the mind and body to calm. For example, the tempo gradually decreases and gradually settles down. The music with the atmosphere that suits you is selected.

ウォームアップ曲及びクールダウン曲の楽曲データは、ＭＰ３フォーマットで記憶部１２に記憶されている。 The music data of the warm-up music and the cool-down music are stored in the storage unit 12 in the MP3 format.

このウォームアップ曲及びクールダウン曲としては、全ステップを通じで同一の楽曲を用いることが可能であり、例えば、ユーザＰＣ２が楽曲配信サーバ１から最初にジョギング用楽曲データをダウンロードするときには、ジョギング本編とウォームアップ曲及びクールダウン曲の全ての楽曲データを含むジョギング用楽曲データがダウンロードされるが、２度目以降のダウンロードの際には、ジョギング本編の楽曲データのみのジョギング用楽曲データがダウンロードされる。 As the warm-up music and the cool-down music, it is possible to use the same music throughout all the steps. For example, when the user PC 2 downloads music data for jogging from the music distribution server 1 for the first time, The jogging music data including all the music data of the warm-up music and the cool-down music is downloaded, but the music data for jogging only of the music data of the main jogging is downloaded at the second and subsequent downloads.

なお、２度目以降のダウンロードの際においても、ユーザの選択により、最初にダウンロードされたウォームアップ曲及びクールダウン曲とは異なるウォームアップ曲及びクールダウン曲がダウンロードされようにしても良い。 In the second and subsequent downloads, a warm-up song and a cool-down song different from the first downloaded warm-up song and cool-down song may be downloaded according to the user's selection.

また、図６に示すように、ジョギング時間、すなわち、ジョギング本編の演奏時間は、２ステップ毎に上昇する。その一方で、ジョギング本編のテンポは、奇数ステップにおいては演奏開始から終了まで一定であり、偶数ステップにおいては、徐々に上昇する。また、各ステップにおける最終テンポと、その次のステップにおけるテンポとは一致している。 Further, as shown in FIG. 6, the jogging time, that is, the performance time of the jogging main part increases every two steps. On the other hand, the tempo of the jogging main part is constant from the start to the end of the performance in the odd steps and gradually increases in the even steps. In addition, the final tempo in each step matches the tempo in the next step.

ステップが上がっていくに従って（トレーニング期間が時間的に未来にあるほど）、ユーザのジョギング能力は向上していくものと考えられることから、それに合わせて、ジョギング時間及びテンポが上昇していく。また、目安ではあるが、ステップが上がっていくに従って曲数も増えていく。そして、このジョギング時間及びテンポは、トレーニングコース毎に決定され、更には、トレーニングコース内のステップ毎に決定されるようになっており、具体的には、例えば、マラソン等の専門家により策定された運動理論や方針等に基づいて、予めトレーニングコース毎に設定されている。 As the step goes up (the training period is in the future in time), the jogging ability of the user is considered to improve, so the jogging time and tempo increase accordingly. Moreover, although it is a standard, the number of songs increases as the steps go up. The jogging time and tempo are determined for each training course, and further for each step in the training course. Specifically, for example, the jogging time and tempo are determined by experts such as marathons. Based on the exercise theory, policy, etc. set for each training course in advance.

図７は、各ジョギング時間におけるジョギング用楽曲データの概要構成の一例を示す図である。 FIG. 7 is a diagram showing an example of a schematic configuration of jogging music data at each jogging time.

本実施形態においては、ジョギング時間に応じて、ジョギング用楽曲データのジョギング本編部分が、１又は複数作成される。これは、次に述べる理由による。 In the present embodiment, one or more jogging main parts of the jogging music data are created according to the jogging time. This is for the following reason.

すなわち、ジョギング時間が長いほど一般にジョギング本編部分のデータサイズも大きくなり、また、そのデータのダウンロードに要する時間も長くなる。そして、ダウンロード時間が長いほど、何らかの理由（例えば、コネクションの切断等）でダウンロードが途中で失敗する可能性が高くなる。ダウンロードが失敗してしまったら、また最初からダウンロードを行わなければならず、ダウンロード時間が更に長くなってしまう。そこで、ジョギング時間が長い場合には、ジョギング本編部分のデータを複数に分けて作成することで、ダウンロードも複数回に分けて行われることとなる。よって、ダウンロードが途中で失敗しても、既にダウンロードした部分については再度ダウンロードする必要がなくなる。 That is, the longer the jogging time, the larger the data size of the main part of the jogging, and the longer the time required for downloading the data. The longer the download time, the higher the possibility that the download will fail for some reason (for example, disconnection of connection). If the download fails, you will have to download from the beginning again, and the download time will be longer. Therefore, when the jogging time is long, the data of the main part of the jogging is created in a plurality of parts, so that the downloading is performed in a plurality of times. Therefore, even if the download fails in the middle, there is no need to download the already downloaded portion again.

具体的には、図７に示すように、ジョギング時間が１５分である場合には、１５分のファイル１個でジョギング本編部分が構成される（図７（ａ））。また、ジョギング時間が３０分である場合には、３０分のファイル１個でジョギング本編部分が構成される（図７（ｂ））。また、ジョギング時間が４５分である場合には、３０分のファイル１個と１５分のファイル１個の合計２個のファイルでジョギング本編部分が構成される（図７（ｃ））。 Specifically, as shown in FIG. 7, when the jogging time is 15 minutes, the jogging main part is composed of one 15-minute file (FIG. 7A). When the jogging time is 30 minutes, the main part of the jogging is composed of one 30-minute file (FIG. 7B). When the jogging time is 45 minutes, the main part of the jogging is composed of a total of two files, one 30-minute file and one 15-minute file (FIG. 7C).

また、ジョギング時間が６０分である場合には、３０分のファイル２個でジョギング本編部分が構成される（図７（ｄ））。また、ジョギング時間が９０分である場合には、３０分のファイル３個でジョギング本編部分が構成される（図７（ｅ））。また、ジョギング時間が１２０分である場合には、３０分のファイル４個でジョギング本編部分が構成される（図７（ｆ））。 When the jogging time is 60 minutes, the main part of the jogging is composed of two 30-minute files (FIG. 7 (d)). If the jogging time is 90 minutes, the main part of the jogging is composed of three 30-minute files (FIG. 7 (e)). When the jogging time is 120 minutes, the main part of the jogging is composed of four 30-minute files (FIG. 7 (f)).

ここで、ジョギング本編部分が複数のファイルで構成される場合であって、初期テンポと最終テンポとが異なるときには、再生順において最初のファイルの最初の曲が初期テンポで再生され、複数のファイルにまたがって徐々にテンポが上昇し、最後のファイルの最後の曲が最終テンポで再生されるように、各ファイルが構成される。 Here, if the main part of jogging is composed of multiple files and the initial tempo and final tempo are different, the first song of the first file in the playback order is played at the initial tempo, Each file is configured so that the tempo gradually rises across and the last song of the last file is played at the final tempo.

図８は、ジョギング用楽曲データのジョギング本編部分の１ファイルの概要構成を示す図である。 FIG. 8 is a diagram showing a schematic configuration of one file of the jogging main part of the music data for jogging.

図８に示すように、ジョギング用楽曲データのジョギング本編は、最初に前奏が再生された後、１曲目の楽曲本体が再生される。そして、曲間部が再生された後、２曲目の楽曲本体が再生される。以下同様にして、全ての楽曲本体が再生された後、後奏が再生される。つまり、ジョギング本編は、前奏の曲間つなぎデータ、１曲目の楽曲本体データ、曲間部の曲間つなぎデータ、２曲目の楽曲本体データ、曲間部の曲間つなぎデータ…Ｎ曲目の曲本体データ、後奏の曲間つなぎデータの順に再生されるように、楽曲配信サーバ１の制御部１１により構成される。 As shown in FIG. 8, in the jogging main part of the jogging music data, the first music body is reproduced after the prelude is first reproduced. Then, after the inter-music part is reproduced, the second music body is reproduced. In the same manner, after all the music pieces are reproduced, the later performance is reproduced. In other words, the main part of jogging is the connection data between the songs of the prelude, the first song body data, the song connection data between the songs, the second song body data, the song connection data between the songs ... The song body of the Nth song It is comprised by the control part 11 of the music distribution server 1 so that it may reproduce | regenerate in order of data, and the connection data of interlude music.

このとき、ジョギング本編は、例えば、図５に示すトレーニングコースのステップ１の場合には、１６０ＢＰＭ一定のテンポで各楽曲データが再生されるように構成され、例えば、ステップ２の場合には、１６０ＢＰＭから１６５ＢＰＭまでテンポが徐々に上昇するように楽曲データが再生されるように構成される。 At this time, for example, in the case of step 1 of the training course shown in FIG. 5, the jogging main part is configured so that each piece of music data is played at a constant tempo of 160 BPM. For example, in the case of step 2, 160 BPM The music data is reproduced so that the tempo gradually increases from 1 to 165 BPM.

そして、ガイダンス音声の音声データが、楽曲数等に応じて予め定められた順番の曲間部の曲間つなぎデータが再生されている間に再生されるようにジョギング本編は構成される。また、ジョギング本編の演奏が開始されてから、規定の時間が経過する都度（例えば、５分、１０分、２０分、…）、その時点での経過時間とこれまでの消費カロリーをユーザに対して告知する音声データが再生されるようにジョギング本編は構成される。 Then, the jogging main part is configured such that the voice data of the guidance voice is reproduced while the inter-music connection data of the inter-music parts in the predetermined order according to the number of music is reproduced. Also, every time a specified time elapses after the performance of the main jogging starts (for example, 5 minutes, 10 minutes, 20 minutes,...), The elapsed time at that time and the calorie consumption so far are given to the user. The jogging main part is configured so that the voice data to be notified is reproduced.

また、トレーニングコースの選択の際に英単語の記憶学習を行うことが選択された場合や、楽曲本体の選択の際に記憶対象語音声が選択された場合等には、ジョギング本編において、記憶対象語音声が再生される。この場合、例えば、ジョギング本編の再生開始から、ランニングハイ状態となると予測される予め設定されたタイミング以降では、楽曲本体データではなく、選択された記憶対象語音声データが再生される。 Also, if you choose to learn English words when selecting a training course, or if you select a target word voice when selecting a song body, the jogging volume will Speech is played. In this case, for example, after the start of playback of the jogging main part, after the preset timing at which the running high state is predicted, the selected storage target word / sound data is reproduced instead of the music body data.

なお、第３階層のメニュー項目が選択された時点で最終的なトレーニングコースを決定し、第４階層のメニュー項目が選択された時点でトレーニングを開始させるステップを決定しても良い。例えば、「初心者」が選択された場合には、トレーニングコースＡのステップ１が開始ステップとして決定され、「中級者」が選択された場合には、トレーニングコースＡのステップ３が開始ステップとして決定され、「上級者」が選択された場合には、トレーニングコースＡのステップ５が開始ステップとして決定されるようにしても良い。 Note that the final training course may be determined when the third layer menu item is selected, and the step of starting the training may be determined when the fourth layer menu item is selected. For example, when “beginner” is selected, step 1 of training course A is determined as the start step, and when “intermediate” is selected, step 3 of training course A is determined as the start step. When “advanced” is selected, step 5 of training course A may be determined as the start step.

［１．２．２．２記憶対象語音声データの詳細及び作成方法］
次に、本実施形態に係る記憶対象語音声データ（記憶対象語ＷＡＶパーツデータ３０１）とその作成方法について、図９乃至図１４を用いて詳細に説明する。なお、以下の説明においては、英単語及び日本語訳の発音を、カナで表記する。また、英単語のアクセント部分を示す場合、アクセントのある母音を含むカナ文字をアクセント部分として示す。 [1.2.2.2 Details of storage target word voice data and creation method]
Next, the storage target word speech data (storage target word WAV part data 301) and the creation method thereof according to the present embodiment will be described in detail with reference to FIGS. In the following explanation, the pronunciation of English words and Japanese translations are written in kana. When an accent part of an English word is shown, a kana character including an accented vowel is shown as an accent part.

図９は、１章の記憶対象語音声データのデータ構造の概要例を示す図である。つまり、
図９は、１曲分の楽曲に相当する記憶対象語音声データのデータ構造の概要例を示す。 FIG. 9 is a diagram showing a schematic example of the data structure of the storage target word speech data of Chapter 1. That means
FIG. 9 shows a schematic example of the data structure of the storage target word voice data corresponding to one music piece.

図９に示すように、記憶対象語音声データは、リズム音のパートと、英単語の発声音のパートと、日本語訳の発声音のパートとにより構成されている。 As shown in FIG. 9, the storage target word speech data is composed of a rhythm sound part, an English utterance part, and a Japanese translation utterance part.

リズム音のパートでは、１４０ＢＰＭのテンポに合わせてリズム音が発音される。また、このリズム音の拍子は、４／４拍子である。つまり、１小節には４拍が含まれており、１拍に対して４分音符が割り当てられる。この１小節の最初の拍から最後の拍までを、順に「第１拍」、「第２拍」、「第３拍」及び「第４拍」と称する。第１拍は強拍であり、第２拍、第３拍及び第４拍は夫々弱拍である。そして、各拍のタイミングに合わせてリズム音が発音される。なお、第１〜第４の全ての拍のタイミングでリズム音が発音されなくても良い。例えば、第１拍のタイミングでのみリズム音が発音されるようにしても良い。 In the rhythm sound part, the rhythm sound is pronounced at a tempo of 140 BPM. The time signature of this rhythm sound is 4/4. That is, one measure includes four beats, and a quarter note is assigned to one beat. The first beat to the last beat of one measure are referred to as “first beat”, “second beat”, “third beat”, and “fourth beat” in this order. The first beat is a strong beat, and the second, third, and fourth beats are weak beats, respectively. And a rhythm sound is pronounced according to the timing of each beat. Note that the rhythm sound does not have to be generated at the timing of all the first to fourth beats. For example, a rhythm sound may be generated only at the timing of the first beat.

英単語及び日本語訳の発声音のパートでは、２５単語分の英単語とその日本語訳の発声音が発音される。つまり、本実施形態では、２５個の記憶対象語が、１章の記憶対象語群に相当する。また、２５単語における１単語毎に、英単語とその日本語訳の発声音が４回繰り返し発音される。例えば、英単語が「book」であり、その日本語訳が「本」である場合には、「book 本 book 本 book 本 book 本」と発音される。続いて、例えば、「reserve 予約する reserve 予約する reserve 予約する reserve 予約する」と発音される。このように、英単語と日本語訳とが交互に発音される。 In the English word and Japanese translation sound part, 25 words of English words and the Japanese translation sound are pronounced. That is, in the present embodiment, 25 storage target words correspond to the storage target word group of one chapter. Also, for each word in 25 words, the English word and its Japanese translation sound are repeatedly pronounced four times. For example, if the English word is “book” and its Japanese translation is “book”, it is pronounced as “book book book book book book book”. Subsequently, for example, it is pronounced “reserve reservation reserve reservation reserve reservation reserve reservation”. In this way, English words and Japanese translations are pronounced alternately.

上記の４回の繰り返しにおける記憶対象語の１回分の発音に対して、１小節が割り当てられる。つまり、１個の記憶対象語に対して４小節が割り当てられる。そして、記憶対象語は２５個あるので、全体として１００小節が割り当てられる。従って、１章の記憶対象語音声を楽曲としてみた場合、記憶対象語音声は、１曲が１００小節の楽曲ということになる。 One measure is assigned to one pronunciation of the storage target word in the above four repetitions. That is, four measures are assigned to one storage target word. Since there are 25 words to be stored, 100 bars are assigned as a whole. Therefore, when the storage target word sound of Chapter 1 is viewed as a song, the storage target word sound is a song of 100 bars.

また、英単語の発声音のために１小節中の第１拍目と第２拍目とが割り当てられる。また、日本語訳の発声音のために１小節中の第３拍目と第４拍目とが割り当てられる。実際の割り当ては、英単語や日本語訳の発音時間等によって変化する。英単語の場合、第１拍目は必ず割り当てられるが、第２拍目は、割り当てられる場合と割り当てられない場合とがある。また同様に、日本語語の場合、第３拍目は必ず割り当てられるが、第４拍目は、割り当てられる場合と割り当てられない場合とがある。 Moreover, the 1st beat and 2nd beat in 1 bar are allocated for the utterance sound of English words. In addition, the third beat and the fourth beat in one measure are assigned for the utterance sound of the Japanese translation. The actual assignment varies depending on the pronunciation time of English words and Japanese translations. In the case of English words, the first beat is always assigned, but the second beat may or may not be assigned. Similarly, in the case of Japanese language, the third beat is always assigned, but the fourth beat may or may not be assigned.

そして、本実施形態における記憶対象語音声データは、第１拍のタイミングで英単語のアクセントの部分が発音されるように構成される。具体的に、拍のタイミングとは、その拍でリズム音が発音される場合においては、リズム音の発音開始のタイミング、例えば、打楽器等であればその打点のタイミングをいう。別の表現を用いると、１小節を４等分した各期間を拍の期間とすれば、拍のタイミングとは、その拍の期間の開始時である。この場合、拍の期間の開始時にリズム音の発音が開始される。 The storage target word speech data in this embodiment is configured such that the accented part of the English word is pronounced at the timing of the first beat. Specifically, the beat timing refers to the timing of the start of rhythm sound generation when a rhythm sound is generated at that beat, for example, the timing of the hit point for a percussion instrument or the like. In other words, if each period obtained by dividing one measure into four equal parts is a beat period, the beat timing is the start of the beat period. In this case, the pronunciation of the rhythm sound is started at the start of the beat period.

具体例を次に示す。例えば、「book」の発音を「ブック」とすると、アクセントのある「ブ」が第１拍のタイミングで発音される。また、「reserve」の発音を「リザーブ」とすると、アクセントのある「ザ」が第１拍のタイミングで発音される。英語は強勢アクセントのある言語である。従って、このような言語の場合、拍のタイミングで英単語のアクセント部分が発音されるようにすることで、英単語がリズム良く発音される。 A specific example is shown below. For example, if the pronunciation of “book” is “book”, an accented “bu” is pronounced at the timing of the first beat. If the pronunciation of “reserve” is “reserve”, accented “the” is pronounced at the timing of the first beat. English is a language with a strong accent. Therefore, in such a language, the English word is pronounced with good rhythm by making the accented part of the English word pronounced at the timing of the beat.

その一方で、本実施形態における記憶対象語音声データは、第３拍のタイミングで日本語訳の語頭の部分が発音されるように構成される。 On the other hand, the storage target word speech data in the present embodiment is configured such that the beginning part of the Japanese translation is pronounced at the timing of the third beat.

例えば、「本」の発音は「ホン」であるので、語頭の「ホ」が第３拍のタイミングで発音される。また、「予約する」の発音は「ヨヤクスル」であるので、語頭の「ヨ」が第３拍のタイミングで発音される。なおこの場合、「ス」が第４拍のタイミングで発音される。日本語は、高低アクセントはあるが、強勢アクセントは一般的には無いとされている。このような言語の場合、拍のタイミングで語頭部分が発音されるようにすることで、リズム良く発音される。なお、日本語訳の発音に対して意図的に強勢アクセントを付けても良く、この場合、拍のタイミングでアクセント部分が発音されるようにしても良い。 For example, since the pronunciation of “book” is “phone”, the beginning “ho” is pronounced at the timing of the third beat. Further, since the pronunciation of “reserve” is “Yoakusu”, the beginning of the word “yo” is pronounced at the timing of the third beat. In this case, “su” is pronounced at the timing of the fourth beat. In Japanese, there are high and low accents, but generally no stress accents. In the case of such a language, it is pronounced with good rhythm by making the beginning part pronounced at the timing of the beat. It should be noted that a stress accent may be intentionally added to the pronunciation of the Japanese translation. In this case, the accent part may be pronounced at the timing of the beat.

図１０は、記憶対象語音声データの作成例を示す図である。前述したように、記憶対象語音声データは、英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３及び記憶対象語用ドラムベースＷＡＶデータ３０４に基づいて作成される。 FIG. 10 is a diagram illustrating an example of creating the storage target word speech data. As described above, the storage target word sound data is created based on the English word group WAV data 302, the Japanese translation group WAV data 303, and the storage target word drum base WAV data 304.

図１０に示すように、記憶対象語用ドラムベースＷＡＶデータ３０４は、記憶対象語音声のリズム音のパートに相当するＷＡＶデータである。ただし、記憶対象語用ドラムベースＷＡＶデータ３０４は、リズム音とともに、ハーモニーやメロディを伴ったＷＡＶデータであっても良い。また、英単語群ＷＡＶデータ３０２は、記憶対象語音声の英単語の発声音のパートに相当するＷＡＶデータである。この英単語群ＷＡＶデータ３０２は、各英単語のアクセントの発音タイミングが各小節の第１拍のタイミングに合うように構成されている。また、日本語訳群ＷＡＶデータ３０３は、記憶対象語音声の日本語訳の発声音のパートに相当するＷＡＶデータである。この日本語訳群ＷＡＶデータ３０３は、各日本語訳の発音開始タイミングが各小節の第３拍のタイミングに合うように構成されている。 As shown in FIG. 10, the drum base WAV data 304 for the storage target word is WAV data corresponding to the rhythm sound part of the storage target word speech. However, the drum base WAV data 304 for the storage target word may be WAV data accompanied by a rhythm sound and a harmony or a melody. The English word group WAV data 302 is WAV data corresponding to the utterance sound part of the English word of the storage target speech. The English word group WAV data 302 is configured so that the accent sounding timing of each English word matches the timing of the first beat of each measure. The Japanese translation group WAV data 303 is WAV data corresponding to the utterance part of the Japanese translation of the speech to be stored. The Japanese translation group WAV data 303 is configured such that the pronunciation start timing of each Japanese translation matches the timing of the third beat of each measure.

生成手段としての制御部１１は、上記３個のＷＡＶデータを混合（ミキシング）して記憶対象語音声データを作成する。このとき、制御部１１は、英単語群ＷＡＶデータ３０２における各英単語のアクセントの発音タイミングが記憶対象語用ドラムベースＷＡＶデータ３０４におけるリズム音の各小節の第１拍に合うように調整する。また、制御部１１は、日本語訳群ＷＡＶデータ３０３における各日本語訳の発音開始タイミングが記憶対象語用ドラムベースＷＡＶデータ３０４におけるリズム音の各小節の第３拍に合うように調整する。英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３及び記憶対象語用ドラムベースＷＡＶデータ３０４が、１４０ＢＰＭの１００小節で予め作成されていれば、制御部１１の行う調整としては、混合時の各ＷＡＶデータの再生開始位置を互いに一致せれば良い。 The control unit 11 serving as a generation unit mixes (mixes) the three pieces of WAV data to create the storage target word voice data. At this time, the control unit 11 adjusts the pronunciation timing of the accent of each English word in the English word group WAV data 302 so as to match the first beat of each measure of the rhythm sound in the drum base WAV data 304 for the storage target word. Further, the control unit 11 adjusts the pronunciation start timing of each Japanese translation in the Japanese translation group WAV data 303 so as to match the third beat of each measure of the rhythm sound in the drum base WAV data 304 for the storage target word. If the English word group WAV data 302, the Japanese translation group WAV data 303, and the drum base WAV data 304 for the storage target word are created in advance in 100 measures of 140 BPM, the adjustment performed by the control unit 11 is as follows. The playback start positions of the respective WAV data may be matched with each other.

英単語において、語頭にアクセントがある場合、発音タイミングが上述したとおりで問題はない。ここで、「語頭にアクセントがある」とは、先頭の母音にアクセントがある場合（例えば、「item」の「i」等）、及び、先頭の子音とアクセントのある母音とが一体的に発音される場合（例えば、「book」の「boo」等）を含む。その一方で、語頭にアクセントがない英単語については、アクセント部分よりも前の発音部分が直前の小節の第４拍の期間に入ることとなる。第４拍には日本語訳が割り当てられるので、日本語訳の発声音と英単語の発声音との間隔が非常に短くなったり、又は、日本語訳の発声音と英単語の発声音とが一部重ならざるを得なくなる場合がある。 If there is an accent at the beginning of an English word, the pronunciation timing is as described above and there is no problem. Here, “the accent is at the beginning of the word” means that the leading vowel has an accent (for example, “item” “i”, etc.), and that the leading consonant and the accented vowel are pronounced together. (For example, “boo” of “book”). On the other hand, for English words with no accent at the beginning, the pronunciation part before the accent part enters the period of the fourth beat of the immediately preceding measure. Since the Japanese translation is assigned to the 4th beat, the interval between the utterance sound of the Japanese translation and the utterance sound of the English word becomes very short, or the utterance sound of the Japanese translation and the utterance sound of the English word May have to be partially overlapped.

そこで、本実施形態においては、日本語訳の発音終了から当該日本語訳の次に発音される英単語の発音開始までの間隔（以下、「日本語訳と英単語との時間間隔」と称する）が所定時間以上となるように、日本語訳と英単語との時間間隔が所定時間未満となる英単語の発音タイミングが第１拍のタイミングからずれて、第１拍のタイミングで英単語の発音が開始されるように、英単語群ＷＡＶデータ３０２が構成される。 Therefore, in the present embodiment, the interval from the end of the pronunciation of the Japanese translation to the start of pronunciation of the English word that is pronounced next to the Japanese translation (hereinafter referred to as the “time interval between the Japanese translation and the English word”). ) Is longer than the predetermined time, so that the pronunciation timing of the English word whose time interval between the Japanese translation and the English word is less than the predetermined time deviates from the timing of the first beat. The English word group WAV data 302 is configured so that pronunciation is started.

所定時間として、日本語訳と英単語との時間間隔の最低限の時間を、最低間隔時間Ａとして予め決定する。この最低間隔時間Ａは、例えば、テンポ、日本語訳の発音時間、日本語訳と英単語との聞き分けやすさ等によって決定される。 As the predetermined time, the minimum time interval between the Japanese translation and the English word is determined in advance as the minimum interval time A. The minimum interval time A is determined by, for example, the tempo, the pronunciation time of the Japanese translation, the ease of distinguishing between the Japanese translation and the English word, and the like.

例えば、日本語訳は、最大で第３拍と第４拍との２拍に割り当てられる。第３拍のタイミングで日本語訳の発音が開始されるので、下記の式１を満たすような範囲で最低間隔時間Ａを決定すると良い。 For example, the Japanese translation is assigned to 2 beats of 3rd beat and 4th beat at the maximum. Since pronunciation of the Japanese translation starts at the timing of the third beat, it is preferable to determine the minimum interval time A within a range that satisfies the following formula 1.

Ａ＋Ｋ≦６０／１４０×２・・・（式１）
式１において、Ｋは、記憶対象語群に含まれる２５個の日本語訳夫々の発音時間のうち最長の発音時間である。また、６０は、１分あたりの秒数であり、１４０はテンポであり、２は拍数である。 A + K ≦ 60/140 × 2 (Formula 1)
In Equation 1, K is the longest pronunciation time of the 25 Japanese translations included in the storage target word group. 60 is the number of seconds per minute, 140 is the tempo, and 2 is the number of beats.

逆に、日本語訳と英単語との聞き分けやすさから最低間隔時間Ａを先に決定したとすると、記憶対象語群に含まれる２５個の日本語訳夫々の発音時間が、式１を満たすようにする。つまり、各日本語訳の発音時間は、最長でも、２拍分の時間から最低間隔時間Ａを差し引いた残りの時間以下とする。 On the other hand, if the minimum interval time A is determined first because it is easy to distinguish between the Japanese translation and the English word, the pronunciation time of each of the 25 Japanese translations included in the storage target word group satisfies Equation 1. Like that. That is, the pronunciation time of each Japanese translation is at most equal to or less than the remaining time obtained by subtracting the minimum interval time A from the time of two beats.

上記最低間隔時間Ａは、ジョギング用楽曲データが作成される前の記憶対象語音声データとしてのテンポを基準にした場合の時間である。最終的に、ジョギング用楽曲データが作成される場合、記憶対象語音声を含む各楽曲のテンポは、選択されたコース及びステップに対応したテンポに調整される。その結果、日本語訳と英単語との間隔は最低間隔時間Ａよりも短くなる場合もある。ただし、楽曲を聴くユーザは、そのテンポに自然に慣れていくものと考えられる。従って、テンポが速くなれば、ユーザの聴覚がそのテンポに慣れるので、その分日本語訳と英単語との時間間隔を短くしても差し支えない。また、テンポが遅くなれば、その分日本語訳と英単語との時間間隔を長くしても良い。 The minimum interval time A is a time based on the tempo as the storage target word voice data before the jogging music data is created. Finally, when jogging song data is created, the tempo of each song including the storage target speech is adjusted to the tempo corresponding to the selected course and step. As a result, the interval between the Japanese translation and the English word may be shorter than the minimum interval time A. However, it is considered that the user who listens to the music naturally gets used to the tempo. Therefore, if the tempo is faster, the user's hearing is used to the tempo, so the time interval between the Japanese translation and the English word can be shortened accordingly. If the tempo is slow, the time interval between the Japanese translation and the English word may be increased accordingly.

以下、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整の具体例を説明する。図１１乃至１４は、英単語群ＷＡＶデータ３０２における英単語の発音タイミングの調整例を示す図である。以下に説明する例においては、１章の記憶対象語群の１番目の英単語が「book」、２番目の英単語が「reserve」、３番目の英単語が「proposal」、４番目の英単語が「entertainment」であるものとする。 Hereinafter, a specific example of adjustment of pronunciation timing of English words in the English word group WAV data 302 will be described. FIGS. 11 to 14 are diagrams showing examples of adjustment of pronunciation timing of English words in the English word group WAV data 302. In the example described below, the first English word in the memory group of chapter 1 is “book”, the second English word is “reserve”, the third English word is “proposal”, the fourth English word Assume that the word is “entertainment”.

先ず、「book」の発声音である「ブック」のアクセントは、「ブ」にある。そこで、図１１（ａ）に示すように、「ブ」の発音タイミングを第１拍のタイミングに合わせ、日本語訳の発声音である「ホン」における「ホ」の発音タイミングを第３拍のタイミングに合わせた場合を考える。「book」の場合、アクセントは語頭にあるので、例えば式１を満たすように最低時間間隔Ａが適切に決定されていれば、「ホン」の発音終了から直後の「ブック」の発音開始までの時間間隔ｔ１は、最低時間間隔Ａよりも短くなることはない。従って、「ブック」における「ブ」の発音タイミングは、第１拍のタイミングに合わせられる。 First, the accent of “book”, which is the sound of “book”, is “bu”. Therefore, as shown in FIG. 11A, the pronunciation timing of “B” is set to the timing of the first beat, and the pronunciation timing of “ho” in the “hon”, which is the utterance sound of the Japanese translation, is set to the third beat. Consider the case of timing. In the case of “book”, since the accent is at the beginning of the word, for example, if the minimum time interval A is appropriately determined so as to satisfy Formula 1, the end of the pronunciation of “Hong” until the start of pronunciation of “Book” immediately after The time interval t1 is never shorter than the minimum time interval A. Accordingly, the sound generation timing of “bu” in the “book” is matched with the timing of the first beat.

次に、「reserve」の発声音である「リザーブ」のアクセントは、「ザ」にある。そこで、図１１（ｂ）に示すように、「ザ」の発音タイミングを第１拍のタイミングに合わせ、日本語訳の発声音である「ヨヤクスル」における「ヨ」の発音タイミングを第３拍のタイミングに合わせ、「ス」の発音タイミングを第４拍のタイミングに合わせた場合を考える。２番目以降の英単語については、その英単語の直前に、当該英単語の日本語訳が発音される場合と、当該英単語の前の順番の英単語の日本語訳が発音される場合とがあるので、夫々について時間間隔を考える必要がある。ここで、「ホン」の発音終了から直後の「リザーブ」の発音開始までの時間間隔ｔ２、及び、「ヨヤクスル」の発音終了から直後の「リザーブ」の発音開始までの時間間隔ｔ３が、何れも最低時間間隔Ａ以上であれば、「リザーブ」の発音タイミングをずらす必要はない。図１１（ｂ）は、この場合の例である。 Next, the accent of “reserve” which is the utterance sound of “reserve” is “the”. Therefore, as shown in FIG. 11B, the pronunciation timing of “za” is synchronized with the timing of the first beat, and the pronunciation timing of “yo” in “Yoakusuru”, which is the utterance sound of the Japanese translation, is set to the third beat. Consider the case where the sounding timing of “su” is matched with the timing of the fourth beat in accordance with the timing. For the second and subsequent English words, a Japanese translation of the English word is pronounced immediately before the English word, and a Japanese translation of the English word in the order before the English word is pronounced. There is a need to consider the time interval for each. Here, the time interval t2 from the end of the sounding of “Hong” to the start of the sounding of “Reserve” immediately after the sounding, and the time interval t3 from the time of the sounding of “Yoakusuru” to the time of starting the sounding of “Reserve” immediately after are both If the minimum time interval A is equal to or longer, it is not necessary to shift the sound generation timing of “reserve”. FIG. 11B shows an example of this case.

次に、「proposal」の発声音である「プロポーザル」のアクセントは、「ポ」にある。そこで、図１２（ａ）に示すように、「ポ」の発音タイミングを第１拍のタイミングに合わせ、日本語訳の発声音である「アン」における「ア」の発音タイミングを第３拍のタイミングに合わせた場合を考える。ここで、「ヨヤクスル」の発音終了から直後の「プロポーザル」の発音開始までの時間間隔ｔ４が、最低時間間隔Ａより短くなっている。従って、「アン」の発音終了から直後の「プロポーザル」の発音開始までの時間間隔ｔ５が最低時間間隔Ａ以上であっても、「プロポーザル」の発音タイミングがずらされることとなる。 Next, the accent of “proposal” which is the utterance sound of “proposal” is in “po”. Therefore, as shown in FIG. 12A, the pronunciation timing of “po” is synchronized with the timing of the first beat, and the pronunciation timing of “a” in “An”, which is the utterance sound of the Japanese translation, is set to the third beat. Consider the case of timing. Here, the time interval t4 from the end of pronunciation of “Yoakusuru” to the start of pronunciation of “proposal” immediately after is shorter than the minimum time interval A. Therefore, even when the time interval t5 from the end of the sound generation of “An” to the start of the sound generation of “Proposal” immediately after is the minimum time interval A or more, the sound generation timing of “Proposal” is shifted.

具体的に、図１２（ｂ）に示すように、「プロポーザル」の発音開始の「プ」の発音タイミングが第１拍に合わせられる。英単語の発音開始タイミングが第１拍に合わせられれば、直前の日本語訳と英単語との時間間隔が最低時間間隔Ａより短くなることはない。 Specifically, as shown in FIG. 12B, the sound generation timing of “Pro” at the start of sound generation of “Proposal” is set to the first beat. If the pronunciation start timing of the English word is synchronized with the first beat, the time interval between the previous Japanese translation and the English word will not be shorter than the minimum time interval A.

次に、「entertainment」の発声音である「エンタテイメント」のアクセントは、「テ」にある。そこで、図１３（ａ）に示すように、「テ」の発音タイミングを第１拍のタイミングに合わせ、日本語訳の発声音である「ゴラク」における「ゴ」の発音タイミングを第３拍のタイミングに合わせ、「ク」の発音タイミングを第４拍のタイミングに合わせた場合を考える。ここで、「ゴラク」の発音終了から直後の「エンタテイメント」の発音開始までの時間間隔ｔ７が、最低時間間隔Ａより短くなっている。従って、「アン」の発音終了から直後の「エンタテイメント」の発音開始までの時間間隔ｔ６が最低時間間隔Ａ以上であっても、「エンタテイメント」の発音タイミングがずらされることとなる。 Next, the accent of “entertainment”, which is the voice of “entertainment”, is “te”. Therefore, as shown in FIG. 13A, the sounding timing of “te” is matched with the timing of the first beat, and the sounding timing of “go” in “Goraku”, which is the utterance sound of the Japanese translation, is set to the third beat. Consider the case where the sounding timing of “ku” is matched with the timing of the fourth beat in accordance with the timing. Here, the time interval t7 from the end of the pronunciation of “goraku” to the start of the pronunciation of “entertainment” immediately after is shorter than the minimum time interval A. Therefore, even when the time interval t6 from the end of the sound generation of “An” to the start of sound generation of “Entertainment” immediately after is the minimum time interval A or more, the sound generation timing of “Entertainment” is shifted.

ここで、「entertainment」は、アクセントが複数ある。つまり、「entertainment」は、最も強く発音される第１アクセント（第１強勢）と、第１アクセントの次に強く発音される第２アクセント（第２強勢）とを有する。「エンタテイメント」においては、第１アクセントは「テ」にあり、第２アクセントは「エ」にある。このように、第１アクセントが第２のアクセントよりも後に発音される場合、英単語の発音開始タイミングが第１拍のタイミングに合わせられ、且つ、第１アクセントの発音タイミングが第２拍のタイミングに合わせられる。具体的に、図１３（ｂ）に示すように、「エンタテイメント」の発音開始部分の「エ」の発音タイミングが第１拍に合わせられ、「テ」の発音タイミングが第２拍に合わせられる。またこのとき、「テ」の発音タイミングが第２拍に合わせられるように、例えば、「エンターテイメント」というような発音の仕方によって、発音時間が調整される。これにより、アクセントが複数ある英単語がリズム良く発音される。 Here, “entertainment” has a plurality of accents. That is, “entertainment” has a first accent (first stress) that is pronounced most strongly and a second accent (second stress) that is pronounced next to the first accent. In “entertainment”, the first accent is in “te” and the second accent is in “d”. As described above, when the first accent is pronounced after the second accent, the pronunciation start timing of the English word is matched with the timing of the first beat, and the pronunciation timing of the first accent is the timing of the second beat. Adapted to. Specifically, as shown in FIG. 13B, the sounding timing of “D” at the sounding start portion of “Entertainment” is set to the first beat, and the sounding timing of “T” is set to the second beat. At this time, the pronunciation time is adjusted by, for example, “entertainment” so that the sound timing of “te” is synchronized with the second beat. Thereby, English words with a plurality of accents are pronounced with good rhythm.

なお、アクセントが複数あるか否かにかかわらず、発音時間が長い英単語の発音を、第１拍と第２拍との両方に割り当てることは差し支えない。例えば、「proposal」において、「プロポーザル」の「プ」の発音タイミングを第１拍のタイミングに合わせ、「ザ」の発音タイミングを第２拍のタイミングに合わせても良い。 Regardless of whether there are a plurality of accents, the pronunciation of English words with a long pronunciation time may be assigned to both the first beat and the second beat. For example, in “proposal”, the sound generation timing of “Pro” in “Proposal” may be matched with the timing of the first beat, and the sounding timing of “The” may be matched with the timing of the second beat.

このように、英単語の発音タイミングをずらして、発音開始のタイミングを第１拍のタイミングに合わせた場合に、英単語の発音が第３拍の期間にまで継続しないようにする必要がある。更に、英単語と日本語訳とをユーザが容易に聞き分けることができるように、英単語の発音終了からその次の日本語訳の発音開始まで間隔を置く必要がある。従って、各英単語の発音時間は、最長でも、２拍分の時間から所定の時間間隔を差し引いた残りの時間以下とする。このときの時間間隔は、例えば、最低間隔時間Ａと同一の時間とすると良い。 As described above, when the pronunciation timing of the English word is shifted and the pronunciation start timing is matched with the timing of the first beat, it is necessary to prevent the pronunciation of the English word from continuing until the third beat period. Furthermore, it is necessary to provide an interval from the end of pronunciation of an English word to the start of pronunciation of the next Japanese translation so that the user can easily distinguish between the English word and the Japanese translation. Accordingly, the pronunciation time of each English word is at most equal to or less than the remaining time obtained by subtracting a predetermined time interval from the time of two beats. The time interval at this time may be the same as the minimum interval time A, for example.

以上のように、英単語のアクセントの発音タイミングを第１拍のタイミングに合わせ、日本語訳の発音開始タイミングを第３拍のタイミングに合わせた場合に、日本語訳と英単語との時間間隔が最低時間間隔Ａ以上の英単語については、そのまま第１拍のタイミングでアクセントが発音されるように、また、日本語訳と英単語との時間間隔が最低時間間隔Ａ未満の英単語については、第１拍のタイミングで発音が開始されるように、予め英単語群ＷＡＶデータ３０２が作成される。例えば、人間が、１４０ＢＰＭのリズムに合わせ且つ発音タイミングを調整しつつ、英単語の音声のパート部分を発声し、この音声を録音して英単語群ＷＡＶデータ３０２を作成しても良い。 As described above, the time interval between the Japanese translation and the English word when the accent pronunciation timing of the English word is set to the timing of the first beat and the pronunciation start timing of the Japanese translation is set to the timing of the third beat. For English words whose minimum time interval is A or more, the accent is pronounced at the timing of the first beat as it is, and for English words whose time interval between the Japanese translation and English words is less than the minimum time interval A The English word group WAV data 302 is created in advance so that pronunciation is started at the timing of the first beat. For example, a human may utter a voice part of an English word while adjusting the pronunciation timing in accordance with the rhythm of 140 BPM, and record the voice to create the English word group WAV data 302.

そして、英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３及び記憶対象語用ドラムベースＷＡＶデータ３０４とが混合されて記憶対象語音声データが作成され、作成された記憶対象語音声データがパーツＷＡＶデータデータベースに登録される。パーツＷＡＶデータデータベースに登録された記憶対象語音声データにおいては、日本語訳と英単語との時間間隔が最低時間間隔Ａ以上となっている。 Then, the English word group WAV data 302, the Japanese translation group WAV data 303, and the drum word WAV data 304 for the storage target word are mixed to generate the storage target word sound data, and the generated storage target word sound data is a part. Registered in the WAV data database. In the storage target word speech data registered in the parts WAV data database, the time interval between the Japanese translation and the English word is not less than the minimum time interval A.

ところで、これまでの説明では、日本語訳と英単語との時間間隔に基づいて、英単語の発音タイミングをずらすか否かの切り分けが行われていたが、これとは別の条件で切りかけを行っても良い。 By the way, in the explanation so far, based on the time interval between the Japanese translation and the English word, it was determined whether or not to shift the pronunciation timing of the English word. You can go.

第１の例は、英単語の発音開始タイミングからアクセントの発音開始タイミングまでに要する発音時間（以下、「発音開始からアクセントまでの発音時間」と称する）に基づいて切り分けを行う。この場合、発音開始からアクセントまでの発音時間が、所定の発音時間未満である場合は発音タイミングをずらさず、所定の発音時間以上である場合は発音タイミングをずらす。 In the first example, separation is performed based on the pronunciation time required from the pronunciation start timing of the English word to the accent pronunciation start timing (hereinafter referred to as “the pronunciation time from the pronunciation start to the accent”). In this case, the sound generation timing is not shifted if the sound generation time from the start of sound generation to the accent is less than the predetermined sound generation time, and the sound generation timing is shifted if it is longer than the predetermined sound generation time.

例えば、「international」の発声音である「インタナショナル」のアクセントは、「ナ」にある。そこで、図１４（ａ）に示すように、「ナ」の発音タイミングを第１拍のタイミングに合わせた場合を考える。ここで、「インタナショナル」の「イ」の発音タイミングから「ナ」の発音タイミングまでの発音時間ｔ８が、所定の発音時間以上である場合、発音タイミングがずらされることとなる。具体的に、図１４（ｂ）に示すように、「インタナショナル」の発音開始の「イ」の発音タイミングが第１拍に合わせられる。 For example, the accent of “international” which is the voice of “international” is “na”. Therefore, as shown in FIG. 14A, a case where the sound generation timing of “na” is matched with the timing of the first beat is considered. Here, when the sounding time t8 from the sounding timing of “I” in “International” to the sounding timing of “N” is equal to or longer than a predetermined sounding time, the sounding timing is shifted. Specifically, as shown in FIG. 14B, the sound generation timing of “I” at the start of sound generation of “International” is matched with the first beat.

このときの所定の発音時間を、許容発音時間Ｂとして予め決定しておく。例えば、許容発音時間Ｂは、日本語訳と英単語との間隔が最低間隔時間Ａ以上となるように、下記の式２を満たす範囲で予め決定しておくと良い。 The predetermined sounding time at this time is determined in advance as the allowable sounding time B. For example, the allowable pronunciation time B may be determined in advance within a range that satisfies the following expression 2 so that the interval between the Japanese translation and the English word is equal to or longer than the minimum interval time A.

Ａ＋Ｂ＋Ｋ≦６０／１４０×２・・・（式２）
第２の例は、英単語のスペルにおいて、アクセントのある文字よりも前にある文字の数に基づいて切り分けを行う。この場合、アクセントのある文字よりも前にある文字の数が、所定の文字数未満である場合は発音タイミングをずらさず、所定の文字数以上である場合は発音タイミングをずらす。 A + B + K ≦ 60/140 × 2 (Formula 2)
In the second example, in the spelling of English words, carving is performed based on the number of characters preceding the accented characters. In this case, the sounding timing is not shifted when the number of characters preceding the accented character is less than the predetermined number of characters, and the sounding timing is shifted when it is equal to or greater than the predetermined number of characters.

例えば、「minority」の発声音である「マイノァリティ」のアクセントは、「ノ」にある。そこで、図１４（ｃ）に示すように、「ノ」の発音タイミングを第１拍のタイミングに合わせた場合を考える。ここで、「minority」のアクセントは「o」にある。この［o］よりも前にある文字は、「min」であり、その文字数は３である。そして、所定の文字数が３文字であると仮定した場合、「min」の文字数は所定の文字数以上となるので、発音タイミングがずらされることとなる。具体的に、図１４（ｄ）に示すように、「マイノァリティ」の発音開始の「マ」の発音タイミングが第１拍に合わせられる。 For example, the accent of “minority” which is the utterance sound of “minority” is “no”. Therefore, as shown in FIG. 14C, consider a case where the sounding timing of “no” is matched with the timing of the first beat. Here, the accent of “minority” is “o”. The character preceding this [o] is “min”, and the number of characters is three. When it is assumed that the predetermined number of characters is three, the number of characters “min” is equal to or greater than the predetermined number of characters, so that the sound generation timing is shifted. Specifically, as shown in FIG. 14 (d), the sounding timing of “ma” at the start of sounding “minority” is synchronized with the first beat.

このときの所定の文字数を、許容文字数Ｃとして予め決定しておく。例えば、許容文字数Ｃについては、文字数をその発音に要する最長の発音時間に換算し、この発音時間を許容発音時間Ｂに当てはめて、日本語訳と英単語との間隔が最低間隔時間Ａ以上となるように、式２を満たす範囲で予め決定しておくと良い。 The predetermined number of characters at this time is determined in advance as the allowable number of characters C. For example, regarding the allowable number of characters C, the number of characters is converted into the longest pronunciation time required for the pronunciation, and this pronunciation time is applied to the allowable pronunciation time B, and the interval between the Japanese translation and the English word is equal to or greater than the minimum interval time A. As such, it may be determined in advance within a range that satisfies Equation 2.

以上の第１及び第２の例では、英単語のアクセントの発音タイミングを第１拍のタイミングに合わせ、且つ、日本語訳の発音開始タイミングを第３拍のタイミングに合わせた場合の日本語訳と英単語との時間間隔を算出する必要がない。つまり、第１及び第２の例では、英単語に関する情報のみで、切り分けを行うことが可能である。これは、英単語の直前に発音される日本語訳の発音時間を考慮する必要がないことを意味し、記憶対象語群における英単語の順番が如何様に変わっても、各英単語の発音タイミングをずらすか否かを変更する必要がない。 In the above first and second examples, the Japanese translation when the pronunciation timing of the accent of the English word is matched with the timing of the first beat and the pronunciation start timing of the Japanese translation is matched with the timing of the third beat There is no need to calculate the time interval between and the English word. That is, in the first and second examples, it is possible to perform the separation only with the information related to the English word. This means that it is not necessary to consider the pronunciation time of the Japanese translation that is pronounced immediately before the English word, and the pronunciation of each English word is no matter how the order of the English words in the memorized word group changes. There is no need to change whether or not the timing is shifted.

なお、英単語の発音タイミングをずらすか否かの切り分け方法として、日本語訳と英単語との時間間隔に基づく方法、発音開始からアクセントまでの発音時間に基づく方法、及び、アクセントのある文字よりも前にある文字の数に基づく方法のうち、何れか２つの方法を組み合わせて、又は、全ての方法を組み合わせても良い。例えば、発音開始からアクセントまでの発音時間が許容発音時間Ｂ以上の英単語や、アクセントのある文字よりも前にある文字の数が許容文字数Ｃ以上の英単語については、発音タイミングをずらすものとして予め切り分けておき、これらの条件を満たさない残りの英単語について、日本語訳と英単語との時間間隔に基づいて切り分けを行っても良い。 As a method of determining whether or not to shift the pronunciation timing of English words, a method based on the time interval between the Japanese translation and English words, a method based on the pronunciation time from the start of pronunciation to the accent, and accented characters Of the methods based on the number of preceding characters, any two methods may be combined or all methods may be combined. For example, for English words whose pronunciation time from the start of pronunciation to accent is longer than the allowable pronunciation time B, and for English words whose number of characters preceding the accented character is more than the allowable number of characters C, the pronunciation timing is shifted. The remaining English words that do not satisfy these conditions may be separated in advance based on the time interval between the Japanese translation and the English words.

［１．２．３ジョギング用楽曲データのジョギング本編部分の作成方法］
次に、楽曲配信サーバ１の制御部１１によるジョギング用楽曲データのジョギング本編部分の作成方法について、図１５及び図１６を用いて説明する。 [1.2.3 Method for creating jogging main part of song data for jogging]
Next, a method for creating the main part of jogging music data for jogging by the control unit 11 of the music distribution server 1 will be described with reference to FIGS. 15 and 16.

図１５は、本実施形態に係る各種設定値等を示す図である。また、図１６は、ジョギング用楽曲データのジョギング本編部分の作成方法の一例を示す図である。 FIG. 15 is a diagram showing various setting values according to the present embodiment. FIG. 16 is a diagram illustrating an example of a method for creating a jogging main part of music data for jogging.

先ず、ジョギング本編部分を作成するための各種設定値等について説明する。 First, various setting values for creating the jogging main part will be described.

図１５（ａ）は、ジョギング本編部分の１ファイル中に含めることができる楽曲本体（芸術音楽及び記憶対象語音声を含む）の曲数の最小値と最大値とを示す表である。図１５（ａ）に示すように、１ファイル中には、３曲以上１０曲以下の楽曲本体が含まれる。この最小値と最大値とは、例えば、ジョギング本編部分の長さの調整のし易さや著作権料等に応じて決定される。 FIG. 15A is a table showing the minimum value and the maximum value of the number of music pieces (including art music and memorized word speech) that can be included in one file of the jogging main part. As shown in FIG. 15 (a), one file includes a music body of 3 to 10 songs. The minimum value and the maximum value are determined according to, for example, the ease of adjusting the length of the jogging main part and the copyright fee.

図１５（ｂ）は、各走行テンポにおけるジョギング本編部分の１ファイルに含まれる小節数を示す表である。 FIG. 15B is a table showing the number of bars included in one file of the main part of jogging at each running tempo.

走行テンポとは、ジョギング本編部分の１ファイルに含まれる小節数を算出するための基準となるテンポである。具体的に、初期テンポと最終テンポとが同値である場合には、
走行テンポ＝初期テンポ＝最終テンポ
であり、初期テンポと最終テンポとが異なる値である場合には、
走行テンポ＝（初期テンポ＋最終テンポ）／２
である。本実施形態においては、走行テンポの最小値を１００ＢＰＭとし、最大値を２２０ＢＰＭとした。 The running tempo is a tempo that serves as a reference for calculating the number of measures included in one file of the jogging main part. Specifically, if the initial tempo and final tempo are the same value,
When running tempo = initial tempo = final tempo and the initial tempo and final tempo are different values,
Running tempo = (initial tempo + final tempo) / 2
It is. In this embodiment, the minimum value of the running tempo is 100 BPM and the maximum value is 220 BPM.

また、図１５（ｂ）中、ファイル時間とは、ジョギング本編部分の１ファイルの再生時間であり、各テンポ共に、１５分及び３０分のファイル時間がある。 In FIG. 15B, the file time is the playback time of one file in the main part of jogging, and each tempo has a file time of 15 minutes and 30 minutes.

また、ファイル総小節数とは、ジョギング本編部分の１ファイルに含まれる小節数である。この小節数は、１小節を４拍として算出されている。従って、
ファイル総小節数＝走行テンポ×ファイル時間／４
である。また、ジョギング本編が複数のファイルで構成される場合には、
各ファイルのファイル総小節数の合計＝ジョギング本編の全小節数
である。 The total number of measures in the file is the number of measures included in one file of the main part of jogging. The number of measures is calculated with one measure being 4 beats. Therefore,
Total number of bars = running tempo x file time / 4
It is. Also, if the jogging main part is composed of multiple files,
The total number of measures for each file is the total number of measures in the main jogging volume.

本実施形態においては、ジョギング本編部分の１ファイルに含まれる楽曲本体、前奏、後奏及び曲間部の再生時間の合計をファイル時間に合わせるために、ジョギング本編部分の長さを小節単位で調整する。これは、走行テンポがどのように変化しても、夫々の曲の小節数は変化しないからで、これによって、ジョギング本編の長さをジョギング時間に合わせることが容易となる。 In this embodiment, the length of the main part of the jogging is adjusted in units of bars in order to match the total playback time of the music body, prelude, postlude and inter-music part included in one file of the main part of the jogging to the file time. To do. This is because no matter how the running tempo changes, the number of measures in each song does not change, and this makes it easy to match the length of the main jogging to the jogging time.

図１５（ｃ）は、前奏、曲間部及び後奏に用いられる曲間つなぎデータの小節数を示す表である。 FIG. 15 (c) is a table showing the number of bars in the inter-music connection data used for the prelude, the inter-music part, and the post-performance.

本実施形態においては、曲間部の曲間つなぎデータとして、小節数が互いに異なる複数の曲間つなぎデータを用意する。曲間に適切な小節数の曲間部が挿入されることで、ジョギング本編部分の１ファイルの再生時間をファイル時間に合わせ易くする。具体的には、図１５（ｃ）に示すように、最少が１２小節であり、最大が１４４小節である。そして、１２小節から１４４小節まで１２小節間隔で１２種類の長さの異なる曲間つなぎデータを用意する。なお、曲間の１２小節は、走行テンポが最大の２２０ＢＰＭであってもガイダンス音声が時間的に曲間に収まる小節数である。 In the present embodiment, a plurality of inter-music connection data having different measures are prepared as inter-music connection data in the inter-music part. By inserting an inter-music part with an appropriate number of measures between the music, the playback time of one file in the main part of the jogging can be easily adjusted to the file time. Specifically, as shown in FIG. 15C, the minimum is 12 bars and the maximum is 144 bars. Then, twelve types of connecting data between songs having different lengths are prepared at intervals of twelve bars from twelve bars to 144 bars. The 12 bars between songs are the number of bars in which the guidance sound can be temporally accommodated between songs even if the running tempo is 220 BPM.

更に、本実施形態においては、前奏及び後奏の曲間つなぎデータとして、小節数が互いに異なる複数の曲間つなぎデータを用意する。これは、ジョギング本編部分を構成するファイルの個数、及び、ジョギング本編部分を複数のファイルで構成する場合におけるファイルの再生順に応じて、予め定められた長さの前奏及び後奏を夫々のファイルに入れるためである。 Furthermore, in the present embodiment, a plurality of pieces of connecting data between songs having different numbers of measures are prepared as connecting data between the prelude and the follower. Depending on the number of files that make up the main part of the jogging and the playback order of the files when the main part of the jogging is made up of a plurality of files, a prelude and a prelude of a predetermined length are assigned to each file. It is for putting.

図１５（ｄ）は、ファイル時間が１５分の場合における各テンポの許容最長楽曲小節数を示す表である。 FIG. 15D is a table showing the maximum allowable number of music bars for each tempo when the file time is 15 minutes.

許容最長楽曲小節数とは、ジョギング本編部分に含めることができる楽曲本体１曲の最長の小節数である。本実施形態においては、１曲の小節数に上限を設けている。このようにすることの一つの理由は、作成されるジョギング本編部分の長さをファイル時間に合わせることを容易にするためである。 The allowable maximum number of music bars is the maximum number of bars of one music body that can be included in the main part of jogging. In this embodiment, an upper limit is set for the number of measures in one song. One reason for doing this is to make it easy to match the length of the created jogging main part to the file time.

具体的に、ファイル時間を最短の１５分として、楽曲本体が長いために１ファイルに３曲の楽曲本体しか入らなかった場合でも、曲間に曲間部を挿入することができるようにしなければならない。従って、
許容最長楽曲小節数≦（ファイル総小節数−前奏と後奏の小節数の合計の最大値−曲間部の小節数の最小値×２）／３＋１
を満たす必要がある。ここで、前奏と後奏の小節数の合計が最大値となる組み合わせは、前奏が８、後奏が２０である。また、曲間部の小節数の最小値は１２である。また、最後の「＋１」は、楽曲本体の最初の１小節と、その楽曲本体の前に再生される曲間部又は前奏の最後の１小節を重ねて再生させるために、その１小節分を許容最長楽曲小節数に加算している。 Specifically, if the file time is set to the shortest 15 minutes and the music body is long and only 3 music pieces can be inserted into one file, the inter-song part must be inserted between songs. Don't be. Therefore,
Allowable maximum number of music bars ≤ (total number of bars of file-maximum value of total number of bars of prelude and follower-minimum value of the number of bars between music pieces x 2) / 3 + 1
It is necessary to satisfy. Here, the combination in which the sum of the number of measures of the prelude and the follower is the maximum is 8 for the prelude and 20 for the follower. In addition, the minimum value of the number of bars in the inter-music part is 12. In addition, the last “+1” indicates that the first bar of the music body and the last bar of the music played before the music body or the last bar of the prelude are overlapped and played. It is added to the maximum allowable number of music bars.

このように、許容最長楽曲小節数を設定することにより、ジョギング本編部分を構成する楽曲の曲数が下限（３曲）を下回らないようにしている。これは、ユーザの立場から曲数は少なくない方が良いからである。 In this way, by setting the allowable maximum number of music bars, the number of music pieces constituting the main part of the jogging is prevented from falling below the lower limit (three music pieces). This is because the number of songs is better from the user's standpoint.

例えば、走行テンポが１００ＢＰＭの場合における許容最長楽曲小節数は１０８である（３．０９分）。同様に、１１０ＢＰＭの場合は１２１小節（３．４６分）、１２０ＢＰＭの場合は１３３小節（３．８０分）、１３０ＢＰＭの場合は１４６小節（４．１７分）、１５５ＢＰＭの場合は１７７小節（５．０６分）、１７０ＢＰＭの場合は１９６小節（５．６０分）、１９０ＢＰＭ及び２２０ＢＰＭの場合は２１０小節である（６．００分）。なお、図１５（ｃ）中の再生時間は１４０ＢＰＭにおける再生時間である。ここで、１４０ＢＰＭにおける１曲の再生時間の許容最大値を６分としたため、許容最長楽曲小節数は、最大でも２１０小節である。 For example, when the running tempo is 100 BPM, the maximum allowable number of music bars is 108 (3.09 minutes). Similarly, in the case of 110 BPM, 121 bars (3.46 minutes), in the case of 120 BPM, 133 bars (3.80 minutes), in the case of 130 BPM, 146 bars (4.17 minutes), in the case of 155 BPM, 177 bars (5) 0.06 minutes), 196 bars (5.60 minutes) for 170 BPM, and 210 bars for 190 BPM and 220 BPM (6.00 minutes). Note that the reproduction time in FIG. 15C is the reproduction time at 140 BPM. Here, since the allowable maximum value of the playback time of one song at 140 BPM is 6 minutes, the maximum allowable number of music bars is 210 bars at the maximum.

また、ファイル時間が３０分である場合の許容最長楽曲小節数は、どのテンポであっても２１０小節である。 In addition, when the file time is 30 minutes, the maximum allowable number of music bars is 210 bars at any tempo.

以下においては、ジョギング本編部分（本実施形態に係る記憶対象語音声を含む）の具体的な作成方法を、テンポが一定の場合とテンポが上昇する場合とで分節して説明する。なお、本実施形態においては、いずれのテンポ又はトレーニングコースの場合でも、経験的にジョギング開始後約１５分経過後から徐々にランニングハイ状態が始まることを前提としており、ジョギング本編部分の最初からランニングハイ状態が始まる以前は楽曲本体データの再生を行い、ランニングハイ状態が始まった以降はジョギング本編部分の最後まで記憶対象語音声データの再生を行うものとしている。 In the following, a specific method of creating the jogging main part (including the speech to be stored according to the present embodiment) will be described by dividing it into a case where the tempo is constant and a case where the tempo is increased. In this embodiment, in any tempo or training course, it is empirically assumed that the running high state starts gradually after about 15 minutes from the start of jogging. Before the high state begins, the music body data is reproduced, and after the running high state begins, the storage target speech data is reproduced until the end of the jogging main part.

［１．２．３．１テンポが一定の場合］
図１６に示す例では、テンポ１４０ＢＰＭの楽曲データ及び記憶対象語音声データを全て１６０ＢＰＭの楽曲データに変換して、ファイル時間３０分（ファイル総小節数１２００）のジョギング本編を作成する場合の例であり、楽曲本体データとしては、曲Ａ〜曲Ｅの５個の楽曲本体データが、この選曲順で選択されたものとする。また、記憶対象語音声データとしては、音声Ｆ〜音声Ｊの５個の記憶対象語音声データが、この選曲順で選択されたものとする。更に、曲間つなぎデータとしては、前奏及び後奏の曲間つなぎデータの他に、１２小節の曲間部ａから１４４小節の曲間部ｌまでの１２個の曲間つなぎデータが用意されている。 [1.2.3.1 When the tempo is constant]
The example shown in FIG. 16 is an example in which the music data of tempo 140 BPM and the storage target word voice data are all converted to music data of 160 BPM, and a jogging main part having a file time of 30 minutes (total number of bars of file 1200) is created. Yes, as the music body data, it is assumed that five music body data of music A to music E are selected in this order of music selection. In addition, as the storage target word voice data, five storage target word voice data of the voices F to J are selected in this music selection order. Furthermore, as inter-music connection data, 12 inter-music connection data from the inter-music part a of 12 bars to the inter-music part 1 of 144 bars are prepared in addition to the inter-music data of the prelude and the follower. Yes.

制御部１１は、楽曲本体データ、記憶対象語音声データ及び曲間つなぎデータを、夫々テンポ１６０ＢＰＭの楽曲本体データ、記憶対象語音声データ及び曲間つなぎーデータに変換する。ここで、各楽曲の演奏時間及び記憶対象語音声の再生時間は、元の７／８（１４０ＢＰＭ／１６０ＢＰＭ）となる。 The control unit 11 converts the music main body data, the storage target word voice data, and the inter-music connection data into the music main body data, the storage target word voice data, and the inter-music connection data having a tempo of 160 BPM, respectively. Here, the performance time of each musical piece and the reproduction time of the speech to be stored are the original 7/8 (140 BPM / 160 BPM).

そして、制御部１１は、前奏の曲間つなぎデータをジョギング本編の先頭に挿入し、後奏の曲間つなぎデータをジョギング本編の最後に挿入する。また、制御部１１は、曲Ａ〜曲Ｅの楽曲本体データ及び音声Ｆ〜音声Ｊの記憶対象語音声データを、この曲順で、前奏の曲間つなぎデータと後奏の曲間つなぎデータとの間に挿入する。 Then, the control unit 11 inserts the pre-music connection data at the beginning of the jogging main part, and inserts the subsequent music connection data at the end of the jogging main part. In addition, the control unit 11 stores the song body data of the songs A to E and the storage target word voice data of the sounds F to J in the order of the songs, Insert between.

この時点におけるジョギング本編の小節数、すなわち、前奏と曲Ａ〜曲Ｅと音声Ｆ〜音声Ｊの記憶対象語音声と後奏との小節数の合計は、１６０ＢＰＭにおけるファイル時間３０分に対応する小節数である１２００小節未満となる。そして、制御部１１が、各楽曲本体データ及び各記憶対象語音声データの間に、曲間部ａ〜曲間部ｌのうち適切な長さの曲間つなぎデータを夫々挿入することにより、ジョギング本編の最終的な小節数が、ファイル総小節数である１２００小節となるように調整する。曲間部の曲間つなぎデータは１２小節間隔で用意されているので、ファイル時間に相応する小節数と実際に作成されるジョギング本編部分の小節数の差は最大でも±６小節分で済ませることができる。このように、必ずしも丁度１２００小節になるようにジョギング本編を作成する必要はなく、多少のズレがあっても良い。 The number of measures in the main part of the jogging at this time, that is, the total number of measures of the prelude, the music A to the music E, the voice F to the voice J, and the memorized speech and the subsequent music is the bar corresponding to a file time of 30 minutes at 160 BPM. The number is less than 1200 bars. Then, the control unit 11 inserts inter-music connection data having an appropriate length from the inter-music part a to the inter-music part l between the music main body data and the storage target speech data, thereby jogging. The final number of measures in the main part is adjusted to be 1200 measures that is the total number of measures in the file. Inter-song connecting data in the inter-song part is prepared at intervals of 12 bars, so the difference between the number of bars corresponding to the file time and the actual number of bars in the main part of jogging to be created must be at most ± 6 bars. Can do. In this way, it is not always necessary to create the main jogging so as to be exactly 1200 bars, and there may be some deviation.

また、制御部１１は、およそ中心に位置する曲間部の小節数が可能な限り多くなるよう、各曲間部の小節数を調整する。これは、長い曲間を１箇所に集中させることで他の曲間を短くし、曲間が間延びした印象をユーザに与えないためである。また、中心に位置する曲間に長い曲間部を挿入することで、メリハリが生まれる。その他の位置には、極力小節数の少なく且つ長さが均等になるような曲間部が挿入される。図１６に示す例では、全部で５曲の楽曲と５章の記憶対象語音声が選択されているので、５曲目である曲Ｅと最初の記憶対象語音声との曲間に１４４小節の曲間部が挿入され、その他の曲間には１２小節又は２４小節の曲間部が挿入されている。 In addition, the control unit 11 adjusts the number of bars in each inter-music part so that the number of bars in the inter-music part located at the center is as large as possible. This is because by concentrating long music pieces in one place, the other music pieces are shortened, and the impression that the music pieces are extended is not given to the user. Moreover, sharpness is born by inserting a long inter-music part between the music located in the center. In other positions, inter-musical portions are inserted so that the number of measures is as small as possible and the lengths are uniform. In the example shown in FIG. 16, a total of 5 songs and 5 chapters of the speech to be stored are selected, so there are 144 bars of music between the 5th song E and the first speech to be stored. Between the other songs, a 12-bar or 24-bar inter-music portion is inserted.

なお、小節数が多い曲間部を必ずしも中心に位置する曲間に挿入する必要はなく、また、１箇所のみに挿入する必要もない。例えば、全体の３分の１の楽曲が終わった後に小節数が多い曲間部を挿入し、更に３分の１の楽曲が終わった後に小節数が多い曲間部を挿入しても良い。ただし、間延びした印象を与えないためにも、小節数が多い曲間部を挿入する箇所を、極力少なめ（例えば、多くても全曲間の半分以下）にすることが望ましい。 In addition, it is not always necessary to insert an inter-music portion having a large number of measures between music pieces located at the center, and it is not necessary to insert it only at one place. For example, an inter-music part having a large number of measures may be inserted after the entire one-third music has been completed, and an inter-music part having a large number of measures may be inserted after one-third of the music has been completed. However, in order not to give the impression of being extended, it is desirable that the number of places where the number of measures is inserted is made as small as possible (for example, at most half or less between all songs).

曲間部の曲間つなぎデータは、各楽曲本体の曲間を曲間部でつなぐことにより、音楽的（聴覚的）に自然な形で各楽曲本体を接続して演奏するために用いられる楽曲データである。 Inter-song inter-song data is the music used to connect each music body in a musically (hearingly) natural way by connecting the songs of each music body between the songs. It is data.

ここで、楽曲本体データは、図３において説明したように、楽曲本体ＭＩＤＩデータ１０５から変換された楽曲本体のＷＡＶデータにジョギングアレンジドラムベースＷＡＶデータ１０６が合成されて作成される。このとき、楽曲本体の曲調に合ったリズムパターンのジョギングアレンジドラムベースＷＡＶデータ１０６が選択される。 Here, the music body data is created by synthesizing the jogging arrangement drum base WAV data 106 with the WAV data of the music body converted from the music body MIDI data 105 as described in FIG. At this time, jogging arrangement drum base WAV data 106 having a rhythm pattern matching the tone of the music body is selected.

また、記憶対象語音声データは、図１１において説明したように、英単語群ＷＡＶデータ３０２と日本語訳群ＷＡＶデータ３０３に、記憶対象語用ドラムベースＷＡＶデータ３０４が合成されて作成される。 Further, as described with reference to FIG. 11, the storage target word speech data is created by synthesizing the storage target word drum base WAV data 304 with the English word group WAV data 302 and the Japanese translation group WAV data 303.

ジョギング用楽曲データを構成する複数の楽曲本体（芸術音楽及び記憶対象語音声を含む）夫々のリズムパターンが全て同一であれば、曲間部の曲間つなぎデータとしては、これと同一のリズムパターンのデータのみを用意すれば、各楽曲の接続前後でリズムパターンが変化しないため、自然な形で演奏及び再生を継続することができる。しかしながら、楽曲本体のリズムパターンが曲毎に異なるような場合においては、一つのパターンでは、各楽曲の接続前後でリズムパターンが急激に変化し、音楽的に不自然となる。 If the rhythm patterns of a plurality of song bodies (including art music and memorized word speech) constituting the jogging song data are all the same, the same rhythm pattern is used as the song connecting data between the songs. If only this data is prepared, the rhythm pattern does not change before and after the connection of each music piece, so that the performance and reproduction can be continued in a natural manner. However, in the case where the rhythm pattern of the music body differs from song to song, with one pattern, the rhythm pattern changes abruptly before and after connection of each song, making it unnatural musically.

そこで、本実施形態においては、ジョギング用楽曲データを構成する複数の楽曲本体のリズムパターンが２パターン以上存在する場合には、これに応じて、複数のリズムパターンの曲間部の曲間つなぎデータを用意する。 Therefore, in the present embodiment, when there are two or more rhythm patterns of a plurality of music main bodies constituting the music data for jogging, in accordance with this, the connection data between the songs in the inter-music part of the plurality of rhythm patterns Prepare.

例えば、ある楽曲本体とその次の楽曲本体のリズムパターンとが異なる場合には、前の楽曲本体に接続する部分においては、その楽曲本体のリズムパターンと同じ（又は近い）リズムパターンで演奏が開始され、次の楽曲本体に接続する部分においては、その楽曲本体のリズムパターンと同じ（又は近い）リズムパターンで演奏が終了し、演奏が進行するに従ってリズムパターンが次第に変化していくような曲間部を用意する。 For example, if the rhythm pattern of one music body is different from the rhythm pattern of the next music body, the performance starts with the same (or close) rhythm pattern as that of the music body at the part connected to the previous music body In the part connected to the next song body, the performance ends with the same (or close) rhythm pattern as the song body, and the rhythm pattern gradually changes as the performance progresses. Prepare a part.

そして、例えば、ジョギング用楽曲データを構成する複数の楽曲本体のリズムパターンとして、ＸとＹとの２つのパターンが混在している場合には、リズムパターンがＸ一定の曲間部、Ｙ一定の曲間部、ＸからＹに変化する曲間部、ＹからＸに変化する曲間部の４つを、曲間部ａ〜曲間部ｆの夫々について用意する（例えば、曲間部ａの場合には、リズムパターンがＸ一定の曲間部ａ１、Ｙ一定の曲間部ａ２、ＸからＹに変化する曲間部ａ３、ＹからＸに変化する曲間部ａ４）。 For example, when two patterns of X and Y are mixed as rhythm patterns of a plurality of music main bodies constituting the music data for jogging, the rhythm pattern is a constant X portion, a constant Y portion. There are four inter-music parts, inter-music part changing from X to Y, and inter-music part changing from Y to X. In this case, the rhythm pattern has an X constant music interval portion a1, a Y constant music interval portion a2, an inter song portion a3 that changes from X to Y, and an inter song portion a4 that changes from Y to X).

制御部１１は、曲間部を挿入する曲間前後の楽曲本体データの楽曲情報又は記憶対象語音声データの内容情報等を参照して、楽曲本体のリズムパターンを認識し、これに対応したリズムパターンとなる曲間部の曲間つなぎデータを挿入する。 The control unit 11 recognizes the rhythm pattern of the music body by referring to the music information of the music body data before and after the music between which the music part is inserted or the content information of the speech data to be stored, and the corresponding rhythm pattern. Insert inter-song link data of the inter-song part that is the pattern.

ここで、制御部１１は、楽曲本体データと曲間部の曲間つなぎデータとを接続する場合に、楽曲本体の最初の１小節と曲間部（楽曲本体がそのファイルの１曲目である場合には前奏）の最後の１小節とを重複させて接続する。これは、シンコペーションで始まる楽曲の本体開始部分と曲間部の終了部分とを自然に繋げるためである。 Here, when the control unit 11 connects the music main body data and the inter-music piece connection data, the first measure of the music main body and the inter-music piece portion (when the music main body is the first song of the file) Is connected with the last bar of the prelude) overlapping. This is to naturally connect the main body start portion of the music beginning with the syncopation and the end portion of the inter-music portion.

つまり、小節を跨いだシンコペーションでは、１小節目の最後で実際の演奏が開始される場合がある。例えば、１小節目の４拍子目で演奏が開始されるとする。そうすると、１小節目の最初の３拍子分は何のメロディーも演奏されないこととなるので、曲間部の最後の部分と楽曲本体との最初の部分との接続部分で、音楽的に不自然な部分ができてしまう。そこで、楽曲本体の最初の１小節と曲間部の最後の１小節とを重複させるのである。なお、シンコペーションで始まらない楽曲本体では、最初の１小節は無音で作成されている。そして、制御部１１は、楽曲本体と曲間部とを一律に１小節分重複させる。 In other words, in syncopation across measures, actual performance may start at the end of the first measure. For example, assume that the performance starts at the fourth beat of the first measure. Then, no melody will be played for the first three beats of the first measure, so it is musically unnatural at the connection between the last part of the inter-music part and the first part of the music body. A part is made. Therefore, the first one bar of the music body and the last one bar between the music pieces are overlapped. In the music body that does not start with syncopation, the first bar is created with no sound. Then, the control unit 11 uniformly overlaps the music main body and the music interval part by one bar.

こうしたことから、小節数を計算する場合においては、各楽曲本体の小節数から夫々１小節減算する必要がある。 For this reason, when calculating the number of measures, it is necessary to subtract one measure from the number of measures of each music body.

なお、ジョギング本編の最初の楽曲本体の前に挿入される前奏の曲間つなぎデータは、その楽曲本体のリズムパターンと同じ（又は近い）リズムパターンのデータを用い、ジョギング本編の最後の楽曲本体の後に挿入される後奏の曲間つなぎデータは、その楽曲本体のリズムパターンと同じ（又は近い）リズムパターンのデータを用いることが望ましい。 Note that the tune pattern data of the prelude inserted before the first song body of the jogging main part uses the same (or close) rhythm pattern data as that of the main part of the jogging main part. It is desirable to use rhythm pattern data that is the same as (or close to) the rhythm pattern of the music body as the inter-song tune connection data to be inserted later.

［１．２．３．２テンポが上昇する場合］
テンポが上昇する場合、つまり、
初期テンポ＜最終テンポ
である場合も、テンポの調整以外は、テンポが一定の場合と基本的に同様である。つまり、再生途中でテンポが変わろうと変わるまいと、各曲の小節数は変わらないから、ファイル総小節数さえ確定していれば、実際に作成するジョギング本編部分の小節数の調整は容易に行うことができる。 [1.2.3.2 When the tempo increases]
If the tempo increases, that is,
The case where the initial tempo <the final tempo is basically the same as when the tempo is constant except for the adjustment of the tempo. In other words, if the tempo changes during playback, the number of measures in each song does not change, so if the total number of measures in the file is fixed, the number of measures in the main part of the jogging to be created can be easily adjusted be able to.

これを、例えば、小節数ではなく演奏時間の再生時間で調整しようとすると、途端に複雑となる。なぜなら、各曲の演奏時間等を求めるためには、先ず、夫々のテンポを求めなければならない。しかし、各曲の再生位置（ジョギング本編先頭の再生が開始されてからの経過時間）が判らなければ、各曲に対して適切なテンポを求めることもできない。そこで、演奏時間又は再生時間やテンポを仮に設定等した上で、全体の演奏時間を調整することとなるが、小節数で調整する場合と比較して、明らかに処理が複雑になる。また、テンポが一定の場合とテンポが上昇する場合とで処理を変えなければならない。更には、仮に設定したテンポと実際のテンポとがずれることがあるため、このことに起因して、計算した全体の演奏時間とファイル時間との誤差が大きくなる場合がある。小節数で調整するようにすれば、こうした不都合は生じない。 For example, if an attempt is made to adjust this not by the number of measures but by the playback time of the performance time, it becomes complicated. Because, in order to obtain the performance time of each song, first, the tempo of each must be obtained. However, if the playback position of each song (elapsed time since the start of playback of the jogging main part) is not known, an appropriate tempo cannot be obtained for each song. Therefore, the entire performance time is adjusted after temporarily setting the performance time or the reproduction time and the tempo, but the processing is obviously more complicated than the case of adjusting the number of measures. In addition, the processing must be changed between a case where the tempo is constant and a case where the tempo increases. Furthermore, since the tempo that has been set may deviate from the actual tempo, the error between the calculated overall performance time and file time may increase due to this. Such an inconvenience does not occur if the number of measures is adjusted.

次に、テンポについても、小節単位で調整を行う。具体的に、ジョギング本編の最初の小節のテンポを初期テンポとし、最後の小節のテンポを最終テンポとして、その間の小節については、初期テンポから最終テンポまでなだらかにテンポが変化するように調整する。例えば、初期テンポが１６０ＢＰＭ、最終テンポが１７０ＢＰＭ、ファイル総小節数が１２００小節である場合は、最初の小節を１６０ＢＰＭとし、１小節経過する毎に１／１２０ＢＰＭずつテンポが上昇するように各小節のテンポを調整する。 Next, the tempo is adjusted in units of measures. Specifically, the tempo of the first measure of the main jogging is set as the initial tempo, the tempo of the last measure is set as the final tempo, and the measures in the meantime are adjusted so that the tempo changes gently from the initial tempo to the final tempo. For example, if the initial tempo is 160 BPM, the final tempo is 170 BPM, and the total number of bars in the file is 1200 bars, the first bar is 160 BPM, and each bar has a tempo that increases by 1/120 BPM after each bar. Adjust the tempo.

［１．３楽曲配信サーバの動作］
［１．３．１記憶対象語音声データ作成処理］
次に、楽曲配信サーバ１の動作について説明するが、始めに、記憶対象語音声データ作成処理について、図１７を用いて説明する。 [1.3 Music distribution server operation]
[1.3.1 Storage target word voice data creation process]
Next, the operation of the music distribution server 1 will be described. First, the storage target speech data creation processing will be described with reference to FIG.

図１７は、本実施形態に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。 FIG. 17 is a flowchart illustrating a processing example in the storage target word voice data creation processing of the control unit 11 of the music distribution server 1 according to the present embodiment.

図１７の処理は、例えば、オペレータからの指示操作によって、記憶対象語ＷＡＶパーツデータ書き出しプログラム２０５が起動したときに開始される。先ず、制御部１１は、パーツＷＡＶデータデータベースから、記憶対象語用ドラムベースＷＡＶデータ３０４を取得し、取得したＷＡＶデータをＲＡＭに記憶する（ステップＳ４０１）。 The process of FIG. 17 is started, for example, when the storage target word WAV part data writing program 205 is activated by an instruction operation from the operator. First, the control unit 11 acquires the drum base WAV data 304 for the storage target word from the part WAV data database, and stores the acquired WAV data in the RAM (step S401).

次いで、制御部１１は、パーツＷＡＶデータデータベースから、英単語群ＷＡＶデータ３０２と、当該英単語群ＷＡＶデータ３０２に対応する日本語訳群ＷＡＶデータ３０３とを取得し、取得したＷＡＶデータをＲＡＭに記憶する（ステップＳ４０２）。取得される英単語群ＷＡＶデータ３０２と日本語訳群ＷＡＶデータ３０３とは、再生される英単語群と日本語訳群とが対応している。つまり、取得された英単語群ＷＡＶデータ３０２に基づいて再生される英単語の日本語訳が、取得された日本語訳群ＷＡＶデータ３０３に基づいて再生される。取得すべき英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３は、例えば、オペレータからの指示に基づいて選択されても良いし、取得するＷＡＶデータが予め記載された設定ファイルに基づいて選択されても良い。 Next, the control unit 11 acquires the English word group WAV data 302 and the Japanese translation group WAV data 303 corresponding to the English word group WAV data 302 from the parts WAV data database, and stores the acquired WAV data in the RAM. Store (step S402). The acquired English word group WAV data 302 and the Japanese translation group WAV data 303 correspond to the reproduced English word group and the Japanese translation group. That is, a Japanese translation of English words reproduced based on the acquired English translation group WAV data 302 is reproduced based on the acquired Japanese translation group WAV data 303. The English word group WAV data 302 and the Japanese translation group WAV data 303 to be acquired may be selected based on an instruction from an operator, for example, or the WAV data to be acquired is selected based on a setting file described in advance. May be.

次いで、制御部１１は、取得した記憶対象語用ドラムベースＷＡＶデータ３０４、英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３を混合することにより、記憶対象語音声データを作成する（ステップＳ４０３）。このとき、制御部１１は、日本語訳の発音開始タイミングがリズム音の第３拍のタイミングに合うように、日本語訳群ＷＡＶデータ３０３の再生位置及び記憶対象語用ドラムベースＷＡＶデータ３０４の再生位置を調節する。また、制御部１１は、日本語訳と英単語との時間間隔が最低時間間隔Ａ以上となる英単語のアクセントの発音タイミングがリズム音の第１拍のタイミングに合うように、且つ、日本語訳と英単語との時間間隔が最低時間間隔Ａ未満となる英単語の発音開始タイミングがリズム音の第１拍のタイミングに合うように、英単語群ＷＡＶデータ３０２の再生位置及び記憶対象語用ドラムベースＷＡＶデータ３０４の再生位置を調節する。 Next, the control unit 11 mixes the acquired drum base WAV data for storage target word 304, the English word group WAV data 302, and the Japanese translation group WAV data 303 to create storage target word speech data (step S403). ). At this time, the control unit 11 sets the playback position of the Japanese translation group WAV data 303 and the drum base WAV data 304 for the storage target word so that the pronunciation start timing of the Japanese translation matches the timing of the third beat of the rhythm sound. Adjust the playback position. Further, the control unit 11 adjusts the pronunciation timing of the accent of the English word whose time interval between the Japanese translation and the English word is equal to or more than the minimum time interval A, and matches the timing of the first beat of the rhythm sound. For the playback position of the English word group WAV data 302 and the storage target word so that the pronunciation start timing of the English word whose time interval between the translation and the English word is less than the minimum time interval A matches the timing of the first beat of the rhythm sound The playback position of the drum base WAV data 304 is adjusted.

もっとも、英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３が、前記１．２．２．２項で説明したような適切な構成となっていれば、再生位置の調整としては、記憶対象語用ドラムベースＷＡＶデータ３０４、英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３夫々の先頭の再生位置を互いに一致させれば良い。 However, if the English word group WAV data 302 and the Japanese translation group WAV data 303 have an appropriate configuration as described in the above section 1.2.2.2, the reproduction target position may be stored. What is necessary is just to make the reproduction | regeneration head of each of the drum base WAV data 304 for word, the English word group WAV data 302, and the Japanese translation group WAV data 303 mutually correspond.

次いで、制御部１１は、作成した記憶対象語音声データをパーツＷＡＶデータデータベースに登録する（ステップＳ４０４）。制御部１１は、この処理を終えると、記憶対象語音声データ作成処理を終了させる。 Next, the control unit 11 registers the created storage target word speech data in the part WAV data database (step S404). When this process is completed, the control unit 11 ends the storage target word speech data creation process.

［１．３．２メイン処理］
次に、メイン処理について、図１８を用いて説明する。 [1.3.2 Main processing]
Next, the main process will be described with reference to FIG.

図１８は、本実施形態に係る楽曲配信サーバ１の制御部１１のメイン処理における処理例を示すフローチャートである。 FIG. 18 is a flowchart illustrating a processing example in the main processing of the control unit 11 of the music distribution server 1 according to the present embodiment.

先ず、ユーザ操作により、ユーザＰＣ２が楽曲配信サーバ１にアクセスすると、図１８に示すように、楽曲配信サーバ１の制御部１１は、ログイン処理を実行する（ステップＳ１）。具体的に、制御部１１は、ユーザＰＣ２からユーザＩＤ、パスワード等を受信し、認証処理を行って、ユーザを特定する。 First, when the user PC 2 accesses the music distribution server 1 by a user operation, as shown in FIG. 18, the control unit 11 of the music distribution server 1 executes a login process (step S1). Specifically, the control unit 11 receives a user ID, a password, and the like from the user PC 2 and performs an authentication process to specify a user.

次いで、制御部１１は、コース選択処理を実行する（ステップＳ２）。具体的に、制御部１１は、所定のコース選択用ＷＥＢページをユーザＰＣ２に送信して、図４において説明したように、ユーザにトレーニングコースを選択させ、選択されたトレーニングコースの情報をユーザＰＣ２から受信する。 Subsequently, the control part 11 performs a course selection process (step S2). Specifically, the control unit 11 transmits a predetermined course selection WEB page to the user PC 2 to cause the user to select a training course as described in FIG. 4, and information on the selected training course is stored in the user PC 2. Receive from.

次いで、制御部１１は、所定のＷＥＢページをユーザＰＣ２に送信して、ユーザに自分の体重及び性別並びに所望する記憶対象語群名を入力させ、これらの情報をユーザＰＣ２から受信する（ステップＳ３）。このユーザの体重は、ジョギング中におけるユーザのカロリー消費の計算に用いられ、性別は、最終的なトレーニングコースの決定に用いられる。性別によって運動能力に差があるため、同じトレーニングコースでも性別によってその内容（具体的には、ジョギング時間及びテンポ）を若干変えるのである。また、入力された記憶対象語群名を示すデータは、上述してきたように各楽曲に続けてランニングハイ状態に移行したタイミング以降に再生する記憶対象語音声の設定に用いられる。 Next, the control unit 11 transmits a predetermined WEB page to the user PC 2 to cause the user to input his / her weight and gender and a desired storage target word group name, and receives these pieces of information from the user PC 2 (step S3). ). This user's weight is used to calculate the user's calorie consumption during jogging, and gender is used to determine the final training course. Because there is a difference in athletic ability depending on gender, the content (specifically, jogging time and tempo) varies slightly depending on gender even in the same training course. Further, the input data indicating the storage target word group name is used for setting the storage target word voice to be reproduced after the timing of shifting to the running high state following each music piece as described above.

次いで、制御部１１は、ユーザＰＣ２から受信されたトレーニングコースの情報と性別の情報とに基づいて、最終的なトレーニングコースを決定する（ステップＳ４）。 Next, the control unit 11 determines a final training course based on the training course information and gender information received from the user PC 2 (step S4).

なお、ステップＳ２において、特定されたユーザが前回までに選択したトレーニングコースを変更しない場合には、ステップＳ３の処理を省略し、ユーザのトレーニング情報から、トレーニングコースの情報を取得しても良い。 In step S2, if the identified user does not change the training course selected by the previous time, the processing of step S3 may be omitted, and the training course information may be acquired from the user training information.

次いで、制御部１１は、ステップを決定する（ステップＳ５）。具体的に、制御部１１は、先ず、システム側で推奨するステップを提示するＷＥＢページをユーザＰＣ２に送信する。例えば、今回新たなトレーニングコースが選択された場合には、ステップ１が推奨され、前回までのトレーニングコースが変更されない場合には、現在のステップ、当該トレーニングコースのジョギング用楽曲データを最初にダウンロードした日からの経過日数等に応じて相応しいステップが推奨される。ここで、ユーザＰＣ２を操作することによって、提示されたステップをユーザが承諾した場合には、その旨のリクエストがユーザＰＣ２から楽曲配信サーバ１に送信される一方、ユーザがステップを変更する場合には、その旨のリクエストがユーザＰＣ２から楽曲配信サーバ１に送信される。そして、制御部１１は、このリクエストに基づいて、ステップを決定する。 Next, the control unit 11 determines a step (step S5). Specifically, the control unit 11 first transmits a WEB page presenting recommended steps on the system side to the user PC 2. For example, when a new training course is selected this time, Step 1 is recommended, and when the previous training course is not changed, the current step and music data for jogging of the training course are downloaded first. Appropriate steps are recommended according to the number of days since the date. Here, when the user accepts the presented step by operating the user PC 2, a request to that effect is transmitted from the user PC 2 to the music distribution server 1, while the user changes the step. The request to that effect is transmitted from the user PC 2 to the music distribution server 1. And the control part 11 determines a step based on this request.

次いで、制御部１１は、決定されたトレーニングコース及びステップに対応するジョギング時間（記憶対象語音声データの再生時間を含む）、初期テンポ及び終了テンポを、コース情報から取得し、ＲＡＭ上の所定領域に設定する（ステップＳ６）。 Next, the control unit 11 acquires the jogging time (including the reproduction time of the storage target speech data), the initial tempo, and the end tempo corresponding to the determined training course and step from the course information, and a predetermined area on the RAM. (Step S6).

次いで、制御部１１は、ジョギング時間に基づいて、図７で説明したように、ファイル時間ＦＴ（１）〜ＦＴ（Ｎ）を設定する。例えば、制御部１１は、ジョギング時間が１５分である場合には、Ｎ＝１、ＦＴ（１）＝１５を設定し、ジョギング時間が４５分である場合には、Ｎ＝２、ＦＴ（１）＝３０、ＦＴ（２）＝１５を設定する。 Next, the control unit 11 sets the file times FT (1) to FT (N) based on the jogging time as described in FIG. For example, the controller 11 sets N = 1 and FT (1) = 15 when the jogging time is 15 minutes, and N = 2 and FT (1 when the jogging time is 45 minutes. ) = 30 and FT (2) = 15.

次いで、制御部１１は、ウォーミングアップ曲のファイル名を設定する（ステップＳ８）。具体的に、制御部１１は、コース番号及び今日の日付を含むと共に、通し番号として「１」を含むファイル名を設定する（例えば、「００１−２００８０７０１−１．ＭＰ３」等。ここで、「００１」はコース番号、「２００８０７０１」は日付、「１」は通し番号である）。 Next, the control unit 11 sets the file name of the warm-up song (step S8). Specifically, the control unit 11 sets a file name including the course number and today's date and including “1” as the serial number (for example, “001-20080701-1.MP3” etc., where “001” "Is the course number," 2008701 "is the date, and" 1 "is the serial number).

後述するが、制御部１１は、上記の命名規則で、ジョギング本編とクールダウン曲についても通し番号を増加させながらファイル名を設定する。これは、携帯音楽プレーヤ３でジョギング用楽曲データを再生する際に、規定された順番（ウォーミングアップ曲、ジョギング本編の１番目のファイル、ジョギング本編の２番目のファイル…クールダウン曲）でファイルを再生させるためである。ＭＰ３フォーマットのデータでは、ＩＤタグを設定することができるが、このＩＤタグにはトラック番号も含めることができる。携帯音楽プレーヤ３がＩＤタグを参照してファイルの再生順を決定するのであれば問題はないが、ＩＤタグに対応していない機種も存在する。このような機種では、ファイル名の順（例えば、ＡＳＣＩＩ（American Standard Code for Information Interchange）コード（文字コードの一例）で昇順となるように）で再生を行うものが多い。そこで、制御部１１は、上述したようにファイル名を設定する。 As will be described later, the control unit 11 sets the file name while increasing the serial number for the jogging main part and the cool down song according to the above naming rule. This is because when playing music data for jogging on the portable music player 3, the files are played in the prescribed order (warming up music, first file of the main jogging, second file of the main jogging ... cool down music). This is to make it happen. In the MP3 format data, an ID tag can be set, and the track number can also be included in this ID tag. There is no problem if the portable music player 3 determines the playback order of the file by referring to the ID tag, but there is a model that does not support the ID tag. Many of these models perform reproduction in the order of file names (for example, ascending order of ASCII (American Standard Code for Information Interchange) codes (an example of character codes)). Therefore, the control unit 11 sets the file name as described above.

次いで、制御部１１は、ファイル番号ｉに１を設定する（ステップＳ９）。 Next, the control unit 11 sets 1 to the file number i (step S9).

次いで、制御部１１は、後述する使用楽曲決定処理を実行することにより、ｉ番目のジョギング本編１ファイルに相当するジョギング用楽曲データの作成に用いられる楽曲データ、記憶対象語音声データ、曲間つなぎデータと、その演奏（再生）順序を決定する（ステップＳ１０）。 Next, the control unit 11 executes music piece determination processing, which will be described later, so that music data used for creation of jogging music data corresponding to the i-th jogging main part 1 file, storage target word voice data, and inter-song joining. Data and its performance (reproduction) order are determined (step S10).

次いで、制御部１１は、ジョギング用楽曲データ作成処理を実行することにより、ｉ番目のジョギング本編１ファイルに相当するジョギング用楽曲データを作成する（ステップＳ１１）。 Next, the control unit 11 executes jogging song data creation processing to create jogging song data corresponding to the i-th jogging main part 1 file (step S11).

次いで、制御部１１は、ファイル番号ｉに１を加算して（ステップＳ１２）、ファイル番号ｉがファイル数Ｎより大きいか否かを判定する（ステップＳ１３）。このとき、制御部１１は、ファイル番号ｉがファイル数Ｎよりも大きくない場合には（ステップＳ１３：ＮＯ）、ステップＳ１０に移行する。 Next, the control unit 11 adds 1 to the file number i (step S12), and determines whether the file number i is greater than the number N of files (step S13). At this time, if the file number i is not greater than the number N of files (step S13: NO), the control unit 11 proceeds to step S10.

一方、制御部１１は、ファイル番号ｉがファイル数Ｎより大きい場合には（ステップＳ１３：ＹＥＳ）、クールダウン曲のファイル名を設定する（ステップＳ１４）。具体的に、制御部１１は、コース番号及び今日の日付を含むと共に、通し番号としてＮ＋１を含むファイル名を設定する。 On the other hand, when the file number i is greater than the number N of files (step S13: YES), the control unit 11 sets the file name of the cool down song (step S14). Specifically, the control unit 11 sets a file name including the course number and today's date and including N + 1 as the serial number.

次いで、制御部１１は、作成したジョギング用楽曲データをユーザＰＣ２に送信する（ステップＳ１５）。具体的に、制御部１１は、ウォーミングアップ曲のファイル、ジョギング本編（一又は複数の記憶対象語音声を含む）のファイル及びクールダウン曲のファイル夫々へのリンクを含むダウンロード用のＷＥＢページをユーザＰＣ２に送信して、ダウンロードするファイルをユーザに選択させる。そして、制御部１１は、ユーザに選択されたリンクに設定されているＵＲＬ（Uniform Resource Locator）の情報をＰＣ２から受信して、このＵＲＬに対応するファイルをユーザＰＣ２に送信する。 Next, the control unit 11 transmits the created jogging music data to the user PC 2 (step S15). Specifically, the control unit 11 displays a download WEB page including links to a warm-up song file, a jogging main part file (including one or more words to be stored) and a cool-down song file, respectively, for the user PC2. And let the user select a file to download. And the control part 11 receives the information of URL (Uniform Resource Locator) set to the link selected by the user from PC2, and transmits the file corresponding to this URL to user PC2.

そして、制御部１１は、ユーザが必要なだけのファイルの送信を行うと、メイン処理を終了させる。 And the control part 11 will complete | finish a main process, if a user transmits as many files as needed.

［１．３．３使用楽曲決定処理］
次に、使用楽曲決定処理について、図１９を用いて説明する。 [1.3.3 Use song determination process]
Next, the music use determination process will be described with reference to FIG.

図１９は、本実施形態に係る楽曲配信サーバ１の制御部１１の使用楽曲決定処理における処理例を示すフローチャートである。 FIG. 19 is a flowchart illustrating a processing example in the used music determination process of the control unit 11 of the music distribution server 1 according to the present embodiment.

使用楽曲決定処理が開始されると、図１９に示すように、制御部１１は、演奏リストを初期化する（ステップＳ１０１）。 When the use song determination process is started, as shown in FIG. 19, the control unit 11 initializes the performance list (step S101).

次いで、制御部１１は、走行テンポＴＭＰの設定を行う（ステップＳ１０２）。走行テンポＴＭＰは、初期テンポ、終了テンポ、ファイル番号ｉ及びファイル時間ＦＴ（１）〜ＦＴ（Ｎ）に基づいて設定される。具体的に、制御部１１は、初期テンポと終了テンポとが一致する場合には、走行テンポＴＭＰに初期テンポを設定する。また、制御部１１は、初期テンポと終了テンポとが一致しない場合には、
（ファイル初期テンポ＋ファイル終了テンポ）／２
を計算し、この計算結果を走行テンポＴＭＰに設定する。ここで、ファイル初期テンポとは、ジョギング本編のｉ番目のファイルにおける最初の曲のテンポであり、ファイル終了テンポとは、ジョギング本編のｉ番目のファイルにおける最後の曲のテンポである。ファイル初期テンポ及びファイル終了テンポは、以下の式により求めることができる。 Next, the control unit 11 sets the travel tempo TMP (step S102). The running tempo TMP is set based on the initial tempo, end tempo, file number i, and file times FT (1) to FT (N). Specifically, when the initial tempo and the end tempo match, the control unit 11 sets the initial tempo as the travel tempo TMP. The control unit 11 also determines that the initial tempo and the end tempo do not match.
(File initial tempo + file end tempo) / 2
And the calculation result is set as the running tempo TMP. Here, the file initial tempo is the tempo of the first song in the i-th file of the main part of jogging, and the file end tempo is the tempo of the last song in the i-th file of the main part of jogging. The file initial tempo and the file end tempo can be obtained by the following equations.

ファイル初期テンポ＝（初期テンポ−終了テンポ）×（ＦＴ（１）〜ＦＴ（ｉ−１）の合計）／ジョギング時間
ファイル終了テンポ＝（初期テンポ−終了テンポ）×（ＦＴ（１）〜ＦＴ（ｉ）の合計）／ジョギング時間
制御部１１は、走行テンポＴＭＰを設定すると、
ＴＭＰ×ＦＴ（ｉ）／４
を計算し、この計算結果をファイル総小節数ＦＭに設定する（ステップＳ１０３）。 File initial tempo = (initial tempo−end tempo) × (total of FT (1) to FT (i−1)) / jogging time File end tempo = (initial tempo−end tempo) × (FT (1) to FT ( i) total) / jogging time When the controller 11 sets the running tempo TMP,
TMP x FT (i) / 4
And the calculation result is set as the total number of bars FM (step S103).

次いで、制御部１１は、前奏の小節数と後奏の小節数とを設定する（ステップＳ１０４）。具体的に、制御部１１は、ファイル番号ｉ及びファイル数Ｎの組み合わせに対応する前奏の小節数と後奏の小節数とを設定する。また、制御部１１は、設定された小節数の前奏を演奏リストの先頭セットするとともに、設定された小節数の後奏を演奏リストの最後にセットする。 Next, the control unit 11 sets the number of measures for the prelude and the number of measures for the follower (step S104). Specifically, the control unit 11 sets the number of measures of the prelude and the number of measures of the follower corresponding to the combination of the file number i and the number N of files. Further, the control unit 11 sets the prelude of the set number of measures at the beginning of the performance list and sets the follower of the set number of measures at the end of the performance list.

次いで、制御部１１は、許容最長楽曲小節数ＭＰＭを設定する（ステップＳ１０５）。具体的に、制御部１１は、ファイル時間ＦＴ（ｉ）が３０分である場合には、２１０を許容最長楽曲小節数ＭＰＭとして設定する。一方、制御部１１は、ファイル時間ＦＴ（ｉ）が１５分である場合には、図１５（ｄ）に示すように、走行テンポＴＭＰに対応するテンポを許容最長楽曲小節数ＭＰＭとして設定する。 Next, the control unit 11 sets the allowable longest music measure number MPM (step S105). Specifically, when the file time FT (i) is 30 minutes, the control unit 11 sets 210 as the allowable maximum music measure number MPM. On the other hand, when the file time FT (i) is 15 minutes, the control unit 11 sets the tempo corresponding to the running tempo TMP as the allowable maximum music measure number MPM, as shown in FIG.

次いで、制御部１１は、ジョギング本編を構成する複数の楽曲を選定する（ステップＳ１０６）。具体的に、制御部１１は、複数の楽曲を選定すべく、楽曲選択処理を複数回実行する。 Next, the control unit 11 selects a plurality of music pieces constituting the jogging main part (step S106). Specifically, the control unit 11 executes the music selection process a plurality of times in order to select a plurality of music.

楽曲選択処理において、制御部１１は、先ず、特定されたユーザの履歴情報を参照して、前回までにユーザが選択したアルバム、アーティスト又はジャンル（以下、単に「アルバム等」と称する）と、そのアルバム等に含まれる楽曲（記憶対象語音声を含む）のリストと、を提示するＷＥＢページをユーザＰＣ２に送信する。この時制御部１１は、上記ステップＳ３の処理においていずれかの記憶対象語音声の名称が入力されている場合、当該入力されている名称に対応する記憶対象語音声を含む上記リストを提示する上記ＷＥＢページをユーザＰＣ２に送信する。 In the music selection process, the control unit 11 first refers to the history information of the identified user, the album, artist or genre (hereinafter simply referred to as “album etc.”) selected by the user up to the previous time, A WEB page presenting a list of music (including words to be stored) included in an album and the like is transmitted to the user PC 2. At this time, when the name of any storage target speech is input in the process of step S3, the control unit 11 presents the list including the storage target speech corresponding to the input name. The WEB page is transmitted to the user PC 2.

ここで、ユーザＰＣ２を操作することによって、ユーザが、提示されたアルバム等（記憶対象語音声を含む。以下同様）を別のアルバム等に変更せずに楽曲又は記憶対象語群を選択した場合には、その楽曲の楽曲選択リクエストがユーザＰＣ２から楽曲配信サーバ１に送信される。 Here, by operating the user PC 2, the user selects a song or a storage target word group without changing the presented album or the like (including the storage target word sound; the same applies hereinafter) to another album or the like. The music selection request for the music is transmitted from the user PC 2 to the music distribution server 1.

一方、ユーザが、提示されたアルバム等を別のアルバム等に変更した場合には、選択されたアルバム等のアルバム等選択リクエストがユーザＰＣ２から楽曲配信サーバ１に送信され、制御部１１は、これに応じて、アルバム等のリストやアルバム等の検索メニューを提示するＷＥＢページデータをユーザＰＣ２に送信する。そして、ユーザがアルバム等を選択し、そのアルバム等に含まれる複数の楽曲又は記憶対象語音声の中から所望する楽曲又は記憶対象語音声を選択すると、その楽曲又は記憶対象語音声の楽曲選択リクエストがユーザＰＣ２から楽曲配信サーバ１に送信される。 On the other hand, when the user changes the presented album or the like to another album or the like, an album selection request for the selected album or the like is transmitted from the user PC 2 to the music distribution server 1, and the control unit 11 In response, WEB page data presenting a list of albums and a search menu of albums is transmitted to the user PC 2. Then, when the user selects an album or the like and selects a desired song or storage target word voice from a plurality of songs or storage target word sounds included in the album or the like, a music selection request for the music or storage target word sounds Is transmitted from the user PC 2 to the music distribution server 1.

制御部１１は、この選択リクエストを受信することによって、１曲の楽曲データを特定し選定する。そして、制御部１１は、選定した楽曲を、選定した順番に相当する曲順目の楽曲として演奏リストにセットする。 By receiving this selection request, the control unit 11 specifies and selects one piece of music data. And the control part 11 sets the selected music to a performance list as a music of the music order corresponding to the selected order.

ここで、制御部１１は、ユーザに楽曲を選択させる際に、選択可能な楽曲の小節数を許容最長楽曲小節数ＭＰＭ以下に制限する。例えば、制御部１１は、楽曲のリストを提示するＷＥＢページをユーザＰＣ２に送信する際、小節数が許容最長楽曲小節数ＭＰＭよりも多い楽曲をリストから除外する。 Here, when the control unit 11 causes the user to select a song, the control unit 11 restricts the number of measures of the selectable music to be equal to or less than the allowable longest music bar number MPM. For example, when the control unit 11 transmits a WEB page that presents a list of songs to the user PC 2, the control unit 11 excludes songs having more measures than the allowable maximum number of music measures MPM from the list.

制御部１１は、上記楽曲選択処理の繰り返し実行により、少なくとも３曲の楽曲が選定されるように制御するとともに、１ファイルに含めることが可能な曲数まで楽曲が選定されるように制御する。具体的に、制御部１１は、ファイル総小節数ＦＭから、前奏、後奏、これまでに選択された曲数分の楽曲及び当該曲数分の曲間部の最短の小節数の合計を差し引いた残り小節数が、許容最長楽曲小節数ＭＰＭ未満となるまで、楽曲選択処理を繰り返して楽曲を選定する。 The control unit 11 performs control so that at least three songs are selected by repeatedly executing the song selection process, and controls so that songs are selected up to the number of songs that can be included in one file. Specifically, the control unit 11 subtracts the sum of the prelude, the follower, the number of songs selected so far, and the shortest number of measures between the songs for the number of songs from the total number of measures FM of the file. The music selection process is repeated until the number of remaining bars becomes less than the maximum allowable music bar number MPM to select music.

なお、演奏時間が演奏開始から１５分を経過した以降は、制御部１１は、芸術音楽としての楽曲ではなく記憶対象語音声を選定していくこととなる。 Note that, after the performance time has passed 15 minutes from the start of the performance, the control unit 11 selects the storage target word voice instead of the music as art music.

制御部１１は、楽曲の選定を終えると、およそ中心に位置する曲間に、１２小節から１４４小節まで１２種類の曲間部の中から可能な限り小節数が多い曲間部をセットする（ステップＳ１０７）。具体的に、制御部１１は、許容最長楽曲小節数ＭＰＭと、前奏、後奏及び選択された楽曲の各小節数に基づいて、およそ中心に位置する曲間を除く各曲間に最小の小節数である１２小節の曲間部をセットした場合に、およそ中心に位置する曲間にセットすることができる小節数が最多となる曲間部をセットする。ここで、およそ中心に位置する曲間とは、選定された楽曲の曲数が偶数である場合、曲数／２曲目の後に位置する曲間であり、選定された楽曲の曲数が偶数である場合、例えば、（曲数＋１）／２曲目の後に位置する曲間である。 When the selection of the music is completed, the control unit 11 sets an inter-music part having as many bars as possible from 12 kinds of inter-music parts from the 12th bar to the 144th bar between the musics located at the center. Step S107). Specifically, the control unit 11 determines the smallest measure between the songs except for the song located at the center, based on the maximum allowable song number MPM and the number of measures of the prelude, the follower, and the selected song. When a 12-measure inter-music part is set, the inter-music part having the largest number of measures that can be set between the music located at the center is set. Here, when the number of songs of the selected music is an even number, the number of songs of the selected music is an even number when the number of songs of the selected music is an even number. In some cases, for example, the number of songs is located after (number of songs + 1) / 2.

次いで、制御部１１は、およそ中心に位置する曲間を除く残りの全ての曲間にも曲間部をセットする（ステップＳ１０８）。具体的に、制御部１１は、各曲間部の小節数が極力均等になるように、且つ、前奏、後奏、選択された楽曲及び各曲間部の小節数の合計が、ファイル総小節数ＦＭに最大限近づくように、１２小節から１４４小節まで１２種類の曲間部の中から各曲間部をセットする。制御部１１は、この処理を終えると、使用楽曲決定処理を終了させる。 Next, the control unit 11 sets the inter-music part between all of the remaining music except for the inter-song music (step S108). Specifically, the control unit 11 determines that the total number of bars in each inter-song portion is as large as possible, and that the total number of bars in the prelude, the subsequent performance, the selected music piece, and each inter-song portion is the total number of measures in the file. Each inter-music part is set from 12 kinds of inter-music parts from 12 bars to 144 bars so as to approach the number FM as much as possible. The control part 11 will complete | finish a use music determination process, after complete | finishing this process.

［１．３．４ジョギング用楽曲データ作成処理］
次に、ジョギング用楽曲データ作成処理について、図２０を用いて説明する。 [1.3.4 Jogging song data creation process]
Next, the music data creation process for jogging will be described with reference to FIG.

図２０は、本実施形態に係る楽曲配信サーバ１の制御部１１のジョギング用楽曲データ作成処理における処理例を示すフローチャートである。 FIG. 20 is a flowchart illustrating a processing example in the jogging music data creation process of the control unit 11 of the music distribution server 1 according to the present embodiment.

ジョギング用楽曲データ作成処理が開始されると、図２０に示すように、制御部１１は、演奏リストにセットされている楽曲の楽曲本体データ又は記憶対象語群の記憶対象語音声データをパーツＷＡＶデータデータベースから取得すると共に、演奏リストにセットされている前奏、後奏及び曲間部の曲間つなぎデータをパーツＷＡＶデータデータベースから取得する（ステップＳ３０１）。このとき、各曲間つなぎデータは、その前に接続される楽曲本体データ又は記憶対象語音声データのリズムパターンとその後に接続される楽曲本体データのリズムパターンとに合わせたものが取得される。 When the jogging song data creation process is started, as shown in FIG. 20, the control unit 11 converts the song body data of the song set in the performance list or the storage target word voice data of the storage target word group into the part WAV. In addition to obtaining from the data database, the piece connection data of the prelude, the follower, and the piece set in the performance list is obtained from the part WAV data database (step S301). At this time, the connection data between the pieces of music is acquired in accordance with the rhythm pattern of the music main body data or storage target speech data connected before that and the rhythm pattern of the music main body data connected thereafter.

次いで、制御部１１は、取得した楽曲本体データ又は記憶対象語音声データと、曲間つなぎデータとを、演奏リストにセットされている順序で接続して、ジョギング本編部分のＷＡＶデータを作成する（ステップＳ３０２）。このとき、制御部１１は、各楽曲又は記憶対象語音声の最初の１小節とその前の曲間部の最後の１小節とを重複させて接続する。 Next, the control unit 11 connects the acquired music main body data or storage target word sound data and the inter-music connection data in the order set in the performance list, and creates the WAV data of the jogging main part ( Step S302). At this time, the control unit 11 connects the first one measure of each music piece or the speech to be stored and the last one measure of the previous inter-music portion in an overlapping manner.

次いで、制御部１１は、ファイルの初期テンポから最終テンポまで滑らかにテンポが変化するように小節単位でテンポの調整を行いながら、ジョギング本編部分のＷＡＶデータを変換する（ステップＳ３０３）。この結果、初期テンポ＝最終テンポである場合には、全小節が同一のテンポとなり、初期テンポ＜最終テンポである場合には、１小節毎にテンポが少しずつ上昇する。 Next, the control unit 11 converts the WAV data of the main part of jogging while adjusting the tempo in units of measures so that the tempo smoothly changes from the initial tempo to the final tempo of the file (step S303). As a result, when initial tempo = final tempo, all measures become the same tempo, and when initial tempo <final tempo, the tempo slightly increases for each measure.

次いで、制御部１１は、ガイダンス音声の音声データを、対応する位置の曲間にセットされる曲間部の曲間つなぎデータに合成する（ステップＳ３０４）。 Next, the control unit 11 synthesizes the voice data of the guidance voice with the inter-music connection data of the inter-music part set between the music at the corresponding positions (step S304).

次いで、制御部１１は、エンコーダ部１４によりジョギング本編部分のフォーマットをＭＰ３フォーマットに変換することによってｉ番目のジョギング用楽曲データのファイルを作成し、これを記憶部１２に記憶させる。（ステップＳ３０５）。 Next, the control unit 11 uses the encoder unit 14 to convert the format of the jogging main part to the MP3 format, thereby creating a file of music data for i-th jogging, and stores the file in the storage unit 12. (Step S305).

次いで、制御部１１は、ｉ番目のジョギング用楽曲データのファイル名を設定する（ステップＳ３０６）。具体的に、制御部１１は、コース番号及び今日の日付を含むと共に、通し番号としてｉ＋１を含むファイル名を設定する。 Next, the control unit 11 sets the file name of the i-th song data for jogging (step S306). Specifically, the control unit 11 sets a file name including the course number and today's date and including i + 1 as the serial number.

このようにして、制御部１１は、ジョギング用楽曲データをユーザＰＣ２によってダウンロード可能なように記憶させる。つまり、制御部１１は、ダウンロード用のＷＥＢページからユーザの選択に基づいてユーザＰＣ２から送信されてきたＵＲＬに対応するジョギング用楽曲データを記憶部１２から取得し送信することができるように、ジョギング用楽曲データを、記憶位置を示すパス及びファイル名を設定して記憶させるのである。 In this way, the control unit 11 stores the jogging music data so that it can be downloaded by the user PC 2. That is, the control unit 11 performs jogging so that the jogging music data corresponding to the URL transmitted from the user PC 2 based on the user's selection from the download WEB page can be acquired from the storage unit 12 and transmitted. The musical composition data is stored by setting the path and file name indicating the storage position.

制御部１１は、この処理を終えると、ジョギング用楽曲データ作成処理を終了させる。 When this process is finished, the controller 11 ends the jogging music data creation process.

以上説明したように、本実施形態によれば、制御部１１が、英単語群ＷＡＶデータ３０２、日本語訳群ＷＡＶデータ３０３及びドラムベースＷＡＶデータ３０４に基づいて、互いに異なる複数の記憶対象語夫々について複数回その発声音が発音されるとともに、リズム音が発音され、且つ、英単語のアクセントの発音タイミングが拍のタイミングに合わせて英単語と日本語訳とが交互に発音されるように構成された記憶対象語音声データを作成する。ここで、制御部１１が、日本語訳の発音終了からその次に発音される英単語の発音開始までの時間間隔が最低間隔時間Ａ以上になるように、英単語のアクセントの発音タイミングを拍のタイミングに合わせたときの前記時間間隔が最低間隔時間Ａ未満となる英単語のアクセントが拍のタイミングからずれて発音される記憶対象語音声データを作成する。そして、制御部１１が、作成された記憶対象語音声データをパーツＷＡＶデータデータベースに登録し、登録された記憶対象語音声データを含むジョギング用楽曲データをユーザＰＣ２に送信する。 As described above, according to the present embodiment, the control unit 11 uses the English word group WAV data 302, the Japanese translation group WAV data 303, and the drum base WAV data 304 to store a plurality of different storage target words. The voicing sound is pronounced multiple times, the rhythm sound is pronounced, and the English accent and the Japanese translation are alternately pronounced according to the beat timing. The stored word speech data is created. Here, the control unit 11 beats the pronunciation timing of the accent of the English word so that the time interval from the end of the pronunciation of the Japanese translation to the start of pronunciation of the next English word to be pronounced is equal to or longer than the minimum interval time A. The word-to-speech data to be stored is generated in which the accent of the English word whose time interval is less than the minimum interval time A when it is synchronized with the timing of Then, the control unit 11 registers the created storage target speech data in the part WAV data database, and transmits jogging music data including the registered storage target speech data to the user PC 2.

従って、原則、英単語のアクセントの発音タイミングは拍のタイミングに合わせられているが、日本語訳の発音終了から当該日本語訳の次に発音される英単語の発音開始までの時間間隔が最低間隔時間Ａとなる当該英単語については、アクセントの発音タイミングが拍のタイミングからずれていることによって、日本語訳と英単語との時間間隔が最低間隔時間Ａ以上となる記憶対象語音声データが作成される。よって、作成された記憶対象語音声データが携帯音楽プレーヤ３により再生処理されることによって、英単語と日本語訳とを聞き分けることが容易な間隔で英単語と日本語訳とを再生することが可能となるので、英単語とその日本語訳とをユーザが確実に記憶することができる。 Therefore, in principle, the pronunciation timing of accents of English words is matched to the timing of beats, but the time interval from the end of pronunciation of a Japanese translation to the start of pronunciation of an English word that is pronounced next to the Japanese translation is the minimum. With respect to the English word corresponding to the interval time A, the storage target word voice data whose time interval between the Japanese translation and the English word is equal to or greater than the minimum interval time A is obtained because the accent pronunciation timing is shifted from the beat timing. Created. Therefore, the created storage target word / speech data is reproduced by the portable music player 3 so that the English word and the Japanese translation can be reproduced at an interval at which it is easy to distinguish the English word from the Japanese translation. Thus, the user can surely memorize English words and their Japanese translations.

また、英単語のアクセントの発音タイミングは、あくまでも拍からずらされ、例えば、第１拍のタイミングから第２拍のタイミングにずらされる（英単語のアクセントの発音タイミングは結局拍のタイミングに合わされる）わけではないので、単位時間当たりに再生される単語数が減ることを防止することができ、英単語とその日本語訳とをユーザが効率的に記憶することができる。 Moreover, the pronunciation timing of the accent of the English word is shifted from the beat to the last, for example, shifted from the timing of the first beat to the timing of the second beat (the pronunciation timing of the accent of the English word is eventually adjusted to the timing of the beat) However, since the number of words reproduced per unit time can be prevented from decreasing, the English word and its Japanese translation can be efficiently stored by the user.

また、日本語訳と英単語との時間間隔が最低間隔時間Ａ以上となる英単語の発音開始タイミングが拍のタイミングに合わせられた記憶対象語音声データが作成される。 In addition, the storage target word voice data is generated in which the pronunciation start timing of the English word whose time interval between the Japanese translation and the English word is the minimum interval time A or more is matched with the beat timing.

従って、英単語の発声音がリズム良く再生され、英単語とその日本語訳とをより記憶させやすくすることができる。 Therefore, the utterance sound of the English word is reproduced with good rhythm, and the English word and its Japanese translation can be more easily memorized.

また、リズム音における１小節中の第１拍と第２拍とが英単語に割り当てられ、且つ、当該１小節中の第３拍と第４拍とが日本語訳とに割り当てられており、日本語訳と英単語との時間間隔が最低間隔時間Ａ以上となる英単語が第１アクセントと第２アクセントとを有し、且つ、第１アクセントが第２アクセントよりも後に発音される単語である場合、当該英単語の発音開始タイミングが第１拍に合わせられ、且つ、第１アクセントの発音タイミングが第２拍に合わせられた記憶対象語音声データが作成される。 In addition, the first and second beats in one measure of the rhythm sound are assigned to English words, and the third and fourth beats in the one measure are assigned to the Japanese translation, An English word whose time interval between the Japanese translation and the English word is the minimum interval time A or more has a first accent and a second accent, and the first accent is a word pronounced after the second accent. In some cases, the storage target word sound data is generated in which the pronunciation start timing of the English word is set to the first beat and the pronunciation timing of the first accent is set to the second beat.

従って、第１アクセントと第２アクセントとを有する英単語の発声音がリズム良く再生され、英単語とその日本語訳とをより記憶させやすくすることができる。
［２．第２実施形態］
上記説明した第１実施形態においては、発音タイミングが予め調整された英単語群ＷＡＶデータ３０２と日本語訳群ＷＡＶデータ３０３とを用いて記憶対象語音声データが作成されるようになっていた。これに対し、以下に説明する第２実施形態においては、ユーザにより指定された英単語の音声データ及び指定された英単語に対応する日本語訳の音声データを用いて、発音タイミングを調整しつつ記憶対象語音声データが作成される。 Therefore, the utterance sound of the English word having the first accent and the second accent is reproduced with a good rhythm, and the English word and its Japanese translation can be more easily stored.
[2. Second Embodiment]
In the first embodiment described above, the storage target word speech data is created using the English word group WAV data 302 and the Japanese translation group WAV data 303 whose pronunciation timing is adjusted in advance. On the other hand, in the second embodiment described below, the pronunciation timing is adjusted using the voice data of the English word designated by the user and the voice data of the Japanese translation corresponding to the designated English word. Memory target speech data is created.

本実施形態において、通信システムＳの構成及び楽曲配信サーバ１のハードウエア構成は、第１実施形態の場合と同様であるので、詳細な説明は省略する。 In the present embodiment, the configuration of the communication system S and the hardware configuration of the music distribution server 1 are the same as in the case of the first embodiment, and thus detailed description thereof is omitted.

なお、本実施形態における制御部１１は、本発明において、生成手段、送信手段、判定手段、第１語指定情報受信手段、第１語音声情報取得手段及び第２語音声情報取得手段の一例を構成する。また、記憶部１２は、記憶手段及び第１語情報記憶手段の一例を構成する。 In the present embodiment, the control unit 11 in the present embodiment is an example of a generation unit, a transmission unit, a determination unit, a first word designation information reception unit, a first word voice information acquisition unit, and a second word voice information acquisition unit. Configure. Moreover, the memory | storage part 12 comprises an example of a memory | storage means and a 1st word information storage means.

［２．１辞書データベースの構成］
次に、本実施形態に係る辞書データベースの構成について、図２１を用いて説明する。 [2.1 Dictionary database configuration]
Next, the configuration of the dictionary database according to the present embodiment will be described with reference to FIG.

図２１は、辞書データベースに登録される情報の内容の一例を示す図である。本実施形態において、記憶部１２には、パーツＷＡＶデータデータベースに加えて、辞書データベースが構築されている。 FIG. 21 is a diagram illustrating an example of the content of information registered in the dictionary database. In the present embodiment, a dictionary database is constructed in the storage unit 12 in addition to the parts WAV data database.

図２１に示すように、辞書データベースには、各英単語に関する情報が登録されている。具体的に、辞書データベースには、英単語毎に、英単語のスペル（第１語情報の一例）、英単語ＷＡＶデータ３０５（第１語音声情報の一例）、アクセント位置情報（アクセント情報の一例）、日本語訳、及び日本語訳ＷＡＶデータ３０６（第２語音声情報の一例）、日本語訳発音時間が対応付けて登録されている。ここで、英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６は、本発明のペア語の情報の一例である。 As shown in FIG. 21, information about each English word is registered in the dictionary database. Specifically, in the dictionary database, for each English word, the spelling of the English word (an example of the first word information), the English word WAV data 305 (an example of the first word voice information), the accent position information (an example of accent information) ), Japanese translation, Japanese translation WAV data 306 (an example of second word speech information), and Japanese translation pronunciation time are registered in association with each other. Here, the English word WAV data 305 and the Japanese translation WAV data 306 are examples of pair word information of the present invention.

英単語のスペルは、英語のスペルの文字データであり、例えば、ユーザから指定された英単語を特定するための識別情報としての役割を有している。 The spelling of English words is English spelling character data, and has, for example, a role as identification information for specifying English words designated by the user.

英単語ＷＡＶデータ３０５は、対応する英単語の発声が開始されてから終了するまでの当該発声音が記録されたＷＡＶデータである。この英単語ＷＡＶデータ３０５における英単語の発声のテンポは１４０ＢＰＭに調整されている。また、英単語ＷＡＶデータ３０５としては、発音パターン１と発音パターン２の２種類のＷＡＶデータが、辞書データベースに登録されている。発音パターン１の英単語ＷＡＶデータ３０５は、英単語のアクセントの発音タイミングを第１拍のタイミングに合わせた場合における発声音のＷＡＶデータである。また、発音パターン２の英単語ＷＡＶデータ３０５は、英単語の発音開始タイミングを第１拍のタイミングに合わせた場合における発声音のＷＡＶデータである。発音のタイミングによって、リズム音に合う発音の仕方が変わる場合がある。そのため、２種類の英単語ＷＡＶデータ３０５が用意されている。 The English word WAV data 305 is WAV data in which the utterance sound from the start of utterance of the corresponding English word to the end thereof is recorded. The tempo of utterance of English words in the English word WAV data 305 is adjusted to 140 BPM. As English word WAV data 305, two types of WAV data, pronunciation pattern 1 and pronunciation pattern 2, are registered in the dictionary database. The English pattern WAV data 305 of the pronunciation pattern 1 is the WAV data of the uttered sound when the pronunciation timing of the accent of the English word is matched with the timing of the first beat. Further, the English word WAV data 305 of the pronunciation pattern 2 is the WAV data of the utterance sound when the pronunciation start timing of the English word is matched with the timing of the first beat. Depending on the timing of pronunciation, the pronunciation method that matches the rhythm sound may change. Therefore, two types of English word WAV data 305 are prepared.

また、アクセントを複数有し、且つ、第１アクセントが第２アクセントよりも後に発音される英単語については、拍のタイミングで発音が開始され、且つ、当該拍の次の拍のタイミングで第１アクセントが発音されるように、発音パターン２の英単語ＷＡＶデータ３０５が調整されている。従って、発音パターン２の英単語ＷＡＶデータ３０５を用いて英単語の発音開始タイミングを第１拍に合わせることによって、当該英単語の第１アクセントの発音タイミングは、第２拍に自動的に合わせられることとなる。なお、例えば「book」等の、語頭にアクセントがある英単語については、発音タイミングがずらされることがないので、発音パターンとしては発音パターン１のみで良い。従って、このような英単語については、発音パターン１の英単語ＷＡＶデータ３０５を登録しておき、発音パターン２の英単語ＷＡＶデータ３０５は登録しなくても良い。 For English words that have a plurality of accents and the first accent is pronounced after the second accent, the pronunciation is started at the timing of the beat, and the first at the timing of the next beat of the beat. The English word WAV data 305 of the pronunciation pattern 2 is adjusted so that the accent is pronounced. Accordingly, by using the English word WAV data 305 of the pronunciation pattern 2 to match the pronunciation start timing of the English word with the first beat, the pronunciation timing of the first accent of the English word is automatically adjusted to the second beat. It will be. Note that, for an English word with an accent at the beginning, such as “book”, the pronunciation timing is not shifted, so that only the pronunciation pattern 1 is sufficient as the pronunciation pattern. Therefore, for such English words, the English word WAV data 305 of the pronunciation pattern 1 may be registered, and the English word WAV data 305 of the pronunciation pattern 2 may not be registered.

アクセント位置情報は、対応する英単語の発音パターン１における発音開始からアクセントまでの発音時間である。つまり、アクセント位置情報は、対応する発音パターン１の英単語ＷＡＶデータ３０５を最初からアクセントの再生位置まで再生するのに要する時間である。 The accent position information is the pronunciation time from the start of pronunciation to the accent in the pronunciation pattern 1 of the corresponding English word. That is, the accent position information is the time required to reproduce the English word WAV data 305 of the corresponding pronunciation pattern 1 from the beginning to the accent reproduction position.

日本語訳は、対応する英単語の日本語訳の文字データである。 The Japanese translation is character data of the Japanese translation of the corresponding English word.

日本語訳ＷＡＶデータ３０６は、対応する日本語訳の発声が開始されてから終了するまでの当該発声音が記録されたＷＡＶデータである。この日本語訳ＷＡＶデータ３０６における日本語訳の発声のテンポは１４０ＢＰＭに調整されている。 The Japanese translation WAV data 306 is WAV data in which the utterance sound from the start of the corresponding Japanese translation to the end thereof is recorded. The tempo of the Japanese translation in the Japanese translation WAV data 306 is adjusted to 140 BPM.

日本語訳発音時間は、対応する日本語訳の発音に要する時間である。つまり、日本語訳発音時間は、日本語訳ＷＡＶデータ３０６を最初から最後まで再生するのに要する時間である。 The Japanese translation pronunciation time is the time required for pronunciation of the corresponding Japanese translation. That is, the Japanese translation pronunciation time is the time required to reproduce the Japanese translation WAV data 306 from the beginning to the end.

［２．２記憶対象語音声データの作成方法］
次に、本実施形態における記憶対象語音声データの作成方法について、図２２を用いて説明する。 [2.2 Method of creating speech data to be stored]
Next, a method of creating the storage target word speech data in the present embodiment will be described with reference to FIG.

図２２は、記憶対象語音声データの作成例を示す図である。本実施形態において、記憶対象語音声データの作成には、記憶対象語用ドラムベースＷＡＶデータ３０４、英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６が用いられる。従って、本実施形態においては、英単語群ＷＡＶデータ３０２及び日本語訳群ＷＡＶデータ３０３は必要ない。 FIG. 22 is a diagram illustrating an example of creating the storage target word speech data. In this embodiment, the storage target word audio data is created using the storage target word drum base WAV data 304, the English word WAV data 305, and the Japanese translation WAV data 306. Therefore, in this embodiment, the English word group WAV data 302 and the Japanese translation group WAV data 303 are not necessary.

記憶学習したい英単語をユーザが複数（例えば、２５単語等）指定する。指定された英単語を示す情報は、指定単語情報（第１語指定情報の一例）としてユーザＰＣ２から楽曲配信サーバ１に送信される。 The user designates a plurality of English words (for example, 25 words) to be memorized. Information indicating the designated English word is transmitted from the user PC 2 to the music distribution server 1 as designated word information (an example of first word designation information).

図２２に示すように、指定単語情報に基づいて、指定された英単語が辞書データベースから検索される。検索された各英単語に対応して、発音パターン１及び２の英単語ＷＡＶデータ３０５と日本語訳ＷＡＶデータ３０６が辞書データベースに登録されている。 As shown in FIG. 22, the designated English word is searched from the dictionary database based on the designated word information. Corresponding to each searched English word, English word WAV data 305 and Japanese translation WAV data 306 of pronunciation patterns 1 and 2 are registered in the dictionary database.

制御部１１は、検索された英単語の英単語ＷＡＶデータ３０５及び対応する日本語訳ＷＡＶデータ３０６を、記憶対象語用ドラムベースＷＡＶデータ３０４に混合合成して、記憶対象語音声データを作成する。ここで、各英単語及び日本語訳について、英単語が指定された順に４小節ずつ英単語と日本語訳の発声音が発音されるように混合合成される。例えば、１番目に指定された英単語が「entertainment」である場合、「エンターテイメントゴラク」がリズム音の１小節目から４小節目まで繰り返し発音されるように、発音タイミングが調整される。また、２番目に指定された英単語が「reserve」である場合、「リザーブヨヤクスル」がリズム音の５小節目から８小節目まで４回繰り返し発音されるように、発音タイミングが調整される。 The control unit 11 mixes and synthesizes the English word WAV data 305 of the searched English word and the corresponding Japanese translation WAV data 306 with the drum base WAV data 304 for the storage target word to create storage target word voice data. . Here, each English word and the Japanese translation are mixed and synthesized so that the utterances of the English word and the Japanese translation are pronounced in four measures in the order in which the English words are designated. For example, when the first designated English word is “entertainment”, the pronunciation timing is adjusted so that “entertainment goraku” is repeatedly pronounced from the first bar to the fourth bar of the rhythm sound. When the second designated English word is “reserve”, the pronunciation timing is adjusted so that “reserve yoyakusuru” is repeatedly pronounced four times from the fifth bar to the eighth bar of the rhythm sound.

また、各日本語訳については、夫々第３拍のタイミングで発音が開始されるように、発音タイミングが調整される。 For each Japanese translation, the pronunciation timing is adjusted so that the pronunciation starts at the timing of the third beat.

一方、各英単語については、夫々第１拍のタイミングでアクセントが発音されるか、又は、第１拍のタイミングで発音が開始されるかが決定される。ここでは、検索された英単語に対応するアクセント位置情報と日本語訳発音時間とに基づいて判定が行われる。日本語訳と英単語との時間間隔は以下の式により求めることができる。 On the other hand, for each English word, it is determined whether the accent is pronounced at the timing of the first beat or the pronunciation is started at the timing of the first beat. Here, the determination is performed based on the accent position information corresponding to the searched English word and the Japanese translation pronunciation time. The time interval between the Japanese translation and the English word can be obtained by the following formula.

時間間隔＝６０／１４０×２−日本語訳発音時間−アクセント位置情報が示す発音時間
求められた時間間隔が最低間隔時間Ａ以上である英単語については、第１拍のタイミングでアクセントが発音されるように発音タイミングが調整される。またこの場合、発音パターン１の英単語ＷＡＶデータ３０５を用いて記憶対象語音声データが作成される。 Time interval = 60/140 x 2-Japanese translation pronunciation time-Pronunciation time indicated by accent position information For English words whose calculated time interval is greater than or equal to the minimum interval time A, an accent is pronounced at the timing of the first beat The pronunciation timing is adjusted so that In this case, the storage target word speech data is created using the English word WAV data 305 of the pronunciation pattern 1.

一方、求められた時間間隔が最低間隔時間Ａ未満である英単語については、第１拍のタイミングで発音が開始されるように発音タイミングが調整される。またこの場合、発音パターン２の英単語ＷＡＶデータ３０５を用いて記憶対象語音声データが作成される。 On the other hand, for English words whose calculated time interval is less than the minimum interval time A, the pronunciation timing is adjusted so that pronunciation is started at the timing of the first beat. In this case, the storage target word voice data is created using the English word WAV data 305 of the pronunciation pattern 2.

以上のような方法により、第１実施形態における記憶対象語音声データと同じ構成の記憶対象語音声データを作成することができる。 By the method as described above, it is possible to create the storage target word speech data having the same configuration as the storage target word speech data in the first embodiment.

そして、作成された記憶対象語音声データは、パーツＷＡＶデータデータベースに登録される。この場合、記憶対象語音声データは、英単語を指定したユーザのユーザＩＤに対応付けて登録される。 Then, the created storage target word speech data is registered in the parts WAV data database. In this case, the storage target word voice data is registered in association with the user ID of the user who specified the English word.

なお、前記１．２．２．２項で説明した場合と同様に、日本語訳と英単語との時間間隔に基づいて英単語の発音タイミングをずらすか否かの判定を行う以外にも、英単語の発音時間や文字数等に基づいて判定を行うことができる。 As in the case described in the above section 1.2.2.2, in addition to determining whether to shift the pronunciation timing of the English word based on the time interval between the Japanese translation and the English word, The determination can be made based on the pronunciation time of English words, the number of characters, and the like.

例えば、英単語のアクセント位置情報が示す発音時間と許容発音時間Ｂとの比較結果に基づいて、発音タイミングをずらすか否かを判定しても良い。この場合、日本語訳発音時間は必要ない。 For example, it may be determined whether or not to shift the pronunciation timing based on the comparison result between the pronunciation time indicated by the accent position information of the English word and the allowable pronunciation time B. In this case, Japanese translation pronunciation time is not required.

また、例えば、アクセントのある文字よりも前にある文字の数と許容文字数Ｃとの比較結果に基づいて、発音タイミングをずらすか否かを判定しても良い。この場合、アクセント位置情報として、アクセントのある文字よりも前にある文字の数を辞書データベースに登録しておく。またこの場合も、日本語訳発音時間は必要ない。 Further, for example, it may be determined whether or not to shift the sound generation timing based on the comparison result between the number of characters preceding the accented character and the allowable number of characters C. In this case, the number of characters preceding the accented character is registered in the dictionary database as accent position information. In this case, Japanese translation pronunciation time is not required.

判定に用いられる最低間隔時間Ａ、許容発音時間Ｂ又は許容文字数Ｃは、予め記憶部１２に記憶される。 The minimum interval time A, the allowable pronunciation time B, or the allowable character number C used for the determination is stored in the storage unit 12 in advance.

［２．３楽曲配信サーバの動作］
次に、楽曲配信サーバ１の動作について説明する。なお、メイン処理及びジョギング用楽曲データ作成処理の処理内容は、第１実施形態の場合と同様であるので、これらの処理の説明は省略する。 [2.3 Music distribution server operation]
Next, the operation of the music distribution server 1 will be described. Note that the processing contents of the main processing and jogging music data creation processing are the same as those in the first embodiment, and a description thereof will be omitted.

［２．３．１記憶対象語音声データ作成処理］
記憶対象語音声データ作成処理について、図２３及び図２４を用いて、実施例毎に説明する。 [2.3.1 Storage target word voice data creation process]
The storage target word speech data creation process will be described for each embodiment with reference to FIGS. 23 and 24. FIG.

実施例１は、日本語訳と英単語との時間間隔に基づいて、英単語の発音タイミングをずらすか否かを判定する場合の実施例である。 Example 1 is an example in the case where it is determined whether or not to shift the pronunciation timing of an English word based on the time interval between the Japanese translation and the English word.

図２３は、本実施形態の実施例１に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。 FIG. 23 is a flowchart illustrating a processing example in the storage target word speech data creation processing of the control unit 11 of the music distribution server 1 according to Example 1 of the present embodiment.

先ず、ユーザ操作により、ユーザＰＣ２が楽曲配信サーバ１にアクセスすると、図２３に示すように、楽曲配信サーバ１の制御部１１は、ログイン処理を実行する（ステップＳ４５１）。具体的に、制御部１１は、ユーザＰＣ２からユーザＩＤ、パスワード等を受信し、認証処理を行って、ユーザを特定する。 First, when the user PC 2 accesses the music distribution server 1 by a user operation, as shown in FIG. 23, the control unit 11 of the music distribution server 1 executes a login process (step S451). Specifically, the control unit 11 receives a user ID, a password, and the like from the user PC 2 and performs an authentication process to specify a user.

次いで、第１語指定情報受信手段としての制御部１１は、ユーザＰＣ２から指定単語情報を受信する（ステップＳ４５２）。具体的に、制御部１１は、英単語を選択するためのＷＥＢページをユーザＰＣ２に送信し、ユーザＰＣ２が、受信したＷＥＢページを画面に表示する。ＷＥＢページ上に表示された英単語の一覧から、ユーザが英単語を複数指定すると、ユーザＰＣ２は、指定された英単語の文字データを含む指定単語情報を楽曲配信サーバ１に送信する。そして、制御部１１は、指定単語情報を受信する。 Subsequently, the control part 11 as a 1st word designation | designated information receiving means receives designation | designated word information from user PC2 (step S452). Specifically, the control unit 11 transmits a WEB page for selecting English words to the user PC 2, and the user PC 2 displays the received WEB page on the screen. When the user designates a plurality of English words from the list of English words displayed on the WEB page, the user PC 2 transmits designated word information including character data of the designated English words to the music distribution server 1. And the control part 11 receives designation | designated word information.

次いで、第１語音声情報取得手段としての制御部１１は、ユーザから指定された英単語に対応する発音パターン１の英単語ＷＡＶデータ３０５を辞書データベースから取得する（ステップＳ４５３）。具体的に、制御部１１は、受信した指定単語情報に含まれている各英単語を辞書データベースから検索し、検索した各英単語に夫々対応する発音パターン１の英単語ＷＡＶデータ３０５を取得する。そして、制御部１１は、取得した英単語ＷＡＶデータ３０５を、夫々ユーザから指定された英単語に対応付けてＲＡＭに記憶する。 Next, the control unit 11 as the first word speech information acquisition unit acquires the English word WAV data 305 of the pronunciation pattern 1 corresponding to the English word designated by the user from the dictionary database (step S453). Specifically, the control unit 11 searches the dictionary database for each English word included in the received designated word information, and acquires English word WAV data 305 of the pronunciation pattern 1 corresponding to each searched English word. . Then, the control unit 11 stores the acquired English word WAV data 305 in the RAM in association with the English word designated by the user.

次いで、第２語音声情報取得手段としての制御部１１は、ユーザから指定された英単語に対応する日本語訳ＷＡＶデータ３０６を辞書データベースから取得する（ステップＳ４５４）。具体的に、制御部１１は、検索した各英単語に夫々対応する日本語訳ＷＡＶデータ３０６を取得する。そして、制御部１１は、取得した日本語訳ＷＡＶデータ３０６を、夫々ユーザから指定された英単語に対応付けてＲＡＭに記憶する。 Next, the control unit 11 as the second word voice information acquisition unit acquires the Japanese translation WAV data 306 corresponding to the English word designated by the user from the dictionary database (step S454). Specifically, the control unit 11 acquires Japanese translation WAV data 306 corresponding to each searched English word. Then, the control unit 11 stores the acquired Japanese translation WAV data 306 in the RAM in association with the English word designated by the user.

次いで、制御部１１は、ユーザから指定された英単語に対応するアクセント位置情報及び日本語訳発音時間を辞書データベースから取得する（ステップＳ４５５）。具体的に、制御部１１は、検索した各英単語に夫々対応するアクセント位置情報及び日本語訳発音時間を取得する。そして、制御部１１は、取得したアクセント位置情報及び日本語訳発音時間を、夫々ユーザから指定された英単語に対応付けてＲＡＭに記憶する。 Next, the control unit 11 acquires the accent position information and the Japanese translation pronunciation time corresponding to the English word designated by the user from the dictionary database (step S455). Specifically, the control unit 11 acquires accent position information and Japanese translation pronunciation time corresponding to each searched English word. Then, the control unit 11 stores the acquired accent position information and Japanese translation pronunciation time in the RAM in association with the English word designated by the user.

次いで、制御部１１は、パーツＷＡＶデータデータベースから記憶対象語用ドラムベースＷＡＶデータ３０４を取得し、ＲＡＭに記憶する（ステップＳ４５６）。次いで、制御部１１は、指定順ｉに１を設定する（ステップＳ４５７）。 Next, the control unit 11 acquires the drum base WAV data 304 for the storage target word from the part WAV data database and stores it in the RAM (step S456). Next, the control unit 11 sets 1 in the specified order i (step S457).

次いで、制御部１１は、ｉ番目に指定された英単語の発音タイミングを設定する（ステップＳ４５８）。具体的に、制御部１１は、英単語のアクセントの発音タイミングを第１拍のタイミングに設定する。 Next, the control unit 11 sets the pronunciation timing of the i-th designated English word (step S458). Specifically, the control unit 11 sets the pronunciation timing of the accent of the English word to the timing of the first beat.

次いで、制御部１１は、指定順ｉが１であるか否かを判定する（ステップＳ４５９）。このとき、制御部１１は、指定順ｉが１である場合には（ステップＳ４５９：ＹＥＳ）、ステップＳ４６２に移行する。 Next, the control unit 11 determines whether or not the designation order i is 1 (step S459). At this time, when the designation order i is 1 (step S459: YES), the control unit 11 proceeds to step S462.

一方、制御部１１は、指定順ｉが１ではない場合には（ステップＳ４５９：ＮＯ）、ｉ−１番目の日本語訳の発音終了からｉ番目の英単語の発音開始までの時間間隔を算出する（ステップＳ４６０）。この時間間隔は、ｉ−１番目に指定された英単語に対応する日本語訳発音時間と、ｉ番目に指定された英単語に対応するアクセント位置情報に基づいて求めることができる。 On the other hand, when the specified order i is not 1 (step S459: NO), the control unit 11 calculates a time interval from the end of pronunciation of the i-1th Japanese translation to the start of pronunciation of the i-th English word. (Step S460). This time interval can be obtained based on the Japanese translation pronunciation time corresponding to the i-1st designated English word and the accent position information corresponding to the i-th designated English word.

次いで、判定手段としての制御部１１は、算出した時間間隔が最低間隔時間Ａ以上であるか否かを判定する（ステップＳ４６１）。このとき、制御部１１は、算出した時間間隔が最低間隔時間Ａ以上である場合には（ステップＳ４６１：ＹＥＳ）、ステップＳ４６２に移行し、算出した時間間隔が最低間隔時間Ａ未満である場合には（ステップＳ４６１：ＮＯ）、ステップＳ４６４に移行する。 Next, the control unit 11 as a determination unit determines whether or not the calculated time interval is equal to or greater than the minimum interval time A (step S461). At this time, when the calculated time interval is equal to or greater than the minimum interval time A (step S461: YES), the control unit 11 proceeds to step S462, and when the calculated time interval is less than the minimum interval time A. (Step S461: NO), the process proceeds to step S464.

制御部１１は、ステップＳ４５９において指定順ｉが１である場合（ステップＳ４５９：ＹＥＳ）、又は、ステップＳ４６１において時間間隔が最低間隔時間Ａ未満である場合には（ステップＳ４６１：ＮＯ）、ｉ番目の日本語訳の発音終了からｉ番目の英単語の発音開始までの時間間隔を算出する（ステップＳ４６２）。この時間間隔は、ｉ番目に指定された英単語に対応する日本語訳発音時間及びアクセント位置情報に基づいて求めることができる。 When the designation order i is 1 in step S459 (step S459: YES), or when the time interval is less than the minimum interval time A in step S461 (step S461: NO), the control unit 11 is i th The time interval from the end of pronunciation of the Japanese translation to the start of pronunciation of the i-th English word is calculated (step S462). This time interval can be obtained based on the Japanese translation pronunciation time and accent position information corresponding to the i-th designated English word.

次いで、判定手段としての制御部１１は、算出した時間間隔が最低間隔時間Ａ以上であるか否かを判定する（ステップＳ４６３）。このとき、制御部１１は、算出した時間間隔が最低間隔時間Ａ以上である場合には（ステップＳ４６３：ＹＥＳ）、ステップＳ４６６に移行し、算出した時間間隔が最低間隔時間Ａ未満である場合には（ステップＳ４６３：ＮＯ）、ステップＳ４６４に移行する。 Next, the control unit 11 as a determination unit determines whether or not the calculated time interval is equal to or greater than the minimum interval time A (step S463). At this time, when the calculated time interval is equal to or greater than the minimum interval time A (step S463: YES), the control unit 11 proceeds to step S466, and when the calculated time interval is less than the minimum interval time A. (Step S463: NO), the process proceeds to step S464.

制御部１１は、ステップＳ４６１又はＳ４６３において時間間隔が最低間隔時間Ａ未満である場合には（ステップＳ４６１：ＮＯ、又は、ステップＳ４６３：ＮＯ）、ｉ番目に指定された英単語の発音タイミングの設定変更を行う。（ステップＳ４６４）。具体的に、制御部１１は、英単語の発音開始タイミングを第１拍のタイミングに設定する（ステップＳ４６４）。 When the time interval is less than the minimum interval time A in step S461 or S463 (step S461: NO or step S463: NO), the control unit 11 sets the pronunciation timing of the i-th designated English word. Make a change. (Step S464). Specifically, the control unit 11 sets the English pronunciation start timing to the first beat timing (step S464).

次いで、第１語音声情報取得手段としての制御部１１は、ｉ番目の英単語に対応する発音パターン２の英単語ＷＡＶデータ３０５を辞書データベースから取得する（ステップＳ４６５）。制御部１１は、取得した英単語ＷＡＶデータ３０５を、ｉ番目の英単語に対応付けてＲＡＭに記憶する。このとき、制御部１１は、ステップＳ４５３においてｉ番目の英単語に対応して取得してあった発音パターン１の英単語ＷＡＶデータ３０５をＲＡＭから削除する。 Next, the control unit 11 as the first word speech information acquisition unit acquires the English word WAV data 305 of the pronunciation pattern 2 corresponding to the i-th English word from the dictionary database (step S465). The control unit 11 stores the acquired English word WAV data 305 in the RAM in association with the i-th English word. At this time, the control unit 11 deletes, from the RAM, the English word WAV data 305 of the pronunciation pattern 1 acquired corresponding to the i-th English word in step S453.

制御部１１は、ステップＳ４６３において時間間隔が最低間隔時間Ａ以上である場合（ステップＳ４６３：ＹＥＳ）、又は、ステップＳ４６５の処理を終えた場合には、指定順ｉに１を加算する（ステップＳ４６６）。 When the time interval is not less than the minimum interval time A in step S463 (step S463: YES), or when the process of step S465 is completed, the control unit 11 adds 1 to the specified order i (step S466). ).

次いで、制御部１１は、指定順ｉが、ユーザから指定された単語数よりも大きいか否かを判定する（ステップＳ４６７）。ユーザから指定された単語数は、指定単語情報に含まれている単語の総数に相当する。ここで、制御部１１は、指定順ｉが、ユーザから指定された単語数以下である場合には（ステップＳ４６７：ＮＯ）、ステップＳ４５８に移行する。 Next, the control unit 11 determines whether or not the designation order i is larger than the number of words designated by the user (step S467). The number of words designated by the user corresponds to the total number of words included in the designated word information. Here, when the designation order i is equal to or less than the number of words designated by the user (step S467: NO), the control unit 11 proceeds to step S458.

一方、生成手段としての制御部１１は、指定順ｉが、ユーザから指定された単語数よりも大きい場合には（ステップＳ４６７：ＹＥＳ）、記憶対象語音声データを作成する（ステップＳ４６８）。具体的に、制御部１１は、ＲＡＭに記憶されている記憶対象語用ドラムベースＷＡＶデータ３０４と、ユーザに指定された各英単語に対応してＲＡＭに記憶されている英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６とを混合して、記憶対象語音声データを作成する。このとき、制御部１１は、ユーザに指定された各英単語について、設定されたタイミングで発音されるように、各英単語の発音タイミングを調整する。また、制御部１１は、ユーザに指定された各英単語に対応する日本語訳について、第３拍のタイミングで発音開始されるように、各日本語訳の発音タイミングを調整する。 On the other hand, when the designation order i is larger than the number of words designated by the user (step S467: YES), the control unit 11 as the generation unit creates storage target word speech data (step S468). Specifically, the control unit 11 stores the drum base WAV data 304 for the storage target word stored in the RAM, the English word WAV data 305 stored in the RAM corresponding to each English word specified by the user, and The Japanese translation WAV data 306 is mixed to create storage target speech data. At this time, the control unit 11 adjusts the pronunciation timing of each English word so that each English word designated by the user is pronounced at the set timing. Further, the control unit 11 adjusts the pronunciation timing of each Japanese translation so that the pronunciation of the Japanese translation corresponding to each English word designated by the user is started at the timing of the third beat.

次いで、制御部１１は、作成した記憶対象語音声データを、ログイン処理において受信したユーザＩＤに対応付けてパーツＷＡＶデータデータベースに登録する（ステップＳ４６９）。制御部１１は、この処理を終えると、記憶対象語音声データ作成処理を終了させる。 Next, the control unit 11 registers the created storage target speech data in the parts WAV data database in association with the user ID received in the login process (step S469). When this process is completed, the control unit 11 ends the storage target word speech data creation process.

実施例２は、英単語の発音開始からアクセントまでの発音時間に基づいて、英単語の発音タイミングをずらすか否かを判定する場合の実施例である。 Example 2 is an example in the case where it is determined whether or not to shift the pronunciation timing of an English word based on the pronunciation time from the start of pronunciation of the English word to the accent.

図２４は、本実施形態の実施例２に係る楽曲配信サーバ１の制御部１１の記憶対象語音声データ作成処理における処理例を示すフローチャートである。また、図２４において、図２３と同様の要素については同様の符号を付してある。 FIG. 24 is a flowchart illustrating a process example of the storage target word voice data creation process of the control unit 11 of the music distribution server 1 according to the second example of the present embodiment. In FIG. 24, the same reference numerals are given to the same elements as those in FIG.

先ず、制御部１１は、実施例１と同様に、ログイン処理、指定単語情報の受信、発音パターン１の英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６の取得を行う（ステップＳ４５１〜Ｓ４５４）。 First, similarly to the first embodiment, the control unit 11 performs login processing, reception of designated word information, and acquisition of English word WAV data 305 and Japanese translation WAV data 306 of pronunciation pattern 1 (steps S451 to S454).

次いで、制御部１１は、ユーザから指定された英単語に対応するアクセント位置情報を辞書データベースから取得する（ステップＳ４８１）。 Next, the control unit 11 acquires accent position information corresponding to the English word designated by the user from the dictionary database (step S481).

次いで、制御部１１は、実施例１と同様に、記憶対象語用ドラムベースＷＡＶデータ３０４の取得、指定順ｉの設定、及びｉ番目に指定された英単語の発音タイミングの設定を行う（ステップＳ４５６〜Ｓ４５８）。 Next, as in the first embodiment, the control unit 11 obtains the drum base WAV data 304 for the storage target word, sets the designation order i, and sets the pronunciation timing of the i-th designated English word (step) S456 to S458).

次いで、判定手段としての制御部１１は、ｉ番目の英単語の発音開始からアクセントまでの発音時間が許容発音時間Ｂ以上であるか否かを判定する（ステップＳ４８２）。ｉ番目の英単語の発音開始からアクセントまでの発音時間は、ｉ番目の英単語に対応するアクセント位置情報が示す発音時間である。このとき、制御部１１は、ｉ番目の英単語の発音開始からアクセントまでの発音時間が許容発音時間Ｂ以上である場合には（ステップＳ４８２：ＹＥＳ）、実施例１と同様に、ｉ番目の英単語の発音タイミングの設定変更及びｉ番目の英単語に対応する発音パターン２の英単語ＷＡＶデータ３０５の取得を行う（ステップＳ４６４、Ｓ４６５）。 Next, the control unit 11 as determination means determines whether or not the pronunciation time from the start of pronunciation of the i-th English word to the accent is equal to or longer than the allowable pronunciation time B (step S482). The pronunciation time from the start of pronunciation of the i-th English word to the accent is the pronunciation time indicated by the accent position information corresponding to the i-th English word. At this time, if the pronunciation time from the start of pronunciation of the i-th English word to the accent is equal to or longer than the allowable pronunciation time B (step S482: YES), the control unit 11 performs the i-th English word as in the first embodiment. The setting of the pronunciation timing of English words is changed, and the English word WAV data 305 of the pronunciation pattern 2 corresponding to the i-th English word is acquired (steps S464 and S465).

制御部１１は、ｉ番目の英単語の発音開始からアクセントまでの発音時間が許容発音時間Ｂ未満である場合（ステップＳ４８２：ＮＯ）、又は、ステップＳ４６５の処理を終えた場合には、実施例１と同様に、指定順ｉの加算処理、及び指定順ｉの判定を行う（ステップＳ４６６、Ｓ４６７）。 When the pronunciation time from the start of pronunciation of the i-th English word to the accent is less than the allowable pronunciation time B (step S482: NO) or when the process of step S465 is completed, the control unit 11 Similarly to 1, the addition process in the designated order i and the determination in the designated order i are performed (steps S466 and S467).

そして、制御部１１は、指定順ｉが、ユーザから指定された単語数以下である場合には（ステップＳ４６７：ＮＯ）、ステップＳ４５８に移行し、指定順ｉが、ユーザから指定された単語数より大きい場合には（ステップＳ４６７：ＹＥＳ）、実施例１と同様に、記憶対象語音声データの作成及び登録を行い（ステップＳ４６８、Ｓ４６９）、記憶対象語音声データ作成処理を終了させる。 When the designation order i is equal to or less than the number of words designated by the user (step S467: NO), the control unit 11 proceeds to step S458, where the designation order i is the number of words designated by the user. If it is larger (step S467: YES), the storage target word speech data is created and registered (steps S468 and S469) as in the first embodiment, and the storage target word speech data creation processing is terminated.

実施例２の変形として、英単語のアクセントのある文字よりも前にある文字の数に基づいて、英単語の発音タイミングをずらすか否かを判定するように処理することも可能である。具体的には、アクセント位置情報として、アクセントのある文字よりも前にある文字の数を辞書データベースに登録しておく。そして、ステップＳ４８２において、制御部１１は、ｉ番目の英単語のアクセントのある文字よりも前にある文字の数が、許容文字数Ｃ以上であるか否かを判定し、許容文字数Ｃ以上である場合はステップＳ４６４に移行し、許容文字数Ｃ未満である場合はステップＳ４６６に移行するように処理すれば良い。 As a modification of the second embodiment, it is possible to perform processing so as to determine whether or not to shift the pronunciation timing of the English word based on the number of characters preceding the accented character of the English word. Specifically, as the accent position information, the number of characters preceding the accented character is registered in the dictionary database. In step S482, the control unit 11 determines whether the number of characters preceding the accented character of the i-th English word is equal to or greater than the allowable number of characters C, and is equal to or greater than the allowable number of characters C. In this case, the process proceeds to step S464, and if it is less than the allowable number of characters C, the process may proceed to step S466.

また、実施例２の更に変形として、アクセント位置情報の代わりに、英単語の発音タイミングをずらすか否かを示すフラグ情報を用いて処理しても良い。具体的には、英単語の発音開始からアクセントまでの発音時間が、許容発音時間Ｂ未満である場合は、フラグ情報の値を例えば０とし、許容発音時間Ｂ以上である場合は、フラグ情報の値を例えば１とする。或いは、英単語のアクセントのある文字よりも前にある文字の数が、許容文字数Ｃ未満である場合は、フラグ情報の値を例えば０とし、許容文字数Ｃ以上である場合は、フラグ情報の値を例えば１とする。そして、各英単語に対応付けて０又は１の値のフラグ情報を辞書データベースに登録しておく。そして、ステップＳ４８２において、制御部１１は、ｉ番目の英単語のフラグ情報の値を判定し、フラグ情報の値が１である場合はステップＳ４６４に移行し、フラグ情報の値が０である場合はステップＳ４６６に移行するように処理すれば良い。 Further, as a further modification of the second embodiment, processing may be performed using flag information indicating whether or not to shift the pronunciation timing of English words instead of the accent position information. Specifically, when the pronunciation time from the start of pronunciation of the English word to the accent is less than the allowable pronunciation time B, the value of the flag information is set to 0, for example. For example, the value is 1. Alternatively, when the number of characters preceding the accented character of the English word is less than the allowable number of characters C, the value of the flag information is set to 0, for example, and when the number of allowable characters is equal to or greater than C, the value of the flag information Is 1 for example. Then, flag information with a value of 0 or 1 is registered in the dictionary database in association with each English word. In step S482, the control unit 11 determines the value of the flag information of the i-th English word. If the value of the flag information is 1, the process proceeds to step S464, and the value of the flag information is 0. May be processed so as to proceed to step S466.

また、実施例２の場合、英単語の発音のタイミングをずらすか否かは、アクセントの位置から予め分かるので、発音のタイミングをずらさない英単語については、発音パターン１の英単語ＷＡＶデータ３０５のみを辞書データベースに登録し、発音のタイミングをずらす英単語については、発音パターン２の英単語ＷＡＶデータ３０５のみを辞書データベースに登録しても良い。 In the case of the second embodiment, whether or not to shift the timing of pronunciation of English words is known in advance from the position of the accent. Therefore, for English words that do not shift the timing of pronunciation, only the English word WAV data 305 of the pronunciation pattern 1 is used. Are registered in the dictionary database, and only the English word WAV data 305 of the pronunciation pattern 2 may be registered in the dictionary database for English words whose timing of pronunciation is shifted.

また、実施例１と実施例２とを組み合わせても良い。つまり、制御部１１は、アクセントの位置に基づいて英単語の発音のタイミングをずらすか否かを判定し、発音のタイミングをずらさないと判定された英単語については、更に、日本語訳と英単語との時間間隔に基づいて発音のタイミングをずらすか否かを判定するようにしても良い。 Further, the first embodiment and the second embodiment may be combined. That is, the control unit 11 determines whether or not to shift the pronunciation timing of the English word based on the position of the accent, and for the English word determined not to shift the pronunciation timing, the Japanese translation and English It may be determined whether or not to shift the pronunciation timing based on the time interval with the word.

また、英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６は、辞書データベースから取得するのではなく、記憶対象語音声データ生成処理において、ユーザから指定された英単語に対応する英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６を生成し、生成されたこれらのＷＡＶデータを取得するようにしても良い。この場合、例えば、辞書データベースには、英単語の発音記号の情報、当該発音記号におけるアクセス位置を示す情報等を登録しておく。そして、制御部１１は、発音記号及びアクセス位置を示す情報等に基づいて、指定された英単語の英単語ＷＡＶデータ３０５を音声合成により生成する。また同様に、制御部１１は、日本語訳の文字データに基づいて、日本語訳ＷＡＶデータ３０６を音声合成により生成する。 Further, the English word WAV data 305 and the Japanese translation WAV data 306 are not acquired from the dictionary database, but in the storage target word sound data generation process, the English word WAV data 305 and the English word WAV data 305 corresponding to the English word designated by the user Japanese translation WAV data 306 may be generated, and the generated WAV data may be acquired. In this case, for example, phonetic symbol information of English words, information indicating an access position in the phonetic symbol, and the like are registered in the dictionary database. And the control part 11 produces | generates the English word WAV data 305 of the designated English word by speech synthesis based on the phonetic symbol and the information indicating the access position. Similarly, the control unit 11 generates the Japanese translation WAV data 306 by speech synthesis based on the Japanese translation character data.

また、英単語の発音開始からアクセントまでの発音時間としてのアクセント位置情報を用いる場合、これを辞書データベースから取得するのではなく、記憶対象語音声データ作成処理において、ユーザから指定された英単語に対応して生成しても良い。この場合、制御部１１は、例えば、指定された英単語の英単語ＷＡＶデータ３０５に対して音声解析処理を行う。そして制御部１１は、発声音の音圧レベルに基づいてアクセント位置を特定する。 Also, when using the accent position information as the pronunciation time from the start of pronunciation of the English word to the accent, it is not acquired from the dictionary database, but in the storage target word speech data creation process, the English word specified by the user It may be generated correspondingly. In this case, for example, the control unit 11 performs voice analysis processing on the English word WAV data 305 of the designated English word. Then, the control unit 11 specifies the accent position based on the sound pressure level of the uttered sound.

また、制御部１１は、英単語の発音タイミングをずらすか否かを判定する前に、全ての英単語のアクセントの発音タイミングを第１拍のタイミングに合わせた記憶対象語音声データを最初に作成し、判定を行った後、記憶対象語音声データを修正して、英単語の発音タイミングをずらしても良い。この場合、制御部１１は、最初に作成された記憶対象語音声データを音声解析することによって日本語訳の発音終了からその次に発音される英単語の発音開始までの時間間隔を求めても良い。 In addition, the control unit 11 first creates storage target word voice data in which the accent pronunciation timings of all English words are matched with the timing of the first beat before determining whether or not to shift the pronunciation timings of English words. Then, after the determination is made, the storage target word voice data may be modified to shift the pronunciation timing of the English words. In this case, the control unit 11 may obtain a time interval from the end of the pronunciation of the Japanese translation to the start of the pronunciation of the next English word by performing a voice analysis on the initially created storage target word speech data. good.

［２．３．２使用楽曲決定処理］
次に、使用楽曲決定処理について説明する。使用楽曲決定処理の処理内容は、基本的には第１実施形態の場合（図１９）と同様である。従って、第１実施形態の場合と異なる部分についてのみ説明する。 [2.3.2 Used music determination process]
Next, the music used determination process will be described. The processing content of the used song determination process is basically the same as in the case of the first embodiment (FIG. 19). Therefore, only different parts from the case of the first embodiment will be described.

図１９に示すステップＳ１０６において、制御部１１は、ジョギング本編を構成する複数の楽曲を選定する。ここで、制御部１１は、ジョギング用の楽曲を構成する記憶対象語音声をユーザに選択させる場合、当該ユーザによる英単語の指定によって作成された記憶対象語音声の中から選択させる。具体的に、制御部１１は、パーツＷＡＶデータデータベースに登録されている記憶対象語音声データのうち、図１８に示すステップＳ１のログイン処理において受信したユーザＩＤに対応する記憶対象語音声データのリストを提示するＷＥＢページをユーザＰＣ２に送信する。そして、ユーザは、ユーザＰＣ２を操作することによって、提示された記憶対象語音声データの中から所望の記憶対象語音声データを選択する。 In step S106 shown in FIG. 19, the control unit 11 selects a plurality of music pieces constituting the jogging main part. Here, when the control unit 11 causes the user to select the storage target speech that constitutes the music for jogging, the control unit 11 selects the storage target speech generated by designation of English words by the user. Specifically, the control unit 11 lists the storage target word / speech data corresponding to the user ID received in the login process in step S1 shown in FIG. 18 among the storage target word / speech data registered in the parts WAV data database. Is transmitted to the user PC 2. Then, the user operates the user PC 2 to select desired storage target speech data from the presented storage target speech data.

その後、ユーザにより選択された記憶対象語音声データを含むジョギング用楽曲データが作成され、作成されたジョギング用楽曲データがユーザＰＣ２に送信される。 Thereafter, jogging music data including the storage target speech data selected by the user is created, and the created jogging music data is transmitted to the user PC 2.

以上説明したように、本実施形態によれば、制御部１１が、英単語ＷＡＶデータ３０５、日本語訳ＷＡＶデータ３０６及びドラムベースＷＡＶデータ３０４に基づいて、互いに異なる複数の記憶対象語夫々について複数回その発声音が発音されるとともに、リズム音が発音され、且つ、英単語のアクセントの発音タイミングが拍のタイミングに合わせ、更に、日本語訳の発音開始タイミングが拍のタイミングに合わせて英単語と日本語訳とが交互に発音されるように構成された記憶対象語音声データを作成する。ここで、制御部１１が、日本語訳の発音終了からその次に発音される英単語の発音開始までの時間間隔が最低間隔時間Ａ未満であるか否かを判定し、当該時間間隔が最低間隔時間Ａ未満である英単語のアクセントの発音タイミングを拍のタイミングからずらして記憶対象語音声データを作成する。そして、制御部１１が、作成された記憶対象語音声データをパーツＷＡＶデータデータベースに登録し、登録された記憶対象語音声データを含むジョギング用楽曲データをユーザＰＣ２に送信する。 As described above, according to the present embodiment, the control unit 11 uses a plurality of storage target words different from each other based on the English word WAV data 305, the Japanese translation WAV data 306, and the drum base WAV data 304. The utterance sound is pronounced, the rhythm sound is pronounced, the accent pronunciation timing of the English word is synchronized with the beat timing, and the pronunciation start timing of the Japanese translation is synchronized with the beat timing To-be-stored word speech data configured to alternately pronounce Japanese and Japanese translations. Here, the control unit 11 determines whether or not the time interval from the end of the pronunciation of the Japanese translation to the start of pronunciation of the next English word is less than the minimum interval time A, and the time interval is the minimum. The word-to-speech data to be stored is created by shifting the pronunciation timing of accents of English words that are less than the interval time A from the timing of beats. Then, the control unit 11 registers the created storage target speech data in the part WAV data database, and transmits jogging music data including the registered storage target speech data to the user PC 2.

また、日本語訳と英単語との時間間隔が最低間隔時間Ａ未満となってしまうことを確実に防止することができる。 Further, it is possible to reliably prevent the time interval between the Japanese translation and the English word from being less than the minimum interval time A.

また、英単語ＷＡＶデータ３０５と日本語訳ＷＡＶデータ３０６とが別個に存在するので、英単語の発音タイミングをずらした記憶対象語音声データの作成を容易に行うことができる。 Further, since the English word WAV data 305 and the Japanese translation WAV data 306 exist separately, it is possible to easily create the storage target word voice data in which the pronunciation timing of English words is shifted.

また、制御部１１が、英単語の発音開始タイミングからアクセントの発音タイミングまでの発音時間が許容発音時間Ｂ以上であるか否かを判定し、当該発音時間が許容発音時間Ｂ以上である英単語のアクセントの発音タイミングを拍のタイミングからずらして記憶対象語音声データを作成しても良い。 Further, the control unit 11 determines whether or not the pronunciation time from the pronunciation start timing of the English word to the accent pronunciation timing is equal to or greater than the allowable pronunciation time B, and the English word whose pronunciation time is equal to or greater than the allowable pronunciation time B The speech sound data to be stored may be created by shifting the accent sounding timing from the beat timing.

また、制御部１１が、英単語においてアクセントのある文字よりも前にある文字の数が許容文字数Ｃ以上であるか否かを判定し、当該文字の数が許容文字数Ｃ以上である英単語のアクセントの発音タイミングを拍のタイミングからずらして記憶対象語音声データを作成しても良い。 Further, the control unit 11 determines whether or not the number of characters preceding the accented character in the English word is equal to or greater than the allowable number of characters C, and the number of the character is equal to or greater than the allowable number of characters C. The speech sound data to be stored may be created by shifting the accent pronunciation timing from the beat timing.

これらの場合、判定の条件が日本語訳に関する事項を含まないので、日本語訳に関する情報を用いなくても、英単語の発音タイミングをずらすか否かを判定することができる。 In these cases, since the determination condition does not include matters relating to the Japanese translation, it is possible to determine whether or not to shift the pronunciation timing of the English words without using information relating to the Japanese translation.

また、制御部１１が、日本語訳と英単語との時間間隔が最低間隔時間Ａ以上となる英単語の発音開始タイミングを拍のタイミングに合わせて記憶対象語音声データを作成する。 In addition, the control unit 11 creates the storage target word voice data by matching the pronunciation start timing of the English word whose time interval between the Japanese translation and the English word is equal to or longer than the minimum interval time A to the timing of the beat.

また、リズム音における１小節中の第１拍と第２拍とが英単語に割り当てられ、且つ、当該１小節中の第３拍と第４拍とが日本語訳とに割り当てられており、日本語訳と英単語との時間間隔が最低間隔時間Ａ以上となる英単語が第１アクセントと第２アクセントとを有し、且つ、第１アクセントが第２アクセントよりも後に発音される単語である場合、制御部１１が、発音パターン２の英単語ＷＡＶデータ３０５を用いて当該英単語の発音開始タイミングを第１拍に合わせ、これにより、第１アクセントの発音タイミングを第２拍に合わせて記憶対象語音声データを作成する。 In addition, the first and second beats in one measure of the rhythm sound are assigned to English words, and the third and fourth beats in the one measure are assigned to the Japanese translation, An English word whose time interval between the Japanese translation and the English word is the minimum interval time A or more has a first accent and a second accent, and the first accent is a word pronounced after the second accent. In some cases, the control unit 11 uses the English word WAV data 305 of the pronunciation pattern 2 to adjust the pronunciation start timing of the English word to the first beat, thereby matching the pronunciation timing of the first accent to the second beat. Creates speech data to be stored.

従って、第１アクセントと第２アクセントとを有する英単語の発声音がリズム良く再生され、英単語とその日本語訳とをより記憶させやすくすることができる。 Therefore, the utterance sound of the English word having the first accent and the second accent is reproduced with a good rhythm, and the English word and its Japanese translation can be more easily stored.

また、制御部１１が、ユーザＰＣ２から指定単語情報を受信し、指定単語情報に基づいて、ユーザから指定された英単語に対応する英単語ＷＡＶデータ３０５及び日本語訳ＷＡＶデータ３０６をパーツＷＡＶデータデータベースから取得し、ユーザから指定された英単語に対応するアクセント位置情報に少なくとも基づいて、英単語の発音タイミングをずらすか否かを判定し、ユーザから指定された英単語とその日本語訳との発声音が再生される記憶対象語音声データを作成する。 Further, the control unit 11 receives the designated word information from the user PC 2 and, based on the designated word information, converts the English word WAV data 305 and the Japanese translation WAV data 306 corresponding to the English word designated by the user into the parts WAV data. Determine whether or not to shift the pronunciation timing of the English word based on at least the accent position information corresponding to the English word specified by the user obtained from the database, and the English word specified by the user and its Japanese translation To-be-stored word voice data for generating the utterance sound is generated.

従って、英単語をユーザが指定することによって、ユーザが記憶したい英単語を記憶学習することができる。 Therefore, the user can memorize and learn the English word that the user wants to memorize by designating the English word.

なお、上記各実施形態においては、英単語のアクセントの発音タイミングが拍からずらされる場合、英単語の発音開始タイミングが拍のタイミングに合うようになっていたが、発音タイミングをずらす位置はこれだけに限られるものではない。例えば、英単語の発音終了から第３拍のタイミングまでの時間が、英単語と日本語訳とを聞き分けることができる時間以上となる位置であれば、別の位置にずらされても良い。 In each of the above embodiments, when the pronunciation timing of the accent of the English word is shifted from the beat, the pronunciation start timing of the English word is matched with the timing of the beat, but this is the only position where the pronunciation timing is shifted. It is not limited. For example, as long as the time from the end of pronunciation of the English word to the timing of the third beat is longer than the time for distinguishing between the English word and the Japanese translation, the position may be shifted to another position.

また、上記各実施形態においては、記憶対象語音声の拍子を４／４拍子としていたが、拍子はこれに限られるものではない。 In each of the above embodiments, the time signature of the speech to be stored is 4/4, but the time signature is not limited to this.

また、上記各実施形態においては、英単語及び日本語訳が夫々１小節中の２拍に割り当てられていたが、１拍のみに割り当てられても良いし、３拍以上割り当てられても良い。また、英単語及び日本語訳に対する拍の割り当て数は同数でなくても良い。 Moreover, in each said embodiment, although the English word and the Japanese translation were each allocated to 2 beats in 1 bar, it may be allocated only to 1 beat and may be allocated 3 beats or more. Also, the number of beats assigned to English words and Japanese translations need not be the same.

また、上記各実施形態においては、１個の記憶対象語が発音される４小節の全ての小節において日本語訳が発音されるようにしていたが、一部の小節においてのみ日本語訳が発音されるように記憶対象語音声データが作成されても良い。例えば、４小節のうち最後の小節のみ日本語訳が発音される要することで、英単語とその日本語訳との記憶が確実に行われたか否かをユーザが自ら復唱して確認することができる。 Further, in each of the above embodiments, the Japanese translation is pronounced in all four bars where one storage target word is pronounced, but the Japanese translation is pronounced only in a part of the bars. As described above, the storage target speech data may be created. For example, the Japanese translation is required to be pronounced only in the last of the four bars, so that the user can repeat and confirm whether or not the memory of the English word and the Japanese translation has been reliably performed. it can.

また、上記各実施形態においては、英単語、当該英単語の日本語訳、の順番で発音されるようにしていたが、日本語訳、当該日本語訳に対応する英単語、の順番で発音されるようにしても良い。 In each of the above embodiments, the pronunciation is made in the order of the English word and the Japanese translation of the English word. However, the pronunciation is made in the order of the Japanese translation and the English word corresponding to the Japanese translation. You may be made to do.

また、上記各実施形態においては、第１の語として英単語を適用していたが、英語以外の強勢アクセントを有する言語の単語を適用しても良い。また、第１の語として、単語のみに限らず、熟語、句、節又は文等を適用しても良い。 Moreover, in each said embodiment, although the English word was applied as a 1st word, you may apply the word of the language which has a stress accent other than English. Further, the first word is not limited to a word but may be an idiom, phrase, clause or sentence.

また、上記各実施形態においては、第２の語として日本語を適用していたが、日本語以外の言語を適用しても良い。また、また、第２の語として、単語、熟語、句、節又は文等を適用しても良い。 In each of the above embodiments, Japanese is applied as the second word, but a language other than Japanese may be applied. Moreover, you may apply a word, a idiom, a phrase, a clause, a sentence, etc. as a 2nd word.

また、上記各実施形態においては、ペア語として、或る語と当該語の訳のペアを適用していたが、ペアとして記憶する語であれば、これに限られるものではない。 In each of the above embodiments, a pair of a word and a translation of the word is applied as a pair word. However, the present invention is not limited to this as long as the word is stored as a pair.

また、上記各実施形態においては、ジョギング用楽曲データに前奏の曲間つなぎデータと後奏の曲間つなぎデータとが挿入されるようにしていたが、何れか一方のみが挿入されるようにしても良いし、何れも挿入されないようにしても良い。 Further, in each of the above embodiments, the interlude song connection data and the follower song connection data are inserted into the jogging song data, but only one of them is inserted. Alternatively, neither of them may be inserted.

また、上記各実施形態における楽曲配信サーバ１は、ジョギングを行っている際に聴取されるジョギング用楽曲データを作成するようにしていたが、例えば、マラソンやウォーキング等を行っている際に聴取される運動用の楽曲データを作成するように構成しても良いし、エアロビクスやヨーガ、ピラティス・メソッド、マーシャルアーツ、ブートキャンプ、ダンス等を行っている際に聴取される運動用の楽曲データを作成するように構成しても良い。 In addition, the music distribution server 1 in each of the embodiments described above creates jogging music data to be listened to while jogging. For example, the music distribution server 1 is listened to during marathon, walking, etc. It may be configured to create music data for exercise, or create music data for exercise to be heard during aerobics, yoga, Pilates method, martial arts, boot camp, dance, etc. You may comprise so that it may do.

また、楽曲配信サーバ１は、記憶対象語音声データを、運動用の楽曲データに含まれる態様でユーザＰＣ２に送信するのではなく、記憶対象語音声データのみをユーザＰＣ２に送信するようにしても良い。つまり、記憶対象語音声データは、運動中に聴取するために再生される場合に限られるのではなく、例えば、純粋に記憶学習のために再生されるものとして送信されるようにしても良い。 In addition, the music distribution server 1 does not transmit the storage target word sound data to the user PC 2 in a form included in the exercise music data, but transmits only the storage target word sound data to the user PC 2. good. That is, the storage target word sound data is not limited to being reproduced for listening during exercise, but may be transmitted, for example, as being reproduced purely for memory learning.

１楽曲配信サーバ
２ユーザＰＣ
３携帯音楽プレーヤ
１１制御部
１２記憶部
１３通信部
１４エンコーダ部
１５システムバス
１０１楽曲本体ＷＡＶパーツデータ
１０２ジョギングアレンジ曲間つなぎＷＡＶパーツデータ
１０３ジョギングアレンジ音声ガイダンスＷＡＶパーツデータ
１０４ＤＪ音声ＷＡＶパーツデータ
１０５楽曲本体ＭＩＤＩデータ
１０６ジョギングアレンジドラムベースＷＡＶデータ
２０１パーツＷＡＶデータデータベースプログラム
２０２サーバシステムプログラム
２０３ＷＥＢサイトプログラム
２０４楽曲本体ＷＡＶパーツデータ書き出しプログラム
２０５記憶対象語ＷＡＶパーツデータ書き出しプログラム
３０１記憶対象語ＷＡＶパーツデータ
３０２英単語群ＷＡＶデータ
３０３日本語訳群ＷＡＶデータ
３０４記憶対象語用ドラムベースＷＡＶデータ
３０５英単語ＷＡＶデータ
３０６日本語訳ＷＡＶデータ
ＮＷネットワーク
Ｓ通信システム 1 Music distribution server 2 User PC
DESCRIPTION OF SYMBOLS 3 Portable music player 11 Control part 12 Memory | storage part 13 Communication part 14 Encoder part 15 System bus 101 Music main body WAV part data 102 Jogging arrangement song connection WAV part data 103 Jogging arrangement voice guidance WAV part data 104 DJ voice WAV part data 105 Music Main body MIDI data 106 Jogging arrangement drum base WAV data 201 Parts WAV data database program 202 Server system program 203 WEB site program 204 Music body WAV parts data writing program 205 Storage target word WAV parts data writing program 301 Storage target word WAV part data 302 English Word group WAV data 303 Japanese translation group WAV data 304 Storage target words The drum base WAV data 305 English words WAV data 306 Japanese translation WAV data NW network S communication system

Claims

Distributing audio information for storage, which is audio information generated so that the utterance of a pair word composed of a first word having an accent and a second word corresponding to the first word is reproduced. A distribution device,
Based on the information on the rhythm sound that is generated at the timing of the time signature and the information on the pair word, a plurality of the pair words and the rhythm sound that are the same or different from each other are pronounced, and the first Generating means for generating the storage voice information configured so that the first word and the second word are alternately pronounced in accordance with the timing of the beat of the accent of the word;
Storage means for storing the generated storage audio information;
Transmitting means for transmitting the stored voice information for storage to a terminal device;
With
The generating means sets the interval so that the interval from the end of pronunciation of the second word to the start of pronunciation of the first word pronounced next to the second word is equal to or longer than a predetermined time. The distribution apparatus generates the storage voice information in which the accent of the first word that is less than the time is pronounced with a shift from the timing of the beat.

The distribution apparatus according to claim 1,
The generating unit includes a determining unit that determines whether or not a condition that the interval is less than the predetermined time is satisfied,
The distribution apparatus, wherein the storage voice information is generated by shifting an accent pronunciation timing of the first word determined to satisfy the condition from a beat timing.

The distribution apparatus according to claim 1,
The pronunciation time from the pronunciation start timing of the first word to the accent pronunciation timing is not less than a predetermined pronunciation time, or the number of characters preceding the accented character in the first word is predetermined. As long as the number of characters is greater than or equal to
The generating means includes determining means for determining whether or not the first word satisfies the condition.
The distribution apparatus, wherein the storage voice information is generated by shifting an accent pronunciation timing of the first word determined to satisfy the condition from a beat timing.

In the distribution apparatus according to claim 2 or 3,
The distribution device is characterized in that the generation means generates the storage voice information by matching the pronunciation start timing of the first word determined to satisfy the condition with the beat timing.

The distribution device according to claim 4,
A plurality of consecutive beats for each of the first word and the second word constituting the pair word are assigned in one measure;
The generating means has a first accent in which the first word determined to satisfy the condition is pronounced most strongly and a second accent that is pronounced next to the first accent. And when the first accent is a word pronounced after the second accent,
The pronunciation start timing of the first word is matched with the timing of the first beat among a plurality of beats assigned to the first word, and the pronunciation timing of the first accent is set to the first word The distribution device generates the audio information for storage in accordance with timings of beats other than the head among a plurality of beats assigned to.

The distribution apparatus according to any one of claims 2 to 5,
The generating means adjusts the pronunciation timing of the second word to a predetermined timing based on the information of the rhythm sound and the second word voice information which is voice information of the utterance sound of the second word. , Based on the rhythm sound information and the first word sound information that is the sound information of the utterance sound of the first word, the pronunciation timing of the accent of the first word that does not satisfy the condition is The distribution apparatus, wherein the storage voice information is generated by shifting the pronunciation timing of the accent of the first word that satisfies the condition in accordance with the beat timing from the beat timing.

The distribution device according to claim 6,
First word information storage means for storing first word information indicating the first word and accent information indicating the position of the accent of the first word in association with each other;
First word designation information receiving means for receiving, from the terminal device, first word designation information indicating the first word designated by a user;
First word voice information acquisition means for acquiring the first word voice information of the designated first word based on the received first word designation information;
Second word voice information acquisition means for acquiring the second word voice information of the second word corresponding to the designated first word based on the received first word designation information;
Further comprising
The determination means determines whether or not the first word satisfies the condition based on at least the accent information corresponding to the first word designated by the user;
The generating means is based on the rhythm sound information, the acquired first word sound information, and the acquired second word sound information, and the utterance sound of the designated first word, The distribution apparatus for generating the storage voice information for reproducing the utterance sound of the second word corresponding to the first word.

Distributing audio information for storage, which is audio information generated so that the utterance of a pair word composed of a first word having an accent and a second word corresponding to the first word is reproduced. A delivery method,
Based on the information on the rhythm sound that is generated at the timing of the time signature and the information on the pair word, a plurality of the pair words and the rhythm sound that are the same or different from each other are pronounced, and the first A generating step of generating the storage voice information configured such that the first word and the second word are alternately sounded in accordance with the timing of the beat of the accent of the word;
A storage step of storing the generated storage audio information;
A transmitting step of transmitting the stored voice information for storage to a terminal device;
Have
In the generating step, the interval is set so that an interval from the end of pronunciation of the second word to the start of pronunciation of the first word pronounced next to the second word is equal to or longer than a predetermined time. The distribution method characterized by generating the storage voice information in which the accent of the first word that is less than a predetermined time is pronounced with a shift from the beat timing.

The distribution method according to claim 8,
A first word designation information receiving step for receiving, from the terminal device, first word designation information indicating the first word designated by a user;
Based on the received first word designation information, a first word voice information acquisition step of acquiring the first word voice information which is voice information of the utterance sound of the designated first word;
Second word voice information that acquires second word voice information that is voice information of the utterance of the second word corresponding to the designated first word based on the received first word designation information Acquisition process;
First word information specified by the user from first word information storage means for storing first word information indicating the first word and accent information indicating an accent position of the first word in association with each other. Accent information acquisition step of acquiring the accent information corresponding to the word of
A determination step of determining whether or not to shift the pronunciation timing of the accent of the first word designated by the user based on at least the acquired accent information;
Further comprising
In the generation step, the utterance sound of the designated first word based on the rhythm sound information, the acquired first word sound information, and the acquired second word sound information And the storage voice information for reproducing the utterance of the second word corresponding to the first word, and
The pronunciation timing of the second word is matched with a predetermined timing, and the accent pronunciation timing of the first word, which is determined not to shift the accent pronunciation timing, is matched with the beat timing, and the accent pronunciation timing is shifted. And generating the storage voice information by shifting the accent timing of the first word determined to be different from the beat timing.