JP4192994B2

JP4192994B2 - Data input program for singing synthesis

Info

Publication number: JP4192994B2
Application number: JP2007128293A
Authority: JP
Inventors: 裕司久湊; オルトラジャウメ
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2007-05-14
Filing date: 2007-05-14
Publication date: 2008-12-10
Anticipated expiration: 2023-02-27
Also published as: JP2007272242A

Description

本発明は、歌唱合成装置において、歌詞の入力を容易にする技術に関する。 The present invention relates to a technique for facilitating input of lyrics in a singing voice synthesizing apparatus.

従来、音声合成装置や音声合成プログラムにおいて、テキストが与えられると、このテキストに対応する発音記号を生成して、この発音記号に従って、音声を合成するものがあった。この種の装置やプログラムの中には、生成された発音記号列を編集する機能を有するものがあった。この編集のために、これら装置やプログラムは、テキストエディタを実装していて、ユーザはこのテキストエディタを用いて発音記号列を直接編集するのが一般的であった。
しかしながら、生成される発音記号列は、読みを表す記号、アクセントやポーズの位置などを表す記号の列であり、これらの記号の意味をユーザが知らなければ、発音記号列をユーザが編集するのは非常に難しかった。 Conventionally, in a speech synthesizer or a speech synthesis program, when a text is given, a phonetic symbol corresponding to the text is generated and a speech is synthesized according to the phonetic symbol. Some devices and programs of this type have a function of editing a generated phonetic symbol string. For this editing, these devices and programs are equipped with a text editor, and a user generally edits a phonetic symbol string directly using this text editor.
However, the generated phonetic symbol sequence is a sequence of symbols representing readings, symbols and positions of accents and poses, etc. If the user does not know the meaning of these symbols, the user edits the phonetic symbol sequence. Was very difficult.

この欠点を克服する技術として、例えば、特許文献１においては、発音記号列を知らない一般のユーザにとって理解が容易な発話区分という記号を導入している。この発音区分は、アクセント句、声立て句、呼気段落の区切り位置を表わしていて、発音記号列の定義を知らなくても、その言語を話すことができる者であれば、容易に編集することができる。また、発音区分の編集作業によって、間接的に発音記号の編集が行われるように構成されているので、特許文献１によれば、容易に発音記号列の編集を行うことができる。 As a technique for overcoming this drawback, for example, Patent Document 1 introduces a symbol called utterance classification that is easy to understand for a general user who does not know a phonetic symbol string. This pronunciation division represents the position of the accent phrase, voice phrase, and exhalation paragraph, and can be easily edited by those who can speak the language without knowing the definition of the phonetic symbol string. Can do. Further, since the phonetic symbols are indirectly edited by the editing operation of the phonetic classification, according to Patent Document 1, the phonetic symbol string can be easily edited.

また今日、コンピュータを用いて作曲することが行われている。このコンピュータを用いた音楽編集は、ＤＴＭ（デスクトップミュージック）とも呼ばれている。このＤＴＭでは、ユーザは、ＭＩＤＩ（ＭｕｓｉｃａｌＩｎｓｔｒｕｍｅｎｔｓＤｉｇｉｔａｌＩｎｔｅｒｆａｃｅ）音源を利用して、好きな曲を自分自身で演奏したり、またユーザ自身が製作者となって、音色やメロディを考えながら、自分だけの楽曲を作って演奏を楽しんだりしている。
また、ＤＴＭにおいては、音声合成の技術を利用して、曲に合わせてコンピュータに歌を歌わせることも行われている。
特開平５−１１７９７号公報 Today, music is composed using a computer. This music editing using a computer is also called DTM (desktop music). In this DTM, the user can play a favorite song by himself using MIDI (Musical Instruments Digital Interface) sound source, or the user himself can be a producer and think about the tone and melody. I enjoy making music.
Also, in DTM, using a voice synthesis technique, a computer is allowed to sing a song according to a song.
JP-A-5-11797

上述した特許文献１の技術は、自然性の高い韻律を与える適切な発音記号列を容易に生成することが可能な発音記号列生成装置、及びそれを用いたテキスト音声合成システムを提案している。しかし、特許文献１の技術は、既に出来上がっている楽譜の音符に歌詞を割り当てていく音楽編集作業のための装置やプログラムに適用することはできない。 The technique of Patent Document 1 described above proposes a phonetic symbol string generation device capable of easily generating an appropriate phonetic symbol string that gives a highly natural prosody, and a text-to-speech synthesis system using the same. . However, the technique of Patent Document 1 cannot be applied to a device or program for music editing work in which lyrics are assigned to musical notes of a musical score that has already been completed.

また楽曲編集において、音符に歌詞と発音をつける場合、１つの音符に１つの母音を割り振っていくのは、発音記号と音節区切りを知らないユーザには難しい。なぜならば、例えば英語の歌詞を音符に割り当てる場合、ユーザは単語の音節を知らなければならない。しかし一般のユーザは、単語の音節を知るには辞書を見なければならず、歌詞入力の際にそのようなことをするのはわずらわしいのである。 In addition, in music editing, when adding lyrics and pronunciation to notes, it is difficult for a user who does not know phonetic symbols and syllable breaks to allocate one vowel to one note. Because, for example, when assigning English lyrics to a note, the user must know the syllable of the word. However, a general user must look at the dictionary to know the syllables of words, and it is troublesome to do so when inputting lyrics.

本発明は、上述した事情に鑑みて為されたものであり、音符に歌詞を割り振ることを容易にする仕組みを提供することを目的としている。 The present invention has been made in view of the above-described circumstances, and an object thereof is to provide a mechanism that makes it easy to assign lyrics to musical notes.

上記目的を達成するために、本発明による歌唱合成用データ入力プログラムは、楽曲を構成する複数の音符に対応した発音音量又は発音期間を指定する情報を含む複数のノートデータからなる歌唱スコアデータを記憶手段に記憶する記憶過程と、前記記憶手段に記憶された複数の連続したノートデータに対応付けられて入力された入力データであって、１つの単語を構成する複数の表記文字と、連続した２つの前記表記文字間を接続する表記文字接続記号とを含んだ入力データを、入力装置を介して取得し、該入力データから単語を取得する単語取得過程と、前記単語取得過程において取得された単語について、単語の発音態様を示す発音記号と、単語を構成する音節の数を表す区分情報とを取得する単語情報取得過程と、前記単語情報取得過程において取得した区分情報が表す数と、前記単語取得過程において取得された単語に対応付けられたノートデータの数とを比較し、両者が一致する場合には、前記単語取得過程において１の単語から得られた発音記号を各々１個の母音を含む音節に分け、各音節を、前記歌唱スコアデータにおける各ノートデータのうち前記単語が取得された入力データに対応付けられた複数の連続したノートデータに割り当て、さらに前記ノートデータに割り当てられた音節が所定の音節であり、該ノートデータが指定する発音音量または発音期間が所定時間以上である場合に、当該音節に対応する発音記号を他の発音記号に変更することにより、前記記憶手段に記憶された歌唱スコアデータを更新する歌唱スコアデータ更新過程とをコンピュータに実行させる。 In order to achieve the above object, a singing synthesizing data input program according to the present invention includes singing score data composed of a plurality of note data including information for designating a sounding volume or a sounding period corresponding to a plurality of notes constituting a music piece. A storage process stored in the storage means, and input data input in association with a plurality of continuous note data stored in the storage means, a plurality of written characters constituting one word, and a continuous Input data including a notation character connection symbol for connecting between the two notation characters is acquired via an input device, and a word acquisition process for acquiring a word from the input data, acquired in the word acquisition process For a word, a word information acquisition process for acquiring a phonetic symbol indicating a pronunciation mode of the word and division information indicating the number of syllables constituting the word, and the word information acquisition The number represented by the classification information acquired in the process is compared with the number of note data associated with the word acquired in the word acquisition process, and if both match, one word is acquired in the word acquisition process. Are divided into syllables each including one vowel, and each syllable is a plurality of consecutive notes associated with the input data from which the word is acquired among the note data in the singing score data. When the syllable assigned to the data and the note data is a predetermined syllable and the sound volume or sound period specified by the note data is equal to or longer than a predetermined time, the phonetic symbol corresponding to the syllable is changed to another syllable. By changing to phonetic symbols, the singing score data updating process for updating the singing score data stored in the storage means is performed on the computer. Make.

また、本発明による歌唱合成用データ入力プログラムは、楽曲を構成する複数の音符に対応した複数のノートデータからなる歌唱スコアデータを記憶手段に記憶する記憶過程と、前記記憶手段に記憶された複数の連続したノートデータに対応付けられて入力された入力データであって、１つの単語を構成する複数の表記文字と、連続した２つの前記表記文字間を接続する表記文字接続記号とを含んだ入力データを、入力装置を介して取得し、該入力データから単語を取得する単語取得過程と、前記単語取得過程において取得された単語について、単語の発音態様を示す発音記号と、単語を構成する音節の数を表す区分情報とを取得する単語情報取得過程と、前記単語情報取得過程において取得した区分情報が表す数と、前記単語取得過程において取得された単語に対応付けられたノートデータの数とを比較し、両者が一致する場合には、前記単語取得過程において１の単語から得られた発音記号を各々１個の母音を含む音節に分け、各音節を、前記歌唱スコアデータにおける各ノートデータのうち前記単語が取得された入力データに対応付けられた複数の連続したノートデータに割り当て、さらに前記ノートデータに割り当てられた音節が所定の音節である場合に、当該音節に対応する発音記号を、当該ノートデータの前後のノートデータに割り当てられた音節に基づいて修正することにより、前記記憶手段に記憶された歌唱スコアデータを更新する歌唱スコアデータ更新過程とをコンピュータに実行させる。 The singing synthesizing data input program according to the present invention includes a storing process of storing singing score data composed of a plurality of note data corresponding to a plurality of notes constituting a music piece in a storing unit, and a plurality of storing data stored in the storing unit. Input data that is input in association with consecutive note data, and includes a plurality of written characters that form one word and a written character connection symbol that connects two consecutive written characters. A word acquisition process for acquiring input data via an input device and acquiring a word from the input data, a phonetic symbol indicating the pronunciation of the word, and a word for the word acquired in the word acquisition process A word information acquisition process for acquiring classification information representing the number of syllables, a number represented by the classification information acquired in the word information acquisition process, and the word acquisition process. The number of note data associated with the word acquired in this way is compared, and if they match, the phonetic symbol obtained from one word in the word acquisition process each includes a vowel Each syllable is assigned to a plurality of continuous note data associated with the input data from which the word is acquired among the note data in the singing score data, and the syllable assigned to the note data is predetermined. The singing score data stored in the storage means is updated by correcting the phonetic symbol corresponding to the syllable based on the syllables assigned to the note data before and after the note data. Let the computer execute the singing score data update process.

本発明によれば、ユーザに正しい音節区切りの知識がなく、歌詞の入力を誤って行なわれても母音の数を正しく入力されれば、正しい発音記号が設定される。 According to the present invention, if the user has no knowledge of correct syllable breaks and the number of vowels is correctly input even if lyrics are input incorrectly, the correct phonetic symbol is set.

本発明の好適な実施形態を、図面を参照しながら説明する。
図１は、この発明の一実施形態である歌唱合成用データ入力装置としての機能を有するコンピュータ１の構成を示すブロック図である。図１に示すコンピュータ１において、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１１、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１２、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１３、操作部１４、ＨＤＤ（ハードディスク駆動装置）１５、ディスプレイ１６、データ入出力部１７は、バスＢＵＳを介して接続されており、お互いにデータの授受を行うことができる。また音源部１８、スピーカ１９は、コンピュータ１に外部機器として接続されているが、コンピュータ１の内部の機器として構成してもよい。 A preferred embodiment of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a configuration of a computer 1 having a function as a singing synthesizing data input device according to an embodiment of the present invention. In the computer 1 shown in FIG. 1, a CPU (Central Processing Unit) 11, a ROM (Read Only Memory) 12, a RAM (Random Access Memory) 13, an operation unit 14, an HDD (Hard Disk Drive Device) 15, a display 16, and data input / output. The units 17 are connected via a bus BUS and can exchange data with each other. The sound source unit 18 and the speaker 19 are connected to the computer 1 as external devices, but may be configured as devices inside the computer 1.

ＣＰＵ１１は、汎用的なデータ処理を行うマイクロプロセッサであり、ＲＯＭ１２に格納されたＢＩＯＳ（ＢａｓｉｃＩｎｐｕｔ／ＯｕｔｐｕｔＳｙｓｔｅｍ）等の制御用プログラムおよびＨＤＤ１５に格納されたＯＳ（オペレーティングシステム）に従い、コンピュータ１の他の構成部の制御処理を行う。 The CPU 11 is a microprocessor that performs general-purpose data processing. In addition to the computer 1 according to a control program such as BIOS (Basic Input / Output System) stored in the ROM 12 and an OS (operating system) stored in the HDD 15, The control processing of the constituent parts is performed.

ＲＯＭ１２は、ＢＩＯＳ等の制御用プログラムを格納する不揮発性メモリである。また、ＲＡＭ１３は、ＣＰＵ１１や他の構成部が利用するデータを一時的に記憶するための揮発性メモリである。ＲＯＭ１２内のＢＩＯＳは、コンピュータ１の電源が投入された時に、ＣＰＵ１１によって読み出され、ＲＡＭ１３に書き込まれる。ＣＰＵ１１は、このＲＡＭ１３内のＢＩＯＳに従ってハードウェアの利用環境を構築する。操作部１４は、キーパッドやマウス等を有し、ユーザによって行われる操作内容を反映したデータをＣＰＵ１１に送信する。ＨＤＤ１５は、大容量の記憶領域を有する不揮発性のメモリであり、ＨＤＤ１５に記憶されるデータは書き換え可能である。ＨＤＤ１５には、ＯＳと、各種のアプリケーションと、各種のアプリケーションによって利用されるデータが格納されている。ＣＰＵ１１は、ＢＩＯＳによるハードウェア環境の構築後、ＨＤＤ１５からＯＳを読み出して、ＲＡＭ１３に書き込み、ＯＳに従って、ＧＵＩ（ＧｒａｐｈｉｃａｌＵｓｅｒＩｎｔｅｒｆａｃｅ）環境およびアプリケーションの実行環境の構築等の処理を行う。ＨＤＤ１５に記憶されているアプリケーションのうち主要なものとして、歌唱合成用データ入力アプリケーションがある。 The ROM 12 is a non-volatile memory that stores a control program such as BIOS. The RAM 13 is a volatile memory for temporarily storing data used by the CPU 11 and other components. The BIOS in the ROM 12 is read by the CPU 11 and written in the RAM 13 when the computer 1 is turned on. The CPU 11 constructs a hardware usage environment according to the BIOS in the RAM 13. The operation unit 14 includes a keypad, a mouse, and the like, and transmits data reflecting operation contents performed by the user to the CPU 11. The HDD 15 is a nonvolatile memory having a large-capacity storage area, and data stored in the HDD 15 can be rewritten. The HDD 15 stores an OS, various applications, and data used by the various applications. The CPU 11 reads out the OS from the HDD 15 and writes it into the RAM 13 after constructing the hardware environment by the BIOS, and performs processing such as construction of a GUI (Graphical User Interface) environment and an application execution environment according to the OS. Among the applications stored in the HDD 15, there is a singing composition data input application.

ＣＰＵ１１は、マウス等の操作により、歌唱合成用データ入力アプリケーションの実行指示をユーザから受け取ると、このＨＤＤ１５から歌唱合成用データ入力アプリケーションを読み出してＲＡＭ１３に書き込み、歌唱合成用データ入力アプリケーションに従って各種処理を行う環境を構築する。このようにして、コンピュータ１は、本実施形態に係る歌唱合成用データ入力装置として機能する。 When the CPU 11 receives an instruction to execute the song synthesis data input application from the user by operating the mouse or the like, the CPU 11 reads the song synthesis data input application from the HDD 15 and writes it into the RAM 13, and performs various processes according to the song synthesis data input application. Build the environment to do. In this way, the computer 1 functions as a song synthesis data input device according to the present embodiment.

ディスプレイ１６は、液晶ディスプレイと、ＣＰＵ１１による制御の下、液晶ディスプレイを駆動する駆動回路とを有し、文字、図形等の情報を表示する。 The display 16 includes a liquid crystal display and a drive circuit that drives the liquid crystal display under the control of the CPU 11 and displays information such as characters and graphics.

データ入出力部１７は、例えばＵＳＢ（ＵｎｉｖｅｒｓａｌＳｅｒｉａｌＢｕｓ）インターフェース、各種データを入出力可能なインターフェースであり、外部機器からデータを受信し、受信したデータをＣＰＵ１１に転送したり、ＣＰＵ１１により生成されたデータを外部機器に送信したりする。
音源部１８は入力されたデータに基づいて楽音信号を発生し、スピーカ１９から楽音として出力する。 The data input / output unit 17 is, for example, a USB (Universal Serial Bus) interface or an interface capable of inputting / outputting various data. The data input / output unit 17 receives data from an external device, transfers the received data to the CPU 11, or is generated by the CPU 11. Send data to external devices.
The sound source unit 18 generates a musical sound signal based on the input data and outputs it as a musical sound from the speaker 19.

図２は、コンピュータ１のＣＰＵ１１が歌唱合成用データ入力アプリケーションを実行することにより提供される歌唱合成用データ入力装置の機能構成を示すブロック図である。図に示すように、歌唱合成用データ入力装置は、操作手段２１、外部データ入力手段２２、データ編集手段２３、表示手段２４および記憶手段２５を有している。これらのうち、操作手段２１は、コンピュータ１の操作部１４であり、記憶手段２５は、コンピュータ１のＲＡＭ１３およびＨＤＤ１５である。また外部データ入力手段２２は、コンピュータ１のデータ入出力部１７であり、表示手段２４はディスプレイ１６である。データ編集手段２３は、歌唱合成用データ入力アプリケーションを構成するソフトウェアモジュールである。 FIG. 2 is a block diagram showing a functional configuration of a singing voice synthesizing data input device provided by the CPU 11 of the computer 1 executing a singing voice data input application. As shown in the figure, the singing voice synthesizing data input device has an operation means 21, an external data input means 22, a data editing means 23, a display means 24 and a storage means 25. Among these, the operation unit 21 is the operation unit 14 of the computer 1, and the storage unit 25 is the RAM 13 and the HDD 15 of the computer 1. The external data input means 22 is the data input / output unit 17 of the computer 1, and the display means 24 is the display 16. The data editing means 23 is a software module that constitutes a singing synthesizing data input application.

外部データ入力手段２２は、外部機器等から歌唱スコアデータ２５１を取得し、取得した歌唱スコアデータ２５１をデータ編集手段２３へ送信する。データ編集手段２３は、受け取った歌唱スコアデータ２５１を記憶手段２５に格納し、操作手段２１の操作に応じて、この歌唱スコアデータ２５１の編集を行う。表示手段２４は、記憶手段２５に記憶された歌唱スコアデータ２５１を表示する。 The external data input unit 22 acquires the singing score data 251 from an external device or the like, and transmits the acquired singing score data 251 to the data editing unit 23. The data editing unit 23 stores the received singing score data 251 in the storage unit 25, and edits the singing score data 251 in accordance with the operation of the operation unit 21. The display unit 24 displays the singing score data 251 stored in the storage unit 25.

図３は、この歌唱スコアデータ２５１の構成を例示している。歌唱スコアデータ２５１は、楽曲を構成する一連の音符の各々に対応したノートデータからなる時系列データである。図３において横に並んだ１行分のデータが１つのノートデータを示している。１つのノートデータは、番号、音高、発音記号、発音期間、入力文字、表示文字の各データによって構成されている。ここで、各ノートデータにおける番号は、そのノートデータによって表される音符が曲の先頭から何番目のものであるかを示している。また、各ノートデータにおける音高および発音期間は、そのノートデータに対応した音符の音高および発音期間を各々指定している。発音記号、入力文字および表示文字は、ユーザによる操作手段２１の操作に応じて、データ編集手段２３が各ノートデータに割り当てる情報である。本実施形態の特徴は、このデータ編集手段２３によって行われる各ノートデータへの発音記号等の割り当て処理にある。記憶手段２５には、この各ノートデータへの発音記号等の割り当て処理において参照される辞書データ２５３と編集規則データ２５２が記憶されている。 FIG. 3 illustrates the configuration of this singing score data 251. The singing score data 251 is time-series data composed of note data corresponding to each of a series of notes constituting the music. In FIG. 3, the data for one line arranged side by side indicates one note data. One piece of note data is composed of data of numbers, pitches, phonetic symbols, pronunciation periods, input characters, and display characters. Here, the number in each piece of note data indicates the number of the note represented by the note data from the beginning of the song. The pitch and sound generation period in each note data designate the pitch and sound generation period of the note corresponding to the note data. The phonetic symbols, input characters, and display characters are information that the data editing unit 23 assigns to each piece of note data in accordance with the operation of the operation unit 21 by the user. The feature of the present embodiment is in the assignment processing of phonetic symbols to each note data performed by the data editing means 23. The storage means 25 stores dictionary data 253 and editing rule data 252 that are referred to in the process of assigning phonetic symbols to the respective note data.

図４に辞書データの構成を示す。辞書データは、多数の英単語等の表記文字の各々に対応したデータの集まりであり、１つの単語に対応したデータは、その単語の表記と、その単語が発音されるときの態様を表した発音記号列と、その単語が発音されるときの音声がいくつの音節に分けられるかの区分数を示す情報とにより構成されている。 FIG. 4 shows the structure of dictionary data. Dictionary data is a collection of data corresponding to each of a large number of written characters such as English words, and the data corresponding to one word represents the notation of the word and the manner in which the word is pronounced. It consists of a phonetic symbol string and information indicating the number of divisions into how many syllables the voice when the word is pronounced is divided into.

本実施形態において、ユーザは、操作手段２１の操作により、一連のノートデータに割り当てるべき歌詞を入力することができる。この歌詞を入力する際に、複数の音節からなる単語を複数の連続した音符に割り当てるような場合がある。
本実施形態は、このような場合におけるユーザの便宜を図ったものである。本実施形態において、ユーザは、ｎ音節からなる単語をｎ個の連続した音符に割り当てたい場合に次のルールだけを守ればいい。
ａ．最初の音符に対応付けて単語における最初の表記文字を入力し、最後の音符に対応付けて単語における最後の表記文字を入力する。
ｂ．最後の音符以外の音符に対応付けて入力するデータは、必ず表記文字接続記号（例えばハイフン）で終わるようにする。
ｃ．単語における最初の表記文字および最後の表記文字以外の表記文字は、単語における出現順序と同じ順序で入力する。 In the present embodiment, the user can input lyrics to be assigned to a series of note data by operating the operation means 21. When inputting the lyrics, there are cases where a word consisting of a plurality of syllables is assigned to a plurality of consecutive notes.
The present embodiment is intended for the convenience of the user in such a case. In the present embodiment, when the user wants to assign a word consisting of n syllables to n consecutive notes, only the following rule should be observed.
a. The first notation character in the word is input in association with the first note, and the last notation character in the word is input in association with the last note.
b. Data input in association with a note other than the last note must always end with a notation character connection symbol (for example, a hyphen).
c. The notation characters other than the first notation character and the last notation character in the word are input in the same order as the appearance order in the word.

データ編集手段２３は、このルールに従って入力されたデータを操作手段２１から受け取ると、表記文字接続記号によって接続された表記文字列を１個の単語と解釈する。そして、データ編集手段２３は、記憶手段２５に記憶された辞書データ２５３を参照することにより、この単語に対応した発音記号列を取得し、これを各々１個の母音を含んだｎ個の音節に分割し、ｎ個のノートデータに割り当てる。
編集規則データ２５２は、以上説明した割り当て処理において参照される特殊な規則を表すデータである。例えば、英語には、あいまい母音（Ｓｃｈｗａ音）なるものがある。あいまい母音は発音が弱く発音期間が短い場合に用いられるもので、これに対応する音符の音量が大きい、または発音期間が長い場合には、通常の母音に変換される。更に具体的な例をだすと、上記割り当て処理において、０．５秒以上持続される音符にＳｃｈｗａ音が割り当てられた場合、データ編集手段２３は、これを「Ｑ」という発音記号に変換する。 When the data editing unit 23 receives data input according to this rule from the operation unit 21, the data editing unit 23 interprets the notation character string connected by the notation character connection symbol as one word. Then, the data editing unit 23 refers to the dictionary data 253 stored in the storage unit 25 to obtain a phonetic symbol string corresponding to this word, and uses it as n syllables each including one vowel. Divided into n pieces of note data.
The edit rule data 252 is data representing a special rule referred to in the assignment process described above. For example, in English, there is an ambiguous vowel (Schwa sound). An ambiguous vowel is used when the pronunciation is weak and the pronunciation period is short. When the volume of the corresponding note is large or the pronunciation period is long, it is converted into a normal vowel. As a more specific example, when a Schwa sound is assigned to a note lasting 0.5 seconds or more in the assignment process, the data editing means 23 converts this into a phonetic symbol “Q”.

また、他の規則としては、「ｈ」の発音の変更がある。これは、「ｈ」の発音を前後の発音に従って、無声にしたり有声にしたりする規則である。具体的には、データ編集手段２３は、上記割り当て処理において、あるノートデータに発音記号「ｈ」を割り当てた場合において、そのノートデータの前のノートデータに母音が割り当てられており、かつ、前のノートデータとの間に休符がない場合に「ｈ」の発音を有声とする。例えば、「Ｓｈｅｈｉｔｓ」という歌詞の「Ｓｈｅ」と「ｈｉｔｓ」との間に休符があると「ｈｉｔｓ」の「ｈ」は無声、休符がないと有声にする。 Another rule is to change the pronunciation of “h”. This is a rule that makes the pronunciation of “h” unvoiced or voiced according to the previous and next pronunciations. Specifically, when the phonetic symbol “h” is assigned to certain note data in the assignment process, the data editing means 23 assigns a vowel to the note data before the note data, and When there is no rest between the note data and the note data, the pronunciation of “h” is voiced. For example, if there is a rest between the words “She hits” and “She” and “hits”, the “h” in “hits” is silent, and if there is no rest, it is voiced.

次に、歌唱合成用データ入力装置の動作を説明する。ＣＰＵ１１により歌唱合成用データ入力アプリケーションが実行されると、表示手段２４に図５に示すようなピアノロールが表示される。次に、操作手段２１に歌唱スコアデータを読み込む旨の指示が与えられると、外部データ入力手段２２を用いて、外部から歌唱スコアデータが読み込まれる。 Next, the operation of the singing synthesizing data input device will be described. When the data input application for song synthesis is executed by the CPU 11, a piano roll as shown in FIG. Next, when an instruction to read the singing score data is given to the operation means 21, the singing score data is read from the outside using the external data input means 22.

読み込まれた歌唱スコアデータは、データ編集手段２３を介して記憶手段２５に記憶される。記憶手段２５に記憶された歌唱スコアデータは、表示手段２４上にピアノロール画面として表示される。図５はこのピアノロール画面を示すものである。このピアノロール画面においては、上下方向がピアノの鍵の並び方向に対応しており、下が低音、上が高音の鍵を表している。すなわち、ピアノロール画面の縦軸は音高軸となっている。また、ピアノロール画面の横軸は時間軸となっている。そして、図中、太線によって囲った矩形は、歌唱スコアデータに含まれる個々のノートデータを表しており、ノートバーと呼ばれる。各ノートデータに対応したノートバーの音高軸方向の位置は、そのノートデータによって指定された音高を示している。また、各ノートデータに対応したノートバーの時間軸方向の位置は、そのノートデータによって指定された音符の発音タイミングを示している。また、各ノートデータに対応したノートバーの長さは、そのノートデータによって指定された発音期間を示している。 The read singing score data is stored in the storage means 25 via the data editing means 23. The singing score data stored in the storage means 25 is displayed on the display means 24 as a piano roll screen. FIG. 5 shows this piano roll screen. In this piano roll screen, the up and down direction corresponds to the arrangement direction of the piano keys, and the lower side represents the bass key and the upper side represents the treble key. That is, the vertical axis of the piano roll screen is the pitch axis. The horizontal axis of the piano roll screen is a time axis. In the drawing, a rectangle surrounded by a thick line represents individual note data included in the singing score data, and is called a note bar. The position of the note bar corresponding to each note data in the pitch axis direction indicates the pitch specified by the note data. The position of the note bar corresponding to each note data in the time axis direction indicates the sounding timing of the note designated by the note data. Further, the length of the note bar corresponding to each note data indicates the sound generation period designated by the note data.

本実施形態において、単語を分割して複数のノートデータに割り当てる場合、ユーザは、図６に示すように、語が連続することを示すハイフンを使って操作手段２１による単語の入力を行う。すなわち、図６において「ａｍａｚｉｎｇ」と言う単語が、「ａ−」、「ｍａ−」、「ｚｉｎｇ」として入力されているが、「ａ」と「ｍａ」についている「−」は、単語の入力が未だ途中であり、後続のノートバーへ割り当てるべき表記文字が未だ残っていることを示している。図に示す例では、「ａ−」は「ｍａ−」につながり、つながったものはさらに「ｚｉｎｇ」へとつながる。「ｚｉｎｇ」には、ハイフンが付いていないので、これが単語の最後尾の表記文字列である。 In the present embodiment, when a word is divided and assigned to a plurality of note data, the user inputs the word by the operation means 21 using a hyphen indicating that words are continuous as shown in FIG. That is, the word “amazing” in FIG. 6 is input as “a−”, “ma−”, and “zing”, but “−” attached to “a” and “ma” is the input of the word Is still in the middle, indicating that there are still characters to be assigned to the subsequent note bar. In the example shown in the figure, “a−” is connected to “ma−”, and the connected one is further connected to “zing”. Since “zing” does not have a hyphen, it is a written character string at the end of the word.

このようにして入力される「ａｍａｚｉｎｇ」が、本実施形態においてどのように処理されるかを以下説明する。 How “amazing” input in this way is processed in the present embodiment will be described below.

まず、ユーザは操作手段２１を使用して、「ａｍａｚｉｎｇ」を割り当てる最初のノートバーＮＢ１をダブルクリックするなどして、ノートバーに歌詞を入力することができる歌詞入力待ち状態にする。そして、操作手段２１により「ａ」と「−」と入力しエンターキーを押す。 First, the user uses the operation means 21 to enter a lyric input waiting state in which lyrics can be input to the note bar by, for example, double-clicking the first note bar NB1 to which “amazing” is assigned. Then, “a” and “−” are input by the operating means 21 and the enter key is pressed.

これにより、データ編集手段２３は、ノートバーＮＢ１に対応付けて、入力データ「ａ−」を取得する。そして、この入力データ「ａ−」を、ノートバーＮＢ１に対応付けて表示するとともに、記憶手段２５に確保されたバッファ領域（図示略）に格納する。次にユーザは、同様の操作により、ノートバーＮＢ２を指定して「ｍａ−」を入力する。これにより、データ編集手段２３は、記憶手段２５におけるバッファ領域内の既存の入力データ「ａ−」に新たに取得した入力データ「ｍａ−」を追加する。この結果、バッファ領域内の入力データは、「ａ−ｍａ−」となる。次にユーザは、同様の操作により、ノートバーＮＢ３を指定して「ｚｉｎｇ」を入力する。これにより、データ編集手段２３は、記憶手段２５におけるバッファ領域内の既存の入力データ「ａ−ｍａ−」に新たに取得した入力データ「ｚｉｎｇ」を追加する。この結果、バッファ領域内の入力データは、「ａ−ｍａ−ｚｉｎｇ」となる。この場合、データ編集手段２３は、入力された「ｚｉｎｇ」は、最後が「−」で終わっていないので、単語を構成する全ての表記文字の入力が終了したことを検知する。 Thereby, the data editing means 23 acquires the input data “a−” in association with the note bar NB1. The input data “a−” is displayed in association with the note bar NB1, and is stored in a buffer area (not shown) secured in the storage means 25. Next, the user designates the note bar NB2 and inputs “ma−” by the same operation. Accordingly, the data editing unit 23 adds the newly acquired input data “ma−” to the existing input data “a−” in the buffer area in the storage unit 25. As a result, the input data in the buffer area is “a-ma-”. Next, the user designates the note bar NB3 and inputs “zing” by the same operation. Accordingly, the data editing unit 23 adds the newly acquired input data “zing” to the existing input data “a-ma-” in the buffer area in the storage unit 25. As a result, the input data in the buffer area is “a-ma-zing”. In this case, since the input “zing” does not end with “−” at the end, the data editing unit 23 detects that input of all the notation characters constituting the word is completed.

次に、データ編集手段２３は、記憶手段２５におけるバッファ領域内の入力データ「ａ−ｍａ−ｚｉｎｇ」から、ハイフンを除去し、単語「ａｍａｚｉｎｇ」を得る。 Next, the data editing unit 23 removes the hyphen from the input data “a-ma-zing” in the buffer area in the storage unit 25 to obtain the word “amazing”.

次に、データ編集手段２３は、「ａｍａｚｉｎｇ」の区分数と発音記号を取得する為に、辞書データ２５３内の「ａｍａｚｉｎｇ」を検索する。辞書データ２５３には、「ａｍａｚｉｎｇ」の区分数と発音記号として、それぞれ「３」と「＠ｍｅＩＺＩＮ」が格納されているので、データ編集手段２３はこれらを得る。
次にデータ編集手段２３は、この区分数と、ユーザが単語を割り当てたノートデータ（ノートバー）の数を比較する。もしこの数に不一致があればエラーの表示を表示手段２４に出して、処理を中止する。 Next, the data editing unit 23 searches for “amazing” in the dictionary data 253 in order to obtain the number of divisions and pronunciation symbols of “amazing”. Since the dictionary data 253 stores “3” and “@meIZIN” as the number of divisions of “amazing” and the phonetic symbols, respectively, the data editing means 23 obtains them.
Next, the data editing means 23 compares the number of divisions with the number of note data (note bar) to which the user has assigned words. If there is a discrepancy between the numbers, an error display is displayed on the display means 24 and the processing is stopped.

次にデータ編集手段２３は、得られた発音記号列を音符に割り当てるために分割する。この時、母音の位置を区切りとする規則を使って分割する。これは通常の楽譜では１つの音符に対して１つの音節が対応していて、１つの音節には１つの母音が含まれているからである。「ａｍａｚｉｎｇ」の発音記号は、「＠ｍｅＩＺＩＮ」であるので、「＠」と「ｍｅＩ」と「ＺＩＮ」という３つの音節に分割する。そして、データ編集手段２３は、分割した音節を、歌唱スコアデータ内のノートバーＮＢ１、ＮＢ２およびＮＢ３に対応した各ノートデータの発音記号の欄に格納する。 Next, the data editing means 23 divides the obtained phonetic symbol string in order to assign it to a note. At this time, it is divided using a rule with the position of the vowel as a delimiter. This is because, in a normal score, one syllable corresponds to one note, and one syllable contains one vowel. Since the phonetic symbol of “amazing” is “@meIZIN”, it is divided into three syllables “@”, “meI”, and “ZIN”. Then, the data editing means 23 stores the divided syllables in the phonetic symbol column of each note data corresponding to the note bars NB1, NB2 and NB3 in the singing score data.

次に、データ編集手段２３は、編集規則データ２５２に格納されている規則に従って記憶手段２５内の歌唱スコアデータにおける発音記号の修正をする。図７は、この処理を経た最終的な歌唱スコアデータを示している。なお、図７に示される発音記号は、発音記号をコンピュータ上で扱いやすいように規定したサンパ（ＳＡＭＰＡ）に従った音声アルファベットである。
図６に示されるデータの場合、編集規則データ２５２に当てはまる規則が無いので、なにも変更されない。
このようにして、「ａｍａｚｉｎｇ」に対応した最終的な歌唱スコアデータが記憶手段２５内に得られる。 Next, the data editing unit 23 corrects the phonetic symbols in the singing score data in the storage unit 25 according to the rules stored in the editing rule data 252. FIG. 7 shows final singing score data that has undergone this processing. The phonetic symbols shown in FIG. 7 are phonetic alphabets according to a sampler (SAMPA) that specifies phonetic symbols so that they can be easily handled on a computer.
In the case of the data shown in FIG. 6, since there is no rule that applies to the edit rule data 252, nothing is changed.
In this way, final singing score data corresponding to “amazing” is obtained in the storage means 25.

次に、図８（ａ）を参照し、ユーザが、「ａｍａｚｉｎｇ」と言う単語を４つの音符に割り当てて入力する場合の動作を説明する。この例では、ユーザがノートバーのＮＢ１からＮＢ４に図のように表記文字を入力している。「ａｍａｚｉｎｇ」と言う単語の音節数は「３」であるので、これはユーザが勘違いをして入力している例である。
この場合、データ編集手段２３は、上述の処理と同じように、バッファ領域にユーザが入力した「ａ−ｍａ−ｚｉ−ｎｇ」を格納する。
次に、データ編集手段２３は、バッファ領域に格納された「ａ−ｍａ−ｚｉ−ｎｇ」とから、ハイフンを除去して「ａｍａｚｉｎｇ」と言う単語を得る。 Next, with reference to FIG. 8A, an operation when the user assigns and inputs the word “amazing” to four notes will be described. In this example, the user inputs notation characters in NB1 to NB4 of the note bar as shown in the figure. Since the number of syllables of the word “amazing” is “3”, this is an example in which the user misunderstands the input.
In this case, the data editing unit 23 stores “a-ma-zi-ng” input by the user in the buffer area, as in the above-described processing.
Next, the data editing means 23 obtains the word “amazing” by removing the hyphen from “a-ma-zi-ng” stored in the buffer area.

次に、データ編集手段２３は、「ａｍａｚｉｎｇ」の区分数と発音記号を取得する為に、辞書データ２５３内の「ａｍａｚｉｎｇ」を検索する。辞書データ２５３には、「ａｍａｚｉｎｇ」の区分数と発音記号として、それぞれ「３」と「＠ｍｅＩＺＩＮ」が格納されているので、データ編集手段２３はこれらを得る。次にデータ編集手段２３は、この区分数と、単語がいくつのノートバーに跨って入力されたかを比較をする。この「ａｍａｚｉｎｇ」の場合、ユーザは４つのノートバーに跨って入力されており、ノートバーの数が区分数と異なる。よってデータ編集手段２３は、エラーの表示を表示手段２４に出して、処理を中止する Next, the data editing unit 23 searches for “amazing” in the dictionary data 253 in order to obtain the number of divisions and pronunciation symbols of “amazing”. Since the dictionary data 253 stores “3” and “@meIZIN” as the number of divisions of “amazing” and the phonetic symbols, respectively, the data editing means 23 obtains them. Next, the data editing means 23 compares the number of divisions with how many note bars the word is input over. In the case of “amazing”, the user is input across four note bars, and the number of note bars is different from the number of sections. Therefore, the data editing unit 23 gives an error display to the display unit 24 and stops the processing.

次に、ユーザが、「ａｍａｚｉｎｇ」と言う単語を正しく３つの音符に割り当てて入力するが、図８（ｂ）に示すように、「ａｍａｚｉｎｇ」と言う単語の音節を誤って入力した場合の動作を説明する。この場合にはユーザが入力した「ａｍａ−ｚｉ−ｎｇ」がバッファ領域に格納されるが、これからハイフンが除去されるので、バッファ領域の内容は、前述と同様、「ａｍａｚｉｎｇ」と言う単語になる。従って、前述と同様、この単語の発音記号を構成する３つの音節である「＠」と「ｍｅＩ」と「ＺＩＮ」が３つのノートデータに割り当てられる。 Next, when the user correctly assigns and inputs the word “amazing” to the three notes, but as shown in FIG. 8B, the operation when the syllable of the word “amazing” is erroneously input Will be explained. In this case, “ama-zi-ng” input by the user is stored in the buffer area. Since hyphens are removed from the buffer area, the content of the buffer area is the word “amazing” as described above. . Accordingly, as described above, the three syllables “@”, “meI”, and “ZIN” constituting the phonetic symbol of this word are assigned to the three note data.

次に、ユーザが、図８（ｃ）に示すように、「ａｍａｚｉｎｇ」と言う単語を４つの音符に割り当てて入力する場合の動作を説明する。これは、ユーザが「ａｍａｚｉｎｇ」と言う語の音節数を４と勘違いした場合ではなく、ユーザは音節数を正しく認識していて、かつ「ａｍａｚｉｎｇ」と言う単語を４つの音符に割り当てる場合である。
具体的には、ユーザが、「ｍａ−」の音を伸ばすことを意図して、図８（ｃ）のようにノートバーＮＢ１からＮＢ４に入力するものとする。
この場合、ユーザは、「ｍａ−」を入力したノートバーの次のノートバーに、「−」のみを入力する。このようにして入力される「―」は、他の特殊な記号、例えば「＊」に置き換えられて、バッファ領域に格納される。 Next, as shown in FIG. 8C, an operation in the case where the word “amazing” is assigned to four notes and input will be described. This is not when the user misunderstands the number of syllables of the word “amazing” as 4, but when the user correctly recognizes the number of syllables and assigns the word “amazing” to four notes. .
Specifically, it is assumed that the user inputs to the note bars NB1 to NB4 as shown in FIG. 8C with the intention of extending the sound “ma−”.
In this case, the user inputs only “−” in the note bar next to the note bar in which “ma−” is input. The “-” input in this way is replaced with another special symbol, for example, “*”, and stored in the buffer area.

従って、ユーザが４つのノートバーへの入力を終えると、バッファ領域には「ａ−ｍａ−＊ｚｉｎｇ」が残る。データ編集手段２３は、バッファ領域に格納された「ａ−ｍａ−＊ｚｉｎｇ」から、ハイフンを削除して「ａｍａ＊ｚｉｎｇ」と言う文字列を得る。次に、データ編集手段２３は、バッファ領域内の「ａｍａ＊ｚｉｎｇ」が「＊」を含んでいるのを検知すると、「ａｍａ＊ｚｉｎｇ」を第２のバッファ領域にコピーした後、バッファ領域内の「ａｍａｉ＊ｚｉｎｇ」から「＊」を除去し、単語「ａｍａｚｉｎｇ」を得る。
次に、データ編集手段２３は、「ａｍａｚｉｎｇ」の区分数と発音記号を取得する為に、辞書データ２５３内の「ａｍａｚｉｎｇ」を検索する。辞書データ２５３には、「ａｍａｚｉｎｇ」の区分数と発音記号として、それぞれ「３」と「＠ｍｅＩＺＩＮ」が格納されているので、データ編集手段２３はこれらを得る。 Therefore, when the user finishes inputting the four note bars, “a-ma- * zing” remains in the buffer area. The data editing unit 23 deletes the hyphen from “a-ma- * zing” stored in the buffer area to obtain a character string “ama * zing”. Next, when the data editing unit 23 detects that “ama * zing” in the buffer area includes “*”, the data editing unit 23 copies “ama * zing” to the second buffer area, The “*” is removed from “amai * zing” of the word to obtain the word “amazing”.
Next, the data editing unit 23 searches for “amazing” in the dictionary data 253 in order to obtain the number of divisions and pronunciation symbols of “amazing”. Since the dictionary data 253 stores “3” and “@meIZIN” as the number of divisions of “amazing” and the phonetic symbols, respectively, the data editing means 23 obtains them.

次にデータ編集手段２３は、この区分数と、単語がまたがって入力されているノートバーの数とを比較をする。この場合、「ａｍａｚｉｎｇ」の区分数が「３」であるのに対し、ユーザは「ａｍａｚｉｎｇ」を入力したノートバーの数は「４」であり、ノートバーの数の方が区分数よりも「１」だけ多い。
しかし、第２のバッファ領域には、「＊」が１つ格納されており、これは、４個のノートバーのうち１個は、その前のノートバーの音を引き継ぐことを意味している。この場合、データ編集手段２３は、ノートバーの数から「＊」の数を引き算して、区分数と比較する。ここでは同数になるので、データ編集手段２３は、処理を継続する。 Next, the data editing means 23 compares the number of divisions with the number of note bars that are input across words. In this case, while the number of divisions of “amazing” is “3”, the number of note bars in which the user inputs “amazing” is “4”, and the number of note bars is more than the number of divisions. Only 1 ”.
However, one “*” is stored in the second buffer area, which means that one of the four note bars takes over the sound of the preceding note bar. . In this case, the data editing means 23 subtracts the number of “*” from the number of note bars and compares it with the number of sections. Since the number is the same here, the data editing means 23 continues the processing.

そして、データ編集手段２３は、次のような処理を行う。まず、データ編集手段２３は、バッファ領域内の「ａｍａ＊ｚｉｎｇ」の先頭から順次音節を取り出していく。最初に「ａ」が取り出されるので、この音節に対応した発音記号「＠」を、「ａｍａｚｉｎｇ」の入力を行った４個のノートバーに対応した各ノートデータのうち、最初のノートデータに割り当てる。
次に、「ｍａ」が取り出されるので、この取り出した音節に対応した発音記号「ｍｅＩ」を、２番目のノートデータに割り当てる。
次に、「＊」が取り出されるので、直前の発音記号「ｍｅＩ」の引き伸ばしの為の、発音記号「Ｉ」を３番目のノートデータに割り当てる。
最後に、「ｚｉｎｇ」が取り出されるので、この取り出した音節に対応した発音記号「ＺＩＮ」を、４番目のノートデータに割り当てる。
以上のような処理により、図９のような歌唱スコアデータが得られる。
このように、本実施形態によれば、ユーザが単語を複数のノートバーにまたがって入力した場合、各ノートバーに対応した各ノートデータに、単語を構成する各音節が自動的に割り当てられ、歌唱スコアデータが生成される。よって、ユーザによる歌唱スコアデータの編集が容易になる。 Then, the data editing unit 23 performs the following process. First, the data editing unit 23 sequentially extracts syllables from the head of “ama * zing” in the buffer area. Since “a” is extracted first, the phonetic symbol “@” corresponding to this syllable is assigned to the first note data among the respective note data corresponding to the four note bars to which “amazing” is input. .
Next, since “ma” is extracted, the phonetic symbol “meI” corresponding to the extracted syllable is assigned to the second note data.
Next, since “*” is extracted, the phonetic symbol “I” for extending the previous phonetic symbol “meI” is assigned to the third note data.
Finally, since “zing” is extracted, the phonetic symbol “ZIN” corresponding to the extracted syllable is assigned to the fourth note data.
Through the processing as described above, singing score data as shown in FIG. 9 is obtained.
Thus, according to the present embodiment, when a user inputs a word across a plurality of note bars, each syllable constituting the word is automatically assigned to each note data corresponding to each note bar, Singing score data is generated. Therefore, the user can easily edit the singing score data.

なお、本実施形態では、表記文字として英語アルファベットを用いる例を示したが、表記文字は辞書データにより発音記号に変換されるようになっていれば漢字や記号列であってもよい。 In this embodiment, an example in which an English alphabet is used as a written character has been described. However, the written character may be a Chinese character or a symbol string as long as it is converted into a phonetic symbol by dictionary data.

本発明による歌唱合成用データ入力インターフェースを実現するコンピュータの構成を示す図である。It is a figure which shows the structure of the computer which implement | achieves the data input interface for song synthesis | combination by this invention. 本発明による歌唱合成用データ入力インターフェースの機能ブロック図を示す図である。It is a figure which shows the functional block diagram of the data input interface for song synthesis by this invention. 歌唱スコアデータの構成を示す図である。It is a figure which shows the structure of song score data. 辞書データの構成を示す図である。It is a figure which shows the structure of dictionary data. ピアノロールの表示例を示す図である。It is a figure which shows the example of a display of a piano roll. ピアノロールに歌詞を入力する具体例を示す図である。It is a figure which shows the specific example which inputs a lyrics into a piano roll. 歌唱スコアデータの構成を示す図である。It is a figure which shows the structure of song score data. ピアノロールに歌詞を入力する具体例を示す図である。It is a figure which shows the specific example which inputs a lyrics into a piano roll. 歌唱スコアデータの構成を示す図である。It is a figure which shows the structure of song score data.

Explanation of symbols

１・・・コンピュータシステム、１１・・・ＣＰＵ、１２・・・ＲＯＭ、１３・・・ＲＡＭ、１４・・・操作部、１５・・・ＨＤＤ、１６・・・ディスプレイ、１７・・・データ入出力部、１８・・・音源部、１９・・・スピーカ、２１・・・操作手段、２２・・・外部データ入力手段、２３・・・データ編集手段、２４・・・表示手段、２５・・・記憶手段。 DESCRIPTION OF SYMBOLS 1 ... Computer system, 11 ... CPU, 12 ... ROM, 13 ... RAM, 14 ... Operation part, 15 ... HDD, 16 ... Display, 17 ... Data entry Output unit, 18 ... sound source unit, 19 ... speaker, 21 ... operating means, 22 ... external data input means, 23 ... data editing means, 24 ... display means, 25 ... -Memory means.

Claims

A storage process of storing in the storage means singing score data composed of a plurality of note data including information specifying the sound production volume or the sound generation period corresponding to a plurality of notes constituting the music;
Input data that is input in association with a plurality of continuous note data stored in the storage means, and connects between a plurality of written characters constituting one word and the two continuous written characters A word acquisition process of acquiring input data including a notation character connection symbol via an input device, and acquiring a word from the input data;
For a word acquired in the word acquisition process, a word information acquisition process for acquiring a pronunciation symbol indicating a pronunciation mode of the word and division information indicating the number of syllables constituting the word;
If the number represented by the category information acquired in the word information acquisition process is compared with the number of note data associated with the word acquired in the word acquisition process, and the two match, the word acquisition process The phonetic symbol obtained from one word is divided into syllables each including one vowel, and a plurality of syllables are associated with input data from which the word is acquired among the note data in the singing score data. Assigned to the continuous note data, and the syllable assigned to the note data is a predetermined syllable, and the sound volume or sound generation period specified by the note data is equal to or longer than a predetermined time, the sound corresponding to the syllable A singing score data update process for updating the singing score data stored in the storage means by changing the symbol to another phonetic symbol; A data input program for singing synthesis characterized by being executed by a computer.

A storage process of storing in the storage means singing score data composed of a plurality of note data corresponding to a plurality of notes constituting the music;
Input data that is input in association with a plurality of continuous note data stored in the storage means, and connects between a plurality of written characters constituting one word and the two continuous written characters A word acquisition process of acquiring input data including a notation character connection symbol via an input device, and acquiring a word from the input data;
For a word acquired in the word acquisition process, a word information acquisition process for acquiring a pronunciation symbol indicating a pronunciation mode of the word and division information indicating the number of syllables constituting the word;
When the number represented by the category information acquired in the word information acquisition process is compared with the number of note data associated with the word acquired in the word acquisition process, and the two match, the word acquisition process The phonetic symbol obtained from one word is divided into syllables each including one vowel, and a plurality of syllables are associated with input data from which the word is acquired among the note data in the singing score data. If the syllable assigned to the note data is a predetermined syllable, the phonetic symbol corresponding to the syllable is determined based on the syllables assigned to the note data before and after the note data. The singing score data updating process for updating the singing score data stored in the storage means, A data input program for singing synthesis characterized by being executed.