- This product usually ships next day.
A serious discount
available for this item - details can be found at checkout.
When it comes to creating quality vocal tracks, Yamaha's VOCALOID singing synthesis technology literally changes everything. The future will now be very different. The last bastion of human musical expression - the singing voice - has been realistically harnessed in synthesis. Now it's time for YOU to express your music in YOUR own words. You can sing your heart out - through LEON, LOLA, MIRIAM, and other VOCALOID virtual vocalists from Zero-G, thanks to the incredible new technology developed by Yamaha.
LEON is a virtual male soul vocalist modelled on a real professional singer, and when he is installed into your PC he will literally allow you to create singing of superb quality and realism. LEON will sing ANY words you ask him to in English - literally anything - be they beautiful lyrics or comical trivialities, Monteverdi madrigals or manic chants. You can create vocal tracks of soulful singing in any lyrics you want. You just type in lyrics, and synthesize. Then add expression to taste. LEON is under your total control, and the really mind-blowing thing is - he can truly sound like a professional singing voice. With very little practice the results you get from LEON will completely fool your friends - they will not guess that it is not a real singer. The question you will hear will always be "WHO is that?", and not "What is that?".
How Does It Work? Why does LEON sound so real?
Few events have dramatically diverted the course of musical history. Synthesis, MIDI, Sampling - inventions of this magnitude are very rare events indeed. Singing Synthesis is such an event. Most acoustic instruments can be simulated well using various synthesis techniques, but (until Yamaha's VOCALOID techology came along) the singing voice resisted any serious simulation attempts. Hardly surprising, since singing has an extremely wide range of articulations, timbres, and transitions between sounds. And singing can communicate words as well as melody, which means you have a double layer of meaning, unlike other instruments. The human ear is so used to hearing the voice that even the finest tonal shifts or anomalies are immediately noticed. However, Yamaha's VOCALOID is in fact a totally new vocal-synthesis technology, achieving a much higher level of sophistication in this exciting area! The team at the Yamaha Advanced System Development Center in Japan has created software that emulates the singing voice with incredible accuracy.
Zero-G's development team have been working closely with Yamaha to create many 'vocal fonts' (virtual vocalists) for the VOCALOID system. The team starts each project by recording a professional vocalist singing literally all possible phonemes and transitions between syllables. Each transition is slightly different depending on the particular combination of phonemes, and these differences play a big part in how we understand words and whether a vocal track sounds natural or artificial. For example, the phoneme "p" sounds slightly different at the beginning of a word than it does at the end, and it affects the vowels next to it differently than, say, the phoneme "t".
The recordings of the professional singers are converted to the frequency domain using Fast Fourier Transform and divided into thousands of separate phonetic transitions which are processed in a unique way and then stored for use with the VOCALOID synthesis engine. Expressive tools such as vibrato, pitch bend, and attack are also derived from the real singing and stored separately.
To create a vocal track, you enter music and lyrics into the VOCALOID Editor (see screenshot). The melody can be entered by hand in the piano-roll style Editor or imported from a Standard MIDI File; the words are entered manually (either as words or as phonemes). Expressive elements can be imported from a MIDI File previously saved by the VOCALOID Editor, or entered via a graphic palette using drag-and-drop. Further detailed programming of expression parameters can be done graphically, for finely detailed results.
The data that you have entered in the Editor is sent to the synthesis engine, which fetches what it needs from the phonetic and expression databases to synthesize the track. To sing the word "part", for example, the software combines four elements from the phonetic database: "p" (as it sounds at the beginning of a word), "p-ar" (the transition from "p" to "ar"), "ar-t" (the transition from "ar" to "t"), and "t" (as it sounds at the end of a word). The two "ar" elements are blended together, and the resulting vowel "a" is lengthened as necessary to accommodate your chosen melody and rhythm.
Different pitches are derived by shifting the fundamental and overtones while leaving the vowel formants relatively untouched. The database elements were derived from phrases originally sung at different pitches, limiting the amount of shifting the engine needs to do (and therefore improving ultimate realism). A Pentium 4/ 2GHz computer takes less than one-third real time to render the track in the frequency domain and convert it into the time domain for use. For example, a 1-minute track can be rendered in less than 20 seconds.
have control over all of the following parameters, which you can bring to bear
on any part of your creation: