User:Arknarok/Romanization Guidelines

The following guidelines are my way of romanizing touhou songs. I do not request anyone to follow them, though it would be nice if romanization would be more standartized.

  1. While the base of romanization is Hepburn-style romaji, I use special edition of Hepburn to write lyrics (let's call it Lyrics-style-Hepburn). It is based off Revised Hepburn romaji (Wikipedia on Hepburn versions)
    1. All consonants are fully borrowed from Revised Hepburn. Which means is "zu" (not "dzu" or "du") and is "ji". Additionally, should I ever encounter obsolete kana (like ), it should be romanized with "w-" (like "wi"), even though "w-" won't really be sung.
    2. Even though consonants follow Revised Hepburn, there is one exception - っち (and similar sounds) are to be romanized as "cchi" (or "ccha", "cchu" etc) (Wapuro Romaji style), not "tchi". While I agree that "tchi" makes more sense, it isn't really obvious that the purpose of the "t" letter is to lengthen the sound (the main point of lyric romanization is that they should be as intuitive as possible for someone who doesn't know japanese. More advanced japanese users would definitely prefer better romanization methods).
    3. Unlike Revised Hepburn, I do not use macrons to indicate longer sounds at all. I use "oo", "ou", "ei", "ee" etc. instead of "ē", "ū" and so on. Again, if you are trying to teach people how to pronounce stuff correctly, strict Hepburn is somewhat better, but it still isn't exactly perfect (and people are going to mimic pronunciation of the singer anyway).
    4. For cases when is immediately followed by a vowel, " n' " is used to indicate its sound. This conforms with Revised Hepburn. So 陰陽 would be romanized as "in'you".
    5. Direct romanization of katakana is something you should put some of your own judgment into (it can sometimes be translated). In case of needed romanization, I use any theoretically possible japanese sounds ("fi", "vo" and the likes are okay). However, long vowel mark () is to be turned into the preceding vowel (ルール would be "ruuru"). Do not use "l" when romanizing katakana (or any other japanese writing system). Either stick with japanese sounds or translate the word (it doesn't do much harm).
    6. Following Revised Hepburn, sound is always romanized as "n" (see 1.4 for an additional note). Traditional Hepburn sometimes converted the sound to "-m" (for example, ぐんま would be "gumma"). It doesn't need to happen now, just use "n" anyway (ぐんま would be "gunma" now).
  2. As a particle, is romanized as "wa". Similarly, is romanized as "e" when used as a particle. However, is always "wo". This is to eliminate inconsistency and, besides, sometimes "w" in "wo" is actually sung. On the other hand, there isn't a universal truth on what should be. If you like it as "o" - go ahead, but try to maintain consistency with you previous work.
  3. Half-width katakana doesn't need to follow any strict romanization rules, because it is usually used to express sounds and not words. Be careful when romanizing it.
    1. Optional small hiragana (that is, not the type that actually affects pronounciation) doesn't need to follow strict romanization rules either. You could romanize なぁに as "nani" or "naani". The best thing to do is to listen to the song and see which one suits best.
  4. Punctuation should follow the japanese text. Make sure to convert japanese-style punctuation into western-style (【 】-> ( ) ; -> , ; -> . ; 『 』-> " " ; 「 」-> ' ' (though " " is allowed when it makes sense) ; -> —, -, preceding vowel or whatever makes most sense ; ・-> whatever makes sense ; don't forget to convert all spaces!)
  5. All lines in translation are to begin with a capital letter. With transcription you should try and keep all text lowercase, unless an explicit full stop is encountered.
  6. When a clearly loaned word is encountered, it is possible to translate it right within romaji text. However, you should use some indication to make sure the reader understands that it is no longer romaji. I use italics for that. With that, "スーパーてんこ" would be "super tenko". Make sure not to be overly zealous about it, some loaned words are so highly integrated into japanese that it doesn't make much sense to translate them.
  7. There are certain rendaku cases where it isn't clear whether rendaku should be used, and singer's voice isn't exactly helping the matter. Don't lose your sleep over it, most likely it doesn't matter whether you use rendaku or not. "H" versus "b" cases are important, but "k"/"g" and "t"/"d" aren't (though if you can confirm the pronunciation, it would be nice).
  8. All particles in text should be separated by spaces. It is, however, debatable whether certain words are 1 or 2 particles. I consider のは to be 2 particles ("no wa"), but では case is debatable (I do tend to consider it 2 particles and romanize as "de wa", but it really is a debatable case, you can write "dewa" if you want). へと is 2 particles ("e to"). All possible clause connecting particles are 1 particle (のに is "noni", ので is "node").
  9. Keep in mind that can and will be sometimes shortened to . If it is shortened, same rules apply.
  10. Word separation can be a very tricky topic sometimes. There are a few general guidelines, but you'll have to think for yourself.
    1. Main idea is that spaces are a powerful tool to control rhythm and speed. While shorter words will be read faster, longer words are more "meaningful". Sometimes it is worth it to take note of the way the singer is using.
    2. Various verb forms based on te-form are usually written as 1 word (this includes -teru, -teiru, -teiku, -teyuku, -teoku, -tearu, -teshimau ...). Note that it is common for printed lyrics to use kanji in te-form suffix. Don't be confused - it's still there only to modify the verb.
    3. On a related note, -temo and -demo are always hard cases. It is absolutely necessary to determine the basic meaning in order to have optimal romanization. -temo and -demo should be attached to a verb when they are used to mean "even though ...". When "mo" is related to the following words (like "-te mo ii" - "it's fine if ...")(this was a horrible mistake on my part, as this was the case of -mo belonging to the verb, not the following word, it should be "-temo ii", sorry for confusion) it is a standalone particle and should be separated. Finally, "demo" meaning "but" should be a separate word altogether.
    4. Verb chaining through te-form usually results in multiple words. However, note that it is very common to attach -ageru/-agaru or -kureru using te-form. These words became verb forms and you usually want to keep them as one word (みせてあげる is highly likely to be "miseteageru"). Again, use your own judgment here. Other auxiliary verbs that follow this pattern would be -yaru, -morau, -yagaru.
    5. Verb chaining/forming through stem-form is a very tricky topic. The best thing to do here is to know Japanese. Most common verbs formed through stem-chaining should remain as 1 word (like 辿り着く - "tadoritsuku". Usually if you attach a verb with reading -tsuku, -au or -dasu, the result will most likely be romanized as 1 word). However, it is very common among lyricists to create new words with this method. In such case, it would most likely be better if you split the word into 2. Also make sure you ARE working with stem-form chaining (meaning that stem-form is actually attached to a verb and not a particle or something).
    6. Verb forming through adverb-form chaining isn't very common and almost always results in 2 words.
    7. Generally, verb forming through attaching "-suru"/"-naru" to a noun produces 2 words in romaji. Expression "ni suru"/"ni naru" is not an exception, "ni" should be separated like a usual particle.
    8. While can be sometimes seen within subclauses as a standalone "particle" (and it would need to be separated, see そうなのか - "sou na no ka"), remember that it can also appear at the end of na-adjectives. In this case, "na" should be attached to the corresponding adjective. While it seems obvious, these cases are a bit harder to spot in actual song lyrics. (It's also a very common mistake people do when transcribing songs.)
    9. ように/ような should be romanized as "you ni" and "you na". This comes from the fact that "you" is a separate kanji, and "ni"/"na" are actually particles(this is a mistake on my part - "ni" is certainly a particle, while "na" is an ending of a resulting na-adjective). This is a very common expression, so it is sometimes easier to assume it's just a single word. It isn'tAs I have just mentioned, "youna" is sort of one word, and you can write it as a single word if you don't. I still prefer to keep "na" separate, though.
    10. Most set expressions should be written uniformly. This usually means they are to be written as 1 word. For instance, I prefer to write 誰も and 誰でも as one word ("daremo" and "daredemo").
    11. ない, attached to something via adverb-form conjugation or other "direct" attachment method remains in the word after romanization (広くない would become "hirokunai"). This, however, is a debatable case, because sometimes there's a lot of meaning put in such "-nai" so it should be kept as a separate word. As for じゃない i always prefer to keep "ja" separate from the word (but since "-nai" is attached to "ja", i don't separate them). Therefore, そうじゃない would become "sou janai".
  11. Line count should follow kanji text. Sometimes a line is separated into blocks with spaces between them. It is most common to leave these cases as 1 line and make sure that a space in kanji texts has a corresponding space in romaji. However, sometimes it is obvious that having multiple blocks per line was a result of a lyricist not having enough vertical space for his lyrics (for example, you can see the same lines earlier without being cramped). In such cases, you can manually separate multiple-block line into several smaller (but make sure to do it both in kanji and romaji).
  12. Alternative readings (cases when kanji aren't sung the same way as they are usually pronounced) aren't indicated in romaji, but should be indicated in kanji lyrics. Use alt-ja template to specify furigana for the word with alternative reading and to specify the word that was actually sung. (As a sidenote, 身体 sung as "karada" is NOT an alternative reading case). Refer to my article on alternative readings for more information about alternative readings.
    1. As a special case, you may sometimes find ~ていく sung as ~てゆく. I am convinced that this is because "yuku" sounds more awesome in singing, and lyricists often write "iku" automatically, even if they know it's going to be sung as "yuku" anyway.
  13. You can also use furigana to indicate obscure, unusual (but not thought-up) readings of some kanji.
  14. Various lyricist's notes within kanji text (some of them specify the alternative readings, for instance) should make it into kanji lyrics (in case of alternative readings, you should include them as furigana, obviously), but should not make it into romaji. Romaji is for song text only (Innocent Key used to put smileys into "Touhou School", which were omitted from romaji, but included into kanji lyrics).
  15. When there's a dialogue taking place in a song, it is highly recommended to specify names of characters (use bold to avoid confusion with song lyrics).
  16. If you find a typo in printed lyrics, leave it in kanji text, but fix it in romaji. Make a translation note if you want. The main point is - the kanji lyrics should be exactly like the printed lyrics and include all errors uncorrected, if they are present.
  17. And that's about it. Remember that exceptions are possible for every guideline here and that it is you who is romanizing the song. Make sure to develop a feel for the song and don't forget to listen to it to check if your romaji text works fine.