Phoneme Tool/data format

From Valve Developer Community
< Phoneme Tool
Revision as of 09:57, 14 September 2011 by Infectiousfight (talk | contribs) (I added the CLOSECAPTION data based on my own analysis of SDK example .wav files.)
Jump to navigation Jump to search

The phoneme editor embeds the following ASCII text block at the end of a .wav file:

PLAINTEXT
{
example sentence
}
WORDS
{
WORD example <start time> <end time>
{
<phoneme id> <phoneme name> <start time> <end time> 1
}
WORD sentence <as above>
{
<as above>
}
}
EMPHASIS
{
<time> <normalised value>
}
CLOSECAPTION
{
english
{
PHRASE unicode <size of text in *bytes*. Text has no nul-termination.> <Text, formatted in what seems to be either UCS-2 or UTF-16> <start time> <end time>
}
}
OPTIONS
{
voice_duck <1/0>
}

All sections are required, even if they are empty (as emphasis often is).

Todo: Purpose of the final "1" value of a phoneme.
Todo: As relates to the closed-captioning section...what are other valid language identifiers? Are there encodings other than "unicode" that are valid? Can we have multiple "phrases"? Is this method of closed-captioning deprecated or does it supersede the method described here, or do they exist alongside each other?

Phoneme IDs

  • 95 <sil>
  • 97 aa2
  • 98 b
  • 100 d
  • 101 ey
  • 102 f
  • 103 g
  • 104 hh
  • 105 iy
  • 106 y
  • 107 c
  • 108 l
  • 109 m
  • 110 n
  • 111 ow
  • 112 p
  • 114 r2
  • 115 s
  • 116 t
  • 117 uw
  • 118 v
  • 119 w
  • 122 z
  • 230 ae
  • 240 dh
  • 331 nx
  • 593 aa
  • 596 ao
  • 601 ax
  • 602 er
  • 603 eh
  • 604 ax2
  • 605 er2
  • 609 g2
  • 614 hh2
  • 616 ih2
  • 618 ih
  • 619 l2
  • 633 r
  • 635 r3
  • 638 d2
  • 643 sh
  • 650 uh
  • 652 ah
  • 658 zh
  • 676 jh
  • 679 ch
  • 952 th