Phoneme Tool/data format

From Valve Developer Community
Jump to: navigation, search

The phoneme editor embeds the following ASCII text block at the end of a .wav file:

PLAINTEXT
{
example sentence
}
WORDS
{
WORD example <start time> <end time>
{
<phoneme id> <phoneme name> <start time> <end time> 1
}
WORD sentence <as above>
{
<as above>
}
}
EMPHASIS
{
<time> <normalised value>
}
CLOSECAPTION
{
english
{
PHRASE unicode <size of text in *bytes*. Text has no nul-termination.> <Text, formatted in what seems to be either UCS-2 or UTF-16> <start time> <end time>
}
}
OPTIONS
{
voice_duck <1/0>
}

All sections are required, even if they are empty (as emphasis often is). To do: Purpose of the final "1" value of a phoneme. To do: As relates to the closed-captioning section...what are other valid language identifiers? Are there encodings other than "unicode" that are valid? Can we have multiple "phrases"? Is this method of closed-captioning deprecated or does it supersede the method described here, or do they exist alongside each other?

Phoneme IDs

  • 95 <sil>
  • 97 aa2
  • 98 b
  • 100 d
  • 101 ey
  • 102 f
  • 103 g
  • 104 hh
  • 105 iy
  • 106 y
  • 107 c
  • 108 l
  • 109 m
  • 110 n
  • 111 ow
  • 112 p
  • 114 r2
  • 115 s
  • 116 t
  • 117 uw
  • 118 v
  • 119 w
  • 122 z
  • 230 ae
  • 240 dh
  • 331 nx
  • 593 aa
  • 596 ao
  • 601 ax
  • 602 er
  • 603 eh
  • 604 ax2
  • 605 er2
  • 609 g2
  • 614 hh2
  • 616 ih2
  • 618 ih
  • 619 l2
  • 633 r
  • 635 r3
  • 638 d2
  • 643 sh
  • 650 uh
  • 652 ah
  • 658 zh
  • 676 jh
  • 679 ch
  • 952 th