The data are in the form of a CSV file, elp-pronounced-sampa.txt. The file is in standard ASCII. Each record (tuple) is delimited by carriage-return line-feed (Microsoft style). Each field is separated from its neighbour by a comma. Comma is never used within field content. Some fields contain " as an ordinary character (non-quoting). Any field may consist of the single character ? meaning "unknown". There are 79,673 lines, which is too long for Excel. First line gives field names. Its fields are: Spell - spelling, mixed case (key) Ref - Tells which authority was used for the word Uni = Unisyn CMU = Carnegie-Mellon Moby * Derived by computer and not checked ? No authority POS - Part of Speech Infl - Inflectional Category Pron - Pronunciation NSyll - Number of Syllables NPhon - Number of Phonemes NMorph - Number of Morphemes MorphSp - Morpheme Parse (letters) MorphPr - Morpheme parse (phonemes) Suggested names and help files to use in db interface: + Phonological Characteristics * Pronunciation HELP: A single representative pronunciation of the word. Dots mark possible syllable boundaries. Primary stress is marked by " before the stressed vowel, secondary stress by % E.g.: d%i.k"Or.4@.k%e4.@d for "decorticated". Pronunciation is based on General American standard, and uses the following codes based on SAMPA (http://www.phon.ucl.ac.uk/home/sampa): a At; A cAr, spA; aI Ice; aU OUt; @ Above; @` buttER; b Boy; d Dog; dZ baDGe; D THis; e Ape; E Ebb; f Fig; g Go; h Hip; i EAt; I If; j Yes; k Kite; l Lip; l= bottLe; m= deisM; n= buttoN N siNG; o OAt; O AUto; OI OYster; p Pig; r Read; s Sew; S SHow; t Toy; tS caTCH; T THin; u rUde; U pUt; v Van; V Up; w Wind; z Zoo; Z viSion; 3` bIRd; 4 beTTer * Number of Phonemes HELP: Number of phonemes in the main pronunciation. The diphthongs /aI/, /aU/, /OI/, and the affricates /tS/ and /dZ/, each count as single phonemes. * Number of Syllables HELP: Number of syllables in the main pronunciation. + Morphological Characteristics * Morpheme Parse (letters) HELP: The word respelled with morpheme markers indicating the composition of the word. Markers include: {free bases} suffixes> -- separates other bound morphemes Free bases are spelled as they are in the independent word, e.g.: {abhor}>ent> for abhorrent * Morpheme Parse (phonemes) HELP: A single representative pronunciation of the word. Morpheme markers enclose: {free bases} {free bases with different pronunciation $} suffixes> -- separates other bound morphemes Dots mark possible syllable boundaries. Primary stress is marked by " before the stressed vowel, secondary stress by % E.g.: @d> for "decorticated". Pronunciation is based on General American standard, and uses the following codes based on SAMPA (http://www.phon.ucl.ac.uk/home/sampa): a At; A cAr, spA; aI Ice; aU OUt; @ Above; @` buttER; b Boy; d Dog; dZ baDGe; D THis; e Ape; E Ebb; f Fig; g Go; h Hip; i EAt; I If; j Yes; k Kite; l Lip; l= bottLe; m= deisM; n=buttoN N siNG; o OAt; O AUto; OI OYster; p Pig; r Read; s Sew; S SHow; t Toy; tS caTCH; T THin; u rUde; U pUt; v Van; V Up; w Wind; z Zoo; Z viSion; 3` bIRd; 4 beTTer * Number of morphemes SEARCHABLE: numeric, upper and lower bounds + Part of speech HELP: The part of speech of the word. JJ adjective ("beautiful") NN noun ("beauty") RB adverb ("beautifully") VB verb ("beautify") encl enclitic group ("beauty's") minor all other ("the", "in", "what", "uh") ? unknown | separates alternatives: "can" VB|NN + Inflectional category HELP: Inflected form of the entry. plur plural of noun ("beauties") comp comparative ("better") super superlative ("best") 3sg third person singular present ("beautifies") past preterite or past participle ("beautified") ger gerund ("beautifying") - base form, or word does not inflect ("beauty", "the") ? unknown