Monday 25 October 2010

Skrifenn Ugens - Kernewek kewsys gans jynn-amontya! - Cornish spoken by computer

I recently managed to make my computer speak Cornish.

How did I manage this?

It turns out that espeak is available in Welsh.

First it is necessary to process the Cornish text to replace "dh" with "dd", "f" with "ff" etc. so that it conforms more closely to the Welsh orthography.

Then simply feed in the processed text to espeak with the Welsh voice.

Here's the python script to convert to Welsh orthography:

kernewek_to_welshorthography.py

import sys
import string

# takes first argument as input text, second as output

inputfile = sys.argv[1]
outputfile = sys.argv[2]

#print inputtext_words
inputtext = file(inputfile).readlines()

def towelsh(inputtext):
outputtext = ""
for w in inputtext:
w = w.lower()
w = w.replace("dh","dd")
w = w.replace("f","ff")
w = w.replace("y","i")
w = w.replace("ll","l")
w = w.replace("ch","tj")
w = w.replace("gh","ch")
outputtext += w + "\n"
return outputtext

outputtext = towelsh(inputtext)
outputtext = outputtext.replace(" .",".")
outputtext = outputtext.replace(" - ","-")
outputtext = outputtext.replace(" ' ","'")
#print towelsh
out = file(outputfile,"w")
out.write(outputtext)

And a shell script to launch espeak in Linux: (espeak is also available for Windows)
The input text file (original Cornish) and the output sound file are passed to the script as command line arguments.


#!/bin/bash
python kernewek_to_welshorthography.py $1 kows_workingfile.txt
espeak -vcy -w $2 -f kows_workingfile.txt

I can't say the output is perfect but it is generally passable

Bro Goth Agan Tasow (mp3)

No comments:

Post a Comment