Tuesday, 30 August 2016

Adding a copy to clipboard button to TaklowKernewek GUI tools

Click on the button labelled "Kopi dhe'n Klyppbordh" to copy the contents of the output box, to the clipboard.
This uses Tkinter's clipboard_append() function. This is available for all of the progams launched by TaklowKernewekLonchyer.pyw (except the text to speech program) and the Welsh language mutation treigloGUI.py.

sylrannaGUI, treuslytherennaGUI and kovtreylyansGUI also now have a "netbook" mode designed for smaller screens. Launch it by using "--netbook" as a command line option, or using the TaklowKernewekLonchyer_netbook.pyw launcher. This makes the size of the output window smaller, such that it its on the screen of my EeePC.

The corpus statistics module also now has modified code to quit the program, since if there are matplotlib windows opened, the Kwitya button was not necessarily exiting the program.
I have modified the code to call sys.exit() in this program instead of using Tkinter itself.

There is also a button labelled "Klerhe Tresennow" which closes any open matplotlib plots.


Monday, 29 August 2016

Cornish Corpus analysis GUI

I have now written a GUI module to provide some basic analysis functionality on the corpus of Cornish.

I have included the traditional texts, using the versions from www.howlsedhes.co.uk plus the Solempnyta short story by Ben Bruch and part of the translation of Lord of the Rings by Jerry Jeffries.


Choose the text on the menu at the left, and select which function to use in the menu in the next panel. In this case "Rol Menowghder Ger" has been selected, which provides a list of the most frequent words. It can be selected how many words to list, and to specify a minimum number of letters in words to be listed.
The next option "Hirder Geryow" provides a cumulative frequency diagram for the lengths of words. This can be produced for all texts, or for one text at a time.

The option "Menowghder Ger (tresenn barr)" provide the option to add words to a list using the lower entry bar in the middle panel, and then press the "Dalleth" button to draw a grouped bar chart, comparing the frequency of a specified list of words across the various texts, or a single text.

Output showing the frequency of the numerals from 1 - 8 in the texts. The number 8 "eth" may not be the numeral since it is a part of the verb "to go" as well which is likely to be the more common use. Similarly "dew" could be a mutated form of the word "tew" (fat).


Translation memory in Cornish - effect of using all bigrams and trigrams

Here is the effect of using the two different options in kovtreylyansGUI.py whether to match all bigrams and trigrams between the input text and the corpus bilingual sentences, or only those that have at least one word that is not a stopword (the Python NLTK stopwords corpus (item 67 in NLTK data), which consists of a list of common words that are not likely to have much semantic content).

Input sentence is "There is a red sun on Proxima b"

Full output - the non-stopwords option


trigrams for input sentence are:
[('there', 'is', 'a'), ('is', 'a', 'red'), ('a', 'red', 'sun'), ('red', 'sun', 'on'), ('sun', 'on', 'proxima'), ('on', 'proxima', 'b'), ('proxima', 'b', '.')]

bigrams for input sentence are:
[('there', 'is'), ('is', 'a'), ('a', 'red'), ('red', 'sun'), ('sun', 'on'), ('on', 'proxima'), ('proxima', 'b'), ('b', '.')]

Listing N-grams with a minimum of 1 non-stopword each:
Common trigrams:

Common bigrams:
Eus pluvenn rudh genes, mar pleg?     --  Have you a red pen, please?        

(a red)

Ass yns teg, pennow an menydhyow y'n  --  How beautiful the tops of the      
howlsedhes, an howl rudh a-ughta.     --  mountains are in the sunset, the red
                                      --  sun above them.                    

(red sun)

Additional output using all option


Other N grams containing only stopwords:
Common trigrams:
Mes yma karr a-rag an chi.            --  But there is a car in front of the
                                      --  house.                            

(there is a)

Sur, yma lyver war an voes.           --  Certainly there is a book on the  
                                      --  table.                            

(there is a)

Yma avon vras ha hir yn Almayn.       --  There is a large, long river in   
                                      --  Germany.                          

(there is a)

Yma toll down yn kres an fordh.       --  There is a deep hole in the centre
                                      --  of the road.                      

(there is a)

Yma kador gesys y'n hel.              --  There is a chair left in the hall.

(there is a)

Yma fordh pur gul ryb an pras.        --  There is a very narrow road beside
                                      --  the field.                        

(there is a)

Usi! Hag yma lost hir ryb an hel      --  Yes! And there is a long queue    
ynwedh.                               --  beside the hall too.              

(there is a)

Yma unn chi a-dryv an sinema; chi Mr  --  There is a certain house behind the
Pollglas yw ev.                       --  cinema; it's Mr Pollglas's house. 

(there is a)

Kemmer an kowl ma, yma bollas ragos.  --  Take this soup, there is a bowlful
                                      --  for you.                          

(there is a)

Yma bownder verr ha kul ynter an      --  There is a short, narrow lane     
dhew bras vras.                       --  between the two big fields.       

(there is a)

Yma krys ow kregi war benn an gweli.  --  There is a shirt hanging on the end
                                      --  of the bed.                       

(there is a)

Yma ki owth hartha a-ves.             --  There is a dog barking outside.   

(there is a)

Yma nown dhe'n vebyon. An vamm a      --  The boys are hungry. Mother will  
vynn ri nebes boes dhedha. Mes eus    --  give them some food. But is there 
boes lowr y'n yeynell? Eus. Yma meur  --  enough food in the refrigerator?  
a vara hag amanenn gesys hwath        --  Yes! There is a lot of bread and  
ynwedh.                               --  butter still left as well.        

(there is a)

Yma tren skav dhe dhiw eur marnas     --  There is a fast train at          
teyr mynysenn warn ugens.             --  twenty-three minutes to two.      

(there is a)


Common bigrams:
Yma mebyl gesys ena y'n chi:          --  There is furniture left in the    
kador-vregh, kador, gweder ha         --  house: an arm-chair, chair, mirror
lestrier mes nyns eus moes ena.       --  and dresser but there isn't a table
                                      --  there.                            

(there is), (there is)

Hemm yw pluvenn.                  --  This is a pen.                      (is a)
Henn yw chi.                      --  That is a house.                    (is a)
An drehevyans na yw eglos.        --  That building is a church.          (is a)
Henn yw kenter.                   --  That is a nail.                     (is a)
An drehevyans a-rag an chi yw         --  The building in front of the house
karrji.                               --  is a garage.                      

(is a)

An dra na yw pluvenn.             --  That object is a pen.               (is a)
Honn yw Kernewes.                 --  That is a Cornish woman.            (is a)
Hemm yw aval hweg.                --  This is a sweet apple.              (is a)
An avon Tamer yw avon vras, down.     --  The River Tamar is a large, deep  
                                      --  river                             

(is a)

Hemm yw koes bras.                --  This is a big wood.                 (is a)
Hel an Dre yw drehevyans teg.         --  The town hall is a fine building. 

(is a)

An gour yw den da, dell dybav.        --  The husband is a good person, I   
                                      --  think.                            

(is a)

Gwydhelek yw yeth keltek          --  Irish is a Celtic language.         (is a)
Lyver berr yw lyver da, dell dybav.   --  A short book is a good book, I    
                                      --  think.                            

(is a)

An chi a-ji dhe'n koes yw chi         --  The house in the wood is a new    
nowydh.                               --  house.                            

(is a)

Gwreg an gour ma yw Kernewes dha.     --  This man's wife is a good         
                                      --  Cornishwoman.                     

(is a)

Kres an koes yw le kosel.             --  The middle of the wood is a quiet 
                                      --  place.                            

(is a)

Noy Mr Turner yw maw bras.            --  Mr Turner's nephew is a big boy.  

(is a)

Pow Frynk yw pow pur vras ha pur      --  France is a very big and beautiful
deg.                                  --  country.                          

(is a)

Onan yw brithel.                  --  One is a mackerel.                  (is a)
Hemm yw kerdh hir mes brav yw         --  This is a long walk but it's grand.

(is a)

Gour kloppek yw ev.               --  He is a lame man.                   (is a)
An Gresenn Gernewek yw le da ha dhe   --  The Cornish Centre is a good place
les yw hi rag tus Kernow.             --  and it is useful for the people of
                                      --  Cornwall.                         

(is a)

Yma dew dhen ha dew ugens y'n         --  There are forty-two people in the 
kuntelles. Hemm yw niver da rag       --  meeting. This is a good number for a
kuntelles a'n par ma.                 --  meeting of this kind.             

(is a)

Broder Androw yw pronter yn unn       --  Andrew's brother is a vicar in a  
eglos.                                --  certain church.                   

(is a)

Agan kesva yw onan dha.           --  Our association is a good one.      (is a)
Dyskador yw, dell glewav.         --  He is a teacher, I hear.            (is a)
Ammeth yw tra vras yn Kernow.         --  Agriculture is a big affair in    
                                      --  Cornwall.                         

(is a)

Pow pell yw Ejyp ha tir bras yw       --  Egypt is a distant country and it is
ynwedh.                               --  large also.                       

(is a)

Nag eus! Nyns eus karrji ena.         --  No! There isn't a garage there.   

(there is)

Eus, sur!                       --  There is, certainly!              (there is)
Nag eus!                        --  There isn't!                      (there is)
Yma nebonan a-ji dhe'n eglos na.      --  There is someone inside that church.

(there is)

Yma neppyth a-ragh an chi ma.         --  There is something in front of this
                                      --  house.                            

(there is)

Nyns eus kenter omma, dell hevel.     --  There is no nail here, it seems.  

(there is)

Eus! Yma an eglos ryb hel an dre      --  There is! The church is beside the
                                      --  town hall.                        

(there is)

Eus, dell hevel.                --  There is, it seems.               (there is)
Nag eus! Nyns eus tra omma.           --  There isn't. There's nothing here.

(there is)

Eus gwin gesys? Eus! Yma gwin y'n     --  Is there (any) wine left? There is!
gegin.                                --  There's wine in the kitchen.      

(there is)

Yma nebonan y'n gegin lemmyn.         --  There is someone in the kitchen now.

(there is)

Eus. Hi a wra glaw lemmyn.      --  There is. It's raining now.       (there is)
Piw yw Mr Lock ytho? Ottena! An gour  --  Who is Mr Lock then? Look! that man
ena yw Mr Lock, an gour hir na.       --  there is Mr. Lock, that tall man. 

(there is)

Nyns eus le gesys yn kres an dre      --  There isn't a place left in the town
lemmyn.                               --  centre now.                       

(there is)

Nyns eus nebonan gesys yn hel an      --  There is no-one left in the town  
dre.                                  --  hall.                             

(there is)

Nyns eus arghans lowr rag boes.       --  There is not enough money for food.

(there is)

Yma unn karr a-rag an chi.            --  There is one car in front of the  
                                      --  house.                            

(there is)

Yma unn garrek pur vras ena.          --  There is one very large rock there.

(there is)

Eus! Yma boes war blat y'n gegin      --  Yes! There is food on a plate in the
                                      --  kitchen.                          

(there is)

Nyns eus kyttrin dhe Druru kyns       --  There is no bus to Truro before four
peder eur.                            --  o'clock.                          

(there is)

Yma po korev po gwin gans an goen.    --  There is beer or wine with the    
                                      --  dinner.                           

(there is)

Yma spas lowr y'n le na lemmyn.       --  There is enough room in that place
                                      --  now.                              

(there is)

Y'n hel (yma onan).             --  In the hall (there is one).       (there is)
Eus! yma, dell hevel.           --  Yes! There is, it seems.          (there is)
An vamm re worras an kinyow war an    --  Mother has put the dinner on the  
voes lemmyn. Kynsa yma kowl onyon.    --  table now. First there is onion   
                                      --  soup.                             

(there is)

Eus nebes koffi gesys ragov? Nag      --  Is there a little coffee left for 
eus. Yma te hepken.                   --  me? No. There is tea only.        

(there is)

Re dhiwedhes os, ow howeth, ha nyns   --  You are too late, my friend, and  
eus hanafas a goffi ragos.            --  there is no cup of coffee for you.

(there is)

Nyns eus karr ow tos.           --  There is no car coming.           (there is)
Nyns eus nebonan ow kelwel.     --  There is no one calling.          (there is)
Eus!                            --  There is!                         (there is)
Nag eus!                        --  There is not!                     (there is)
Joy re skrifas dhodho mes nyns eus    --  Joy has written to him but there is
gorthyp hwath.                        --  no reply yet.                     

(there is)

Ottena an kok mes nyns eus den ynno.  --  There is the fishing boat but     
                                      --  there's no one in it.             

(there is)

Nyns eus karr vyth y'n fordh.         --  There is no car at all on the road.

(there is)

Nyns eus gesys kestenenn y'n koes,    --  There isn't a chestnut tree left in
dell dybav.                           --  the wood, I think.                

(there is)

Prag y tybydh yndella? Drefenn nag    --  Why do you think so? Because there
eus ken fordh dhe dybi.               --  is no other way to think.         

(there is)


It is clear there are a large number of matches to "there is a" and its substrings "there is" and "is a". For a long input text, the number of matches of this kind can become very large.

Notice that for the sentence
Yma mebyl gesys ena y'n chi: -- There is furniture left in the
kador-vregh, kador, gweder ha -- house: an arm-chair, chair, mirror
lestrier mes nyns eus moes ena. -- and dresser but there isn't a table
-- there.

(there is), (there is)

there is a match to the bigram "there is" twice, since "there isn't" in the English sentence was tokenized to "there is n't" since NLTK is aware that isn't is grammatically speaking, two words.

Sunday, 21 August 2016

What is the highest stadium in Cornwall?

Suppose we want to translate the sentence "What is the highest hill in Cornwall?" into Cornish, but that the corpus may use a synonym for one of the words.


python wordnettest.py
Enter some text please.
What is the highest hill in Cornwall?
word: What

Hypernyms
[]

Hyponyms of all hypernyms
[]






word: highest
(Synset('high.a.01'), u'greater than normal in degree or intensity or amount')
(Synset('high.a.02'), u"(literal meaning) being at or having a relatively great or specific elevation or upward extension (sometimes used in combinations like `knee-high')")
(Synset('eminent.s.01'), u'standing above others in quality or position')
(Synset('high.a.04'), u'used of sounds and voices; high in pitch or frequency')
(Synset('high.s.05'), u'happy and excited and energetic')
(Synset('gamey.s.02'), u'(used of the smell of meat) smelling spoiled or tainted')
(Synset('high.s.07'), u'slightly and pleasantly intoxicated from alcohol or a drug (especially marijuana)')

Hypernyms
[]

Hyponyms of all hypernyms
[]


word: hill
(Synset('hill.n.01'), u'a local and well-defined elevation of the land')
(Synset('mound.n.04'), u'structure consisting of an artificial heap or bank usually of earth or stones')
(Synset('hill.n.03'), u'United States railroad tycoon (1838-1916)')
(Synset('hill.n.04'), u'risque English comedian (1925-1992)')
(Synset('mound.n.01'), u'(baseball) the slight elevation on which the pitcher stands')
(Synset('hill.v.01'), u'form into a hill')

Hypernyms
[Synset('natural_elevation.n.01'), Synset('structure.n.01'), Synset('baseball_equipment.n.01'), Synset('shape.v.02')]
(Synset('natural_elevation.n.01'), u'a raised or elevated geological formation')
(Synset('structure.n.01'), u'a thing constructed; a complex entity constructed of many parts')
(Synset('baseball_equipment.n.01'), u'equipment used in playing baseball')
(Synset('shape.v.02'), u'make something, usually for a specific function')

Hyponyms of all hypernyms
[u'highland', u'upland', u'hill', u'mountain', u'mount', u'promontory', u'headland', u'head', u'foreland', u'ridge', u'swell', u'airdock', u'hangar', u'repair_shed', u'altar', u'arcade', u'colonnade', u'arch', u'area', u'balance', u'equilibrium', u'equipoise', u'counterbalance', u'balcony', u'balcony', u'bascule', u'boarding', u'body', u'bridge', u'span', u'building', u'edifice', u'building_complex', u'complex', u'catchment', u'coil', u'spiral', u'volute', u'whorl', u'helix', u'colonnade', u'column', u'pillar', u'corner', u'quoin', u'cross', u'deathtrap', u'defensive_structure', u'defense', u'defence', u'door', u'entablature', u'erection', u'establishment', u'false_bottom', u'floor', u'level', u'storey', u'story', u'fountain', u'guide', u'honeycomb', u'house_of_cards', u'cardhouse', u'card-house', u'cardcastle', u'housing', u'lodging', u'living_accommodations', u'hull', u'jungle_gym', u'lamination', u'landing', u'landing_place', u'lookout', u'observation_tower', u'lookout_station', u'observatory', u'masonry', u'memorial', u'monument', u'mound', u'hill', u'obstruction', u'obstructor', u'obstructer', u'impediment', u'impedimenta', u'partition', u'divider', u'platform', u'weapons_platform', u'porch', u'post_and_lintel', u'prefab', u'projection', u'public_works', u'sail', u'set-back', u'setoff', u'offset', u'shelter', u'shoebox', u'signboard', u'sign', u'stadium', u'bowl', u'arena', u'sports_stadium', u'superstructure', u'supporting_structure', u'tower', u'transept', u'trestlework', u'vaulting', u'ways', u'shipway', u'slipway', u'wellhead', u'wind_tunnel', u'base', u'bag', u'baseball', u'baseball_bat', u'lumber', u'baseball_glove', u'glove', u'baseball_mitt', u'mitt', u'batting_cage', u'cage', u'batting_glove', u'batting_helmet', u"catcher's_mask", u'mound', u'hill', u"pitcher's_mound", u'pine-tar_rag', u'rosin_bag', u'beat', u'carve', u'cast', u'mold', u'mould', u'chip', u'cut_out', u'grind', u'handbuild', u'hand-build', u'coil', u'hill', u'layer', u'machine', u'model', u'mold', u'mould', u'mound', u'preform', u'preform', u'puddle', u'reshape', u'remold', u'roughcast', u'sculpt', u'sculpture', u'sinter', u'stamp', u'swage', u'upset', u'throw']




word: Cornwall
(Synset('cornwall.n.01'), u'a hilly county in southwestern England')

Hypernyms
[]

Hyponyms of all hypernyms
[]


word: ?

Hypernyms
[]

Hyponyms of all hypernyms
[]


What is the highest lookout in Cornwall?
What is the highest hill in Cornwall?
What is the highest story in Cornwall?
What is the highest hangar in Cornwall?
What is the highest shipway in Cornwall?
What is the highest landing in Cornwall?
What is the highest sculpt in Cornwall?
What is the highest tower in Cornwall?
What is the highest observation tower in Cornwall?
What is the highest wellhead in Cornwall?
What is the highest pine-tar rag in Cornwall?
What is the highest mould in Cornwall?
What is the highest pillar in Cornwall?
What is the highest deathtrap in Cornwall?
What is the highest offset in Cornwall?
What is the highest stamp in Cornwall?
What is the highest cardcastle in Cornwall?
What is the highest body in Cornwall?
What is the highest honeycomb in Cornwall?
What is the highest bag in Cornwall?
What is the highest weapons platform in Cornwall?
What is the highest post and lintel in Cornwall?
What is the highest prefab in Cornwall?
What is the highest erection in Cornwall?
What is the highest swage in Cornwall?
What is the highest obstructor in Cornwall?
What is the highest handbuild in Cornwall?
What is the highest baseball mitt in Cornwall?
What is the highest sports stadium in Cornwall?
What is the highest ridge in Cornwall?
What is the highest sail in Cornwall?
What is the highest divider in Cornwall?
What is the highest beat in Cornwall?
What is the highest cut out in Cornwall?
What is the highest carve in Cornwall?
What is the highest monument in Cornwall?
What is the highest corner in Cornwall?
What is the highest batting helmet in Cornwall?
What is the highest defence in Cornwall?
What is the highest airdock in Cornwall?
What is the highest sign in Cornwall?
What is the highest puddle in Cornwall?
What is the highest preform in Cornwall?
What is the highest mount in Cornwall?
What is the highest remold in Cornwall?
What is the highest area in Cornwall?
What is the highest supporting structure in Cornwall?
What is the highest level in Cornwall?
What is the highest building in Cornwall?
What is the highest head in Cornwall?
What is the highest cardhouse in Cornwall?
What is the highest promontory in Cornwall?
What is the highest porch in Cornwall?
What is the highest machine in Cornwall?
What is the highest whorl in Cornwall?
What is the highest foreland in Cornwall?
What is the highest upset in Cornwall?
What is the highest lumber in Cornwall?
What is the highest set-back in Cornwall?
What is the highest arch in Cornwall?
What is the highest balance in Cornwall?
What is the highest rosin bag in Cornwall?
What is the highest storey in Cornwall?
What is the highest wind tunnel in Cornwall?
What is the highest memorial in Cornwall?
What is the highest housing in Cornwall?
What is the highest spiral in Cornwall?
What is the highest transept in Cornwall?
What is the highest mold in Cornwall?
What is the highest complex in Cornwall?
What is the highest arena in Cornwall?
What is the highest upland in Cornwall?
What is the highest lodging in Cornwall?
What is the highest partition in Cornwall?
What is the highest altar in Cornwall?
What is the highest equilibrium in Cornwall?
What is the highest slipway in Cornwall?
What is the highest mountain in Cornwall?
What is the highest setoff in Cornwall?
What is the highest masonry in Cornwall?
What is the highest lookout station in Cornwall?
What is the highest edifice in Cornwall?
What is the highest defense in Cornwall?
What is the highest house of cards in Cornwall?
What is the highest obstructer in Cornwall?
What is the highest hull in Cornwall?
What is the highest building complex in Cornwall?
What is the highest batting cage in Cornwall?
What is the highest lamination in Cornwall?
What is the highest baseball bat in Cornwall?
What is the highest headland in Cornwall?
What is the highest shelter in Cornwall?
What is the highest highland in Cornwall?
What is the highest repair shed in Cornwall?
What is the highest guide in Cornwall?
What is the highest counterbalance in Cornwall?
What is the highest chip in Cornwall?
What is the highest model in Cornwall?
What is the highest column in Cornwall?
What is the highest impedimenta in Cornwall?
What is the highest catcher's mask in Cornwall?
What is the highest public works in Cornwall?
What is the highest pitcher's mound in Cornwall?
What is the highest boarding in Cornwall?
What is the highest reshape in Cornwall?
What is the highest baseball glove in Cornwall?
What is the highest sculpture in Cornwall?
What is the highest shoebox in Cornwall?
What is the highest catchment in Cornwall?
What is the highest establishment in Cornwall?
What is the highest bridge in Cornwall?
What is the highest colonnade in Cornwall?
What is the highest layer in Cornwall?
What is the highest cross in Cornwall?
What is the highest roughcast in Cornwall?
What is the highest glove in Cornwall?
What is the highest observatory in Cornwall?
What is the highest arcade in Cornwall?
What is the highest defensive structure in Cornwall?
What is the highest platform in Cornwall?
What is the highest bowl in Cornwall?
What is the highest card-house in Cornwall?
What is the highest cage in Cornwall?
What is the highest throw in Cornwall?
What is the highest mitt in Cornwall?
What is the highest batting glove in Cornwall?
What is the highest jungle gym in Cornwall?
What is the highest equipoise in Cornwall?
What is the highest swell in Cornwall?
What is the highest stadium in Cornwall?
What is the highest helix in Cornwall?
What is the highest door in Cornwall?
What is the highest projection in Cornwall?
What is the highest sinter in Cornwall?
What is the highest living accommodations in Cornwall?
What is the highest impediment in Cornwall?
What is the highest base in Cornwall?
What is the highest signboard in Cornwall?
What is the highest false bottom in Cornwall?
What is the highest bascule in Cornwall?
What is the highest entablature in Cornwall?
What is the highest floor in Cornwall?
What is the highest volute in Cornwall?
What is the highest hand-build in Cornwall?
What is the highest span in Cornwall?
What is the highest superstructure in Cornwall?
What is the highest baseball in Cornwall?
What is the highest coil in Cornwall?
What is the highest cast in Cornwall?
What is the highest fountain in Cornwall?
What is the highest ways in Cornwall?
What is the highest mound in Cornwall?
What is the highest trestlework in Cornwall?
What is the highest landing place in Cornwall?
What is the highest balcony in Cornwall?
What is the highest obstruction in Cornwall?
What is the highest vaulting in Cornwall?
What is the highest quoin in Cornwall?
What is the highest grind in Cornwall?
[(1403, [('in', 'cornwall'), ('highest', 'mountain'), ('the', 'highest'), ('mountain', 'in')]), (132, [('building', 'in')]), (1319, [('in', 'cornwall')]), (1510, [('the', 'highest')]), (1448, [('the', 'highest')]), (1189, [('tower', 'in')])]
Bronn Wennili yw an ughella menydh yn -- Brown Willy is the highest mountain
Kernow. -- in Cornwall.

(highest mountain in), (mountain in cornwall), (is the highest), (the highest mountain)


Bronn Wennili yw an ughella menydh yn -- Brown Willy is the highest mountain
Kernow. -- in Cornwall.

(in cornwall), (highest mountain), (the highest), (mountain in)


An drehevyans a-rag an chi yw karrji. -- The building in front of the house is
-- a garage.

(building in)


Ammeth yw tra vras yn Kernow. -- Agriculture is a big affair in
-- Cornwall.

(in cornwall)


An gwithyas-kres a yskynnas an skeul -- The policeman went up the ladder as
bys dhe fenester an ughella chambour. -- far as the window of the highest
-- bedroom.

(the highest)


Ny yll ev esedha war an ughella huni. -- He cannot sit on the highest one.

(the highest)


Ny welyn an baner kernewek a-ugh tour -- We do not see the Cornish flag above
an eglos y'n dre ma. -- the church tower in this town.

(tower in)

Saturday, 20 August 2016

Nanotechnology is sunny in Cornwall today

Using Python NLTK, it is possible to find synonyms of input words using WordNet, and then feed this into the translation memory software.

It can produce odd results since some words will have different senses that was intended in the input sentence.

When using for the sentence "It is sunny in Cornwall today.", the only sentence returning the bigram ("today", ".") is
"Morwenna yw lowen. Hy fenn-bloedh yw hedhyw. -- Morwenna is happy. Her birthday is today. : [('today', '.')]". However using WordNet to find similar words (by finding words that are hyponyms of the hypernyms of the word) bigrams ("now", ".") and ("yesterday", ".") are also found in the sentence corpus.


python wordnettest.py
Enter some text please.
It is sunny in Cornwall today.
word: It
(Synset('information_technology.n.01'), u'the branch of engineering that deals with the use of computers and telecommunications to retrieve and store and transmit information')

Hypernyms
[Synset('engineering.n.02')]
(Synset('engineering.n.02'), u'the discipline dealing with the art or science of applying scientific knowledge to practical problems')

Hyponyms of all hypernyms
[u'aeronautical_engineering', u'architectural_engineering', u'bionics', u'biotechnology', u'bioengineering', u'ergonomics', u'chemical_engineering', u'civil_engineering', u'computer_science', u'computing', u'electrical_engineering', u'EE', u'industrial_engineering', u'industrial_management', u'information_technology', u'IT', u'mechanical_engineering', u'nanotechnology', u'naval_engineering', u'nuclear_engineering', u'rocketry']




word: sunny
(Synset('cheery.s.01'), u'bright and pleasant; promoting a feeling of cheer')

Hypernyms
[]

Hyponyms of all hypernyms
[]




word: Cornwall
(Synset('cornwall.n.01'), u'a hilly county in southwestern England')

Hypernyms
[]

Hyponyms of all hypernyms
[]


word: today
(Synset('today.n.01'), u'the present time or age')
(Synset('today.n.02'), u'the day that includes the present moment (as opposed to yesterday or tomorrow)')
(Synset('nowadays.r.01'), u'in these times')
(Synset('today.r.02'), u'on this day as distinct from yesterday or tomorrow')

Hypernyms
[Synset('present.n.01'), Synset('day.n.01')]
(Synset('present.n.01'), u'the period of time that is happening now; any continuous stretch of time including the moment of speech')
(Synset('day.n.01'), u'time for Earth to make a complete rotation on its axis')

Hyponyms of all hypernyms
[u'date', u'here_and_now', u'present_moment', u'moment', u'now', u'time_being', u'nonce', u'today', u'tonight', u'date', u'day_of_the_month', u'date', u'eve', u'morrow', u'today', u'tomorrow', u'yesterday']


word: .

Hypernyms
[]

Hyponyms of all hypernyms
[]


mechanical engineering is sunny in Cornwall today.
electrical engineering is sunny in Cornwall today.
industrial management is sunny in Cornwall today.
aeronautical engineering is sunny in Cornwall today.
nuclear engineering is sunny in Cornwall today.
nanotechnology is sunny in Cornwall today.
ergonomics is sunny in Cornwall today.
chemical engineering is sunny in Cornwall today.
It is sunny in Cornwall tonight.
It is sunny in Cornwall now.
It is sunny in Cornwall eve.
It is sunny in Cornwall present moment.
It is sunny in Cornwall morrow.
EE is sunny in Cornwall today.
It is sunny in Cornwall moment.
It is sunny in Cornwall today.
rocketry is sunny in Cornwall today.
biotechnology is sunny in Cornwall today.
information technology is sunny in Cornwall today.
It is sunny in Cornwall yesterday.
architectural engineering is sunny in Cornwall today.
civil engineering is sunny in Cornwall today.
naval engineering is sunny in Cornwall today.
computing is sunny in Cornwall today.
industrial engineering is sunny in Cornwall today.
It is sunny in Cornwall day of the month.
IT is sunny in Cornwall today.
bionics is sunny in Cornwall today.
It is sunny in Cornwall here and now.
It is sunny in Cornwall tomorrow.
bioengineering is sunny in Cornwall today.
It is sunny in Cornwall nonce.
It is sunny in Cornwall date.
computer science is sunny in Cornwall today.
It is sunny in Cornwall time being.
Nyns yw leuryow an chi salow lemmyn. -- The floors of the house are not safe now. : [('now', '.')]
Yma nebonan y'n gegin lemmyn. -- There is someone in the kitchen now. : [('now', '.')]
Nyns usi an flogh ow koska lemmyn. -- The child is not sleeping now. : [('now', '.')]
A-dro dhe gans lyver yw gwerthys lemmyn. -- About a hundred books are sold now. : [('now', '.')]
Nyns yw hemma dhe les lemmyn. -- This is no use now. : [('now', '.')]
Aga chi yw gwerthys y'n eur ma. -- Their house is sold now. : [('now', '.')]
Pandra! An gewer yw braffa lemmyn. -- What! The weather is finer now. : [('now', '.')]
Ergh a wra hi yn Alban lemmyn. -- It's snowing in Scotland now. : [('now', '.')]
An arghans yw tanow lemmyn. -- Money is scarce now. : [('now', '.')]
Ammeth yw tra vras yn Kernow. -- Agriculture is a big affair in Cornwall. : [('in', 'cornwall')]
Eus. Hi a wra glaw lemmyn. -- There is. It's raining now. : [('now', '.')]
An notenn yw parys lemmyn -- The note is ready now. : [('now', '.')]
Nyns eus le gesys yn kres an dre lemmyn. -- There isn't a place left in the town centre now. : [('now', '.')]
Ugens lyver yw gwerthys lemmyn. -- Twenty books are sold now. : [('now', '.')]
Bys dhe'n eur ma nyns yw an gewer mar lyb dell o hi de. -- Until now the weather is not as wet as it was yesterday. : [('yesterday', '.')]
Nyns yw hi pur lowen lemmyn. -- She is not very happy now. : [('now', '.')]
An glaw yw tynna es dell o de. Nyns o an dhargan gwir, my a dyb. -- The rain is more intense than it was yesterday. The forecast was not true, I think. : [('yesterday', '.')]
Oll an gerens yw marow lemmyn. -- All the near relations are dead now. : [('now', '.')]
Morwenna yw lowen. Hy fenn-bloedh yw hedhyw. -- Morwenna is happy. Her birthday is today. : [('today', '.')]
An byskadoryon yw parys lemmyn. -- The fishermen are ready now. : [('now', '.')]
Yma an dioges yn le'ti lemmyn. -- The farmwife is in the dairy now. : [('now', '.')]
Ni yw lowen lemmyn. -- We are happy now. : [('now', '.')]
Yma neppyth nowydh y'n dre lemmyn. Henn yw bryntin! -- There's something new in the town now. That's splendid! : [('now', '.')]
Yma Jerri ow koska lemmyn. -- Jerry is sleeping now. : [('now', '.')]
Nyns eus anwoes war Pam namoy. -- Pam hasn't got a cold now. : [('now', '.')]
Dha gyttrin yw gyllys lemmyn. -- Your bus is gone now. : [('now', '.')]
Ottena! Yma y wreg ganso lemmyn. -- Look! There's his wife with him now. : [('now', '.')]
Ni yw warbarth lemmyn. -- We are together now. : [('now', '.')]
Yma spas lowr y'n le na lemmyn. -- There is enough room in that place now. : [('now', '.')]
Bronn Wennili yw an ughella menydh yn Kernow. -- Brown Willy is the highest mountain in Cornwall. : [('in', 'cornwall')]
Kewer deg yw brav mes nyns yw an gewer teg lemmyn. -- Fine weather is great but the weather isn't fine now. : [('now', '.')]
An gewer hedhyw yw dihaval diworth an gewer de. -- The weather today is different from the weather yesterday. : [('yesterday', '.')]
Nyns eus denvydh omma kynth yw hi seyth eur lemmyn. -- There's no one here although it's seven o'clock now. : [('now', '.')]
An vamm re worras an kinyow war an voes lemmyn. Kynsa yma kowl onyon. -- Mother has put the dinner on the table now. First there is onion soup. : [('now', '.')]
An chi yw gwerthys lemmyn. -- The house is sold now. : [('now', '.')]
Nyns eus meur a gommolennow lemmyn. -- There are not many clouds now. : [('now', '.')]
Dydh da, Maureen. Brav yw lemmyn. -- Hello, Maureen. It's grand now. : [('now', '.')]

Friday, 19 August 2016

Translation memory for Cornish now with a GUI

I have developed the translation memory software a little further as part of my TaklowKernewek tools.

It now has a GUI:

Using only bigrams and trigrams from the corpus that contain at least one non stopword (based on NLTK stopwords corpus).

Showing all bigrams and trigrams outputs a long list of sentences containing ('is', 'the').
Sentences in the corpus that contain multiple trigrams in common with the input are ranked highest, and similarly with bigrams.
After improvement to the text wrapping of the output sentences to split longer lines:

Thursday, 18 August 2016

Translation memory software for Cornish

One of the discussions I was having with Mark Trevethan by email recently was about the translation service of the Cornish Language Office, and the idea of 'translation memory', that is when text is to be translated, to store examples of previous work done. This has two main advantages, one being saving labour, and secondly improving consistency.

I had an idea to make a rudimentary version of this myself, using the Python Natural Language Toolkit. To make this work, I needed a bilingual corpus, which had the same sentences in both Cornish and English.

The electronic version of the Cornish language textbook Skeul an Yeth 1 by Wella Brown, has been made available online free by Kesva an Taves Kernewek (The Cornish Language Board).

This contains a list of example sentences at the end of every chapter, which provides the bilingual corpus for this work.

What the program does is to ask for an input sentence (currently only via the command-line) in English, and then find the 'bigrams' and 'trigrams' in it, and also do so for the sentences from Skeul an Yeth 1.

The program uses the NLTK 'stopwords' corpus, to filter the bigrams/trigrams for whether they are in a list of common words that may not have much in the way of lexical content. Sentences containing trigrams containing at least 1 non-stopword are listed first, followed by bigrams with at least 1 non-stopword, followed by trigrams and bigrams that consist solely of stopwords.

For a larger corpus the numbers of sentences found for common bigrams such as ('in', 'the') could become very large.


python kovtreylyans.py
Enter an English sentence
The cat is sleeping on the floor next to the fire.

trigrams for input sentence are:
[('the', 'cat', 'is'), ('cat', 'is', 'sleeping'), ('is', 'sleeping', 'on'), ('sleeping', 'on', 'the'), ('on', 'the', 'floor'), ('the', 'floor', 'next'), ('floor', 'next', 'to'), ('next', 'to', 'the'), ('to', 'the', 'fire'), ('the', 'fire', '.')]

bigrams for input sentence are:
[('the', 'cat'), ('cat', 'is'), ('is', 'sleeping'), ('sleeping', 'on'), ('on', 'the'), ('the', 'floor'), ('floor', 'next'), ('next', 'to'), ('to', 'the'), ('the', 'fire'), ('fire', '.')]

Listing N grams with a minimum of 1 non-stopword each:
Common trigrams:

Yma an gath a'y growedh war an leur yn-dann an gador y'n esedhva. -- The cat is lying on the floor under the chair in the sitting room. (the cat is), (on the floor)
Ottena! An maw moen na ryb an daras. -- There look! That thin boy next to the door. (next to the)
War an leur yn-dann dha weli yn dha jambour, dell vydh usys! -- On the floor under your bed in your bedroom, as usual! (on the floor)
Usi! Hag yma an gath ena ynwedh. -- Yes! And the cat is there also. (the cat is)
Nag esons! Yma an ki war an leur mes yma an gath y'n wydhenn. -- No! The dog is on the ground but the cat is in the tree. (the cat is)
Gorr glow war an tan. Oer yw hi. -- Put coal on the fire. It's cold. (the fire.)
Esedh orth an tan! Ty a vydh toemma ena. -- Sit at the fire. You will be warmer there. (the fire.)
Dewgh orth an tan! Oer yw hi! -- Come to the fire it's very cold! (to the fire)

Common bigrams:

An gath a gosk war an gweliow. -- The cat sleeps on the beds. (the cat), (on the)
Yma Jerri ow koska lemmyn. -- Jerry is sleeping now. (is sleeping)
Ple'ma an gath? -- Where is the cat? (the cat)
Usi an gath y'n lowarth? -- Is the cat in the garden? (the cat)
orth an tan -- at the fire (the fire)
A esedhons i orth an tan pub gorthugher? -- Do they sit at the fire every evening? (the fire)

Other N grams containing only stopwords:
Common trigrams:


Common bigrams:

Ni a dhybris li. Ena ni a gerdhas. Kerdh hir o dhe'n kerrek war an hal -- We ate lunch. Then we walked. It was a long walk to the rocks on the moor. (to the), (on the)
Eus jynn-skrifa war an desk? -- Is there a typewriter on the desk? (on the)
Ottena - yma an genter war an eurlenn. -- Look there - there's the nail on the carpet. (on the)
Yma pras war an woen hag yma chi ryb an pras na. -- There's a field on the down and there's a house by that field. (on the)
Sur, yma lyver war an voes. -- Certainly there is a book on the table. (on the)
Eus traow gesys war an lestrier? -- Are there things left on the dresser? (on the)
Yma bleujyow byw gesys war an fordh omma. -- There are live flowers left on the road here. (on the)
Yma padell blos war voes an gegin. -- There's a dirty pan on the kitchen table. (on the)
Ottena teyr delenn rudh war an leur. -- Look there are three red leaves on the ground. (on the)
Eus hwetek plat byghan war an lestrier? -- Are there sixteen small plates on the dresser? (on the)
Yw. Yma hi war an voes y'n gegin. -- Yes. It's on the kitchen table. (on the)
Eus amanenn war an bara? Eus! -- Is there butter on the bread? Yes! (on the)
Deves yw tanow war an voen. -- Sheep are scarce on the down. (on the)
war an amari -- on the cupboard (on the)
A nyns usi an boes war an voes hwath? -- Isn't the food on the table yet? (on the)
Esons i war an voes? -- Are they on the table? (on the)
War an voes (yma) martesen. -- On the table (it is) perhaps. (on the)
Nebes fordhow y'n ynys yw ledan lowr mes meur a fordhow ena yw re gul. -- Few roads on the island are wide enough but many roads there are too narrow. (on the)
War an voes ymons. -- They are on the table. (on the)
Skrifewgh hanow an lyver war gynsa linen an folenn! -- Write the name of the book on the first line of the page! (on the)
Esesta war an treth? -- Were you on the beach? (on the)
Esewgh hwi war an treth? -- Were you on the beach? (on the)
Y'n koes yth esa del gell war an leur. -- In the wood there were brown leaves on the ground. (on the)
An vamm re worras an kinyow war an voes lemmyn. Kynsa yma kowl onyon. -- Mother has put the dinner on the table now. First there is onion soup. (on the)
War an voes y hworrons i an boes. -- On the table they put the food. (on the)
Ena y hworrav ow hota war an gador. -- Then I put my coat on the chair. (on the)
Yma krys ow kregi war benn an gweli. -- There is a shirt hanging on the end of the bed. (on the)
Gorr an kellylli war an voes! -- Put the knives on the table! (on the)
Y'n seythves dydh an dra o dien. -- On the seventh day the matter was complete. (on the)
Ny yllydh jy esedha war an glesin. Re lyb yw ev. -- You can't sit on the lawn. It's too wet. (on the)
Ev a redyas y hanow y'n peswara koloven war an pympes folenn a'n paper-nowodhow. -- He read his name in the fourth column on the fifth page of the newspaper. (on the)
Pan splann an loergann war an arvor a-dreus an mor kosel, assyw hi teg! -- When the full moon shines on the shore across the calm sea, how beautiful it is! (on the)
Tasik! Tasik! Ottena! Ergh war an glesin! -- Daddy! Daddy! Look! Snow on the lawn! (on the)
War drysa estyllenn an argh-lyvrow y'n esedhva yma, dell dybav. -- On the third shelf of the bookcase in the lounge it is, I think. (on the)
Nyns eus karr vyth y'n fordh. -- There is no car at all on the road. (on the)
An gewer yw hager war an heyl. Ny yll den gweles a-dreus dhodho. -- The weather is ugly on the estuary. A person cannot see across it. (on the)
An rewler a worras an lytherow war an desk rybdho. -- The manager put the letters on the desk beside him. (on the)
Ottena! A-dro dhe hanterkans hos war an lynn yn kres an hal. -- Look there! About fifty ducks on the lake in the middle of the moor. (on the)
An peswara drehevyans diworto yw ev a'n keth tu. -- It's the fourth building from it on the same side. (on the)
Goel Sen Pyran a vydh pub blydhen dhe'n pympes a vis Meurth. -- St Piran's Day is on the fifth of March each year. (on the)
Henri a vynn esedha war an isella kador. -- Henry will sit on the lowest chair. (on the)
Ny yll ev esedha war an ughella huni. -- He cannot sit on the highest one. (on the)
Prag y tregh ev an skorrennow na? Drefenn ev dh'aga leski war an tansys. -- Why does he cut those branches? Because he burns them on the bonfire. (on the)
'Yma diwros war an fordh ena ha gour shyndys a'y wrowedh war an leur', an gwithyas kres a leveris. 'Res yw dhis gortos deg mynysenn, mar pleg. Ni a vynn y worra dhe'n klavji a-dhistowgh.' -- 'There's a bicycle on the road there and a man lying injured.' replied the policeman. 'You must wait ten minutes, please. We will take him to hospital immediately.' (on the)
Kerdh hir yw dhe'n eglos. -- It's a long walk to the church. (to the)
Py chambour yw an nessa dhe'n lowarth a-rag? -- Which bedroom is nearest to the front garden? (to the)
Ke dhe'n fenester, mar pleg! -- Go to the window, please! (to the)
Nyns yw an traow ma pur haval orth an re erell, yns i? -- These are not very similar to the others, are they? (to the)
Martyn eth dhe'n treth mes nyns eth dhe neuvya. -- Martin went to the beach but he didn't go to swim. (to the)
Y'n eur na yth eth ev dhe skol an eglos. -- He then went to the church school. (to the)
My a wra lenna hwedhel dhe'n fleghes pub gorthugher. -- I read to the children every evening. (to the)
An dowr a yn nans dhe'n mor. -- The water goes down to the sea. (to the)
Ni oll warbarth eth yn-nans dhe'n treth rag neuvya. -- We all went down to the beach together in order to swim. (to the)
An keur a ganas dhe'n fleghes. -- The choir sang to the children. (to the)
A vynnowgh hwi mones genen dhe'n dons? -- Will you go with us to the dance? (to the)
Dowr an fenten a dhe'n gover. -- The spring water goes to the brook. (to the)
Karol a lanhas an lestri kyns aga daskorr dhe'n lestrier. -- Carol cleaned the dishes before returning them to the dresser. (to the)
An tiek a dhros y vughes dhe'n skiber. -- The farmer brought his cows to the barn. (to the)
An brassa stevell yw an nessa stevell dhe'n wolghva. -- The biggest room is the nearest room to the bathroom. (to the)
An skoloryon, mebyon ha mowesi, a dhe'n keth skol y'n dre. -- The schoolchildren, boys and girls, go to the same school in town. (to the)
An awel o krev. Ny allas an gorholyon dos ogas dhe'n porth. -- The wind was strong. The ships couldn't come near to the harbour. (to the)

Monday, 15 August 2016

Text to speech in Cornish

The program espeak offers text to speech in a variety of languages, not yet Cornish, but I have made a bit of a hack that allows Cornish text to be spoken by it.

There is a Welsh language voice for it, and I have created a script that processes Cornish text doing a series of replaces to make it conform to Welsh spelling rules.

It would be possible to get espeak to speak Cornish directly by creating a Cornish voice for it, and I did start doing this a long time ago, but unfortunately lost this work along with my previous laptop.

The GUI launcher currently only works in Linux-compatible systems, because it launches espeak via the command-line via the Python os library. However espeak itself is also available for Windows and I will adapt the script to work on Windows dreckly.

The first quote as an mp3 file. The second is generated by pressing the "Gorhemmyn" button, and an appropriate greeting is chosen according to the system clock.



Sunday, 14 August 2016

Transliteration from Kernewek Kemmyn to Standard Written Form

The script treuslytherenna.py and its GUI frontend treuslytherennaGUI.py converts text from Kernewek Kemmyn to Standard Written Form (Main Form).

See also the brief writeup on my website, and earlier on this blog.

A couple of example sentences I use to illustrate some of its features are:

  • Yth esa gwydhenn y'n goeswik
  • Yth esa gwydhennow y'n goeswik

 There was a tree in the forest is the translation of the first sentance, and gwydhennow is the plural of the singlative gwydhenn which derives from the collective noun gwydh (trees). Gwydh would be use for a general mass of trees, gwydhenn a single tree, and gwydhennow a countable collection of individual trees.



In the left hand panel, gwydhenn becomes gwedhen showing two changes, firstly the doubled consonant -nn becomes single -n. The program will make this change for unstressed syllables, exluding those that are prefixes that have secondary stress like penn- in pennseythun and some others.
The other change is the y becoming an e as part of vocalic alternation. This occurs for y vowels that are 'half-long' in Kernewek Kemmyn, which is detected via the syllable segmentation program.
The function converty(inputsyl) in treuslytherenna.py applies this change as long as the word isn't in a list of exceptions given in datageryow.py and the syllable ends in a consonant. If the syllable ends in a vowel (e.g. ay, ey, oy diphthongs, and -ya endings where the y (which is really a semi-vowel y) has been erroneously assigned to the previous syllable) the change is not made.

If backwards segmentation is chosen, this change won't happen since gwydhenn will be segmented into ['gwy', 'dhenn'] and the y will not be changed since it is now in a syllable ending in a vowel.

The word goeswik (mutation of koeswik) becomes goswik, as the Kernewek Kemmyn oe becomes o where it is a short or half-long vowel, and oo in a syllable with a Kernewek Kemmyn long vowel.

In the right hand panel, the word gwydhennow is unchanged, because the y vowel in the first syllable is now short rather than half-long, and the -nn is in a stressed syllable so retained as a double consonant.


Syllable segmentation in Cornish - forward vs. backward segmentation

The syllable segmentation module of TaklowKernewek I have commented on earlier in this blog, and on my website.

However there is much more to discuss, and one aspect of this is that the program offers a choice between forwards and backwards segmentation.

This means either starting from the beginning of the word, and working forwards assigning the letters to particular syllables, or starting from the end and working backwards.

I present some of the code from the program, which is admittedly difficult to read, and if you like, skip down to the examples at the bottom. It may also be easier to read at my Bitbucket site.

The core of this program is a set of regular expressions, as follows:

# syllabelRegExp should match syllable anywhere in a word
# a syllable could have structure CV, CVC, VC, V
# will now match traditional graphs c-, qw- yn syllable initial position
syllabelRegExp = r'''(?x)
((bl|br|Bl|Br|kl|Kl|kr|Kr|kn|Kn|kwr?|Kwr?|qwr?|Qwr?|ch|Ch|Dhr?\'?|dhr?\'?|dl|dr|Dr|fl|Fl|fr|Fr|vl|Vl|vr|Vr|vv|ll|gwr?|gwl?|gl|gr|gg?h|gn|Gwr?|Gwl?|Gl|Gr|Gn|hwr?|Hwr?|ph|Ph|pr|pl|Pr|Pl|shr?|Shr?|str?|Str?|skr?|Skr?|skw?|Skw?|sbr|Sbr|spr|Spr|sp?l?|Sp?l?|sm|Sm|tth|Tth|thr?|Thr?|tr|Tr|tl|Tl|wr|Wr|wl|Wl|[bckdfjvlghmnprstwyzBCKDFJVLGHMNPRSTVWZY]) # consonant
\'?(ay|a\'?w|eu|ey|ew|iw|oe|oy|ow|ou|uw|yw|[aeoiuy])\'? #vowel
(lgh|ls|lt|bl|br|bb|kl|kr|kn|kwr?|kk|n?ch|dhr?|dl|n?dr|dd|fl|fr|ff|vl|vv|gg?ht?|gw|gl|gn|ld|lf|lk|ll|mm|mp|nk|nd|nj|ns|nth?|nn|ph|pr|pl|pp|rgh?|rdh?|rth?|rk|rl|rv|rm|rn|rr|rj|rf|rs|sh|st|sk|ss|sp?l?|tt?h|tt|[bdfgljmnpkrstvw])? # optional const.
)| # or
(\'?(ay|a\'?w|eu|ew|ey|iw|oe|oy|ow|ou|uw|yw|Ay|Aw|Ey|Eu|Ew|Iw|Oe|Oy|Ow|Ou|Uw|Yw|[aeoiuyAEIOUY])\'? # vowel
(lgh|ls|lt|bl|bb|kl|kr|kn|kwr?|kk|cch|n?ch|dhr?|dl|n?dr|dd|fl|fr|ff|vl|vv|gg?ht?|gw|gl|gn|ld|lf|lk|ll|mm|mp|nk|nd|nj|ns|nth?|nn|ph|pr|pl|pp|rgh?|rdh?|rth?|rk|rl|rv|rm|rn|rr|rj|rf|rs|sh|st|sk|ss|sp?l?|tt?h|tt|[bdfgljmnpkrstvw]\'?)?) # consonant (optional)
'''
# diwethRegExp matches a syllable at the end of the word
diwetRegExp = r'''(?x)
((bl|br|Bl|Br|kl|Kl|kr|Kr|kn|Kn|kwr?|Kwr?|qwr?|Qwr?|ch|Ch|Dhr?\'?|dhr?\'?|dl|dr|Dl|Dr|fl|Fl|fr|Fr|vl|Vl|vr|Vr|vv|ll|gwr?|gwl?|gl|gr|gg?h|gn|Gwr?|Gwl?|Gl|Gr|Gn|hwr?|Hwr?|ph|Ph|pr|pl|Pr|Pl|shr?|Shr?|str?|Str?|skr?|Skr?|skw?|Skw?|sbr|Sbr|spr|Spr|sp?l?|Sp?l?|sm|Sm|tth|Tth|thr?|Thr?|tr|Tr|tl|Tl|wr|Wr|wl|Wl|[bckdfjlghpmnrstvwyzBCKDFJLGHPMNRSTVWYZ]\'?)? #consonant or c. cluster
\'?(ay|a\'?w|eu|ew|ey|iw|oe|oy|ow|ou|uw|yw|Ay|Aw|Ey|Eu|Ew|Iw|Oe|Oy|Ow|Ou|Uw|Yw|\'?[aeoiuyAEIOUY]\'?) # vowel
(lgh|ls|lt|bl|br|bb|kl|kr|kn|kwr?|kk|cch|n?ch|dhr?|dl|n?dr|dd|fl|fr|ff|vl|vv|gg?ht?|gw|gl|gn|ld|lf|lk|ll|mm|mp|nk|nd|nj|ns|nth?|nn|ph|pr|pl|pp|rgh?|rdh?|rth?|rk|rl|rv|rm|rn|rr|rj|rf|rs|sh|st|sk|ss|sp?l?|tt?h|tt|[bdfgjklmnprstvw]\'?)? # optionally a second consonant or cluster ie CVC?
(\-|\.|\,|;|:|!|\?|\(|\))*
)$
'''
# kynsaRegExp matches syllable at beginning of a word
# 1st syllable could be CV, CVC, VC, V
kynsaRegExp = r'''(?x)
^((\'?(bl|br|Bl|Br|kl|Kl|kr|Kr|kn|Kn|kwr?|Kwr?|qwr?|Qwr?|ch|Ch|Dhr?|dhr?|dl|dr|Dr|fl|Fl|fr|Fr|vl|Vl|vr|Vr|gwr?|gwl?|gl|gr|gn|Gwr?|Gwl?|Gl|Gr|Gn|hwr?|Hwr?|ph|Ph|pr|pl|Pr|Pl|shr?|Shr?|str?|Str?|skr?|Skr?|skw?|Skw?|sbr|Sbr|spr|Spr|sp?l?|Sp?l?|sm|Sm|tth|Tth|thr?|Thr?|tr|Tr|tl|Tl|wr|Wr|wl|Wl|[bckdfghjlmnprtvwyzBCKDFGHJLMNPRTVWYZ])\'?)? # optional C.
\'?(ay|a\'?w|eu|ew|ey|iw|oe|oy|ow|ou|uw|yw|Ay|Aw|Ey|Eu|Ew|Iw|Oe|Oy|Ow|Ou|Uw|Yw|[aeoiuyAEIOUY])\'? # Vowel
(lgh|ls|lk|ld|lf|lt|bb?|kk?|cch|n?ch|n?dr|dh|dd?|ff?|vv?|ght|gg?h?|ll?|mp|mm?|nk|nd|nj|ns|nth?|nn?|pp?|rgh?|rdh?|rth?|rk|rl|rv|rm|rn|rj|rf|rs|rr?|sh|st|sk|sp|ss?|tt?h|tt?|[jw]\'?)? # optional C.
(\-|\.|\,|;|:|!|\?|\(|\))*
)'''


In the actual segmentation of the word itself, the expressions kynsaRegExp and diwetRegExp are used, depending on whether we are going forwards starting from the beginning or backwards from the end:


if fwds:
# go forwards
sls = rannans.ranna_syl(self.graph,regexps.kynsaRegExp,fwd=True,bwd=False)
else:
# go backwards from end
sls = rannans.ranna_syl(self.graph,regexps.diwetRegExp,fwd=False,bwd=True)


where ranna_syl() is the actual function that returns a list of syllables from the word ger:


def ranna_syl(self,ger,regexp,fwd=True,bwd=False):
""" divide a word into a list of its syllables
and return this as a list of plain text strings
"""
syl_list = []
if fwd:
# go forwards through the word
while ger:
# print(ger)
k = self.match_syl(ger,regexp)
# print("kynsa syl:{k}".format(k=k))
# add the syllable to the list
if k != '':
syl_list.append(k)
if k != '' and len(ger.split(k,1))>1:
# if there is more of the word after the
# 1st syllable
# remove the 1st syllable
ger = ger.split(k,1)[1]

else:
ger = ''
if bwd:
# go backwards from the end through the word
while ger:
# print(ger)
d = self.match_syl(ger,regexp)
# print(d)
# add the syllable to the list
if d != '':
syl_list.insert(0,d)
if d != '' and len(ger.rsplit(d,1))>1:
# if there is more of the word before the
# last syllable
# remove the last syllable
ger = ger.rsplit(d,1)[0]
else:
ger = ''
# this is returning
# a list of plain text
# not Syllabenn objects
return syl_list


The syllabelRegExp regular expression is used in Syllabenn class itself, as part of the code that initates a Syllabenn object and works out the syllable parts, i.e. consanant clusters and vowels, and the overall length.

Example sentences

The effect of going forwards or backwards can be illustrated in the processing of an example sentence:

Going backwards from the end, tends to maximise consonants at the beginning of syllables. For example the word 'gewer' is processed into ['ge', 'wer'] i.e. the w is assigned to the second syllable whereas in this word the 'ew' is actually pronounced as a diphthong. The gemminated consonant 'mm' in lemmyn is split into two different syllables.
Now working forward, the processing of the word 'gewer' now splits into ['gew', 'er'] which accords with the status of 'ew' as a diphthong. 'Lemmyn' now splits into ['lemm', 'yn'] assigning the whole of the gemminated consonant to the first syllable. The word 'Fatell' now has the 't' assigned to the first syllable

A similar effect can be seen in another sentence:
Special cases such as the unstressed monosyllables 'ha', and 'dell' are detailed in the file datageryow.py.

With forwards segmentation, the processing of 'kommolek', and 'hevel' assigns consonants to the coda of syllables rather than maximising the onset.