Showing posts with label welsh language. Show all posts
Showing posts with label welsh language. Show all posts

Tuesday, 20 June 2017

Syllable segmentation: showing an error message if input not understood in full

In my previous post about filling in the gap in Bewnans Ke at the Cornish language weekend, I noticed that the Cornish word 'vyajya' (to travel) is not understood in full by the syllable segmentation module of taklow-kernewek, if the reverse segmentation mode is used starting from the end and working backwards, since it assumes the penultimate syllable is 'yaj' starting with a semivocalic y rather than a vowel y. This leaves 'v' on its own which is not matched by the regular expression as a syllable.

This can be compounded if accented or non-alphabetic characters are included. A warning can now be given in cases where not all of the input word is matched, using the command-line syllabenn_ranna_kw.py with the --warn option, or by checkboxes in sylrannakwGUI.py and sylrannacyGUI.py

The window in my netbook. If the box is ticked, a warning is given. One of my next things to do is to make the output box a little cleverer to avoid splitting lines in the middle of words.
The interface language is a bit confused, since the explanatory text here remains in Cornish, although the warning message is in Welsh this was done in not a very rational way, since this is hard-coded at the moment to appear in Cornish only in short or line mode, and bilingually Cornish/English in long mode, except if the CYmode flag is set this gets overridden and it displays the Welsh version. It probably needs a bit of an overhaul to change the language in a similar way to the corpus statistics module.

Monday, 20 March 2017

Llan50goch

Here is the result of using the regular expressions used in my Python module to segment the longest place name in the UK.

Both forward and backward segmentation are used. Background picture from Astronomy Picture of the Day.

Sunday, 19 March 2017

Syllable segmentation in Welsh

Building on previous work on syllable segmentation, I have now made an initial version of syllable segmentation in Welsh. This currently is implemented within the syllabenn_ranna_kw module, and there is also a test script regexp_test_cy.py at bitbucket.org/davidtreth/taklow-kernewek.


The output of the test script looks like this:
Each syllable is matched by a regular expression using re.findall().The output of this is a list of tuples, one per syllable. The first four elements of the tuple are used if it is a syllable beginning with a consonant, the next three if it is a syllable beginning with a vowel, and the last three if it is a syllable (either CV or V) using a vowel with an diaresis (looks like a German umlaut), which in Welsh forces it to be a separate syllable rather than part of a dipthong with a following vowel.

This is a method, matching backwards from the end of the word.



The main program splitting text into syllables in text mode syllabenn_ranna_kw.py is now able to process Welsh text (possibly erroneously). I will create a TkInter GUI app for it once I have debugged it a bit more.

Here is Mae Hen Wlad Fy Nhadau treated by the program:

Output in line mode

python3 syllabenn_ranna_kw.py --fwd --line --cyregexp maehenwlad.txt

Linenn 1
Mae:1  hen:1  wlad:1  fy:1  nhadau:2  yn:1  annwyl:2  i:1  mi:1  ,:0 
Niver a sylabennow y'n linenn = 11

Linenn 2
Gwlad:1  beirdd:1  a:1  chantorion:3  ,:0  enwogion:3  o:1  fri:1  ;:0 
Niver a sylabennow y'n linenn = 11

Linenn 3
Ei:1  gwrol:1  ryfelwyr:3  ,:0  gwladgarwyr:3  tra:1  mad:1  ,:0 
Niver a sylabennow y'n linenn = 10

Linenn 4
Dros:1  ryddid:2  collasant:3  eu:1  gwaed:1  .:0 
Niver a sylabennow y'n linenn = 8

Linenn 5

Niver a sylabennow y'n linenn = 0

Linenn 6
Gwlad:1  ,:0  gwlad:1  ,:0  pleidiol:2  wyf:1  i'm:1  gwlad:1  .:0 
Niver a sylabennow y'n linenn = 7

Linenn 7
Tra:1  môr:1  yn:1  fur:1  i'r:1  bur:1  hoff:1  bau:1  ,:0 
Niver a sylabennow y'n linenn = 8

Linenn 8
O:1  bydded:2  i'r:1  hen:1  iaith:1  barhau:2  .:0 
Niver a sylabennow y'n linenn = 8

Linenn 9

Niver a sylabennow y'n linenn = 0

Linenn 10
Hen:1  Gymru:2  fynyddig:3  ,:0  paradwys:3  y:1  bardd:1  ,:0 
Niver a sylabennow y'n linenn = 11

Linenn 11
Pob:1  dyffryn:2  ,:0  pob:1  clogwyn:2  ,:0  i'm:1  golwg:2  sydd:1  hardd:1  ;:0 
Niver a sylabennow y'n linenn = 11

Linenn 12
Trwy:1  deimlad:2  gwladgarol:3  ,:0  mor:1  swynol:2  yw:1  si:1 
Niver a sylabennow y'n linenn = 11

Linenn 13
Ei:1  nentydd:2  ,:0  afonydd:3  ,:0  i:1  mi:1  .:0 
Niver a sylabennow y'n linenn = 8

Linenn 14

Niver a sylabennow y'n linenn = 0

Linenn 15
Os:1  treisiodd:2  y:1  gelyn:2  fy:1  ngwlad:1  tan:1  ei:1  droed:1  ,:0 
Niver a sylabennow y'n linenn = 11

Linenn 16
Mae:1  hen:1  iaith:1  y:1  Cymry:2  mor:1  fyw:1  ag:1  erioed:2  ,:0 
Niver a sylabennow y'n linenn = 11

Linenn 17
Ni:1  luddiwyd:2  yr:1  awen:2  gan:1  erchyll:2  law:1  brad:1  ,:0 
Niver a sylabennow y'n linenn = 11

Linenn 18
Na:1  thelyn:2  berseiniol:3  fy:1  ngwlad:1  .:0 
Niver a sylabennow y'n linenn = 8


Selected words in long-form

python3 syllabenn_ranna_kw.py --fwd --cyregexp maehenwlad.txt | more
...
An ger yw: gwrol
Niver a syllabennow yw: 1
Hag yns i: ['gwrol']
S1: GWROL, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4
...
An ger yw: pleidiol
Niver a syllabennow yw: 2
Hag yns i: ['pleid', 'iol']
S1: PLEID, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: iol, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7
...

The word "gwrol" is assumed to have gwr as a consonant cluster, and is thus analysed as 1 syllable rather than 2 it has if w is pronounced as a vowel, which I is the correct Welsh pronunciation of this word, according to recordings of the Welsh national anthem.
Also, "pleidiol" is analysed as two syllables, interpreting the second i as a semi-vowel. It should really be 3 syllables, interpreting it as a vowel, i.e. ["pleid", "i", "ol"].

The Welsh syllable segmentation has a set of regular expressions defined for Welsh, and a subclass of the Syllabenn object, however the Ger object only has the special cases (of abnormal stress) defined for Cornish in the file datageryow.py so most abnormally stressed words will not be picked up.

Appendix, full output of Mae Hen Wlad Fy Nhadau:

python3 syllabenn_ranna_kw.py --fwd --cyregexp maehenwlad.txt
An ger yw: Mae
Niver a syllabennow yw: 1
Hag yns i: ['Mae']
S1: MAE, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: hen
Niver a syllabennow yw: 1
Hag yns i: ['hen']
S1: HEN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: wlad
Niver a syllabennow yw: 1
Hag yns i: ['wlad']
S1: WLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: fy
Niver a syllabennow yw: 1
Hag yns i: ['fy']
S1: FY, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: nhadau
Niver a syllabennow yw: 2
Hag yns i: ['nhad', 'au']
S1: NHAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: au, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5


An ger yw: yn
Niver a syllabennow yw: 1
Hag yns i: ['yn']
S1: YN, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: annwyl
Niver a syllabennow yw: 2
Hag yns i: ['ann', 'wyl']
S1: ANN, VC, hirder = [1, 1], hirder kowal = 2
S2: wyl, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 5


An ger yw: i
Niver a syllabennow yw: 1
Hag yns i: ['i']
S1: I, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: mi
Niver a syllabennow yw: 1
Hag yns i: ['mi']
S1: MI, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Gwlad
Niver a syllabennow yw: 1
Hag yns i: ['Gwlad']
S1: GWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: beirdd
Niver a syllabennow yw: 1
Hag yns i: ['beirdd']
S1: BEIRDD, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: a
Niver a syllabennow yw: 1
Hag yns i: ['a']
S1: a, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: chantorion
Niver a syllabennow yw: 3
Hag yns i: ['chant', 'or', 'ion']
S1: chant, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: OR, VC, hirder = [2, 1], hirder kowal = 3
S3: ion, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 9


An ger yw: enwogion
Niver a syllabennow yw: 3
Hag yns i: ['en', 'wog', 'ion']
S1: en, VC, hirder = [1, 1], hirder kowal = 2
S2: WOG, CVC, hirder = [1, 2, 1], hirder kowal = 4
S3: ion, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 9


An ger yw: o
Niver a syllabennow yw: 1
Hag yns i: ['o']
S1: O, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: fri
Niver a syllabennow yw: 1
Hag yns i: ['fri']
S1: FRI, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Ei
Niver a syllabennow yw: 1
Hag yns i: ['Ei']
S1: EI, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: gwrol
Niver a syllabennow yw: 1
Hag yns i: ['gwrol']
S1: GWROL, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: ryfelwyr
Niver a syllabennow yw: 3
Hag yns i: ['ryf', 'el', 'wyr']
S1: ryf, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: EL, VC, hirder = [2, 1], hirder kowal = 3
S3: wyr, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 9


An ger yw: gwladgarwyr
Niver a syllabennow yw: 3
Hag yns i: ['gwlad', 'gar', 'wyr']
S1: gwlad, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: GAR, CVC, hirder = [1, 2, 1], hirder kowal = 4
S3: wyr, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 10


An ger yw: tra
Niver a syllabennow yw: 1
Hag yns i: ['tra']
S1: TRA, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: mad
Niver a syllabennow yw: 1
Hag yns i: ['mad']
S1: MAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Dros
Niver a syllabennow yw: 1
Hag yns i: ['Dros']
S1: DROS, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: ryddid
Niver a syllabennow yw: 2
Hag yns i: ['rydd', 'id']
S1: RYDD, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: id, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: collasant
Niver a syllabennow yw: 3
Hag yns i: ['coll', 'as', 'ant']
S1: coll, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: AS, VC, hirder = [2, 1], hirder kowal = 3
S3: ant, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 8


An ger yw: eu
Niver a syllabennow yw: 1
Hag yns i: ['eu']
S1: EU, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: gwaed
Niver a syllabennow yw: 1
Hag yns i: ['gwaed']
S1: GWAED, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Gwlad
Niver a syllabennow yw: 1
Hag yns i: ['Gwlad']
S1: GWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: gwlad
Niver a syllabennow yw: 1
Hag yns i: ['gwlad']
S1: GWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: pleidiol
Niver a syllabennow yw: 2
Hag yns i: ['pleid', 'iol']
S1: PLEID, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: iol, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: wyf
Niver a syllabennow yw: 1
Hag yns i: ['wyf']
S1: WYF, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: i'm
Niver a syllabennow yw: 1
Hag yns i: ["i'm"]
S1: I'M, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: gwlad
Niver a syllabennow yw: 1
Hag yns i: ['gwlad']
S1: GWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Tra
Niver a syllabennow yw: 1
Hag yns i: ['Tra']
S1: TRA, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: môr
Niver a syllabennow yw: 1
Hag yns i: ['môr']
S1: MÔR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: yn
Niver a syllabennow yw: 1
Hag yns i: ['yn']
S1: YN, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: fur
Niver a syllabennow yw: 1
Hag yns i: ['fur']
S1: FUR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: i'r
Niver a syllabennow yw: 1
Hag yns i: ["i'r"]
S1: I'R, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: bur
Niver a syllabennow yw: 1
Hag yns i: ['bur']
S1: BUR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: hoff
Niver a syllabennow yw: 1
Hag yns i: ['hoff']
S1: HOFF, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: bau
Niver a syllabennow yw: 1
Hag yns i: ['bau']
S1: BAU, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: O
Niver a syllabennow yw: 1
Hag yns i: ['O']
S1: O, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: bydded
Niver a syllabennow yw: 2
Hag yns i: ['bydd', 'ed']
S1: BYDD, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ed, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: i'r
Niver a syllabennow yw: 1
Hag yns i: ["i'r"]
S1: I'R, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: hen
Niver a syllabennow yw: 1
Hag yns i: ['hen']
S1: HEN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: iaith
Niver a syllabennow yw: 1
Hag yns i: ['iaith']
S1: IAITH, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: barhau
Niver a syllabennow yw: 2
Hag yns i: ['bar', 'hau']
S1: BAR, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: hau, CV, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: Hen
Niver a syllabennow yw: 1
Hag yns i: ['Hen']
S1: HEN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Gymru
Niver a syllabennow yw: 2
Hag yns i: ['Gym', 'ru']
S1: GYM, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ru, CV, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: fynyddig
Niver a syllabennow yw: 3
Hag yns i: ['fyn', 'ydd', 'ig']
S1: fyn, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: YDD, VC, hirder = [2, 1], hirder kowal = 3
S3: ig, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 8


An ger yw: paradwys
Niver a syllabennow yw: 3
Hag yns i: ['par', 'ad', 'wys']
S1: par, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: AD, VC, hirder = [2, 1], hirder kowal = 3
S3: wys, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 9


An ger yw: y
Niver a syllabennow yw: 1
Hag yns i: ['y']
S1: y, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: bardd
Niver a syllabennow yw: 1
Hag yns i: ['bardd']
S1: BARDD, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Pob
Niver a syllabennow yw: 1
Hag yns i: ['Pob']
S1: POB, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: dyffryn
Niver a syllabennow yw: 2
Hag yns i: ['dyff', 'ryn']
S1: DYFF, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ryn, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: pob
Niver a syllabennow yw: 1
Hag yns i: ['pob']
S1: POB, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: clogwyn
Niver a syllabennow yw: 2
Hag yns i: ['clog', 'wyn']
S1: CLOG, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: wyn, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: i'm
Niver a syllabennow yw: 1
Hag yns i: ["i'm"]
S1: I'M, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: golwg
Niver a syllabennow yw: 2
Hag yns i: ['gol', 'wg']
S1: GOL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: wg, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: sydd
Niver a syllabennow yw: 1
Hag yns i: ['sydd']
S1: SYDD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: hardd
Niver a syllabennow yw: 1
Hag yns i: ['hardd']
S1: HARDD, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Trwy
Niver a syllabennow yw: 1
Hag yns i: ['Trwy']
S1: TRWY, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: deimlad
Niver a syllabennow yw: 2
Hag yns i: ['deim', 'lad']
S1: DEIM, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: lad, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: gwladgarol
Niver a syllabennow yw: 3
Hag yns i: ['gwlad', 'gar', 'ol']
S1: gwlad, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: GAR, CVC, hirder = [1, 2, 1], hirder kowal = 4
S3: ol, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 9


An ger yw: mor
Niver a syllabennow yw: 1
Hag yns i: ['mor']
S1: MOR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: swynol
Niver a syllabennow yw: 2
Hag yns i: ['swyn', 'ol']
S1: SWYN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ol, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: yw
Niver a syllabennow yw: 1
Hag yns i: ['yw']
S1: YW, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: si
Niver a syllabennow yw: 1
Hag yns i: ['si']
S1: SI, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Ei
Niver a syllabennow yw: 1
Hag yns i: ['Ei']
S1: EI, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: nentydd
Niver a syllabennow yw: 2
Hag yns i: ['nent', 'ydd']
S1: NENT, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: ydd, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5


An ger yw: afonydd
Niver a syllabennow yw: 3
Hag yns i: ['af', 'on', 'ydd']
S1: af, VC, hirder = [1, 1], hirder kowal = 2
S2: ON, VC, hirder = [2, 1], hirder kowal = 3
S3: ydd, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 7


An ger yw: i
Niver a syllabennow yw: 1
Hag yns i: ['i']
S1: I, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: mi
Niver a syllabennow yw: 1
Hag yns i: ['mi']
S1: MI, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: Os
Niver a syllabennow yw: 1
Hag yns i: ['Os']
S1: OS, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: treisiodd
Niver a syllabennow yw: 2
Hag yns i: ['treis', 'iodd']
S1: TREIS, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: iodd, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: y
Niver a syllabennow yw: 1
Hag yns i: ['y']
S1: y, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: gelyn
Niver a syllabennow yw: 2
Hag yns i: ['gel', 'yn']
S1: GEL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: yn, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: fy
Niver a syllabennow yw: 1
Hag yns i: ['fy']
S1: FY, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: ngwlad
Niver a syllabennow yw: 1
Hag yns i: ['ngwlad']
S1: NGWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: tan
Niver a syllabennow yw: 1
Hag yns i: ['tan']
S1: TAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: ei
Niver a syllabennow yw: 1
Hag yns i: ['ei']
S1: EI, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: droed
Niver a syllabennow yw: 1
Hag yns i: ['droed']
S1: DROED, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Mae
Niver a syllabennow yw: 1
Hag yns i: ['Mae']
S1: MAE, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: hen
Niver a syllabennow yw: 1
Hag yns i: ['hen']
S1: HEN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: iaith
Niver a syllabennow yw: 1
Hag yns i: ['iaith']
S1: IAITH, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: y
Niver a syllabennow yw: 1
Hag yns i: ['y']
S1: y, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2


An ger yw: Cymry
Niver a syllabennow yw: 2
Hag yns i: ['Cym', 'ry']
S1: CYM, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ry, CV, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: mor
Niver a syllabennow yw: 1
Hag yns i: ['mor']
S1: MOR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: fyw
Niver a syllabennow yw: 1
Hag yns i: ['fyw']
S1: FYW, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: ag
Niver a syllabennow yw: 1
Hag yns i: ['ag']
S1: AG, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: erioed
Niver a syllabennow yw: 2
Hag yns i: ['er', 'ioed']
S1: ER, VC, hirder = [2, 1], hirder kowal = 3
S2: ioed, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 6


An ger yw: Ni
Niver a syllabennow yw: 1
Hag yns i: ['Ni']
S1: NI, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: luddiwyd
Niver a syllabennow yw: 2
Hag yns i: ['ludd', 'iwyd']
S1: LUDD, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: iwyd, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7


An ger yw: yr
Niver a syllabennow yw: 1
Hag yns i: ['yr']
S1: YR, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: awen
Niver a syllabennow yw: 2
Hag yns i: ['aw', 'en']
S1: AW, V, hirder = [2], hirder kowal = 2
S2: en, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 4


An ger yw: gan
Niver a syllabennow yw: 1
Hag yns i: ['gan']
S1: GAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: erchyll
Niver a syllabennow yw: 2
Hag yns i: ['erch', 'yll']
S1: ERCH, VC, hirder = [1, 1], hirder kowal = 2
S2: yll, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 4


An ger yw: law
Niver a syllabennow yw: 1
Hag yns i: ['law']
S1: LAW, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: brad
Niver a syllabennow yw: 1
Hag yns i: ['brad']
S1: BRAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4


An ger yw: Na
Niver a syllabennow yw: 1
Hag yns i: ['Na']
S1: Na, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: thelyn
Niver a syllabennow yw: 2
Hag yns i: ['thel', 'yn']
S1: THEL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: yn, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6


An ger yw: berseiniol
Niver a syllabennow yw: 3
Hag yns i: ['bers', 'ein', 'iol']
S1: bers, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: EIN, VC, hirder = [2, 1], hirder kowal = 3
S3: iol, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 9


An ger yw: fy
Niver a syllabennow yw: 1
Hag yns i: ['fy']
S1: FY, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3


An ger yw: ngwlad
Niver a syllabennow yw: 1
Hag yns i: ['ngwlad']
S1: NGWLAD, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

Friday, 17 March 2017

Syllable segmentation now available for Standard Written Form Cornish

I have done some more work on the syllable segmentation module of my taklow-kernewek Python software.

I have added regular expressions to process text in the Standard Written Form of Cornish. These are still under testing and may not correctly process all SWF words.

It looks to me like the SWF specification is not currently available at the Cornish Language Office website, it was accessible via the MAGA website which now redirects to the Cornish Language Office, which doesn't appear to link to the SWF specification PDF.

However it is online here: http://kernowek.net/Specification_Final_Version.pdf and the 2014 revisions are here: https://www.cornwall.gov.uk/media/21486879/swf-review-board-final-report.pdf. The SWF dictionary can be accessed as a PDF (pre-revision spelling) or as a website www.cornishdictionary.org.uk, which will soon be revised by Akademi Kernewek.

Peter Jenkin has recently produced some translations of some Welsh songs into Cornish, in FSS Traditional, which I have used to test my new regular expressions.

As an example, this is the translation of Calon Lân (Colon Lan):

Counting number of syllables per word and in each line. "Tecca" should be spelt "Tecka" I believe which will be recognised as 2 syllables, "Precyous" is 2 syllables in the output reproduced at the end of this post. This screenshot was taken before 'cy' was added as a consonant (used in FSS in place of KK 'sh') in the FSS regular expressions.
Full syllable details. The syllable parts have lengths of either 1 (short) or 2 (long) in FSS rather than 1 (short), 2 (half-long vowel / gemminated consonant), 3 (long vowel) in Kemmyn.

The original text in Welsh is:


Calon lân yn llawn daioni,
Tecach yw na'r lili dlos:
Dim ond calon lân all ganu
Canu'r dydd a chanu'r nos.
Nid wy'n gofyn bywyd moethus,
Aur y byd na'i berlau mân:
Gofyn wyf am galon hapus,
Calon onest, calon lân.

Pe dymunwn olud bydol,
Hedyn buan ganddo sydd;
Golud calon lân, rinweddol,
Yn dwyn bythol elw fydd.

Hwyr a bore fy nymuniad 
Gwyd i'r nef ar adain cân
Ar i Dduw, er mwyn fy Ngheidwad,
Roddi i mi galon lân.
with Peter Jenkin's Cornish translation:

Colon lan yw leun a dhadder,
Tecca es lili precyous:
Ny yll saw colon lan cana -
Cana'n jydh ha cana'n nos.

1. Ny wovynnav bewnans pur es,
Owr an bys na'y berlys mann:
Govyn a wrav colon attes,
Colon onest, colon lan.
   
2. Pythow an bys ma, mar mynnen,
Dhe hasen uskis galsa va;
Lanow colon lan yw prest len
A brenvyth prow bynytha.
   
3. Gorthugher, myttin ow mynnas
A neyj war eskelly can -
Troha Duw, a-barth ow Gwithyas,
A ro dhymmo colon lan.


The output showing the number of syllables in each line is:

Colon:2  lan:1  yw:1  leun:1  a:1  dhadder:2  ,:0 
Niver a sylabennow y'n linenn = 8

Tecca:1  es:1  lili:2  precyous:2  ::0 
Niver a sylabennow y'n linenn = 6

Ny:1  yll:1  saw:1  colon:2  lan:1  cana:2  -:0 
Niver a sylabennow y'n linenn = 8

Cana'n:2  jydh:1  ha:1  cana'n:2  nos:1  .:0 
Niver a sylabennow y'n linenn = 7


Niver a sylabennow y'n linenn = 0

1:0  .:0  Ny:1  wovynnav:3  bewnans:2  pur:1  es:1  ,:0 
Niver a sylabennow y'n linenn = 8

Owr:1  an:1  bys:1  na'y:1  berlys:2  mann:1  ::0 
Niver a sylabennow y'n linenn = 7

Govyn:2  a:1  wrav:1  colon:2  attes:2  ,:0 
Niver a sylabennow y'n linenn = 8

Colon:2  onest:2  ,:0  colon:2  lan:1  .:0 
Niver a sylabennow y'n linenn = 7


Niver a sylabennow y'n linenn = 0

2:0  .:0  Pythow:2  an:1  bys:1  ma:1  ,:0  mar:1  mynnen:2  ,:0 
Niver a sylabennow y'n linenn = 8

Dhe:1  hasen:2  uskis:2  galsa:2  va:1  ;:0 
Niver a sylabennow y'n linenn = 8

Lanow:2  colon:2  lan:1  yw:1  prest:1  len:1 
Niver a sylabennow y'n linenn = 8

A:1  brenvyth:2  prow:1  bynytha:3  .:0 
Niver a sylabennow y'n linenn = 7


Niver a sylabennow y'n linenn = 0

3:0  .:0  Gorthugher:3  ,:0  myttin:2  ow:1  mynnas:2 
Niver a sylabennow y'n linenn = 8

A:1  neyj:1  war:1  eskelly:3  can:1  -:0 
Niver a sylabennow y'n linenn = 7

Troha:2  Duw:1  ,:0  a-barth:2  ow:1  Gwithyas:2  ,:0 
Niver a sylabennow y'n linenn = 8

A:1  ro:1  dhymmo:2  colon:2  lan:1  .:0 
Niver a sylabennow y'n linenn = 7

The word "Tecca" should probably be written "Tecka" in SWF/T, where it will have 2 syllables.

The long-form output showing syllable details is shown below. The lengths of vowels are either long (written as 2 units here) or short (1) in the Standard Written Form, rather than long being 3, half-long 2 and short 1 as in Kemmyn.

An ger yw: Colon
Niver a syllabennow yw: 2
Hag yns i: ['Col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: lan
Niver a syllabennow yw: 1
Hag yns i: ['lan']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: yw
Niver a syllabennow yw: 1
Hag yns i: ['yw']
S1: YW, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: leun
Niver a syllabennow yw: 1
Hag yns i: ['leun']
S1: LEUN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: a
Niver a syllabennow yw: 1
Hag yns i: ['a']
S1: a, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: dhadder
Niver a syllabennow yw: 2
Hag yns i: ['dhadd', 'er']
S1: DHADD, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: er, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: Tecca
Niver a syllabennow yw: 1
Hag yns i: ['Te']
S1: TE, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: es
Niver a syllabennow yw: 1
Hag yns i: ['es']
S1: ES, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: lili
Niver a syllabennow yw: 2
Hag yns i: ['lil', 'i']
S1: LIL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: i, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5

An ger yw: precyous
Niver a syllabennow yw: 2
Hag yns i: ['precy', 'ous']
S1: PRECY, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ous, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: Ny
Niver a syllabennow yw: 1
Hag yns i: ['Ny']
S1: Ny, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: yll
Niver a syllabennow yw: 1
Hag yns i: ['yll']
S1: YLL, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: saw
Niver a syllabennow yw: 1
Hag yns i: ['saw']
S1: SAW, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: colon
Niver a syllabennow yw: 2
Hag yns i: ['col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: lan
Niver a syllabennow yw: 1
Hag yns i: ['lan']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: cana
Niver a syllabennow yw: 2
Hag yns i: ['can', 'a']
S1: CAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: a, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5

An ger yw: Cana'n
Niver a syllabennow yw: 2
Hag yns i: ['Can', "a'n"]
S1: CAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: a'n, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: jydh
Niver a syllabennow yw: 1
Hag yns i: ['jydh']
S1: JYDH, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: ha
Niver a syllabennow yw: 1
Hag yns i: ['ha']
S1: ha, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: cana'n
Niver a syllabennow yw: 2
Hag yns i: ['can', "a'n"]
S1: CAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: a'n, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: nos
Niver a syllabennow yw: 1
Hag yns i: ['nos']
S1: NOS, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: 1
Niver a syllabennow yw: 0
Hag yns i: []
Hirder ger kowal = 0

An ger yw: Ny
Niver a syllabennow yw: 1
Hag yns i: ['Ny']
S1: Ny, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: wovynnav
Niver a syllabennow yw: 3
Hag yns i: ['wov', 'ynn', 'av']
S1: wov, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: YNN, VC, hirder = [1, 1], hirder kowal = 2
S3: av, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 7

An ger yw: bewnans
Niver a syllabennow yw: 2
Hag yns i: ['bewn', 'ans']
S1: BEWN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ans, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: pur
Niver a syllabennow yw: 1
Hag yns i: ['pur']
S1: PUR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: es
Niver a syllabennow yw: 1
Hag yns i: ['es']
S1: ES, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: Owr
Niver a syllabennow yw: 1
Hag yns i: ['Owr']
S1: OWR, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: an
Niver a syllabennow yw: 1
Hag yns i: ['an']
S1: an, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: bys
Niver a syllabennow yw: 1
Hag yns i: ['bys']
S1: BYS, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: na'y
Niver a syllabennow yw: 1
Hag yns i: ["na'y"]
S1: NA'Y, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: berlys
Niver a syllabennow yw: 2
Hag yns i: ['berl', 'ys']
S1: BERL, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: ys, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: mann
Niver a syllabennow yw: 1
Hag yns i: ['mann']
S1: MANN, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: Govyn
Niver a syllabennow yw: 2
Hag yns i: ['Gov', 'yn']
S1: GOV, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: yn, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: a
Niver a syllabennow yw: 1
Hag yns i: ['a']
S1: a, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: wrav
Niver a syllabennow yw: 1
Hag yns i: ['wrav']
S1: WRAV, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: colon
Niver a syllabennow yw: 2
Hag yns i: ['col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: attes
Niver a syllabennow yw: 2
Hag yns i: ['att', 'es']
S1: ATT, VC, hirder = [1, 1], hirder kowal = 2
S2: es, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 4

An ger yw: Colon
Niver a syllabennow yw: 2
Hag yns i: ['Col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: onest
Niver a syllabennow yw: 2
Hag yns i: ['on', 'est']
S1: ON, VC, hirder = [2, 1], hirder kowal = 3
S2: est, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: colon
Niver a syllabennow yw: 2
Hag yns i: ['col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: lan
Niver a syllabennow yw: 1
Hag yns i: ['lan']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: 2
Niver a syllabennow yw: 0
Hag yns i: []
Hirder ger kowal = 0

An ger yw: Pythow
Niver a syllabennow yw: 2
Hag yns i: ['Pyth', 'ow']
S1: PYTH, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ow, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5

An ger yw: an
Niver a syllabennow yw: 1
Hag yns i: ['an']
S1: an, VC, hirder = [2, 1], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: bys
Niver a syllabennow yw: 1
Hag yns i: ['bys']
S1: BYS, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: ma
Niver a syllabennow yw: 1
Hag yns i: ['ma']
S1: ma, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: mar
Niver a syllabennow yw: 1
Hag yns i: ['mar']
S1: mar, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: mynnen
Niver a syllabennow yw: 2
Hag yns i: ['mynn', 'en']
S1: MYNN, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: en, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: Dhe
Niver a syllabennow yw: 1
Hag yns i: ['Dhe']
S1: Dhe, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: hasen
Niver a syllabennow yw: 2
Hag yns i: ['has', 'en']
S1: HAS, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: en, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: uskis
Niver a syllabennow yw: 2
Hag yns i: ['usk', 'is']
S1: USK, VC, hirder = [2, 1], hirder kowal = 3
S2: is, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: galsa
Niver a syllabennow yw: 2
Hag yns i: ['gals', 'a']
S1: GALS, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: a, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 4

An ger yw: va
Niver a syllabennow yw: 1
Hag yns i: ['va']
S1: VA, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: Lanow
Niver a syllabennow yw: 2
Hag yns i: ['Lan', 'ow']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: ow, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5

An ger yw: colon
Niver a syllabennow yw: 2
Hag yns i: ['col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: lan
Niver a syllabennow yw: 1
Hag yns i: ['lan']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: yw
Niver a syllabennow yw: 1
Hag yns i: ['yw']
S1: YW, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: prest
Niver a syllabennow yw: 1
Hag yns i: ['prest']
S1: PREST, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: len
Niver a syllabennow yw: 1
Hag yns i: ['len']
S1: LEN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: A
Niver a syllabennow yw: 1
Hag yns i: ['A']
S1: A, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: brenvyth
Niver a syllabennow yw: 2
Hag yns i: ['bren', 'vyth']
S1: BREN, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: vyth, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7

An ger yw: prow
Niver a syllabennow yw: 1
Hag yns i: ['prow']
S1: PROW, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: bynytha
Niver a syllabennow yw: 3
Hag yns i: ['byn', 'yth', 'a']
S1: byn, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: YTH, VC, hirder = [2, 1], hirder kowal = 3
S3: a, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 7

An ger yw: 3
Niver a syllabennow yw: 0
Hag yns i: []
Hirder ger kowal = 0

An ger yw: Gorthugher
Niver a syllabennow yw: 3
Hag yns i: ['Gorth', 'ugh', 'er']
S1: Gorth, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: UGH, VC, hirder = [2, 1], hirder kowal = 3
S3: er, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 8

An ger yw: myttin
Niver a syllabennow yw: 2
Hag yns i: ['mytt', 'in']
S1: MYTT, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: in, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: ow
Niver a syllabennow yw: 1
Hag yns i: ['ow']
S1: ow, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: mynnas
Niver a syllabennow yw: 2
Hag yns i: ['mynn', 'as']
S1: MYNN, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: as, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: A
Niver a syllabennow yw: 1
Hag yns i: ['A']
S1: A, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: neyj
Niver a syllabennow yw: 1
Hag yns i: ['neyj']
S1: NEYJ, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: war
Niver a syllabennow yw: 1
Hag yns i: ['war']
S1: WAR, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: eskelly
Niver a syllabennow yw: 3
Hag yns i: ['esk', 'ell', 'y']
S1: esk, VC, hirder = [1, 1], hirder kowal = 2
S2: ELL, VC, hirder = [1, 1], hirder kowal = 2
S3: y, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 5

An ger yw: can
Niver a syllabennow yw: 1
Hag yns i: ['can']
S1: CAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4

An ger yw: Troha
Niver a syllabennow yw: 2
Hag yns i: ['Tro', 'ha']
S1: TRO, CV, hirder = [1, 2], hirder kowal = 3
S2: ha, CV, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 5

An ger yw: Duw
Niver a syllabennow yw: 1
Hag yns i: ['Duw']
S1: DUW, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: a-barth
Niver a syllabennow yw: 2
Hag yns i: ['a-', 'barth']
S1: A-, V, hirder = [2], hirder kowal = 2
S2: barth, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 5

An ger yw: ow
Niver a syllabennow yw: 1
Hag yns i: ['ow']
S1: ow, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: Gwithyas
Niver a syllabennow yw: 2
Hag yns i: ['Gwith', 'yas']
S1: GWITH, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: yas, CVC, hirder = [1, 1, 1], hirder kowal = 3
Hirder ger kowal = 7

An ger yw: A
Niver a syllabennow yw: 1
Hag yns i: ['A']
S1: A, V, hirder = [2], hirder kowal = 2
Hirder ger kowal = 2

An ger yw: ro
Niver a syllabennow yw: 1
Hag yns i: ['ro']
S1: RO, CV, hirder = [1, 2], hirder kowal = 3
Hirder ger kowal = 3

An ger yw: dhymmo
Niver a syllabennow yw: 2
Hag yns i: ['dhymm', 'o']
S1: DHYMM, CVC, hirder = [1, 1, 1], hirder kowal = 3
S2: o, V, hirder = [1], hirder kowal = 1
Hirder ger kowal = 4

An ger yw: colon
Niver a syllabennow yw: 2
Hag yns i: ['col', 'on']
S1: COL, CVC, hirder = [1, 2, 1], hirder kowal = 4
S2: on, VC, hirder = [1, 1], hirder kowal = 2
Hirder ger kowal = 6

An ger yw: lan
Niver a syllabennow yw: 1
Hag yns i: ['lan']
S1: LAN, CVC, hirder = [1, 2, 1], hirder kowal = 4
Hirder ger kowal = 4





Thursday, 19 January 2017

Can Y Cardi re-released for President Dukat inaugurization

President Dukat is inaugurized today. The Dukat-Brunt team encourage all citizens of the galaxy to celebrate with this new version of Can Y Cardi (The Cardi Song).









Earth's political crisis continues after the planet voted to leave the United Federation of Planets in a recent referendum.

The Prime Minister of Earth, Tresera June still maintains that the planet will not only press ahead with plans to leave the Federation, but leave the UFP single market. The Romulan government is expecting this may give an opportunity to bring an end to trade sanctions imposed by the Federation after the Romulan annexation of two star systems in the Neutral Zone. This could even lead to the legalisation of Romulan Ale on Earth. However this would mean a hard border with Mars and its dependent asteroid colonies which will be remaining in the Federation, and the possibility of another independence referendum on Luna.

Sunday, 16 October 2016

An atlas of Wales in Welsh - data on OpenStreetMap

I recently saw a tweet by @mikeparkerwales about a new atlas of Wales in Welsh being prepared by @dafyddelfryn.

A Gaelic map of Scotland has also recently been produced by Paul Kavanagh.
As well as my own work, there is also Justin Cozart's map of Cornwall in Cornish.

I have produced a set of cycling maps of Wales where I used the placenames from OpenStreetMap. My procedure was to prefer the Welsh name, i.e. use the name:cy tag where it existed, otherwise the name tag.

I couldn't actually remember how I had got the Welsh names, since they are not on the standard shapefile downloads from geofabrik.de that I usually have used.

However the full set of tags are available by using the .pbf file available at download.geofabrik.de/europe/great-britain.html which needs a little more work, firstly to convert to the full XML by osmconvert, and then to turn that into SpatialiteDB format. The tutorial here explains a little of how this is done.
This process can be horribly inefficient, since a large .pbf file is downloaded, which is processed into an even larger XML. I think there are other ways to do it which avoid this which I may look into in future.

For the Welsh names, it is necessary to look for the name and name:cy tags, and maybe also name:en tags, which exist for some places, though this is usually just a duplication of name. A very few places have an alt_name:cy tag, where there is more than one Welsh name current for a particular place.





See also WikiProject Wales on the OpenStreetMap wiki. Unfortunately neither the Welsh language OpenStreetMap rendering at http://brasskipper.org.uk/cyosm or the multilingual test page by Jochen Topf currently seem to be working.

Here are some basic renderings of placenames in QGIS, using the places.shp from the geofabrik.de shapefile of Wales, and 'joining' this to the name:cy, name:en and alt_name:cy tags from the Spatialite version.

Mid-Wales. Most names only have the name tag, and where a separate name:cy tag exists, the name tag is generally the English name, although in the case of Ffwrnais, it is a bilingual form. Where name:en exists, it is usually a duplicate of name, but in one case (Pontfaen /  Forge), name:cy and name:en tags but no name. Barmouth has an alt_name:cy tag of "Y Bermo" defined.

Holyhead. Similarly where name:cy exists, name is sometimes the English version and sometimes a bilingual form.
Cardiff. Many bilingual names are in evidence, the general practice here seems to be to use name:cy for the Welsh name and name for the English name, and where name:en exists, it is usually simply a duplicate of name.

Bridgend. The remaining two places in the populated places shapefile that have an alt_name:cy defined are here.




Thursday, 11 August 2016

Welsh language internet memes

Here's a few captioned pictures in Welsh I put out on clecs.cymru (like Twitter, but in Welsh) previously and a couple of Cornish ones:

Based on the song Can y Cardi



Mae Powys yn wlad fawr iawn. Scattered towns separated by vast tracts of conifers...
Mae e yn Forg!
A typical summer Saturday on the main train line through Cornwall. In Cornish it is not recommended to use "war an tren", instead "y'n tren" if you are indeed travelling inside the carriage.
This may look similar to a screenshot from Poldark but it is in fact from the Cornish version of Lord of the Rings, in a scene showing the hobbits in the Old Forest (An Hen Goeswik or An Goeswik Goth).

Monday, 9 November 2015

Cornish Language Shop 'Kowsva' appears on Newyddion on S4C

Recently the Cornish language shop Kowsva run by Kowethas an Yeth Kernewek featured on the national TV news in Wales, appearing on Newyddion on 3rd November on S4C.

Here is the report, to which I added subtitles in English and Cornish, which is labelled as Corsican due to auto-complete.

Please click on the 'cog' icon at the lower-right of the video to choose the subtitle language.

Tuesday, 18 November 2014

An unusual application of a MaxEnt habitat suitability model

The MaxEnt software is often used by ecologists, and others for species habitat modeling based on environmental layers.

So some data I used from the 2011 UK census (England, Wales and Cornwall) was 1. those with a skill in the Welsh language (the full question was only asked of census respondents living in Wales) and 2. those self-describing as Cornish for national identity.

The data is converted from census output polygons, to dots randomly placed within the part of the output polygon below 300m altitude.

Although there is quite a lot of land above 300m in Wales, there is actually only one or two census output area polygons that entirely disappear when terrain above 300m is cut out. So if you're in Blaenavon, apologies for deleting you.

Using the environmental layers of elevation, slope (from Shuttle Radar Topography Mission) and distance from the coast, this is the output:

Notice that the habitat suitability for Welsh speakers is actually higher in areas such as the North York Moors, and North Devon than Ynys Môn.


Habitat suitability drops off further than 60km from the coast

Altitudes of 200m-300m appear to be most suitable for Welsh speakers according to the observations of the census data.

The Welsh speakers are not suited to living on flat terrain.

A range of coastal areas are suitable for resettlement of the Cornish in the event of for example,  unexpected reactivation of the igneous activity of the Cornubian batholith.







Tuesday, 8 July 2014

Speakers of the Welsh language according to 2011 census.

Much was written about a relatively small drop in the percentage of Welsh speakers in Wales as recorded by the 2011 census. I'm sure an astronomer wouldn't believe it was anything other than statistical noise if her data showed a 1% change from one survey to another....

Nevertheless, it is possible to visualise the data in a different way to the standard colorised maps you often see about these things.

One way is the restriction of the census output polygons to where buildings exist as the Datashine project did. However their website does not display statistics for Welsh language skills, since the detailed question was not asked to census respondents living outside Wales.

How about we use a QGIS plugin to give each Welsh speaker in Wales (or actually here, anyone claiming any skill in Welsh) a circular piece of land 50 metres wide, randomly located somewhere below 300 metres above sea level in his output census area polygon:


So here we have the opposite problem to the issues with the typical visualisations with colourised choropleth maps where large but sparsely populated areas dominate visually,namely that denser areas are oversaturated at this scale.

It is also possible to take this random dot distribution and make it into a heatmap (click on the image for a larger version):

I also downloaded the OS OpenData buildings layer for the relevent grid squares covering Wales, and produced another dots distribution (this took QGIS some time).

This produces the following dot maps, giving each Welsh speaker 50 metre and 20 metre diameter circles of land respectively:

Heatmaps: