Saturday, 6 May 2017

Bilingual interface to Cornish Corpus Statistics Python GUI application and switching between Kemmyn and manuscript spelling

As part of my taklow-kernewek tools, I created an application that can do some corpus statistics on Cornish texts, and is configurable at run-time to a certain extent.

I have made a few improvements to the files cornish_corpus.py and corpus_wordfreqGUI.py, which include a bilingual interface, and ability to switch between using Kernewek Kemmyn and manuscript spelling (or at least a reading of such).

To create a switchable bilingual interface, I overhauled the GUI code to make it more object orientated, and created a set of dictionaries where the keys each refer to another dictionary with 2 elements {'en': 'English text', 'kw': 'Cornish text'}.

A button in the GUI then runs a function that changes the interface language, and alters the text in all of the relevant widgets to use that in the new language.

It can also be specified when running corpus_wordfreqGUI.py, at the command line as the -e switch which will launch with English interface.

English interface. The button at the lower left allows switching betweeen the two.

Cornish interface
I have also fixed a bug that happened when there were no words longer than the specified number of letters, and the list of word frequencies is generated. Internally, what happens inside cornish_corpus.py is that the length of the longest word is found, so that the output text is spaced appropriately. Now it checks whether there is an empty list of tuples of (word, frequency) to avoid an indexing error.

The other thing I have done is fix a bug when the manuscript spelling was selected (previously only via command line -m switch, but now also in the GUI, as below). There is a different list of texts available in manuscript vs. Kemmyn, which had previously caused the program to have an index error in some cases.

Unfortunately I still have an issue with TkInter, since when switching between Kemmyn <--> manuscript there is an extra empty space generated, which needs a bit of adjustment to how the widgets pack etc. I got a bit confused when trying to fix it so am leaving it in for now.
Kemmyn (top left) and manuscript (bottom right). These windows have been launched direct from the command line, with the manuscript one launched by "corpus_wordfreqGUI.py -m" to choose the manuscript spelling texts rather than Kemmyn.

Annoying issue with space at the left appearing after switching within the GUI to manuscript spelling.

Update 07/05/17 - Tkinter bug fixed

After spending a while looking at my copy of Programming Python I found the pack_forget() method, which I used to remove the buttons at the lower left (language and manuscript mode switch) while the text choice list is repopulated with new radio buttons, and then the buttons are repacked afterwards.
I also show in the heading above the list texts which mode the program is in.

In Kemmyn mode, showing the most frequent words of at least 5 letters in Passyon agan Arloedh

In manuscript spelling

No comments:

Post a Comment