Sie sind vermutlich noch nicht im Forum angemeldet - Klicken Sie hier um sich kostenlos anzumelden Impressum 
Sie können sich hier anmelden
Dieses Thema hat 25 Antworten
und wurde 1.711 mal aufgerufen
 Scottish Gaelic - Gàidhlig
Seiten 1 | 2
Bussinchen Offline




Beiträge: 90

23.08.2011 02:10
scottishgaelic.dic Zitat · Antworten

In this thread we can discuss questions concerning our future Scottish dictionary for Scrabble3D: gaelic.dic!


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

akerbeltzalba Offline




Beiträge: 142

23.08.2011 13:31
#2 RE: scottishgaelic.dic Zitat · Antworten

Zitat
The word count is limited by your RAM (or maybe 2GB). German dic has about 600k entries, several with paraphrase. It's size is about 10MB, loaded into the program in 2 or 3 seconds.



Ok, That seems roughly comparable to what our spellchecking dictionaries have (bearing in mind some of the morphology rules the .aff file implements). Now that doing this is a serious possibility, I'll put my head together with my co-conspirator and thrash out some rules on how to deal with some of the dictionary issues. Having read some of the German and Latin threads, there are a few things we need to figure too.

Getting technical here so she can chip in. There's a rule in the other dic files that non-independent forms are not allowed. That makes sense but it means we need to remove the following from the .dic
- any hyphenated word (as we can't do hyphens anyway)
- any word with an apostrophe: m', d'
- some prefixes (we need a full list): co-, comh-, dì, seann-, droch-, deagh-,
prefixes that are also independent words are ok: ro- (ro),
- no single letter words: a, m', d'
- no emphatic affixes (with or without hyphen): -sa, -san, -se, e

If we work off the root forms prior to generating them. On the bright side, I think the only rule we need to implement is the one for lenition. The rules for adding h-, t-, n- we don't want anyway.

Just thinking out loud for now, we need a full list. Any categories I missed?

GunChleoc Offline



Beiträge: 5

23.08.2011 14:43
#3 RE: scottishgaelic.dic Zitat · Antworten

Co-conspirator signing in :D

> - any hyphenated word (as we can't do hyphens anyway)
> - some prefixes (we need a full list): co-, comh-, dì, seann-, droch-, deagh-,

Kicking these out could be handled with regular expressions.

> - any word with an apostrophe: m', d'
> - no single letter words: a, m', d'

If this is a short list, a simple query/replace will do


> - prefixes that are also independent words are ok: ro- (ro),

Let's keep the ro and kick out the ro-

> - no emphatic affixes (with or without hyphen): -sa, -san, -se, e
Why not the ones without the hyphen? They would be an affix just like e.g. a plural affix

> If we work off the root forms prior to generating them. On the bright side, I think the only rule we need to implement is the one for lenition. The rules for adding h-, t-, n- we don't want anyway.

Exactly

Oileanach chànan chuthachail

Bussinchen Offline




Beiträge: 90

23.08.2011 18:00
#4 RE: scottishgaelic.dic Zitat · Antworten

I will answer later, when I will be at home again.

I must say that I don't know anything about the Gaelic language, not even whether you have many inflected forms like in German or Italian or not (like in English)... (i.e. if Gaelic is a more synthetic or a more analytic language). I think I must google a little bit in order to understand more how Celtic languages are structured. I must learn what lenition means exactly for Gaelic word stems. Interesting and exiting!

CU, cheers!
/Bussinchen


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

linhart Offline




Beiträge: 2.493

23.08.2011 20:12
#5 RE: scottishgaelic.dic - No abbreviations Zitat · Antworten

Let me make just a little remark:

Usually spellchecker lists contain also abbreviations. These have to be eliminated, since they are not allowed in Scrabble.
This is no problem if they have a full stop at the end. But e.g. in German there are also many abbreviations which are not marked in this way (chemical Elements, physical units etc.).

Bussinchen Offline




Beiträge: 90

25.08.2011 00:11
#6 RE: scottishgaelic.dic - No proper names, no trade marks Zitat · Antworten

And generally proper names and trade marks are not allowed.


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

Bussinchen Offline




Beiträge: 90

25.08.2011 01:56
#7 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

First of all:

In other languages like German, Swedish, Italian and Spanish, for example, a well known dictionary is the reference dictionary for the validity of the placed words.
In German, officially, the Duden (Rechtschreibduden) is used, and all the lemmata that are in that dictionary, are valid Scrabble-words.
In Swedish, it is the SAOL (Svenska Akademiens ordlista - the word list of the Royal Swedish Academy.) In Spanish, it is the list of the Royal Spanish Academy.
In Italian, it is the Zingarelli.

Which dictionary could it be for Gaelic Scrabble?

As a Gaelic Scrabble does not exist yet, you have to decide your own rules. It would be desirable to choose a Gaelic dictionary to be the reference dictionary for the validity of the Gaelic words.

Unfortunately I cannot really help you, because I cannot read a word in Gaelic. But I have found some links that could be useful:

http://www2.smo.uhi.ac.uk/toisich/
http://www2.smo.uhi.ac.uk/gaidhlig/faclair/

But I think you know that already, because there I have found a link to akerbeltzalba's website, too.

If you would like to create a really excellent Gaelic Scrabble word list, you should compare the lemmata in the reference dictionary with your own word list that could be any spell checker list or something else. You should eliminate all the words that are not found in the official reference dictionary. But this is a huge work that maybe will take several years.

But I think you can start with your own list and as a first step eliminate all proper nouns, all trade marks, all abbreviations, all words with hyphens and apostrophs and so on.


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

akerbeltzalba Offline




Beiträge: 142

27.08.2011 01:18
#8 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

We'll use Am Faclair Beag (http://www.faclair.com) as a standard reference, it contains the closest thing Gaelic has to a Duden and a modern dictionary alongside. As this is also the source for the OO dictionary files, it's a close match :)

Abbreviations will be easy to catch, fortunately, by stripping all caps.

Bussinchen Offline




Beiträge: 90

27.08.2011 01:44
#9 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

OK

Gaelic Scrabble does not exist yet, and there are no official rules. Therefore feel free to do what you think will be the best solution for Gaelic Scrabble!


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

akerbeltzalba Offline




Beiträge: 142

28.08.2011 12:30
#10 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

Question, you mentioned this but I misplaced the bit of paper I put it on. We had a facebook debate about the word sets and whether to allow from from Dwelly's classic dictionary. Opinions being evenly split, I'm tending to offering a choice of "modern only" and "with Dwelly".

That requires two .dic files, yes? One where all are marked [=AFB] and one with [=Dwelly], I think that's what you said, wasn't it?

And a bit of exciting news, Kevin, our black magic man, would also like to do Irish. So perhaps we should change this Forum thread into Scottish Gaelic & Irish :) (I don't think a totally separate subforum is needed)

Bussinchen Offline




Beiträge: 90

28.08.2011 15:10
#11 How to create categories in Scrabble3D dic files Zitat · Antworten

No problem: I'll explain it again.

In Scrabble3D dictionaries it is possible to create different dic-categories within the same file (= one file only!).

Let's take Gero's deutsch.dic as an example, because in deutsch.dic, we do have different categories. You cannot see the entries in deutsch.dic, because that downloaded file is encrypted, but Gero sends always an unencrypted version of his German SuperDic to me as a backup of his work. In German, Rechtschreibduden (RD) is standard catagory, always active, it cannot be unchecked, that's why it is grey in the settings. Universalduden (UD) and Freestyle are supplementary categories, not conform to the official rules, but they can be checked (or unchecked), if the player wants to do so. For more information about Gero's dic, please read here (in German): Geros SuperDic - Tipps und Tricks zum Umgang mit dem deutschen Wörterbuch



[Header]
Version=Superdic Stand 06.08.11
StandardCategory=Rechtschreibduden
[Categories]
1=Universalduden
2=Freestyle und Duden-Oldies
[Words]
AA=KINDGERECHTE UMSCHREIBUNG FÜR MENSCHLICHEN KOT
AACHENER
AACHENERIN
AACHENERINNEN
AACHENERN
AACHENERS
AAK=AAK - FLACHES RHEINFRACHTSCHIFF - QUELLE FREMDWÖRTERDUDEN;2
AAKE=AAK - FLACHES RHEINFRACHTSCHIFF - QUELLE FREMDWÖRTERDUDEN;2
AAKEN=AAK - FLACHES RHEINFRACHTSCHIFF - QUELLE FREMDWÖRTERDUDEN;2
AAKES=AAK - FLACHES RHEINFRACHTSCHIFF - QUELLE FWD;2
AAKS=AAK - FLACHES RHEINFRACHTSCHIFF - QUELLE FWD;2
AAL
AALE
AALEN
AALEND
AALENDE
AALENDEM
AALENDEN
AALENDER
AALENDES
AALENS
AALES
AALEST
AALET
AALFANG=VON AALFANG;1
AALFANGE=VON AALFANG;1
AALFANGES=VON AALFANG;1
AALFANGS=VON AALFANG;1
AALGLATT
AALGLATTE
AALGLATTEM
AALGLATTEN
AALGLATTER
AALGLATTES
AALKASTEN=EINE FANGVORRICHTUNG UD 6. AUFL.;2
AALKASTENS=EINE FANGVORRICHTUNG UD 6. AUFL.;2
AALKORB=VON AALKORB;1
AALKORBE=VON AALKORB;1
AALKORBES=VON AALKORB;1
AALKORBS=VON AALKORB;1
AALKÄSTEN=EINE FANGVORRICHTUNG - PL. UD 6. AUFL.;2
AALKÖRBE=VON AALKORB;1
AALKÖRBEN=VON AALKORB;1
AALLEITER=FISCHPASS FÜR AALE;1
AALLEITERN=FISCHPASS FÜR AALE - PL.;1
AALMUTTER=IN KALTEN MEEREN, TEILWEISE IN GROSSEN TIEFEN LEBENDER FISCH, DER LEBENDE JUNGE ZUR WELT BRINGT;1
AALMUTTERN=IN KALTEN MEEREN, TEILWEISE IN GROSSEN TIEFEN LEBENDER FISCH, DER LEBENDE JUNGE ZUR WELT BRINGT - KORREKTER PL.!;1
AALQUAPPE=;1
AALQUAPPEN=;1
AALRAUPE=IM SÜSSWASSER LEBENDER, GROSSER RAUBFISCH;1
AALRAUPEN=IM SÜSSWASSER LEBENDER, GROSSER RAUBFISCH;1
AALREUSE=;1
AALREUSEN=;1
AALS
AALSPEER=;1
AALSPEERE=;1
AALSPEEREN=;1
AALSPEERES=;1
AALSPEERS=;1
AALST
AALSTECHEN=;1
AALSTECHENS=;1
AALSTRICH=AALSTRICH IST DER LÄNGS ÜBER DIE RÜCKENMITTE VERLAUFENDE DUNKLE STREIFEN IM FELL VON DIVERSEN SÄUGETIEREN;1
AALSTRICHE=AALSTRICH IST DER LÄNGS ÜBER DIE RÜCKENMITTE VERLAUFENDE DUNKLE STREIFEN IM FELL VON DIVERSEN SÄUGETIEREN;1
AALSTRICHEN=AALSTRICH IST DER LÄNGS ÜBER DIE RÜCKENMITTE VERLAUFENDE DUNKLE STREIFEN IM FELL VON DIVERSEN SÄUGETIEREN;1
AALSTRICHES=AALSTRICH IST DER LÄNGS ÜBER DIE RÜCKENMITTE VERLAUFENDE DUNKLE STREIFEN IM FELL VON DIVERSEN SÄUGETIEREN;1
AALSTRICHS=AALSTRICH IST DER LÄNGS ÜBER DIE RÜCKENMITTE VERLAUFENDE DUNKLE STREIFEN IM FELL VON DIVERSEN SÄUGETIEREN;1
AALSUPPE=;1
AALSUPPEN=;1
AALT
...
...


If you look at the header of that file, you find this:

[Categories]
1=Universalduden
2=Freestyle und Duden-Oldies


So for Gaelic you can specify e.g. the following optional categories:

[Categories]
1=Dwelly
2=XYZ Gaelic word list
3=...


Now your header must be like this:

[Header]
Version=100001
Author=akerbeltzalba
StandardCategory=Am Faclair Beag
Licence=CC-N3, any commercial use is prohibited.
Comment=Gaelic Dic 28.08.11, encrypted
Key=?????????
[Replace]
[Categories]
1=Dwelly
2=...
3=...
[Words]
...
...


Don't worry about the encryption key. I believe that it is generated automatically by the program, when encryption is done.
Keep the encryption key empty: Key=


In the word list, however, you must write like this:

AALFANG=VON AALFANG;1
AALKASTEN=EINE FANGVORRICHTUNG UD 6. AUFL.;2


i.e. Gaelic word, equal sign (=), definition/explanation/comment, semicolon, number of the category


If you don't have any definitions/explanations/comments yet, you must write like this:

AALSTECHEN=;1

So remember that it is important not to use any semicolon within the definitions, because everything you write after the semicolon will not be shown in the tooltip or the word search hits. Semicolon is a category marker only.


If you want words to be found in the standard category, you only write like this (with definition or without definition) (no semicolon, no number):

AA=KINDGERECHTE UMSCHREIBUNG FÜR MENSCHLICHEN KOT
AACHENER




Feel free to contact me or Gero, whenever you have more questions!

Best regards,
Bussinchen



I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

Bussinchen Offline




Beiträge: 90

28.08.2011 15:26
#12 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

Zitat von akerbeltzalba
And a bit of exciting news, Kevin, our black magic man, would also like to do Irish. So perhaps we should change this Forum thread into Scottish Gaelic & Irish :) (I don't think a totally separate subforum is needed)



Wow!!!! Amazing!!! Phantastic!!! Awesome!!!


I OpenSource!
• Scrabble3D Download: Sourceforge.net | • Scrabble3D Help: Wiki | • Scrabble3D News: Twitter | • Scrabble3D Fanship: Facebook
• Scrabble3D in Italia: Sezione Scrabble3D sul Forum della Federazione Italiana Gioco Scrabble

akerbeltzalba Offline




Beiträge: 142

28.08.2011 16:05
#13 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

What's the more sensible option? For example, let's say we have a 5 word dic file:
glas
taigh
fuar
muirsgian
deoch-bhiugh

Let's say the last two are Dwelly words but the others occur both in Dwelly and in AFB. If we define Dwelly=2 and AFB=1, does the list look like this:
glas;1;2
taigh;1;2
fuar;1;2
muirsgian;2
deoch-bhiugh;2

Or is it better to define AFB+Dwelly=1 and AFBonly=2,resulting in:
glas;2
taigh;2
fuar;2
muirsgian;1
deoch-bhiugh;1

Scotty Offline

Administrator


Beiträge: 3.777

28.08.2011 16:15
#14 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

The German dic isn't the best example for your task. English has two lists: SOWPODS and TWL, where TWL is a subset of SOWPODS [1]. So, the dic has a standard category TWL and players can add SOWPODS words. These words are marked by ;1.
It's not possible to have a word in two or more categories, and double entries are not allowed.
Putting all together the list looks like that:

[General]
StandardCategory=Dwelly
[Categories]
1=ABC
[Words]
glas
taigh
fuar
muirsgian=;1
deoch-bhiugh=;1

Dwelly is always active and ABC can be used optional.

[1] english.dic: Discussion about the TWL and SOWPODS categories


Download: Sourceforge.net | Help:Scrabble3D Wiki | Discussion: Forum | News: Twitter | Fanship: Facebook

akerbeltzalba Offline




Beiträge: 142

28.08.2011 16:41
#15 RE: scottishgaelic.dic - A reference Dictionary Zitat · Antworten

Ah ok, I'm with you now. Except I probably didn't explain it so well so it's the other way round I think. The idea is to have a smaller, more modern wordset (let's call this AFB=1) and a bigger wordset containing the modern wordset PLUS older words in it too for advanced players (let's call this combined set AFB+Dwelly=2 - I'll think of better names eventually).

So we get
[General]
StandardCategory=AFB
[Categories]
1=Dwelly
[Words]
glas
taigh
fuar
muirsgian=;1
deoch-bhiugh=;1

This means that in a standard game, players only get the StandardCategory words so the last two are not allowed. But if they select the bigger wordset, they get everything plus ;1

That's right, isn't it?

Seiten 1 | 2
 Sprung  
Xobor Forum Software von Xobor.de
Einfach ein Forum erstellen
Datenschutz