Let's discuss the Finnish dictionary here in this thread.
We know that the suomi.dic we have just now is not very good. It contains only 93695 words. Compare that to our deutsch.dic, which contains nearly 681000 words, or with our français.dic which contains nearly 379000 words. I would be great if we could get a better finnish word list or if we could improve the one we already have.
I think J24 means that the Finnish dictionary he uses for playing Scrabble is Nykysuomen sanakirja or maybe Kielitoimiston sanakirja.
We read in Wikipedia:
Standard Finnish is prescribed by the Language Office of the Research Institute for the Languages of Finland and is the language used in official communication. The Dictionary of Contemporary Finnish (Nykysuomen sanakirja 1951–61), with 201,000 entries, was a prescriptive dictionary that defined official language. An additional volume for words of foreign origin (Nykysuomen sivistyssanakirja, 30,000 entries) was published in 1991. An updated dictionary, The New Dictionary of Modern Finnish (Kielitoimiston sanakirja) was published in an electronic form in 2004 and in print in 2006.
Zitat von J24And the dictionary portal what we use, is not free one.
Please give me the link anyway!
Zitat von J24Players can decide, which dictionary they use (from Finnish Scrabble rules).
OK, this is the same in the booklet with the rules for Hungarian Scrabble. And in Hungary there is no National Scrabble Federation either.
But we decided to take Magyar Értelmező Kéziszótár for creating our future magyar.dic, because it seems to be the most reliable and the most official one in Hungary. So I think that Kielitoimiston sanakirja would be the best one for Finnish Scrabble, because is has been published by the Language Office of the Kotimaisten kielten tutkimuskeskus (Research Institute for the Languages of Finland, http://www.kotus.fi/?l=en&s=1).
Like in Swedish Scrabble. That's boring, if you play often in German, French, Italian, English, and so on, where inflected forms permit you to extend already placed words so that you get more points... But OK, if the rule is like this in the Finnish rule booklet, we have to follow follow this rule.
Maybe our suomi.dic is not that bad, if we consider that no inflected forms are valid. Now we have to compare the 93695 words of our suomi.dic with Kielitoimiston sanakirja, which consists of almost 100 000 entries.
J24, what do you personally think about the quality of suomi.dic of Scrabble3D, when you are playing in poll mode?
I have checked the old discussion (in German language) about our suomi.dic we have had with a Finnish guy, xyz, in 2008. At that time, today's Scrabble3D with its advanced functions and its server didn't exist yet. Then we played with an older, much simpler (but nevertheless very nice) version, Heiko Tietze Scrabble 3.0.xx. At that time I was member in this forum, but I was not as active as I am today, and I didn't care so much where our dic files came from. Now it's different. So now, three years later, I have checked once more the links xyz mentioned in his postings...
The kotus list which I edited for suomi.dic contained 94110 words, and the version I sent (years ago) seems to have the right number, also (see Finnisch Archiv 2008).
But suomi.dic contains 93695 words, so about 415 words are missing.
The original version I sent was in lowercase. Suomi.dic is uppercase and with a slightly different alphabetical order, so maybe the words have been lost in the same process?
I attach here the original list with 94110 words, encoding UTF-8, uppercase. I hope it is the right format. All the special characters should be right now. Can you put in the version number etc., I don't know what it should be.
I found one mistake I made when editing the original list in lowercase, the first å in "ångstöm" should be written with lowercase.
The original kotus list also contains inflection information, maybe it could be used in dividing the list in word classes (verbs, etc). But if nobody has even noticed the dictionary is missing 400 or so words, maybe it is not such a useful idea.
About the quality of the kotus list, I think it is not so bad after all. It's the basis for the Language Office dictionary, which has some thousands of extra new words. If there is no Scrabble association in Finland (I haven't found one), in my opinion there is not much point in having a non-free dictionary as the "official" list for Finnish.
The header of your old suomi.dic is like this: [Header] Version=100001 Author=xyz StandardCategory=nicht zugeordnet [Categories] [Words]
The header of the German dictionary deutsch.dic is like this:
[Header] Version=200019 Author=Gero Illing <email@example.com> StandardCategory=Rechtschreibduden Licence=Copyright Gero Illing, Nutzung und freie Verbreitung ausschließlich zusammen mit dem Programm Scrabble3D gestattet Comment=Superdic Stand 15.05.12, encrypted Key=S3MMjf46 [Replace] [Categories] 1=Universalduden 2=Freestyle und Duden-Oldies [Words]
So you see that you should add a definition for what is the StandardCategory; I think besides Author you should write your correct name instead of xyz and add your email address; Licence: You should write which one you want: GNU-GPL oder CC Creative Commonsnon commercial... You can even leave a Comment; and you have to decide whether you want Scotty to encrypt your word list or not (tell him by email). The information about the Key is nothing you have to write, and Replace is irrelevant for the Finnish language (it is relevant only for languages with digraphs).
Since Kotus already is a free word list, I suggest some header like this for your updated suomi.dic:
[Header] Version=100002 Author=xyz (to be replaced by your name) <firstname.lastname@example.org> StandardCategory=Kotus Licence=GNU General Public License (or CC or what you prefer) Comment=This word list is based on Kotus word list. (or something like that; you can explain a little bit what Kotus is, in English or Finnish) [Replace] [Categories] [Words]
Your dictionary contains 93695 unique words (the rest of 94110 entries are duplicates) with several strange letters: ´," ",-,',¢,4,A,Á,À,Å,Ä,B,C,D,E,É,È,Ê,F,G,H,I,Î,J,K,L,M,N,O,Ö,P,Q,R,S,Š,T,U,Û,V,W,X,Y,Z,Ž,
I don't remember exactly how we managed to create the current list but I guess it would be the result after removing all these problems.
Letters are only allowed (and converted to) as upper-case. The sort order is applied automatically based on ordinal value of character.
BTW: I'd like to add your real name with email address to the list header. And, according to Bussinchen's posting, you should think about licence (GPL v3) and plain text, i.e. not scrambled.
Unfortunately I have not yet been able to discuss these things with you, but I will do it as soon as possible on Finnish letter set . That discussion will have consequences for our Finnish word list suomi.dic.