Yes, I agree. But how do we get the main forms? As Bussinchen explained, there are NO inflected forms in your list! One theoretical possibility would be to compile all wordforms from all ancient Greek texts. Presumably, the authors of the Perseus site have done something like that, but until now we have not found a corresponding list.
Zitat von linhartYes, I agree. But how do we get the main forms? As Bussinchen explained, there are NO inflected forms in your list! One theoretical possibility would be to compile all wordforms from all ancient Greek texts. Presumably, the authors of the Perseus site have done something like that, but until now we have not found a corresponding list.
IMHO it could be a solution to compile all wordforms from all texts in the Perseus database. At least that would be better rather than to have no inflected forms at all!
Zitat von linhart in a private email to Apollonius and BussinchenBut the main question is: do you (or we) want to have all inflected forms in the Scrabble word list? If yes, this would be really VERY much work. There is at least one language, where Scrabble is played without inflection forms, namely Swedish. So this might be a realistic option.
Personally I don't like that stupid Swedish rule at all. And if our goal is to serve a pedagogical purpose with our Ancient Greek Scrabble3D, too - a goal which I really support! - we absolutely need inflected forms, too!!!
So I want to have inflected forms in our dic. Period!
Yes, Bussinchen, it would of course be better to have all word forms in the list. But for the moment I do not see any realistic possibility to reach this goal. There are so many ancient greek texts, and they are presented in Perseus only in small sections, so that we must perform many thousands of copy-and-pastes.
If one word has on average 6 letters and one Greek letter needs 2 bytes for storage, these are 132 MB.
I tried the first text of Lucian (Abdicatus). It is divided in 32 sections, and each section has on average 100 words or so. If I assume that the other texts are stored in a similar manner, this means that we must copy and paste about 110,000 text sections. If we need 10 seconds for one copy and paste, this would need 1,100,000 seconds or 305 hours of work, just to get a huge text file of about 132 MB which contains the whole ancient Greek literature.
So we need an idea how this process could be improved.
So did I! Apollonius told me that there are three parts to be downloaded all together first, otherwise it is not possible to unzip them. But once you have downloaded part 1-2-3, the unzip works, if you start uncompress the file which has the ending 001, and you get one huge pdf-file: 1806 pages! This pdf-file is a facsimile, is a scanned version of the printed version of the LSJ.
But we made a test in order to check out if copy & paste works, and it is possible indeed, even though we have only a scanned facsimile. But if I copy for instance the whole page 4, sometimes not everything is selected for copying and remains unmarked. I don't understand why.
Then I tried to paste copied text from the LSJ-pdf into MS Notepad, MS WordPad and Notepad++. That works - but the Greek letters are not shown correctly at all in my txt-files.
The result is like this:
Crya9o-Ppucria, 17, good produce, C. I. 9262. uYa0o5ai|jioviSaifiwv (cf. sq.) : hence, guests who drink but little, Arist. Eth. E. 3. 6, 3 : dyafloBai-iioviao-Tai, name of a sort of club, Ross Inscrr. ined ayaOofipva-ia dya'XXw. 282. v, ovos, 6, the good Genius, to whom a cup of pure wine was drunk at the end of dinner, the toast being given in the words dya- &ov Saipovos : and in good Greek it was always written divisim. II an Egyptian serpent, Wessel. Diod. 3. 50. d-yaOoSo<7ia, 77, (8oo"ts) the giving of good, Schol. Arist. dY<).8o-86T-r|S, ov, u, the Giver of good, Diotog. ap. Stob. 332. 19: fern 8oris, ioos, 17, Dionys. Ar. 440. 34. d^dOo-fiB-ris, is, like good, seeming good, opp. to d-yafliis, Plat. Rep 509 A, Iambi., etc. Adv. -Sius. aya.9oepyiu, to do good or well, I Ep. Tim. 6. 18: contr. -oupycu, Act. Ap. 14. 17 (vulg. ayaOoiTOi&v). dyafloepYta, Ion. -li), contr. -ovpyia, f/, a good deed, service rendered, Lat. benefimim, Hdt. 3. 154, 160. II. well-doing, Eccl.
Now I don't know if it would be possible to transform the wrongly shown "Greek" letters in real and correct Greek letters.
I don't know if this error occurs in copying/pasting because they do not use Unicode. I really don't know if this is related to that, but I had found something on this site:
To find individual lemmata, one must use the search function. It is not possible yet for a user of Perseus to send Greek Unicode directly to the search engine. Therefore, if a user wants to send a fully inflected Greek form with diacritical marks, Beta Code must be used. Help is provided to write the Beta Code using Greek letters and accents (See under "Help" "Lookup Tool Help"), but searching goes so much more quickly if one can transliterate to Beta Code directly.
Anyway: You see that there is a CD, too. Rather expensive (135 $) - so I would have liked to see at least some screenshots. But in one comment I can read:
In addition the programme has several useful facilities, such as bookmarks and the possibility of making notes. The latter option, however, is not really essential since communication with a word processor like MS Word runs smoothly by means of the copy/paste function.
If we only knew if it is possible to create whole lists that can be copied or if you can copy only one article of a single lemma at the time...
Zitat von ApolloniusAnd what a quandary! We either have no inflected forms, or the dictionary will be in the millions of words? I don't even mind that so much, but WE will have to add all of those forms?
I think, if we continue using the Perseus database to creat lists, we could use both the frequency number and the length of every word as a decisive criterion to reduce not only the number of the words in our future ancientgreek.dic, but primarily the size of the file. But I repeat: I think that we should absolutely have inflected forms, too! Because that's what students are training most, when they prepare themselves for exams (for example the German Graecum, as I did, when I was a young student in Erlangen).
If it is too much job now, we can start with an ancientgreek.dic without inflected forms. And then we can provide a dic update, when we have added inflected forms.
As far as I understand it, the number 11 millions (exactly 11,385,778) refers to the number of words in such a way that words which occur several times are also counted several times. Thus the number of DIFFERENT word forms must be much smaller, presumably less than one million. So the main problem is how to copy all texts into one large file.
There is one possibility which might speed up the copying process. If you look at a page with a portion of a classic text (e.g. http://www.perseus.tufts.edu/hopper/text...1%3Asection%3D1), you see at the bottom in a red field the letters XML and still below the link "XML version". If you click on it, you can download an XML version of the WHOLE text, not only the current section.
Now the problem is how to convert the XML format to ordinary Greek text. I think I could write a conversion program, but it would of course be better, if we could find one somewhere.
Nevertheless, also in this way each text would have to be copied separately. I have not counted them, but I guess that there are about 1000 texts. But this could be manageable.
But will we still need that in light of finding the entire dictionary?
Also, I never could see the Greek text in those excel sheets. When I opened them, they gave me a warning that I had over 500 fonts, and therefore the documents would not be displayed properly. And then they weren't displayed properly! So I'm not sure if I'm the best to help with matters of properly-displaying Greek text. But when I try to copy from the PDF file to notepad, I get the same results as Bussinchen, for whatever that's worth.
Thank you for this link. Unfortunately I was not able to install this Page Converter properly. If I want to convert a XML-file I get a message which says that the process cannot access the XML-file since it is used by another process. A new start of the computer did not help.
To your question "But will we still need that in light of finding the entire dictionary?":
Yes, I think we need something like that, since the dictionary does not contain inflected forms.
Oh... Well things are moving along quickly then! I thought we were just going to try and get a basic dictionary up and running first, but if we're going for broke, then I'm all in favor of that. I think that ALL of the forms should be included, as much as possible. Even though I don't understand ancient Greek yet, I have to keep in mind how it would be if I DID. I would want to be able to play the word meaning "they dance" if I knew it, saw a place for it, and had the proper letters. Just having the basic form would not do at all! It would be tantamount to removing the past tenses of verbs in English, or plural forms of nouns!
Bussinchen, I will have to leave the technical aspects to you and Linhart unless you want to give me a crash course in Excel! However I can help with any sort of boring work like editing the word list, or things of that nature...just tell me how, and I'm at your disposal. I know it won't be possible to run a filter to automatically generate word forms (for one thing your filter would have to differentiate between an adjective and a verb in Ancient Greek), but when we find a verb, it would be nice if there was a way to automatically generate all of the forms for it. Would that have to be done within the Scrabble3D program's dictionary editor?
I'm sorry about the program! I tried it, but I had different problems. I couldn't export an XML file which contained much text at all. It was mostly lines of code. I think I am getting the XML from the wrong place, or perhaps at the wrong time?
Also, @ Bussinchen, your excel sheets look fantastic, as far as a word list with definitions and other information! That puts my TXT file to shame!
But I have already an idea!!! Before, however, I must contact Prof. Dr. Wilfried Stroh as well as Prof. Dr. Holzberg, both from the University of Munich, whom I want to contact again after my first email contact. I will do it today.
My idea consists in trying to involve high-minded students of Ancient Greek language and literature. It would be phantastic if we could bring them to help us by "donating" inflected forms in a real OpenSource and Wikipedia-alike mind... In that way we all together could create an amazing Creative Commons Ancient Greek word list including inflected forms, available for everybody, not only for Scrabble3D. But we could use this forum for that purpose!
ad 3: I'm really not an expert, but thank you anyway for your compliment! Fortunately the Greek fonts appeared well in MS Excel after using my workaround via Microsoft Windows Wordpad...
Zitat von BussinchenIt is really irritating that the Perseus-website creates error messages so often:
Zitat von Perseus on http://www.perseus.tufts.edu/hopper/vocablistAn Error Occurred Sorry, we were unable to load the page you were looking for! You've encountered an error that probably occurred due to a problem on our end. The error has been logged, and we'll be looking into fixing it. If you'd like to help us out, you can fill out the following form
I am really angry, because even today this error occured, and I don't know what to do. I cannot create even one single list! I opened three different browsers: Internet Explorer, Mozilla Firefox, Google Chrome - but it doesn't matter which one I use, in each browser already my first trying failed: An Error Occurred
Open-Source Services The Perseus Hopper is an open-source project providing a suite of services for interacting with textual collections. While as a whole it provides an integrated reading environment, its individual services are designed to be modular and can be grouped into three different classes.
The size of the file hopper-texts-GreekRoman.tar.gz is 119 MB. The size of the unzipped tar.file is 412,2 MB.
Unfortunately Greek letters are not shown as Greek letters in my Editor Notepad++. Example:
That's great, Bussinchen! I wonder why I had not found this link. The files are all in XML format, so it is not the fault of notepad if you cannot read them. But I have made already some progress in writing a conversion program. I could already convert all texts of Aeschines, but there are still some open problems.
Though this is kind of hard work, I also had to laugh in between. In Aeschines 3, 112 we can read:
It is really strange that there is neither a spritus nor an accent on the word υνκνοων. I did not find that word in the old Langenscheidts Taschenwörterbuch Griechisch-Deutsch I used when I studied Ancient Greek with Mr. Holzberg in Erlangen. I don't find any word beginning in ypsilon+ny there...
I don't know what to say...
By the way: Today I have written one more email to Prof. Holzberg, asking him among other things if he - or somebody else at the Department for Classical Languages at the University of Munich - knows if there is a copy/paste function in the LSJ-CD-ROM like the one we have in the CD-ROM of the German Rechtschreibduden...