Shavian:Forum
This is the forum. Many things here may be out of date. Click here to begin a new section, in either alphabet. You will need to create an account first.
Homonyms
I have a plan to deal with this, but it'll have to wait a couple of days. -- TT
- Done-- see Shavian:Dab. Marnanel 02:16, 3 April 2009 (UTC)
Androcles and canonicity
If Androcles did something one way, unless it was clearly a mistake, should we always follow? There seem to be a few words which are decidedly different in various dialects (as in Talk:That), and I wondered whether it might be a useful rule of thumb. Marnanel 03:30, 3 April 2009 (UTC)
- That was my understanding, yes, that Androcles was a model for "standard" Shavian.
- Everyone is, of course, free to write as they wish when they want to represent their own dialect. But if you want to write "standard written Shavian", I've always taken Androcles as the model. (Obvious mistakes excepted, I suppose.) -- Pne 20:30, 4 April 2009 (UTC)
Wordlists
If you want to transliterate some words, another option besides taking a random document and clicking on words is to take a wordlist such as this one or this one.
Then create a new article in the Document: namespace (e.g. by editing Document:Sandbox), paste the article in, preview, and fix some unknown words from there. (But don't actually save the article unless you have permission to use it and to release it as cc-by-sa.) -- Pne 20:53, 4 April 2009 (UTC)
- That's a really useful idea. I asked on Wikipedia in a couple of places [1] [2] whether they had copyright clearance to use the lists, and both of them said they considered it was permissible (though the people on en: gave a better answer as to why).
Also, someone else has since replied saying that the Cambridge Encyclopedia of Language 2e has the list with no copyright notice. If Wiktionary, two Wikipedias, and the CEOL are okay with using it, perhaps we can. Marnanel 16:18, 6 April 2009 (UTC)
Prefill
I have added a little bit of extra code so that if you create a new page in the lexicon, and removing -s, -es, -ed, -ing, or -ly will get you an existing page, the content of that page is prefilled.
What would be more useful is adding the appropriate suffix automatically, but obviously that's more complicated (checking for sibilants, etc) and will have to wait for later. Marnanel 21:24, 4 April 2009 (UTC)
- I did this a couple of days ago but forgot to update this page. See Shavian:Prefill magic for the current ruleset. Marnanel 16:19, 6 April 2009 (UTC)
- Further update to the prefill code: if nothing in Shavian:Prefill magic fits, and a word can be made up of two parts both of which we know and both of which are at least two letters, it will prefill. For example, waistcoat = waist + coat. This can result in some false positives, so be sure to check carefully. But it can also save time and ensure consistency. Marnanel 01:03, 13 April 2009 (UTC)
New feature
Clicking on a word in a document with a green clover after it will now give you a list of possible disambiguations. Choosing one and pressing the "Disambiguate" button will then update the document automatically. This is best used in dual mode. Marnanel 20:51, 5 April 2009 (UTC)
Error
I've been disambiguating some words, but I got an "alignment error" when I tried to disambiguate the word "does" in Act II of Androclese. I'm not quite sure what this means - could the comment at the end of does be interfering with this process? There was also a problem with "tearing" in the prologue, except in that case neither option appeared, so neither could be selected. -WurdBendur 09:29, 6 April 2009 (UTC)
- Thanks, I'll look into them. The system which lets you click on the disambiguation links has only the rendered HTML as input, but must modify the wiki text, so it can't work with anything really precise like character offsets. Instead, it must use word numbers. The trouble is that it doesn't (at present) know which wiki instructions produce text of their own; there's a simple algorithm to guess but it's too simple. Alignment errors happen when the system is told that word number 123 is "dog" but it turns out to be "cat", or something similar to that, and they happen for that reason. I will look into improving the algorithm. Thanks for reporting the errors. Marnanel 15:52, 6 April 2009 (UTC)
- Update: I think the solution would be not to use absolute word numbers from the start of the document, but merely the count of times that word had appeared-- say "the second instance of 'does' in the document" or something like that. It's rather unlikely that wiki instructions would introduce ambiguous words, since we use them only for well-defined heading words like "Chapter" and so on. I'll do this tonight. Marnanel 16:23, 6 April 2009 (UTC)
- The error in tearing was that neither word had a description after it and the program got confused. I've updated the page accordingly. Marnanel 16:30, 6 April 2009 (UTC)
LiveJournal
Pne has helpfully syndicated the changes to this page onto LiveJournal here. Marnanel 16:24, 6 April 2009 (UTC)
Very, very basic "suggest" function added
You can turn it on from the preferences menu. All it currently does is add a set of suffixes and see which ones exist. Later it will read Shavian:Prefill magic for the suffix list, and will be smarter and know about final e and things. This version is already being useful for me, though. Marnanel 01:05, 9 April 2009 (UTC)
- It is now cleverer and knows about doubling Latin letters for short vowels and so on. The rules it uses are Shavian:Suggest magic. Marnanel 23:11, 11 April 2009 (UTC)
New feature: bulk update
When a document contains many unknown words, it can be a nuisance to edit the pages of each one separately. So now the first dozen unknown words in a document are listed at the end of the page, with text boxes; if any of these are filled in and the button pressed, new pages will be created with the given contents each time. Prefill magic works on these boxes too, but please be extra-vigilant that it's giving you the correct answers. I hope this is as useful to you as it's being to me. Marnanel 22:16, 10 April 2009 (UTC)
Policies
I would like some policies on spelling to exist so that we can be more uniform. I invite your comments on the talk page of Shavian:Policy; everything's up for discussion at present. Marnanel 02:08, 11 April 2009 (UTC)
Words which can be proper nouns or common words but are usually one or the other
What shall we do with words which can be either proper nouns (names) or common words (verbs, nouns, adjectives, ...) but which are nearly always one or the other?
For example, for something like "Mark", having a Dab page makes sense since either the noun/verb "mark" or the name "Mark" could reasonably intended, but in cases such as "jenny/Jenny" or "shift/Shift", in nearly all cases, JENNY will be titlecase (the name, not a female donkey) and SHIFT will be lowercase (the verb or noun, not the name of the key).
Should we make Dab pages for such pairs, too, so that such "ambiguous" (in theory) words can be spotted more easily in texts? Or should the onus be on the people preparing or proofreading documents about donkeys or computing to spot the use of a word in the much-less-common sense?
I'd be inclined to say Dabs have a limit; after all, nearly everything can be a name (consider a fairy tale about two animals whose names are Bear and Lion... I don't want to have to make "bear" and "lion" into Dab pages just to accommodate that possibility). -- Pne 11:42, 13 April 2009 (UTC)
- Good point. I'd even wonder about whether Job should be a dab page... Marnanel 20:12, 18 April 2009 (UTC)
Substring lists
There are now some lists of all words which contain a given substring, created by the bot. Anyone can ask for a new one by editing Shavian:Substring magic. Does anyone think it would be useful to be able to restrict the substrings to the end of the word? Marnanel 20:13, 18 April 2009 (UTC)
Another thought: it might be useful to be able to list all words which ended with a given string in CMUDict and did not end with another string in our lexicon. Marnanel 21:09, 18 April 2009 (UTC)
Other spelling reform systems
It occurs to me that we could create tables for other spelling reform systems, e.g. Deseret
- ๐=๐
- ๐=๐
- ...
and for respelling systems using the Latin alphabet
- ๐=b
- ๐=k
- ...
and then add an optional further step to transliteration so people could compare spelling reform systems (and note the obvious superiority of Shavian, of course :) ). Marnanel 22:02, 18 April 2009 (UTC)
- Interesting proposal, the mapping to other spelling reform systems. Which ones should be included, besides Deseret and Unifon? Pitman ITA, too? Others? -- Pne 13:09, 19 April 2009 (UTC)
- I'd like to include Pitman ITA, but I'm not sure it has code points even in ConScript. I'd also like to include some Latin-alphabet respelling schemes, but I don't know the details of any of them. Not sure what else-- Ewellic, maybe? Anyway, I've made a start at some mapping tables, which I'll complete later; I'm also playing with Runic and Tengwar just because it makes it possible, and it may bring people here... Marnanel 17:49, 20 April 2009 (UTC)
- Omniglot has a list of some "alternative spellings systems". I also went ahead and fleshed out the mappings for Deseret and Unifon according to my understanding. -- Pne 19:34, 20 April 2009 (UTC)
- Thank you. I've added the code, written it up a bit and made it live. There are still a couple of characters missing from Deseret and Unifon, though-- I remember ๐ผ isn't in there. Marnanel 22:35, 22 April 2009 (UTC)
What do you think about including IPA, "dictionary key spelling" ("รค" = ๐ญ as in "palm"), and the like? -- Pne 19:50, 20 April 2009 (UTC)
- I think that would be really rather useful. Is there a standard guide to it somewhere? Marnanel 22:35, 22 April 2009 (UTC)
- There's no single standard dictionary key, if that's what you mean, but each dictionary tends to have its own. The AHD key has been popular, and can be found here. As for IPA, Omniglot's article on Shavian gives one possible mapping, and the IPA article itself might be handy. WurdBendur 04:28, 23 April 2009 (UTC)
PSA: Automated updates
All automated updates, e.g. Shavian:Hundred most wanted and the substring lists, are not happening until they've been switched to use the new server. Marnanel 16:32, 20 April 2009 (UTC)
Releasing the code
I am considering releasing:
- the script that uploads documents
- the hundred-words script (when it's rewritten)
- the transliterator script
- the substring-finding script
- "George", the MediaWiki extension that contains almost all the things that make this not just a vanilla MediaWiki installation. (This may need even more careful extra checking than usual for holes first.)
Does anyone want copies? Marnanel 17:51, 20 April 2009 (UTC)
Metadata on article pages
I have moved all the metadata that was previously on talk pages to article pages. The actual Shavian text should now be wrapped thus:
- {{Shaw|๐ฃ๐ง๐ค๐ด}}
For now it will still work if you don't, but I may turn that off eventually. Marnanel 15:49, 13 May 2009 (UTC)
New features today:
- The 12Dict wordlists from http://wordlist.sourceforge.net/ were added. This will help us get a lot of words via prefill magic, I think.
- Prefill magic now tells you how it arrived at its guess. Marnanel 23:02, 3 June 2009 (UTC)