I've made simple patch for LogJam with "typographics" support. With "cyrillic mode" quotes too.
But this patch is not complete and not usable in "production" environment. You can see example of output at http://www.livejournal.com/users/bacek/2003/09/19/
If you are interesting in this feature I can give you this patch "AS-IS" :)
AS-IS у меня свой есть. Но глянуть интересно.
2004-06-24 02:04 pm (UTC)
user-defined text replacement list is more elegant method
лучше тогда сделать список автозамен.
а туда и вставлять:
(c) -> ©
(r) -> ®
=) -> <img src="blablabla/smile.png">
2004-06-24 02:20 pm (UTC)
Re: user-defined text replacement list is more elegant method
третий пункт в письме будет выглядеть весьма фигово. Лучше заменить на юникодный смайл.
2004-06-24 02:53 pm (UTC)
юзер может быть другого мнения - ему может приспичить иметь более двух вариантов смайликов ;)
или он[а] может захотеть вставлять (ну бывают такие)
розочки (@->-) и попросит, к примеру старшего братика сделать так, чтобы они вставлялись в результате в виде картинок или что-то вроде.
Настраиваемая автозамена - это ж иногда удобно до жути! =)
2004-06-25 10:48 am (UTC)
файл подстановок? идея хороша. Надо будет подумать.
2004-06-24 03:26 pm (UTC)
Re: user-defined text replacement list is more elegant method
Некоторые аськи-джабберы с этим справляются так: при написании ты видишь все как написал, а когда отправляешь, проводится автозамена и вместо :-) отсылается картинка. Это не всегда хорошо, но в данном случае идея может сработать.
А в целом очень нужная вещь. Жутко страдаю, набирая все эти чудеса руками.
2004-06-24 03:41 pm (UTC)
My Typographical Wishlist
I don’t have Logjam installed on this machine (just my desktop at home), so I can’t check if any of this behavior is already implemented, so some of it may be redundant.
- Quotation Marks Replace " with “ or ” and ' with ‘ or ’ as appropriate. Generally speaking, we want the l version when the character follows whitespace, and the r version when it precedes whitespace. When it has whitespace on both sides, it should probably not be replaced, as user intent is unclear. Exception: when ' or " immediately follows a numeral (0123456789) or a fraction (see below), it should be replaced by ′ or ″, respectively.
- Hyphens/Dashes A single hyphen with whitespace on both sides should be replaced with –. Two consecutive hyphens should be replaced with —.
- Ellipses Replace ... with ….
- Fractions When they have whitespace on both sides, replace 1/2, 1/4, and 3/4 with ½, ¼, and ¾, respectively. It would be nice to have less-common fractions created automatically as well, using ¹, ¹, ¹, and <sup> as appropriate in denominators, then ⁄, then <small> for the numerator, but that seems like more of a pain in the neck to implement well, especially as it probably wouldn’t see much use and most people probably don’t care.
- Small Caps It would be helpful for me if Logjam made it easy to use small caps without having to type out <span style="font-variant: small-caps">, but maybe I’m the only one who finds them useful.
2004-06-24 03:45 pm (UTC)
Re: My Typographical Wishlist
Oops, make that “¹, ², ³.”
2004-06-25 10:55 am (UTC)
Re: My Typographical Wishlist
thanks for such a detailed reply. From the all above I conclude that user-defined substitution file will do the job just right.
2004-06-24 04:58 pm (UTC)
element versus entity
I thought the Q element
was supposed to take care of matching quotes in HTML.
2004-06-24 09:05 pm (UTC)
Re: element versus entity
I hate IE too.
What does "фича" mean? I pray it's not what I think it is.
"feature". What have thought about?
No... that's exactly whan I feared it was.
It is something the Russian people have been doing for centuries, though it was never as rampant as it is these days. While all cultures borrow words from other languages and perhaps modify them slightly to fit into the flow and style of their own language, Russians have a tendancy not to bother finding out what the real word is and simply take what they hear. In this case, somebody probably spoke to a low class Brit or a Yankee.
This behavior irritated me since I was about 4, when my dad told me about a Russian sailor term (sorry.. I don't have a Russian keyboard) "Rindu bey" coming from English "Ring the bell".
btw, the origin of such literal borrowing by russinans also results in that (in average) they are being liked by other nations more that other people. there's no objective need to invent own word if there is foreing having unique meaning while some people may want to invent it (i.e. mostly due to subjective reasons to show that they are unique).
Hi there, my dear karate lover. Why haven't your friends invented brave new English word for kumite, makiwara, kata, dan
etc? Why the Americans drink tequila
instead of Mexican liquor made from fermented juices of an Agave plant
? Why do you use word samurai
instead of knight
instead of animation
I will tell you why.
Because you want to be understood and understood right. If you know what word katana
mean and you know your entire audience does why bother with Japanese longsword
and fall in danger of being misunderstood? (somebody can think about bokken
And you are not alone. I do not taunt your language or defend mine. I say that almost every person behaves this way. When human encounters new concept he needs a word to describe it. If there are no such a word he invents it. If there is one - that one is used. You see a native American lodge frequently having an oval shape and covered with bark or hides
and you ask "what is it?" They say "wigvam". And your language adopts new word.
You see a large rigid dirigible
and ask "what the..?" It is a zeppelin
. Even when you lean that itwas actually a name you do not want to change - why? Everyone understands.
The only reason to throw away such a words is that you want your kids to learn right
words. Or those you think are right. Many goverments try to cleanse
the language from time to time. Sometimes they even do not fail completely.
Well, do you really think that Italian programmers use Italian word for byte
? Or Chinese? I do not think so.
p.s. feel free to correct my English - I will thank you for this possibility to improve it.
Actually... as I stated, I have absolutely no problems with languages borrowing words from other languages... and if you read my post carefully even changing them around to sound more like the language it's being incorporated into. My problem is Russians tend to go with what they hear. Kind of like taking the word "katana" from your example and adopting it as "kurana" - which is at least what I hear when Japasese actually pronounce it in their speech.
To put it plainly, I would have no problem if Russian adopted the word "ficher" and even change it around to be "fiCHER" with the accent on the last syllable - which I think would make it fit Russian flow a little better than "FIcher".
And finally, I believe "feature" translates perfectly into actual Russian as "harakteristika" - sorry... transliteration isn't my strongest and it's been a long time since I actually used Russian. But I'm not going to argue that point as the words may be different in some subtle ways which I'm not aware of.
Well, we adopt all the foreign words. We simply cant use English words without some adoptation because we have many language cases and we do not use sounds like th, w, ng. Actually, the word "ficher" mutated into "fichar".
And than there are suffixes. :) I once heard even "фичастость". It is feature richness.
Well.. that last part makes sense. Once the word makes its way into the language it's only logical that it gets conjugated by the language's own rules.
Well, trere is entire community about adoptation of English words by Russian immigrants in America. runglish
. Very funny.Вам послайсить или одним писом?
(should I slice it or you will take it in one piece?)
is it possible to get a secure login(that would do fine over wireless in a coffeehouse) with logjam? the website has https://www.livejournal.com
login, just wondering if logjam supports somthing similar.
Am I really the only person who feels we should (in locales where this is reasonable, of course,) limit ourselves to straight ASCII, with a little bit of markup thrown in for structure? I'm really not interested in typesetting my journal, thanks. I don't even have preview compiled into logjam.
*sigh* Okay, I'm done complaining about it. As you were.
2004-06-27 06:40 am (UTC)
(responding late, but i'll respond here because i sort of agree with you)
I think there are three valid text markup behaviors:
- plain text
- TeX-style markup, where - is a hyphen, -- an en-dash, and --- an em-dash. This is necessary if we want to support the whole range of possible characters.
- a structured-text-style markup, where (for example) "--" is an em-dash, because that is what we type in ASCII when we want an em-dash. Similarly, we match the '"' character into real quotes, etc.
We already have the first behavior. I believe the second behavior can be done by properly-configured keyboards-- see my page on X input
. The third behavior is partially implemented, in the file "structuredtext.[ch]" available in the arch repository. I haven't finished it yet, though.
There are ASCII purists, and then there are typography purists. Some of us grew up behind terminals, others behind early editions of Crime and Punishment. My experience has been an odd mix of the two.
On one hand, I like the simplicity of ASCII. I used to dabble in BASIC programming in grade school, and at one point, I had ASCII sequences memorized (so I could poll for particular keystrokes and the like). For this reason, it will always hold a special place in my heart. On the other hand, I’ve also had extensive typography experience, and I tend to notice little details like proper quotes, primes, and even hairline spaces set on either side of an em dash. Sure it seems like flourish, and it certainly was in the days of dumb terminals. But we need not limit ourselves; display quality is getting to the point where reading long passages of text on screen is actually remotely bearable. Proper typography goes a long way towards achieving that ideal.
All that said, I’m all for sticking to straight ASCII in email. Marking it up in HTML is just silly.