?

Log in

No account? Create an account
structured text - LogJam [entries|archive|friends|userinfo]
LogJam

[ website | LogJam ]
[ userinfo | livejournal userinfo ]
[ archive | journal archive ]

structured text [Dec. 13th, 2002|01:10 am]
LogJam
logjam
[evan]

One of the features I’ve wanted for LogJam 4.1 is what is known as “structured text” in the Python world, though I’m sure it has other names elsewhere. So I implemented it in CVS in a really basic way right now, but I plan to flesh it out once I get some feedback.

The idea is to be able to type in an expressive manner without worrying about the format. HTML is easy to learn but cumbersome to type when you’re trying to just write. I was also inspired by eru to include Unicode quotes and dashes in the way TeX does it.

You can read README.structuredtext for more details. But for now, I’m curious if anyone will find this useful and if anyone wants any additional functionality in it.

I also have a minor question: since we don’t have a TeX-like math mode, there currently is no way to type a minus sign. I’ve let the ambiguous hyphen-minus sign character alone but I’d like to actually use the proper hyphen and minus characters at some point. How should I do it?

LinkReply

Comments:
[User Picture]From: crackedview
2002-12-13 04:51 am (UTC)
I can't help with the minus sign problem but I love the structured text stuff and I'm glad to see it making it's way into logjam.
(Reply) (Thread)
From: evan
2002-12-13 11:47 am (UTC)
Do you use it elsewhere? Whose software?
(Reply) (Parent) (Thread)
[User Picture]From: jace
2002-12-13 12:54 pm (UTC)
I use structured text all the time with ZWiki. Structured text rocks, but it belongs in LiveJournal's renderer, not in LogJam. I want the ability to edit old posts and still see them in structured text.
(Reply) (Parent) (Thread)
From: evan
2002-12-13 01:18 pm (UTC)
We discussed adding to to LJ a long time ago but decided against it (though I can't remember why?).

With LogJam, I allow it to convert both ways. So you can load up an old post and flip it into structure text mode and get your structured text back. :)
(Reply) (Parent) (Thread)
From: hajime
2002-12-13 05:20 am (UTC)

Hooray for something more expressive than "preformatted".

Structured text, yes please. Re: minus sign, it's possibly not the greatest way to do it, but TeX-style -- for endash and --- for emdash, - for minus sign? That's how I write these days anyway, it seems.
(Reply) (Thread)
[User Picture]From: ntang
2002-12-13 05:32 am (UTC)
What about // for italics (or ) and ** for bold (or )? I prefer that, it's more obvious/instinctive to me.

I think it's potentially great, since I tend to type using _things_ *like* /this/ for emphasis normally as I'm used to typing in an all-text environment, and it'd let me continue doing that but get actual html formatting out of it. Hell, I wish it was an option for text fields all over the place (and not just in LJ!). :)
(Reply) (Thread)
From: compwiz
2002-12-13 09:49 am (UTC)
Yeah, I'd agree with that one. Also _text_ for underlining, assuming there's an option to turn these off.
(Reply) (Parent) (Thread)
[User Picture]From: ntang
2002-12-13 11:07 am (UTC)

Re:

Oh hell, I forgot to escape my gts and lts. D'oh.
(Reply) (Parent) (Thread)
From: evan
2002-12-13 11:39 am (UTC)
I'm torn between using those (which also make more sense to me) and compatibility with the original structured text. But I guess I've already broken compatibility by adding all these other fun options, so I can go that way.
(Reply) (Parent) (Thread)
[User Picture]From: ntang
2002-12-13 12:24 pm (UTC)

Re:

That's the spirit. :) They were stupid and broke compatibility with years of established practice, so that's their fault.
(Reply) (Parent) (Thread)
[User Picture]From: xaosenkosmos
2002-12-13 11:25 pm (UTC)
wrt /italics/, uh, how do we type directory structures that way? It was my first thought ("why no italics? oh, duh."). i think part of the point of this is that we just naturally type with this, and therefore don't have to think to put in html tags. but having to do \/home\/jbm\/foo is just not natural (unless it's preceeded by sed -e 's/...)
(Reply) (Parent) (Thread)
From: evan
2002-12-14 12:08 pm (UTC)
That's what stopped me, too, but it's common to have a "verbatim" environment (what if I wanted to use an asterisk?). Whether verbatim always means in a monospaced font... then I'd write like this:
Now run a command like ``#ls /usr/bin/log*#'' to see if it installed.
(Reply) (Parent) (Thread)
[User Picture]From: xaosenkosmos
2002-12-14 03:28 pm (UTC)
This is just an ugly problem space. You could just make it "[space-ish char][emphchar][not-space]+[emphchar-from-before][space-ish char]", but then what if you wanted to do "this is *very important*"?

How do we disambiguate what the user meant to do... (See Also: "DWIMNWIS: Do What I Mean, Not What I Say; Bringing 24th century computing to today!", by Ida Listic)

So, my thoughts are to just go with <space-ish><delim><non-delim+><delim><space-ish>, where <space-ish> ⊂ {' ', '\t', '\n', '?', '.', ...}.

Anyway, here's some slightly hokey perl code to do the above method. (if someting insists on hosing the code despite my pre's, please head to http://www.joshisanerd.com/~jbm/lj/emphtext.pl and look at it there.)

#!/usr/bin/perl
#
# jbm's bad attempt at structured text
# public domain (as if it's worth it) josh@joshisanerd.com

# Customizable bits
#
%emph = (
        '*' => ["", ""],
        '/' => ["", ""],
        '^' => ["", ""],
        '_' => ["", ""]
        );

@delim_chars = qw/. ? ! " '/;

# Turn those into perl REs that make sense.
my $emphchars = '\Q' . join("", keys(%emph)) . '\E';

my $delim_chars = '\Q' . join("", @delim_chars) . '\E\s\n';

# grab input.
undef($/);
$_ = " " . <>;

#the magic.
while(m!([$delim_chars])([$emphchars])([^$emphchars]+)\2([$delim_chars])!s) {
    my $od = $1;  # opening delim
    my $ec = $2;
    my $text = $3;
    my $cd = $4; #close delim
    my $e_open = $emph{$ec}[0];
    my $e_close = $emph{$ec}[1];

    #print "Got: $text inside \"$ec\" ($od, $cd): $e_open, $e_close\n";

    # isn't this wonderfully legible?
    s/\Q$od$ec$text$ec$cd\E/$od$e_open$text$e_close$cd/g;
}

# Add 

s for good measure s/\n{2,3}/\n

\n/sg; # Bam! print;

(Reply) (Parent) (Thread)
From: evan
2002-12-15 12:53 am (UTC)

I'm writing C code, but thanks.
(Reply) (Parent) (Thread)
[User Picture]From: frew
2002-12-13 08:07 am (UTC)

That would be awesome.
(Reply) (Thread)
[User Picture]From: colin
2002-12-13 02:02 pm (UTC)
In my opinion, if you're using space–hyphen–space, you're doing something wrong, so " - " could represent space–minus–space. (In Word typing space-hyphen-space gets you space–en dash–space, and the minus glyph is almost always similar in length to the en dash glyph, so the results would be pretty similar, even though the meaning is lost.)

I mean, you could have a simple math mode, or even just some trigger that says, "the next character is a math character". There are a bunch of other math constructs that could be converted, such as:

- → −
+/- → ±
x → ×
<= → ≤
>= → ≥
!= → ≠

Although basically all these conversions (except x → ×) could be made in the text body as well.

Adding features without overcomplicating things is tough.
(Reply) (Thread)
[User Picture]From: mindlace
2002-12-25 02:30 pm (UTC)

reStructuredText

It would be very cool if you would consider using reStructuredText as it is the most intuitive flavour of structured text I've come across. It has the *most powerful* merit of not requiring you to mess up your paragraphs with URLs.

(this really does need to be in LJ itself, but I'd love to see it in logjam.)
(Reply) (Thread)