basic/xml

From twext

Jump to: navigation, search

Contents

[edit] xml searchable?

duke asks:

  • will our twext xml outputs, if online, be searchable/findable/filterable via search (ie google)?

waqas responds:

That kind of depends on the output. Keyword searching would definately be possible. But the problem is searching for "a quoted piece of text spanning across chunks". Because chunks of one language have chunks of the other language in between them.
i guess i'm asking about twext titles.. ie find me song in portugues w/ english twext.. maybe if we include an unchunked version of TEXT hidden in xml source the search could find chunk_spanning_TEXT?

waqas asks:

Would both texts appear (in the output) to people using limited devices? (like cell phones? or text-to-speech software, etc?). Because search engines essentially get the same view of the page as these people.
maybe include unchunked copy in xmlsource?
re: cell phones.. maybe the most important twext output.. future spec.. worth anticipating but let's start simple

[edit] naming names

  • will our twext xml ouputs have a unique name for such searching? xxml?
lets stick with html like syntax. and name names later :)
ok =)

[edit] direct edit?

waqas asks: everyone would be editing twext from the interface zeen builds. but would anyone be editing the actual xml directly?

later yes or maybe.. at first, simple steps over all, principle is make system open as possible with fewest interface traps.. ie it would be nice to be able to add tags via direct text editing (pico) and a web interface.. this should be balanced somehow with a single point of truth principle (sorry that's awfully vague).. but if you ask me, it's best to start super simple, then adapt to users..
About the twext internal format. I think the focus of the format should be easy processing, and not human editability. Editing should be limited to the twexter user interface. (This is the same way wikis do it. You always edit through the wiki interface, allowing the wiki software to do special stuff like indexing, etc, in the background.) Part of this is because having text flowing mixed with multiple languages may not be the best thing to edit by hand in a plain text editor, and would likely lead to errors. What are your thoughts?
yes.. keep it simple.. bad is good

hiya waqas.. glad to see you speak martian 8P.. you asked:

As for the output and internal format, I have a few questions and ideas I'd like to discuss. Internal format This should be XML.

8)

[edit] multilingual in single file?

Suppose we have a page of twext. Now would it contain two languages? Or more?

baby step = two languages.. ultimately, as many as individual user can handle..

For two languages it could be as simple as

<chunk title='language 2 text'>language 1 text</chunk>
 ? re: "chunk".. is this a call for an individual chunk?

But for multiple languages it would be something like:

 <chunk>
   <text lang='english'>English text</text>
   <text lang='french'>French text</text>
   <text lang='martian'>Martian text</text>
 </chunk>

Multiple languages are a bit more complicated to generate output for, since the languages need to be specified.

1. this could be job for multilingual/spec.. next phase.. should we prep, but simplify for now?
2. OmegaWiki is one database for all languages.. uses WikiData.. provides dictionary content for OLPC.. expresses interest in wixi as front end :).. OmegaWiki identfies language objects 1. by meaning, then after that classifies them by language.. this is very cool.. kids on OLPC could be free to recombine any meaningful words they like, before being trapped in categories..

So the question is, would one twext file contain all languages?

individual twext file might identify languages contained in file not all languages, dialects, slangs in system (there will be thousands)

Or would each language pair be in a separate file?

 ? can you explain ? for now, maybe for a root text like the song "imagine" there would unique branches for language pair, then possibly individual file "leaves" or "twigs" to enable variable chunking, alt translation.. if that's too complex for first step, then maybe simple answer is "yes" :) please don't get trapped by my ignorance of XML

[edit] xslt

External format This can be HTML, generated through XSLT. Or it can be anything else. Its pretty easy to change the output format and formatting. Just a matter of editing a few lines of code.

very nice working with you, mr. zeen :)

What do you think?

see comment above

[edit] wiki

(and I'm new to the wiki format :)

how do you like wiki?
Wikis are cool ;)


 

 
Retrieved from "http://twext.com/basic/xml"
Personal tools