data
From twext
|
data is information that is organized.. twext data compares different strings of text which represent similar chunks of meaning.. a twext method and xcroll interface helps humans relate different text strings.. twexter software finds these related text strings then formats twext translation.. the more we format twext, the more data we collect which relates strings of text and intended meanings.. how to we organize this data? [edit] gimmewhat does twext data want? wtf knows? me? i'm lazy, i want machines to automatigally get my text twext.. i want:
[edit] easy to share
[edit] make it mine
bla bla bla.. first baby step is lyric chunkster between english/espaƱol, and json data for now and maybe dodo data later.. if we want better automagic chunking and machine translation (between any of many languages), then maybe we want to be sharing data.. we want as much corpus available for statistical analysis/improvement of automagic chunking/translation
|
Martin Langhoff keynoted at consol stressing data design as key to scalable software.. he said twexml'd be way slow, and poo poo'd json(why?(: )) but did show interest in an embeddable twexter for moodle and/or the xo.. i asked how we can shared twext data between separate embeds but he said let google figure it out.. sounds good to me but yeah right as if google cares.. anyway, for now we don't need to scale, we do need to test simple to see if/how twexter works, and we do wanna explore data in the wild ie twexml, dodo or whatever, and Gabriel recommends json, says it's easy so let's give it a try pa'ver k pex |

