data

From twext

Jump to: navigation, search

data is information that is organized.. twext data compares different strings of text which represent similar chunks of meaning.. a twext method and xcroll interface helps humans relate different text strings.. twexter software finds these related text strings then formats twext translation.. the more we format twext, the more data we collect which relates strings of text and intended meanings.. how to we organize this data?

[edit] gimme

what does twext data want? wtf knows? me? i'm lazy, i want machines to automatigally get my text twext.. i want:

  1. chunkster to chunk text
  2. machine translation to translate chunks
  3. fix interface, so if i can fix automagic chunk/tranlsation errors and teach machines to twext me better

[edit] easy to share

  • easy to suck into and suck out of whateva database
  • especially easy for idiot humans to understand
  • easy to find and fix anywhere online

[edit] make it mine

  • learn every word i know
  • and how well i know it
  • gimme twext where i need it

bla bla bla.. first baby step is lyric chunkster between english/espaƱol, and json data for now and maybe dodo data later..

if we want better automagic chunking and machine translation (between any of many languages), then maybe we want to be sharing data.. we want as much corpus available for statistical analysis/improvement of automagic chunking/translation


  • adding more, independent data usually beats out designing ever-better algorithms to analyze an existing data set...

Martin Langhoff keynoted at consol stressing data design as key to scalable software.. he said twexml'd be way slow, and poo poo'd json(why?(: )) but did show interest in an embeddable twexter for moodle and/or the xo.. i asked how we can shared twext data between separate embeds but he said let google figure it out.. sounds good to me but yeah right as if google cares..

anyway, for now we don't need to scale, we do need to test simple to see if/how twexter works, and we do wanna explore data in the wild ie twexml, dodo or whatever, and Gabriel recommends json, says it's easy so let's give it a try pa'ver k pex

Retrieved from "http://twext.com/data"
Personal tools