[TriLUG] OT: Standardized Data Crunching

Tom Roche Tom_Roche at pobox.com
Mon Feb 11 15:28:28 EST 2008


Tom Eisenmenger Mon, 11 Feb 2008 7:35:28 -0800
 >> it could just be that I'm naive but having a variety of computer
 >> models, but it seems to me that providing public data in a
 >> standardized format (XML would probably be fine) and, more
 >> importantly, providing computer modeling horsepower (of a wide
 >> range of sophistication) through a standardized API, all freely
 >> available on the net, would be a no-brainer.

Tom Roche Mon Feb 11 12:51:18 EST 2008
 > OK, I'll say it: you're naive :-) Not that this isn't a good
 > idea--it's a great idea, it's just not a no-brainer. There's a
 > tremendous "legacy" out there, of both tools and data. Just
 > providing secure open access to what we have now would be a helluva
 > lotta work

And, just to clarify, I'm not saying this would be brain surgery, just
a helluva lotta work (essentially linear in the number of {tools, data
sources, computing sources}). Hence I suggest that you'd really need
talent in "development" in the organizational sense (i.e. recruiting
partners and funds), not the software sense. At least at the
handwaving level, the solution STM to be façade, façade,
façade:

* generally: the less work you create for your partners, the more
   compliant they will be. So hide their {schema, API} and map new ones
   to them.

* data: create a virtual XML repository, map XQuery calls on that to
   "real" calls on the "real" data. The fun work will probably be
   creating the ontology (for all scientific data? hmm ... maybe it's
   been done already). The rest will be sheer brute force (maintenance
   will also be a chore, too), but naturally partitionable.

* tools: map calls on your service to calls on the "real" tools/API.
   Lotta ways to do this, too, but ISTM it screams "web service."

* computing: you'd want a real grid specialist to start this up, given
   the workloads you'd need to handle, but you could at least get
   started with something like World Community Grid. That being said,
   there's now a lot more competition in that space than when SETI at home
   started, and those cycles ain't really free.

Doable, and you could start small and grow it--hell, you'd *hafta*
start small and grow it.

FWIW, Tom Roche <Tom_Roche at pobox.com>




More information about the TriLUG mailing list