25 September 2010

Burrowing into the bran tub

Intelligence organs of a fairly small but technologically proficient country managed to clone the email and document bases of a fairly large but unsuspecting company with an active and diverse research operation. Not the research results bases themselves; just the email and document archives. The immediate benefits of the theft were obvious, mundane, and irrelevant here. More interesting from a scientific computing viewpoint (if no more morally or legally defensible) were the spinoffs from applying statistical information mining techniques to the combined database contents.

As a direct result of the operation, other companies, seen as friendly to the small country's interests, received repackaged and anonymised material suggesting productive new lines of enquiry. Patents resulting from these have already been filed; others are in the pipeline. The large company which was the target of the operation, significantly, remains unaware of the opportunities which exist within components of its own activities which have never been brought together.

The story illustrates a truth: that much knowledge is locked away in information stores assembled for one set of reasons and never reexamined in other ways.

Less melodramatically, and less dubiously, information openly published on the internet forms a huge field within which to prospect potential information seams - "The low user entry barrier of the Web has resulted in massive amounts of unstructured and weakly structured data referring to objects, concepts, user interests and communities", to quote the Digital Enterprise Research Institute at Galway. As SAS's David Smith points out, tweets and blog entries can contain pointers to early identification of potentially vital phenomena. This is an aspect of what is known in the industry as pharmacovigilance, which "can be defined as a set of practices aiming at the detection, understanding and assessment of risks related to the use of drugs in a population, and the prevention of consequential adverse effects [or] in a narrower sense ... postmarket surveillance"[1]. [More...]

Dr. C said...

Whew! Trying to understand this is like those guys in "His Master's Voice" trying to understand the "other." Speaking of which, what has been the results of applying these techniques to SETI?