Pages

Tuesday, November 16, 2010

Connecting the Dots

Over the weekend I read Kevin Rivette's book “Rembrandts in the Attic,” which outlines the lost value buried in distributed documents, and what this underutilized intellectual property costs companies. A subject near and dear to my heart. But it wasn't until Sunday, when my 6th grade son asked me to review his paper on Francis Drake's journey in the south seas, that I connected the dots. As I read how they explored and discovered new islands and peoples. How they charted, documented, and mapped not only everything they found, but everywhere the went. That all their charts and maps got me thinking.

Why don't we do this for our information? We document the output of a hypothesis or experiment, capture the data, and if the project is abandoned or failed we file it away. Often forgotten. Explorers make maps to capture what they learn so that the next visitor can find where they have been and go a little farther, learn a little more, avoid the same mistakes. Why can't we see information the same way? Apply a few tags to provide the "lay of the land" as it were to an information asset. Capture the context, meaning and value of it. A simple step, yet one that can make all the difference in discovering and leveraging our forgotten assets.

We are drowning in data. Every year, Berkeley researchers tell us, we generate 30% more information every year. The sequencing of the human genome over the past decade has led research centers in both the private and public sectors to place huge orders for thousands of servers and storage systems capable of handling terabytes of the new genomic, proteomic, drug, and health care data generated hourly.

Privately, we all struggle with this issue each day. Finding the information we're looking for. Few industries suffer more from this data deluge than pharmaceuticals. Many gifted and well-paid scientists and engineers spend 15% of their time trolling through federated storage or file servers for the data or documents they need. Sometimes they never find them, triggering rework, redundant tests, and the loss of untold millions of dollars each year. Despite significant investments in information technology, knowledge-based pharma remains “knowledge poor” in its day-to-day
operations, at every step of the value chain, from discovery through distribution.

Big data has led to flexible storage solutions that scale massively, easily, and
relatively cheaply, if you call pay as you go cheap. However, while the storage
industry has met the challenge, pharmaceutical companies are realizing they are not making as much progress as they thought investing in genomics, proteomics, and
informatics research. They're not getting the returns on investment. It is the tumultuous world of bioinformatics that has not fully met the challenge of the genomics revolution-in-waiting.

But why are we still struggling to connect the dots?

The real challenge is that the research process itself still remains personally competitive, often isolated, and widely distributed. Information exists, but unconnected. At a surprising number of firms, R&D teams are literally re-inventing the wheel, duplicating research that the company has already done, whose lessons are buried in some obscure and forgotten file. Knowledge is generated and then abandoned when research leads in a different direction.

These assets, both the data and the knowledge remain just as isolated, distributed and unconnected. Dumped into bench-side databases or file servers. Even if they are effectively consolidated in a warehouse or content system they remain unconnected and without context. And the sheer growing volume of the data, papers, and images makes it increasingly difficult to find and discover a specific resource when you need it most. How we manage this information must change. And it must change before it is too late. We must change before it becomes impossible and costly to retroactively fix the error of our ways.

The knowledge exists about all of this information. These small “Rembrandts” exist everywhere. The day it is stored in a database or filed away in a digital landfill the person that created it, the project team that worked on it, and the admins that manage it have that knowledge. They know what it is, why it was created, how it was created and what was learned from it. Yet that knowledge quickly evaporates. People move on to other projects, get excited about something else, or leave the company. What we know about the informational context and value begins to fade - like all memory. And every day more and more of this knowledge is lost. These small “Rembrandts”, that the organization paid dearly for, are being lost every day because no one can find them. Even if someone was lucky enough to stumble upon the data or the file in a year or two they often cannot interpret it correctly, or put it in the right context necessary to maximize its value.

Think of how easy it would be to apply just a few tags to that data table to make it more findable. A small description, a little provenance information, a link to a few seemingly unrelated papers to provide the missing “context”. Informational threads, human insight and experience, provided by another scientist can make all the difference in the world. But this demands that we change the way we think about
information. We must view it not as an output of a project or hypothesis that was abandoned, but for what it really is... a learning process. Explorers make maps. Why don't researchers?

The visible world may be known, but the unseen world is just begining to be explored. Why don't we see information for what it really is? An output of the exploration. Applying just a few tags to capture the context, meaning and value of your work will make all the difference. And while that benefit may at first appear to be for someone else, like karma, it may perhaps one day benefit you.

No comments:

Post a Comment