grey data rising

18 Feb 2005

comment

grey data rising

Academic papers are published in journals, if you want to read them, you mostly still have to buy expensive access to the journals or have someone buy it for you. Music is distributed nicely organized in CDs or with helpful commentary on the radio or value-added-free by filesharing networks. General scientific knowledge comes from class room learning or asking your colleagues.

That’s still mostly the case. But it isn’t always the case. Search capacity and a tendency for people to put data in public or semi-public internet space is loosening the rules. If you run a journal search and find a paper that you want to read, but you don’t have access to that journal, try some google-fu to track down the home institution for the researcher, then use their staff directory to find the researcher’s personal staff page (or maybe even google it directly, though this is still tricky), then see if they don’t have a “grey” copy of the paper sitting there. If they don’t, try their co-author. The fact that google now has an academic-specific search just makes finding out about the article in first place that much easier. Is it “legal” for the researcher to be giving away access to their article after they’ve entered contract with the publisher to distribute it in return for peer-review and conventional publishing? I don’t know, maybe. Authors have always been able to hand out reprints, but those are physical copies. Legal or not, it’s possible know, which may be more important.

What if you always wanted to be a DJ. You could get a show at a campus station. Or you start an online internet station, but that’s tricky: internet broadcasting doesn’t fly under the radar any more, there’s licensing fees to think about, and if it’s at all popular you need to have a server. What about podcasting? Yes, that’s a possibility, your blog becomes something like a low volume radio station… and your webhost does the serving. But you have to set up and maintain a blog. Some of the file sharing programs let people download by the folder rather than just the track… arrange the mix into a folder onto your computer and make it searchable. You can’t commentate, and unless you make an effort no one will really know who’s doing their mixing for them, but people will find it while searching for individual tracks and all of a sudden, by putting some tracks in a folder… you’re a DJ with listeners, sort of.

When I’m working and I get stuck on a question that needs some general science knowledge or some specific technical approach, I google it before I get out of my chair and bother people. Some of the software I use has very well developed knowledge bases on their home website, but even in those cases of specific technique often a web-wide search will yield results as good or better. Often I find myself flipping through powerpoint presentations – which Google helpfully converts into HTML for faster perusing, or reading through the course notes for GEO202 at the University of Somewhere in .doc form. I find clusty.com, which works as a search result clumping layer on top of Google, good for this sort of thing.

These uses weren’t anticipated, and are probably not appreciated by some. They’re certainly becoming more common, and the repurposing of technology seems to be accelerating. As data makes it more and more online, general purpose storage and general purpose search synthesise into these grey areas of data and knowledge sharing. As usual, it’s nothing new, and not exactly unanticipated, although the details are sometimes surprising. It’s the early 90s vague promises of the techno utopian geeks coming to bloom. The degree and specific forms are new, and are interesting. Apparently, if the ecology of our technologies is allowed to be general and flexible enough, it will shape itself with the same kind of bottom-up response to many distributed forces that on a greater scale evolved organisms and ecosystems.

← blog

grey data rising

leave a comment