Friday, November 02, 2007

A Blog Citation Index?

One of the primary quality indicators for scholarly works is citation analysis. The assumption is that the value a work is based in part of the number of times it has been cited in other works. I often include links or references to peer-reviewed papers that I have read or which support my arguments in my blog postings. Many of the library blogs I read also include such references.

If there is a great deal of discussion about an article, how does easily one gain access to all the blog postings that reference a specific (non-blog) scholarly work?

Bloggers for Peer-Reviewed Research Reporting (BPR3) is an effort that "strives to identify serious academic blog posts about peer-reviewed research by developing an icon and an aggregation site where others can look to find the best academic blogging on the Net." The concept grew from a Dave Munger Cognitive Daily post.

The BPR3 primary goal has been to create a recognizable icon for use on any blog when discussing peer-reviewed research. Blogging on Peer-Reviewed ResearchThe icon would point to the original primary source material. Posts discussing peer-reviewed research articles could be distinguishing from those containing news and other miscellaneous content. Guidelines for usage are available. (In fact, I am in violation of the guidelines by including it in this post. Sorry guys. It is free promotion) Their long term vision:
  • BPR3 as it was originally conceived was simply a way for bloggers to denote posts on peer-reviewed research. It will still be that, but it will be much more.

  • There will be a central web site where snippets from these posts will be displayed, along with links back to the original posts.

  • Readers will be able to choose the topics they're interested and view only those posts.

  • Bloggers will use plugins for WordPress and Movable Type allowing them to enter a DOI or other identifier and automatically generate code to post the icon, link to the post to our site and its aggregation tools, and generate a properly formatted research citation which links to the original article.

  • Bloggers will be able to instantly find people blogging on the research they're blogging on. Researchers will find blog posts about their research, too.

  • Readers, bloggers, and researchers can use topic-specific RSS feeds Forums and other tools will allow researchers to collaborate in real-time

  • Readers can share questions about research, discuss how to use our site, and discuss topics they can't find blog posts on.

  • Based on their blog alone, the focus and energy in their early efforts has focused on the icon. In my opinion, the much more powerful and useful part of their concept is a site is the aggregated index. When a blogger creates a post with the icon, a link is automatically generated back at the index site. As the number of tracks backs grows, the index becomes a central depository for blog posts about peer-reviewed materials. A researcher could then follow the icon on any blog back to the aggregated index containing all the other blog commentaries on that specific work.

    This concept could become the basis of a very powerful research tool. Think Science Citation Index for blogs, or what I call a Blog Citation Index (BCI).

    It would seem that having a simple icon and trackback somewhere on the blog post is not good enough to generate a useful index. It would have to be associated with a specific bibliographic information of the primary material. Otherwise the index could become dirty real fast. Their site is silent on their plans. DOI is optional and there has been no discussion about the use of OpenURL. I also wonder if there are plans for a code snipit generator for those that do not use Movable Type or Word Press in an effort to simplify the process of creating the trackbacks and standardizing the information required for indexing.

    I am anxiously awaiting their prototype, which has been promised to be available in a month.

    If the BCI were implemented at the publisher level (charging publishers to link to the index could be part of the business model) , visitors to a journal's online table of contents could quickly identify which articles have been blogged. One could also see what topics are hot by quickly identifying the most cited materials over a period of time. Metatags could be used to create tag clouds to tracking keywords and memes.

    It is a very interesting concept that could use a librarian's help. It is a concept that our friends across town at OCLC would be (should be?) interested in. Sphere: Related Content

    2 comments:

    Mr. Gunn said...

    Several issues need to be solved before a really useful blog citation index comes about.

    Blog citations from within journal articles as well as interblog links need to be counted, there needs to be some way to handle link rot, probably by publisher archiving of cited content as supplementary material, and we need a useable standard way to indicate that a post is about a scientific paper, including metadata about the papers discussed. Alf Eaton has some good thoughts on this. There's been some discussion about using a third-party archiving service for this, which could make it easier for authors as they were doing research, but the central point of failure is a serious issue, I believe.

    Postgenomic is an attempt at a technorati-style aggregation of science blogs and includes a section for blog posts about scientific papers.

    The way Postgenomic handles citation data is to scrape it from the target of the link, so you add rev="review" to your link and the aggregator handles the citation data scraping on its end.

    I think it's a good idea to keep some metadata in the post itself(so there's no central point of failure nor service lock-in), and COinS seems like a good way to do this.

    There's a generator (which you probably know about) which will populate the tag if you give it a DOI or PMID, so you don't have to type out the journal/page/etc.

    I'm glad you're working with the bpr3 people, and I hope something really useful comes of it, but as I have expressed to Dave many times, I think the final service would end up being better for end users if Postgenomic and bpr3 were to work together. I know they were talking, but I haven't heard if anything has come of it.

    Gunther Eysenbach said...

    WebCite (http://www.webcitation.org) has implemented an internal ranking of most cited webmaterial called WebCite Index.
    Citations are harvested both from published research papers and from citing authors taking snapshots ("WebCiting") material.

    WebCite archives stable snapshots of any URL at the time when it is cited (if the citing author initiated it) or at least when the paper is published. The snapshot will be permanently preserved in various Internet archives and libraries (WebCite is a member of the International Internet Preservation Consortium). The citation format a lot of journals are now using / recommending is something like the following:


    Adam. How to Better Cite Blogs. Emergent Chaos - The Emergent Chaos Jazz Combo of the Blogosphere (Blog). URL:http://www.emergentchaos.com/archives/2007/10/how_to_better_cite_blogs.html. Accessed: 2008-03-14. (Archived by WebCite® at http://www.webcitation.org/5WJRbnuHA )

    This ensures that the readers sees exactly the same content as the citing author (even if the content changes).

    Some bloggers - who feel it is important to be cited properly - also put a "WebCite this!" on their blog (link to www.webcitation.org/archive - prepopulating this form with stable metadata such as the author name and blog title) so that the blog is accurately cited and automatically archived if somebody cites it. This also goes a long way if the blogger has to proof - for some reason - the sequence/priority of ideas, data, stories etc. - esp. important in academia. See http://gunther-eysenbach.blogspot.com/ for an example.