Saturday, October 29, 2011

Harvesting

It was a strange day in computerland. Although there is no logic to support it, I often find on rainy days that the Internet does not behave like one thinks it should. And south Florida has been in the midst of a very rainy weekend.

According to a long summary on the Open Archives Service Providers site, the stated purpose of Scientific Commons is to create relationships between authors and publications, and it has already done so for 23 million. I was quickly taken to a webpage with a banner saying “Scientific Commons, ” a search box, the option to view the page in either German or English and a list of “Neue Publikationen.” I tried to switch to the English version (which did not happen). I put the search term “climate change” into the search box anyway and waited and waited and waited. Nothing happened. I tried to open some of the “neue publikationens” listed on the homepage, including several with English abstracts. Nothing happened.

The site is deceptively friendly looking for English speakers. I’m not one who believes the entire global community needs to translate everything into English to suit me because English is the language I speak. I know that English in the common language of the scientific community, and that many times only an abstract will be written in English. But I was surprised that nothing opened even when clicking on the links to German articles. It makes it very difficult to review without the ability to see any results, but the bottom line is that the site does not function well, at least on this side of the Atlantic.

The link from Open Archives Service Providers for Callima.org went to a website for addiction, and I didn’t notice anything that looked remotely like a collection of metadata. DigitAlexandria looked promising and impressive on their homepage. About 1,000,000 documents were in their system, with some of the best scientific research centers listed as harvested. The only trouble is that putting a term in the query box and searching resulted in a page opening which had a logo for the Wayback Machine (Beta) and a error message which read: Page cannot be crawled or displayed due to robots.txt. Bizarre.

I finally decided to check Scirus, which I have used before. Mercifully it opened to its familiar search page. It bills itself as a specialty search engine which concentrates on scientific websites. Its “Preferred” websites include; Digital Archives, NDLTD, RePEc, DiVa and multiple Patent Offices. It also iharvests several prominent publishers like Elsevier, Royal Society Publishing, Wiley-Blackwell and Sage. There should be no confusion that these records will lead to the full text; only the metadata records have been harvested with a note linking to the publishers website. Overall I have found Scirus very useful when searching the web for scientific information. Because it searches scientist’s websites and they frequently have links to PowerPoints of their presentations, I have often found graphic-rich material to show students when they are looking for illustrations for their own projects. And today it worked as expected.

No comments:

Post a Comment