Monday, November 21, 2011

IRLS 675 Final Blog

This course provided opportunities to build  repositories and practice working within those environments. The four open source content management systems we worked with were dissimilar enough to practice different skills, yet parallel enough to have a continuity of learning.


Drupal was a robust platform, and offered much more than just the ability to develop a repository. At this particular time in my learning curve most of its features, like running intranets, portals and blogs, and managing websites was lost on me. I felt like I only scratched the surface during the practice time using it. My end impression was that it was flexible to use as a repository, but not the best choice for one. I did not like the way items and metadata were presented in a blog-like display, but this might have been overcome with a more in-depth exploration.

This was my second opportunity to use DSpace, the first was IRLS 671--Introduction to Digital Collections. The workflow was easier the second time, but I must admit my dislike of the interface that organized everything into communities. One repository I evaluated had close to 100 communities listed on their welcoming page, which was not welcoming at all to an end-user unfamiliar with the university’s administrative structure.

EPrints strength lay in its ability to archive and distribute scholarly publications. It was designed perfectly for that purpose, and was flexible enough to accommodate my collection of images. However its academic environment was a poor match for the touristy feel of my collection of photographs.

I especially liked Omeka, and felt it has a lot to offer. As opposed to the other three, I think its ability to host a collection in the clouds (for a fee, of course) is a positive feature that i might be able to use later.

I think one of the most important outcomes of this class for me was a better understanding of metadata; working with application profiles, ontologies and taxonomies. Overall I learned a lot about working with repository platforms and am looking forward to applying what I have learned.

Sunday, November 13, 2011

Unit 12

As I mentioned in an earlier blog, I think we are only scratching the surface when it comes to understanding the workings of a repository. Realistically that’s about the best that a semester or two can do—the actual experiences of working day-to-day with digital collections (hopefully) will provide the depth of understanding needed for competency and productivity. So when questioned which should be considered for future classes, the continuation of a semester of quick forays into the installation and configuration of four different virtual machines, or a pre-configured VM, I would have to say I’m split evenly between the two choices.


On one hand, things have started to come together this semester during the CLI parts of the class. In IRLS 672 I blindly copied code like a transcriptionist reproducing a manuscript in a foreign language. Now as I type, I’m beginning to get a sense of what I’m actually trying to do. Unfortunately if something goes wrong all I can do is compare and look for typos—it will take many more courses before I can suggest to myself an alternative way of coding. But that’s the progression in learning.

On the other hand, since I have many obligations in my life beside this class (family and work are two) something has had to suffer. I have spent hours and hours staring at code that wasn’t working the way it should have. That time could have been used for a better purpose working with a collection in a pre-configured VM, and I would probably know more about managing digital collections.

I don’t know if it’s logistically possible, but perhaps a compromise would work. Rather than developing four virtual machines from the ground up (Drupal, DSpace, ePrints and Omeka), why not select two and use the remaining time working with a collection in depth in a pre-configured virtual machine? From my perspective Drupal and Omeka were the least like each other, and also offered unique learning experiences that could apply to other platforms used for repositories today. And learning to apply ontologies, metadata, crosswalks, several plugins, etc, in one environment, but more in-depth and consistently, might serve us learners better.

Sunday, November 6, 2011

Unit 11 Comparison of Home Sites

I’m not quite sure if this week’s blog assignment is to discuss the home site of the repository we were created using Drupal, DSpace, ePrints and Omeka, or the respective organizations’ home site.


Of the organizations’ home sites, I must admit I found Drupal’s marketing line the most compelling: Come for the software, stay for the community. It emphasizes that a novice repository developer would not be alone, but has a great deal of global support. Let’s make that an exceptional amount of company and support: “701,691 people in 228 countries* speaking 181 languages power Drupal.”

DSpace and ePrints, on the other hand seem to stress the fact that they are open-access, while Omeka labels itself a platform for “serious” web publishing. By having such terms prominently displayed on the homepage such as “Dublin Core,” “Linux,” “Apache,” “MySQL,” and “PHP” it is skipping the amateur website developer and marketing to users with a substantial bit of IT/Digital Content background.

All sites appeared as the top result when Googled, all sites were “busy” and required a lot of scrutiny to find the sections of interest. I would not determine a final selection for repository software by the home page of the platform. I am much more interested in how I can use it, rather than how the organization has chosen to use it—our needs and user base are probably much different.



Saturday, October 29, 2011

Harvesting

It was a strange day in computerland. Although there is no logic to support it, I often find on rainy days that the Internet does not behave like one thinks it should. And south Florida has been in the midst of a very rainy weekend.

According to a long summary on the Open Archives Service Providers site, the stated purpose of Scientific Commons is to create relationships between authors and publications, and it has already done so for 23 million. I was quickly taken to a webpage with a banner saying “Scientific Commons, ” a search box, the option to view the page in either German or English and a list of “Neue Publikationen.” I tried to switch to the English version (which did not happen). I put the search term “climate change” into the search box anyway and waited and waited and waited. Nothing happened. I tried to open some of the “neue publikationens” listed on the homepage, including several with English abstracts. Nothing happened.

The site is deceptively friendly looking for English speakers. I’m not one who believes the entire global community needs to translate everything into English to suit me because English is the language I speak. I know that English in the common language of the scientific community, and that many times only an abstract will be written in English. But I was surprised that nothing opened even when clicking on the links to German articles. It makes it very difficult to review without the ability to see any results, but the bottom line is that the site does not function well, at least on this side of the Atlantic.

The link from Open Archives Service Providers for Callima.org went to a website for addiction, and I didn’t notice anything that looked remotely like a collection of metadata. DigitAlexandria looked promising and impressive on their homepage. About 1,000,000 documents were in their system, with some of the best scientific research centers listed as harvested. The only trouble is that putting a term in the query box and searching resulted in a page opening which had a logo for the Wayback Machine (Beta) and a error message which read: Page cannot be crawled or displayed due to robots.txt. Bizarre.

I finally decided to check Scirus, which I have used before. Mercifully it opened to its familiar search page. It bills itself as a specialty search engine which concentrates on scientific websites. Its “Preferred” websites include; Digital Archives, NDLTD, RePEc, DiVa and multiple Patent Offices. It also iharvests several prominent publishers like Elsevier, Royal Society Publishing, Wiley-Blackwell and Sage. There should be no confusion that these records will lead to the full text; only the metadata records have been harvested with a note linking to the publishers website. Overall I have found Scirus very useful when searching the web for scientific information. Because it searches scientist’s websites and they frequently have links to PowerPoints of their presentations, I have often found graphic-rich material to show students when they are looking for illustrations for their own projects. And today it worked as expected.

Monday, October 24, 2011

Uniit 9

There is such a learning curve creating metadata for my collection. I do not have any background in cataloging other than a mandatory course taken early in grad school. As I work more with the process I am discovering inconsistencies in the items that I create metadata for. I like to think that I am learning as I go, but I realize that I have such a long way to go. I have read over the elements list in Dublin Core and IEEE Learning Objects Metadata. What seems simple at first glance becomes difficult when trying to apply the rules and inexperience gets in the way.


I recently took a workshop on Metadata for ContentDM; I was the only non-technical services person who attended. During discussions and breakout session I could sense the importance to the catalogers for precision, almost to the point of obsession. I am not gifted in that way.

Working as a reference librarian I can appreciate the work that catalogers do. I often feel frustrated by the lack of precision I experience when searching proprietary databases. It seems that in a move to become more Google-like many database providers are attempting to liberally interpret subject terms; so many times lately I will look at a list of results and wonder how a certain item could possiblely be relevant.

Sunday, October 16, 2011

Unit 8 ePrints

I found the degree of difficulty between the installation of ePrints, Drupal and DSpace about the same. However I do think configuration of ePrints, and its customization was more complex and needlessly difficult.


I added to the Welcome message (which wasn’t difficult), and added a logo (which took three unsuccessful tries with method 1, and two tries, ultimately successful, with method 2). I tried to change the theme, although I could not see any differences between glass and green. It also took several hours to get the changes made to the subject taxonomy, but that is probably due to my inexperience in CLI.

It is rather disappointing that there are such few options for customization within ePrints. I credit Drupal with much greater choices for themes and modules. Overall ePrints seemed to have a lack of community support when compared to Drupal and DSapce, although there is plenty of documentation available on their site.

According to Wikipedia, this platform was originally created for institutional repositories and for scientific journals. Its history is tied to the development of the OAI-PMH protocol, and was one of the first open source, free software packages. My ultimate conclusion is that it is very suitable if using it for the purpose it was created—an IR or journal publishing. It however would be my second, or third choice as a repository for other item types.

Sunday, October 9, 2011

Unit 7 Sticker Shock


I work for a community college, not a major research university that demands high tuition from its students, but an institution that is trying to provide an “affordable” education to students who otherwise wouldn’t be able to attend college at all. The faculty’s emphasis is on teaching rather than research, so a digital collection that makes the most sense for us is to have is a learning object repository. I have begun to investigate what is available  in software platforms for a LOR, and in the past two weeks I have spoken to two commercial providers.

The first company I contacted was Equella. It runs the state’s learning object repository, Orange Grove, and appeared very suitable for what my campus would like to do. At present, we have no learning objects, just the dream of creating them. The campus has approximately 10,000 students and, if things take off, the repository would hopefully expand to include the three other campuses in our college system. As I talked to the representative, it became clear that Equella is not scalable down, but is a complete content management system, able to integrate the library management systems, student registration, provide a repository and multiple other things. After a one-time installation ($125,000) and consultation fees ($25,000), the yearly license would be approximately $80,000 a year. Suspicious because Equella is part of the Pearson publishing company, it seemed exorbitant to me.

So I read some other reviews and came across Telescope from North Plains. It advertises itself as scalable and affordable, with an emphasis on digital assets management (DAM) specifically of video and audio files. Speaking with a very nice representative who admitted that funding is often a problem with educational institutions, he was able to lower the yearly license fee down to $100,000 a year. (Gasp) Or we could purchase the software ($150,000), store all files on our own server, and pay a yearly maintenance fee of $30,000 a year. (What a bargain).

I guess I’m naïve; I don’t have much experience with budgets. I really don’t have anything to compare these prices to but our annual book budget of $50,000. But suddenly the open source products—DSpace, Drupal and others we will be looking at in this course are looking very attractive from a financial perspective.