Hola Mundo

18 Dec 2011

Hello world. In May 2011 I received my MA in Library and Information Science from the University of Missouri and in June began working as the Systems and Emerging Technologies Librarian at Gettysburg College. I have been at Gettysburg for six months now, and I enjoy the variety of my work and the flexibility I'm given in pursuing my interests. These interests will also be the focus of this blog and are centered around digital scholarship - the scholarly communication system (its problems, and more importantly, workable solutions), technology, the role of libraries, publishing, and new media, and so forth.

This site is powered by Jekyll - thanks to Brian Grinstead for the suggestion.

Disclaimer: opinions (and grammatical errors) are solely mine and not those of my employer, yada yada.

As a way of introducing myself, I'd like to discuss some of the projects I've been involved with. In this post, I'll focus on the Pictures of the Year International Archive and subsequent posts will discuss work I'm still actively involved with - KOPN's Reel-To-Reel Project, research on retractions in biomedical literature, and my current position at Gettysburg.

POYi | History

Pictures of the Year International (POYi) is an annual photojournalism contest sponsored by the University of Missouri Reynolds Journalism Institute. Since its beginning as a local contest in 1944, POYi has continued to grow in both participation and scope and recently wrapped up an exhibition at the Newseum.

As far back as the early 1980s it was recognized that this collection held historical value and one graduate student (Randy Olsen) undertook the task of scanning the entire POYi archive into digital format. However, it goes without saying that digital image scanning in the 1980s was less advanced than it is today. Subsequent workers adopted Olsen's scanning methods, and as a result the images on the site are of relatively low quality. There are long term plans to rescan the archive in high-resolution, but the time and costs involved are quite large (of course not all images would need to be rescanned since participants in recent years submit photos digitally).

Around 2008, POYi partnered with Mizzou's School of Information Science and Learning Technology (SISLT) to put the archive online. Under the faculty direction of Dr. Tom Kochtanek, SISLT grad students Sean Burns and A.J. Million conceptualized and implemented the online archive. Burns and Million researched content management systems and chose Omeka - in part for its use of Dublin Core (a metadata standard specifically for online content) and its openness (e.g. Sean designed the site's Omeka theme and wrote scripts to import images and map metadata before the CSV Import plugin was available).

POYi | My Part

When I began working on the project in Fall 2010, my first task was to solve the problem of random PHP errors that were being thrown on the pages. The solution (as we suspected) was to update the version of Omeka, which was a fairly painless process although some of the PHP functions had changed names from version .9 to 1.3.1 and required some rooting around in order to make everything work properly and look pretty. I was also responsible for uploading three years worth of submissions, approximately 5,000 photos with metadata. I used the CSV Import plugin, which performed two essential functions:

Metadata: it allowed me to map each column in the Excel spreadsheet to its appropriate Dublin Core element, and
Connect images with metadata: by adding a column in the spreadsheet with the file path of each image's location, I was able to connect each image with its metadata

Simple in principle but messy in practice. I can't say enough about the CSV Import plugin - on the whole, it saved me a lot of time and seamlessly performed two functions critical for displaying content. However, I did run into a few hangups, so below are some tips for using Omeka's CSV Import plugin.

Of course, begin by reading the Documentation and make use of the forums - one strength of Omeka is its user base and dedicated outreach team.
Learn Excel (or your spreadsheet program of choice. I used Excel so that's what I'll discuss). Much of the work is cleaning metadata, so learn the functions and shortcuts.
The CSV Import Plugin doesn't like diacritics (accents, etc.) and many symbols. My solution was find and replace - e.g. find "é" and replace with "e", or find "&" and replace with "and". Not perfect but it worked.
For mapping files (audio, video, etc):
1. Add a column where the value in every row is the directory the files are located in. For example, I uploaded files to /archive/files, so the filepath for POYi Archive was http://archive.poyi.org/archive/files. Every row in this column had a value of http://archive.poyi.org/archive/files.
2. Add a second column with the filename of each file in your /archive/files directory (or wherever the files are located). If you're lucky enough to have this data in a way you can easily paste it into Excel, congrats. For the less fortunate, you'll want to create a new spreadsheet with the filenames that you can copy-n-paste into the main spreadsheet. To do so, open a terminal and navigate to your files directory.
  For Mac/Linux:
  $ ls > filenames.csv
  For Windows:
  $ dir > filenames.csv
  Open your new spreadsheet and copy-n-paste that into your main spreadsheet (and hope everything is in proper order!)
3. Next, combine these two columns in a third column. In Excel, use the concatenate function - it should look something like "=CONCATENATE(H2,I2)". If there are spaces in the filenames, replace the space with "%20". For example, "Space Between Words.txt" will be "Space%20Between%20Words.txt".
4. Test it! Make sure the new links actually work before importing.
I found it best to begin by uploading in small batches - 10-50 items at a time. Also, when I hit an error, I jumped to the Items tab and deleted the most recent item (the item responsible for the error) rather than undoing the entire import. I wish I could say this trick shaved off a few seconds, but given the number of errors I ran into due to random accent marks and the like, this actually saved me quite a bit of time.

I extend my gratitude to the folks on the Omeka team at the Center for History and New Media at George Mason University. Omeka has served the needs of POYi Archive well so far, and working with the program has been particularly educational for me. I have developed web design and programming skills, gained practical experience with a large digital library, and have been inspired to participate in larger discussions on digital scholarship and understanding the nature of scholarly work in C21.