The User is Not Broken

Tuesday, June 13, 2006

This posting: The User is not Broken: A Meme Masquerading as a Manifesto seems to be causing a bit of discussion and debate at the moment. It was quoted in at least one talk at SSP last week too. Falls under the general heading of "Library 2.0"

"Beyond Borders and Bindings" notes from the SSP Annual Meeting 2006

As part of my "Grand Tour" of the technical, library and publishing conference circuit, last week I attended the SSP annual meeting. This year the theme was Beyond Borders and Bindings.

The conference programme was very varied and of a high quality. The talks ranged across topics such as explorations of new revenue models for publishers, open access publishing (and those two topics aren't antagonistic, as I've learnt), preservation and archiving, new publishing technologies, online advertising and marketing, and a host of others.

The opening keynote was given by Marshall Keys (bio). Entitled "Chaotic Transitions: How Today's Trends Will Affect Tomorrow's Information Environment" the talk looked at the issues currently facing libraries (funding; disintermediation), the rapidly changing ways in which students and researchers are consuming information including ubiquitous, multiple format access to content, as well as similar issues for publishers, e.g. new ways that users are finding (and sharing) information, and how to support them.

The talk was a lively start to the conference, and raised a lot of interesting issues. Keys exhorted publishers to consider how to design "Scholarly products for a digital way of life". He listed the following items as topics for consideration:
In typically geeky fashion I was most intrigued by Marshall's reference to this paper: "Crouching Students, Hidden Resources: Designing and Implementing a Virtual Library" (PDF) which discusses the exploration of a 3D learning environment for students. Time for librarians to get into Second Life?!

The other stand-out session for me was entitled "Hanging Together in a Multi-Format Landscape", featuring talks from Geoff Bilder, Roy Tennant and Cindy Hill.

This session continued many of the themes of the keynote, covering themes such as new technologies for enabling easier resource discovery, as well as ways to find some common ground between publishers and librarians who are both (independently, unfortunately) wrestling with similar issues: how to get content to the people that need it. Roy Tennant observed that the communities have common goals and different strengths and that by working together everyone benefits.

Cindy Hill explained her role as a corporate librarian at Sun, and how supporting their knowledge workers increasingly means being able to easily tailor licensed content for different uses. Hill noted that content needs to be "Mashable, Chunkable, Collaborative, and Embeddable".

In his talk, "Libraries & Publishers in a Googlezon World", Roy Tennant observed that we are now in the "Age of Ubiquitous Discovery", and that finding content is easy, and there's no longer a need for licensed databases of content; Google is everywhere. Tennant explained to publishers that based on the increasing level of referrals from Google searches that "Most users of your content will never see your home page...they don't want to learn how to use your site, nor should they have to...your brand identity has to be at article level". (We have statistics to support this at Ingenta which I'll try and get shared soon).

Tennant noted that linking needs to be transparent plumbing, and provided pointers to publishers on how they enable discovery, e.g. by making their content crawlable, by supporting deep linking, by offering APIs (e.g. SRU, OpenSearch, MXG) onto their content, and supporting standards such as OpenURL and COinS.

Tennant also commented that publishers need to rethink the cost model for academic articles, suggesting that they "think iTunes". Tennant also suggested that publishers should "lift the curtain" on the peer review process, and make it explicit how (and whether) content has been reviewed, e.g. by linking to a description of the review process.

Last up in this session was Geoff Bilder who dropped what he described as his normal "techno-messianic" viewpoint to deliver a presentation entitled "The Emperor's New Web". Bilder's central theme were the problems encountered by researchers attempting to find and access content on the web; in particular, the tangled mess of search engines, publisher websites, and link resolvers which makes online research a frustrating experience. He showed some diagrams which plotted out the different paths that a user can take to get an article, with endless click throughs from search results, link resolvers, etc.

Bilder explained how researchers are developing techniques to avoid this mess; essentially as a pain-limitation exercise, routing around areas of confusion or difficulty. He suggested that researchers are learning to avoid certain systems (both publisher and library) and/or URLs as they quickly learn which paths take the to content the quickest. And in many cases that may not be the "appropriate copy". Bilder observed that addressing this tangled web, and simply enabling users to go directly to the content they need with minimum fuss is crucial (echoing Tennant's "linking should be plumbing" idea).

Bilder noted that to help enrich the online experience (e.g. embracing Web 2.0, etc) the best thing that publishers and librarians can do is to open up data and work together on standards, rather than creating further silos. Services such as social bookmarking, etc. work best when aggregated across sites rather than being confined to a particular publisher or library website. Bilder explained that aggregators like Ingenta has a role to play here too, in particular supporting publishers as a technology partner. Quoting Uri Rubinsky, Bilder remarked that publishers and librarians should be working to build the "tunnels under Disneyland", not the "Disneyland" experience itself.

Bilder ended with a radical view of a publisher's website as nothing more than a content repository: no bells, whistles or clever functionality, just easy access to the content and metadata; a hard drive connected to the web.

I think this session and the keynote really echoed many of the central themes of the entire SSP conference, which was about how publishers can adapt during the "chaotic transition" that is currently occuring in academic research.

Overall SSP was a very enjoyable conference, and like most conferences much of the interesting discussion happened between sessions (over coffee and beers!). Next year the event is moving to San Francisco, see you there!

The Journal of Web Librarianship

Via Science Library Pad I see that there is a new journal relevant to shifted librarians. The Journal of Web Librarianship will publish "...material related to all aspects of librarianship as practiced on the World Wide Web, including both existing and emerging roles and activities of information professionals in the Web environment". Topics will include issues such as:

"web page design, usability testing of library or library-related sites, cataloging or classification of Web information, international issues in web librarianship, scholars' use of the web, information architecture, library departmental web pages, RSS feeds, podcasting, library services via the web, search engines, history of libraries and the web, and future aspects of web librarianship".

Jody Condit Fagan, the editor of the journal, has started an editorial blog which will explore "academic authorship and peer-reviewed journals from a different perspective". The first few posts review some books covering topics like peer reviewing and writing for academic journals. Interesting stuff.

Fagan joins the ranks of other editors who are exploring the use of blogs as a useful supplement to their journals. E.g. the bioethics blog (who claim to be the first), Action Potential and Free Association.

I was also pleased to note that as the journal is published by Haworth Press, you'll soon be able to access it from IngentaConnect as well as the publishers own site.

Yesterday an announcement went out to the Semantic Web interest group mailing list introducing openacademia 1.0, "an RDF-based publication repository for research groups and scientific communities".

As the documentation explains, openacademia "is an open source publication metadata repository for scientific communities. Our goal is to allow scientists to collect, organize and disseminate publications more efficiently using a combination of novel semantic technologies". The software which can be run locally, e.g. within a group of institutions, allows users to deposit metadata about their publications which can then be shared with others as RSS feeds, etc.

The software supports BibText, which is used by many scholars to maintain and format bibliographies, making it easy to add data to the system. APIs are available to extract data from the repository (which is based on Sesame). There are some nice tools for visualising topics between papers, graphs of author relationships, etc. The RSS feeds contain detailed metadata about each paper and author.

I'm going to have a more detailed look at the system over the coming weeks, but it intrigues me as their approach echoes our own adoption of RDF and Semantic Web technologies for aggregating and storing RDF metadata. Indeed many of the vocabularies are also shared, for example they're using FOAF to describe authors. The more publishing data that becomes available as RDF, the better the network effects become.

We're currently considering ways to open up our own repository for people to explore. Part of the technology partner role we play for publishers is ensuring that their content reaches the widest audience and is accessible as possible. With a growing number of applications exposing Semantic Web data, access to RDF and SPARQL interfaces will be just as important as "traditional" and community specific interfaces such as OpenURL, and ZING. Historically there's been some disconnects between the library and publisher communities on how they store, manage and share metadata. As we're firmly in-between these communities I'm hoping we can start to help bridge these gaps, and I'm convinced that Semantic Web technologies can help.

The Ingenta US Publisher Forum

Last Tuesday, 6th June Ingenta held its annual US Publisher Forum, at Dumbarton House in Washington, DC. Like the November meeting, which was for our UK for UK clients, the event consisted of a mixture of presentations from Ingenta staff as well as invited speakers from the industry.

I was lucky enough to be asked to present alongside my colleagues Louise Tutton (Head of Client Management) and Lucy Power (Senior Information Architect). Louise gave an update on the new and upcoming features available on IngentaConnect, and how we're adapting the service to meet the needs of publishers in the rapidly changing environment. Lucy provided an excellent introduction to the benefits of Information Architecture and provided some insights into what techniques her team uses when working with clients. I gave a talk called "The Web of Data" which further explored some themes I've explored in recent talks. More on that in another posting.

I wanted to share my notes on the talks given my our external speakers: Toni Tracy (Portico), Helen Henderson (Information Power) and Mary Page (Rutgers University Libraries).

I believe the presentations should all be online shortly, but I've put notes from the talks into the following postings:

I personally found the forum to be extremely useful. It was nice to meet some of our clients and discuss issues that they face, as well as getting updates from our industry speakers. I attended SSP in the days following the forum and many of the issues explored by our speakers (archiving, institutional identifiers, etc) were echoed in other presentations and discussions. I think we managed to create an event that was firmly on the pulse of current industry issues. Hopefully these notes will be useful to others who were unable to attend.

I believe our next event will be later this year where we will be having a similar gathering of UK publishers before the Online Information conference in London.

The Ingenta US Publisher Forum: What Do Librarians Want?

Notes from a presentation given by Mary Page, Rutgers University Libraries at the Ingenta US Publisher Forum in 2006.

Issues to Discuss:

Online publications cost more, in every way. Require higher level of staff, not just clerks any more. Many more staff involved.

Subs normally set up by subject specialist, that work closely with faculty. Make initial selection, then e-resource librarian, cataloguer, and collection librarian.

Budget crisis for many libraries. Flat funding means budget decrease. Adding titles means cancelling titles.

Currently hoarding supplies of paper clips and paper!

Increasingly difficult to identify titles to cancel because of packaging. Please don't bundle any more! Example of having to purchase a bundle of 20+ titles when they only really want one of them -- no other way to get that title.

Likely to cancel some packages, don't have a choice. Also want to "make a statement" about packages. Wary about "Big Deal".

Pricing information essential get that as early as possible. Essential for planning.


Serial agents made life easier for librarians in the print environment. Order print subscription, then forget it. Publisher changes didn't really affect print access.

Ejournals: every change disrupts access.

Serial agents were slow to adapt to online publishing. Climate has stabilized, less haggling over contracts.

It's time to make e-journal management routine.

Libraries use agents to outsource much of the work.

Libraries are slow to pay and slow to act: agents will pay/act on their behalf quicker.

Agents provide one invoice, one renewal process, one point of contact.

Financial incentives allow libraries to buy more titles

Barriers to Access:

Titles must be OpenURL compliant. Easy access increases use. Format makes a difference, even though they had print titles, their usage at Rutgers via JSTOR is much higher.

Moving platforms and changing publishers impedes access. Requires lot of updating of URLs in online catalogues, etc.

Communicate! (Less print, more listservs) Print is no longer an information tool for them. Turn to website for information.

Pricing Models:

Does anyone understand them?!

FTE pricing can be exorbitant for large institutions. Will pay for quality titles, especialy if the pricing is fair. We need important scholarly titles, we want to support scholarly publishing, but meet us in the middle somewhere.

We're ready to go online only. Urged publishers to move away from print, as its expensive to manage: shelving, binding, moving around. Don't make us take print if you can avoid it, and don't penalize us for that.

Use Your Editors:

Educate them about pricing models; explain them. They will give you good feedback

If your pricing is fair, demonstrate that to them. Show them your costs.

Provide them with information to share with their colleagues. Professors listen to each other

Ask your editors to vet new titles.


Founding mission is to create a level playing field for librarian, publishers, and vendors. Modelled on UKSG.

Emerging issues, new technology. Librarians want to learn from you!


The Ingenta US Publisher Forum: Using Institutional Identifiers to Identify Markets and Streamline the Supply Chain

Notes from a presentation given by Helen Henderson (Information Power) at the Ingenta US Publisher Forum in 2006.

EDIT: slides now available -- download .pps

What are Institutional Identifiers? They are not new, there are many different types:

Location definitions

Finanical/business information

No true unique identifiers for institutions.

What are they not?

What metadata is needed?

Unique id, location, category, tier, size, URL, credentials. Data requirements vary through the supply chain, i.e. different facets for different uses.

Why do we need them?

Licensing, marketing, customer analysis, authorization, authentication, optimising support of the journal supply chain.



Sources of information

Issues with accuracy of information through the supply chain. Intermediaries have different levels of detail, so pass on different bits of information. Often difficult to identify the actual subscribing institution. It's also useful to get information about non-subscribers, i.e. to whom should I market?

Customer analysis: group, e.g. by country, type, region, etc.

Market analysis: which of the top 50 subscribe, and what is our total revenue? Degree of market overlap with potential business partners?

Why Now?:

Conducting industry wide pilot. Mapped out a number of different "transactions" through the supply chain:

Participants in current pilot: BL, HighWire, Ringgold, Swets, UK Libraries. Project website is at www.journalsupplychain.org. Progress reports on each of the 9 work packages have recently been added. Intending to produce an evaluation report at the end of June.

Biggest issue is the metadata: what to attach for each transaction?

Benefits for users include improved activation. Helen noted that Ingenta has been quite aggressive at allowing agents to activate on behalf of customers (as far as she's aware we're the only company doing this).

Easy access to archives

Could also offer registry facilities (central IP registration). Although unlikely that libraries will want it, if there was a central trusted registry, then that would be of benefit to them.

Helen recently wrote a report on institutional identifiers for the Charleston Advisor.

Q: Others are doing this also?

A: Atypon are largely creating a database of IP addresses, there's no metadata. Ringgold are creating a database of institutions. We may fold in their work. Intention is that database will be in public domain. Would need sustainable business model, and are considering what that might be.

Q: How will you get buy-in from libraries

A: To certain extent, no need for buy in, they just need to use it. They should be able to see the direct benefits to them. Key benefit would be to see and immediately activate their whole institutional holdings in one go, not title by title. In terms of supplying their id with their order: that would be nice. Best thing to do to get buy-in is to put up something wrong and get people to correct it!

The Ingenta US Publisher Forum: Portico - A New Electronic Journal Archiving Service

Notes from a presentation given by Toni Tracy (Director, Publisher Relations, Portico) at the Ingenta US Publisher Forum in 2006.

EDIT: slides now available -- download .pps

Portico's Mission: To preserve scholarly literature published in electronic form and to ensure that these materials remain available to future generations of scholars, researchers, and students

Portico's History:
The project has its roots in a JSTOR project: Electronic-Archiving Initiative. This was aimed at faciliating transition to electronic journals by developing a technical infrastructure and sustainable archive to preserve e-journals.

The project began with a 2-year pilot phase (2003-2005), working with 10 publishers initially.

The decision was made to pull out Portico as a separate independent project from JSTOR. Portico was launched in Spring 2005 by JSTOR and Ithaka, with support from the Mellon Foundation. Their operations are now live and journals are actively being ingested and archived.

Publisher participants started after Frankfurt last year. Library partners are due to become involved after ALA Mid Winter.

Portico is:

Their advisory committee is made up of both librarians and publishers

The approach to archiving is to preserve the intellectual content, not an individual publisher's look-and-feel. They are not a web archive. The archive contains text, images, and limited functionality such as internal linking within documents.

Publishers deliver source files (SGML, XML, PDF, etc) to Portico. These are then converted or normalized from that proprietary format to an archival format for deposit in the Portico repository. The archival format is based on the NLM DTD.

The normalization process "proceeds carefully and deliberately". The emphasis is on long-term preservation requirements rather than immediate access. This is a "migration" approach to archiving, as opposed to "emulation" (a la LOCKSS). The migration approach brings content into a normalized format for preservation. Portico actively engages with the individual publisher to discuss why they made certain choices, e.g. for images, tagging practices, etc. The results are an accurate, although plain rendition of the original content.

Portico's access model has undergone extensive discussion, taking 18 months to reach a consensus.

Currently it only offers access to archived content to those libraries supporting the archive financially.

Access is offered only when specific trigger event conditions prevail AND when titles no longer available from the publisher or other sources.

Trigger events include:

Lot of back issue digitization efforts. Forseeing a day when back issues have had all commercial advantage reaped from them, and may get removed. Portico can step in and provide continued access.

For supporting libraries, trigger events initiate campus-wide access regardless of subscription.

Until a trigger occurs, select librarians from partnering institutions are granted password access for archive audit and verification. This access is explicitly not for document delivery or inter-library loan purposes. It's just for verification.

Libraries can use Portico archive for post-cancellation or "perpetual" access IF a publisher chooses to name Portico to meet that obligation. Of the 13 participating publishers, 10 have allowed this so far.

Who pays? Publishers and Libaries, but also charitable foundations and government agencies that also offer support. Mellon and JSTOR included.

13 publishers: 3,400 journals. 60 libraries (35 contracted, rest in process). Include Sage, UKSG, Elsevier, OUP, Wiley, BioOne...

Supporting publishers are asked to sign a license agreement and deposit articles in a timely manner. Financial contributions consist of an annual fee to fund initial conversion tools, development, and to defray costs of adding new content. Contributions tiered and vary according to a journal's revenue.

Supporting libraries incur a similar annual archive support payment which covers ongoing operations, maintenance and enhancements. Contributions are again tiered.

Benefits of Archiving:

Role for Publishers: What is your archival strategy? Develop and articulate your strategy to libraries. Signs of growing awareness of importance of archiving amongst publishers. Encouraged to participate in at least one archival arrangement. Publishers should monitor digital preservation developments and efforts, as well as related legal developments, e.g. on legal deposit (big issue for UK publishers), e-content copyright registration (Library of Congress pilot beginning) and
"Section 108 Working Group".

Q: If publisher deals with Portico with post-cancellation access
what will user see?

A: The vanilla version. Portico is not a primary access point. Don't want to compete
with investments made by publisher

Q: What are the hardware decisions? Does it matter?

A: Robust replication strategy. So online/offline replication, so stable even though centralised. Princeton, will add west coast, mid-west, UK, and Asia (eventually).

Q: What if a participating publisher went out of business. Are there plans to make it easier to use? (from Mary Page)

A: Its not static. Rudimentary search and browse. Will need to listen to library community. Keep that conversation going.

Q: We would want something more user friendly, if you're the only host

A: This will be on agenda for strategy meeting. Will be gathering metrics on level of post-cancellation usage, etc.

Interested All My Eye readers may wish to read "Preserving Electronic Scholarly Journals: Portico" from the April issue of Ariadne, for more information. Another useful executive overview of the archiving landscape was recently published in the Charleston Advisor.

Top tips for conference first-timers

Wednesday, June 07, 2006

Stephen Abram has blogged a great set of suggestions for first-time library conference delegates, from the personally practical to the professionally prudent. Some of our teams are on the road at the moment (publisher services are at SSP in Crystal City; library services are about to head over to Baltimore for SLA), so it seemed pertinent. Some of my favourites from Stephen's tips (to give you a selection which might encourage you to read them all):
"Bring at least two pairs of shoes" (I learned the value of this during my nth footsore journey from hall 4.2 to hall 8.0 at my first Frankfurt some years ago -- and this was in flat shoes that I'd borrowed from my mother and was sure would be a safe bet)

"Make your schedule in advance (at least at the start of the day, but earlier if possible). Include all of the options you might like so that if one desired session is cancelled or doesn't meet your expectations or needs then you can hop over to another. Make sure you note the room locations so you can evaluate how much time you have to get there between sessions."

"If you're late, have a little courage and take a seat. Don't hover and shuffle at the back of the room or in the door. Librarians tend to sit in the end seat of every row and you'll have to shuffle theatre style to get a good seat in the middle of a row." I don't think for one minute that it's just librarians who do that but my, it's annoying (and perhaps a little insulting to the presenter, since it implies that you expect to want to make a sharp exit?)

"Remember your business cards." Ah, so simple. And so simply forgettable.

"[Exhibitors] invite you to workshops, demonstrations, announcements, breakfasts and parties, etc. Don't accept the invitation and then blow them off. It's rude." I've been at workshops where half the listed delegates have just not shown, and I squirmed with pity (probably misplaced and patronising) for the poor undervalued presenter. So it's rude to other delegates as well as the organisers!

"DON'T be embarrassing! Hoovering through the exhibit hall looking for free pens and avoiding eye contact with anything resembling booth staff is not the image librarians want to project." This familiar image just made me chuckle, though of course the majority of librarians aren't just in it for a free <insert cheesy freebie of choice here>.

"DON'T assume that your old familiar vendors haven't changed and that you know everything about them. This is your opportunity to learn what's new and different."
'Nuff said!

