Implementing Fedora Four
at the University of Maryland Libraries
Peter Eichman
Bria Parker
Ben Wallberg
Joshua Westgard
March 31, 2015
Overview
Fedora's Past at UMD: Bria Parker
Fedora's Present at UMD: Peter Eichman
Fedora's Future at UMD: Josh Westgard
From local to standard
- UMD has been using locally created schemas for years
- Descriptive -- UMDM
- Administrative -- UMAM
- Fedora 4 gives us the opportunity (and motivation) to move from local schemas to standards
- MODS was a clear frontrunner
- most closely matched structure of existing metadata
- widely used
- UMDM --> MODS --> MODS RDF --> Fedora4
Moving from XML to RDF
- MODS RDF in its current state is problematic for our use
- need to model the data, not remodel the XML (which is what the current iteration of MODS RDF does)
- blank nodes created by these hierarchies became an issue
- we're transitioning from hierarchical metadata structure to flat
General Implementation Setup
- Currently building from 4.1.1
- Create an overlay WAR of our custom configuration
- Include role based access control (fcrepo-module-rbacl)
- Separately running services for:
- Standalone infinispan instance
- JMS Messaging Consumer (same Tomcat as Fedora)
- Fuseki (standalone)
- Solr (remote server)
Fedora as a First Class Service
- Expose the REST API directly
- Don't treat it as a "black box" dumb storage backend
- Allow for multiple front-end interfaces
- Use standards to support a variety of clients and tools
Challenges to Fedora as a First Class Service
- Authorization must be implemented in the repository
- URIs we mint must be stable and persistant
URI Stability
- Challenge: URIs we mint must be stable and persistant
- Challenge++: We will be federating in binaries from the file system,
and these binaries may move around
- Solution:
- One persistant tree of URIs (PCDM objects and
collections)
- These persistant URIs then can point to the more mutable
URIs in a separate tree (e.g., storage) through the
message/external-body content type.
Authentication and Authorization
- Challenge: Authorization must be implemented in the repository
- Challenge++: We want a mix of public access and authenticated
access
- Solution: Authenticate with Tomcat, mix local users with LDAP, set
ACL
Admin User
- Define an admin user with the fedoraAdmin role in
tomcat-users.xml
<tomcat-users>
<role rolename="fedoraAdmin"/>
<role rolename="fedoraUser"/>
<user username="admin" password="SECRET" roles="fedoraAdmin"/>
</tomcat-users>
LDAP Authentication with JNDI
- Tomcat JNDI authentication realm to use LDAP to authenticate regular
users (fedoraUser).
- server.xml
<Engine name="Catalina" defaultHost="localhost">
<!-- ... -->
<Realm className="org.apache.catalina.realm.JNDIRealm"
connectionURL="ldaps://directory.umd.edu"
commonRole="fedoraUser"
userPattern="uid={0},ou=people,dc=umd,dc=edu"
/>
<!-- ... -->
</Engine>
Root ACL
- Set an ACL on the root node that grants the "reader" permission to
the special principal EVERYONE
- Note the double slash in the URL path segment
$ curl https://fedora4dev.lib.umd.edu/fcrepo/rest//fcr:accessroles \
-X POST \
-H 'Content-Type: application/json' \
-d '{"EVERYONE":["reader"]}' \
-u admin:SECRET
Authentication and Authorization: Outstanding Challenge
- Challenge++: We want a mix of public access and
authenticated access
SolutionWorkaround:
- Define a "public" or "anonymous" user with the fedoraUser role in
tomcat-users.xml
- Distribute the password for that user.
- Not ideal; other solutions?
Content Types
- Image (visual content)
- Book (both manuscript and print)
- Audio/Video (streaming services via external application)
- Our implementation of Fedora has at times become a hindrance with respect to adding new content types (chain of dependencies)
Portland Common Data Model
- Should work well for existing content
- Adoption is in keeping with our goals of:
- leveraging standards
- community-based development
- Allows us to begin the modeling process while deferring commitment on Hydra/Islandora question
Research Data and Institutional Repository
- Currently piloting "DRUM for data" (DRUM = Dspace)
- Is Fedora better suited to be our research data repository?
- Ultimately, can we replace Dspace with Fedora as our IR?
Born-Digital Archives
- Currently in the early stages of active Born-Digital archiving program
- Access is, for us, a great unanswered question with respect to Born-Digital materials
- Through the iSchool, UMD is a partner in Brown Dog project (http://browndog.ncsa.illinois.edu)
- The file system federation features of F4 seem to have potential here
A More Complete Digital Asset Management System
- Curator-mediated batch loading with validation tools
- Flexible, automated fixity checking and auditing
- Management and tracking of multiple instances of single assets
- A suite of backend storage options with derivative generation, asset replication, and shipment to storage facilitated by F4
Thank you!
Peter Eichman (peichman@umd.edu)
Bria Parker (blparker@umd.edu)
Ben Wallberg (wallberg@umd.edu)
Joshua Westgard (westgard@umd.edu)
/