Over the past few months, I have shared an idea about how to make citing online sources easier. You can find out more about this on the page, A Better Way to Cite Online Sources. Some of the suggestions that came from the survey and posts Details Part 1 and Details Part 2 (GEDCOM) was why not use an existing standard.
One of the suggestions was using Metadata Object Description Schema (MODS). MODS is a “schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications.” and is maintained by the Library of Congress.1
Now I suspect I will talk more about MODS in a future post, but the reason I bring it up now is because immediately in researching MODS I came across another acronymn, MARC. MARC stands for MAchine-Readable Cataloging and the MARC formats are “standards for the representation and communication of bibliographic and related information in machine-readable form.”2 Most of the discussion I came across dealt with MARC 21 which (according to Understanding MARC Bibliographic: Machine-Readable Cataloging) is “the standard used by most library computer programs.”
Now let’s return to the specific case identified in the video, “A Better Way to Cite Online Sources.” We have a website that identifies a book source. One of the three representations of a citation found in Evidence Explained is a Source List Entry or in other words a bibliographic entry:
So the book, A History of Emery County would look like this:
Geary, Edward A. A History of Emery County. Salt Lake City: Utah State Historical Society, 1996.
The main parts (or fields) of the entry are:
- Title (main & sub)
- Publication (place, publisher, year)
It should not be too surprising that the Library of Congress has a listing for this book and one of the ways that you can view it is in MARC format. The description earlier indicates that MARC is a machine-readable format. To make it easier for us to read, I have reformatted it:
000 01322cam a2200361 a 450
008 960403s1996 utuab b l001 0 eng d
035 ## $9 (DLC) 96060167
906 ## $a 7 $b cbc $c copycat $d 2 $e opcn $f 19 $g y-gencatlg
955 ## $a pb06 to hlcd 11-04-96; lk50 11-07-96; lk03 to sl 11-15-96; lj04 11-15-96
955 ## $a pn05 04-03-96; OCLC import pb06 11-04-96
010 ## $a 96060167
020 ## $a 0913738050
035 ## $a (OCoLC)35206145
040 ## $a USl $c USl $d DLC
042 ## $a lccopycat
043 ## $a n-us-ut
050 00 $a F832.E5 $b G43 1996
082 00 $a 979.2/57 $2 21
100 1# $a Geary, Edward A., $d 1937-
245 12 $a A history of Emery County / $c Edward A. Geary.
260 ## $a Salt Lake City : $b Utah State Historical Society ; $a [Castle Dale] ; $b Emery County Commission, $c 1996.
300 ## $a x, 448 p. : $b ill., map ; $c 24 cm.
440 #0 $a [Utah centennial county history series]
500 ## $a Series statement from jacket.
504 ## $a Includes bibliographical references (p. 423-426) and index.
651 #0 $a Emery County (Utah) $x History.
710 2# $a Utah State Historical Society.
710 1# $a Emery County (Utah). $b County Commission.
920 ## $a **LC HAS REQ’D # OF SHELF COPIES**
922 ## $a ad
991 ## $b c-GenColl $h F832.E5 $i G43 1996 $t Copy 1 $w BOOKS
Now lets trim that down to just what we need to represent author, title, and publication information:
$a Geary, Edward A.,
$a A history of Emery County.
$a Salt Lake City :
$b Utah State Historical Society ;
Without going into a lot of detail, I will explain the parts of the MARC format as it pertains to the source. To get a better understanding of MARC, consult Understanding MARC Bibliographic: Machine-Readable Cataloging.
The 100 tag indicates a main entry for a personal name (in other words, the primary author). The 1# identifies two indicators the 1 describes the personal name as being a surname type whereas the # is a placeholder to show that the second indicator is undefined. Any $ followed by a letter represents a subfield and $a specifically identifies a personal name with the value: “Geary, Edward A. ,”
The tag 245 is for a title statement. The first indicator with a value of 1 means that this is a title added entry because there is an author. The value of 2 for the second indicator means to skip the first 2 characters when sorting or filing this entry so the “A ” will be skipped thus starting at “history”. The $a subfield identifies this as a proper title with a value of “A history of Emery County.”
The 260 tag indicates publication information. The ## indicators mean that there are not multiple publication entries so no sequence number is needed (the first indicator) whereas the second indicator is undefined. The $a subfield identifies the place of publication, $b is the publisher’s name, and $c is the date of publication.
As can be seen, the information needed to represent a book following the Source List Entry from Evidence Explained can be represented using MARC.
There is still much I don’t know about MARC in the few hours that I have explored it, but I have a few observations and questions.
- MARC was originally created in the 1960′s by the Library of Congress (LOC) so it has been in use for many decades.
- It has acceptance by at least the LOC & Library and Archives Canada.
- MARC has some ability to be extended by local entities: libraries, vendors, systems.
- It can be used for the following types of materials:
- language material (including books)
- printed music
- manuscript music
- cartographic material
- manuscript cartographic material
- projected medium (including video recordings)
- nonmusical sound recording (audio recordings)
- musical sound recording
- 2-dimensional nonprojectable graphic
- computer file (electronic resources)
- mixed materials
- 3-dimensional artifact or naturally occurring object
- manuscript language material
- Has the ability to identify resources in print, microfilm, microfiche, and electronic form.
- The MARC format is compact using numbers and letter codes to encode information.
- The file contents is difficult to read by a person without first being formatted by a computer program.
- Not the easiest file format to code against. (Not complaining, just an observation)
- Additional punctuation (commas, periods, colons, semicolons) at the end of field values should be part of the bibliographic entry formatting and would have to be removed to get to the real field values.
- It would be useful to be able to break up the author’s name into parts: first name, surname, prefix (Dr), and suffix (Jr, III). This doesn’t appear possible with MARC.
- The publication, Understanding MARC Bibligraphic, has been published in at least the following languages which could indicate its use internationally:
- How much is MARC used nationally and internationally?
- Is it able to handle the sources specific to genealogy research?
- EE indicates that for a book, Full and Short Reference Notes include a page number. Is there an existing MARC tag for that? My hunch says no as it was created for bibliographic entries and online card catalogs.