friend

GenPerfect–My Ideal Genealogy Software

Thursday, 17 Mar 2011 | by Mark Tucker

I grew up in Utah, have a brother-in-law that worked for WordPerfect, and used WordPerfect in high school and college before Microsoft Word became the dominant word processing software. So when I tried to put a name to all the ideas about what the ideal genealogy software would look like to me, GenPerfect was the perfect name.

I am sad that I missed RootsTech 2011, but am excited to see that it has stirred up ideas and there is a spirit of innovation that seems to be sweeping through the genealogy/technology community.  Some are having discussions about a new data format to bring GEDCOM into the 21st century and make sure it plays well in the online world of collaboration and social networking. One place you can find this is the BetterGEDCOM Wiki and another is the e-mail list for the FamilySearch Developer Network (FSDN).

Much of the recent discussion on FSDN has been around the main sticking points of the data model and whether the structure should be people-based or record-based. As a developer, I often want to jump down into the details of the problem and gnaw on it until I figure it out. But lately I am changing. I prefer to look at it from a user’s perspective. Call it product management or User Experience (UX), but if in the end the data model doesn’t allow the software to do what I think it can and should do, then I think a great opportunity would have been missed.

So back to GenPerfect. What do I think it should look like? What implications does that have on a data model? As a user, what is my vision of the perfect genealogy software?

(more…)

RootsMagic 4 Citation Quality Gotcha #2

Wednesday, 8 Jul 2009 | by Mark Tucker

In gotcha #1 we looked at the issue of having the Source quality associated with the Source Details instead of the Master Source.  In gotcha #2 we look at issues dealing with evidence.

Source, Information, & Evidence

According to Evidence Explained by Elizabeth Shown Mills, ”sources are artifacts, books, digital files, documents, film, people, photographs, recordings, websites, etc.” (see page 24)  Information is the content of the source.  Evidence “represents our interpretation of information we consider relevant to the research question or problem.” (see page 25)  So in order to classify evidence we need both information and a research objective.  Even though the Genealogical Proof Standard (GPS) does not include a step to define research goals, I’ve included it as part of the Genealogy Research Process Map because it is implied.  Step one of the GPS states:

“We conduct a reasonable exhaustive search in reliable sources for all information that is or may be pertinent to the identity, relationship, event, or situation in question.”
The BCG Genealogical Standards Manual, page 1.

How do we know which sources to search if we don’t have a research objective?  The definitions of direct and indirect evidence also points to the need to have a defined research objective:

Direct evidence – relevant information that seems to answer the research question or solve the problem all by itself.
Indirect evidence – relevant information that cannot, alone, answer the question;
Negative evidence – an inference we can draw from the absence of information that should exist under particular circumstances.
Evidence Explained, page 25

Even the definition for negative evidence hints at a research objective.

So how can we set the citation quality value for evidence in RootsMagic or any other genealogy software unless we have a research objective?

(more…)

RootsMagic 4 Citation Quality Gotcha #1

Tuesday, 7 Jul 2009 | by Mark Tucker

I applaud the work the RootsMagic team has done to bring professional-quality research practices to the most recent version of RootsMagic. The work that they (and others) are doing is truly innovative. Just the other day, I awarded RootsMagic 4 an Innovator award for the implementation of research analysis around their citation quality feature.

I strongly encourage users of RootsMagic to use this feature, but in its current implementation there are a few gotchas and workarounds that need to be followed.

The Genealogical Proof Standard & Evidence Explained define research analysis classifications for a source, information, and evidence. A source is an object (or person) that contains (or has) information. A source can be classified as original or derivative. An original source is in its first oral or recorded form. Everything else that comes from an original (or another derivative) is a derivative. For example, a book is an original. Let’s say that it is a census enumerator’s book that he carried from house to house to take the census. Now let’s say that book is microfilmed and stored at an archive. The microfilm copy is a derivative. The digitization of the microfilm is a second generation derivative of the original. Without getting into the special cases of image copies, duplicate originals, and record copies, it is relatively easy to start uncovering the provenance or ancestry of the source you are using for your research back to the original source. The classification of a source as original or derivative helps to answer the question “Is there a better source?” and helps in your analysis as original sources usually carry more weight than derivative.

(more…)

Introduction to METS

Wednesday, 24 Jun 2009 | by Mark Tucker

The Challenges

One of the challenges that need to be solved for online source citation is the ability to give structure to digital assets. Think of an online book that consists of a hundred images each representing a page. There are other images for the cover, title page, etc. There might even be text documents, audio files, or video associated with it. How do we keep track of all those individual files and relate them as a single digital entity? That is part of the problem that METS is trying to solve. In online citations, we also have the issue of source provenance. Where did the digital image file for the census come from? It came from a microfilm copy of the original census. Is it possible that METS can help keep track of this provenance? What about complex sources that are part of a collection in a series part of a record group at an archive? Can MET be used to keep track of this hierarchal information?

Let’s explore the basics of METS to see if we can find some answers.

Metadata Encoding & Transmission Standard – METS

Basically a METS document consists of 7 major sections:

1. METS Header
2. Descriptive Metadata
3. Administrative Metadata
4. File Section
5. Structual Map
6. Structual Links
7. Behavior

METS is usually used to manage digital assets where there is at least one digital file, but it doesn’t have to. The sections that are interesting for our discussion are Descriptive Metadata, Administrative Metadata, and Structual Map.

(more…)

Better Online Citations – Details Part 5 (MODS)

Monday, 22 Jun 2009 | by Mark Tucker

MODS

In this post, we continue our exploration through existing bibliographic standards to see how they might work as a format for online sites to easily share citation information.  To see the journey we have made so far, visit the page, A Better Way to Cite Online Sources.

From the Library of Congress standards page for MODS, we see the following description:

Metadata Object Description Schema (MODS) is a schema for a bibliographic element set that may be used for a variety of purposes, and particularly for library applications.

On the MODS overview page, we get more details:

As an XML schema it is intended to be able to carry selected data from existing MARC 21 records as well as to enable the creation of original resource description records. It includes a subset of MARC fields and uses language-based tags rather than numeric ones, in some cases regrouping elements from the MARC 21 bibliographic format. This schema is currently in draft status…
…the schema does not target round-tripability with MARC 21. In other words, an original MARC 21 record converted to MODS may not convert back to MARC 21 in its entirety without some loss of specificity in tagging or loss of data. In some cases if reconverted into MARC 21, the data may not be placed in exactly the same field that it started in because a MARC field may have been mapped to a more general one in MODS.

Compared to MARC, MODS is simplier and uses word tags (like name, titleInfo, and originInfo) instead of numeric tags (100, 245, 260).  There is not a 1 to 1 mapping between MARC and MODS, so conversion between the two might introduce some challenges.

Let’s look at the book example used in the analysis of the other standards:

Geary, Edward A. A History of Emery County. Salt Lake City: Utah State Historical Society, 1996.

The Library of Congress represents this book in MODS here.

The three key pieces of information (author, title, and publication) are represented in MODS as follows:

(more…)

Better Online Citations – Details Part 4 (MARC XML)

Saturday, 20 Jun 2009 | by Mark Tucker

MARC XML

Previous posts have explored a better way to cite online sources (Part 1), how citation information can be stored as a file using GEDCOM format (Part 2) and MARC format (Part 3). This post takes the next logical step and discusses MARC XML.

MARC was created as a machine-readable format many decades ago. In the last decade, eXtensible Markup Language (XML) has been developed as a standard format to allow validation, processing, and transformation of data. MARC XML takes the MARC format and represents it as XML. This is done in a lossless way so that conversions between MARC and MARC XML will not lose any data.

A book represented as a Source List Entry in Evidence Explained looks like this:

Geary, Edward A. A History of Emery County. Salt Lake City: Utah State Historical Society, 1996.

That same book listed with the Library of Congress is shown here as MARC XML.

Let’s quickly compare the MARC entries for author, title, and publication with the corresponding representation in MARC XML.

(more…)

Next Page »

Powered by WordPress | Theme by Roy Tanck

Copyright 2010 Mark Tucker. All rights reserved.