There are examples where town clerks clearly made mistakes recording births, such as attributing a child to a now-dead first wife, etc. Many such records could be based on hearsay to start with. Then, too there are the confusing terms whose usage has changed over time slightly, such as cousin and nephew that make such documents ambiguous, and of course, the ever popular phonetic spelling that was so common in colonial America. On top of this, handwriting styles have changed and documents get stained or torn. All this means that even original documents must be interpreted with circumspection and criticality, and which means evidence must be evaluated on a case-by-case basis, not by some formulaic ranking of sources.
I am sure that paid genealogists get reimbursed for their expenses, and their deliverable will gain authority by citing sources higher up on the provenance scale. For the vast majority of people using genealogy software (non-professionals), is it worthwhile to get a copy of an original document if somebody gives you a transcript over the Internet? If it is no trouble, why not? If you doubt its veracity because of other evidence, sure! But, each generation triples the number of people you are investigating in your family tree and probably the effort is better spent on somebody else if none of those cases apply.
Personally, in prioritizing my research, I would rather find additional, independent evidence confirming my existing evidence than to have move my existing evidence up the scale of provenance a step or two. Hence my use of the phrase preponderance of evidence. Perhaps I am out-of-step with others in this, but it seems that the most common errors involve applying evidence to the wrong person, rather than getting the data wrong, and it is just as easy to apply an original document to the wrong person as it is a copy..]]>
All the standard guides do a fine job of citing published sources–that being the principal type of material used by those college students who, as you note, are taught to use MLA or CMOS (the latter being preferred over MLA in many academic fields such as my own, history). MLA and CMOS also provide an example or two for citing original documents of the type most academics use–those in university archives–but those models do not fit most resources used by genealogists or academics who mine local records.
Commendably, a significant number of academic historians, historical demographers, and practitioners of related fields *are* now using the grassroots-level original documents that have long been considered the “domain” of genealogists and amateur historians. These academic researchers, too, are discovering a need for guidance in the use and citation of those records. That is why, at the Amazon.com website for _Evidence Explained_, one sees endorsements of EE volunteered by two major historians. That is why academic reviewers for _Choice_, _Library Journal_, and _Booklist_ recommend EE for all academic libraries and upper-level/grad-level students. And, that is why Library Journal just awarded EE its “Best Reference Work 2007″ designation.
I do disagree with you as to the need and value of consulting original records, even when no conflict is known to exist. After all, if everyone working on a problem keeps using the same wrong abstract or database, everyone will “agree” but they’ll all be wrong.
(I won’t catalog, here, all the other reasons why it is important to consult those originals. I’ve done that elsewhere and all over.)
I am, however, puzzled as to how “source provenance is often overridden by a preponderance of evidence.” Provenance, meaning “origins,” speaks to the authenticity of a *single record.* “Preponderance of the evidence,” which is no longer used in genealogy because it ill-fits our field, is (like GPS) a conclusion based upon a *whole body of evidence.* Would you help us see your reasoning for this statement?
Elizabeth Shown Mills, CG, CGL, FASG]]>
About learning, I can only speak from personal experience, and I didn’t learn from my software. I learned from encountering problems. For example, learning about new -style/old-style dates the first time I found an infant recorded as dying before it was born. For example, learning about keeping sources when I can’t remember where I got that date which now seems so obviously wrong. Now I understand the payoff for the work involved and I do it gladly. The software could do all those things the day I took it out of the box, but I didn’t even know enough to look for those features. So I think your fundamental thesis is a little flawed.
I haven’t read the Elizabeth Shown Mills book, so I obviously missed its election as a Bible, but I do know that schools teach MLA, not Elizabeth Shown Mills. A little explanation of why it is necessary to invent a new standard would be useful. I would volunteer that MLA doesn’t strike me as a very machine-parseable format, but I don’t recall that being mentioned as a criteria in your article.
To castigate software for their handling of sources is not fair. First, I cannot tell what you consider a good citation beyond that it adheres to Elizabeth Shown Mills. So some requirements would be nice. It is my feeling that the bottom line is that a citation should ensure that another person, or even myself, can find the source for a fact at a later date and verify it. Most software I have seen does collect source information adequately for that purpose, and yet sources are still not documented even to this extent. Or sources are input religiously for every data item and all of them merely point to ancestry.com. (Which one of the thousands of contributed family trees do you think that person stumbled across first?) While a software company is certainly going to provide source management tools in order to remain competitive, they probably don’t feel like enforcing the proper use of these tools, if it means risking the loss of some percentage of their potential customers who don’t want to be bothered.
Rather than quibbling about citation formats, it would be far more productive to ask that we get more sources online. Much of the nation is now far remote from the location of the original documents since their families have migrated across the seas or across the country. For example, wouldn’t it be nice if local governments publish on the Internet vital records and probate records they hold that are over 100 years old. If this was standardized enough, one could imagine that software could automate the searching of these repositories and ranking of the resulting matches, which would be great. Then yes, suck in the data automatically along with a computer readable citation, presumably in XML as you suggest, but that is such a small part of this particular challenge and very far down the path.
Your layering idea is a nice way of keeping history. My personal preference would be to have the software never delete anything, just overlay old facts with a new version of a fact and keep the old one with its documentation as history, so you fully document the thought process that got you to the current state of your data, as the addition of more evidence may change your “conclusion”. Unlike what is suggested by your GPS (which seems more like a process than a standard of proof), in real life there is no final conclusion to the search. “I have never seen / A finished genealogy”.
Regarding some of your comments about merging and layering GEDCOMs, you might find some of the discussions on werelate.org about merging, uploading GEDCOMs useful. It is much closer to an actual requirements analysis, and I think it has a broader view in that people will not want to suck everything to their local system so much as use remote sources as a virtual part of their local database. This does merge nicely with your layering idea, but there are difficult issues matching two or more arbitrary family trees when one or both may have errors, different spellings, missing facts, etc., or once you do figure out a match, to save the reference information from the external database so you can automate bump the two databases again in the future to quickly spot changes.
Speaking of werelate.org, the biggest impetus towards better genealogy will be the need to collaborate. The payoff will be higher quality data for you, the cost will be the need to conform to a certain standard. But the number of websites that truly provide for collaboration is very small. I think werelate.org could get there. However, most websites just blindly accept submitted trees and keep them all in sterile isolation, so the website doesn’t annoy users by enforcing standards or suggesting somebody’s data is wrong. Once such a truly collaborative website achieves some general acceptance, software packages will then modify their workings accordingly.
Source Provenance is sort of a snooty issue and I am not sure it is even all that important. I can’t imagine the computer ever doing a good job of supplanting the user as the final arbiter. I am not George E. Bowman jealously guarding the designation of Mayflower Descendant, and even then source provenance is often overridden by a preponderance of evidence. If somebody provides a good-faith transcription or abstract, and tells me where it comes from so I can verify if I find contradictory evidence, I will have nearly as much confidence as if I had a copy of the original. In a collaborative environment, this would be even more true, as there is a very good chance somebody will have the opportunity, and take the time, to confirm the transcription/abstract.]]>
How can I get hold of one of THOSE?! Niche product or not, I want it!
If someone does see possibilities with utilizing even some of the ideas you posit it will have to come from a different source than we currently see on the market. Even then I’m afraid it will wind up being a niche product marketed to professional genealogists with a steep price tag.]]>