I grew up in Utah, have a brother-in-law that worked for WordPerfect, and used WordPerfect in high school and college before Microsoft Word became the dominant word processing software. So when I tried to put a name to all the ideas about what the ideal genealogy software would look like to me, GenPerfect was the perfect name.
I am sad that I missed RootsTech 2011, but am excited to see that it has stirred up ideas and there is a spirit of innovation that seems to be sweeping through the genealogy/technology community. Some are having discussions about a new data format to bring GEDCOM into the 21st century and make sure it plays well in the online world of collaboration and social networking. One place you can find this is the BetterGEDCOM Wiki and another is the e-mail list for the FamilySearch Developer Network (FSDN).
Much of the recent discussion on FSDN has been around the main sticking points of the data model and whether the structure should be people-based or record-based. As a developer, I often want to jump down into the details of the problem and gnaw on it until I figure it out. But lately I am changing. I prefer to look at it from a user’s perspective. Call it product management or User Experience (UX), but if in the end the data model doesn’t allow the software to do what I think it can and should do, then I think a great opportunity would have been missed.
So back to GenPerfect. What do I think it should look like? What implications does that have on a data model? As a user, what is my vision of the perfect genealogy software?
I have an empty database, what do I do first? How about quickly enter in my name and those of my family members along with their birth information. Nothing new there. Or maybe I choose to select my family members from my list of Friends on Facebook or some other social site.
For each person, I would like a list of questions generated including:
- What is Worth Tucker’s birth date and place
- Did Worth Tucker marry?
- When and where did Worth Tucker get married and to whom?
- Is Worth Tucker still living?
- When and where did Worth Tucker die?
- Where is Worth Tucker buried?
There could be more questions based on age and locations where he lived:
- Did Worth Tucker serve in the military during WWI?
- Was Worth Tucker affected by the 1918 Influenza epidemic?
But then for each person, I could add my own questions:
- What did Worth Tucker look like (height, weight, eye color)?
- What were Worth Tucker’s occupations?
- In what locations did Worth Tucker live?
I like the idea of being able to add a question about a relative/ancestor at any time and to have one place to keep that list.
The next thing that I would like to do is add source documents. These could be scanned images of birth and marriage certificates, links to images that live online, a typed family history in pdf format, photographs, or many other forms. If the document is typed, then it would use OCR to create a transcript. In terms of the Genealogical Proof Standard (GPS) and Evidence Explained (EE) this is a derivative source as opposed to an original source. Now I have two sources that make up part of a source provenance. When a new source is added to the system, then by answering a few questions you can indicate if that source is original and if not then how it is or may be related to the original. Depending on the type of source entered, then using EE templates I will know the important information that I need to record about that source. This becomes my citation.
On websites such as FamilySearch, Ancestry, or even family history blogs then clicking on a single link I can download the source image, the citation, and even the data into my database. For example, for a census result on Ancestry, I could choose to download information for a single household (names, relationship to head of household, ages, calculated birth year, gender, can read, can write, birth place, etc.) or for all households on this census page and the previous and next pages. They would be imported as single household clusters as well as being related to specific source pages.
For any source, I can choose to create a transcript or an abstract of the document. Either on the original or a derivative, I can highlight or annotate names, dates, places, or events and they would become part of searchable/accessible data in my database. For each source, I can indicate informants (either specifically like “Moses Tucker” or generally, “probably a doctor”). Knowing the informant(s) for a document, we can start to understand who might have provided which information (ex: death certificate can have both a doctor and family member as informants). Information is provided by either an eye witness or participant making it primary information or someone that received the knowledge from someone else which we call secondary.
Once I have the basics entered into the system for the first 1-3 generations, then I can move on to researching specific ancestors. I can take one of the questions associated with an individual and click to create a research project with that as the objective of the research. I could also use a statement or hypothesis as the objective of the research project. The project allows me to look at a subset of sources as they pertain to a specific goal. As I enter information or add sources, these will be recorded in a research log associated to the project. I can add additional questions that I want answered as part of the project. EE defines evidence as being direct, indirect, or negative and that relates to how well a piece of information in a source answers the research objective. In these terms, then you cannot have evidence unless you also have a way to associate information to an objective.
Let’s say that I am collaborating on this project with others. I want a data format that allows me to share not only people, places, dates, events, and relationship but also objective, research log, sources, information, evidence, and questions. What if there are specific tasks that can be identified, then split among many participants.
What I mean by this is that just because I found a Worth Tucker in a document doesn’t mean that he is my Worth Tucker. There could also be different spellings of the same name in different documents. I would like to be able to link these individuals together and enter a reason why I think they are the same. Throughout the system, they would appear as one individual. For example, one research project could be about his birth and another about his death. But when I look at him in the system, I would see one individual (with possible name variations) and two events: birth and death. But I need to easily be able to get back to the list of “persons” that make up this person in case I discover that I am on the wrong track with one of the sources. I can then easily unlink them.
My database now contains multiple names for a single ancestor or even names for witnesses, neighbors, or people appearing in the same sources as your ancestors. With a single click, I can see a list of names of just ancestors or a list of names of people that haven’t been associated with a tree meaning they haven’t been proved as ancestors or that they are (or possibly are) associates of your ancestors. If I could list them in order of the number of times they appear in the system, I might be able to learn which ones my ancestors associated with most and ones that I might want to consider as research leads. Clicking on one of these names will allow me to create a research project for this individual. Sometimes the best ways to overcome brick walls in the research of our ancestors is to research someone they knew.
I need an easy way to list all conflicts for a specific research objective. By having all the conflicts in one place, I can makes notes by each one trying to resolve the conflict until I come up with the most plausible conclusion. At this point in time, based on the research that I did I can summarize my research as a conclusion. This will affect which information shows as the default on screen (and in reports) for an ancestor. Let’s say a research project to determine Worth Tucker’s birth date and location leads to many possible answers but as I go through each conflict, I feel best about 30 Nov 1870 and Laurel Township, Ashe County, North Carolina. So I indicate that that is my conclusion. When I look at Worth in the system, I see that as his birth date, but can also choose to see the other possible dates and places. The objective of the research project has been reached and I have a conclusion. The status of the project indicates that I have reached a conclusion on a specific date.
A year later, I get additional information about this research objective so I re-open the research project and add the information. This might lead to a different conclusion which now shows as the default. But in the research project, I can look at the research logs, questions, information, sources, conflicts, and conclusions separately.
Online Backup & Sync
Even though I may be doing the work on my desktop software, I want my work constantly backed-up to a secure location online so that I don’t loose any of my data or artifacts. In addition, I may choose to sync the data to an online database or peer-to-peer databases to protect my information and allow for collaboration.
If we take the above as use cases or user stories of what GenPerfect should do, then we can understand that the data model needs to support:
- People and/or Names
- Sources (GPS)
- Source Provenance
- Citations (EE Templates, GPS)
- Association of People, Dates, Places, etc. to a Source
- Questions (also Statements and Hypothesis)
- Research Logs
- Analysis: sources, information, evidence (GPS)
What would be in your GenPerfect? What else should be in the data model?