All GEDCOM records are given unique identifiers known as xrefs
(cross-references) to allow other records to link to them. These are
alphanumeric strings surrounded by ‘@’ symbols. The tidyged
package creates these xrefs automatically:
library(tidyged)
simpsons <- gedcom(subm("Me")) |>
add_indi(sex = "M") |>
add_indi_names(name_pieces(given = "Homer", surname = "Simpson")) |>
add_indi(sex = "F") |>
add_indi_names(name_pieces(given = "Marge", surname = "Simpson")) |>
add_indi(sex = "F") |>
add_indi_names(name_pieces(given = "Lisa", surname = "Simpson")) |>
add_indi(sex = "M") |>
add_indi_names(name_pieces(given = "Bart", surname = "Simpson")) |>
add_note("This is a note")
#> Added Male Individual: @I1@
#> Added Female Individual: @I2@
#> Added Female Individual: @I3@
#> Added Male Individual: @I4@
#> Added Note: @N1@
dplyr::filter(simpsons, tag %in% c("INDI", "NOTE")) |>
knitr::kable()
level | record | tag | value |
---|---|---|---|
0 | @I1@ | INDI | |
0 | @I2@ | INDI | |
0 | @I3@ | INDI | |
0 | @I4@ | INDI | |
0 | @N1@ | NOTE | This is a note |
Note the unique xrefs in the record column.
In the above example a series of records are created (which will be explained in more detail in the proceeding articles). After each record is created, the name(s) of the individual are defined without actually explicitly referencing the Individual record. This is because they are acting on the active record. A record becomes active when it is created or when it is explicitly activated.
We can query the active record using the active_record()
function:
Since the last record to be created was the Note record, it is the active record. The active record is stored as an attribute of the tibble.
We can use activation to add to existing records. If we want to
activate another record, we can activate it using the
activate_*()
family of functions together with its
xref:
There are many other functions in the gedcompendium
that
take record xrefs as input parameters and it can be tedious to have to
manually look these up. The tidyged
package offers a number
of helper functions to locate specific xrefs using pattern matching:
find_indi_name(simpsons, "Bart")
#> [1] "@I4@"
find_indi_name_all(simpsons, "Simpson")
#> [1] "@I1@" "@I2@" "@I3@" "@I4@"
These helper functions begin with find_*
and act as
wrappers to the more general function find_xref()
. It’s
straightforward to write your own wrapper if you’re familiar with the
tags used in the GEDCOM specification.
In the activation example, we would activate Marge’s record with:
Note that the full name does not need to be given, since the term is partially matched. As long as it is detected in the name of the individual it will be found.
In this use case, if no match or more than one match is found, it will result in an error:
When removing entire records, you don’t have to necessarily rely on activating them first. The same referencing techniques above can be used to remove records immediately:
xref | name | sex | date_of_birth | place_of_birth | date_of_death | place_of_death | mother | father | num_siblings | num_children | last_modified |
---|---|---|---|---|---|---|---|---|---|---|---|
@I2@ | Marge Simpson | F | 0 | 22 NOV 2024 | |||||||
@I3@ | Lisa Simpson | F | 0 | 22 NOV 2024 | |||||||
@I4@ | Bart Simpson | M | 0 | 22 NOV 2024 |
In all the examples you’ve seen so far the approach has been to build up the tree one record at a time. There are a number of helper functions that allow you to shortcut this laborious exercise. These functions can create multiple records at once, including Family Group records, where you can go back and add more detail. The functions are:
add_parents()
add_siblings()
add_spouse()
add_children()
They all require the xref of an Individual record (or one to be
activated), except for add_children()
, which requires the
xref of a Family Group record. These functions do not change the active
record.
Because of this, you cannot use add_children()
in a
single pipeline with the other functions.
The feedback from these functions gives you the necessary xrefs to then add more detail.
To illustrate, we can build up two families starting with a spouse:
from_spou <- gedcom(subm("Me")) |>
add_indi(sex = "M") |>
add_parents() |>
add_siblings(sexes = "MMFF") |>
add_spouse(sex = "F")
#> Added Male Individual: @I1@
#> Added Family Group: @F1@
#> Added Male Individual: @I2@
#> Added Female Individual: @I3@
#> Added Male Individual: @I4@
#> Added Male Individual: @I5@
#> Added Female Individual: @I6@
#> Added Female Individual: @I7@
#> Added Family Group: @F2@
#> Added Female Individual: @I8@
The initial individual (@I1@) gets added as a child to a family (@F1@) with two parents (@I2@ and @I3@) and 4 siblings (@I4@ to @I7@). Finally, he is given a spouse (@I8@) in his own family (@F2@).
Now we have the xref of his family, we can add his two daughters:
with_chil <- from_spou |>
add_children(xref = "@F2@", sexes = "FF")
#> Added Female Individual: @I9@
#> Added Female Individual: @I10@
Now we have the records, we can use all of these xrefs to add details like names and facts.
The tidyged.utils
package contains the function
add_ancestors()
to create Individual and Family Group
records for entire generations of ancestors.
Record identifiers have been a topic of much discussion in the GEDCOM
user community. Even though xref identifiers will be imported unchanged
in the tidyged
package, some systems do create their own
xref identifiers on import. So you cannot assume they will survive
between systems. However, they should always be internally
consistent.
A couple of other mechanisms exist for providing unique identifiers to records:
tidyged
package does not use this, nor expose it to a user;find_*_refn()
).For these reasons, neither of these mechanisms are considered to be a better alternative way of selecting records.