All top level records in a GEDCOM file can record the date and time
they were last modified. The tidyged
package (the main
package for creating and summarising GEDCOM files) includes change dates
(today’s date) by default every time a record is created or modified.
Since the time is very unlikely to be useful in such a context, the
package ignores this by default. We illustrate by loading the
tidyged
and tidyged.utils
packages, and
creating an example object.
level | record | tag | value |
---|---|---|---|
0 | HD | HEAD | |
1 | HD | GEDC | |
2 | HD | VERS | 5.5.5 |
2 | HD | FORM | LINEAGE-LINKED |
3 | HD | VERS | 5.5.5 |
1 | HD | CHAR | UTF-8 |
1 | HD | DEST | gedcompendium |
1 | HD | SOUR | gedcompendium |
2 | HD | NAME | The ‘gedcompendium’ ecosystem of packages for the R language |
2 | HD | CORP | Jamie Lendrum |
3 | HD | ADDR | |
3 | HD | [email protected] | |
3 | HD | WWW | https://jl5000.github.io/tidyged/ |
1 | HD | DATE | 22 NOV 2024 |
1 | HD | LANG | English |
1 | HD | SUBM | @U1@ |
0 | @U1@ | SUBM | |
1 | @U1@ | NAME | Me |
1 | @U1@ | CHAN | |
2 | @U1@ | DATE | 22 NOV 2024 |
0 | TR | TRLR |
See row 19 and the row after for the change date for the submitter record.
For GEDCOM files with thousands of records, including change dates
can add considerable bloat. For this reason it is possible to remove all
change date structures with the remove_change_dates()
function:
level | record | tag | value |
---|---|---|---|
0 | HD | HEAD | |
1 | HD | GEDC | |
2 | HD | VERS | 5.5.5 |
2 | HD | FORM | LINEAGE-LINKED |
3 | HD | VERS | 5.5.5 |
1 | HD | CHAR | UTF-8 |
1 | HD | DEST | gedcompendium |
1 | HD | SOUR | gedcompendium |
2 | HD | NAME | The ‘gedcompendium’ ecosystem of packages for the R language |
2 | HD | CORP | Jamie Lendrum |
3 | HD | ADDR | |
3 | HD | [email protected] | |
3 | HD | WWW | https://jl5000.github.io/tidyged/ |
1 | HD | DATE | 22 NOV 2024 |
1 | HD | LANG | English |
1 | HD | SUBM | @U1@ |
0 | @U1@ | SUBM | |
1 | @U1@ | NAME | Me |
0 | TR | TRLR |
If there are any records that are not referenced anywhere else, they
can be found with the identify_unused_records()
function.
In the example below we create 6 family group records, half with
members, half without, and also an unreferenced Repository record:
some_unref <- gedcom(subm("Me")) |>
add_indi(qn = "Tom Smith") |>
add_indi(qn = "Tammy Smith") |>
add_indi(qn = "Alice White") |>
add_indi(qn = "Phil Brown")
#> Added Unknown Individual: @I1@
#> Added Unknown Individual: @I2@
#> Added Unknown Individual: @I3@
#> Added Unknown Individual: @I4@
tom_xref <- find_indi_name(some_unref, "Tom")
tammy_xref <- find_indi_name(some_unref, "Tammy")
phil_xref <- find_indi_name(some_unref, "Phil")
alice_xref <- find_indi_name(some_unref, "Alice")
some_unref <- some_unref |>
add_famg(husband = tom_xref, wife = tammy_xref) |>
add_famg() |>
add_famg(husband = phil_xref) |>
add_famg() |>
add_famg(children = alice_xref) |>
add_famg() |>
add_repo("Test repo")
#> Added Family Group: @F1@
#> Added Family Group: @F2@
#> Added Family Group: @F3@
#> Added Family Group: @F4@
#> Added Family Group: @F5@
#> Added Family Group: @F6@
#> Added Repository: @R1@
identify_unused_records(some_unref)
#> [1] "@F2@" "@F4@" "@F6@" "@R1@"
We can find out more about these xrefs by using the
describe_records()
function from the tidyged
package:
identify_unused_records(some_unref) |>
describe_records(gedcom = some_unref)
#> [1] "Family @F2@, headed by no individuals, and no children"
#> [2] "Family @F4@, headed by no individuals, and no children"
#> [3] "Family @F6@, headed by no individuals, and no children"
#> [4] "Repository @R1@, Test repo"