Geni consistency & plausibility checker

Started by Private User on Saturday, April 29, 2017
Problem with this page?

Participants:

Profiles Mentioned:

Related Projects:

Showing 151-180 of 503 posts

Dan, I'm pushing out v4.1.5 which will fix that FindAGrave issue. However, keep an eye on FindAGrave. My code for parsing the death date was a little different than the birth date. I can't remember if that was intentional or not. I duplicated the birth code for the death and it now parses correctly, but I'm not certain I didn't break something else for death.

v4.1.5 date comparisons

While comparing MM/DD/YYYY to MM/YYYY seems to work (not warning when MM/YYYY are the same), I did run into this:
https://www.geni.com/family-tree/index/6000000057329390031
where the 'flagged' pair have:
8/1854 vs. 1855
as a warning.

Probably should not be 'flagged' ... although ...

... should two siblings with same birth year (year only for both) be flagged with warning? or as twins? Maybe warn only if the 'months between' is set > 12; otherwise consider them as twins.

Dan, just a thought and I am not sure if this is relevant to your post above with Jeff.

Two siblings, not necessarily twins, can also be born in the same year say 7 - 9 months apart.

A good example of what Pam Karp just mentioned is two brothers of my 3rd great grandmother

Isac Mendel born in January of 1827
Salomon Mendel born in November of 1827

See also the now considered offensive term: https://en.wiktionary.org/wiki/Irish_twin

A long shot perhaps...

Does it check any non-Roman characters in the name fields? Geni has implemented a warning, but only when you go to edit, and you have to manually correct it.

If you could detect which language it is, would it be possible to have a one-click transfer to the right language (better ask user to specify which language)? Especially for names that have the opposite order from English (Hungarian, Chinese, Japanese, Korean, Vietnamese). That has been my top wish since we had multi-lingual names.

re: non-twins born in the same year.

Certainly can & does happen. Which is why I suggest that siblings with the same birth year where one has a "year-only" date value NOT be flagged as a warning. Warnings would be issued only if BOTH siblings have at least MM/YYYY *and* not same value of month (or month+day, if both profiles have full date details).

On the other hand ... maybe that could be a separate config option ('Inform if multiple siblings in same year without identical MM/DD/YYYY") -- default OFF -- so that one could check for possible mis-merged children and/or dubious data. (7 children in one year? Possibly mis-merges and/or arbitrary guesses as to birth dates. I've seen both instances.)

At the moment (v4.1.6), same MM & YYYY on two siblings are NOT considered twins (and thus warning is showing). They should be considered twins.

Example: https://www.geni.com/family-tree/index/6000000041215082295 or
https://www.geni.com/family-tree/index/6000000002869148594 (simpler view of same family).

how about a "fix all" button rather than clicking them all one by one

re: fix all ... I'd only support that for each 'class' of fix. That is, "fix all upper/lower case issues" could be one such example. (I suspect we'll have other kinds of 'one-click-fixes'.)

Dan Cornett, I don't see the problem with the profiles you linked to yesterday. Violet is listed as being born Nov 1922 and Marion 10 May 1923. That's a 6-7 month difference.

Oh wait.. you changed it from c1923. Let me test that.

I agree on "fix all" button for case problems. Sometimes I am overwhelmed by the number of changes required.

Hopefully over time the case problems will start to diminish.
I have to say I am "having fun" with them and have started to chase them up and down trees. They have always bugged me but now it is so easy to fix them, even if some pages have lots of them.

Great tool the fix case. Thanks

Fix all might be a bit tricky. Short term, might it help visibility if when you clicked the fix, it also put a strikethrough (line through the middle of) the text on the associated issue?

I like just the 'check-mark'; I've run into a number of cases in the past 24 hours where a 'fix-all' would cause more trouble than help.

Jeff: The hover text of the "proposed fixed string" doesn't seem to be working for me at the moment.

v 4.1.6 - clicking the 'recycle' icon does not re-read the config data. The config data *are* re-read when the window or Tree refreshes.

v 4.1.7 - hover text is back working! (Thanks!)

bug? First message on this Tree View about John Richmond's burial date seems to be wrong. Parsing problem? Logic? ...?

https://www.geni.com/family-tree/index/6000000038150295279#60000000...

I have another check that might be interesting. As we should not have profiles in our tree older than 125, they should as well not marrying and having children over that age. Is that also implemented?
(it will rise an error already as the person is over that age, but it might drive as well errors about the children as might not belong to that father, so it is not a double warning and there is value to know the children might be wrong)

Dan, oddly 1740 is in the "day" field for John's profile.

re: 1740 year in day field. Huh. Easily fixed by re-entering.

Maybe another check? (Day field > 31) if no year field, offer to move from day to year?

(I should have left it for you to experiment with ...)

I'm thinking of making the sibling check not flag if one profile is just a year until I write something that looks at if the month is in the middle of the year or something.

Private User, The incorrect case checker isn't showing up all of the incorrect cases

Examples
* Sarah McGowan (huntington) - (huntington) in last name in brackets
* Clara Elizabeth Ann Towers - all capitals in first name and birth name

Another one
* Louisa Cawdell Weston - all lowercase in first name

Jeff, does the checker look for full stops in names?

re: finding 'all one case'

The Consistency Checker is using a simple checker that it looks at the entire field and says it's "OK" if there is a mix of upper & lowercase.

The case of Sarah is that "McGowan (huntington" is all in one field (Last-name). So it looks like a valid mixed-case text. If "huntington" were properly in the Birth-surname, it would have been identified (and correctable).

I don't know why Louisa Cawdell Weston isn't being caught. (Maybe spaces within fields is an issue?)

Hmm, yeah - not quite sure why this was working incorrectly, but I rewrote the case checker for version 4.1.9 to use a different method which should hopefully better add support for various character sets. In doing so, it removed the problems with those profiles.

This change affects both the consistency checker and the case detection / correction in the copy tool. It should be published within the hour.

Showing 151-180 of 503 posts

Create a free account or login to participate in this discussion