Suggestions for the developers

Started by Private User on Thursday, January 6, 2011
Problem with this page?

Participants:

Showing all 21 posts

There are a few things I notice over and over again in merges that are simple errors or oversights, but that can easily lead to problems once the profiles are public and start being compared to others. For several of them it seems like a few programming fixes could make a big difference so far as clean data is concerned.

Date estimates:
I have (as I assume many do) multiple people with the same name, and in fact, the same sibling names, in my family tree. When there's no date info given with them, the merge issues page tries to encourage me to merge them, even though they're sometimes several hundred years apart in age. Would it be possible to generate approximate dates for such entries so as to avoid such issues? I mean, the system does know what generation these people are, it should be able to extrapolate that my 12th great-grandmother and 6th great-grandaunt did not live at the same time. It doesn't have to be anything *at all* accurate, and could be as simple as 'pre-[date of first child's birth]' or 'post-[parent's birthdate]', but it would just be nice to have approximate dates show up so you can tell the difference between generations in a glance.

Assumption of Death:
I'd like the system to be set to assume that any people entered with birthdates before, say, 1870, (and definitely who have given death dates) automatically set as 'false' for the living check box. I notice people miss it sometimes when they're in a hurry, and then the merge suggestions balk because those fields don't match. This is a fairly simple check, and would keep the data cleaner than it is currently.

Died in Infancy Indicator:
Additionally, it would be nice if there was some sort of indicator for 'died as a child'--a lot of times it's hard to tell if a family really did have three boys named William or if it's one child with multiple birthdates given in multiple trees. I think being able to mark that yes, this family had two boys named William who died as infants before one survived to adulthood would help keep data clean across the board.

Good suggestions... what do you think Mike Stangel

I like it :)

Oh, and case insensitivity with the names would also help--Perkins, perkins, and PERKINS are all obviously (to humans, not machines) the same name, and simply making the name fields case insensitive would clear up them. (I know there are many more complicated problems with names than just this, but this seems like a partial fix that should be easy enough to implement, and that would make merges cleaner and more comprehensive.

How about something to signify derivatives of names
Ie: Dunley, Dunlay, Dunlea ?
A few surnames in my trees changed spelling over the generations including my own:
Boudrot to Boudreau
as opposed to having every derivative listed in one's profile name
it could make things less confusing?

Just got done sorting thru 320 tree matches.....

Believe that Geni matches by first name of anyone in the profile to ANY name in a potential match...last name does not appear to be used as a qualifier and there is NO consideration for date....more than 3/4 of the tree matches were more than 2 centuries off the mark....and last names were also way off....

I can only come to the conclusion that Mary brings up every Mary in ANY profile....perhaps that is too simplistic but when 1370 is (by Geni) matched to 2006 and last name Garneys is matched to Zwiebecki and the ONLY thing in common is 1 or more FIRST names...then that is the only conclusion I can come up with...

So every Mary who had anyone in their tree named Robert brings up EVERY Mary/Robert combination...no matter the relationship or date.

IF last names were a criteria , (Renee) ,then would NOT see other surnames appear in tree matches...I don't believe it matters much HOW you spell the last name since it doesn't appear to be taken into account..

The MOST LIKELY match is the first few listings...after that it's anything remotely alike.

Mike Stangel

Can you clarify the matching criteria and advise if some of the suggestions here are possible to add as enhancements?

I would have thought "disregard case" is already in place, for instance.

I think something like that is a good idea, Renee, but at the same time the trick is figuring out how to implement it easily. You're asking that the computer know not just what you said, but what you meant to say. The only way it could know Dunlea, Dunley, and Dunlay are all supposed to be the same name would be if it was programed to accept such things, and given the insane numbers of alternate last names out there, it wouldn't be a feasible update.

That said, there are alternates which could make a big difference with a more reasonable amount of work. One such could be extra, non-display name fields--say, a nickname field for first names, and an 'alternate spellings" field for last names. You could list other names (Joanna, Hannah, Jo Anna; Fitz Randolph, Fitzrandolf, Randolph, etc) and thus even if your ancestor is listed as Joanne Randolph, the system would be able to suggest that Joanna FitzRandolph in another tree could be the same person. The programing necessary to add the fields and include them in searches/merger comparisons, but not displayed in summaries would probably be outweighed by the value added to the profiles.

Fay--That characteristic of the tree matches is what makes me suggest additional checks against dates. Many of my tree matches show up because there's no birthdate for someone, which 'matches' another person without a birthdate. Estimating birthdates would cut down on that, and if it could limit searches for Mary to within 100 years of the date (to allow for mistakes in data entry) and, oh, say NOT match every single Mary without a last name to each other, those two things would make a big difference in the data we get given.

A field for "alternate spellings" is a good idea.

Has anyone tested whether the new "surname" search associates alternates? I think it might but not sure if that's based on text input by us.

I agree that putting in approx dates (checking the c. box) would be enormously helpful in matching and something WE can do.

what about date range within famly bloc....as a criteria..

I honestly believe that alternate spellings would be prohibitive....then if they were attached to a family, even if there were NO dates, they would be included within that bloc....

1st parent birth to last child born...as a range

Don't forget that there IS a circa key...it allows for 5 years either way...

Not perfect but would eliminate 1990's showing in 1400's...

Cecelia...what about first 3 letters of last name...(that still wouldn't fix problems with von, van, del,Mc, Mac....etc)....that's why I suggest date range....

There is a circa button, but it only works if the people entering data remember to check it. And well, if everyone entered data the same way, none of these suggestions would be necessary. :) The date estimate suggestion is not so much for the use of the person entering the info as it is for others who want to use their data. I'm trying to think of ways that the system can adjust for data that's entered incorrectly or incompletely.

There are enough times that unknown relatives are just listed with no date, no last name, nothing. Basically, all that's known is that they're probably named Mary and they're married to an ancestor. In a perfect world, yes, the person who enters them would guess at the age based off of the children or husband's age, would list them with the husband's surname (better than nothing), etc. But if those things aren't given, the system has been given enough details around this person to be able to make such assumptions and display them in the absence of other information.

Had someone ask about the progress on adoption in Geni....

Wasn't Jesus adopted by Joseph???
So maybe the question NEEDS to be more seriously addressed.

Please don't anyone take my last comment wrongly...Just trying to point out that adoption NEEDS to be addressed a bit more actively....was more prevalent than is given significance....quite often, if a family was wiped out by disease etc a relative would take the survivor as their own.....in cases where a son died without children the inheritance would pass laterally to a brother etc....they are ALL related..

Others are UNRELATED adoptions....

But the issue was very common over the ages....and, I believe, is of the 2 differing reasons...one breaks bloodline, one does not...

At one time, the US Census had a "soundalike" algorithm. It's not perfect, but it could help.

Bryan John Maloney

Your talking about a soundex or metaphone searching system. geni already has this feature for "Pro" users only. :(

There are several next generation name matching products out there now that cover many languages and whose basic algorithms can handle the other languages. I bet that one of them would be interested in the cachet of being used in a social media genealogy site.

Suggestions for Projects.... you are able to tell us when a project we are following has a "new" item (profile, etc.) since the last time we as a user looked at it...is it possible to flag those new items with a star (*), the word "new", the item in red or in italics...something to quickly and easily identify what has been added?

Here is a perfect example of Faye's "why in the world did these get matched up" comments from January: http://www.geni.com/merge/compare/6000000011809477319?to=437789

Anita..I wil assume you are talking about me...

The URL you posted looks at Helen Williams....that is the match that Geni found....it is as simple as that...

If accessible, you can search each profile individually for dates to see if a match (tho the case cited is so obviously wrong it should be rejected immediately)....I meant in cases where there were no dates in at least 1 profile family...and you still have to be cautious....

To be somewhat fair...Geni DOES say on that URL that it does NOT think it is a match...but there are way too many where Geni is silent and, as has been stated, generations may be involved....

Perhaps the enhancement on dates is supposed to aid in this area...tho you can't tell by me since I am one of the many who can NOT enter dates on ANY profiles....widespread problem.

PERHAPS..since 99% of profiles contain more than 1 name, Geni could sort potential matches utilizing the RANGE within the profile. (If parents were born in 1480's, children in 1520's, and grandchildren in 1550's.....then a profile, even with same names that spanned 1880-1940 would NOT be a match...)

Fay (no 'e')

My apologies Fay on both the mis-spelling of your name and for not making my comment clear enough. My comment was not directed AT you, but rather to enhance your comments on how Geni needs to update their matching criteria. The merge will be denied once I am aware someone from Geni has looked at it. My apologies again.

Anita M. Boynton that is seen as a potential match because the profile name matches (Janice = Janice and Williams = Williams) and because one parent name matches (Helen = Helen). There are no dates on your Janice's profile or immediate family to use for comparison.

Showing all 21 posts

Create a free account or login to participate in this discussion