Forum OpenACS Development: Re: Why unique constraints should be treated with care

Collapse
Posted by Don Baccus on
"This is why most sites today ask for a username and not an e-mail because other systems already detected the fact that e-mails are not unique."

This is an unfounded statement from authority, Malte. Hand-waving just does little to convince me.

"Therefore I'd encourage developers to be *really* careful with unique constraints and make a check if this unique constraint is reflected in the real world. Usually it isn't"

This is just as bad. Nothing but hand-waving.

Now ... you still haven't answered one of my original questions. If two people have the same e-mail and the system's set up to allow login by e-mail, which person do you log in if the e-mail's not unique?

Flip a coin?

The spam issue mentioned by Dave is one I thought of driving to the bay area today.

If we remove the constraint and things break, who will fix it? You? History doesn't back up this presumption.

Sorry if cleaning up the Oracle messes on the floor that delayed 5.2 so long has left a sour taste in my mouth, but why shouldn't it?

Disclaimer: This discussion has been going on way to long and it has already turned nasty to the point were I just gave up discussing it any further. But as Don made comments in a dissmissive way about my believes and work in public I feel personally offendend and want to ractify. I just hope that this does not result in a flame war as past experience has at least shown to me that this kind of thing is bad for the community, especially if happening between the founder and one of the most active members.

Now let's get started 😊

Let's put it this way. I have three clients with user data of each around 20.000 customers with 40.000 contact people. Within these I find that we have around 1000 contacts who are using the same e-mail address. These are distinct people (read: parties).

The unique constraint on emails is on *parties*, which is a table, if I'm not mistaken, that has *nothing* to do with users, because groups and organizations could be parties as well. So, to answer your question: If two *PARTIES* have the same e-mail, I use *username* in my system to log them in. Which is my whole point. Do not use email for the login of *users* but username. Username is part of the users table, emails is part of the parties table. There are more parties than users in the system, so it does not make sense to limit the parties in my opinion due to the fact that you need a uniqueness on login, which you are giving to real person (aka users).

As for the Oracle delay, would you mind publicly stating which CVS commits from myself you refer to with regards to fixing Oracle? You keep on telling that I'm responsible for delaying the 5.2 release when I thought we established that the code change, though made after the code freeze and therefore a mistake on my behalf which I already apologized for, has not caused any delays in terms of broken code.

Taking that my memory might be bad in that the previous point I just wanted to make that the reason for this exchange (a three line code change...) has actually the capabilities of the dire consequences of breaking code which I wont be able to fix.

http://xarg.net/tools/cvs/change-set-details?key=22705

So, maybe I am exaggerating a little bit (which is different from hand waving, unless my command of the english language is not good enough). But I stand by my point that we have more unique constraints than we need.

E-Mail is not unique. There are families who use the same email, there are employees who use the same email, there are villages who use the same e-mail. Therefore having a unique constraint on e-mails while the open world actually prooves that more parties (organizations and persons) use the same e-amil is a bad idea. If you force users (note the difference between person and user) to use their e-mail address for login, obviously the unique constraint on e-mail makes sense. I'd even go so far and agree that the email of a user (person which has a password to login to the community) in the system should be unique, to avoid the spam problem, though I still wouldn't enforce it (why do we have the username).

Oh, and while the box of pandora is already open who says that users in the community system have to have an e-mail? If I want to participate in the forums and make assessments and have my dotlrn portal, I do not need an e-mail address. I need a login, and the email makes it easier to communicate with myself, but why should I be forced to provide the email or, for that matter, why should any user of OpenACS force his users to provide e-mails (this refers to "not null" constraints).

Another unique constraint which we had been banging our head against quite often in the last months is the unique constraint on cr_item name with parent_id, though I still agree this is a good idea.

Here is the szenario though:

- Original file is uploaded
- User uploads a first translation with the same filename (e.g french)
- User two uploads a second translation with the same filename (e.g. spanish)

As we do not have a unique constraint on cr_item, locale, parent_id we are forced to either

- Use separate folders for the files (of each languague)
- Rename the name of the file

There have been some other issues (multiple revisions of a file where you need to know which revision was following what earlier revision, resulting in us not using revisions in the first place but different folders).

Note: I'm not accusing anyone of making a wrong decision (even if I say that it has been a bad idea), I just note that realities of day to day development driven by client needs has shown that some assumptions should better not have been made.

So my warning / plea is solemnly: If you make a unique or not null constraint in the datamodell, make sure that your assumption is correct and can be remodelled in the real world, as changes in the datamodell and application once you made this assumption, are painful, result in discussions like this and most likely wont be made in the first place or not committed back, resulting in forks all over the place.