Database Design

We have the following items we want to look at:

As we were a geeky confernece, we also had items such as:

Some of these items are data we refer to to ask questions, and some are data we are saving for each person.

From a security point of view, our permissions were quiet strict; since we are adding data, we dont want there to be any risk of corrupting or misusing data. Our queries our registration system use are only INSERT and SELECT; we do not UPDATE or DELETE anything.

Data we refer to

Lets look at an example: registration type. Things we want to know: the "name" of the registration type", the cost of the registration type (in dollars or cents? Pounds or pence? Integer or float?), the row ID for this registration type, the sort order for this type, and if this type is still available.

In SQL code (for MySQL):

create table Registration_Type(ID smallint primary key auto_increment, Name varchar(100), Cost float(10,2), Sort_Order smallint, Available boolean)

We code around this in Perl to make an object (OO-Perl).

package Registration_Type;
# This is in a file called Registration_Type.pm

sub new {
# Standard Perl OO creator, returning...
	return $self;
}

sub dbh {
# A method where we will store a reference to our database handler (DBI).
# We put it here so the rest of the object should only refer to 
#  $self-%gt;dbh->prepare, etc.
	return $self->{'dbh'};
}

sub name {
# This will return the name of the registration type. 
	return $self->{'name'};
}

#
# Etc... for cost, sort_order, available...
#

sub load {
# Given a ROW ID, load this into the variables...
	my $self = shift;
	my $id = shift;
	die "Not a ROW ID" unless $id =~ /^\d+$/;
	die "No DBH" unless defined $self->dbh;
	my $sql = "SELECT * FROM Registration_Type WHERE ID = ?";
	my $sth = $self->dbh->prepare($sql);
	my $result = $sth->execute($id);
	die "Couldn't find row ID $id" if ($result != 1);
	my $ref = $sth->fetchrow_hashref;
# Now set each variable
	$self->name($ref->{'Name'};
	# etc...
	return 1;
}

sub all {
# Given a database handle, return an array with all items
	my $self = shift;
	my %args = @_;
	my $dbh = $self->dbh || $args{'-dbh'};
	die unless defined $dbh;
	my $sql = "SELECT ID FROM Registration_Type";
	my $sth = $dbh->prepare($sql);
	my $res = $sth->execute;
	die "None found" unless $res > 0;

# Now make an array and fill it with each line
	my @array;
	while (my $ref = $sth->fetchrow_hashref) {
		push @array, Registration_Type->new(-dbh => $args{'-dbh'}, -id => $ref->{'ID'});
	}
	return @array;
}

# Must return 1/true at the end of the package.
1;

We can now use this object in the rest of our code...

use DBI;
my $dbi = DBI::connect();
use Registration_Type;
my @types = Registration_Type->all(-dbh => $dbh);
foreach (sort {$a->sort_order <> $b->sort_order} @types) {
	# Skip this one if it is not available at the moment.
	next unless $_->available;
	printf "Type: %s, Cost: %s\n", $_->name, $_->cost;
}

We note that we should be able to drop out any registration type at any time (eg, we filled up our professional registrations but could still accept hobbiests and concessions), but that we still want the object to exist in the database since we still need to be able to charge for it.

Data we save from the user

Saving data is done carefully; we ask the user twice so they can confirm they have entered everything as they wanted. The difference between seeing the data in an input field and seeing it as plain text (uneditable) can make people think twice, so they need to then edit.

The Person Table

Our data for each person is in one table. It will require some explanation:

CREATE TABLE Person (ID integer primary key auto_increment, Name varchar(200) not null, Class varchar(50), Email_Address varchar(150) not null, KEY index_Email_Address (Email_Address(10)), Company_Name varchar(255), Telephone varchar(100), Vegitarian tinyint, Shirt_Size varchar(4), Registration_Type_ID tinyint not null, Formal_Dinner tinyint, Preferred_Shell_ID tinyint, Preferred_Distribution_ID tinyint, Preferred_Editor_ID tinyint, Date_Entered datetime not null, Address varchar(200), PostCode varchar(20), State varchar(50), Country varchar(50), Experience tinyint, Host varchar(100) not null, Special_Requirements varchar(255), Accomodation_Start date, Accomodation_End date, lca2002 tinyint, lca2001 tinyint, lca1999 tinyint, CustNum varchar(255) not null, KEY index_CustNum (CustNum(10)));

Now to analyse each section.

CREATE TABLE Person

We tell MySQL that this new table is to be called 'Person'. Case sensative.

(ID integer primary key auto_increment,

Starting with the opening bracket for the creation definition, we define the row ID for each record (each row). This is the primary key, is an integer that automatically increases with each row. We chose an integer because we feel that this will cover the range of people who may use it; we can use a bigger value if necessary. but its not really worth using anything smaller (the saving wont be that great if there is only a small number of people).

Name varchar(200) not null,

We specify the name to be a variable number of alphanumeric characters; we could revise this to be of type 'text'. Every person entry must always have a value; although our (Perl/PHP/CGI) code will do validation, we also put this constraint here in the underlying SQL.

Class varchar(50),

Just for fun, we gave each person a character class, a small string that indicated their task/state, eg "You are a happy perl-monk".

Email_Address varchar(150) not null,

Again, each person (row) must have an email address.

KEY index_Email_Address (Email_Address(10)),

We define a KEY based on the Email address field, mainly because we want to be able to find rows based upon the email address. We define the key to be only the first 10 characters of the email address; this is probably still too many characters; amongst all the email addresses of the world, you could probably index just the first 4 characters and then linearly scan the resulting set.

For example, say we have a normal distrubtion (by characters of the latin alphabet) of 100,000 email addresses. If we were to index on one character, then 1/26 of them would be left when we did a linear scan. By indexing 3 characters, thats 26^-3 of the 100,000 addresses, or... 5.6 items left to linearly scan!

Company_Name varchar(255),

For our professional registratnts, we record their company.

Telephone varchar(100),

Phone numbers aren't always given as just a number, but sometimes have spaces, brackets, dashes. Not everyone will give a phone number, so we accept NULL.

Vegitarian tinyint

Could just be a boolean...

Shirt_Size varchar(4),

We had this as a short string, but could easly be split off to another table (Shirt_Sizes) and replace this column with a reference to a row ID in the other table (eg, call this Short_Size_ID, and make it a tinyint).

Registration_Type_ID tinyint not null,

A reference to the Registration_Types table.

Formal_Dinner tinyint,

We had this as a boolean since people could bring partners, and thus each person may 'order' additional meals at our conference dinner. Some people didn't come to it either... thus a number 0..X.

Preferred_Shell_ID tinyint,

This is a reference to the small lookup table of unix(tm) shells.

Preferred_Distribution_ID tinyint,

... and a lookup to the small distribution list....

Preferred_Editor_ID tinyint,

... and the small editor list lookup table.

Date_Entered datetime not null,

We want to be able to audit everything, so this has a datetime field. It could also be a timestamp...

Address varchar(200),

I used a varchar, but text to handle unicode may be a better choice.

PostCode varchar(20),

Remember, postcodes (zip codes) arent just numeric. They have spaces, sometimes letters, or sometimes arent used!

State varchar(50),

This is sometimes also a county, and can be NULL (eg, London is not in any country, but just London).

Country varchar(50),

We could make this a lookup list, but it would be aroun 200+ entries long, so I made this freeform.

Experience tinyint,

We let people select how advanced they thought they were, the idea being that we would let speakers inticate the technical level of their presentation, and we could schedule the right sized venue for each talk.

Host varchar(100) not null,

We're dealing with the net here, so we want to record the IP address this was submitted from.

Special_Requirements varchar(255),

Vegitarians, vegans, diabled access, foot massages; whatever the person wants to put in! We also used this freeform field for our speakers, sponsors, and special guests to register their access codes to get discounts on their attendance.

Accomodation_Start date, Accomodation_End date,

We tied in very loosely with the on-campus accomodation; we wanted to be able to issue the accomodation place with the names and addresses of those who wanted to stay there; we used this to get the dates and we (as a script on the web server) shot off a form email every time someone registered with the required dates. They weren't set in stone, but a starting indication.

lca2002 tinyint, lca2001 tinyint, lca1999 tinyint,

A quick survey if people had been before. Coul;d have been booleans (yes, no), but more importantly we wanted tri state: yes/no/didn't answer the question!

CustNum varchar(255) not null, KEY index_CustNum (CustNum(10)));

This is a primary part of the person record, and could have been the primary key.

We wanted people to have an identifier that was not uniform or guessable that they could use toi refer to their entry. It had to be independant of their data at any time (eg, we need to be able to update all other fields, so this needs to be a separate record). We settled on an md5sum of the time they registered and their email address they initally registered with.

We use this so people can return to the web site to pay (yes, you can register details and not pay here... but then you get the begging messages). When you return to the site to pay, you need your customer number. So, by having a big, seemingly random string as your customer number, it is very difficult for someone else to be able to pay for your acount unless you give them this number. If it was just a digit (1..100), then someone could hit the site and iterate through each of these.

Person Summary

This central table is the key to everything, and all other tables can be connected to this one by the Customer Number (except for some of the CommSecure ones; see later).

Registration Code Index


James Bromberger (james_AT_rcpt.to)
hosted by Pelican Manufacturing.