New Locomotive Database
Posted: Fri Nov 23, 2007 6:27 pm
This project is making quiet progress and I have some initial specs written out. I may be able to start development in December.
The plan is to have an online database of all LNER locomotives. The more I thought about this, I decided it should be very flexible. We need to store things like numbers, shed allocations, classes, names, rebuilds, etc. However there are few consistencies between classes as to what we may want to store. Therefore I have chosen a database of "tags". A tag will have a date and a small amount of information - eg. a number change, a livery change, an allocation. There will also be a locomotive database which has some very basic information about each engine. As numbers are not consistent and do change, internal identifiers will mark which locomotive a tag refers to. These identifiers would be invisible to a user, and only visible to editors for debugging purposes.
Yes I have chosen a relational database model. I considered an RDF type scheme - this is surprisingly close to my tag concept, but I decided RDF could not hold the data efficiently. PHP will be used for the web interface.
There is a huge amount of data to enter, and I know there have been some volunteers already. More will be welcome You need good reliable sources of data. Data entry reliability and spelling are important. Editors will be able to add and edit entries in the tags database. I'll also add a closed board on these forums for the editors.
My database design has been chosen to be very flexible. In reality only a small number of queries will be coded as web pages. Eg. "query all locomotives that fit these criteria", "all information about this locomotive", "all locomotives at this shed on this date". On a regular basis (probably weekly), I will create an automatic process that dumps selections of the database data into static pages that will be linked from the locomotive pages. These will replace the current lists of named locomotives.
Current Database Design
I currently have four tables planned. Two are 'auxiliary' tables. More auxiliary tables can be added as needed.
Class
This lists all legitimate class names for the database. It will also list the URL to the relevant page for this class. This will allow all class names in the web front end to be consistent. Eg. avoid typos and formatting differences.
Shed
This would list each shed code, shed, and start/end dates. I'm not too well versed with shed codes - is there anything else I need?
Locomotives
This will list the Build Date, Works No., Manufacturer (Works or Company), and internal identifier. Note we do not need things like name, number, withdrawal date, etc. because these will all be in the tags database.
Tags
Okay I probably need a new name for this database
This is where the real information is stored.
Each tag will store: Internal Locomotive Identifier, Date, Tag Type, Event/Observation, Information, and Comments.
All tags are dated, but the Event/Observation field will determine if the tag describes an observation or an event. Eg. a shed allocation could be an observation (X was at Y in 1948), or an event (X was allocated to Z in 1949).
The Information field would depend on the tag type - eg. it could hold a shed code, a new class, a name, etc.
As a future extension, it could include the local filename of a photograph or drawing. Ie. the database could be easily expanded to include photos.
The Tag Type determines what this tag refers to. This will be enumerated but the enumerations will be hidden from the user (basically there will be a list of possible presets). My current list is as follows:
Class
Livery
Name
Number
Owner
Shed Allocation
Tender No.
Boiler No.
Tender Type
Boiler Type
Repair
Rebuild
Withdrawn
Scrapped
More tag types can be added as needed.
Thoughts, comments? I might have been a bit technical but I have high hopes. I have chosen what I believe to be a powerful underlying data structure.
Richard
The plan is to have an online database of all LNER locomotives. The more I thought about this, I decided it should be very flexible. We need to store things like numbers, shed allocations, classes, names, rebuilds, etc. However there are few consistencies between classes as to what we may want to store. Therefore I have chosen a database of "tags". A tag will have a date and a small amount of information - eg. a number change, a livery change, an allocation. There will also be a locomotive database which has some very basic information about each engine. As numbers are not consistent and do change, internal identifiers will mark which locomotive a tag refers to. These identifiers would be invisible to a user, and only visible to editors for debugging purposes.
Yes I have chosen a relational database model. I considered an RDF type scheme - this is surprisingly close to my tag concept, but I decided RDF could not hold the data efficiently. PHP will be used for the web interface.
There is a huge amount of data to enter, and I know there have been some volunteers already. More will be welcome You need good reliable sources of data. Data entry reliability and spelling are important. Editors will be able to add and edit entries in the tags database. I'll also add a closed board on these forums for the editors.
My database design has been chosen to be very flexible. In reality only a small number of queries will be coded as web pages. Eg. "query all locomotives that fit these criteria", "all information about this locomotive", "all locomotives at this shed on this date". On a regular basis (probably weekly), I will create an automatic process that dumps selections of the database data into static pages that will be linked from the locomotive pages. These will replace the current lists of named locomotives.
Current Database Design
I currently have four tables planned. Two are 'auxiliary' tables. More auxiliary tables can be added as needed.
Class
This lists all legitimate class names for the database. It will also list the URL to the relevant page for this class. This will allow all class names in the web front end to be consistent. Eg. avoid typos and formatting differences.
Shed
This would list each shed code, shed, and start/end dates. I'm not too well versed with shed codes - is there anything else I need?
Locomotives
This will list the Build Date, Works No., Manufacturer (Works or Company), and internal identifier. Note we do not need things like name, number, withdrawal date, etc. because these will all be in the tags database.
Tags
Okay I probably need a new name for this database
This is where the real information is stored.
Each tag will store: Internal Locomotive Identifier, Date, Tag Type, Event/Observation, Information, and Comments.
All tags are dated, but the Event/Observation field will determine if the tag describes an observation or an event. Eg. a shed allocation could be an observation (X was at Y in 1948), or an event (X was allocated to Z in 1949).
The Information field would depend on the tag type - eg. it could hold a shed code, a new class, a name, etc.
As a future extension, it could include the local filename of a photograph or drawing. Ie. the database could be easily expanded to include photos.
The Tag Type determines what this tag refers to. This will be enumerated but the enumerations will be hidden from the user (basically there will be a list of possible presets). My current list is as follows:
Class
Livery
Name
Number
Owner
Shed Allocation
Tender No.
Boiler No.
Tender Type
Boiler Type
Repair
Rebuild
Withdrawn
Scrapped
More tag types can be added as needed.
Thoughts, comments? I might have been a bit technical but I have high hopes. I have chosen what I believe to be a powerful underlying data structure.
Richard