Take the 2-minute tour ×
Code Review Stack Exchange is a question and answer site for peer programmer code reviews. It's 100% free, no registration required.

I've been thinking a bit more on my database design sense this post. The general idea is I'de like to create a generic database that can store information on any number of subject, similar to a wiki, but have more structured data available to help. The idea is that the linking between entries will be more defined and allow more more types of searches.

Library                             -- Primary library entry data
    iD              VARBINARY(16)   -- UUID & PK
    ownerID         VARBINARY(16)   -- Users UUID & FK
    name            VARCHAR         -- Name for the entry
    dataType        VARCHAR         -- Mime type of data
    data            LONGBLOB        -- Data a for the entry
    dirty           BIT             -- Was this entry locally modified
    timestamp       DATETIME        -- Last time the record was updated

Sections                            -- Additional sections for a library entry
    iD              VARBINARY(16)   -- UUID & PK
    libraryID       VARBINARY(16)   -- Library UUID & FK
    name            VARCHAR         -- Name for the section
    order           int             -- The ordering of this section
    dataType        VARCHAR         -- Mime type of data
    data            LONGBLOB        -- Data for the section
    dirty           BIT             -- Was this section locally modified
    timestamp       DATETIME        -- Last time the record was updated

Attributes                          -- Attributes for a library entry
    iD              VARBINARY(16)   -- UUID & PK  (Potentially could be removed)
    libraryID       VARBINARY(16)   -- Library UUID & FK
    name            VARCHAR         -- Name of attribute
    dataType        INT             -- The type of data the attribute holds
    reference       VARBINARY(16)   -- Value of attribute that is Library UUID & FK
    data            VARCHAR         -- Value of attribute that is any other datatype
    dirty           BIT             -- Was this attribute locally modified
    timestamp       DATETIME        -- Last time the record was updated

Users                               -- User accounts
    iD              VARBINARY(16)   -- UUID & PK
    libraryID       VARBINARY(16)   -- Library UUID & FK (The library entry for this person)
    email           VARCHAR         -- Email address
    nickname        VARCHAR         -- Nickname used for display
    firstname       VARCHAR         -- Real first name
    lastname        VARCHAR         -- Real last name
    joinDate        DATETIME        -- Date the account was created

OpenID                              -- Mapping for OpenID authentication
    openID_URI      VARCHAR         -- PK
    userID          VARBINARY(16)   -- Users UUID & FK

I've added a Sections table to hold individual sections of an entry. Sections should be used to break down an entry into logical parts when it makes sense to do so. This would allow for automatic generation of table of contents, possibility to show/hide individual sections, the ability to update a section without having to update the whole entry.

I've also added a Uses table and an ownerID to each entry. The thought is that the own would have the ability to approve edits made to the entry. The owner would/should be an expert in the subject area of the entry. I'm contemplating a social aspect with reputation similar to stackexchange to limit/allow the user to edit entries that aren't their own. I'm still thinking this through so any thought on the matter would be nice.

Standard attributes that apply to all Library entries are below. Besides these two each entry could have any number to attributes that only apply to it. I'm not sure how to store the information yet but I would like to be able to define standard attributes for given types of entries. For example all entries on people should have a Birthdate attribute.

Attribute Name | Type | Description
---------------+------+---------------
TypeOf         | ref  | The Library entry UUID that this attributes library entry is a type of.
               |      | For example if this attribute's library entry is for the element
               |      | carbon then the TypeOf attribute would point to the library entry for
               |      | elements. There can only be one TypeOf attribute.
---------------+------+---------------
Tag            | ref  | The Library entry UUID that this attributes library entry relates to in
               |      | some way. These are meant to work in the same way tags in a tag cloud work.
               |      | There can be multiple Tag attributes.
---------------+------+---------------
  1. General design thoughts? I'm trying to design a database that can store information on anything and allow for meaningful and useful searches. The idea is this could old an encyclopedia or two worth of information allowing for the data to be structured in such a way as to allow easy navigation from like/similar/related item to item.

  2. How should revisions be handled?

    • Drop all history, only modify the current entry
    • Each revision is in effect a new entry (duplicating all other sections/attributes needed for the entry)
    • Create a History table to keep previous version of the data without the need to create a new library entry for each revision
    • Something else
  3. Given the above design what issues may arise during syncing 2 separate databases?

    • Entry Exists -> Entry Doesn't Exist
    • Updated Entry Exists -> Entry Exists
    • Updated Entry Exists <-> Updated Entry Exists
    • Are this security issues with syncing the Users and or OpenID table
  4. In the above design how would you handle merging 2 entries while keeping both existing entries intact? What issues may arise with merging entries?

    • Create a pointer from one to the other
    • Create a 3rd new merged entry
    • Something else
share|improve this question
 
This question appears to be off-topic because there is no actual code to review. This may be best for Database Administrators SE. –  Jamal Dec 16 '13 at 19:33
add comment

closed as off-topic by Jamal Dec 16 '13 at 19:33

  • This question does not appear to be a code review request within the scope defined in the help center.
If this question can be reworded to fit the rules in the help center, please edit the question.

1 Answer

up vote 1 down vote accepted

1) Design issues, as I see them (beyond the fact that I dislike multi-domain tables).

  • Naming the top-level table Library makes it sound like these are disassociated sets of entries, but they're not. Rename it as Entry.
  • The datatype columns should not be VARCHAR - make them an INT FK reference to a table which has supported mime-types in it (VARCHAR there is fine). Actually, you can make them a reference to other Entrys, which is probably what you should do.
  • name should maybe be a referenced attribute of Entry (see below). I'm debatin' that one.
  • What is the point of the dirty reference? How are you able to detect it was 'modified locally'?
  • Don't name columns after the data type. timestamp ("It's a timestamp, so what?") should be renamed to last_update or inserted_at or similar. data should probably be content.
  • If this is supposed to be similar to wikipedia, where anybody can edit it, what's the point of the ownerId column? Change this to last_updated_by. Add a new table, Edit_Approvers or similar, with references to Entry_Id and User_Id. Or, add it as a possible attribute (see below).
  • Sections as a table is problematic. Especially as I can easily imagine times when I will want sub-sections. Remove this table, and add parentEntryId and order to the Entry table. You don't list the target RDBMS, but most modern ones will support recursive queries that will allow you to assemble this properly.
  • Having libraryId in Attributes is problematic, remove it. In fact, other than the reference column (which I'm not too clear on - are you re-referencing libraryId?), you basically have the Entry table all over again. If possible, remove the Attributes table, add the rows to the Entry table, and create the cross-reference table Entry_Attributes; this will allow for better normalization, among other things.
  • Users is basically a specific application of the Entry table, with sub-sections of typed entries for Email, Name, and so on. This also applies to OpenId (and why is this a seperate table - were you planning on allowing users to register multiple OpenIds?).

2) How should revisions be handled? I don't know, how do you want to handle it? That's going to determine what further (if any) design changes are needed.

3) Syncing 2 (or more) seperate databases; you haven't specified what you want to happen. Usually, though, it's 'take the most recent entry/revision' (which you should be able to use last_update to determine). And what, specifically, are you concerned about for security? A number of issues will be related to application-level security, anyways.

4) Merging entries will be related to the answers to 2. What do you want to happen?

share|improve this answer
add comment

Not the answer you're looking for? Browse other questions tagged or ask your own question.