si-blog requirements

Posted Jun 08, 2004 in si-blog.


At the heart of the new si-blog will be a greatly enhanced database. An early glimpse of the tables was met with murmurs of approval, and there hasn't been much change since then. Fields have been added to the cats and entries tables for nice URLs. The banned table has had more fields added. Other than that, the tables are the same.

Catching referrers

I like to maintain a list of the most recent referrers (if available), so I have a little script that updates a table if referral data is available. The script currently filters out internal links, empty fields, and a growing list of banned keywords (mostly porn, but also some search engines and mail-related stuff). At the moment, the referral data is filtered by looping through an array of banned keywords, but in the new system I am hoping to automate this by placing the banned data in its own table.

Extending this concept, I will also keep a growing list of banned IPs, URLs, user agents, and email addresses. Some of these will be used for helping to administer comments.


By using a join table, entries will now be able to exist in multiple categories. The categories table will have three jobs, as it will also be used to categorize images (which also have their own join table) and links. I am not planning to sub-divide categories, however.


I will be adopting the archive/yyyy/mm/dd/nice-title/ URL scheme for entries. Categories will also get a nice URL. Naturally, entering archive/yyyy/mm/dd/ will show the day's posts, archive/yyyy/mm/ will show the month's, and so on. I will probably provided several ways of viewing posts and comments, adding more as time goes by.

I am going to start maintaining a database of categorized links, most of which will have XFN relationship data included. I'll draw my out there list from the same table, and simply distinguish them from the others by giving them a special category.


This is a toughy. I have agonized over this for days, and I still haven't made a decision. There are two groups of si-blog user:

  1. Geeky types, perfectly capable of writing well-formed XML.
  2. Non-geeky types, who are incapable of even providing proper punctuation, let alone properly nested tags.

I have therefore considered providing an option to write well-formed XML, as described in this previous entry. I expect the agonizing to continue for a while.

There will be some form of administration going on with comments. IPs that are recognized as being banned will not be presented with the option to comment, although nobody will be denied the option to read comments.

File structure

Another toughy. The current file structure is a joke, but I recognize that the current permalinks must remain. I am not quite sure how to deal with that yet. I may keep the old si-blog permalinks active, or resort to some serious use of mod_rewrite. I'd like to incorporate the old data into the new system, so the latter may be a better solution.

Other areas of will eventually be amalgamated with the si-blog, and I'll introduce some sort of image gallery. The new structure will be able to easily handle these. Many static pages will be moved.

Search options

To cope with the upheaval resulting from moving pages, dying URLs, and new information, I am planning to create a sophisticated search tool. This will have two interfaces (one will offer advanced options, the other will not) that will allow users to search within most of the text fields. I am hoping to provide grouped results, but that functionality may follow later on.


I will continue to provide RSS and Atom feeds as summaries. I am not planning to produce feeds of categories, links, or comments, but these may follow later on down the line.

Visual design

I am quite attached to my existing design, except for the navigation. I will probably make some changes, but expect them to be evolutionary, not revolutionary. I'll be adding a style sheet for printing, and possibly some seasonal alternatives. No plans for a style-switcher, though - get Firefox for a mechanism that takes care of that.


  1. Gravatar

    For keeping the file system alive. You could send all requests for /blog to a a PHP script that checks for the ID of the post and/or other interesting parts of the URI and than the scripts checks the database to find the 'slug' of the post and related stuff like the post date and redirects the person to /archives/{year}/{month}/{day}/{slug}.

    For search you could use the FULLTEXT functionality of MySQL which works absolutely great.

    O and please provide full text in your feeds.

    Posted by Anne on Jun 08, 2004.

  2. Gravatar

    The filing system idea seems like a good one. I shall have to think about that.

    I have considered FULLTEXT; however, it will not be useful on searches of 3 characters or less. I'd like for people to be able to search for PHP, CSS, XML, etc.

    I am disinclined to provide full text feeds. I'd like my entries and tutorials to be read in the context of my website. This will be especially important in the future, when I hope to use the si-blog as a tool to help attract business.

    Posted by Simon Jessey on Jun 09, 2004.

  3. Gravatar

    You can still use FULLTEXT search though. Just add some things like '%' when the entered search text is less than four characters.

    If you need a write-up for more information, I believe SimonW has written something in the past about that.

    Posted by Anne on Jun 09, 2004.

  4. Gravatar

    Simon, I came across this post on WebmasterWorld and thought you may find it useful for the new blog engine. I've had a few 'people' come in, this week even, and begin requesting pages at phenomonal rates. This throttle may do the trick...

    Posted by Mike P. on Jun 18, 2004.