A List Apart

Menu
Issue № 145

Simple Content Management

by Published in The Server Side49 Comments

Don’t be fooled by the title: this article covers the implementation of a complete, expandable, client-side content management system using REBOL.  This system makes it easy for any website operator, regardless of experience, to update site content while keeping markup valid and consistent and ensuring that links stay pertinent.

Why another CMS? I’m not a fan of the client-side content management provided by FrontPage or Dreamweaver, and server-based systems rely on server OS and software and are vulnerable to the restrictions of hosting packages.  This CMS will work on any desktop system.

What you’ll need

Your content editors will need to be able to modify a text file, understand a few markup rules, and should have a little double-clicking experience.

You, the web designer and provider of code, should have a little scripting experience and a copy of the scripting language REBOL.  The core language runs on over 40 platforms and is available from their downloads section. To run the code in this article, you just need to follow the installation instructions for your platform, which is usually no more arduous than unpacking a Zip file.

REBOL may be new to many of you, so I’d suggest a quick look at some of the excellent language primers on their site, though you may find it just as easy to learn by example.

Your content file

All content for your site is stored within a single text file, which I’ll name content.txt.  This file uses simplified markup rules I have defined myself, and the contents of the file will look something like what’s below. (Note: Some lines have been wrapped to fit this page. View the linked document for easier study and greater accuracy.)

Site Title: Business, Inc.New Page: Products (products.html)===Business, Inc. ProductsBusiness, Inc. has an **excellent** reputation 
in the world of office supplies. Our specialties 
include:...Paper Clips…Staples…Pen HoldersBuy online: ##http://oursite/store/##Business, Inc. Store##.

This simplified markup (which you define) is all the content editors need to use to keep their site up-to-date.  Note: every declaration, heading and paragraph is on a single line, so a word-wrapping text editor is a must.

At the top of the file is the title of the site – this will appear between the <title> tags.  Headers are defined by === (three equals signs) at the beginning of the line, lists by ..., and every other line is a regular paragraph.  Inline formatting includes wrapping some text with two **s to emphasize it and hyperlinks go ## url ## text ##.  I also have special markup for images, though it is beyond the scope of this article to explain their implementation.  Needless to say, it probes the image for its size attributes.

One script fits all

To begin the REBOL script, first we need to create a text file called content-management.r  (or anything ending in .r).  At the top of the file goes:

REBOL [
 author: "Christopher Ross-Gill"
 rights: "Chris and ALA readers"
]

The REBOL header provides much scope for describing your scripts.

Now we can load in the content.

content-block: read/lines %content.txt

I’ve used the /lines refinement of read to split the file by line endings (file names in REBOL have a ‘%’ prefix and use ‘/’ for directories – no matter what the platform).  We’ll end up with a block (like an array) of strings.  We just need to zap all empty strings from content-block. {Line wraps are marked thusly: ». – Ed.}

while [empty-string: find content-block ""]»
 [remove empty-string]

Parsing the content

Extracting structure from our document is made easy with the parse command.  This next bit of code defines structure as an empty block, then loops through content-block exposing each string to the rigors of the parse rule, filling structure with the processed, organized content.

structure: copy []foreach paragraph content-block [
 parsed?: parse paragraph [
  "Site Title:" copy para to end
  (repend structure ['site-title trim para])
  |
  "New Page:" copy para to "("
  skip copy id to ")" skip
  (repend structure ['h1 trim para id])
  |
  "===" copy para to end
  (repend structure ['h2 para])
  |
  "..." copy para to end
  (repend structure ['li para])
 ]
 if not parsed? [repend structure ['p paragraph]]
]

Notes on the rules: In REBOL, a parse rule not only describes the structure of a string, it also instructs the program what to do with the extracted values – these instructions are stored in the parentheses.

Taking the first rule as an example, it looks at the beginning of the string for the text “Site Title: “ then sets the word (like a variable) para to the rest of the string.  It then appends the structure with the word site-title and the string para.  Alternate rules are separated by the | character.  If none of the rules match, we append the string with the word p.  The structure block should look something like this (for more complex strings, REBOL uses curly brackets {}):

[
 site-title "Business, Inc."
 h1 "Products" "products.html"
 h2 "Business, Inc. Products"
 p {Business, Inc. has an excellent reputation 
in the world of office supplies.  Our specialties include:}
 li "Paper Clips"
 li "Staples"
 li "Pen Holders"
 p {Buy online – »
 ##http://oursite/store/##Business, Inc. Store##.}
]

Inline formatting

Before we go punching our content into an XHTML template, we’ll set up a paragraph formatter.  This function deals with punctuation, special characters, links and emphasis.

format-text: func [text [string!]][
 replace/all text {&} {&}
 replace/all text { "} { “}
 if (first text) = #"^"" »
 [replace text {"} {“}]
 replace/all text {"} {”}
 replace/all text { '} { ‘}
 if (first text) = #"'" »
 [replace text {'} {‘}]
 replace/all text {'} {’}
 replace/all text {--} {–}
 replace/all text { – } {–}
 replace/all text {.  } {.  }

The collection above of replaces gives us curly quotes, em dashes, et al.

 parse/all text [
  any [
   thru {##}
   copy link to {##}
   2 skip
   copy hlink to {##}
   (
    href: {<a href="}
    nlink: link    replace text rejoin [
     {##} link {##} hlink {##}
    ] rejoin [
     href lowercase copy nlink{} hlink {</a>}
    ]
   )
  ]
  to end
 ]

This parse rule can be more complicated, e.g. it can detect email addresses and external links.  The any at the beginning of the rule indicates that there may be none or more links within a paragraph (as opposed to some which would imply one or more links).

 parse/all text [
  any [
   thru {**} copy emphasized to {**}
   (
    replace text rejoin [
     {**} emphasized {**}
    ] rejoin [
     {<em>} emphasized {</em>}
    ]
   )
  ]
  to end
 ]

A simpler version of the hyperlink rule.

 replace/all text {@} {@}

As an anti-spam measure, this replaces the ‘@’ symbol with its associated entity reference number.

 return text
]

Thus concludes the format function.

Your XHTML template

The template can be taken care of easily.  Prepare your XHTML document and save it as template.html, putting placeholders for the page title and the body. A minimal template is below. The linked document is invisible in your browser until you view source, which should look like this:

<html><head>»
 title><% page-title %></title></head>
<body><% page-body »
 %></body></html>

Placeholders need only be unique in construction; they don’t necessarily need to be an XHTML tag.

punch-template: func [
 site-title [string!]
 page-title [string!]
 page-menu [string!]
 page-content [string!]
][
 template: read %template.html
 replace template <% page-title »
 %> rejoin [site-title ": " page-title]
 replace template <% page-body »
 %> rejoin [newline page-menu newline page-content newline]
 return template
]

That is our function to replace the placeholders.  Yep, it’s that simple.

Putting it all together

So let’s review what we have up to now: content, stored as structure; formatting guidelines, stored in the function format-text; and a template, stored in the function punch-template.  To put everything together, we need to add two more stores to our script: one for individual page contents with associated file names, and a string to store the menu code.

pages: copy []
menu: copy ""

Now we can wrap up the content from within a single parse rule – remember, code can be run from parentheses within a parse rule.  Since we’re parsing a block as opposed to a string this time, the rule looks slightly different:

parse structure [
 (append menu <div id="menu">)
 'site-title set title string!

Remember the structure block?  Our parse rule is stating that it must begin with the word site-title followed by a string.

 some [
  'h1 set header string! set id string!
  (
   repend menu [
    newline <p> build-tag compose [
      a href (id)
    ]
    format-text header </a> </p>
   ]
   content: copy ""
   repend content [
    newline <div id="content">
    newline <h1> format-text header </h1>
   ]
   list?: false
  )

When we come across an h1, we are really defining a new page.  So we add a reference to the menu.  Then we make a new string for this page and insert a <div id=“content”>.  Then we paste in the heading.  The final task is to turn the list? flag off.  We’ll come to this soon.

  some [
   'h2 set para string!
   (
    if list? [append content </ul> »
  list?: false]
    repend content [newline <h2> »
  format-text para </h2>]
   )
   |
   'p set para string!
   (
    if list? [append content »
  </ul> list?: false]
    repend content [newline <p> span class="linewrap">»
  format-text para </p>]
   )
   |
   'li set para string!
   (
    if not list? [repend content »
  [newline <ul>] list?: true]
    repend content [newline <li> »
  format-text para </li>]
   )

This covers the rest of the formatting.  list? allows us to check when we have to add the <ul> tag.  When we come across an li and list? is not true, we set list? to true and add a <ul>.  Likewise, if list? is true when we come across a header or paragraph, we insert the </ul> tag and set list? to false.

  ]
  (
   if list? [append content </ul>]
   append content </div>
   repend pages [to-file id content]
  )
 ]
 (append menu </div>)
]

That should fill our pages block and menu string nicely.

Publishing

To finish up, we just loop through the pages block, and everything we need to put the site together is there.  REBOL treats files and ftp urls in the same way, so you could quite easily replace %pages/ with ftp://user:pass@somesite.org/pages/ to write the pages directly to your web server – the complete content management system.

if not exists? %pages/ [make-dir %pages/]foreach [file content] pages [
 write join %pages/ file punch-template »
 title header menu content
]

Save the script.  Place in the same folder as content.txt and template.html and execute the script.  A new folder called pages will appear, containing, if you followed my example, products.html.  Try adding new pages to content.txt or try adding your own formatting rules.

This script can form the base of a much more powerful and complex client-side content management system.  More complex formatters exist in REBOL, such as Make-Doc-Pro or the HTML Dialect.  An article on how REBOL blocks work is available at REBOL Forces.

About the Author

49 Reader Comments

Load Comments