Content Modelling: A Master Skill
Issue № 349

Content Modelling: A Master Skill

In “Tinker, Tailor, Content Strategist,” which runs concurrently in this issue, I asked you about content strategy master skills, which hardly seems fair if I don’t share one of my own favorites. More and more I find that the content model is one of the most important content strategy tools at my disposal. It allows me to represent content in a way that translates the intention, stakeholder needs, and functional requirements from the user experience design into something that can be built by developers implementing a CMS. The content model helps me make sure that the content vision becomes a reality.

Article Continues Below

What is a content model?#section2

A content model documents all the different types of content you will have for a given project. It contains detailed definitions of each content type’s elements and their relationships to each other. You can capture a high-level version in an org chart-style diagram, or use a spreadsheet to capture a more detailed version. The level of detail in the model is determined by the purposes you need it to serve.

A simple, high-level content model

The high-level model shown here depicts some common content types of a music website and how they’d relate to each other. Items in the sales chart would link to pages about the relevant songs, artists, and albums listed. The album, artist, and song pages would also be linked to each other. This model can be used to validate the concept with stakeholders, and helps IAs and designers start thinking about the implications for the flow of the site.

Most of the time, however, you’ll need a more detailed content model that breaks down each content type into its components, and provides information such as the format in which you expect each attribute. In the example above, a more detailed model—showing the breakdown of information captured about an album, artist or song—would be developed to inform the design of the pages themselves and help configure the CMS. It will also eventually be useful in training the content creators.

Why is a content model important?#section3

The content model both influences and is influenced by the work of several other disciplines. A content model helps clarify requirements and encourages collaboration between the designers, the developers creating the CMS, and the content creators.

For information architects and designers — The content model helps information architects and designers make sure that the page designs accommodate all the content types for the site and provides guidance on the bits of text and media that will be available for the page. At the same time, the content model needs to support the content, layout, and functionality portrayed in the designs. If captions are included in the layout, that must be captured in the model for an Image. If events must be sorted by date on a calendar, then Date has to be captured in a separate field and it has to be sortable data, not just text. Depending on the complexity of the site, a high-to-medium level of detail is usually sufficient for designers.

For developers — The content model helps developers understand content needs and requirements as they configure the CMS. Given the many types of CMS, there are many ways to accomplish the same effect. If the content model indicates something that isn’t easily accomplished by a given CMS, it helps the developers adjust their approach to create the desired result in a way that’s compatible with the way the CMS works. Developers will need a greater level of detail in the content model. If the content strategist doesn’t provide them, the developers will interpret content needs and make up the details themselves. And they may not be in a position to keep in mind all the design requirements, as well as the needs of the content producers who will use the system.

For content authors and producers — The content model gives content authors and producers guidelines on what content to write or create and how to enter it into the CMS. Although they aren’t generally part of the content modeling process, it’s important to keep this audience in mind because they’ll be the ones working with the CMS each day. For their sake, try to keep the model intuitive, be consistent where there are similarities, and keep redundant activities to a minimum. Depending on how you set up the content model, you can make it easy for them to get their work done, or you can make them dread having to use the system.

How do you create a content model?#section4

There are three main things to consider:

  1. The assembly model: The way content creators will put individual content items together to make webpages, campaigns, documents, or other content products.
  2. The content types: The various configurations of content that are distinct enough to be unique types in the system.
  3. The content attributes: The content and metadata elements that make up each type, including how they relate to each other.

The assembly model#section5

It’s important to understand that most CMSs have a bias. They’re often designed around a certain “unit” of content and that’s what they’re optimized to create. For blog applications, the unit is a post. For Sharepoint, a unit is a document. For most web content management tools a unit is a webpage, even though we’re more likely to be making dynamic sites that use (and reuse) content in a variety of configurations.

During one conversation at the start of a project using FatWire, I started sketching how the elements of a page could be assembled. One of the developers immediately observed, “That’s a very Interwoven way of doing things. We can’t do that in FatWire.” Interwoven is a line of CMS products that allow you to identify a repeatable group of attributes within a content item. For example, an individual FAQ item could be composed of a question, an answer, an optional image, and an optional link. When creating the page, the content creator can “replicate” that entire group of attributes to create as many FAQ items as needed to complete the page. FatWire doesn’t allow that kind of open-ended replication of groups of attributes. You must either specify the maximum number of fields you need for all of the question-answer sets (and the user can leave some of them blank), or you must make the question-answer set its own separate content type and then assemble the sets into a list on the FAQ page.

To make these decisions, you need to consider how modular your content needs to be. Some types of content, such as a press release, may be fairly self-contained. Each instance of Press Release will create one page. In other cases, you’ll have pages made up of many different reusable content modules collected to create the whole. In the FAQ example, if you want to be able to use certain questions on multiple FAQ pages, then it makes sense to have each question-answer set be its own content item. Then they could be manually gathered into lists, or they could be tagged with terms to be dynamically assembled into FAQs by topic.

Some other things to consider when thinking about the assembly model:

  • How structured does the content need to be? Do you need to capture specific, uniquely identifiable data that can be used to sort or filter the content (for example, date, price, rating, author, or location)?
  • How flexible does the content need to be? Can you predict what the elements of a page will be, how many and in what order? Or do you need to support an open structure with the ability to add varying numbers of various content elements to any part of the page?
  • How reusable does the content need to be? Can you make the reusable parts separate so that they can be shared across different pages of the site?
  • How tolerant are your content creators of laborious processes? If you ask them to take a lot of unstructured content and break it into dozens of distinct data fields, are they going to have the time to do it? If you’re lucky, maybe they’re already used to doing that kind of work, but try to avoid asking them to break up the content if it serves no functional purpose, because this can be very time-consuming.

Don’t worry too much about getting precise answers to these questions at this initial stage. This information will help guide decisions in the next two steps. So even if you can’t get consensus on the answers, just having the questions will help shape your thinking and discussions for the rest of the task.

The content types#section6

Questions about how structured the content needs to be will help you determine what constitutes a distinct content type. If the content doesn’t need to be structured, you can have one basic content type and put whatever you need to in it. This is the premise behind a blog post. Very flexible but very unstructured.

More likely, you will want to create at least some structured content types. Then you need to abstract the kinds of content you’re creating and look for patterns. Consider the elements that make up a piece of content and see how many attributes they have in common. For example, a Recipe is clearly quite different from a Slideshow. But is a Band Profile close enough to an Actor Profile that they can be two flavors of the same underlying Profile type?

There are other reasons to make something a separate type of content:

  1. Distinct, reusable elements. You might decide to create an Author content type that contains the name, bio and photo of each author. These can then be associated with any piece of content that person writes.
  2. Functional requirements. A Video might be a different type of content because the presentation layer needs to be prepared to invoke the video player.
  3. Organizational requirements. A Press Release may be very similar to a general Content Page, but only the Press Release is going to appear in an automatically aggregated Newsroom. It’s easier for these to be filtered out if they’re a unique type of content.

The lines can get fuzzy, especially because some elements within the type will always be optional. Sometimes it comes down to a question of “How many differences does it take before something is a completely different thing?” When you have a situation that’s too close to call it’s probably a good idea to collaborate with your tech team and business analyst or functional analyst to come up with the best approach.

The content attributes#section7

In the last step, you’ll identify each different element of each content type. This includes both the content that you can see on the page, and the metadata, which you don’t see. It also includes relationships to other content types. For example, if you create a separate Author content type, you will need to indicate that a Review can have an Author associated with it.

Some elements will be obvious—your press release has a title, maybe a subtitle, an optional image, and body text. But determining which pieces of information need to be captured in separate fields can be a challenge in some cases. Consider the following:

  • Layout Do some things need to be displayed in a completely different style, or on varying places on the page? You should avoid having markup and styling stored with the content, so ideally each element that needs to be displayed differently should be in its own field. For example, you could just make the subtitle part of the body text and ask your editors to bold it, but how well is that going to work with the RSS feed or on a mobile device?
  • Reuse Again, a separate field of data can be pulled out and used independently of the rest of the page, but not if it’s all part of one big body text.
  • Sorting and filtering If you want to be able to sort content by date, or filter content that pertains to a particular city, then these pieces of information have to be in a field by themselves so that they can be used to sort and filter.

Once you’ve determined what the different elements are, you’ll also need to capture information such as what format each element should be in and whether or not it’s required. For example, if you want to sort a certain type of content by date, then every instance needs to have a date, and you can’t have some dates entered as mm/dd/yyyy and some entered as dd/mm/yyyy. Again, coordinate with your tech team to make sure you’re providing them the information that they need to configure the system. And coordinate with your business/functional analyst to make sure that your recommendations are aligned.

Content model documentation#section8

Since the content model serves different audiences, at several different stages of the project, treat it as a living document. It’s never really complete—you just stop updating it when the project is over. As such, it’s better as a working document than a finished deliverable. Over the lifecycle of a content model many people will have input, and it’s even possible that different people will own it at different stages. In most of my projects I’ve handed off the content model to either a functional analyst or developer at some point.

Mainly, the project team will use the content model internally. I’ve rarely had stakeholders who wanted to review an in-progress content model. When they have, they’ve had little to say about it. What they really want is a CMS that meets their design, tech, and business needs, and also serves the people who will have to use it. The content model helps the team to make that happen.

On occasion, you may need to ask stakeholders for input or decisions. It may be more useful to pull out the salient details and present them separately. This way they can focus on the question at hand rather than trying to understand the whole of the content model.

If you must also create CMS training or content production guidelines, the content model will serve as a useful starting point. You can easily excerpt information as needed and reformat it into a more user-friendly document. Though we might strive for a well-configured CMS that’s intuitive and easy to use, the reality of the implementation is almost always more complicated than that. Providing proper guidance to the CMS users could make the difference between the project being perceived as a success or as a serious mess.


A content model is a powerful tool for fostering communication and aligning efforts between UX design, editorial, and technical resources on a project. By clearly defining the assembly model, the content types, and the content attributes, we can help make sure that the envisioned content strategy becomes a reality for the content creators. In my recent projects, I find that content modeling is more and more in demand. It’s a valuable skill for any content strategist, especially those that strive for mastery.

About the Author

Rachel Lovinger

Rachel Lovinger is a Content Strategy Director at Razorfish NYC. She’s dedicated to exploring a future in which information is more effectively structured and useful connections are more easily revealed. Find her on Twitter, on Scatter/Gather, and at conferences all over the world.

15 Reader Comments

  1. I have to admit that it bugs me a little that reading this article makes it sound like “content modeling” is some new shiny tool to go in the content strategist’s toolbelt. The reality is that this is a fairly old technique in the web world, and even moreso in the data systems world – except the rest of us call it data modeling. I’m not sure why we need to give it a new name and pretend like there’s new strategy around how to use it when the reality is, this is all stuff a lot of folks have done for ages, just under a different moniker and using a good project manager to coordinate the construction and sharing (and as a side note, why are we trying to turn content strategists into project managers anyway?).

    We can debate who’s responsibility it is to actually create the data model, that’s pretty objective in a lot of cases. But it’s definitely not new, and I think it harms the industry to not at least give a hat tip to all the work that’s been done in this area prior to today. This seems to be a common and growing problem amongst content strategists lately, in trying to carve out their niche, they’re cannibalizing parts of other fields but acting like it’s new ground.

    Sorry, rant off.

  2. Having been in this field for well over a decade now I have seen hundreds of projects come and go and content is so often viewed as less important than design, less important than functionality, less important. I disagree with fienen above because his comments imply that content hasn’t been an issue in our world. It has been. It still is. The more articles like this that are published the more people will start thinking along the lines of content first, which, in my experience, increases the chances of a successful project immensely.

  3. I should remark following Derek that I absolutely understand the importance of content in our field. I’m a huge proponent of content and content strategy, as a matter of fact (I’ve debated it, written about it, and spoken about it on more occasions than I can count). What draws my ire is definitely not the strategic principle of “content first.” Nor is it the practice of ensuring content is considered throughout whatever project methodology you use. Content is enormously important. Understanding its roles and usable applications is vital for determining its place in overriding marketing strategy. And the practice of “content modeling,” as presented in the article, definitely is a crucial part of that.

    My main issue in this case is simply in the way this particular concept was approached in this article
    – that an entity-relationship model development process was picked up, repackaged, and made to look like something new and shiny for content strategists. I wouldn’t even argue that a content strategist shouldn’t be the one responsible for this necessarily. But I do think it’s a bit disingenuous to present it without acknowledging what it really is, and I see it as a symptom of a bigger problem within the field of content strategy in how we approach concepts of web design.

  4. Rachel’s observations on the assembly model are well made. Certainly every project should involve a detailed and concrete briefing on the intrinsic data model of the target CMS, early in the process.

    That said, I fully agree with _fienen_, that large chunks of this appear to be data modeling in a shiny new hat.

    By not relating it to the 30+ year history of data modeling and diagramming, the reader is denied access to a rich history of related, highly relevant techniques. While the simple “content model diagrams” shown in the article have value, they would be so much richer using Crow’s Foot notation (or UML, et al) with little increase in complexity.

    A good entity relationship diagram can simultaneously appear simple to a lay person, while encoding valuable information for a developer. That’s a great artifact. And one that better serves the stated goal of being an effective advocate for content into the technical realm. A master technique indeed.

  5. Hello folks, this is an interesting debate. I certainly didn’t intend to give the impression that I had invented a new thing called content modeling. My goal was to share information that would be valuable to people who are working with content but might not have done this kind of activity before. It may not be new, but for people who haven’t done it, I’m hoping this article provides a good introduction to the concepts and practices, and a way to get started thinking about it.

    This kind of activity has been part of my work for 12 years, though I never had formal training in data modeling. What I’ve shared here are the ideas and activities that I’ve used in my own work, and that I think will be most useful for people who are doing content strategy. My editor suggested a quote that I think captures this well:

    “There are no new ideas. There are only new ways of making them felt.” – Audre Lorde

    fienen and chetamahori (and any others reading this article), if you have additional resources that you feel will enrich people’s understanding of the history or nuance of this practice, please feel free to share them here. This should be the beginning of a conversation, not the last word.

  6. While I take fienen’s point that the UX/CS/IA community is sometimes guilty of land-grabbing existing practices, I think it’s valid to call this approach ‘content modelling’ (I say ‘domain modelling’ but potato/patato…)since we’re dealing strictly within the domain of content objects and their thematic, semantic connections, wheras general data modelling might include actors, system states etc.

    It’s great to see this practice getting traction on ALA where hopefully it will be tried by a wider audience to build great, graph-like products. I did a talk on this subject at the IA Summit in 2011 and was greatly encouraged by the positive response from IAs and Content Strategists alike, especially given that domain-driven design is a practice that smells a bit technical at times 🙂

  7. Good points Mike. I looked through the slides for the presentation and I love the way you frame data modelling within IA. (Here it is, if people want to take a look: These are definitely interlocking practices, each with a slightly different focus.

    Here’s how I see all of these models fitting together. There are basically three types of metadata:
    * *structural metadata* – which describes how the types of content, their attributes, and how they relate to each other)
    * *administrative metadata* – which describes various administrative states of a piece of content, such as who authored it, or when it was created
    * *descriptive metadata* – which describes what the content is about and gives it more context and meaning

    The Content Model is primarily concerned with structural metadata, while the Domain Model is largely concerned with descriptive metadata, though there’s some overlap (especially when you start talking about how content objects will be tagged, and how they’re related to other content objects).

    Of course, all of these models have a common purpose, which is to help translate and communicate concepts and intentions into something that can be validated and built. Which makes them ideal tools for creating a bridge between design and technology.

  8. Thanks Rachel! Agreed that we need a separation of concerns when it comes to managing and designing for metadata.

    I agree that the modelling I looked at in the presentation was about the context and meaning of content: modelling conceptual entities and the real-world connections between them. ‘Things’, if you will, rather than documents about those things. RDF is good for expressing such relationships as subject-predicate-object, e.g. Giant Panda – lives in – Broadleaf forest.

    I think this approach is nice because it frees us from the confines of thinking about ‘pages’, or indeed any specific platform-implementation at all, which is of course the new hotness in this context-centric, platform-agnostic world 🙂

  9. I concur with chetamahori, UML based diagrams would have added value to the article both for lay persons and specialists. Else, it’s a well put overview.

  10. I totally agree that content modeling is important, and really it is “data modeling from the other end.” Honestly, the issues and challenges you describe here could easily be addressed and modeled with ExpressionEngine. I don’t know if the other CMSs are as capable, but EE really shines in this area.

  11. Intriguing, useful article – thanks much, Rachel… but I agree with others that it would have been richer, more nutritious with a broader perspective, key sources cited, credit given (dare I say more “context” provided?) – e.g. mention of and differentiation from data modeling, maybe via an editor’s intro (dare I say “content curation”?). Here are Rosenfeld and Morville a decade ago in Polar Bear (2nd edition), p. 293:

    “If you’re already familiar with data modeling, then content modeling should seem similar. However, keep in mind that the unstructured text that makes up so much of our web content presents many challenges that don’t come up in data modeling. In effect, content modeling is an effort to apply structure where there is little or none, with the goal of supporting improved searching, browsing, and managing of content. In a sense, content models are perhaps the truest form of bottom-up information architecture: by determining what types of chunks are important and how to link them, we make the answers embedded in our content ‘rise to the surface’.”

  12. chetamahori and fienen have very valid points, and I’d be really interested to read a deeper piece about the history of data modelling and how this practice has evolved. I’d love more context, but as a relative newbie with a background in journalism, framing it as content modelling makes the most sense for me – and means I can pass this along to other journos to show how and why this field is important, and what sort of thought goes into building a CMS, why semantic markup is important to understand, etc. Super basic for this audience, maybe, but it’s still something I have to argue about with people who don’t understand why submitting Word docs with a list of links is not ‘online publishing’.

    Plus, it’s generated some great resources to follow up on – thanks John and Mike for those. The comments gave some pretty great context.

  13. I have similar thoughts as Leilani Graham-Laidl also being a relative newbie to this concept and having come from journalism/PR background. Framing it this way was helpful to me because it was simple to understand and gave me a place to start. But, I’m also glad I read the comments to understand that there is a longer history to this practice and a variety of resources.

  14. Very informative article. Thanks so much. Friends, is there a good book or resources available on Content Modelling?

    My experience is that unless business understand and come out with a definite and intuitive content model – that will be easy to you and group similar content together for increasing its findability – there’s a chance that vendors implementing solutions like WCM will be just happy to refer to existing site layout and design to come up with some kind of a “content model”. This gives you a better underlying technology, but without a solid taxonomy-driven and intuitive content model, investments in technology doesn’t really reap any benefits to the business.

  15. Data model, content model, domain model…super model 🙂

    It’s all subjective, but each bring their own unique abstractions to the table, IMO.

    Data modelling may have been around since the dawn of relational databases (likely before). But concept such as relationships, constraints, normalization, etc make understanding this subject difficult (if not impossible) for a business person or project manager with no formal experience in RDBMS.

    Likewise, a domain model (especially in the context of DDD via Eric Evans) is more even abstract still, with a greater focus on business rules/logic and providing a tool to programmers to tease these business requirements from stakeholders and business experts.

    After reading this, I see the content model as being something of a in-between. A modelling framework, specific to those working at the level of CMS (ie: Drupal) not the RDBMS or OO Domain.

    Basically, what I am trying to say, is no harm, no foul in calling it “something else” — it is something else. It would be far worse if you called it data modelling and missed key concepts and made it your own.

    Great article thanks for sharing 🙂


Got something to say?

We have turned off comments, but you can see what folks had to say before we did so.

More from ALA