Is it a Tree or a Graph? Explaining the Two Most Common Content Modeling Patterns

This blog was an assignment for the Intro to Content Management course, part of the MA in Content Strategy degree.

Table Of Contents

Tree and graph content modeling
Choosing the right content model
Hierarchical content modeling pattern – Use cases
Networked content modeling pattern – Use cases
Conclusion
Further considerations
Bibliography

From a content strategy standpoint, the decisions made in the early stages of a website development project (Powell, 2003) are the defining points in the life of a website. Before anyone is allowed to utter the word “mockup”, the back-end content strategist (Rockley, 2016) will consider information ecology, which examines the complex dependencies that exist between context, content, and users (Morville & Rosenfeld, 2007).

Within this consideration sits the topic of content modeling. In content strategy speak content modeling draws upon data modeling for databases to create a structure capable of elevating content, which is more ambiguous and conceptual than any other sort of data (Hill, 2022). “It’s the process of first understanding the concept of a specific type of content—how your users perceive and understand the various pieces and parts it comprises—and then creating a structure that supports those components and relationships,” writes Sarah Wachter-Boettcher in her book titled “Content Everywhere” (2012, p. 30). Finally, content modeling is about building a future-proof framework that holds up when populated with content. In Deane Barker’s words, “it’s like city planning – you design a structure for maximum use and future value, then watch as your city grows into it” (2019, p. 16)

Here, I want to add that content modeling of a website and how to do it well is really part of a bigger discourse on web and mobile usability. I will quote Steve Krug’s (2013, p. 27) definition of usability because it really resonated with me:

“A person of average (or even below average) ability and experience can figure out how to use the thing to accomplish something without it being more trouble than it’s worth.”

Back to the main topic.

One of the fundamental decisions in content modeling revolves around the structure of content objects in relation to one another. The importance of it comes down to the fact that choosing a content structure, a once-in-a-website-project kind of decision, and realising the mismatch between what the content is supposed to do versus what the implemented content structure is allowing it to do post-launch carries significant costs.

First of all, website projects, in general, are expensive, lengthy, and resource-intensive. Finances aside, content objects’ relationships dictate how information is organized and managed, which in turn impacts internal and external audiences alike. In the backend, an admin will be frustrated when they realise they cannot create a list of the most popular articles of all time because this functionality had not been envisioned in the requirements analysis stage of web development (Powell, 2003). This ability to assemble content based on a set of criteria is called content aggregation, which Deane Barker defines as “the grouping of content to provide additional value, to increase the information embodied within it” (Barker, 2024).

On the user-facing front, visitors will grow exasperated when resurfacing information is time-consuming because topically related content is not easily accessible or clearly interconnected.

One bad decision on how content objects exist in relation to one another within a website has financial, human labour, user satisfaction, and operations-related implications. The stakes are high.

The answer to creating a usable website where relationship-building and content aggregation capabilities exist and further increase the value of content lies in selecting the appropriate content model structure.

Tree and graph content modeling

There are two common types of content model structures that define relationships between content objects: hierarchical and networked.

The first type, a hierarchical structure, is often referred to as a “tree structure” as data arranged according to this model has one unique root that serves as the starting point (Open4Tech, 2019). From there, entities called nodes (in the context of this paper, these entities are content objects) will be added in a hierarchical manner. This structure is actually a top-down one, meaning that the tree is upside down (or reverse), so content objects will be cascading from the root.

The hierarchical content model organises all content objects except the content root in a parent/child structure, which means every object will be related to a single parent (Barker, 2019). Some objects will have children, while some will be childless (these will be referred to as “leaves”). Groups of children will be siblings. An object, together with its descendants, is often called a branch of the tree (Barker, 2019). The similarities with a typical tree in nature continue: individual branches or leaves do not connect to one another ergo, there are no loops (Norouzi, 2023). The connections between nodes are called edges, and these tend to be unidirectional.

The tree structure is logical but also rigid. It has clear rules for dependencies and indicates priority based on where in the tree a content object is located. Objects closer to the root tend to be more important, such as top-level pages on a website, which carry more weight in a user journey than a privacy policy page within the same website.

In comparison, a networked content structure, commonly known as a graph, has no hierarchy and can have any arrangement of nodes (or vertices) and edges (Open4Tech, 2019). All content in a graphed structure is equal, flat, and unordered in relation to one another (Barker, 2016). It will also loop and interconnect, taking advantage of the fact that there are no rules dictating the connections among the nodes.

Choosing the right content model

Now that we understand the behaviour of the two common content model patterns for websites, how do we decide which one better fits our use case?

This goes back to the job of a content strategist and their focus on purpose.

Considering the arguments from the first part of this paper, the critical decision as to which structure is a more suitable choice is heavily dependent on a thorough understanding of the reasons behind building (or redesigning) a website.

What is the company aiming to achieve with it? What is the “Why?” that made the CEO kick off the conversation? Bailie and Urbina (2013, p. 23) wrote that “the primary success factor is the content; the technology only supports it.” There are many reasons a company launches an expensive website project, from operational ones to personal or political ones. They are more or less valid, but they also tend to be tactical and short-sighted. A strategic reason is needed to be able to create a structure that supports the delivery of business-critical content that satisfies user needs and wins their patronage.

Ah, the users. Content strategy means “getting the right content, to the right people, in the right place, at the right time” (Halvorson, 2018). In line with this definition, the chosen data structure has to enable the internal and external audiences in their tasks. Website operators have to be able to efficiently, appropriately, and timely present and aggregate content. Website visitors have to be able to navigate, retrieve, and consume content.

Assuming that the content strategy legwork has been done, the next step is to consider the content and its form.

Hierarchical content modeling pattern – Use cases

As previously discussed, the tree structure and the graph structure have certain allowances and limitations that make them suitable for specific content types.

The tree content modeling pattern is recommended for websites and web applications that call for a hierarchical structure with clearly defined spatial relationships between content (Barker, 2016, p. 178). The tree content object structure is very intuitive because, in general, the concept of hierarchical organization is pervasive. This provides a sense of familiarity to users, who can easily and quickly understand the website and “develop a mental model of the site’s structure and their location within that structure”(Morville & Rosenfeld, 2007).

For example, a classic Microsoft SharePoint architecture is typically built using a hierarchical system of site collections and subsites, with inherited navigation, permissions, and site designs (Hendrickson, 2023). Microsoft Sharepoint, primarily sold as a document management and storage system, can be used for enterprise content and document management or to create intranet sites with built-out pages, document libraries, and organisational charts.

The content tree (technically, “object tree”, as Deane writes on page 104 in “Real World Content Modeling”) is a popular choice for your standard run-of-the-mill websites – one homepage (root), a couple of top-level pages (pillar pages) with one of them being a blog page that has multiple blog posts (children). Identity sites (most general corporate sites) and organisational chart sites (Lynch & Horton, 2016) rest on a hierarchical content modeling structure.

The SEO community will often reference the tree content modeling pattern as the recommended site structure model (Rakt referred to it as “pyramid” on the Yoast blog 2023), particularly for smaller websites that do not require intense crawl budget management (Google Search Central, 2024).

In the field of content management systems, Sitecore CMS uses a hierarchical database model. I have to admit that I struggled to find more examples of CMS providers that natively offered the content tree in their shipped ready-to-download versions. Does it mean that the content tree is dead (de Metter, 2016)? Far from it.

I would rather say that its popularity simply decreased as the needs of publishers evolved. The requirements for dynamic content modeling gave rise to content management systems that are highly customizable, more flexible and extendable. They are also more complex and can handle more contextualised content aggregations. The tree content modeling pattern is, in its plain form, very limited. The familiar and intuitive structure is perfect until it isn’t. There is room for horizontal and vertical expansion within reason, however, if the volume of content objects gets bigger and the relationships between objects grow more complex, the content becomes trapped in a content management environment that cannot bring out its value.

The hierarchical organization of content can still be implemented in some parts of a CMS through additional tools, such as categories and tags, menus, or collections (Barker, 2024). However, the downside to this method is that these tools tend not to have built-in governance (Bailie & Urbina, 2013) that would standardise how website editors use them. If documentation or guidelines are not provided for the website admins, the probability of you having a tag cloud nightmare (Weller, 2008) on your hands very soon is dangerously high.

In conclusion, the tried and tested content tree is the right choice for a domain of content whose objects are not expected to exist in complex relationships that would call for more flexible content aggregation options. Considering its usability-centric approach and how intuitive it is, every website project should go through the process of excluding content tree structure first before exploring more nuanced alternatives. Why change a winning team?

Networked content modeling pattern – Use cases

What about the networked content modeling pattern?

Where the content tree provides a clear definition of how relationships between content objects are defined, the graphed structure doesn’t have such limitations.

As stated earlier, the entities existing within the graph can be connected in any possible way, which makes it possible to express diverse and dynamic relationships. Ordering by type, or graph content modeling, is especially relevant for systems that prioritize contextual relevance, as the content objects on their own are ultimately equal.

A graphed shape of content is capable of handling large amounts of data of one predominant type with its intricacies. Social media networks are built on this data structure. Six Degrees, the first online social media network (BBC World Service, 2019), which allowed users to connect with their real-world contacts by creating a profile within a database, used an image of a graph data structure on their website to visualise the context of their data before the idea of online social networking as we know it existed. Since 2000, the algorithms powering social media have become incredibly sophisticated and cloaked in mystery, but they still leverage the idea that forging new connections is valuable.

Wikis are another use case for graphed content models. Wiki is “a form of online hypertext publication that is collaboratively edited and managed by its own audience directly through a web browser” (Wikipedia, 2024). This type of website would have no or very limited structure, rather allowing structure to emerge in the process of co-creating it. From the content strategy point of view, the purpose of a wiki is to manage and transfer knowledge in a specific domain that’s been comprehensively documented and structured to capture the breadth and depth of the topic. The networked content modeling pattern is highly scalable because its structure is not constrained by the number of relationships each content object could have.

More flexibility in expressing relationships between content objects is both an advantage and a disadvantage. This freedom can lead to complexity and ambiguity, especially for websites, when clarity and instructions for how to engage with the content should be expressed in its structure. Additionally, graphed content modeling might become challenging if maintaining consistency has not been factored into the design of the website. Similarly, while wikis thrive on the collaborative nature of graphed content models, the lack of predefined structure can result in inconsistency and fragmentation of information, decreasing the value of the aggregated content.

With more freedom in content modeling comes more responsibility for clarifying the rules of engagement.

Conclusion

In this paper, I looked at the two common content modeling patterns – hierarchical and networked. Hierarchical content modeling, akin to a structured tree, offers clarity and intuitiveness, making it suitable for websites requiring well-defined relationships among content objects.

On the other hand, networked content modeling, characterized by a graph structure, provides unparalleled flexibility in representing relationships among content objects. This approach is well-suited for platforms emphasizing contextual relevance and diverse connections, as seen in social media networks and wikis. However, the freedom inherent in networked content modeling could backfire without proper governance.

Ultimately, the decision between hierarchical and networked content modeling hinges on a thorough understanding of the website’s purpose, audience needs, and content goals. Content strategists play a vital role in guiding this decision-making process.

As website development continues to evolve, content modeling remains a critical component in creating usable digital experiences. By selecting the appropriate content model structure and adhering to best practices in content strategy, organizations can effectively navigate the complexities of content management and provide content that adds value.

Further considerations

Back in the day, the limits of CMS were clearly defined by the available technology. Nowadays, the content modeling choice is rarely either/or. Many resources-rich companies will expand on their technology stack and own multiple platforms that use a mix of content models to both, respect platform native properties and to prepare content for multimodal delivery. Further considerations about the use cases of the hierarchical and networked content modeling patterns should focus on how these two are mixed to support modern content delivery practices in the context of content strategy chosen by the publisher.

Bibliography

AWS. (n.d.). Graph vs Relational Databases—Difference Between Databases—AWS. Amazon Web Services, Inc. Retrieved 13 February 2024, from https://aws.amazon.com/compare/the-difference-between-graph-and-relational-database/

Bailie, R. A., & Urbina, N. (2013). Content Strategy: Connecting the Dots Between Business, Brand, and Benefits. XML Press.

Barker, D. (2005). The Content Tree. https://deanebarker.net/tech/blog/content-tree/

Barker, D. (2016). Web Content Management: Systems, Features, and Best Practices (1st edition). O’Reilly Media.

Barker, D. (2019). Real World Content Modeling (1st edition). Self-published. https://deanebarker.net/books/real-world-content-modeling/

Barker, D. (2024). Intro to CM lecture five: Content Aggregation.

BBC World Service. (2019). BBC World Service—Witness History, Six Degrees—The first online social network. https://www.bbc.co.uk/programmes/w3csywv4

de Metter, P. (2016). The Content Tree is Dead. https://archive.24days.in/umbraco-cms/2016/the-content-tree-is-dead/

Google Search Central. (2024). Crawl Budget Management For Large Sites | Google Search Central | Documentation. Google for Developers. https://developers.google.com/search/docs/crawling-indexing/large-site-managing-crawl-budget

Halvorson, K. (2018). What Is Content Strategy? Connecting the Dots Between Disciplines – Brain Traffic blog. https://www.braintraffic.com/blog/what-is-content-strategy

Hendrickson, J. (2023, April 3). Introduction to SharePoint information architecture—SharePoint in Microsoft 365. https://learn.microsoft.com/en-us/sharepoint/information-architecture-modern-experience

Hill, J. (2022, July 1). Overcoming Content Ambiguity and Disorganized Data. Hill Web Creations: Digital Marketing, SEM, SEO. https://www.hillwebcreations.com/overcoming-content-ambiguity/

Juvonen, V. (2022, December 19). Information architecture guidance for SharePoint Online portals. https://learn.microsoft.com/en-us/sharepoint/dev/solution-guidance/portal-information-architecture

Krug, S. (2013). Don’t Make Me Think, Revisited: A Common Sense Approach to Web Usability (3rd edition). New Riders.

Lynch, P. J., & Horton, S. (2016). Web Style Guide, 4th Edition: Foundations of User Experience Design (Fourth edition). Yale University Press.

Microsoft Learn. (2016, October 20). Server and Site Architecture: Object Model Overview. https://learn.microsoft.com/en-us/previous-versions/office/developer/sharepoint-2010/ms473633(v=office.14)

Morville, P., & Rosenfeld, L. (2007). Information Architecture for the World Wide Web: Designing Large-Scale Web Sites (3rd edition). O’Reilly Media. https://www.amazon.com/Information-Architecture-World-Wide-Web/dp/0596527349/ref=sr_1_1

Norouzi, A. (2023, June 7). Unraveling the Secrets of Trees and Graphs: 3 Essential Algorithms Revealed. Armin Norouzi. https://arminnorouzi.github.io/posts/2023/06/blog-post-2/

Open4Tech. (2019, January 1). Trees vs. Graphs. Open4Tech. https://open4tech.com/trees-vs-graphs/

Powell, T. (2003). HTML & XHTML: The Complete Reference (4th edition). McGraw-Hill Osborne Media.

Rakt, M. van de. (2023, May 3). Site structure: The ultimate guide. Yoast. https://yoast.com/site-structure-the-ultimate-guide/

Rockley, A. (2016). Why You Need Two Types of Content Strategist. Content Marketing Institute. https://contentmarketinginstitute.com/articles/types-content-strategist/

Sitecore. (n.d.). Content tree architecture | Sitecore Documentation. Retrieved 14 February 2024, from https://doc.sitecore.com/xp/en/developers/91/sitecore-experience-commerce/content-tree-architecture.html

Staff, H. C. (2023, April 26). 6 Different Types of Database Management Systems. History-Computer. https://history-computer.com/different-types-of-database-management-systems/

Wachter-Boettcher, S. (2012). Content Everywhere: Strategy and Structure For Future-Ready Content (1st edition). Rosenfeld Media.

Weller, M. (2008). You are your (tag) cloud – The Ed Techie. The Ed Techie. https://blog.edtechie.net/weblogs/you-are-your-tag-cloud/

Wikipedia. (2023). Graph (abstract data type). In Wikipedia. https://en.wikipedia.org/w/index.php?title=Graph_(abstract_data_type)&oldid=1184379892

Wikipedia. (2024a). Tree (data structure). In Wikipedia. https://en.wikipedia.org/w/index.php?title=Tree_(data_structure)&oldid=1201036097

Wikipedia. (2024b). Wiki. In Wikipedia. https://en.wikipedia.org/w/index.php?title=Wiki&oldid=1203096248

The banner image was generated in Microsoft Copilot with the following prompt: “create an image of a tree in space with the sky being colourful and forming a graph-like constellation of stars”.