A taxonomy system is an information architecture tool to organize any given set of content. The most famous taxonomy is the Linnaean taxonomy for classifying organisms. In websites, it usually takes the form of categories and tags that group pages according to topics, entities or other concepts. This system for organizing information helps the users navigate and locate content. Categories are often general and hierarchical. Tags, on the other hand, are specific and there’s no hierarchical relation. A piece of content can be assigned to one category but can have several tags.
The best taxonomies describe the content or products they organize accurately, are extensible and can accommodate new topics or products, and match the user’s mental model. When a website grows, it may put a strain into the taxonomy system in place. There may be too many categories or tags to be useful to readers and there may be overlap and competition between taxonomies.
A well-managed taxonomy system can help your users navigate your website and also support your SEO strategy.
Table of Contents
Why a good taxonomy is strategic for SEO
How taxonomy impacts SEO
A good taxonomy system helps Google understand the structure of your site and what are the topics you’re covering with your content. It can also help with content discovery and indexing.
But if it’s not well-managed, taxonomies can potentially generate duplicated content and archive pages can compete in the rankings for the same keywords as individual pages. With a good internal linking and taxonomy structure, you can tell Google what pages are most important, and what content should search engines index and what they shouldn’t.
On a more technical note, a poorly designed taxonomy can also deplete your crawl budget and impact the flow of PageRank through your internal linking.
A taxonomy system can also help you manage changes in your site. Your products can run out of stock, or they are replaced with new ones each season. New content is constantly replacing news and analysis that are now outdated. But your taxonomy, the structure that acts like a scaffolding supporting your site, remains relevant. Shoes are still shoes each season, and there is a new NBA playoff each year. The individual content pages may not be relevant anymore in search, but the structure supporting them is.
Creating an effective taxonomy for SEO
Depending on the purpose of a website, a taxonomy can be created from two different points of view:
- Based on the content topic. You can define the main topic the website will be covering, and the topics and subtopics that you will need to have in your main structure. For example, for a health information website, you could organize your content based on the different parts of the human body. So you’ll end up with a category for kidney diseases and another one for heart conditions.2. Based on the user. In this case, the content will match the user’s journey of the different user personas that visit your site. Using the same example, an alternative organization system for a health information site could be by symptoms. Or by age, for example clustering together health information for kids, or for seniors.
In each case, you’d need to identify first the purpose of the taxonomy (for example: “help users find a great restaurant close to their location”).
Types of taxonomy
Most websites use the double taxonomy system of categories and tags, which is a mix of hierarchical and flat taxonomies. But those are not the only types available.
- Hierarchical taxonomy: The concept described by the taxonomy is narrowed down as you go deeper into the hierarchy. For example: Mammals > vertebrates. Most hierarchical taxonomies are simple, with a parent category and one or more child categories. In this case, the categories are mutually exclusive. A polyhierarchy occurs when a child category has more than one parent category. They happen often in e-commerce sites. For example, a video game console may be placed under video games or under electronics.
- Flat taxonomy: There are only top-level categories and all have the same weight. A tagging system falls into this type of taxonomy.
- Network taxonomy: Every category can be linked to any other category and the relationships between them can be hierarchical or semantic. It can be used to build contextual navigation, like the most viewed articles, recommended reading or a list of upsell products in e-commerce websites.
- Facet taxonomy: Every item has a set of associated categories that work as a set of attributes. For example, in a restaurant review site, you’d see attributes like price range, type of food, user rating, etc. E-commerce sites work the same way. A t-shirt will have attributes like color, sizes available, or fit.
Information architecture for SEO
Site and click depth
Page authority decreases with each click away from the homepage. A page that can be reached from the homepage in 2-3 clicks will be judged by search engines as more important than a page that can be reached only when clicking through 4-6 pages.
When planning your taxonomy is important to take this fact into account and not go overboard with child categories or other elements that will put your content further away from your homepage.
Topical authority
A well-designed taxonomy system helps users and robots navigate and understand your website structure and content. Search engine crawling robots employ semantical analysis to understand concepts and map them together. Connected content under the same topic creates semantic density. A high concentration of related content inside a domain is a signal to search engines about the relevance of the site for that topic.
If a taxonomy system is not implemented correctly, competing and mixed categories will distribute and dissolve any topical authority you may have from the clustering of connected content. It’s important that your categories and tags are well defined and distinct from each other.
URL structure and breadcrumbs
To reinforce to Google the structure of your site and to signal the authority of the category page there are two tools you can use.
First is to reflect the site structure in the URL structure. It will make it clear to user and robot what the page is about. URL length is not a factor in search optimisation. As long as the category reflects the topic and is relevant, the category name should be included in the URL.
Breadcrumbs have two useful applications. First, they will link back from each individual piece of content to the category or sub-category where the content is located. This reinforces the authority of the category archive page. But they also can show up instead of the URLs in the SERP. This provides a visual clue to the user about the topic that content is about.
Content optimization of category pages
Category archive pages as search landing pages
Category pages function as an archive for all the content that sits in that category. Depending on the type of website and the topics you cover, this archive can be more important for your SEO than individual content pages.
As a rule, the faster the content expires, the more important the category archives are for SEO. For example, the category page for a sports team may be more relevant to optimise for SEO than pages for each match of the season. In contrast, a health information website should focus on optimising the pages for specific disorders, like Crohn’s disease, instead of the category for gastrointestinal health.
In the first example, the sports team category page should function as a landing page. A point of entry for users that then redirects them to the content they are looking for. The underlying strategy is that the category page will be optimised and will rank for the more general terms, like a sports team name, and the individual pages will rank for more specific terms.
Custom content
Each category page should have introductory content describing the topic of the category. It should be between 200 and 400 words long and include a few links to the best articles or products inside the category. When writing this text think about the context your users would need and what kind of search intent would bring them to that page. It’s recommended to add an image as well that may illustrate the topic.
Below this, you can show the list of articles inside that category, with an image, headline and a short summary of the content.
Title and meta description tags
A unique, well-crafted title and meta description tags will be useful to generate clicks from the search engine results page. Instead of writing for Googlebot, address the needs of the potential user.
Keyword strategy for category pages
Keyword research can be a useful tool to build your taxonomy system and organize your content, but it needs to be validated with research with actual users, to make sure the taxonomy fits their mental model.
Category pages should target broad, general keywords to avoid individual pages to compete between themselves to rank. Individual pages should focus their optimisation on more specific terms inside that topic domain.
Common taxonomy SEO mistakes
Too many categories or tags
When a taxonomy system is unmanaged and authors can create new categories or tags, there’s going to be an explosion of taxonomy terms. Authors will use different variations of a term as a tag, and there will be no consistency in the categorization approach. Instagram is an example of a site where users have control over what tags to use and can create a seemingly infinite amount of them.
This makes it very difficult for users to reach the content they want and creates problems of content duplication and archives with very little content.
There are two possible solutions to this:
- Implement a top-down taxonomy system, where authors or users have to choose from a pre-defined list of categories and tags. On the category level is the approach used by news publishers since newspapers where born. It’s also the same approach used by libraries. A piece of content either belongs to Politics or Economy. A book can be in crime but not on romance.
- The other option is to conduct a topic clustering exercise, where authors or users are given the freedom to tag as they wish, but these tags don’t create individual archive pages and are instead clustered into topics with other similar tags. Fan fiction site Archive of Our Own has a dedicated team of volunteers that cluster tags together. The Huffington Post, on the other hand, used semantic technology to clean up their tagging system. Their results: “More useful, authoritative tag/topic pages; Improved page crawling and rankings; Better content analysis and content retrieval.”
When authors choose tags, this makes more difficult to enforce, as the amount of possible tags within a topic domain is exceedingly large. It can also become outdated really quickly.
Tagging guidelines or an autocomplete field with tagging suggestions is an alternate, softer implementation of this solution. For example, it could suggest the tag “Los Angeles Lakers”, but it can’t prevent the author from using “LA Lakers” or “Lakers” instead.
AI application is becoming more accessible for a wider range of publishers. For example, ClassifAI is a WordPress plugin that automatically adds tags to content through cloud-based AI services like IBM Watson and Microsoft Azure.
Content from our partners
Content duplication
Having too many categories or tags has a negative impact on user experience, as it makes it more difficult to find the content the user is looking for. But it also has an SEO impact as well. Having similar content on the same topics in several archive pages prevents you from ranking a strong archive page on that topic. It also dilutes your topical authority.
You could also face a content duplication problem if search engines are indexing all of your tag archives. In this case, the same article could be part of the archive of several tags. It’s recommended to apply a no index meta tag to instruct search engine bots to crawl, but not index, some of these particular archive pages.
For example, a sports site may decide to allow search engines to index their tag archives for Stephen Curry (an NBA superstar from the Golden State Warriors) while excluding from the index the archive page for Alfonzo McKinnie (a slightly less stellar player on the same team).
Thin content on archives
A wild proliferation of taxonomy terms will inevitably lead as well to archive pages which have too little content to justify their existence. An archive page with very few articles will not be useful to the user and won’t send a strong topical authority signal to search engine bots.
The recommendation, in this case, is to see if the tag can be merged with another (“LA Lakers” and “Lakers” for example). If the tag is unique but still has very low content, then the recommendation is to apply a no index meta tag to the archive page, to avoid competing with the individual articles.
Other taxonomy mistakes
There are other common mistakes people can make when applying a taxonomy system to organize content which will have a negative SEO impact:
- Duplicated categories and tags: If a news site has a category for “Spain”, there’s no need for a “Spain” tag as well. This will create a poor user experience and both taxonomies will compete in search results against each other.
- Categories that should have a hierarchical relationship are flat instead. In this example, the “Spain” category should be a child node to the parent category “Europe”.
- Categories that should be tags and vice versa. As a rule, categories and subcategories classify content in general terms (“Politics”, “Sports”), while tags are the specific topics covered in the content element (“Democratic Primary”, “NBA 2019 Playoffs”).
- Ignoring the user. Use personas and conduct research with users to validate your taxonomy decisions. Aligning your taxonomy with the user’s mental model will create a better user experience and your site will be better positioned to answer their search intent.
- Each piece of content has its own place. Content can be accessed from multiple paths. That’s a best practice, as it allows you to align your navigation to different user personas. But each individual content element must have a single, unique URL. An e-commerce site, for example, could have the same shorts under Hiking, Camping, or Fitness. But the URL for the shorts should always be the same.
Conclusion
Taxonomies are systems that allow us to organize and make sense out of a given set of content. This organization serves two main purposes:
- Make it easier for users to discover and access your content, to use and navigate your site. A good taxonomy then supports the user in the completion of their goals.
- Makes it easier for robots to discover and index your content. A good taxonomy will help search engines to understand your site so that it can surface your content to relevant search queries.
A taxonomy system can support your SEO strategy by organizing content more efficiently, avoiding content duplication, and presenting a structure that can show topical authority for your content domain.
The benefit of good information architecture is often overlooked and represents a tremendous opportunity for publishers to better position themselves in search engines.