Slug fields: An alternative approach

Chris Lamb, April 20th 2014

Following up to my previous post on slug field generation, Reddit user yen223 pointed out another approach to slug fields.

This is where the slug appears in the URL in addition to the primary key — the slug is purely decoration and the actual database lookup is still performed using the instance's ID. As luck would have it, a post on exactly this alternative approach was already in the post queue!

To elaborate more on what yen223 was referring to, on Amazon all of the following links point to the same product:

This approach is taken by StackOverflow with their /questions/123456/anything style of URLs.

Django implementation

An implementation for Django is really quite simple. As before, let's imagine we are building a simple blog application. First, we define a URL:

from django.conf.urls import patterns, url

urlpatterns = patterns('mysite.blog.views',
    url(r'^posts/(?P<post_id>\d+)/(?P<slug>[\w-]+)$', 'post',
        name='post'),
)

This would give us URLs in the form of /posts/1234/slug-goes-here. Depending on personal taste, you may wish to experiment with URL schemes such as /posts/1234-slug-goes-here or even /posts/slug-goes-here-1234.

Next, we add a method to dynamically generate the slug for each Post:

from django.db import models
from django.utils.text import slugify

class Post(models.Model):
    title = models.CharField(max_length=100)
    content = models.TextField(blank=True)

    @models.permalink
    def get_absolute_url(self):
        return 'blog:post', (self.post_id, self.slug())

    def slug(self):
        return slugify(self.title)

Finally, we just have to define our view:

from django.shortcuts import redirect, render, get_object_or_404

from .models import Post

def view(request, post_id, slug):
    post = get_object_or_404(Post, pk=post_id)

    # Redirect if the slug does not match
    if slug != post.slug():
        return redirect(post, permanent=True)

    return render(request, 'blog/post.html', {
        'post': post,
    })

Note that we don't use the slug in the database lookup — we are simply using the post_id.

Also note that unlike Amazon we redirect to the correct slug if it does not match. This results in cleaner, more canonical URLs. For the same reason, this redirect is a permanent HTTP 301 redirect rather than a temporary one.

Trade-offs

Let's compare this method with the classic slug = SlugField(unique=True) approach:

Advantages

  • If the slug changes, the old URLs will automatically redirect to the latest one. Whilst truly cool URIs never change, not providing a redirect when—for example—correcting a typo in a slug is definitely uncool. An equivalent feature for concrete slugs might require storing all "previous" slugs with against their associated Post and additionally looking up against this table if the initial lookup failed... hardly very tidy.
  • No need to store anything at all in your database. Whilst the storage requirements of a slug field itself might be relative minor, the implementation of the unique constraint within your database also can use significant space and processing when importing. The lack of a field not only means zero storage costs, but it could even be argued that a slug is a presentation concern and is thus an abstraction-level violation to be part of a Post's definition.
  • The lack of a "slug" field additionally means no migrations are required, useful where minimising downtime is a priority.

Disadvantages

  • You have less control over how people can link to your site - people can construct "clever" URLs (eg. /posts/1234/tediously-overwritten-blog-post) which can encourage bad publicity. As mentioned earlier, unlike Amazon's implementation we redirect to the canonical version, but any arbitrary incoming link would still correctly resolve before redirecting.

  • Does not mask your primary key. One advantage of a concrete slug field is that it hides your primary key from your visitors — this is important if you wish to keep secret how many of that instance actually exists. For example, a savvy journalist could periodically register with your site and note their user ID - the difference between these values would be an approximation of your growth rate, something that you may wish to keep private.

    One away around this would be to introduce a concrete "lookup-only slug" that replaces the primary key part of the lookup, but we keep our dynamic one for based on the title. For example:

from django.utils.crypto import get_random_string

class Post(models.Model):
    title = models.CharField(max_length=100)
    content = models.TextField(blank=True)

    lookup = models.SlugField(
        unique=True,
        default=get_random_string,
        max_length=13,
    )

    @models.permalink
    def get_absolute_url(self):
        return 'blog:post', (self.lookup, self.slug())

    def slug(self):
        return slugify(self.title)

The Post.lookup field replaces Post.id as our user-visible value to look up a Post by, avoiding exposing our "secret" primary keys whilst keeping the flexibility of our dynamically generated slugs. This might be an effective compromise as these randomly-generated lookup slugs are unlikely to ever change.

Migrating from primary-key based lookups

If you are currently using plain URLs in the form of /post/123, you can easily migrate to this alternative slug method. First, modify your urls.py to accept an optional slug paramater. This will ensure your old URLs will continue to resolve:

urlpatterns = patterns('mysite.blog.views',
    url(r'^posts/(?P<post_id>\d+)(?:/(?P<slug>[\w-]+))?$', 'post',
        name='post'),
)

Finally, adjust your views.py to reflect that the slug parameter can be missing and to perform the redirect:

def view(request, post_id, slug=None):
    post = get_object_or_404(Post, pk=post_id)

    # Redirect if the slug does not match
    if slug != post.slug():
        return redirect(post, permanent=True)

    return render(request, 'blog/post.html', {
        'post': post,
    })

As the value of slug will be None when the user visits a plain /post/123 URL, they will be automatically redirected to the post's new URL. Easy.

Summary

I hope that's given you another take on slug fields and has given you an alternative to consider next time you need one. Please let us know if you have any questions or if you have suggestions for future posts.


comments powered by Disqus



Learn how to speed up your Django site — get your FREE 14-day course

Become an expert in high-performance web applications.