Following up to my previous post on slug field generation, Reddit user yen223 pointed out another approach to slug fields.
This is where the slug appears in the URL in addition to the primary key — the slug is purely decoration and the actual database lookup is still performed using the instance's ID. As luck would have it, a post on exactly this alternative approach was already in the post queue!
To elaborate more on what yen223 was referring to, on Amazon all of the following links point to the same product:
- http://www.amazon.co.uk/dp/B004DER7HI/a-book
- http://www.amazon.co.uk/dp/B004DER7HI/some-text
- http://www.amazon.co.uk/dp/B004DER7HI/almost-any-text-in-fact
This approach is taken by StackOverflow with their /questions/123456/anything style of URLs.
Django implementation
An implementation for Django is really quite simple. As before, let's imagine we are building a simple blog application. First, we define a URL:
from django.conf.urls import patterns, url urlpatterns = patterns('mysite.blog.views', url(r'^posts/(?P<post_id>\d+)/(?P<slug>[\w-]+)$', 'post', name='post'), )
This would give us URLs in the form of /posts/1234/slug-goes-here. Depending on personal taste, you may wish to experiment with URL schemes such as /posts/1234-slug-goes-here or even /posts/slug-goes-here-1234.
Next, we add a method to dynamically generate the slug for each Post:
from django.db import models from django.utils.text import slugify class Post(models.Model): title = models.CharField(max_length=100) content = models.TextField(blank=True) @models.permalink def get_absolute_url(self): return 'blog:post', (self.post_id, self.slug()) def slug(self): return slugify(self.title)
Finally, we just have to define our view:
from django.shortcuts import redirect, render, get_object_or_404 from .models import Post def view(request, post_id, slug): post = get_object_or_404(Post, pk=post_id) # Redirect if the slug does not match if slug != post.slug(): return redirect(post, permanent=True) return render(request, 'blog/post.html', { 'post': post, })
Note that we don't use the slug in the database lookup — we are simply using the post_id.
Also note that unlike Amazon we redirect to the correct slug if it does not match. This results in cleaner, more canonical URLs. For the same reason, this redirect is a permanent HTTP 301 redirect rather than a temporary one.
Trade-offs
Let's compare this method with the classic slug = SlugField(unique=True) approach:
Advantages
- If the slug changes, the old URLs will automatically redirect to the latest one. Whilst truly cool URIs never change, not providing a redirect when—for example—correcting a typo in a slug is definitely uncool. An equivalent feature for concrete slugs might require storing all "previous" slugs with against their associated Post and additionally looking up against this table if the initial lookup failed... hardly very tidy.
- No need to store anything at all in your database. Whilst the storage requirements of a slug field itself might be relative minor, the implementation of the unique constraint within your database also can use significant space and processing when importing. The lack of a field not only means zero storage costs, but it could even be argued that a slug is a presentation concern and is thus an abstraction-level violation to be part of a Post's definition.
- The lack of a "slug" field additionally means no migrations are required, useful where minimising downtime is a priority.
Disadvantages
You have less control over how people can link to your site - people can construct "clever" URLs (eg. /posts/1234/tediously-overwritten-blog-post) which can encourage bad publicity. As mentioned earlier, unlike Amazon's implementation we redirect to the canonical version, but any arbitrary incoming link would still correctly resolve before redirecting.
Does not mask your primary key. One advantage of a concrete slug field is that it hides your primary key from your visitors — this is important if you wish to keep secret how many of that instance actually exists. For example, a savvy journalist could periodically register with your site and note their user ID - the difference between these values would be an approximation of your growth rate, something that you may wish to keep private.
One away around this would be to introduce a concrete "lookup-only slug" that replaces the primary key part of the lookup, but we keep our dynamic one for based on the title. For example:
from django.utils.crypto import get_random_string class Post(models.Model): title = models.CharField(max_length=100) content = models.TextField(blank=True) lookup = models.SlugField( unique=True, default=get_random_string, max_length=13, ) @models.permalink def get_absolute_url(self): return 'blog:post', (self.lookup, self.slug()) def slug(self): return slugify(self.title)
The Post.lookup field replaces Post.id as our user-visible value to look up a Post by, avoiding exposing our "secret" primary keys whilst keeping the flexibility of our dynamically generated slugs. This might be an effective compromise as these randomly-generated lookup slugs are unlikely to ever change.
Migrating from primary-key based lookups
If you are currently using plain URLs in the form of /post/123, you can easily migrate to this alternative slug method. First, modify your urls.py to accept an optional slug paramater. This will ensure your old URLs will continue to resolve:
urlpatterns = patterns('mysite.blog.views', url(r'^posts/(?P<post_id>\d+)(?:/(?P<slug>[\w-]+))?$', 'post', name='post'), )
Finally, adjust your views.py to reflect that the slug parameter can be missing and to perform the redirect:
def view(request, post_id, slug=None): post = get_object_or_404(Post, pk=post_id) # Redirect if the slug does not match if slug != post.slug(): return redirect(post, permanent=True) return render(request, 'blog/post.html', { 'post': post, })
As the value of slug will be None when the user visits a plain /post/123 URL, they will be automatically redirected to the post's new URL. Easy.
Summary
I hope that's given you another take on slug fields and has given you an alternative to consider next time you need one. Please let us know if you have any questions or if you have suggestions for future posts.
comments powered by Disqus