Scheduling Posts With Pelican

I use a static site generator — Pelican — to manage Webinista.com. Static site generators such as Pelican, and Jekyll use text files as a data store instead of a database engine. Most accept Markdown or reStructuredText files as input, and convert those files to HTML pages. You write your post, then run a command to generate your site (e.g. pelican -s productionconfig.py).

Static site generators free you from having to manage a database, so you can host it pretty much anywhere. But one of the features you lose — compared to a dynamic publishing tool such as WordPress — is the ability to schedule a post with your software. I can't tell Pelican "Please post this post on July 22, 2018." Instead I need to remember to run the pelican command on July 22.

Or, I can ask cron to do it for me. Here's my workaround.

Pelican requires input files to have some structured metadata at the beginning of the file. It's not quite the same as Jekyll's YAML Frontmatter (although there's a plugin for that), but it does require a each post to have title, field, and date properties.

Title: A sample blog post
Date: 2019-01-01 00:00:00
Author: Tiffany
Status: published

Pelican also supports an optional Status property, which can be published or draft. What it does not support, however, is post-dating entries. If the status is published, Pelican will publish it, regardless of the date set. However, if the status is something other than published or draft, it won't generate the post at all. And that's what I use to my advantage.

To schedule posts, I set its status to scheduled. Then I run two scripts every day using cron. One script loops through all of the Markdown files in my posts directory, and updates the status from scheduled to published if the post's date matches today's date. This script runs every morning at 7am. Why the third status? Because it's unambiguous: Publish this post on this date. By using draft, I'd run the risk of publishing a future-dated but unfinished post.

I've also scheduled a second script to run at 9am. That script launches the virtual environment, runs the pelican command and then deactivates the virtual environment. This all happens on my laptop, a MacBook Pro with the last great Apple keyboard. You do need to ensure that your laptop is on and running so that the cron script executes.

The last piece of the puzzle is a Pelican plugin. It's just a few lines long.

```# -*- coding: utf-8 -*-
"""
Webinista_Publish Plugin for Pelican
==========================

Push to S3 when complete.
"""
from subprocess import call
from pelican import signals

def webinista_publish(generator):
    # Sync with my S3 bucket, delete stuff that's gone from source, but not S3
    cmd = 'aws s3 sync ~/blog/output/ s3://webinista.com  --cache-control max-age=604800 --exclude "~/blog/output/updates/drafts" --delete --profile=webinista'
    call(cmd, shell=True)

def register():
    signals.finalized.connect(webinista_publish)

Pelican runs this script right after it finishes generating HTML pages. The script itself executes an aws-cli command that syncs the generated files from my laptop to the S3 bucket that hosts the site. We could just as easily change it to publish it to an Azure or Netlify destination.

I imagine that you could do something similar with other static site generators. The key here is to use an external script to manage publication dates, and combine it with cron to publish and deploy the site.

Subscribe to the Webinista (Not) Weekly

A mix of tech, business, culture and a smidge of humble bragging. I send it sporadically, but no more than twice per month.

View old newsletters