Scheduling Posts With Pelican
macOS as of Mojave no longer allows new cron jobs. The workaround is to give cron
full disk access. This workaround is also a massive security risk. I haven't fully examined alternatives yet, but I think Automator and some AppleScript could probably do the job here.
I use a static site generator — Pelican — to manage Webinista.com. Static site generators such as Pelican, and Jekyll use text files as a data store instead of a database engine. Most accept Markdown or reStructuredText files as input, and convert those files to HTML pages. You write your post, then run a command to generate your site (e.g. pelican -s productionconfig.py
).
Static site generators free you from having to manage a database, so you can host it pretty much anywhere. But one of the features you lose — compared to a dynamic
publishing tool such as WordPress — is the ability to schedule a post with your software. I can't tell Pelican "Please post this post on July 22, 2018." Instead I need to remember to run the pelican
command on July 22.
Or, I can ask cron
to do it for me. Here's my workaround.
Pelican requires input files to have some structured metadata at the beginning of the file. It's not quite the same as Jekyll's YAML Frontmatter (although there's a plugin for that), but it does require a each post to have title, field, and date properties.
Title: A sample blog post
Date: 2019-01-01 00:00:00
Author: Tiffany
Status: published
Pelican also supports an optional Status
property, which can be published or draft. What it does not support, however, is post-dating entries. If the status is published, Pelican will publish it, regardless of the date set. However, if the status is something other than published or draft, it won't generate the post at all. And that's what I use to my advantage.
To schedule posts, I set its status to scheduled. Then I run two scripts every day using cron
. One script loops through all of the Markdown files in my posts
directory, and updates the status from scheduled
to published
if the post's date matches today's date. This script runs every morning at 7am. Why the third status? Because it's unambiguous: Publish this post on this date.
By using draft
, I'd run the risk of publishing a future-dated but unfinished post.
I've also scheduled a second script to run at 9am. That script launches the virtual environment, runs the pelican
command and then deactivates the virtual environment. This all happens on my laptop, a MacBook Pro with the last great Apple keyboard. You do need to ensure that your laptop is on and running so that the cron
script executes.
The last piece of the puzzle is a Pelican plugin. It's just a few lines long.
```# -*- coding: utf-8 -*-
"""
Webinista_Publish Plugin for Pelican
==========================
Push to S3 when complete.
"""
from subprocess import call
from pelican import signals
def webinista_publish(generator):
# Sync with my S3 bucket, delete stuff that's gone from source, but not S3
cmd = 'aws s3 sync ~/blog/output/ s3://webinista.com --cache-control max-age=604800 --exclude "~/blog/output/updates/drafts" --delete --profile=webinista'
call(cmd, shell=True)
def register():
signals.finalized.connect(webinista_publish)
Pelican runs this script right after it finishes generating HTML pages. The script itself executes an aws-cli
command that syncs the generated files from my laptop to the S3 bucket that hosts the site. We could just as easily change it to publish it to an Azure or Netlify destination.
I imagine that you could do something similar with other static site generators. The key here is to use an external script to manage publication dates, and combine it with cron
to publish and deploy the site.