Building the site (part 2)

November 14, 2022 11:29

Part 2: Textpattern and Setting up the Site

Well, I did say there’d be a next time… and here it is just a decade later. In the intervening time a fair amount of what I went over in Part 1 has been rendered obsolete. On the other hand, what I had so long ago intended for this post to be about is mostly the same (since it’s partly about what I did). Indeed the process I use has hardly changed either. I now likely have a topic for another ‘site-building’ post, detailing some the recent changes to modernize the layout a tiny bit, and to change the photo gallery so I can put up some more photo sets.

Briefly, I’ll go over what is no longer relevant from the first part. As already mentioned there, I switched from using categories as post labels to tags, since they seem more flexible. That’s still how I mark the posts. For the topic index page, I was able to ditch my custom page_lookup Liquid tag by switching to Jekyll’s Collections. A Collection in Jekyll is basically a set of related content that has documents stored in one directory. Previously I was storing all of my ‘topic’ descriptions in one directory as well, so it was hardly any trouble to just move them to the _collections directory, and then adjust Jekyll’s config file to produce actual pages for the collection. Jekyll then automatically makes the YAML front matter from the collection files available when iterating through a collection item.

Now, for what this post was ostensibly about, the migration from Textpattern. Jekyll has migrators available for a fair number of blog engines, although I’m not sure how many of those engines are still in use. (sidenote) Textpattern CMS is still going strong apparently It would seem that simply running the Textpattern importer was all that was needed. The importer indeed worked fine, but it only pulled out the content of the posts themselves. I also had several photos that were associated with the posts, and the image tags wouldn’t get processed. The slightly tricky part, which made it more than a matter of just replacing the tags, was that Textpattern had multiple types of tags. I had regular image tags, ‘article’ image tags, and also ‘thumbnail’ variations; additionally, images could be indicated by either name or database id. At the time, I modified the Textpattern migrator to also process the image tags and simply turn them into HTML image tags in the imported articles. I also copied only the image files that were needed to a new directory. It may be that the importer now handles images this without any issues; I haven’t really checked up on it.

One legacy of using Textpattern is that all my posts were formatted using Textile. In the course of time, Markdown has seemingly won out almost everywhere, and Jekyll itself expects Markdown to be used (although the Textile Converter plug-in is available.) I still use Textile, using that converter. While I do prefer Markdown’s link syntax and a handful of features, the one thing that it’s missing that Textile supports is easy use of italic tags alongside em. Markdown avoided ‘confusion’ over the use of * and _ by making them identical, with doubling being the distinguishing feature, to handle em and strong HTML tags respectively. In Textile these symbols are sensibly distinct, and the doubled version is used to indicate whether italic should be used instead of em, making it easier to have both in your text. If you are wondering why I want that, it’s because I’m often using algebraic variables or terms that should be using italic. In truth, I rarely use em, but it is good to have an easy way to use both.

Something that I did give up when switching from Textpattern was the ability to create and edit posts directly via the site itself. When I was traveling and had very limited internet access, that was invaluable. Jekyll doesn’t do that on its own, since it does not ‘run’ while the site is active, it just generates the pages. That’s more of an advantage for me now, since I use Nearly Free Speech to host the site, and it’s very inexpensive if the site is static with no databases. Now technically, since Jekyll could process the pages and make a new site quickly, I could set things up to just use a text editor, put the file on the site, and regenerate it on-the-fly. In practice, since I use my laptop to write the posts now anyway, I don’t really have a need for that. In fact, I don’t even run Jekyll on the site’s server.

Initially I did edit files on my side, uploaded them, and then ran Jekyll to build it on the site. However, I wanted something that didn’t require as much interaction, since at the time I would have to log in and build it manually. I could have set up a script that would trigger when the files were uploaded and then build the site. The problem was, what happened when there was an error and the site was not rebuilt? I needed to not just be sure that the file was uploaded, but also get feedback on the Jekyll error, and then run through the process again until it worked correctly.

It was tedious to edit, upload, and then have to check to see if there were any errors reported from the generation process. And as often as not I also wanted to test the site locally before making some change, to see how it looked before making it live. While a more complicated script (or some other system on top of Jekyll) might handle that, I wanted it to remain simple. In the end, it was easier to generate the entire site locally, and only upload the final product. I use git to keep them in sync, and to ensure a stable version if I need to revert some changes (just in case, though I’ve never had to actually use this capability). It lets me do all the editing and testing on my own machine at my leisure, and also limit server usage to only keeping the files of the actual site itself. In theory, it would also work just as well if I stopped using Jekyll, as long as whatever I used created a complete site.

While it might also reduce the files stored on the server, in practice I do something that effectively keeps a second copy of the whole site. The git directory I upload to is not the one the server uses as the site (in NFSN, the ‘public’ directory). Instead, I have a post-commit hook script that copies all the files out of the repository. Here’s the heart of it:

mkdir -p $TMP_GIT_CLONE
git archive master --format=tar | tar -x -f - -C $TMP_GIT_CLONE
rsync -rlt -vi --delete $TMP_GIT_CLONE/_site/ $PUBLIC_WWW


It first uses git to create an archive (a copy that doesn’t include the repository history) and outputs that to a temporary directory. Then rsync is used to copy the files to the public site (i.e. directory). The first set of options to rsync there are -rlt to copy recursively, include links, and copy file modification times. The -vi options output an itemized list of changes for monitoring the process during the upload, and the delete option removes the files on the server that are not in the current repository. Afterward, the temporary copy of the git repository is removed. Using this method, the public site never has any files but the latest version from the repository, and none of the git history. The rsync copy step might not be strictly necessary, though it gives me a little bit of peace of mind about some file copying errors. There’s maybe also a reduction of the brief time window during which files on the server might be changing, but given the static nature of the site, and relatively fast internet connections, that doesn’t amount to much of an issue these days.

As mentioned, there probably is enough for me to do another post on the recent updates, most particularly the changes to the photo gallery. I make no promises about when to post it, but I expect it to appear in less then ten years.