Making a blog website
A very heartily welcome to my first blog post ever! I created this blog for two main reasons: 1. To force myself to think deeply about programming decisions (after all, it must somehow become a coherent narrative). And 2. as a way of putting myself out there as a software and computer science enthusiast, something that will hopefully lead to meeting new people. So now, what a better way to start of a programming blog by writing a piece on how I created this website itself? Do note that I do not plan to keep editing and updating this post going forward, so if you are from the future the website may well be a little different than described here. Yet I think it can serve as an interesting write up of the experience of such an undertaking.
What do we want
Let's kick it off with defining our requirements:
- Learn something
- Little maintenance
- Easy to add and update posts
- Ability to show code snippets
The old way is the gold way: static html
Initially I started building a setup in NodeJs with nunjucks (a Jinja templating engine). Whilst I rarely use javascript, I've always been attracted to the functional syntax and asynchronous nature of NodeJs. However, in a discussion with a former peer student who has some years of serious experience in DevOps, I learned something new. Namely that since data changes are rare, and request agnostic, a static site would be more than adequate. Of course we can still use templates but the difference is that we 'build' (state -> html) only when we change or add something. This idea would also tie in nicely with our low maintenance requirement since we only need to care about keeping a trivial nginx server alive, which I know to be quite trivial.
Incidently, I also played a small role in making a static photo gallery for my study association which was to be placed behind a login to abide by GDPR guidelines. This project contained a few templates along with a python applications to build and manage the website and it's state. I decided to borrow heavily from this setup. This project is also open source by the way (github)
Our plan
- Define the state
- Create Jinja2 templates to render the state
- Create a python based cli to manage the state and build the site
- Create an nginx configurations to serve files (without extensions in uri)
1. Defining the state
Being a very simple blog, it seems adequate that our state is merely a list of posts. Let's define our types using the marvelous dataclasses introduced into python3.7:
from __future__ import annotations
from dataclasses import dataclass
@dataclass
class Post:
number: int
title: str
postedAt: datetime
@dataclass
class State:
posts: List[Post]
Also, I left out the code for json parsing here since it is quite straightforward. The state file might look something like this:
{
"posts":
[
{
"number": 0,
"title": "test_post",
"postedAt": "2019-11-01T16:54:42"
},
{
"number": 1,
"title": "Making a blog website",
"postedAt": "2019-11-01T17:22:20"
}
]
}
2. Creating templates
Blog posts I love are those that contain clear, copyable code snippets. It goes then without saying that I should adhere by the same standard. My aforementioned friend that advised to build a static site, also recommended pygments to convert plain text code into syntax highlighted html. The great thing about Jinja is the flexible way in which you can provide your template with custom functions. This means that we can write our highlight function the way we normally would in python and then simply inject it into our Jinja context.
def highlight(lang: str, code: str) -> str:
formatter = HtmlFormatter()
lex = pygments.lexers.get_lexer_by_name(lang)
return str(pygments.highlight(code, lex, formatter))
def build() -> None:
# Create a Jinja context with /design/pages/ as root
env = jinja2.Environment(loader=jinja2.FileSystemLoader("design/pages/"))
env.globals["highlight"] = highlight
# Then simply render all templates to eponymous files
# in the build directory
Now in our template we can easily call that function. However, because our code is generally a multi-line string contained all kinds of different strings themselves, it would be a strenuous task to somehow pipe all that text as argument without messing with indentation, special characters, etc. Not to mention the appalling syntax we would have to swallow writing blog posts! Instead I found this beautiful way to call a macro in jinja
<!-- imported somewhere in the base template -->
{% macro code(language) -%}
{{ highlight(language, caller()) }}
{%- endmacro %}
<!-- while writing a blog post -->
{% call macros.code("python") %}
def main():
print("my source code is not a nice string")
return 0;
{% endcall %}
Effectively, you can use the caller() method to load in the literal string that was typed between the call block. This circumvents the need of awkwardly passing code containing all kinds of weird characters as argument to the macro directly.
At this time I was introduced to another problem however; code segments are placed between 'pre' (preformated) tags. That means that the indent I so nicely added to my templates to keep things readable, where now also showing up in the final result. Evidently I could not simply remove all white space from the start of every line because I want to be able to present indented code snippets in a frivolous attempt to make the impression that I write tidy code. Fixing this problem would require us to consider all the lines of code in the snippet together. What we want to the remove the leading indent of the snippet, or in other words remove all indent that is shared between all the lines. Let's fix our highlight function so to satisfy this new constraint:
def highlight(lang: str, code: str) -> str:
formatter = HtmlFormatter()
# special all() function that return false for empty sets,
# otherwise we never terminate in the situation that the
# code block contains only whitespace for example)
alln: Callable[[List[Any]], bool] = lambda l: all(l) and len(l) > 0
# Remove all shared whitespace
lines = code.split("\n")
while alln([str(x)[0].isspace() for x in lines if len(x) > 0]):
for i in range(len(lines)):
if len(lines[i]) > 0:
lines[i] = lines[i][1:]
code = "\n".join(lines)
lex = pygments.lexers.get_lexer_by_name(lang)
return str(pygments.highlight(code, lex, formatter))
3. Management CLI
This will be the gatekeeper to our state. It is this cli we will use for building the site as well as creating new posts.
Over the years I've created loads of small cli programs. But until recently they always sucked. I've had dotnet projects with 30 different attributes all doing weird type reflecion logic to slowly consume a command and eventually select a method to execute.
But all my troubles seemed to have been solved by the wonderful click package. It allowed me to effortlessly create commands and subgroups. Furthermore it can also be configured that an argument should be path, allowing your favorite terminal to auto complete your commands. Eventually I came to the following set of commands:
blog init # create an empty state.json file
blog build # copy and render source files to build location
blog preview # serve build directory in browser
blog post create # copy post template to new directory in the source and append to the state
blog post edit # open a post's directory in vim
blog post delete # remove a post from the state
4. Nginx configuration
Now that we have everything setup, we just need a way to serve the files. For this we only need a simple and short nginx configuration that simply looks for the file specified in the uri. To avoid having ugly looking uri's that end in .html, we simply leverage the fact that nginx simply looks for index.html if no file is specified. So you, my dear reader, only see /posts/1 for example, which in actuality is interpreted as /posts/1/index.html. Note that in reality of course, we use a slightly longer configuration to enable ssl and redirect http to https, but that is outside the scope of this post.
http {
include /etc/nginx/conf/mime.types;
default_type application/octet-stream;
server {
listen 80;
root /var/www/blog;
index index.html index.htm index.php;
server_name blog;
location / {
try_files $uri $uri/ =404;
}
}
}
events { }
We want more, but mmm
During any coding project I find myself always arriving at 'that' moment; You want to add a feature but it just doesn't really play nice with the existing codebase, causing you to edit way more source files than intended. In this case it was the feature of showing the last modified time the bottom of a post's page. Getting the information is quite straightforward using Pathlib, but there is no slot for in our property state type. If you are familiar with python you might think 'why not set the property at runtime? Isn't that what makes python so nice?'. You be very right, this is an acceptable solution in the python world, but it does mean that you break your predefined type, and we committed to having a sound typesystem. Therefore, we need not mutate our existing state, but rather create a new extended type. Previously we could think of our pipeline as follows:
build :: "state.json" -> State -> html
build :: "state.json" -> State -> PreparedState -> html
# When inheriting from dataclasses, the constructor
# parameters are a union of the parent's and the child's.
@dataclass
class PreparedPost(Post):
modifiedAt: datetime
abstract: str
@dataclass
class PreparedState:
posts: List[PreparedPost]
def prepareState(state: State, root: Path) -> PreparedState:
posts_dir = root / "posts"
prepared_posts = []
for post in state.posts:
post_path = posts_dir / str(post.number)
# Recursively finds the files in that path and
# return the most recent modification date.
modifiedAt = lastModified(post_path)
# load the abstract
abstract = "abstract not available"
abstract_path = post_path / "abstract.html"
if abstract_path.exists():
with open(abstract_path, "r") as f:
abstract = f.read()
prepared_posts.append(
PreparedPost(
post.number,
post.title,
post.postedAt,
modifiedAt,
abstract)
)
return PreparedState(prepared_posts)
Looking back
Let's take a look at what we have:
.
├── design
│ │
│ ├── index.html
│ ├── posts
│ │ ├── 0
│ │ │ ├── index.html
│ │ │ └── abstract.html
│ │ └── 1
│ │ ├── index.html
│ │ └── abstract.html
│ ├── templates
│ │ ├── base.html
│ │ ├── footer.html
│ │ ├── macros.html
│ │ ├── nav.html
│ │ └── post.html
│ └── public
│ ├── pygments.css
│ └── style.css
├── ignore
│ └── build
│ ├── index.html
│ ├── posts
│ │ ├── 0
│ │ │ └── index.html
│ │ └── 1
│ │ └── index.html
│ └── public
│ ├── pygments.css
│ └── style.css
├── main.py
├── Pipfile
├── Pipfile.lock
├── README.md
├── src
│ ├── cli.py
│ ├── generate.py
│ ├── state.py
│ └── __init__.py
└── state.json
We have the our design directory that contains all of our templates and other files that I might need in any future posts, conveniently boxed in their own subdirectory. In our build folder we have now have the exact same tree structure except that the template are now rendered with the data provided by the state.json file (also I deleted the templates directory). I am particularly pleased by the simplicity of it all. Now we have all the time in the world to fiddle with the layout and conjure a custom 404 page with images of cute animals.
And this concludes the project. If you want to take a look at the source you can find that over here on Github. If you enjoyed it, hated it, have any suggestions, or just want to say hi, feel free to email me!
Gitalking ...