Creating static websites and PDF files with Jinja and YAML

Why (long)?

It was time again to update my cv. So far in my working life, I have created my CVS using Word (ugh), Latex (indeed) and recently Django! Django? Yes, creating a web page and then having it rendered to PDF with the Pisa library. This grew out of an idea to create my homepage and my cv PDF from the same base.

That was working quite well. Using Django’s ORM and fixture mechanism I was able to maintain web content and my curriculum using the same base of YAML files. However, my website as well as my cv are rather static in nature and using Django felt like a huge overkill. Besides, I wanted to move my site to S3 and in my constant quest to simplify things I decided to throw out Django and just use plain Jinja templates and YAML files as my content.

Why (short)?

To create static files I prefer to have my content in flat text files in a convenient human-readable form. Therefore my choice of YAML. To bring the content into a nice looking layout, I wanted to re-use what I already had so I chose Jinja templates (which is, in fact, my favorite template engine).

1. Content = YAML

YAML documents are instantly understandable and can map most common data structures (see wikipedia article for an introduction).

YAML files can be processed in python using PyYAML. The usage is straight forward:

from yaml import load, Loader
# load entries yaml
data = load(open("data/entries.yaml"), Loader=Loader)

Depending on the YAML definition, the data object will be a map or list with the content.

2. Template = Jinja

With the data object at hand, we can render our templates into an HTML object cv.

from jinja2 import Environment
env = Environment(loader=FileSystemLoader("templates"))

# create and render template
template = env.get_template("cv.html")
cv = template.render(entries=data, templates_folder="templates")

Now comes the interesting part: how do we select the data in our templates? We can make things very easy if we define the data in the YAML files in a beneficiary way.

All can be done using this Jinja’s test functionality: {% for element in data if condition %}

Let’s say our YAML definition looks like this:

- details:
    name: Bernhard Wenzel
    address: Street, City, Country
- projects:
    title: Development of sports site
    client: BigCorp

And we load the whole data content into entities. To select only the address field we can use the following test:

{% for e in entries if e.details %}
        {{ e.details.address }}
{% endfor %} 

if e.details selects the correct list element (in YAML, elements that start with a minus sign are list elements). Voilà. The condition can be any expression of course. For example, in order to disable certain entries, we could add a field enabled: false to the YAML file and filter it using if ... and e.enabled.

Sorting is also possible, in my case it is already sufficient that the order of elements in the YAML file is preserved in python, so keeping things in the correct order in the file is all I need.

We can get quite far this way without any further querying.

3. Write PDF

The rendering to PDF is now just a matter of using the pisa framework. I prefer to have the filename containing a date:

import ho.pisa as pisa
cv = template.render ...

# create pdf
date_format = "bernhardwenzel-cv-%Y.%m"
filename = "out/" + date.today().strftime(date_format) + ".pdf"
pisa.pisaDocument(StringIO.StringIO(cv.encode("UTF-8")), file(filename, "wb"))

I have setup a small GitHub project including an example template that I’m using to render my cv. Available at: https://github.com/BernhardWenzel/pycvmaker

Creating static websites

As the PDF is created by using the rendered HTML file, it is, of course, possible to create a static website that way. There are plenty of static website generators, among those I use Pelican and Octopress. But to keep things even simpler and as my website is not a blog and rarely updated, I prefer to create static content straight out of Jinja.

I use Staticjinja for this which basically combines template rendering and creating static output in one step.

It is possible to call staticjinja directly, however, in order to pass in data a small script is necessary. For my website at wenzel-consulting.net, I use this little snippet:

def get_data():
    return {
        "entries": load(open("data/entries.yaml"), Loader=Loader),
        "projects": load(open("data/projects.yaml"), Loader=Loader)
    }

if __name__ == "__main__":
    output_folder = "out"

    # remove out
    shutil.rmtree(output_folder, ignore_errors=True)

    renderer = Renderer(outpath=output_folder, contexts=[("index.html", get_data)])
    renderer.run(debug=True, use_reloader=False)

    # copy static folder (css and images)
    shutil.copytree("static", output_folder + "/static")

That’s all. I use the same way to filter my data as described in step 2) and it does all I need.

Optional: deployment to Amazon S3

While I’m at it here’s how I deploy the static website to S3. Setting up an Amazon S3 container to serve as a host for static websites is a matter of a few steps (and it’s reasonably priced). Instructions can be found on Amazon (basically, create a container, enable static hosting, make container public and setup routing if you want to use your own domain).

As a deployment tool I use is s3cmd (install with sudo apt-get install s3cmd and configure s3cmd --configure).

To update my contents the sync command can be used. Let’s say the static files are under <project>/out and I have an S3 bucket named <BUCKET-NAME>. To deploy, I do following:

cd <project>
s3cmd sync -r out/ s3://<BUCKET-NAME>

Note: the trailing slash of out/ is crucial. Without it, the command copies the folder including the out directory, but to copy only the contents into the root folder of my S3 bucket the slash has to be added.

That’s it. Now I can structure my content in any way I wish using YAML and render it in a flexible way with Jinja templates.

Resources

Subscribe to Human Intelligence Engineering
Explorations of human ingenuity in a world of technology.