How I accidentically reinvented Static Site Generators


The old yoniles.com is no more and in its place is the site you see before you. What happened to the old posts and the sites direction is not important but I hope future posts will fill in more details. I’d rather talk about the technology that has always underpinned this blog, static site generators.

How we got here

The year is 2018. I have a better hairline, Sicko Mode or Mo Bamba is still a relevant question, and I am about a year into my developer journey. With the help of my friend who i will refer to as gophyrx I registered a set up an Apache server to host a simple website. Its main purpose was to (1) hold a copy of my resume and (2) host an application that I wrote for my freshman roommate to describe the pain of his undergrad experience.

However with winter break coming, I knew I had to put more into the site if I wanted something that I could be proud of. With a lot feedback (read: roasting) from gophyrx and my roommate, I created a simple pure-html and css website that I could call my own and looked pretty decent. I even made a few blog posts! As time went on however, I didn’t like writing pure html or ssh-ing or scp-ing directly into the instance running the site. Even my barely developed developer brain could see that there was a better way to do this, and so I started with what I did best at the time: ignoring the easy way to do something and instead make a bespoke non-sense solution.

The reason for this choice, the habit itself, and even the choice to write the original site in pure html and css was really simple: I didn’t think that learning a framework was worth it. My mindset at the time was that to understand something, I had to be able to write it myself. This lesson was from when I (despite not knowing how to code) tried to write a neural network. Working with Keras left a gross taste in my mouth as it was so simple that I felt that it was a waste of my time. Using a library with a simple api like Keras, I thought, robbed me of my understanding as well as any challenge and should be avoided.

All I wanted to do was insert my blog post into a blank page with my navbar up top, and a list of posts on the side. “This is a simple task.”, I thought, “I could knock this out in a day” and I was right. Looking back, the code isn’t that bad. I even remember trying to use git as a way to manage state. It felt hacky at the time but honestly its not a bad solution. If you are interested in how it worked, you can read the how it worked section, otherwise skip ahead to takeaways after finishing this section.

Later when I had told gophyrx about what I had done, he recommended looking into static site generators. Although it sounded exactly like what I had built, I refused to migrate my site. This is where the other side of the “Do everything yourself” coin comes into play. I was too lazy to learn a new framework. I barely understood C and python at this point in my career. Was I really going to invest the time to learn a broken language like javascript or Ruby, just to have to migrate and possibly rewrite my website? Heck no! My pile of python code worked and made sense to me!

Now years later, I have less time to work on my custom script, write posts, or manually spend the time updating the site. The little script I wrote is neither as full-featured nor as robust as I would like and deployment was still manual. It was time for a change.

How it worked

Like I said above, the code isn’t that bad. I think this is due to the fact that (1) it is written in python (2) it only does two things (3) it was before my first taste of overcomplicating thing with whiteboard masturbation.

Lets step through the main function:

def main():
  args = getArgs()
  title = args.title
  if title == None:
    title=args.postFile.split(".")[0]

First, I read in arguments, which I even call out in a comment are self explanatory. There are only three args: post <required>, title, and template (though I only made and used one template).

Then I generated the filename for the post:

  cmdLineTitle = title.replace(" ", "_") + ".html"

One of my few critiques of my code is this upcoming section. I walk line by line and update the text rather than some smarter way to query

  with open('posts/'+ cmdLineTitle, "w+") as newFile:
    with open('templates/post_template.html', 'r') as t:
      for line in t:
        line = line.replace('<!-- Post Name -->', title)
        line = line.replace('<!-- Last Post Link -->', "<!-- Last Post Link -->\n\t<a href =\""+cmdLineTitle+ "\">"+ title+ "</a>")
        line = line.replace('<!-- Content -->', getContent( args.postFile ) )
        line = line.replace('<!-- Date -->', str(datetime.datetime.today())[:10] )
        newFile.write(line)

However that does gloss over some detail and my other main critique of this code. That getContent() function calls and returns the output from an application called pandoc. Some people had college flings, others found their love at University, I had pandoc. I used that for everything and maybe one day I will share the cooler things I wrote in LaTex using that tool. It was obviously going to be the tool that I used to turn markdown into html.

Then all that was left was to update the links in the navbar and blog homepage. The post_template.html file itself held all of the previous posts and so when a new post was made, it would inherit the links to all previous posts. You may also notice that the reference to it is hardcoded, completely invalidating the template argument. All of the already published posts would be walked through and updated. For some reason I did not take the approach of regenerating the whole website. I instead recreated what I probably did before this tool.

  # add to posts block on blog page
  update_file("blog.html", "posts/"+cmdLineTitle, title)
  # update the template
  update_file("templates/post_template.html", cmdLineTitle, title)
  # update all posts
  for post in os.listdir("posts/"):
    if post != cmdLineTitle:
      update_file("posts/"+post, cmdLineTitle, title)

That actually caused a little bit of a headache as I have another helper script to delete posts that I do not discuss here.

Overall, I would say this code is ok. The tool doesn’t do anything amazing but I think it was pretty impressive for where I was at at the time. It funny seeing myself comment like:

# does a thing
do_the_thing()

given my current opinion on comments but that is a discussion for another time.

Takeaways

Top-tier yapper Slavoj Zizek has an aside about how Wisdom is the most disgusting thing. In the aside, he gives the example of how easy it is to have multiple interpretations of the dichotomy of eternity and the present and how a wise man can sell any such permutation. So don’t take my wishy-washy definitely not bad answer as an indication that I haven’t made up my mind but rather as the peak of wisdom.

The cliche answer to “Why did you build something yourself?” is “Because I wanted to learn and make something better”. Although I didn’t do the latter the former sentiment rings true. If I had to make a wild guess at to how the first site generators looked, I think they would like nearly identical to how I wrote my code. I have dedicated post and template directories, and comments are used as psuedo-templating structure; a pattern which is still used in modern static cite generators. Even in my mistakes, such as creating a subprocess to call pandoc instead of using the markdown library, I am glad that I learned what I did in making those mistakes.

This is my first permutation of wisdom: The journey is more important than the destination. Sure, the code I wrote isn’t the best but it isn’t bad and I learned a lot at the time having done it. There is something to be said for how highschoolers can understand calculus but it took a genius to invent it. Although I am not calling myself a genius by any means, I gained more value by independently creating patterns like project structure or templates than I would have if I had simply learned these patterns from a tool.

My second permutation of wisdom is that there may be more than one way to skin a cat but at the end of the day, all that matters is that its skinned. This is what I would call a ‘capital R Realist’ take. Will your HOA be impressed if you mow your lawn with scythe rather than a riding lawnmower? Obviously not. So why would anyone care if you built a low-feature static site generator instead of using an off-the-shelf one?

The synthesis of those two points is my final permutation of wisdom: doing things by yourself is only valuable for yourself. Was building a static site generator by myself the right choice for the website? No. Was building a static site generator by myself the right choice to talk to recruiters or share with the wider world? No. Was it the right choice to enrich myself? Yes, but I should have checked how other frameworks handled the same problems I had for feedback. In that sense, I am happy that I made this in 2018 but am also happy that I move this site to Hugo.