Simpler Website Tech: Leaving the PHP Island

Image: Images from Wikimedia Commons, combining Nihoa, Northwestern Hawaiian Islands and Lempisaari, Naantali, Finland. CC A and CC BY.

Cite as:

DOI

10.25815/xxk5-h953

Citation format: The Chicago Manual of Style, 17th Edition

Jackson, Daniel. ‘Simpler Website Tech: Leaving the PHP Island’, 2020. https://doi.org/10.25815/xxk5-h953.

Part of a new GenR series for 2020 ‘Open Science Pro Tip’ where Open Science researchers share their digital know-how.

Researcher and developer Daniel Jackson shares his experiences of using flat file web technologies that can take the headaches out of running a research website by reducing maintenance tasks, lowering costs, avoid security headaches, and helping with archiving and keeping a site online long-term. The article covers a number of research site examples from running a personal site, for a research project, or archiving a site at the end of a project. Flat file approaches came about because of the long-standing security vulnerability of websites built on PHP/MySQL which continually run the risk of opening up a whole web server computer to being hijacked. The solution to this ‘vulnerability’ problem is quite simple, remove the machine from the equation, just serve HTML/CSS and any other assets needed — hence the name ‘flat file’ sites. 


The PHP CMSs that we use today were invented in the 1990s, with the major open source projects kicking off in the early 2000s, Drupal 2001 and WordPress 2003. Websites using these technologies underlie many millions of websites, with WordPress reportedly accounting for 35% of the web. (Netcraft 2019)

So what’s happening now? Javascript has become the most popular programming language for the web, hosted services provide API access to data and of course developers use Git for version control. With a shifting technology stack, more functionality moving to the cloud, good operational reasons for separating front-end and back-end functionality, what are the options on offer to the website owner or developer? If we find we have grown weary of our PHP island where do we go now?

The challenges of the LAMP Stack

Problem of scope

Drupal and WordPress are examples of monolithic structures based on the LAMP stack (Linux, Apache, MySQL, PHP). They are monolithic because they control both the data and the presentation layers and with an extensive library of plugins can do pretty much anything. The problem with this huge scope is that some things are not done so well. For example Drupal has very good multi-lingual support built into core but WordPress does not. WordPress has a good shopping solution with the free WooCommerce plugin — however if you want subscriptions and memberships you need to pay for plugins; Drupal’s Commerce modules are complicated to set up and maintain so developers often opt for a 3rd party hosted solution such as Shopify. Some WordPress developers will also use Shopify above WooCommerce. If you would be best served with a 3rd party hosted service for your e-commerce,  do you really need a PHP CMS to run your site, controlling all aspects of the front and back ends? And importantly most websites do not need to do everything.

Security and maintenance

As a website admin, the security alert emails can be a niggling stress. Miss one by a week (on that long awaited break inland to the hills), a few days or hours and your website could be compromised. Recovering a site or at worst rebuilding is a cost — your time and someone’s money. You should really have a protected dev version or local copy but it can be hard keeping these all up to date and in-synch. Drupal CMS has no automatic updating of core so you do need to be on the ball when those critical security alerts hit your inbox.

Performance and server management

Website performance, accessibility and all the other metrics that Google Lighthouse makes audits of will have an impact on your site visibility and bounce rate. Search algorithms will take these into account when determining page ranking and if your site is slow visitors will leave. Mobile traffic often represents the majority of users so performance on slower networks is critical. It is possible to address performance with various types of caching but it can be tricky to set-up correctly and will not get the same fast speeds as static HTML with image optimisation.

Enter the Jamstack, flat files and static site generators

Javascript, APIs and Markdown

The Jamstack can be any combinations of — Javascript, APIs and Markdown. Jamstack is pitched as the evolution of modern web development.

Why the Jamstack?

The Jamstack liberates the back-end, much of the complexity happening at build-time with javascript generating static HTML to be deployed anywhere. It offers advantages vis-a-vis cost, increased security, separation of display and content, vendor lock-in, faster development time and better scalability.

Below are some of the key components of a Jamstack website which combined deliver the listed advantages:

  • Javascript is the most used programming language for client-side web development and increasingly used on the server side. It is an essential tool for the modern coder and has evolved into a mature and versatile language. 
  • APIs (application program interface) are a set of rules, protocols and functions used to retrieve data from cloud based CMSs, other data services or hosted functionality such as an e-commerce solution. These APIs are provided by the data services or cloud based CMSs. An increasingly popular option is GraphQL, which is a query language for APIs.
  • ‘Flat files’ is how the data of your website is stored — in plain simple text files — as opposed to storing your data in a MySQL database. Flat files are single files and should not be confused with a flat-file database, which stores all the data in a single file with no relationships. Using flat files is a bit of a time-warp back to the beginning of the world wide web in the early 1990s, but with the advantage that frameworks will automatically generate HTML pages and optimise your content — more on this later. The flat files comprising the content of your site are usually stored as Markdown.
  • Git is the most widely used version control system and is designed for collaborative working. GitHub and GitLab are Git hosting services. Typically your Jamstack website will be a git repository on your local machine that you push and pull to a remote repository on GitHub.
  • Markdown is a markup language with a simple formatting syntax. It can be entered as plain text or using a WYSIWYG interface. Markdown is used with static site generators but can also be used with single-page applications or more advanced frameworks. Markdown may often be substituted with plain HTML or other plain text formats.

    Markdown files can simply be stored in your website project folders on your development machine or they may be hosted on a remote Git-based CMS such as Netlify CMS. There are a myriad of Git based CMSs to choose from: https://headlesscms.org/. They work by storing the data on GitHub or GitLab with a client side web wysiwyg interface for editing markdown. Netlify CMS has a config file so that you can create your own custom content types.
  • Cloud CMSs are closed source and make data accessible with APIs. Contentful is one of the most popular, Sanity is another interesting contender.
  • Static Site Generators (SSGs) do the building and outputting of your static website, taking your markdown, your templates and layout code to generate plain HTML, js and CSS. SSGs are written in various programming languages and you will install them on your development machine and deployment service. The number of SSGs to choose from is bewildering but we will focus on 3 popular contenders: Hugo, Jekyll and Gatsby. Check out the alternative SSGs here: https://www.staticgen.com/ — some have quite specific remits so could be a good fit if you have a specific requirement.
    • Hugo and Jekyll are relatively lightweight SSGs, with simple templating and enhanced functionionality. They both have a good eco-system of themes. Hugo is extremely fast in building the static pages, it is versatile with an enthusiastic community. Jekyll is widely used and integrated with GitHub Pages, it is simple and described as ‘blog-aware’. Both are developing rapidly, so evaluate your requirements carefully and see which fits best. GitHub Pages is a free hosting service which can be used for both Jekyll and Hugo sites. 
    • Gatsby is a bit different. Gatbsy is a React based framework which leverages the power of React components. Gatsby also uses GraphQL so it can pull data from many different sources, including Git-based CMSs, cloud CMSs, WordPress and Drupal. It is also a static site generator. There are many Gatsby plugins with a strong developer community. It has excellent image processing with progressive and responsive image loading. It is also available offline with prefetching of pages and caching which makes it Progressive Web App (PWA) ready out of the box. There will be a steeper learning curve if you don’t know React or GraphQL and good Javascript skills a must if you want to get the most out of it. The rewards are worth the tutorial subscription and there plenty of excellent learning resources to get started.

Depending on your needs you may want to have continuous deployment of your site for content updates and other changes to your git repository. Netlify is a hosting and build service for automatic deployment to their CDN network, but with frequent updates your build time could push you into their paid-for plans. Any errors in your code will cause builds to fail and build times can be quite long if there are a lot of images to process. As a point of interest Netlify’s CEO Matt Biilmann coined the nested acronym ‘Jamstack’ – ‘a’ is the nested bit for API.

What you do not get with SSGs is functionality like user-authorisation, comments and memberships and this is where the idea of the ‘Content Mesh’ comes in. Search with Algolia, image processing with Cloudinary, comments with Disqus, authorization with Auth0, SnipCart for shopping, and Netlify Forms for forms and anything else you may need.

What to do if you are still committed to your LAMP Stack CMS?

Stay on the PHP island but think a bit smarter about hosted services — Saas (Software as a Service)

There is a cloud solution for pretty much any service you may need. If you have a complex requirement for your site, evaluation of competing solutions could save substantial development time and deliver an improved user experience. The business model for Drupal modules is almost non-existent and the one for WordPress challenged. This is important because beyond the core open source project specific complex functionality can require paid-for support and continuous development to deliver the expected user experience. E-commerce is a good example with reliable Saas solutions ranging from the full enterprise shop, a digital downloads store or a donate button. It is worth spending the time evaluating different options and testing the free trials.

Half on half off the island — the headless approach

The headless CMS approach separates the PHP CMS as a backend from the front-end HTML and javascript. The headless CMS is a data source accessible via APIs. This means that the display layer, (the templating, HTML and CSS) is not limiting the type of application and that the same data can be used in multiple ways. Another variation is the decoupled approach where the same data can serve web pages through the CMS templating system but also allow other applications access to the data through APIs. Typical with these approaches the front-end decoupled or headless element would be developed with a single page application framework such as React or Vue.

Both WordPress and Drupal can be used as headless or decoupled CMS. JSON:API is part of Drupal Core, and WordPress has the REST API plugin.

In conclusion

If you opt for a Jamstack site with static page generation your site will be faster, easier to maintain and more fun to build with more control of page layout and design. Your data will also be clearly separated from your presentation layer so you could easily upgrade from a Jekyll to a Gatsby site in the future.

But… if you want a blog with commenting but don’t want to pay for Disqus or Auth0 then you may decide to stick with WordPress or Drupal. It may be quicker and cheaper to build a site with Drupal or WordPress if you need functionality which come as standard with these CMSs. Another consideration is that site moderators do like the WordPress editing interface and may feel limited by some of the Jamstack equivalents. So no clear answers…

Maybe it is just going to be a lot more fun with Jamstack and static sites both for the end user and the developer. We won’t have to wait for 30secs or longer every time we save a post or see a CSS change and yes we won’t need to worry about spam and security — it will be someone else’s problem.

Addendum — Get out of PHP jail free card:

If you missed a security update whilst on that two week break and a spambot is busy exploiting PHPmailer, chewing up your bandwidth, whatever you try just doesn’t fix it and your hosting company is going to close your account then the utility Wget could save you. Wget is a cross platform web crawler that retrieves web pages and can generate a flattened HTML version of your CMS based website. Follow the instructions here:

Stanford, Web Services Blog: https://swsblog.stanford.edu/blog/creating-static-copy-website


Example sites

Personal site for free

GitHub Pages

GitHub Pages is a simple and free way to setup a website, almost immediately, including free subdomain. GitHub Pages is a good way to run a blog, personal profile, or project site and can be extended in functionality and design. There is a setup wizard on the GitHub Pages site below where you can select from existing templates.

GitHub Pages: https://pages.github.com/

Example: Invest in Open Infrastructure

The campaign and network for open academic infrastructures ‘Invest in Open Infrastructure’ uses the GitHub Pages theme called minimal, with blogging and a custom domain.

Site: https://investinopen.org/ and GitHub repo https://github.com/investinopen/InvestInOpen

Research project sites

Jekyll

Jekyll is a flat file site generating framework made by GitHub and is the same technology used by GitHub Pages. You can run Jekyll locally for site building, on GitHub Pages, or on your own web server.  

Jekyll: https://jekyllrb.com/

Example: AcademicPages theme

Stuart Geiger a researcher from Berkeley Institute for Data Science has made a well structured academic profile site using Jekyll and very kindly released the theme, called ‘Academic Pages’, and testing environment for easy reuse.

AcademicPages theme: https://academicpages.github.io/

Hugo

Hugo is a leading flat file framework from Google and very closely compares to Jekyll, selling its advantage is fast build times. A key difference for researchers is the use of AsciiDoc and reStructuredTex as markup languages. If you are looking to choose between the two this 2019 comparison on Slant is helpful. 

Hugo: https://gohugo.io/

Example: Modern Publishing

Modern Publishing is a research project developing collaborative writing workflows for Open Access publishing. Hugo is used to build their project site and host documentation for connecting tools like GitLab, Open Journal Systems (OJS), Hypothesis, and pandoc-scholar. The project is from TU Hamburg (TUHH) and the Hamburg State and University Library (SUB).

Modern Publishing: https://oa-pub.hos.tuhh.de/en/

WordPress

Flat file site from WordPress

With WordPress it is possible to have the best of both worlds, a WordPress site that users are familiar with and then a flat file output of the WordPress site that is put online for public serving. There are many ways to achieve this mix and below is one example.

Example: Shifter

Shifter is a hosted managed service for generating a flat file site from WordPress. It is mentioned here as an example as it takes all the headaches out of setting up such a system, but also allows you to migrate away and combine it with other frameworks. One disadvantage is that many features that require  dynamic updates or user interaction don’t work with Shifter.

Shifter: https://www.getshifter.io/

Archive a WordPress site

Another use of flat file systems can be for when you want to close a research project and not have to update your site anymore. Again there are many options and this guide takes you through the options.

Example: Simple Static plugin

Simple Static plugin will generate a flat file copy of your site, which you can use to either replace your WordPress site with online or/and place an archived backup copy on GitHub, Archive.org, or on another repository.

Simple Static: https://WordPress.org/plugins/simply-static/ 

References

Netcraft. ‘2019 Web Server Survey’, 2019. https://news.netcraft.com/archives/category/web-server-survey/.

Daniel Jackson

Posted by Daniel Jackson

Artist and technologist Daniel Jackson is director of Avco Productions Ltd, London and founder of artist project space 'No Show Space'. He has developed numerous websites with a range of technologies since the 1990s; worked in a Virtual Reality studio; produced 3D Computer games; developed online music games and sequencing tools for the BBC; written research papers on Sonification and Haptic Technology for BBC Innovations and created experimental online interfaces such as the John Cage Musicircus for The Space. He has pioneered forms of digital art display software and hardware and consulted for the Tate on their digital conservation strategies.

2 Replies to “Simpler Website Tech: Leaving the PHP Island”

  1. For all R users around, you can use the power of Rmarkdown and bookdown in hugo websites via the blogdown package. If you are using netlify, you need to build the website (Rmarkdown part) offline, but you can find other tool doing the R built, too.
    See http://www.noamross.net for a powerful example!

    There is also an academic theme, but that one is very complex. you may want to start with a simpler theme to get to understand how hugo works. See metadata2020.org or rdmpromotion.rbind.io for a use of the universal theme.

    Finally, the use of iframe can be a nice and easy solution to embed other webpages (and functionalities).

    Reply

    1. Gen R

      Thanks for pointers.

      Here’s Noamross site setup how-to https://www.noamross.net/2019/08/09/a-new-website/ love it that there’s a Tufte CSS framework
      https://github.com/edwardtufte/tufte-css 🙂 . Here is Naomross repo https://github.com/noamross/noamross.net/

      The metadata2020.org link isn’t working? Is it maybe another address.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *