Metatag2 hosting on a publisher webserver

This document describes how to host Metatag2 on a publisher domain and provides alternative examples for several widely used types of servers: nginx, Apache, AWS CloudFront, and any other web server using a bash script with a cron job.

Example1: nginx

In the nginx setup, we use a custom path to mount and serve the MetaTag2 files via a server-side proxy. Caching rules are set such that the MetaTag2 files can be cached on the publisher server and updated based on the caching headers on the origin files. The proxy_cache_path directive sets the path and parameters for the cache, and the proxy_cache_key directive sets the key for the cache.

In the server block, the location directive is used to define how to process requests for a specific route (/any-path-you-want/). It includes the proxy_pass directive to redirect requests to the origin server. The proxy_cache directive enables caching, and the proxy_cache_valid directive sets the time that different types of responses can stay in the cache.

Headers from the response that may reveal information about the origin server or other sensitive details are removed using proxy_hide_header. The add_header directive is used to add an X-Cache header to the response, which will show the status of the cache.

# Configure caching proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m; proxy_cache_key "$scheme$request_method$host$request_uri"; server { listen 80; server_name localhost; location /any-path-you-want/ { # Proxy requests to the origin server proxy_pass "http://cdn.stroeerdigitalgroup.de/metatag/live/beispielseite/"; # Enable caching proxy_cache my_cache; proxy_cache_valid 200 301 302 304 30m; proxy_cache_use_stale error timeout updating http_500 http_502 http_503 http_504; proxy_cache_background_update on; proxy_cache_lock on; proxy_cache_lock_timeout 5s; # Modify response headers proxy_hide_header X-Amz-Cf-Id; proxy_hide_header X-Amz-Cf-Pop; proxy_hide_header X-Amz-Server-Side-Encryption; proxy_hide_header Via; proxy_hide_header Server; proxy_hide_header X-Cache; proxy_hide_header Alt-Svc; add_header X-Cache $upstream_cache_status; } location / { root /usr/share/nginx/html; index index.html index.htm; } }

Example2: Apache httpd server

For the Apache server setup, the Apache mod_proxy is used to mount a custom path and serve the MetaTag2 files via a server-side proxy. With the help of mod_cache, the files can be cached on the publisher server and updated based on the caching headers.

Modules like mod_proxy, mod_proxy_http, mod_cache, mod_cache_disk, and mod_headers are loaded using the LoadModule directive. Inside the <IfModule mod_cache_disk.c> block, the caching parameters are defined. The <VirtualHost> block defines a virtual host for the server and inside it, the <Location> block is used to set up a reverse proxy and caching rules for a specific path. Unnecessary response headers are removed using the Header unset directive.

# Load the proxy modules if not already done in httpd.conf LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_http_module modules/mod_proxy_http.so LoadModule cache_module modules/mod_cache.so LoadModule cache_disk_module modules/mod_cache_disk.so LoadModule headers_module modules/mod_headers.so <IfModule mod_cache_disk.c> CacheQuickHandler off CacheRoot "/var/cache/mod_proxy" CacheDirLevels 3 CacheDirLength 5 </IfModule> <VirtualHost *:80> ServerName localhost DocumentRoot /usr/local/apache2/htdocs LogLevel debug <Location "/any-path-you-want"> # The ProxyPass directive specifies the mapping of incoming requests to a origin server ProxyPass http://cdn.stroeerdigitalgroup.de/metatag/live/beispielseite # To ensure that and Location: headers generated from the origin are modified to point to the reverse proxy, instead of back to itself, the ProxyPassReverse directive is most often required: ProxyPassReverse http://cdn.stroeerdigitalgroup.de/metatag/live/beispielseite <IfModule mod_cache_disk.c> CacheEnable disk CacheHeader on CacheDetailHeader on CacheDefaultExpire 1800 CacheMaxExpire 10800 LogLevel cache:trace5 </IfModule> <IfModule mod_headers.c> Header unset X-Amz-Cf-Id Header unset X-Amz-Cf-Pop Header unset X-Amz-Server-Side-Encryption Header unset Via Header unset Server Header unset X-Cache Header unset Alt-Svc LogLevel headers:trace5 </IfModule> </Location> <Directory "/usr/local/apache2/htdocs"> Order allow,deny AllowOverride All Allow from all Require all granted </Directory> </VirtualHost>

Example3: AWS Cloudfront

For AWS CloudFront, an example is provided using the AWS Cloud Development Kit (AWS CDK), which is a framework for defining cloud infrastructure in code. Here, a CloudFront Distribution is set up with an HttpOrigin pointing to the t-online.de domain as an example for an existing publisher setup.

A cache behavior is added for the MetaTag2 files at the path /metatag/live/beispielseite/*, with an HttpOrigin pointing to cdn.stroeerdigitalgroup.de. The AllowedMethods and CachedMethods properties define what kind of HTTP methods are allowed and cached, respectively. This script, when run, will generate a CloudFormation template, which can be used to set up the AWS infrastructure.

import { Stack, StackProps } from 'aws-cdk-lib'; import { AllowedMethods, CachedMethods, Distribution } from 'aws-cdk-lib/aws-cloudfront'; import { HttpOrigin } from 'aws-cdk-lib/aws-cloudfront-origins'; import { Construct } from 'constructs'; // example: https://cdn.example.com/metatag/live/beispielseite/metaTag.min.js export class MetatagPublisherMirrorExample1Stack extends Stack { constructor(scope: Construct, id: string, props?: StackProps) { super(scope, id, props); new Distribution(this, 'Cdn', { defaultBehavior: { origin: new HttpOrigin('t-online.de'), }, additionalBehaviors: { '/metatag/live/beispielseite/*': { origin: new HttpOrigin('cdn.stroeerdigitalgroup.de'), allowedMethods: AllowedMethods.ALLOW_GET_HEAD_OPTIONS, cachedMethods: CachedMethods.CACHE_GET_HEAD_OPTIONS, }, }, }); } }

This generates an cloudformation template like this:

Example4: bash script with cron job

In the last example, a bash script is used to download all required MetaTag2 files to a local folder. This script, when run, fetches an index.json file containing a list of the required files. For each file in the list, the script checks if a newer version exists on the server and downloads it if necessary. This bash script can be set up as a cron job to run periodically. The files can be served with any webserver afterward.

An example cronjob: