First, we need to set up a Kubernetes cluster (see the official AWS documentation for more info). Follow the "Managed nodes - Linux" guide.
Cluster requirements
Kubernetes version >=1.23 <= 1.25
Kubernetes nodes:
- ensure you have enough resources available (we suggest a total minimum of 4 vcpu & 8GB of memory)
- ensure you can run
x86-64
/amd64
workloads.arm64
architecture is currently not supported
Suggestion: ensure
allowVolumeExpansion
is set toTrue
in the storage class definition (this setting enablesPVC
resize)Details
PersistentVolumes
can be configured to be expandable. This feature when set totrue
, allows the users to resize the volume by editing the correspondingPersistentVolumeClaims
object.This can become useful in case your storage usage grows and you want to resize the disk on-the-fly without having to resync data across PVCs.
To verify if your storage class allows volume expansion you can run:
Terminalkubectl get storageclass -o json | jq '.items[].allowVolumeExpansion'trueIn case it returns
false
, you can enable volume expansion capabilities for your storage class by running:TerminalDEFAULT_STORAGE_CLASS=$(kubectl get storageclass -o=jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}')kubectl patch storageclass "$DEFAULT_STORAGE_CLASS" -p '{"allowVolumeExpansion": true}'storageclass.storage.k8s.io/gp2 patchedN.B:
- expanding a persistent volume is a time consuming operation
- some platforms have a per-volume quota of one modification every 6 hours
- not all the volume types support this feature. Please take a look at the official docs for more info
Suggestion: ensure
reclaimPolicy
is set toRetain
in the storage class definition (this setting allows for manual reclamation of the resource)The
Retain
reclaim policy allows for manual reclamation of the resource. When the PersistentVolumeClaim is deleted, the PersistentVolume still exists and the volume is considered "released". But it is not yet available for another claim because the previous claimant's data remains on the volume (see the official documentation).This can become useful in case your need to reprovision a pod/statefulset but you don't want to lose the underlying data
To verify which
reclaimPolicy
your default storage class is using you can run:Terminalkubectl get storageclass -o json | jq '.items[].reclaimPolicy'"Retain"If your storage class allows it, you can modify the
reclaimPolicy
by running:TerminalDEFAULT_STORAGE_CLASS=$(kubectl get storageclass -o=jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}')kubectl patch storageclass "$DEFAULT_STORAGE_CLASS" -p '{"reclaimPolicy": "Retain"}'storageclass.storage.k8s.io/gp2 patched
Note: in order to reduce the overhead of managing stateful services like PostgreSQL, Kafka, Redis and ClickHouse by yourself, we suggest you to run them outside Kubernetes and offload their provisioning, building and maintenance operations:
- for PostgreSQL, take a look at AWS Aurora
- for Apache Kafka, take a look at AWS MSK and Confluent Cloud
- for Redis, take a look at AWS ElastiCache for Redis and Redis Enterprise Cloud
- for ClickHouse, take a look at Altinity Cloud
Chart configuration
Here's the minimal required values.yaml
that we'll be using later. You can find an overview of the parameters that can be configured during installation under chart configuration.
cloud: 'aws'ingress:hostname: <your-hostname>nginx:enabled: truecert-manager:enabled: true
Note: if you are planning to use our GeoIP integration, please also add the snippet below
to enable proxy protocol support in the load balancer and in the nginx
ingress controller:
## For AWS ELB in L4 (TCP) mode, we need to enable some additional config# in the ingress controller in order to get the proper IP address forwarded# to our app. Otherwise we'll get the load balancer nodes addresses instead.## ref:# - https://kubernetes.github.io/ingress-nginx/user-guide/miscellaneous/#source-ip-address# - https://docs.aws.amazon.com/elasticloadbalancing/latest/classic/enable-proxy-protocol.html#ingress-nginx:controller:config:use-proxy-protocol: trueservice:annotations:service.beta.kubernetes.io/aws-load-balancer-proxy-protocol: "*"
Installing the chart
To install the chart using Helm with the release name posthog
in the posthog
namespace, run the following:
helm repo add posthog https://posthog.github.io/charts-clickhouse/helm repo updatehelm upgrade --install -f values.yaml --timeout 30m --create-namespace --namespace posthog posthog posthog/posthog --wait --wait-for-jobs --debug
Note: if you decide to use a different Helm release name or namespace, please keep in mind you might have to change several values in
your values.yaml
in order to make the installation successful. This is because we build several Kubernetes resources
(like service names) using those.
Lookup the address of the installation
POSTHOG_IP=$(kubectl get --namespace posthog ingress posthog -o jsonpath="{.status.loadBalancer.ingress[0].ip}" 2> /dev/null)POSTHOG_HOSTNAME=$(kubectl get --namespace posthog ingress posthog -o jsonpath="{.status.loadBalancer.ingress[0].hostname}" 2> /dev/null)if [ -n "$POSTHOG_IP" ]; thenPOSTHOG_INSTALLATION=$POSTHOG_IPfiif [ -n "$POSTHOG_HOSTNAME" ]; thenPOSTHOG_INSTALLATION=$POSTHOG_HOSTNAMEfiif [ ! -z "$POSTHOG_INSTALLATION" ]; thenecho -e "\n----\nYour PostHog installation is available at: http://${POSTHOG_INSTALLATION}\n----\n"elseecho -e "\n----\nUnable to find the address of your PostHog installation\n----\n"fi
Setting up DNS
Do not use
posthog
or tracking related words as your sub-domain record: As we grow, PostHog owned domains might be added to tracker blockers. To reduce the risk of tracker blockers interfering with events sent to your self-hosted instance, we suggest to avoid using any combination of potentially triggering words as your sub-domain. Examples of words to avoid are:posthog
,tracking
,tracker
,analytics
,metrics
.
Create the record of your desired hostname pointing to the address found above. After around 30 minutes (required to request, receive and deploy the TLS certificate) you should have a fully working and secure PostHog instance available at the domain record you've chosen!
Upgrading the chart
To upgrade the Helm release posthog
in the posthog
namespace:
Get and update the Helm repo:
Terminalhelm repo add posthog https://posthog.github.io/charts-clickhouse/helm repo updateVerify if the operation is going to be a major version upgrade:
Terminalhelm list -n posthoghelm search repo posthog
Compare the numbers of the Helm chart version (in the format posthog-{major}.{minor}.{patch}
- for example, posthog-19.15.1
) when running the commands above. If the upgrade is for a major version, check the upgrade notes before moving forward.
Run the upgrade
Terminalhelm upgrade -f values.yaml --timeout 30m --namespace posthog posthog posthog/posthog --atomic --wait --wait-for-jobs --debug
Check the Helm documentation for more info about the helm upgrade
command.
Uninstalling the chart
To uninstall the chart with the release name posthog
in posthog
namespace, you can run: helm uninstall posthog --namespace posthog
(take a look at the Helm docs for more info about the command).
The command above removes all the Kubernetes components associated with the chart and deletes the release.
Sometimes everything doesn't get properly removed. If that happens try deleting the namespace: kubectl delete namespace posthog
.
Next steps
Now that your deployment is up and running, here are a couple of guides we'd recommend you check out to fully configure your instance.