First, we need to set up a Kubernetes cluster (see the official GCP documentation for more info).
Cluster requirements
Kubernetes version >=1.23 <= 1.25
Kubernetes nodes:
- ensure you have enough resources available (we suggest a total minimum of 4 vcpu & 8GB of memory)
- ensure you can run
x86-64
/amd64
workloads.arm64
architecture is currently not supported
Suggestion: ensure
allowVolumeExpansion
is set toTrue
in the storage class definition (this setting enablesPVC
resize)Details
PersistentVolumes
can be configured to be expandable. This feature when set totrue
, allows the users to resize the volume by editing the correspondingPersistentVolumeClaims
object.This can become useful in case your storage usage grows and you want to resize the disk on-the-fly without having to resync data across PVCs.
To verify if your storage class allows volume expansion you can run:
Terminalkubectl get storageclass -o json | jq '.items[].allowVolumeExpansion'trueIn case it returns
false
, you can enable volume expansion capabilities for your storage class by running:TerminalDEFAULT_STORAGE_CLASS=$(kubectl get storageclass -o=jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}')kubectl patch storageclass "$DEFAULT_STORAGE_CLASS" -p '{"allowVolumeExpansion": true}'storageclass.storage.k8s.io/gp2 patchedN.B:
- expanding a persistent volume is a time consuming operation
- some platforms have a per-volume quota of one modification every 6 hours
- not all the volume types support this feature. Please take a look at the official docs for more info
Suggestion: ensure
reclaimPolicy
is set toRetain
in the storage class definition (this setting allows for manual reclamation of the resource)The
Retain
reclaim policy allows for manual reclamation of the resource. When the PersistentVolumeClaim is deleted, the PersistentVolume still exists and the volume is considered "released". But it is not yet available for another claim because the previous claimant's data remains on the volume (see the official documentation).This can become useful in case your need to reprovision a pod/statefulset but you don't want to lose the underlying data
To verify which
reclaimPolicy
your default storage class is using you can run:Terminalkubectl get storageclass -o json | jq '.items[].reclaimPolicy'"Retain"If your storage class allows it, you can modify the
reclaimPolicy
by running:TerminalDEFAULT_STORAGE_CLASS=$(kubectl get storageclass -o=jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}')kubectl patch storageclass "$DEFAULT_STORAGE_CLASS" -p '{"reclaimPolicy": "Retain"}'storageclass.storage.k8s.io/gp2 patched
Note: in order to reduce the overhead of managing stateful services like PostgreSQL, Kafka, Redis and ClickHouse by yourself, we suggest you to run them outside Kubernetes and offload their provisioning, building and maintenance operations:
- for PostgreSQL, take a look at Google Cloud SQL for PostgreSQL
- for Apache Kafka, take a look at Confluent Cloud
- for Redis, take a look at Google Cloud Memorystore and Redis Enterprise Cloud
- for ClickHouse, take a look at Altinity Cloud
Chart configuration
Here's the minimal required values.yaml
that we'll be using later. You can find an overview of the parameters that can be configured during installation under configuration.
cloud: 'gcp'ingress:hostname: <your-hostname>
Installing the chart
To install the chart using Helm with the release name posthog
in the posthog
namespace, run the following:
helm repo add posthog https://posthog.github.io/charts-clickhouse/helm repo updatehelm upgrade --install -f values.yaml --timeout 30m --create-namespace --namespace posthog posthog posthog/posthog --wait --wait-for-jobs --debug
Note: if you decide to use a different Helm release name or namespace, please keep in mind you might have to change several values in
your values.yaml
in order to make the installation successful. This is because we build several Kubernetes resources
(like service names) using those.
Set up a static IP
Important - This must be a Global
IP address. GKE will not be able to find a Region
IP address and assign it to the ingress controller's LB.
In GCP web console
- Open the Google Cloud Console
- Go to VPC Networks > External IP addresses
- Add a new global static IP with the name
posthog
From gcloud CLI
gcloud compute addresses create posthog --global
Setting up DNS
Do not use
posthog
or tracking related words as your sub-domain record: As we grow, PostHog owned domains might be added to tracker blockers. To reduce the risk of tracker blockers interfering with events sent to your self-hosted instance, we suggest to avoid using any combination of potentially triggering words as your sub-domain. Examples of words to avoid are:posthog
,tracking
,tracker
,analytics
,metrics
.
Create the record of your desired hostname pointing to the address found above. After around 30 minutes (required to request, receive and deploy the TLS certificate) you should have a fully working and secure PostHog instance available at the domain record you've chosen!
Next steps
Now that your deployment is up and running, here are a couple of guides we'd recommend you check out to fully configure your instance.
- Setting up monitoring with Grafana
- Environment variables
- Securing PostHog
- Running behind proxy
- Email configuration
Troubleshooting
I cannot connect to my PostHog instance after creation
If DNS has been updated properly, check whether the SSL certificate was created successfully.
This can be done via the following command:
gcloud beta --project yourproject compute ssl-certificates list
If running the command shows the SSL cert as PROVISIONING
, that means that the certificate is still being created. Read more on how to troubleshoot Google SSL certificates here.
As a troubleshooting tool, you can allow HTTP access by setting ingress.gcp.forceHttps
and web.secureCookies
both to false, but we recommend always accessing PostHog via https.
Upgrading the chart
To upgrade the Helm release posthog
in the posthog
namespace:
Get and update the Helm repo:
Terminalhelm repo add posthog https://posthog.github.io/charts-clickhouse/helm repo updateVerify if the operation is going to be a major version upgrade:
Terminalhelm list -n posthoghelm search repo posthog
Compare the numbers of the Helm chart version (in the format posthog-{major}.{minor}.{patch}
- for example, posthog-19.15.1
) when running the commands above. If the upgrade is for a major version, check the upgrade notes before moving forward.
Run the upgrade
Terminalhelm upgrade -f values.yaml --timeout 30m --namespace posthog posthog posthog/posthog --atomic --wait --wait-for-jobs --debug
Check the Helm documentation for more info about the helm upgrade
command.
Uninstalling the chart
To uninstall the chart with the release name posthog
in posthog
namespace, you can run: helm uninstall posthog --namespace posthog
(take a look at the Helm docs for more info about the command).
The command above removes all the Kubernetes components associated with the chart and deletes the release.
Sometimes everything doesn't get properly removed. If that happens try deleting the namespace: kubectl delete namespace posthog
.
Clickhouse Configuration
By default clickhouse is provisioned with standard gcp persistent disk. If you want to specify your own persistent volume claim or switch to a different type of disk
you can specify the volume claim within values.yaml
.
To manually provision a disk
Using the gcloud cli tool for provisioning a disk:
gcloud compute disks create pvc-clickhouse --type=pd-ssd --size=2048GB --zone=us-central1-c
Create the claim
In order to provide the disk to the clickhouse deployment you must first create a persistent volume and claim within the posthog namespace.
# This creates a volume claim using the same name specified within the clickhouse values fileapiVersion: v1kind: PersistentVolumemetadata:name: clickhouse-volumespec:persistentVolumeReclaimPolicy: RetainstorageClassName: ""capacity:storage: 2048GiaccessModes:- ReadWriteOncegcePersistentDisk:pdName: pvc-clickhousefsType: ext4---apiVersion: v1kind: PersistentVolumeClaimmetadata:name: clickhouse-pvcspec:# It's necessary to specify "" as the storageClassName# so that the default storage class won't be used, see# https://kubernetes.io/docs/concepts/storage/persistent-volumes/#class-1storageClassName: ""volumeName: clickhouse-volumeaccessModes:- ReadWriteOnceresources:requests:storage: 2048Gi
Provide the claim to the helm chart
Add the following to your values.yaml
& run helm install
or upgrade
:
clickhouse:# -- Optional: Used to manually specify a persistent volume claim. When specified the cloud specific storage class will not be provisionedpersistentVolumeClaim: "clickhouse-pvc"