Building a Secure Home Lab: Infrastructure as Code, Zero Trust Secrets, and PKI from Scratch


Why I Built This

My day job rarely gave me the chance to design systems end-to-end-to own the full stack from hypervisor to certificate authority. I wanted to level up, scratch that itch, and understand system design and engineering at a deeper level through hands-on work. So I built a home lab: not a toy, but a production-style environment where every decision-secrets, TLS, networking, automation-had to be justified and implemented properly. This is what came out of it.


Architecture Overview

The lab runs on Proxmox VE with a mix of VMs and a Raspberry Pi. The core pieces:

  • Base module - Ubuntu 24.04 cloud-init template

  • K3S VM - Kubernetes server (amd64), joined by a Pi worker (arm64)

  • Wazuh VM - SIEM stack (manager, indexer, dashboard)

  • UniFi - Networks, VLANs, and WLANs managed via Terraform (home-lab-networkarrow-up-right)

On top of K3S:

  • HashiCorp Vault - Secrets engine and PKI

  • cert-manager - Certificate lifecycle, backed by Vault PKI

  • FleetDM - Device management with in-cluster MySQL and Redis

Everything is driven from a single host machine via Terraform. No secrets in .tfvars, no .pem files on disk, no credentials in the repo. The project code (Proxmox, K3S, Vault, cert-manager, Fleet) is available at homelab-iacarrow-up-right.


Technical Implementation

1. Terraform and State

Each logical unit is a separate Terraform module with its own state file. A wrapper script (terraform.sh) in homelab-iacarrow-up-right handles:

  • Module selection (base, k3s, wazuh, vault, cert-manager, fleet) plus a separate UniFi projectarrow-up-right for network automation

  • State path resolution for terraform-backend-git

  • Cross-module dependencies (e.g. template_vm_id from base state)

State is stored in a private GitHub repo. The backend uses terraform-backend-git to push state as commits, with locking to avoid concurrent applies. This gives version history, auditability, and remote backup without a separate state backend service.

2. Secrets: 1Password as the Single Source of Truth

All secrets live in 1Password. Terraform reads them via the onepassword provider and data "onepassword_item" blocks. At apply time:

  • Proxmox API credentials - Used by the Proxmox provider

  • Kubeconfig - Inlined into Helm and Kubernetes provider config so Terraform can manage K3S resources without a local kubeconfig file

Terraform writes secrets back to 1Password via op item create / op item edit in local-exec provisioners:

  • VM SSH keys and passwords

  • Kubeconfig (after K3S provisioning)

  • Vault unseal keys and root token

  • Vault PKI root CA certificate

  • Fleet MySQL and Redis passwords

The operator never touches raw secrets. Access is through op read or op item get --reveal when needed. This keeps the "blast radius" of secret exposure minimal and centralizes audit in 1Password.

3. Vault: PKI and Kubernetes Auth

Vault is deployed via Helm. A null_resource handles initialization:

  1. Wait for the Vault pod to be running

  2. Run vault operator init (idempotent: skips if already initialized)

  3. Store unseal keys and root token in 1Password

  4. Auto-unseal using the threshold number of keys

A second null_resource configures the PKI engine:

  • Enable pki secrets engine

  • Generate an internal root CA (CN=HomeLab Root CA, 10-year TTL)

  • Create role homelab for *.10.0.0.2.nip.io with server_flag=true, require_cn=false (for cert-manager compatibility)

  • Enable Kubernetes auth

  • Create policy pki-sign and K8s auth role cert-manager bound to cert-manager's ServiceAccount

  • Export the root CA and store it in 1Password

Vault never sees a static token for cert-manager. cert-manager authenticates using its own ServiceAccount via Kubernetes auth, gets a short-lived token, and uses it to request certificate signing. This follows the principle of least privilege: cert-manager can only sign certificates, nothing else.

4. cert-manager and Vault PKI

cert-manager is deployed via Helm with CRDs enabled. A null_resource creates a ClusterIssuer:

Any Ingress annotated with cert-manager.io/cluster-issuer: vault-pki triggers the ingress-shim to create a Certificate resource. cert-manager:

  1. Generates a private key and CSR

  2. Authenticates to Vault via Kubernetes auth

  3. POSTs the CSR to pki/sign/homelab

  4. Receives the signed certificate and full chain

  5. Stores the result in a Kubernetes TLS Secret

Traefik reads that secret and terminates TLS. Certificates auto-renew at ~2/3 of their lifetime. Vault must be unsealed for issuance and renewal; if it's sealed, existing certs keep working until expiry.

5. FleetDM

FleetDM runs on K3S with MySQL and Redis deployed via separate Helm charts. Credentials are generated with random_password and stored in Kubernetes Secrets and 1Password. FLEET_SERVER_URL is set to the external HTTPS URL so Fleet knows how to reach itself behind the ingress.

6. UniFi Network Automation

Network segmentation and wireless configuration are fully automated via a separate Terraform project: home-lab-networkarrow-up-right. It uses the paultyng/unifi provider to manage the UniFi Controller API.

What it provisions:

  • 4 networks - Home, Servers, IoT, Guest, each with its own VLAN, subnet, and DHCP range

  • 4 WLANs - SSIDs bound to each network, with WPA2-PSK and band selection (IoT is 2.4GHz-only)

  • Site configuration - UniFi site setup and DNS assignment

The project uses terraform-backend-git for state (same pattern as the Proxmox modules), with wrapper scripts (tf-init.sh, tf-plan.sh, tf-apply.sh, tf-destroy.sh) that pass -var-file terraform.tfvars. Credentials and WLAN passwords stay in terraform.tfvars (or a secrets vault); the repo recommends never committing that file.

Network layout:

VLAN
Purpose
Notes

0

Home

Personal devices

2

Servers

Proxmox, K3S, Wazuh, DNS

3

Guest

Isolated

4

IoT

2.4GHz, isolated

K3S and Wazuh live on the Servers VLAN. Guest and IoT are firewalled from server resources. Firewall rules are managed directly in the UniFi Console for now; the Terraform config handles networks and WLANs. This keeps network topology version-controlled and reproducible while limiting lateral movement if a device is compromised.

7. nip.io and TLS

Internal services use *.10.0.0.2.nip.io for DNS. nip.io resolves any *.10.0.0.2.nip.io to 10.0.0.2.254, so no local DNS server is needed. TLS is mandatory: all ingresses use Vault-signed certificates. Browsers and agents (e.g. Fleet orbit) must trust the Vault PKI root CA-either by adding it to the system trust store or by passing it via --fleet-certificate when building agent packages.


Security Posture

No Secrets in Repo or State

  • Terraform state in Git does not contain sensitive values; the onepassword provider fetches them at plan/apply time

  • .tfvars holds only non-sensitive config (e.g. op_vault_id)

  • SSH keys, passwords, tokens, and kubeconfig live in 1Password

  • random_password outputs are marked sensitive and not printed in logs

Defense in Depth

  • VLANs - Network segmentation reduces blast radius

  • TLS everywhere - All services behind Traefik use HTTPS with proper cert chains

  • Vault unseal - Root key material is split (e.g. 5 shares, 3 threshold); no single point of compromise

  • Kubernetes auth - cert-manager uses short-lived tokens, not static credentials

Least Privilege

  • Vault policy pki-sign allows only pki/sign/homelab; cert-manager cannot read other secrets or modify Vault config

  • K8s auth role cert-manager is bound to a specific ServiceAccount and namespace

  • Each Terraform module has a narrow scope; cross-module coupling is explicit (e.g. base → k3s via template_vm_id)

Auditability

  • Terraform state in Git provides a history of infrastructure changes

  • 1Password logs access to items

  • Wazuh provides SIEM coverage for the Wazuh VM and any forwarded logs

Operational Security

  • Vault seals on pod restart; unsealing requires manual application of threshold keys (or future auto-unseal via Transit/KMS)

  • Certificate renewal depends on Vault being unsealed; operators are prompted to unseal before renewal windows

  • Fleet agent packages are built with the CA cert baked in, so endpoints can validate the Fleet server without trusting the CA system-wide (optional)


What I Learned

Building this forced me to think through:

  • Secret zero - Where does the first secret come from? 1Password + service account tokens.

  • Bootstrap ordering - cert-manager needs Vault; Vault needs to be unsealed; PKI setup needs the root token. The dependency graph is explicit in Terraform.

  • Certificate chains - SANs vs CN, EKU, chain completeness. Real-world TLS is fussy; Vault PKI with require_cn=false and proper roles made it work.

  • Network-as-code - Treating VLANs and WLANs as Terraform resources means changes are auditable and rollbacks are straightforward; the UniFi Controller becomes another managed endpoint.

The result is a lab that behaves like a small production environment: reproducible, auditable, and built with security as a first-class concern. It scratches the itch-and then some. This remains an evolving project as I venture deeper into security and keep updating my skills.

Code: homelab-iacarrow-up-right (Proxmox, K3S, Vault, cert-manager, Fleet) | home-lab-networkarrow-up-right (UniFi)

Last updated