I spent two years being the guy who provisions databases. Every Monday morning, same Slack message: “Hey, can I get a Postgres instance for the new service?” I’d open Terraform, copy a module block, change three variables, run the plan, wait for approval, apply. Twenty minutes of my life, gone. Multiply that by four teams and it adds up fast.
Then I set up Crossplane with Compositions, and now developers do it themselves with a single YAML file. Here’s how I got there and what broke along the way.
Why Not Just Terraform?
Terraform works. I’m not here to trash it. But for self-service, it has a fundamental problem: developers need access to the state, the provider credentials, and the CI pipeline that runs terraform apply. That’s a lot of trust surface for someone who just wants a database.
Crossplane flips this. It runs inside your Kubernetes cluster as a set of controllers. Developers create a custom resource, Crossplane reconciles it into real cloud resources. Same GitOps workflow they already use for their apps.
Installing Crossplane
I run it via Helm because the marketplace operator had issues with our OPA policies:
helm repo add crossplane-stable https://charts.crossplane.io/stable
helm repo update
helm install crossplane crossplane-stable/crossplane \
--namespace crossplane-system \
--create-namespace \
--set args='{"--enable-usages"}' \
--version 1.19.0
The --enable-usages flag is important. Without it, deleting a Composition can orphan cloud resources. Learned that one the expensive way when an intern deleted a CompositeResourceDefinition and we had 14 untracked RDS instances running for a week.
Setting Up the AWS Provider
apiVersion: pkg.crossplane.io/v1
kind: Provider
metadata:
name: provider-aws-rds
spec:
package: xpkg.upbound.io/upbound/provider-aws-rds:v1.18.0
runtimeConfigRef:
name: irsa-config
I use IRSA (IAM Roles for Service Accounts) instead of static credentials. The runtime config looks like this:
apiVersion: pkg.crossplane.io/v1beta1
kind: DeploymentRuntimeConfig
metadata:
name: irsa-config
spec:
deploymentTemplate:
spec:
selector: {}
template:
spec:
serviceAccountName: crossplane-provider-aws
containers:
- name: package-runtime
args:
- --poll=1m
One gotcha: the provider pods need the eks.amazonaws.com/role-arn annotation on their ServiceAccount, not on the Crossplane system SA. I spent an afternoon debugging “AccessDenied” errors because of this.
The Composition: Wrapping RDS
This is where it gets interesting. A Composition is basically a template that maps a simple developer-facing API to the complex cloud resource underneath.
First, define what developers see (the XRD):
apiVersion: apiextensions.crossplane.io/v1
kind: CompositeResourceDefinition
metadata:
name: xpostgresinstances.database.dedico.hu
spec:
group: database.dedico.hu
names:
kind: XPostgresInstance
plural: xpostgresinstances
claimNames:
kind: PostgresInstance
plural: postgresinstances
versions:
- name: v1alpha1
served: true
referenceable: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
size:
type: string
enum: ["small", "medium", "large"]
description: "small=db.t3.micro, medium=db.t3.small, large=db.t3.medium"
teamName:
type: string
required:
- size
- teamName
Developers pick a t-shirt size and provide their team name. That’s it. No instance class memorization, no subnet group configs, no parameter groups.
Then the Composition maps those simple inputs to real AWS resources:
apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
name: postgres-on-aws
labels:
provider: aws
spec:
compositeTypeRef:
apiVersion: database.dedico.hu/v1alpha1
kind: XPostgresInstance
resources:
- name: rds-instance
base:
apiVersion: rds.aws.upbound.io/v1beta2
kind: Instance
spec:
forProvider:
engine: postgres
engineVersion: "16.4"
dbSubnetGroupName: shared-private
vpcSecurityGroupIds:
- sg-0abc123def456
publiclyAccessible: false
storageEncrypted: true
autoMinorVersionUpgrade: true
backupRetentionPeriod: 7
deletionProtection: true
patches:
- type: FromCompositeFieldPath
fromFieldPath: spec.size
toFieldPath: spec.forProvider.instanceClass
transforms:
- type: map
map:
small: db.t3.micro
medium: db.t3.small
large: db.t3.medium
- type: FromCompositeFieldPath
fromFieldPath: spec.size
toFieldPath: spec.forProvider.allocatedStorage
transforms:
- type: map
map:
small: "20"
medium: "50"
large: "100"
- type: FromCompositeFieldPath
fromFieldPath: spec.teamName
toFieldPath: spec.forProvider.tags.Team
- name: rds-password
base:
apiVersion: secretstores.crossplane.io/v1alpha1
kind: VaultSecret
spec:
forProvider:
path: database/creds
What Developers Actually Do
A developer wanting a database creates this in their app’s GitOps repo:
apiVersion: database.dedico.hu/v1alpha1
kind: PostgresInstance
metadata:
name: user-service-db
namespace: team-payments
spec:
size: small
teamName: payments
They push it, ArgoCD syncs it, Crossplane picks it up, and 5 minutes later there’s a running RDS instance with the connection string written to a Kubernetes Secret in their namespace.
No Slack message. No ticket. No waiting for me.
The Things That Went Wrong
Problem 1: Composition drift detection is slow. Crossplane polls cloud providers on an interval (default 1 minute). If someone modifies an RDS instance through the AWS console, it can take up to a minute to catch and revert. For us that was fine, but if you need tighter drift detection, bump the --poll interval down. Just watch the API rate limits.
Problem 2: Deletion ordering matters. We had a Composition that created both an RDS instance and a security group. When a developer deleted their claim, Crossplane tried to delete the security group before the RDS instance was fully gone. The security group deletion failed because it was still in use, and the whole thing got stuck in a delete loop. Fix: use the usages feature to declare dependencies.
apiVersion: apiextensions.crossplane.io/v1alpha1
kind: Usage
metadata:
name: rds-uses-sg
spec:
of:
apiVersion: ec2.aws.upbound.io/v1beta1
kind: SecurityGroup
resourceRef:
name: my-sg
by:
apiVersion: rds.aws.upbound.io/v1beta2
kind: Instance
resourceRef:
name: my-rds
Problem 3: Provider version upgrades can break CRDs. When I upgraded provider-aws-rds from v1.14 to v1.16, two fields changed names. All existing managed resources started showing “field not found” errors. Now I always test provider upgrades in a staging cluster first, and I pin exact versions in production.
Cost Controls
The t-shirt size model is great for guardrails. Nobody can accidentally spin up a db.r6g.4xlarge because it’s not in the enum. But we also added a Kyverno policy as a second layer:
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: limit-postgres-size
spec:
validatingAdmissionPolicy: false
rules:
- name: max-size-per-namespace
match:
any:
- resources:
kinds:
- PostgresInstance
validate:
message: "Non-production namespaces can only use 'small' or 'medium' sizes"
deny:
conditions:
all:
- key: "{{request.object.spec.size}}"
operator: Equals
value: "large"
- key: "{{request.namespace}}"
operator: AnyNotIn
value:
- prod-*
Was It Worth It?
After three months: I went from handling 15+ infra requests per week to maybe 2 (edge cases where someone needs something outside the standard sizes). Developers are happier because they don’t wait. I’m happier because I can focus on the platform instead of being a human Terraform runner.
The setup took about two weeks of real work. Most of that was getting IRSA right and testing the Compositions against different failure scenarios. If you already run Kubernetes and use GitOps, adding Crossplane is a natural next step.
One piece of advice: start with one resource type. Get the Composition right, document it, let teams use it for a month. Then expand. Trying to build a full self-service catalog on day one is a recipe for half-finished abstractions that nobody trusts.