Overview
An internal automation platform for a major insurance client's architecture team. It replaces the manual process of creating Jenkins pipelines — which required an architect to log in, fill out the "New Item" wizard, configure Git repo / branch / credentials / Jenkinsfile path by hand, and remember the team's unwritten conventions — with a self-service Ansible Automation Platform (Tower) workflow.
Target Users: Application developers who need a new CI/CD pipeline, and the architecture team that previously owned that onboarding work.
The Workflow: A developer opens Tower, picks the "Create Jenkins Pipeline" template, fills in a survey (project name, Git repo, branch, Jenkinsfile path, pipeline type, environment, credentials), and clicks execute. Tower runs an Ansible playbook that talks to the Jenkins REST API, renders a standardized config.xml from a Jinja2 template, and either creates or updates the pipeline — then optionally triggers the first build. The whole thing takes under a minute and is fully audited in Tower's job history.
Why This Matters for an Enterprise: In a regulated insurance environment, "an architect built it by hand" is not a good answer when auditors ask how a production pipeline was configured. A parameterized, repeatable provisioning workflow that logs every execution with the requesting user, inputs, and output is. It also eliminates drift — every pipeline starts from the same template.
My Role
Sole designer and implementer of the provisioning role, Tower survey, and rollout to the client's architecture team:
- Designed the Ansible role structure (validate → crumb → check → render → create/update → optional build)
- Implemented the idempotent create-vs-update flow using Jenkins'
/api/jsonprobe + 200/404 branching - Authored the Jinja2
config.xmltemplate that renders one canonical Jenkins pipeline shape - Set up Tower Job Templates with a survey form for the application developers' self-service use
- Integrated Tower RBAC with the client's LDAP groups so production-environment pipelines could only be provisioned by authorized roles
- Ran training sessions and wrote the runbook for the client's architecture team to own it after handover
Tech Stack
Automation
Integration
Security
Delivery
Architecture
The system is three logical layers, each with a distinct responsibility:
Presentation Layer — AAP Tower Survey: A Job Template with a Survey form. Developers fill in survey fields (job name, repo URL, branch, Jenkinsfile path, pipeline type, environment, credentials ID, trigger-first-build flag). Tower enforces field validation, records the requesting user, and feeds the answers as Ansible variables into the playbook. No one ever touches YAML.
Orchestration Layer — Ansible Role: A single role (jenkins_pipeline) runs six ordered tasks: validate → crumb → check_job → render_config → create_job | update_job → optional trigger_build. Each task is small and independently reviewable. Variables flow through role defaults → group vars → survey overrides.
Integration Layer — Jenkins REST API: The playbook calls Jenkins via its REST endpoints — /crumbIssuer/api/json for CSRF token, /job/{name}/api/json for existence check, /createItem or /job/{name}/config.xml for create/update, /job/{name}/build for optional trigger. Authentication is HTTP Basic with an API token stored in Ansible Vault.
The pipeline config.xml is rendered from a Jinja2 template that bakes in conventions the architecture team had been inconsistently applying by hand — lightweight checkout, standard branch spec, keepDependencies=false, descriptor plugin references, and a provisioning comment tagging the Ansible run ID.
Key Challenges
1. Jenkins CSRF Crumb
Jenkins enforces CSRF protection by default. Every state-changing REST call needs a per-session crumb header, which must first be fetched from the crumb issuer endpoint. Without this, createItem returns 403 with no meaningful error body, which is painful to debug.
2. Idempotent Provisioning
The same survey execution might be re-run — by accident (double-click), by design (re-apply after template update), or because the team wants to move a pipeline between folders. The provisioning flow must never create duplicate jobs and must never silently wipe existing history. A create-or-update semantic is required, not just create.
3. Inconsistent Conventions Across Projects
Before automation, every architect had slightly different ideas about folder layout, Git URL conventions, lightweight checkout settings, Jenkinsfile path locations, and which credentials to reuse. Pipelines created six months apart looked nothing alike — which made incident response and migration extremely painful.
4. Production Pipeline Access Control
The client's security policy required that pipelines targeting the production environment could only be provisioned by specific roles. The survey flow needed to honor this without letting developers simply edit the YAML to bypass it.
5. Credential Leakage Risk
Ansible playbooks that hit HTTP APIs can very easily print secrets in debug logs. In an enterprise audit context, a leaked API token or git credential in a Tower job log is a real incident.
Solutions & Design Decisions
Crumb Fetch as a Discrete Task
A dedicated crumb.yml task hits /crumbIssuer/api/json, parses the response, and sets jenkins_crumb_field / jenkins_crumb_value as facts. Every subsequent mutating call includes the crumb header. This keeps the handshake in one place — when Jenkins upgraded its CSRF behavior, only one file had to change.
Existence Probe + 200/404 Branching
Before any write, check_job.yml calls GET /job/{name}/api/json with status_code: [200, 404] (so 404 does not fail the playbook). The subsequent tasks use when: job_check.status == 404 (create) and when: job_check.status == 200 (update). This gives idempotent upsert semantics without needing external state — the source of truth is Jenkins itself.
Single Canonical config.xml Template
templates/config.xml.j2 encodes every convention the architecture team had agreed on but never written down — standard Git SCM block, */{{ branch }} branch spec, lightweight=true, descriptor plugin references, provisioned-by-Ansible marker comment. Every pipeline created through this flow is byte-identical modulo the survey inputs.
RBAC Split by Job Template
Instead of one catch-all template, Tower has separate Job Templates for dev/sit/uat and for prod. Each template is wired to different Tower teams backed by LDAP groups. Developers can self-serve the lower environments; only the architecture team can run the prod template. The playbook itself is the same — the enforcement is at the Tower permission layer, which auditors can inspect without reading Ansible.
Vault for Tokens, no_log for Tasks
Jenkins API tokens and Git credentials live in Ansible Vault variables. The HTTP URI tasks that pass them carry no_log: true so Tower's job log only records "task started / task ok", never the secret itself. The Vault password is provided to Tower via its credential store, not checked in.
Results & Impact
Setup Speed
- Pipeline provisioning time: ~15 minutes of manual clicking → under 1 minute of survey input
- New-project onboarding no longer blocked on architect availability
Consistency
- All pipelines provisioned through this flow share the same
config.xmlshape by design - Zero configuration drift across environments (dev / sit / uat / prod all use the same template)
Auditability
- Every provisioning event recorded in Tower job history: who, when, what inputs, what output
- Vault-backed credentials never appear in logs (no_log on all secret-carrying tasks)
- RBAC split by environment enforces production-provisioning policy at the platform level
Team Impact
- Architecture team reclaimed the hours previously spent on repetitive pipeline setup
- Developers unblocked — they no longer wait on a ticket to get a pipeline
- Onboarded the architecture team to own the playbook after handover
Learnings
Upsert Beats Create-or-Fail
The first draft of this flow failed loudly on "job already exists". It was correct but useless — most real requests are re-runs after a template tweak, not first-time creations. Switching to a check → create | update pattern made the automation feel like configuration, not ceremony.
Let the Platform Enforce the Policy
I initially considered encoding the "only architects can provision prod" rule in the playbook itself (if/when checks on input role). Splitting it into separate Tower Job Templates with different RBAC was cleaner — the Ansible code stays neutral and auditors can verify policy by looking at Tower ACLs instead of reading YAML.
no_log: true is Non-Optional for Enterprise
Leaving debug output on "just for development" is how tokens end up in log aggregators. Every HTTP URI task that carries a secret has no_log: true from day one. The cost is slightly harder debugging; the cost of not doing it is an audit finding.
A Jinja Template is a Contract
Once a single config.xml.j2 is canonical, it becomes the document of record for "what does one of our pipelines look like?" Changing the template changes every newly provisioned pipeline — which is powerful, so the template lives in Git with reviews, not in an architect's home directory.
Deep Dive: The Provisioning Flow
Why Ansible + Tower Instead of a Bespoke Tool?
Writing a small web form that hits the Jenkins API would have been faster for a one-off. But the client already ran Tower for other automation workflows, had LDAP-integrated RBAC, and had invested in Tower job-log auditing. Reusing that platform meant zero new infrastructure, zero new auth story, and an audit trail the compliance team already trusted.
Role Structure
The Ansible role has six tasks, each intentionally small so code reviews and failures are easy to localize.
--- - import_tasks: validate.yml # required input check - import_tasks: crumb.yml # fetch CSRF crumb - import_tasks: check_job.yml # probe existence - import_tasks: render_config.yml # render Jinja2 template - import_tasks: create_job.yml # POST /createItem when absent - import_tasks: update_job.yml # POST /job/{name}/config.xml when present - import_tasks: trigger_build.yml # optional first-build trigger
Tower Survey Inputs
These are the fields developers fill in. Required fields are enforced at the Tower survey layer so no invalid invocation ever reaches the playbook.
| Field | Purpose |
|---|---|
job_name | Jenkins job name |
repo_url | Git repository URL |
branch | Git branch (default: main) |
jenkinsfile_path | Path to Jenkinsfile in repo |
credentials_id | Jenkins-side Git credentials |
pipeline_type | java / springboot / node / docker |
environment | dev / sit / uat / prod |
folder_name | Optional Jenkins folder |
trigger_initial_build | Whether to kick off the first build |
CSRF Crumb Handshake
--- - name: Get Jenkins CSRF crumb uri: url: "{{ jenkins_url }}/crumbIssuer/api/json" method: GET user: "{{ jenkins_user }}" password: "{{ jenkins_token }}" force_basic_auth: true return_content: true status_code: 200 register: crumb_response no_log: true - name: Expose crumb as facts set_fact: jenkins_crumb_field: "{{ crumb_response.json.crumbRequestField }}" jenkins_crumb_value: "{{ crumb_response.json.crumb }}"
Why A Discrete Crumb Task?
Inlining the crumb fetch into every mutating call would have worked, but when Jenkins 2.176+ changed how crumbs behave across sessions, only one file needed to be audited. Small, focused tasks pay for themselves on every platform upgrade.
Idempotent Upsert
The existence probe returns either 200 (exists) or 404 (absent). Both are treated as success; the downstream task picks the right branch.
--- - name: Check whether the pipeline already exists uri: url: "{{ jenkins_url }}/job/{{ job_name }}/api/json" method: GET user: "{{ jenkins_user }}" password: "{{ jenkins_token }}" force_basic_auth: true status_code: [200, 404] register: job_check no_log: true
# create_job.yml - name: Create new Jenkins pipeline uri: url: "{{ jenkins_url }}/createItem?name={{ job_name | urlencode }}" method: POST headers: Content-Type: "application/xml" "{{ jenkins_crumb_field }}": "{{ jenkins_crumb_value }}" src: "/tmp/{{ job_name }}_config.xml" status_code: [200, 201] when: job_check.status == 404 # update_job.yml - name: Update existing Jenkins pipeline uri: url: "{{ jenkins_url }}/job/{{ job_name }}/config.xml" method: POST headers: Content-Type: "application/xml" "{{ jenkins_crumb_field }}": "{{ jenkins_crumb_value }}" src: "/tmp/{{ job_name }}_config.xml" status_code: 200 when: job_check.status == 200
Canonical Pipeline Template
One Jinja2 template encodes every agreed-upon convention. Every pipeline the flow creates is byte-identical modulo inputs.
<?xml version='1.1' encoding='UTF-8'?> <flow-definition plugin="workflow-job"> <description>Provisioned by Ansible Automation Platform</description> <keepDependencies>false</keepDependencies> <definition class="org.jenkinsci.plugins.workflow.cps.CpsScmFlowDefinition" plugin="workflow-cps"> <scm class="hudson.plugins.git.GitSCM" plugin="git"> <userRemoteConfigs> <hudson.plugins.git.UserRemoteConfig> <url>{{ repo_url }}</url> <credentialsId>{{ credentials_id }}</credentialsId> </hudson.plugins.git.UserRemoteConfig> </userRemoteConfigs> <branches> <hudson.plugins.git.BranchSpec> <name>*/{{ branch }}</name> </hudson.plugins.git.BranchSpec> </branches> </scm> <scriptPath>{{ jenkinsfile_path }}</scriptPath> <lightweight>true</lightweight> </definition> <triggers/> <disabled>false</disabled> </flow-definition>
RBAC Strategy
Policy in the Platform, Not the Playbook
Separate Tower Job Templates for dev/sit/uat and prod. Each template is wired to different Tower teams backed by LDAP groups. Developers self-serve lower environments; only the architecture team can run the prod template. The playbook itself is identical — the access control lives where auditors expect to find it.
| Environment | Tower Job Template | Allowed LDAP Group |
|---|---|---|
| dev / sit / uat | provision-jenkins-pipeline | app-developers |
| prod | provision-jenkins-pipeline-prod | architecture-team |
Secrets Handling
- Jenkins API token and Git credentials live in Ansible Vault variables
- Every HTTP URI task that carries a secret sets
no_log: true - Vault password provided to Tower via its credential store (never checked in)
- Tower job log records the
task started / task okline but never the Authorization header
If You Skip no_log, Secrets Leak
Ansible's default debug behavior prints the full request on task failure. In an enterprise context with centralized log aggregation, a failed uri task without no_log: true ships your API token to whatever ingests Tower logs. Treat no_log: true as mandatory on any task that touches a credential.