Job Description
Software Engineering Professional
Req ID:  58238
Posting Start Date:  06/05/2026
Job Function:  Software Engineering
Division:  Digital
Job Location:  IND-Bengaluru-RMZ Ecoworld
Advertised Salary:  Competitive

Job Req ID: 58238

Posting Date: 6-May-2026

Function:  Software Engineering 

Location: Bengaluru

About the role

You will design, build, and own the core business-logic services of the Cognium platform — the systems that define what an agent is, who can use it, what it costs, and whether it is compliant. These services are not on the millisecond hot path but they are the source of truth for the entire platform. A bug in the Agent Registry or Policy Manager cascades into every invocation. You will be writing code that runs inside an enterprise's most sensitive data environment.

What you’ll be doing

Design and Development
•    Design and implement RESTful and gRPC service APIs for Agent Registry, Workspace Manager, Policy Manager, and Cost Manager following domain-driven design principles
•    Build Cedar policy integration layer — translate business policy rules into Cedar ABAC/RBAC expressions, implement dry-run simulator endpoint, manage policy versioning and rollback
•    Implement SCIM 2.0 protocol endpoints for Azure AD user and group provisioning with idempotent upsert semantics and reconciliation jobs
•    Develop event-sourced audit log producers — every state change emits a Kafka event with SHA-256 hash chain continuation for tamper-evident logging
•    Build agent lifecycle FSM enforcement in Agent Registry — Cedar-guarded state transitions (Draft → Validated → Staged → Active → Deprecated → Archived) with full transition history
•    Implement cost attribution pipeline consumers — read Kafka cost events, compute micropence-precision attribution across the org/workspace/team/agent/invocation hierarchy, persist to ClickHouse
•    Design and implement NATS JetStream consumers for real-time policy invalidation — Cedar cache flush must propagate platform-wide within 5 seconds of a policy change
•    Write unit tests (≥80% coverage), integration tests with Test containers, and contract tests for all API surfaces

Data and Persistence
•    Design PostgreSQL schemas with row-level security for multi-tenant isolation — all entities scoped by org_id and workspace_id
•    Write CockroachDB-compatible SQL for strongly consistent global metadata — agent manifests, Cedar policies, IAM records
•    Implement Redis-backed distributed locking and caching patterns for budget enforcement counters (atomic INCR operations) and prompt cache management
•    Write ClickHouse analytical queries for cost attribution rollup, RAGAS evaluation trending, and audit log search

Integration and Security
•    Integrate Spring Security with JWT validation (Keycloak-issued tokens) and Cedar policy evaluation on every protected endpoint
•    Implement Azure AD SCIM 2.0 webhook receiver with signature validation, idempotency, and retry handling
•    Build Vault dynamic secret client — request tool credentials at runtime, handle lease renewal and rotation without pod restart
Implement data residency enforcement — workspace region tag propagates to all downstream LLM routing and storage decisions via Cedar conditions 

Essential Skills / Experience

Core (Java / Spring Boot)
•    Java 17+ - records, sealed classes, virtual threads (Project Loom), structured concurrency
•    Spring Boot 3.x - Spring Data JPA, Spring Security, Spring AMQP, Spring Batch for data pipeline jobs
•    Spring Security - JWT token validation, method-level security, custom filter chains for Cedar integration
•    JPA / Hibernate - multi-tenancy patterns, discriminator columns, schema-per-tenant, entity graphs, query optimisation
•    Maven / Gradle - multi-module project structure, dependency management, reproducible builds
•    Test containers - integration testing with real PostgreSQL, CockroachDB, Redis, Kafka instances
•    Micrometer - custom metrics, histogram percentiles, Dynatrace OTLP export

Domain and Architecture
•    Domain-Driven Design - bounded contexts, aggregates, repositories, domain events, anti-corruption layers
•    Event sourcing - event store design, event replay, snapshot strategy, eventual consistency handling
•    CQRS - command and query responsibility segregation, read model projections from Kafka event streams
•    API design - RESTful resource modelling, OpenAPI 3.x spec-first development, backward compatibility, versioning
•    gRPC - Protobuf schema design, server and client streaming, interceptors, error propagation
•    Distributed transactions - Saga pattern with compensating actions, outbox pattern for reliable event publishing

Messaging and Streaming
•    Apache Kafka - producer/consumer patterns, exactly-once semantics, topic partitioning strategy, consumer group management, Kafka Streams for real-time aggregations, MirrorMaker2 for replication
•    NATS JetStream - subject-based messaging, push/pull consumers, durable subscriptions, message acknowledgement, key-value store for Cedar cache state
•    Redis - Cluster mode, Lua scripts for atomic budget counters, pub/sub, TTL-based cache management, distributed locking with Redlock

Databases
•    PostgreSQL 16 - row-level security, JSONB, pgvector extension, Citus sharding on tenant_id, WAL - based replication, EXPLAIN ANALYZE query tuning
•    CockroachDB - serializable isolation, geo-partitioning, CDC (Change Data Capture), distributed transactions, schema migrations with Atlas
•    Neo4j - graph data modelling, Cypher queries, agent dependency graphs, knowledge graph entity-relationship traversal
•    ClickHouse - columnar schema design for analytics, materialized views, MergeTree engine, window functions for time-series cost attribution

APIs and Protocols
•    REST API design - HATEOAS, pagination patterns, idempotency keys, error response standards, rate limiting integration
•    gRPC - bidirectional streaming, health checking, reflection, load balancing, deadline propagation
•    WebSocket / SSE - streaming agent responses to consumers in real time

Desirable Skills / Experience

Identity and Security
•    Azure AD integration - SCIM 2.0 provisioning, OIDC/SAML token validation, conditional access claims inspection, group-to-role mapping
•    HashiCorp Vault - dynamic secrets, PKI certificate issuance, AppRole auth, lease renewal, transit encryption
•    OAuth 2.0 / OIDC - token lifecycle, refresh token rotation, JWT claims validation, scope enforcement

Observability (Basics)
•    Dynatrace - OneAgent instrumentation, custom metrics via OTLP, distributed trace context propagation, DQL queries for SLO measurement, Davis AI alert configuration
•    OpenTelemetry - SDK instrumentation, span creation and attribute enrichment, OTLP export configuration, trace context propagation across service boundaries
•    Structured logging - JSON log format, trace_id and span_id correlation, log level management, sensitive data scrubbing before logging

Infrastructure and Tooling (Basics)
•    Git - trunk-based development, feature flags, conventional commits, semantic versioning, protected branch policies
•    GitLab CI/CD - pipeline YAML authoring, stage gates, environment promotion rules, GitLab Container Registry, merge request pipelines
•    Kubernetes - pod anti-affinity, resource requests and limits, liveness/readiness probes, HPA configuration, ConfigMaps and Secrets management
•    Docker - multi-stage builds, image layer optimisation, Cosign image signing, distroless base images
 

Our Package

BT Group is the UK’s leading communications group and the holding company behind some of the country’s most recognised brands – including BT, EE, Openreach and Plusnet. Our purpose is as simple as it is ambitious: we connect for good.  Our customers include consumers, small, medium and large businesses, public sector organisations and other communications providers. 

BT Group’s role is about setting direction, unlocking value and creating the conditions for our brands and businesses to thrive.

Having come through the most capital-intensive phase of our fibre investment, our focus now is on what comes next – simplifying how we operate, using technology and AI to work smarter, and organising ourselves to serve customers better and grow sustainably. Group teams shape strategy, policy, brand, capital allocation and transformation, helping the whole organisation perform at its best.

We have a singular culture that unites all our people: we are customer-first challengers, who are committed, clear and connected. These behaviours unite us as one team to deliver for our colleagues, our customers, our stakeholders and the country.   Joining BT Group means working at the heart of a business that matters to the UK, with the opportunity to shape decisions, influence outcomes and help set the future course of one of the country’s most important companies.