AI Infrastructure and Applications Manager
1 Braham Street, London, United Kingdom
Recruiter: Daniel McCarthy
Career Grade: D
Internal Closing Date: 8/2/26
Why this job matters
We’re looking for an AI Infrastructure & Application Manager to lead a team of engineers responsible for running a suite of AI/ML applications from test through to production covering CI/CD, deployment, monitoring, version control, optimization and drift detection using an enterprise MLOps framework and AWS native services. You’ll also own the observability design and implementation for the serverless infrastructure behind these applications ensuring it is fit for purpose for production operations, incident response, auditability, cost transparency and service reliability.This is a hands-on leadership role: you’ll set technical direction, define operational standards, and coach engineers while collaborating closely with data science, product, security and platform teams. You’ll shape how AI systems are run in production: building the standards, tooling and culture that make AI/ML and agentic applications reliable, observable, secure and cost-effective at enterprise scale
What you’ll be doing
- Lead a team of technical engineers to manage the full AI/ML application lifecycle across test/preprod/prod environments, ensuring repeatable, reliable releases.
- Implement and mature an MLOps framework covering code/data/model versioning, automated testing, release governance, rollback strategies and environment promotion controls.
- Own production readiness for AI/ML workloads: SLOs, runbooks, operational dashboards, support processes, incident response and post‑incident RCA improvements.
- Design and operate CI/CD for ML solutions using patterns such as SageMaker model registry, controlled approvals and secure promotion of model artefacts through environments.
- Get deep understanding of the underneath use case and the data which is being used to develop and train the models.
- Implement model monitoring (e.g. data quality, model quality, bias drift, feature attribution drift) and alerting driving automated responses such as retraining triggers and controlled redeployments.
- Put in place drift detection, evaluation routines, and model performance reporting; partner with data science to define thresholds, baselines and acceptance criteria.
- Establish operational controls for agentic systems like policy boundaries, auditing of tool usage, quality evaluation and performance monitoring, aligned to enterprise requirements.
- Support production operations of generative AI applications using Amazon Bedrock and Amazon Bedrock AgentCore capabilities to deploy and operate agents securely at scale, with strong governance.
- Design and implement end‑to‑end observability for serverless services (e.g., Lambda, Step Functions, EventBridge, APIs), including structured logs, metrics, distributed traces, dashboards, alerting and correlation across workflows.
- Monitor agent behaviour, token usage/cost trends, latency, workflow health and security access patterns; drive continuous improvement and cost optimisation with FinOps-aligned reporting.
- Define standards for documentation, change management and quality gates that reduce MTTR and improve platform reliability.
The skills you’ll need
Our leadership standards
Looking in:
Leading inclusively and Safely
I inspire and build trust through self-awareness, honesty and integrity.
Owning outcomes
I take the right decisions that benefit the broader organisation.
Looking out:
Delivering for the customer
I execute brilliantly on clear priorities that add value to our customers and the wider business.
Commercially savvy
I demonstrate strong commercial focus, bringing an external perspective to decision-making.
Looking to the future:
Growth mindset
I experiment and identify opportunities for growth for both myself and the organisation.
Building for the future
I build diverse future-ready teams where all individuals can be at their best.
About us
BT Group was the world’s first telco and our heritage in the sector is unrivalled. As home to several of the UK’s most recognised and cherished brands – BT, EE, Openreach and Plusnet, we have always played a critical role in creating the future, and we have reached an inflection point in the transformation of our business.
Over the next two years, we will complete the UK’s largest and most successful digital infrastructure project – connecting more than 25 million premises to full fibre broadband. Together with our heavy investment in 5G, we play a central role in revolutionising how people connect with each other.
While we are through the most capital-intensive phase of our fibre investment, meaning we can reward our shareholders for their commitment and patience, we are absolutely focused on how we organise ourselves in the best way to serve our customers in the years to come. This includes radical simplification of systems, structures, and processes on a huge scale. Together with our application of AI and technology, we are on a path to creating the UK’s best telco, reimagining the customer experience and relationship with one of this country’s biggest infrastructure companies.
Change on the scale we will all experience in the coming years is unprecedented. BT Group is committed to being the driving force behind improving connectivity for millions and there has never been a more exciting time to join a company and leadership team with the skills, experience, creativity, and passion to take this company into a new era.
A FEW POINTS TO NOTE:
Although these roles are listed as full-time, if you’re a job share partnership, work reduced hours, or any other way of working flexibly, please still get in touch.
We will also offer reasonable adjustments for the selection process if required, so please do not hesitate to inform us.
DON'T MEET EVERY SINGLE REQUIREMENT?
Studies have shown that women and people who are disabled, LGBTQ+, neurodiverse or from ethnic minority backgrounds are less likely to apply for jobs unless they meet every single qualification and criteria. We're committed to building a diverse, inclusive, and authentic workplace where everyone can be their best, so if you're excited about this role but your past experience doesn't align perfectly with every requirement on the Job Description, please apply anyway - you may just be the right candidate for this or other roles in our wider team.