Software Engineering Professional (SRE)
Apply now »Date: Nov 13, 2024
Location: RMZ Ecoworld, Devarabeesanahal, Bengaluru, India
Company: BT Group
Why BT?
We’ve always been an organisation with purpose; to use the power of communications to make a better world. You can trace this back to our beginning as pioneers of the world’s first telecommunications company. At our heart we’re a technology company with research and innovation in our bones and a desire to be personal, simple, and brilliant for our customers - those are the values we live by whilst also creating an inclusive working environment where people from all backgrounds can succeed.
Our pursuit of progress over the past 180 years has established BT as a strong, successful brand, with huge scale capable of achieving great things. From supporting emergency services, hospitals, banks and keeping economies around the world online, safe and secure, to delivering large scale technology infrastructure like the creation of BT Sport.
Today in this fast changing, always on, digital world our purpose remains true. Yet the market conditions, regulation and competition we face are tougher than ever before. So if you have the drive, optimism and resilience to help propel us forward we’ll offer unrivalled personal development, a wealth of opportunities to learn, experience new things and pursue new careers. If that’s you and what you’re looking for, we’d love you to be part of our future.
Why this job matters
We are seeking a Site Reliability Engineer who is passionate about maintaining and enhancing the reliability, scalability, and efficiency of our systems. The ideal candidate will have a strong background in system monitoring, cross-functional collaboration, and operational process improvement. The SRE should be able to Guide team to adopt best practices and innovative solutions, ensuring top-notch service reliability and customer satisfaction.
What you’ll be doing
• Lead the enhancement of our alerting, monitoring, and reporting systems to proactively address and mitigate potential site reliability issues.
• Design and develop advanced monitoring dashboards, utilizing technologies such as Splunk, Kibana, and Dynatrace, to provide comprehensive insights into system performance and operational health.
• Collaborate with development teams to build scalable and reliable software solutions, aiming to minimize the need for refactoring and modifications.
• Standardize the performance and planning environment to ensure our systems can scale effectively, accommodating new features and user growth.
• Automate repetitive tasks and promote a culture of continuous improvement and an iterative mindset to enhance operational efficiency and team productivity.
• Analyse application patterns and analytics to inform and improve service level objectives, aligning with strategic business goals.
• Evolve AIOPS and NoOps capabilities, integrating self-healing and autonomic features to address complex operational challenges with precision.
• Thrive in a diverse, team-focused environment, working alongside SREs, Engineers, and Product Managers to achieve shared goals and foster a culture of inclusivity and mutual respect.
Required Skills and Qualifications:
• Bachelor’s degree in Computer Science or a related field.
• Proficiency in SQL, shell scripting, and PL/SQL for database management and automation tasks.
• Strong programming skills in languages such as Python, Java, or Ruby.
• Familiarity with distributed storage technologies and resource management frameworks.
• Demonstrated leadership in an SRE role, with a focus on agile practices and team management.
• Excellent communication and collaboration skills.
Bonus Points for (but not essential):
• Holding one or more certifications from Google or other cloud platforms.
• Certification in the ITIL framework.
• Experience in the Telecom industry.
• Experience leading a software engineering team.
• Exposure to Oracle Siebel product