Welcome to WorkInEntertainment.com!   Here's the job that interests you:

  Site Reliability Engineer
Industry: TV - Film Job Posted: 3/1/2019
Location: Northeast Region Job Status: Full Time
Experience Level: TBD

Job Description:

Requisition ID 22214

Position Summary

Our Team

As Discovery Inc’s portfolio continues to grow – around the world and across platforms – the Global Technology & Operations team is building media technology and IT systems that meet the world class standard for which Discovery is known. GT&O builds, implements and maintains the business systems and technology that are critical for delivering Discovery’s products, while articulating the long-term technology strategy that will enable Discovery’s growing pay-tv, digital terrestrial, free-to-air and online services to reach more audiences on more platforms.

From Amsterdam to Singapore and from satellite and broadcast operations to SAP, we are driving Discovery forward on the leading edge of technology.

The Role

Reporting to the Director Site Reliability Engineering, this position is critical in to the mission of the Service & Engineering Improvement team as part of the Technology Operations Group. The core purpose of the role is to ensure that our complex, technologies sand systems are healthy, monitored, automated, and designed to scale. The ideal candidate will have a background as an operations generalist to work closely with our Engineering and Product Development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves work across multiple IT and broadcast technologies, and will also believe that automation is a key component to operating large-scale systems.

Highlights of the role

Based in either Sterling, VA or Knoxville, TN, this role is part of a team supporting our two global Technology Operations Centers based in Sterling, VA and London, UK. As a function these are the 24/7 Command & Control Hub for all our all Distribution and IT support services. The position is key to ensuring organizational improvements, consistently improving and maintaining our availability and uptime, establish effective automation and monitoring to deliver successes and areas of opportunity.

To partner with engineering and workforce technology teams to advocate sensible, scalable systems design as well as building the best tools to diagnose, resolve and prevent issues. The postholder is an ambassador for Service Reliability Engineering and good design within GT&O and so should be a great communicator and enthusiastic champion of Technology Operations.

This position is a member of the leadership team for Technology Operations and will guide the development of the team, and communicate the direction of the organization. The postholder is expected to work regular office hours but during large events should expect to work outside of this including weekends and nights occasionally.


1. Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services
2. Gain deep knowledge of our complex applications
3. Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.
4. Drive efficiencies in systems and processes: capacity planning, configuration management, performance tuning, monitoring and root cause analysis
5. Lead an objective no-blame post-incident analysis and review process
6. On behalf of operations be on point for capacity planning and to help the team anticipate and prepare for growth
7. Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications
8. Support the creation of end-to-end availability and performance of mission critical services Build automation to prevent problem recurrence. Partner with specialists to build automated responses for non-exceptional service conditions.
9. Develop reliability tools and frameworks for use by all operations teams
10. Ensure all key services are measured, monitored and raising alerts when needed
11. Partner with specialists on automating the deployment and configuration processes
12. Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX and Windows environment.
13. Function well in a fast-paced, rapidly-changing environment.
14. Be on-call when required to support our operations centers


• Bachelor’s degree in Computer Science, Information Technology, Mathematics Software or Broadcast Engineering, or other technical discipline, or related practical experience.
• 3+ years experience with troubleshooting in Unix/Linux
• Good programming skills in one or more of C/C++, Java, Javascript, Python, Perl, and an ability to pick up new ones.
• Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems and modern memory management, threads and processes, the user/kernel-space divide, etc.
• Background in Configuration and management of large scale platforms. (Virtualization, Cloud, Unix, Linux, Java, SQL, Oracle)
• A good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring and storage systems.
• Working knowledge of the TCP/IP stack, internet routing and load balancing
• Working exposure to linear and digital broadcasting and platforms preferred
• Knowledge of most of these: data structures, relational and non-relational databases, networking, Linux internals, filesystems, web architecture, and related topics
• Previous experience working with geographically-distributed coworkers.
• Strong verbal, written, interpersonal communication and customer service skills and ability to work well in a global diverse, team-focused environment
• Good organizational and conceptual skills combined with proven critical thinking, analytic, problem solving, and decision-making abilities
• Ability to multi task within related functions
• Positive attitude and can-do mentality
• Experience of working for a Media Company/Broadcast is desirable but not essential
• Must have the legal right to work in US



One Site

All entertainment jobs.

Inspiration finds us all. Maybe it was something you heard or something you watched, it doesn't matter... we get it. Let us help you live your dream with a job in the entertainment industry.

A Few Related Jobs
Senior Software Engineer - Northeast
Software Engineer, Sales Tools - South
Engineer - South
Sr. Network Engineer - Northeast
Broadcast Engineer - South
Penetration Testing Engineer - Northeast
Software Engineer - Client Services - Northeast
Software Engineer - Core Services - Northeast
Software Engineer - Services- DTC - Northeast
Sr Software Engineer - Services DTC - Northeast
Broadcast Technology Engineer - West
Software Engineer, Lighting Tools - West
Application Security Engineer - Northeast
Application Monitoring Engineer - Northeast
Production Engineer - Midwest
Senior Staff Security Engineer - West
Senior Staff Engineer Lead - West
BIT Engineer - Northeast
Cable System and AV Engineer - Northeast
Network Engineer - South



About Us  |  Career Advice Blog  |  Our Blogger: Brian Clapp  |  Featured Job  |  Employer Directory  |  Contact Us
© 2012-2019 Work In Entertainment, LLC. Terms of Use | Privacy Policy.