MTS1 DevOps/Site Reliability Engineer in Scottsdale at PayPal

Date Posted: 11/6/2018

Job Snapshot

Job Description

Fueled by a fundamental belief that having access to financial services creates opportunity, PayPal (NASDAQ: PYPL) is committed to democratizing financial services and empowering people and businesses to join and thrive in the global economy. Our open digital payments platform gives PayPal’s 254 million active account holders the confidence to connect and transact in new and powerful ways, whether they are online, on a mobile device, in an app, or in person. Through a combination of technological innovation and strategic partnerships, PayPal creates better ways to manage and move money, and offers choice and flexibility when sending payments, paying or getting paid. Available in more than 200 markets around the world, the PayPal platform, including Braintree, Venmo and Xoom enables consumers and merchants to receive money in more than 100 currencies, withdraw funds in 56 currencies and hold balances in their PayPal accounts in 25 currencies.

The Monitoring Team at PayPal seeks a talented SRE to help us improve the reliability and resiliency of site monitoring. This position will be located in our Scottsdale, AZ office. Our team builds the world-class products that are used to collect, ingest, store, alert, and report on critical health metrics produced by the large-scale distributed system that runs PayPal. Our monitoring platform is built using n-tiered, scalable technologies such as Flink, Kafka, Java, GoLang, HBase, OpenTSDB, Elastic Search and Druid. As a site reliability engineer on our team you are required to ensure availability of the distributed monitoring platform by managing configurations, system health, tuning parameters to achieve optimal application performance, stability and reliability. A successful candidate will require strong programming skills, sound working knowledge of dev-sec-ops (guidelines + tools + process), understanding of cloud technologies, automation systems, data centers, load balancing, as well as excellent communication and planning skills. In addition to reliability aspects, you will be asked to contribute to our software stack to better understand the gaps in implementation which can cause potential performance issues in production. If you are passionate about systems design, scaling beyond 99.9% reliability and working in a highly dynamic environment with a team of smart and talented engineers then this is the job for you. Preferred Qualifications • Working experience in supporting massively scalable high performance systems. • Excellent problem solving skills. • Ability to identify performance bottlenecks and mitigate system failures. • Contribution to an open source project in operations, automation, or monitoring. We're a purpose-driven company whose beliefs are the foundation for how we conduct business every day. We hold ourselves to our One Team Behaviors which demand that we hold the highest ethical standards, to empower an open and diverse workplace, and strive to treat everyone who is touched by our business with dignity and respect. Our employees challenge the status quo, ask questions, and find solutions. We want to break down barriers to financial empowerment. Join us as we change the way the world defines financial freedom.

Ensure reliability of stream based applications which process up to 10 million data points / second with high reliability, low operational overhead and minimal data loss.

  • Passionate about mentoring team members and bringing in new technologies when necessary.

  • Prior experience in monitoring large scale distributed systems. Demonstrated knowledge of automation for most of the manual tasks around SDLC with techniques such as packaging with Docker, provisioning with Ansible, ensuring a reliable CI/CD pipeline to build and deploy code, automated system restarts and alerting for all critical modules.

  • Should be able to isolate errors by trouble-shooting the application stack from application to framework to underlying infrastructure dependencies and network.

  • Working knowledge of Devops container/orchestration tools (Kubernetes, Docker, Puppet, etc)

  • Experience with Cloud Native software development and other monitoring like Nagios or Splunk is a big plus.

  • Automate the maintenance of systems after they go live by measuring and monitoring availability, latency and overall system health.

  • Hands-On experience in Java, Python or GoLang, NoSQL data stores like Hbase, Couchbase and working knowledge with messaging platforms like KAFKA.

  • Collaborate with other engineers on code reviews, internal infrastructure improvements and process enhancements.

  • Ensure minimal operational overhead by automating maintenance tasks with easily manageable configurations, solving scalability bottlenecks to improve performance  and maximize system availability by ensuring functional and performance SLAs.

  • Hands-on knowledge of OOP/OOD/Functional languages along with strong understanding of concurrency, parallelism, networking, with profound data structure & algorithms.

  • Should be able to  take on-call rotation to address time sensitive production issues and customer support.

  • Experience developing solutions for service monitoring, automated remediation, measuring availability and reliability, performance, analytics, network.

  • Knowledge about building non-lossy data pipelines using  at least one streaming technology like Flink, Samza or Spark.

  • Experience with REST API, GIT, Docker, Jenkins, RxJava and Spring boot.

  • Strong verbal and written communication skills.

  • Having below experience is a plus

    • Experience with any of the following monitoring tools: Grafana, TSDB, Druid or types of monitoring tools: alerting, logging, tracing and time-series metrics.

  • Bachelor’s or Master’s degree or equivalent in computer science or related field with minimum of 5 years of directly related work experience.


We're a purpose-driven company whose beliefs are the foundation for how we conduct business every day. We hold ourselves to our One Team Behaviors which demand that we hold the highest ethical standards, to empower an open and diverse workplace, and strive to treat everyone who is touched by our business with dignity and respect. Our employees challenge the status quo, ask questions, and find solutions. We want to break down barriers to financial empowerment. Join us as we change the way the world defines financial freedom.

PayPal provides equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, pregnancy, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, PayPal will provide reasonable accommodations for qualified individuals with disabilities.

R0039440