Principal Data Engineer PE2 at Spectrum
Posted in Other 11 days ago.
This job brought to you by America's Job Exchange
Type: Full Time
Location: Englewood, Colorado
Charter Communications is America s fastest growing TV, internet and voice company. We re committed to integrating the highest quality service with superior entertainment and communications products. Charter is at the intersection of technology and entertainment, facilitating essential communications that connect 24 million residential and business customers in 41 states. Our commitment to serving customers and exceeding their expectations is the bedrock of Charter s business strategy and it s the philosophy that guides our 90,000 employees.
The Advanced Engineering department resides in the Charter Technology Engineering Center (CTEC) facilities in Englewood, CO and oversees the design and architecture of Charter s multi-billion dollar network infrastructure. We investigate, select, develop, and integrate technologies and solutions that meet the needs of the company for short, medium and long term initiatives. This includes the delivery of the technology plan and future architecture for Voice, Video, Data, Optical, Commercial, Cloud, CPE, Network and Access.
Advanced Analytics has implemented and is operating a new advanced Big Data analytics platform that has enabled business-impacting self-service analytics, decision engineering support, machine learning, modeling, forecasting, and optimization. It is anticipated that by the end of 2018, there will be 2.5+ petabytes (PB) of complex analytics data sets supporting Charter s Advanced Engineering organization. This position is responsible to create and maintain scalable, reliable, consistent and repeatable systems that support data operations and data engineering for Advanced Analytics by receiving, processing, and monitoring raw data at scale through scripts, coding, web scraping, APIs, SQL queries, etc. Deliverables include profiles of data that measure quality, integrity, accuracy, and completeness of workflows. Success in the role requires managing the data lifecycle at scale of multiple data sources and increasing the speed to delivery by implementing automated workload and data workflow solutions.
MAJOR DUTIES AND RESPONSIBILITIES:
- Manage and operate key data systems to ensure that all data feeds are processed in a timely manner and result in high quality data.
- Identify and resolve issues within the data operations feeds and processing.
- Profile data to measure quality, integrity, accuracy, and completeness of the workflows.
- Develop and implement tools, scripts, queries, and applications for ETL/ELT and data operations.
- Use a wide variety of open source technologies, platforms, tools, applications, and cloud services.
- Produce reports, notifications, and trends to uphold data delivery schedules
- Deliver solutions by developing, testing, and implementing code and scripts via (but not limited to) Python, Perl, Shell scripts, PowerShell, etc.
- IP, DNS, DHCP, network, security, and operating system configuration, administration, and troubleshooting experience on Linux/Unix/CentOS, Windows, and macOS.
- Manage lifecycle of multiple data sources
- Work closely with data demand stakeholders, such as analysts and data scientists.
- Work closely with data supply domain experts and the sources systems and platforms providing data.
- Build self-monitoring, robust, scalable interfaces, and data pipelines for 24/7 operations.
- Create highly reusable code modules and packages that can be leveraged across the data pipeline.
- Increase speed to delivery by implementing workload/workflow automation solutions.
- Demonstrate an ongoing focus on enabling key business results by balancing optimal technology solutions with the needs of stakeholders.
- Deliver results through caring about customers, using metrics-driven analysis, and communicating the costs and tradeoffs of ideas to stakeholders and top management.
- Experience in maintaining and managing production-level data systems.
- Strong experience with SQL in on-premises and cloud environments using MySQL, PostgreSQL, IBM DB2, Oracle, SQL Server, or Teradata.
- Experience with other database and data store technologies, such as NoSQL, key-value, columnar, graph, and document; Hadoop experience a plus.
- Strong experience importing, exporting, translating, cleaning, and managing a wide range of file types, such as CSV, TSV, TXT, XLS, Spatial, JSON, XML, HTML, KML, and ZIP.
- Strong experience creating and lifecycle-managing production-level Python scripts.
- Strong background in Linux/Unix/CentOS, Windows Server, and Windows desktop support, installation, administration, and optimization; macOS experience a plus.
- Expertise in data storage and/or data movement that demonstrates knowledge of when to use files, relational database, streaming, or NoSQL variant.
- Ability to identify and resolve end-to-end performance, network, server, client, platform, and operating system issues.
- Well organized with a keen attention to detail and the ability to effectively prioritize and execute multiple tasks.
- Productive in a virtual and on-premises environment.
- Expert with Microsoft Office applications (Word, Outlook, PowerPoint, and Excel) or Linux/macOS equivalents and as a self-sufficient user of Windows, Linux, and macOS desktops.
- Ability to read, write, speak and understand English.
- Familiarity with data workflow/data prep platforms, such as Alteryx, Pentaho, or KNIME.
- Familiarity with automation/configuration management using either Puppet, Chef, or equivalent.
- Experience in and with IT or technical operations at scale in a production environment.
- Knowledge of best practices and IT operations in an always-up, always-available service.
- Experience receiving, converting, and cleansing big data.
- Experience with visualization or BI tools, such as Tableau, Zoomdata, Microstrategy, or anything Microsoft Power BI.
- Demonstrated success creating proof of concept experiments or using design of experiment for analytics, machine learning, or visualization tools that include hypothesis, test plans, and outcome analysis.
- Development, lifecycle management, or operations experience in a DevOps environment a plus.
- Experience in an Agile environment a plus.
- Bachelor's degree or computer science or engineering, analytics, or data science discipline.
- Master s degree in computer science or engineering, analytics, or data science discipline.
- Ongoing learning demonstrated through certificates from professional studies, MOOCs, seminars, online courses, and any other deliverable mechanism after earning bachelor s and master s degrees.
RELATED WORK EXPERIENCE:
- 7- 10+ years of Linux/Unix/CentOS and Windows system admin; macOS experience a plus.
- 7-10+ years of hands-on working experience with RDBMS, SQL, scripting, and coding;
- Experience delivering one major system where candidate was responsible for designing the architecture, implementing, operating, and supporting.
The Spectrum brand is powered and innovated by Charter Communications. Charter Communications reaffirms its commitment to providing equal opportunities for employment and advancement to qualified employees and applicants. Individuals will be considered for positions for which they meet the minimum qualifications and are able to perform without regard to race, color, gender, age, religion, disability, national origin, veteran status, sexual orientation, gender identity, current unemployment status, or any other basis protected by federal, state or local laws.
Charter Communications is an Equal Opportunity Employer - Minority/Female/Veteran/Disability
Charter Communications will consider for employment qualified applicants with criminal histories in a manner consistent with applicable laws, including local ordinances.