
Job Information
Cummins Inc. Data Engineer in Pune, India
DESCRIPTION
The Data Engineer supports, develops, and maintains a data and analytics platform to efficiently process, store, and make data available to analysts and other consumers. This role collaborates with Business and IT teams to understand requirements and best leverage technologies for agile data delivery at scale.
Note:- Even though the role is categorized as Remote, it will follow a hybrid work model.
Key Responsibilities:
Implement and automate deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
Develop and operate large-scale data storage and processing solutions using cloud-based platforms (e.g., Data Lakes, Hadoop, HBase, Cassandra, MongoDB, DynamoDB).
Ensure data quality and integrity through continuous monitoring and troubleshooting.
Implement data governance processes, managing metadata, access, and data retention.
Develop scalable, efficient, and quality data pipelines with monitoring and alert mechanisms.
Design and implement physical data models and storage architectures based on best practices.
Analyze complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual, physical, and logical data models.
Participate in testing and troubleshooting of data pipelines.
Utilize agile development technologies such as DevOps, Scrum, and Kanban for continuous improvement in data-driven applications.
RESPONSIBILITIES
Qualifications, Skills, and Experience:
Must-Have:
2-3 years of experience in data engineering with expertise in Azure Databricks and Scala/Python.
Hands-on experience with Spark (Scala/PySpark) and SQL.
Strong understanding of SPARK Streaming, SPARK Internals, and Query Optimization.
Proficiency in Azure Cloud Services.
Agile Development experience.
Experience in Unit Testing of ETL pipelines.
Expertise in creating ETL pipelines integrating ML models.
Knowledge of Big Data storage strategies (optimization and performance).
Strong problem-solving skills.
Basic understanding of Data Models (SQL/NoSQL) including Delta Lake or Lakehouse.
Exposure to Agile software development methodologies.
Quick learner with adaptability to new technologies.
Nice-to-Have:
Understanding of the ML lifecycle.
Exposure to Big Data open-source technologies.
Experience with clustered compute cloud-based implementations.
Familiarity with developing applications requiring large file movement in cloud environments.
Experience in building analytical solutions.
Exposure to IoT technology.
Competencies:
System Requirements Engineering: Translates stakeholder needs into verifiable requirements.
Collaborates: Builds partnerships and works collaboratively with others.
Communicates Effectively: Develops and delivers clear communications for various audiences.
Customer Focus: Builds strong customer relationships and delivers customer-centric solutions.
Decision Quality: Makes timely and informed decisions to drive progress.
Data Extraction: Performs ETL activities from various sources using appropriate tools and technologies.
Programming: Writes and tests computer code using industry standards, tools, and automation.
Quality Assurance Metrics: Applies measurement science to assess solution effectiveness.
Solution Documentation: Documents and communicates solutions to enable knowledge transfer.
Solution Validation Testing: Ensures configuration changes meet design and customer requirements.
Data Quality: Identifies and corrects data flaws to support governance and decision-making.
Problem Solving: Uses systematic analysis to identify and resolve issues effectively.
Values Differences: Recognizes and values diverse perspectives and cultures.
QUALIFICATIONS
Education, Licenses, and Certifications:
College, university, or equivalent degree in a relevant technical discipline, or equivalent experience required.
This position may require licensing for compliance with export controls or sanctions regulations.
Work Schedule:
- Work primarily with stakeholders in the US, requiring a 2-3 hour overlap during EST hours as needed.
Job Systems/Information Technology
Organization Cummins Inc.
Role Category Remote
Job Type Exempt - Experienced
ReqID 2411643
Relocation Package No
Cummins Inc.
-
- Cummins Inc. Jobs