Data Engineer
Department: Data Analytics
Location: Mumbai, Gurgaon, hybrid
Positions: 4
What is this job about
Data Engineer will own setting up Data Pipeline in AWS environment and will enable Data & Decision Science for Products Owners, Marketing and Senior Leadership.
Responsibilities
- Assemble large, complex data sets from distinct sources into Data Lake that meet functional / non-functional business requirements.
- Identify, design, and implement data and ETL pipeline - optimizing data delivery, re-designing infrastructure for greater scalability, etc for both Stream and Batch data processing
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Spark, Python, Airflow and AWS ‘big data’ technologies.
- Keep our data separated and secure by adherence to infosec and regulatory guidelines related to data centers, governance, monitoring usage and selection of AWS regions.
- Create data & analytics tools/platform that assist Data & Analytics team members in building and optimizing our product into an innovative industry leader
- Expected OUTCOME : Deliver high quality Data Engineering Pipeline and Solution Architecture based on the business requirement which is scalable and streamlined
- Expected OUTCOME : Own data quality, correctness, validation, consistency & Reliability across pipeline
- Expected OUTCOME : Optimization of infrastructure based on usage and requirement
- Expected OUTCOME : Documentation and maintenance of Production code for efficient root cause analysis in case of any failures/issues
Key Skills
- EDUCATION Qualifications : B.Tech./ Masters/ MBA Degree in Business Analytics or Computer Engineering or related field
- EXPERIENCE : min. 2years upto 12years. Preference given to candidates who have worked in Startups and/or Financial Services domain.
- Advanced working Python, Spark & SQL knowledge and experience working with relational/non-relational databases
- Experience building and optimizing AWS/Cloud ‘big data’ data pipelines, architectures, solutions and data sets.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
- Experience supporting and working with cross-functional teams in a dynamic environment
- Experience with big data tools: Hadoop, Spark, Kafka, etc
- Experience with relational SQL and No SQL databases, including RDS, Postgres and Cassandra
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS cloud services: EC2, EMR, Batch, Lambda, RDS, Redshift, Sagemaker, AWS Glue, Athena, QuickSight, Kibana, ElasticSearch
- Experience with stream-processing systems: Storm, AWS Kinesis, Spark-Streaming, etc.
Application Process
Kindly mail your resume to [email protected]
Please mention below details in the mail
- Name
- Position applying for
- Years of Experience
- Current Location
- Notice Period