Data Engineer

Department: Data Analytics

Location: Mumbai, Gurgaon, hybrid

Positions: 4

What is this job about

Data Engineer will own setting up Data Pipeline in AWS environment and will enable Data & Decision Science for Products Owners, Marketing and Senior Leadership.

Responsibilities

  • Assemble large, complex data sets from distinct sources into Data Lake that meet functional / non-functional business requirements.
  • Identify, design, and implement data and ETL pipeline - optimizing data delivery, re-designing infrastructure for greater scalability, etc for both Stream and Batch data processing
  • Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Spark, Python, Airflow and AWS ‘big data’ technologies.
  • Keep our data separated and secure by adherence to infosec and regulatory guidelines related to data centers, governance, monitoring usage and selection of AWS regions.
  • Create data & analytics tools/platform that assist Data & Analytics team members in building and optimizing our product into an innovative industry leader
  • Expected OUTCOME : Deliver high quality Data Engineering Pipeline and Solution Architecture based on the business requirement which is scalable and streamlined
  • Expected OUTCOME : Own data quality, correctness, validation, consistency & Reliability across pipeline
  • Expected OUTCOME : Optimization of infrastructure based on usage and requirement
  • Expected OUTCOME : Documentation and maintenance of Production code for efficient root cause analysis in case of any failures/issues

Key Skills

  • EDUCATION Qualifications : B.Tech./ Masters/ MBA Degree in Business Analytics or Computer Engineering or related field
  • EXPERIENCE : min. 2years upto 12years. Preference given to candidates who have worked in Startups and/or Financial Services domain.
  • Advanced working Python, Spark & SQL knowledge and experience working with relational/non-relational databases
  • Experience building and optimizing AWS/Cloud ‘big data’ data pipelines, architectures, solutions and data sets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
  • Experience supporting and working with cross-functional teams in a dynamic environment
  • Experience with big data tools: Hadoop, Spark, Kafka, etc
  • Experience with relational SQL and No SQL databases, including RDS, Postgres and Cassandra
  • Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • Experience with AWS cloud services: EC2, EMR, Batch, Lambda, RDS, Redshift, Sagemaker, AWS Glue, Athena, QuickSight, Kibana, ElasticSearch
  • Experience with stream-processing systems: Storm, AWS Kinesis, Spark-Streaming, etc.

Application Process

Kindly mail your resume to [email protected]
Please mention below details in the mail

  • Name
  • Position applying for
  • Years of Experience
  • Current Location
  • Notice Period