compoundexpress

What is this job about

Data Engineer will own setting up Data Pipeline in AWS environment and will enable Data & Decision Science for Products Owners, Marketing and Senior Leadership.

Responsibilities

Assemble large, complex data sets from distinct sources into Data Lake that meet functional / non-functional business requirements.
Identify, design, and implement data and ETL pipeline - optimizing data delivery, re-designing infrastructure for greater scalability, etc for both Stream and Batch data processing
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL, Spark, Python, Airflow and AWS ‘big data’ technologies.
Keep our data separated and secure by adherence to infosec and regulatory guidelines related to data centers, governance, monitoring usage and selection of AWS regions.
Create data & analytics tools/platform that assist Data & Analytics team members in building and optimizing our product into an innovative industry leader
Expected OUTCOME : Deliver high quality Data Engineering Pipeline and Solution Architecture based on the business requirement which is scalable and streamlined
Expected OUTCOME : Own data quality, correctness, validation, consistency & Reliability across pipeline
Expected OUTCOME : Optimization of infrastructure based on usage and requirement
Expected OUTCOME : Documentation and maintenance of Production code for efficient root cause analysis in case of any failures/issues

Key Skills

EDUCATION Qualifications : B.Tech./ Masters/ MBA Degree in Business Analytics or Computer Engineering or related field
EXPERIENCE : min. 2years upto 12years. Preference given to candidates who have worked in Startups and/or Financial Services domain.
Advanced working Python, Spark & SQL knowledge and experience working with relational/non-relational databases
Experience building and optimizing AWS/Cloud ‘big data’ data pipelines, architectures, solutions and data sets.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores
Experience supporting and working with cross-functional teams in a dynamic environment
Experience with big data tools: Hadoop, Spark, Kafka, etc
Experience with relational SQL and No SQL databases, including RDS, Postgres and Cassandra
Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
Experience with AWS cloud services: EC2, EMR, Batch, Lambda, RDS, Redshift, Sagemaker, AWS Glue, Athena, QuickSight, Kibana, ElasticSearch
Experience with stream-processing systems: Storm, AWS Kinesis, Spark-Streaming, etc.

Application Process

Kindly mail your resume to [email protected]
Please mention below details in the mail

Name
Position applying for
Years of Experience
Current Location
Notice Period