Location: Hybrid Dallas, TX
Duration –12+ months
Data Engineer Responsibilities:
Technology stack would be ETL, Advanced SQL, UNIX/LINUX/ Python and Data Cloud
• Work with business stakeholders, Business Systems Analysts and Developers to ensure quality delivery of software.
• Interact with key business functions to confirm data quality policies and governed attributes.
• Follow quality management best practices and processes to bring consistency and completeness to integration service testing
• Designing and managing the testing AWS environments of data workflows during development and deployment of data products
• Provide assistance to the team in Test Estimation & Test Planning
• Design, development of Reports and dashboards.
• Analyzing and evaluating data sources, data volume, and business rules.
• Proficiency with SQL, familiarity with Python, Scala, Athena, EMR, Redshift and AWS.
• No SQL data and unstructured data experience.
• Extensive experience in programming tools like Map Reduce to HIVEQL
• Experience in data science platforms like SageMaker/Machine Learning Studio/ H2O.
• Should be well versed with the Data flow and Test Strategy for Cloud/ On Prem ETL Testing.
• Interpret and analyses data from various source systems to support data integration and data reporting needs.
• Experience in testing Database Application to validate source to destination data movement and transformation.
• Work with team leads to prioritize business and information needs.
• Develop complex SQL scripts (Primarily Advanced SQL) for Cloud ETL and On prem.
• Develop and summarize Data Quality analysis and dashboards.
• Knowledge of Data modelling and Data warehousing concepts with emphasis on Cloud/ On Prem ETL.
• Execute testing of data analytic and data integration on time and within budget.
• Work with team leads to prioritize business and information needs
• Troubleshoot & determine best resolution for data issues and anomalies
• Experience in Functional Testing, Regression Testing, System Testing, Integration Testing & End to End testing.
• Has deep understanding of data architecture & data modelling best practices and guidelines for different data and analytic platforms
Requirements:
• Extensive Experience in Data migration is a must ( Teradata to Redshift preferred)
•Extensive testing Experience with SQL/Unix/Linux scripting is a must• Extensive experience testing Cloud/On Prem ETL (e.g. Abinitio, Informatica, SSIS, Datastage, Alteryx, Glu)
• Extensive experience DBMS like Oracle, Teradata, SQL Server, DB2, Redshift, Postgres and Sybase.
• Extensive experience using Python scripting and AWS and Cloud Technologies.
• Extensive experience using Athena, EMR, Redshift, AWS, and Cloud Technologies
• Experienced in large-scale application development testing – Cloud/ On Prem Data warehouse, Data Lake, Data Science
• Experience with multi-year, large-scale projects
• Expert technical skills with hands-on testing experience using SQL queries.
• Extensive experience with both data migration and data transformation testing
• API/RestAssured automation, building reusable frameworks, and good technical expertise/acumen
• Java/Java Script – Implement core Java, Integration, Core Java and API.
• Functional/UI/ Selenium – BDD/Cucumber, Specflow, Data Validation/Kafka, BigData, also automation experience using Cypress.
• AWS/Cloud – Jenkins/ Gitlab/ EC2 machine, S3 and building Jenkins and CI/CD pipelines, SouceLabs.
• API/Rest API – Rest API and Micro Services using JSON, SoapUI
• Extensive experience in DevOps/Data Ops space.
• Strong experience in working with DevOps and build pipelines.
• Strong experience of AWS data services including Redshift, Glue, Kinesis, Kafka (MSK) and EMR/ Spark, Sage Maker etc…
• Experience with technologies like Kubeflow, EKS, Docker
• Extensive experience using No SQL data and unstructured data experience like MongoDB, Cassandra, Redis, ZooKeeper.
• Extensive experience in Map reduce using tools like Hadoop, Hive, Pig, Kafka, S4, Map R.
• Experience using Jenkins and Gitlab
• Experience using both Waterfall and Agile methodologies.
• Experience in testing storage tools like S3, HDFS
• Experience with one or more industry-standard defect or Test Case management Tools
• Great communication skills (regularly interacts with cross functional team members)