Data Engineer

State Farm State Farm · Insurance · Bloomington, IL +3 · Technology and UX

State Farm is seeking an experienced Data Engineer for their AIMD team, focused on Advanced AI, Modeling and Data. The role involves developing and maintaining scalable data solutions, acquiring and cleansing data, and supporting Databricks integration within the enterprise. The position requires proficiency in programming languages like Python and Spark SQL, experience with AWS services, and distributed data processing frameworks.

What you'd actually do

  1. Utilizes industry-adopted languages and frameworks in coding, testing, security, DevOps, DataOps and data engineering practices
  2. Develops and maintains reusable, scalable, and compliant data solutions across multiple platforms and compute environments
  3. Responsible for the identification, acquisition, cleansing, profiling, and ETL (extracting, transformation, and loading) of data used in analytic discovery and production solution deployment across multiple platforms
  4. Establishes business domain knowledge for existing State Farm data sources and investigates, recommends, and initiates acquisition of data resources, both internal and external
  5. Identifies and consults on emerging technologies and critical core systems, including techniques, tools, data sources, and platforms in the data engineering field

Skills

Required

  • 2-4 years of professional experience as a Data Engineer
  • Proficiency in programming languages such as Python, Spark SQL (or PySpark), R, Java, Bash
  • Hands-on experience with AWS services including ETL tools (Glue, EMR Serverless), Lambda, Step Functions, EventBridge, S3, DynamoDB, Kinesis Firehose, Redshift, Iceberg, and SageMaker
  • Experience with distributed data processing frameworks such as Apache Spark, Databricks
  • Experience with infrastructure as code tools such as OpenTofu (formerly Terraform)
  • Familiarity with CI/CD pipelines including automated testing, security scans, and tools like Airflow
  • Data access skills using SQL, and Athena
  • Experience in designing, building, and maintaining data pipelines for automated data processing
  • Knowledge of data modeling techniques such as star schema and snowflake schema, with an understanding of data architecture

Nice to have

  • An 'Innovation Mindset', and the ability to quickly learn new data technologies as needs change
  • Experience with relational databases such as DB2, Postgres, Redshift, etc.
  • Experience with version control systems such as GitHub or GitLab