Lead Data/ai Engineering

AT&T AT&T · Telecom · Plano, TX

Lead Data/AI Engineer responsible for designing, developing, and optimizing data pipelines and operationalizing AI/ML models using various technologies and cloud platforms. The role involves collaborating with stakeholders, ensuring data quality and security, and researching new AI/ML algorithms.

What you'd actually do

  1. Design, develop, and optimize advanced data pipelines using technologies such as Apache Spark, Kafka, Airflow, and cloud-native tools like AWS Glue, Azure Data Factory, or Google Dataflow to drive business insights and enable automation.
  2. Collaborate with data scientists, architects, and business stakeholders to transform raw data into actionable intelligence, while architecting and maintaining robust data solutions across data lakes, warehouses, and marts using platforms like Snowflake, Redshift, or BigQuery.
  3. Develop, deploy, and monitor AI/ML models with frameworks such as TensorFlow, PyTorch, and Scikit-Learn, and operationalize these models via APIs and batch or streaming services.
  4. Ensure data quality, security, and compliance by applying best practices in data governance, encryption, and access control, leveraging tools such as Apache Atlas, Collibra, or cloud security features.
  5. Research and prototype cutting-edge AI/ML algorithms, evaluate emerging technologies, and communicate findings through technical documentation, dashboards (e.g., Power BI, Tableau), and presentations.

Skills

Required

  • Apache Spark
  • Kafka
  • Airflow
  • AWS Glue
  • Azure Data Factory
  • Google Dataflow
  • Snowflake
  • Redshift
  • BigQuery
  • TensorFlow
  • PyTorch
  • Scikit-Learn
  • Python
  • SQL
  • AWS
  • Azure
  • Databricks
  • Git
  • CI/CD

Nice to have

  • Apache Atlas
  • Collibra
  • Power BI
  • Tableau

What the JD emphasized

  • progressive post-baccalaureate experience
  • Python and SQL for data engineering and AI/ML applications
  • cloud platforms (e.g., AWS, Azure ) and data pipeline tools (e.g., Apache Spark, Airflow, Databricks)
  • machine learning frameworks (e.g., TensorFlow, PyTorch, Scikit-learn)
  • CI/CD and version control systems (e.g., Git)

Other signals

  • operationalize AI/ML models
  • deploy and monitor AI/ML models
  • develop advanced data pipelines