Senior Staff Data Architect

GE Healthcare GE Healthcare · Healthcare · Bengaluru, Karnātaka, India · Digital Technology / IT

Seeking an experienced Data Architect to design and develop a scalable, secure, and efficient data platform for GE Healthcare. Responsibilities include designing data systems, developing data models, implementing data pipelines, conducting code reviews, overseeing self-serve data platforms, driving data governance, optimizing performance, evaluating new technologies, providing technical leadership, and communicating strategies. Requires a strong background in cloud computing, software architecture, and implementation, with hands-on coding experience in Python and SQL.

What you'd actually do

  1. Design and Architect Data Systems: Design and structure all data systems, including databases, warehouses, and lakes, focusing on security, scalability, and long-term strategy.
  2. Develop Data Models: Create and maintain logical and physical data models that translate complex business requirements into technical data specifications.
  3. Implement and Optimize Data Pipelines: Design, develop, and implement ETL/ELT processes, ensuring efficient and scalable data flow from various sources.
  4. Conduct Hands-on Code Reviews: Perform detailed code reviews for data-related projects, ensuring code quality, architectural integrity, and adherence to best practices.
  5. Oversee Self-Serve Data Platforms: Design, build, and oversee the creation of self-serve data platforms, empowering domain teams to manage their data products autonomously.

Skills

Required

  • Python
  • SQL
  • AWS services (MSK, EMR, Glue, Lambda, Redshift)
  • Kafka
  • Spark
  • data modeling
  • data lineage
  • data cataloguing
  • metadata management
  • cloud computing
  • software architecture
  • ETL/ELT processes

Nice to have

  • multi-modal LLMs
  • information retrieval
  • content management systems
  • multi-modal AI architectures
  • AWS Solution Architect certification
  • Agile development methodology

What the JD emphasized

  • modernizing legacy data architectures for cloud and AI workloads
  • hands-on coding (Python, SQL, PySpark)
  • data modeling
  • data lineage
  • cataloguing
  • metadata management
  • HIPAA