Staff Data Engineer/surgical Robotics - Ottava

Johnson & Johnson Johnson & Johnson · Pharma · Santa Clara, CA +1

Staff Data Engineer role focused on building data foundations and pipelines for manufacturing and supply chain operations, connecting shop-floor systems to cloud solutions. Responsibilities include data architecture, ETL/ELT pipeline development using Databricks and AWS, data quality, observability, and ensuring compliance with validation standards.

What you'd actually do

  1. Own data architecture and integration strategy for asset and configuration data across discovery systems, provisioning tools, cloud platforms, monitoring systems, and internal services.
  2. Design, implement, and operate ETL/ELT pipelines using Databricks (Spark, Delta Lake, Workflows), Python/PySpark, and AWS S3 to ingest, normalize, validate, reconcile, and serve manufacturing data (robot logs, telemetry) into analytics-ready datasets with defined SLAs for freshness, throughput, and availability.
  3. Build scalable, reliable batch and streaming data pipelines from OT sources (e.g., OPC UA, MQTT, MES) into cloud platforms with Python, SQL, Spark/Databricks, and AWS S3. include schema/contract management, partitioning/retention, and delivery‑semantics design to meet throughput and latency targets.
  4. Create highly reliable APIs and data services for engineers, IT, and business systems to query and manage data at scale.
  5. Establish data quality, test automation, observability and alerting (data checks, pipeline metrics, dashboards, lineage) to ensure production reliability and audit readiness.

Skills

Required

  • Databricks (Spark, Delta Lake, Databricks Workflows)
  • Optimizing PySpark jobs
  • AWS S3
  • Python
  • PySpark
  • SQL
  • Designing performant ETL/ELT
  • Data governance
  • Data quality frameworks
  • Data lineage
  • Validation/compliance (e.g., GxP, CSV)
  • Agile methodologies
  • Integrating OT data sources (OPC UA, MQTT, MES)

Nice to have

  • AWS core data services
  • Kinesis/MSK
  • Real-time streaming
  • ML integration for manufacturing

What the JD emphasized

  • data governance
  • data quality frameworks
  • data lineage
  • validation/compliance (e.g., GxP, CSV)