Computer Scientist - II ( Data Engineering )

Adobe Adobe · Enterprise · Noida, India

This role focuses on data engineering, designing and building scalable distributed data processing systems and ETL/ELT pipelines. It also involves implementing and fine-tuning generative AI capabilities, exposing data products through backend services, and optimizing system performance. Experience with GenAI/LLMs, LangChain, LlamaIndex, and vector databases is required.

What you'd actually do

  1. Design scalable distributed data processing systems.
  2. Build ETL/ELT pipelines for complex datasets.
  3. Implement and fine-tune generative AI capabilities.
  4. Develop backend services to expose data products.
  5. Improve code performance, data quality, and storage efficiency.

Skills

Required

  • Apache Spark
  • Hadoop
  • Kafka
  • Databricks/Delta Lake
  • PySpark
  • Pandas
  • NumPy
  • AWS or Azure
  • Data Modeling
  • LangChain
  • LlamaIndex
  • Vector databases
  • FastAPI, Flask, or Node.js
  • Docker
  • Kubernetes
  • CI/CD

Nice to have

  • Mentorship
  • technical best practices

What the JD emphasized

  • Proven track record of shipping production-grade data pipelines

Other signals

  • Implement and fine-tune generative AI capabilities
  • Experience with LangChain, LlamaIndex, and vector databases
  • Proven track record of shipping production-grade data pipelines