What you'd actually do

Design, build, and operate scalable data replication and ingestion pipelines that move data from production databases, event streams, and third-party sources into Block's Lakehouse.

Develop and enhance Kafka Iceberg connectors and data loading frameworks, enabling reliable, low-latency data delivery to Snowflake and Databricks.

Drive the modernization of Block's CDC platform — evaluating and implementing next-generation approaches for database replication, including cloud-native alternatives, and Iceberg-based ingestion patterns.

Build self-service tooling and observability features that empower internal teams to onboard, monitor, and troubleshoot their own data pipelines with minimal support.

Collaborate with data engineering, platform infrastructure, and product teams to define data contracts, improve service encapsulation, and reduce tight coupling between operational databases and analytics consumers.

What the JD emphasized

8+ years of experience in software engineering or data platform development, with a focus on building scalable data systems or distributed infrastructure.

Strong programming proficiency in languages such as Java, Python, Scala, or Go, with experience developing data frameworks, libraries, or services.

Hands-on experience with streaming data systems and technologies such as Apache Kafka, Kafka Connect, or similar distributed messaging platforms.

Solid understanding of Change Data Capture (CDC), database replication patterns, and data lake or Lakehouse architectures.

Experience with modern data storage formats and table formats such as Apache Iceberg or Delta Lake.

Experience with cloud-based data ecosystems (AWS, GCP, or Azure) and infrastructure-as-code tools.

Design and implement solutions for PII detection, masking, and privacy-compliant data handling within ingestion pipelines, ensuring sensitive data is properly classified, protected, and governed in accordance with Block's privacy policies and regulatory requirements (e.g., GDPR, CCPA).

Block builds simple, powerful tools that make progress towards an economy that’s truly open to all. Each of our brands unlocks different aspects of the economy for more people. Square makes commerce and financial services accessible to sellers. Cash App is the easy way to spend, send, and store money. Afterpay is transforming the way customers manage their spending over time. TIDAL is a music platform that empowers artists to thrive as entrepreneurs. Bitkey is a simple self-custody wallet built for bitcoin. Proto is a suite of bitcoin mining products and services. Together, we’re helping build a financial system that is open to everyone. Join us.

The Role

The Data Ingestion team is part of Block's AI, Data & Analytics organization and is responsible for building and operating the platforms that replicate and ingest data into Block's Lakehouse, powered by Databricks and Snowflake. The team owns Block's Change Data Capture (CDC) platform, streaming data connectors, and data loading infrastructure — ensuring that fresh, reliable data from production databases, event streams, and third-party sources is available for analytics, machine learning, and AI initiatives across Square, Cash App, and Afterpay.

As a Senior Software Engineer on the team, you will design and build the next generation of data ingestion infrastructure — including Kafka Iceberg connectors, database replication pipelines, and unified ingestion frameworks. You will drive the modernization of our CDC platform, help consolidate multiple ingestion paths into a cohesive architecture, and collaborate with partner teams across Block to ensure data flows reliably from source to Lakehouse. In this role, you will have a direct impact on the scalability, reliability, and cost-efficiency of Block's data ecosystem.

Work from anywhere: This role can be performed from any location in the US or Canada.

You Will

Design, build, and operate scalable data replication and ingestion pipelines that move data from production databases, event streams, and third-party sources into Block's Lakehouse.
Develop and enhance Kafka Iceberg connectors and data loading frameworks, enabling reliable, low-latency data delivery to Snowflake and Databricks.
Drive the modernization of Block's CDC platform — evaluating and implementing next-generation approaches for database replication, including cloud-native alternatives, and Iceberg-based ingestion patterns.
Build self-service tooling and observability features that empower internal teams to onboard, monitor, and troubleshoot their own data pipelines with minimal support.
Collaborate with data engineering, platform infrastructure, and product teams to define data contracts, improve service encapsulation, and reduce tight coupling between operational databases and analytics consumers.
Contribute to the unification of Block's data ingestion architecture by identifying opportunities to consolidate overlapping systems and reduce infrastructure complexity.
Design and implement solutions for PII detection, masking, and privacy-compliant data handling within ingestion pipelines, ensuring sensitive data is properly classified, protected, and governed in accordance with Block's privacy policies and regulatory requirements (e.g., GDPR, CCPA).
Establish and promote best practices for data pipeline reliability, cost optimization, schema management, and compliance across the ingestion platform.

You Have

8+ years of experience in software engineering or data platform development, with a focus on building scalable data systems or distributed infrastructure.
Strong programming proficiency in languages such as Java, Python, Scala, or Go, with experience developing data frameworks, libraries, or services.
Hands-on experience with streaming data systems and technologies such as Apache Kafka, Kafka Connect, or similar distributed messaging platforms.
Solid understanding of Change Data Capture (CDC), database replication patterns, and data lake or Lakehouse architectures.
Experience with modern data storage formats and table formats such as Apache Iceberg or Delta Lake.
Experience with cloud-based data ecosystems (AWS, GCP, or Azure) and infrastructure-as-code tools.

Technologies We Use and Teach

Streaming & Messaging: Apache Kafka, Schema Registry, Kafka Connect, Debezium
Data Platform: Databricks, Snowflake
Data Processing & Storage: Apache Spark, Apache Iceberg, Delta Lake, Apache Airflow
Cloud & Infrastructure: AWS, Terraform

We’re working to build a more inclusive economy where our customers have equal access to opportunity, and we strive to live by these same values in building our workplace. Block is an equal opportunity employer evaluating all employees and job applicants without regard to identity or any legally protected class. We will consider qualified applicants with arrest or conviction records for employment in accordance with state and local laws and “fair chance” ordinances.

We believe in being fair, and are committed to an inclusive interview experience, including providing reasonable accommodations to disabled applicants throughout the recruitment process. We encourage applicants to share any needed accommodations with their recruiter, who will treat these requests as confidentially as possible. Want to learn more about what we’re doing to build a workplace that is fair and square? Check out our I+D page.

While there is no specific deadline to apply for this role, U.S. roles are typically open for an average of 55 days before being filled by a successful candidate. Please refer to the date listed at the top of this job page for when this role was first posted.

Block takes a market-based approach to pay, and pay may vary depending on your location. U.S. locations are categorized into one of four zones based on a cost of labor index for that geographic area. The successful candidate’s starting pay will be determined based on job-related skills, experience, qualifications, work location, and market conditions. These ranges may be modified in the future.

To find a location’s zone designation, please refer to this resource. If a location of interest is not listed, please speak with a recruiter for additional information.

Zone A:

$217,800—$326,800 USD

Zone B:

$207,000—$310,400 USD

Zone C:

$196,100—$294,100 USD

Zone D:

$185,200—$277,800 USD

Application Guidelines

Candidates may submit up to 9 active applications within a 60-day period. Reapplications to the same role are accepted 90 days after a previous application has been reviewed.

Use of AI in Our Hiring Process

We may use automated AI tools to evaluate job applications for efficiency and consistency. These tools comply with local regulations, including bias audits, and we handle all personal data in accordance with state and local privacy laws.

_Every benefit we offer is designed with one goal: empowering you to do the best work of your career while building the life you want. Remote work, medical insurance, flexible time off, retirement savings plans, and modern family planning are just some of our offering. _Check out our other benefits at Block.

Block, Inc. (NYSE: XYZ) builds technology to increase access to the global economy. Each of our brands unlocks different aspects of the economy for more people. Square makes commerce and financial services accessible to sellers. Cash App is the easy way to spend, send, and store money. Afterpay is transforming the way customers manage their spending over time. TIDAL is a music platform that empowers artists to thrive as entrepreneurs. Bitkey is a simple self-custody wallet built for bitcoin. Proto is a suite of bitcoin mining products and services. Together, we’re helping build a financial system that is open to everyone.

The Role

Work from anywhere: This role can be performed from any location in the US or Canada.

You Will

Design, build, and operate scalable data replication and ingestion pipelines that move data from production databases, event streams, and third-party sources into Block's Lakehouse.
Develop and enhance Kafka Iceberg connectors and data loading frameworks, enabling reliable, low-latency data delivery to Snowflake and Databricks.
Drive the modernization of Block's CDC platform — evaluating and implementing next-generation approaches for database replication, including cloud-native alternatives, and Iceberg-based ingestion patterns.
Build self-service tooling and observability features that empower internal teams to onboard, monitor, and troubleshoot their own data pipelines with minimal support.
Collaborate with data engineering, platform infrastructure, and product teams to define data contracts, improve service encapsulation, and reduce tight coupling between operational databases and analytics consumers.
Contribute to the unification of Block's data ingestion architecture by identifying opportunities to consolidate overlapping systems and reduce infrastructure complexity.
Design and implement solutions for PII detection, masking, and privacy-compliant data handling within ingestion pipelines, ensuring sensitive data is properly classified, protected, and governed in accordance with Block's privacy policies and regulatory requirements (e.g., GDPR, CCPA).
Establish and promote best practices for data pipeline reliability, cost optimization, schema management, and compliance across the ingestion platform.

You Have

8+ years of experience in software engineering or data platform development, with a focus on building scalable data systems or distributed infrastructure.
Strong programming proficiency in languages such as Java, Python, Scala, or Go, with experience developing data frameworks, libraries, or services.
Hands-on experience with streaming data systems and technologies such as Apache Kafka, Kafka Connect, or similar distributed messaging platforms.
Solid understanding of Change Data Capture (CDC), database replication patterns, and data lake or Lakehouse architectures.
Experience with modern data storage formats and table formats such as Apache Iceberg or Delta Lake.
Experience with cloud-based data ecosystems (AWS, GCP, or Azure) and infrastructure-as-code tools.

Technologies We Use and Teach

Streaming & Messaging: Apache Kafka, Schema Registry, Kafka Connect, Debezium
Data Platform: Databricks, Snowflake
Data Processing & Storage: Apache Spark, Apache Iceberg, Delta Lake, Apache Airflow
Cloud & Infrastructure: AWS, Terraform

To find a location’s zone designation, please refer to this resource. If a location of interest is not listed, please speak with a recruiter for additional information.

Zone A:

$217,800—$326,800 USD

Zone B:

$207,000—$310,400 USD

Zone C:

$196,100—$294,100 USD

Zone D:

$185,200—$277,800 USD

Application Guidelines

Candidates may submit up to 9 active applications within a 60-day period. Reapplications to the same role are accepted 90 days after a previous application has been reviewed.

Use of AI in Our Hiring Process

Senior Software Engineer, Data Ingestion Platform

What you'd actually do

Skills

Required

Nice to have

What the JD emphasized

The Role

You Will

You Have

Technologies We Use and Teach

The Role

You Will

You Have

Technologies We Use and Teach