Description & Requirements
The DataHub Engineering team is building a distributed platform to host, catalog, discover, and deliver financial datasets across Bloomberg. This platform powers batch analytics, real-time stream processing, and low-latency, high-availability data distribution - ensuring that high-quality data, the lifeblood of financial markets, is always accessible.
You will join the team that introduced the abstraction of “dataset”, invented a schema language to formally define all data at Bloomberg, complete with schema evolution, versioning, and a true point in time semantics. We're the first to introduce Kafka, Avro, company-wide Dataset Schema Registry, Mesos, Clustered MySQL, Vitess and Spark for ETL at Bloomberg. We are designing a new Data Intensive Platform that is the hub of financial datasets.
You’ll get to:
Write software for Kafka based Data Pipes for the company wide Data Mesh
Debug and diagnose intricate issues, functional and performance regressions, with Apache Kafka, Apache Spark, data codecs, low latency services, and streaming
Collaborate and share extensively with fellow engineers
Contribute to open source technologies like Spark or Iceberg
Display expertise in building Lakehouse for large scale data platforms
Our tech stack:
Languages: Java, Python, Scala
Frameworks/Tools: Spark, Kafka, Kubernetes
Cloud-Native Stack: Container orchestration, service mesh, distributed tracing
You’ll need to have:
4+ years of professional experience programming in Java, Scala, or Python
Expertise in Apache Kafka, Spark, Redis and Distributed Systems
Experience building and testing scalable and reliable data infrastructure
A Degree in Computer Science, Engineering, Mathematics, similar field of study or equivalent work experience
We’d love to see:
Any of your contributions in open source to Kafka, Spark, Streaming, etc.
Experience with performance optimization techniques in Iceberg and using Redis for caching expensive query results to improve application performance
Experience with DuckDB for analytics on smaller datasets on Kubernetes
Production experience with Kubernetes (Helm, Operators, CRDs)
Familiarity with Kafka, Spark, or lakehouse architectures
A passion for reliability, scale, and mentoring others
We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation (exempt roles only), paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.