Senior Data Engineer - Data (AVD Team)
Sonatype
Who We Are
- At Sonatype, we help organizations build better, more secure software by enabling them to understand and control their software supply chains. Our products are trusted by thousands of engineering teams globally, providing critical insights into dependency health, license risk, and software security. We’re passionate about empowering developers—and we back it with data.
The Opportunity
- We’re looking for a Senior Data Engineer to join our growing Data Platform team. You’ll play a key role in designing and scaling the infrastructure and pipelines that power analytics, machine learning, and business intelligence across Sonatype.
- You’ll work closely with stakeholders across product, engineering, and business teams to ensure data is reliable, accessible, and actionable. This role is ideal for someone who thrives on solving complex data challenges at scale and enjoys building high-quality, maintainable systems.
What You’ll Do
- Design, build, and maintain scalable data pipelines and ETL/ELT processes
- Architect and optimize data models and storage solutions for analytics and operational use
- Collaborate with data scientists, analysts, and engineers to deliver trusted, high-quality datasets
- Own and evolve parts of our data platform (e.g., Airflow, dbt, Spark, Redshift, or Snowflake)Implement observability, alerting, and data quality monitoring for critical pipelines
- Drive best practices in data engineering, including documentation, testing, and CI/CDContribute to the design and evolution of our next-generation data lakehouse architecture
What We’re Looking For
- Minimum Qualifications
- 5+ years of experience as a Data Engineer or in a similar backend engineering role
- Strong programming skills in Python, Scala, or Java
- Hands-on experience with HBase or similar NoSQL columnar stores
- Hands-on experience with distributed data systems like Spark, Kafka, or Flink
- Proficient in writing complex SQL and optimizing queries for performance
- Experience building and maintaining robust ETL/ELT pipelines in production
- Familiarity with workflow orchestration tools (Airflow, Dagster, or similar)
- Understanding of data modeling techniques (star schema, dimensional modeling, etc.)
- Bonus Points
- Experience working with Databricks, dbt, Terraform, or Kubernetes
- Familiarity with streaming data pipelines or real-time processing
- Exposure to data governance frameworks and tools
- Experience supporting data products or ML pipelines in production
- Strong understanding of data privacy, security, and compliance best practices
Why You’ll Love Working Here
- Data with purpose: Work on problems that directly impact how the world builds secure software
- Modern tooling: Leverage the best of open-source and cloud-native technologies
- Collaborative culture: Join a passionate team that values learning, autonomy, and impact