Data Engineer
Sonatype
What you'll do:
- Design, build, and maintain scalable data pipelines and ETL processes
- Architect and optimize data models and storage solutions for analytics and operational use
- Collaborate with other data engineers to deliver trusted, high-quality datasets
- Own and evolve parts of our data platform, specifically the streaming pipeline and Data Lake
- Implement observability, alerting, and data quality monitoring for critical pipelines
- Drive best practices in data engineering, including documentation, testing, and CI/CD
- Contribute to the design and evolution of our next-generation data lakehouse architecture
What you bring:
- 4+ years of experience as a Data Engineer or Backend engineering role
- Strong programming skills in Java and Python
- Proficient in writing complex SQL and optimizing queries for performance
- Proficient in English, and strong communication skills, including the ability to speak to other engineers, analysts, and demo or explain new features to non-engineers
- Some experience using AWS cloud-native tools, like S3, SNS, SQS, EC2, or EMR
It's great if you also bring:
- Familiarity with streaming data pipelines or real-time processing
- Hands-on experience with distributed data tools like Hadoop, HDFS, and Spark
- Know your way around Docker containers and the Linux command line
- Exposure to DynamoDB or similar NoSQL data stores
- Experience using Databricks to write queries and notebooks
- Experience supporting data products in production
- An understanding of data privacy, security, and compliance best practices
Why you'll love working here:
- Data with purpose: Work on problems that directly impact how the world builds secure software
- Modern tooling: Leverage the best of open-source and cloud-native technologies, including very modern versions of Java
- Collaborative culture: Join a passionate team that values learning, autonomy, and impact