Senior Software Engineer - Data Infrastructure
Software Engineering, Other Engineering
Redwood City, CA, USA
Posted on Friday, July 30, 2021
TruEra provides a full lifecycle observability platform to help enterprises analyze machine learning, improve model quality, track performance and build trust. Powered by enterprise-class Artificial Intelligence (AI) Explainability technology based on six years of research at Carnegie Mellon University, Truera’s platform helps eliminate the black box surrounding widely used AI and ML technologies. This visibility leads to higher quality, explainable models that achieve measurable business results, address unfair bias, and ensure governance and compliance.
We are excited about the amazing team we’re building at Truera. One of the core cultural principles at Truera is: “Create what’s not there.” We’re building a team of creator-builders who are excited about our mission and keen to build large-scale systems and drive cutting-edge research in support of it.
We are a rapidly growing Series B company funded by Greylock, Wing, and Menlo Ventures, and working with both Fortune 100 customers and startups throughout the world!
About the job
As a Senior Software Engineer on the TruEra Data team, you will architect, build and manage real-time and batch data pipelines and data aggregation systems to empower self-service reporting on our big data platform and for AI/ML data infrastructure ecosystems. We're developing the platform for both public and private cloud environments with the container as first-class citizens. Infrastructure is at the core of our platform, and we're constantly innovating to make our systems more performant, timely, cost-effective, and capable while maintaining high reliability. You'll be architecting our core data and ML infrastructure and pipelines.
What You Will be Doing:
- Lead the design and implementation of complex distributed systems - be it a new service to power new functionality or data pipelines to ingest large volumes of data or implementing state-of-the-art complex algorithms.
- Build APIs to backend complex data systems across a range of technologies to support new and improved product functionality.
- Partner with data scientists, infrastructure engineers, and product managers to design, build and deliver big data projects and new data platform capabilities.
- Debug hard problems - that’s a given! When things break -- and they will -- you will find yourself debugging those challenging bugs and will be eager to fix things.
- Continuously learn something new, whether it’s a new technology or a quirk of a language we otherwise didn’t know. On occasion, you may find yourself picking up a new language or working with an unfamiliar platform
- Help define and build the TruEra Data ecosystem as use cases grow
- Build scalable data pipelines to move data from different Storage systems to the Truera Platform
- Participate in early customer engagements and PoCs, and use that context to drive new product features
- Review design and code, and make sure what we ship is awesome
- “Create what’s not there”
Who You Are:
- Someone who enjoys having significant ownership of features and systems and pursues results-driven development approaches consistent with pragmatism.
- Someone who is set on building systems that balance scalability, availability, and latency.
- An advocate for improving engineering efficiency, continuous deployment and automation tooling, monitoring solutions, and self-healing systems that enhance the developer experience.
- Good communication skills, mentoring, and a force-multiplying track record.
- Experience in ground-up system building
- You have led and mentored others and care about the development of your teammates.
- You desire to learn and grow, push yourself and your team, share lessons with others and provide constructive and continuous feedback, and be receptive to feedback from others.
- BS in Computer Science or equivalent
- Strong product mindset and 4+ years of proven track record in building and maintaining big data platforms for streaming and batch data processing.
- 3+ years of experience in data engineering, building backend systems, and APIs.
- Expertise in building data pipelines using open-source frameworks (Hadoop, Spark, Kafka, Airflow, etc)
- Strong data infrastructure experience on-premise or Cloud Infrastructure
- Solid background in the fundamentals of computer science and distributed systems
- Experience in containerized deployment or Kubernetes
- Ability to build systems that balance scalability, availability, and latency
- Advocate for the continuous deployment and automation of tools, monitoring, and self-healing systems
- Strong hands-on coding experience in Java, Python, SQL and comfortable diving into any new language or technology.
- Experience with some or similar or all of Spark, Flink, Airflow, Hive, Druid, Presto, PostgreSQL, DBT, ETL, and familiarity with key/value databases, Kafka, and Kubernetes.
- Experience working with modern cloud-based microservice architectures.
- Good understanding and experience in modern ETL (incremental, one-time) with DAG design patterns, data quality checks etc.
- Experience building machine learning models or ecosystems
- Experience with Linux and containers using Docker and Kubernetes is a big plus.
- Having been a part of an engineering team at an early-stage startup
Any unsolicited resumes/candidate profiles submitted through our website or to personal email accounts of employees of Truera are considered property of Truera and are not subject to payment of agency fees.