JetBrains · Amsterdam, Armenia, Germany, DE · 16 days ago
Kineto is a next-generation platform that enables creators, educators, and small businesses to generate, deploy, and operate fully functional AI-powered web applications – instantly and at scale. It combines LLM-driven code generation, multi-tenant Postgres (Neon), dynamic hosting (GKE and Knative), automated deployments (Flux), analytics, billing, and a seamless chat-based UX to make software creation accessible to everyone. Our team is growing rapidly, and we’re now seeking an experienced Infrastructure Engineer who can design, build, and maintain our cloud-native platform, with a focus on scalability, reliability, and automated operations.
What you’ll do:
Cloud and platform engineering (DevOps):
Design, implement, and manage the core infrastructure powering Kineto's platform on Google Cloud Platform (GCP), including networking, security, and identity management.
Build and operate resilient, highly available distributed systems using Kubernetes (GKE), Knative, Istio, and related cloud-native technologies.
Automate the entire infrastructure life cycle (IaC) using Terraform and Terragrunt, ensuring secure, reproducible, and auditable environments.
Implement and maintain CI/CD pipelines (e.g. GitHub Actions and TeamCity) and deployment tools like Flux and Helm for GitOps-driven application delivery.
Optimize and manage the multi-tenant data layer on Postgres and Neon, focusing on robust tenant isolation, performance, backups, and safe schema management.
Operational excellence and reliability:
Drive site reliability engineering (SRE) practices, including monitoring, alerting (Prometheus, Grafana), logging (Loki), and incident response.
Solve complex operational challenges, such as optimizing scale to zero for cost efficiency, minimizing cold starts, enhancing autoscaling behavior, and managing queue backpressure.
Implement platform-wide performance tuning (e.g. container resource limits, distributed locks, caching strategies, and GC configurations).
Ensure platform security and compliance by implementing best practices for secrets management, network segmentation, and vulnerability scanning.
Technical leadership:
Own major infrastructure roadmap items, including multi-region deployments, disaster recovery planning, advanced tenancy separation, and ephemeral preview environments.
Champion DevOps and SRE principles across the engineering team, mentoring engineers on cloud-native best practices, operational readiness, and debugging complex distributed systems.
Collaborate with product and engineering teams to define the long-term vision for the platform's architecture and operational model.
We’d be glad to have you on our team if you:
Have five or more years of experience building and operating large-scale, commercial cloud-native infrastructure, with a strong focus on DevOps/SRE practices.
Possess deep, hands-on expertise with GCP (or AWS/Azure) and Kubernetes administration and operations (GKE experience is a strong plus).
Are proficient with infrastructure-as-code (IaC) tools, particularly Terraform, for managing complex environments.
Have a solid understanding of Linux internals, networking (CNI and service mesh), security, and distributed system design.
Are familiar with CI/CD tools, GitOps (e.g. Flux), monitoring stacks (Prometheus/Grafana), and logging systems.
Thrive in cross-functional teams and excel at communicating complex infrastructure ideas clearly.
#LI-YY1
We process the data provided in your job application in accordance with the Recruitment Privacy Policy.
Headquarters
Amsterdam, Armenia, Germany
Work Location
remote
Job Category
Cybersecurity
Application Deadline
Not specified
Job Type
full-time
Experience Level
Not specified
Application Method
Apply via JobSpring
Salary
Not specified
No related jobs found