Job Description
This role is responsible for the deployment, maintenance, and monitoring of our parachain infrastructure, which currently operates on Google Cloud Platform (GCP) but is designed to evolve into a multicloud architecture. The candidate will lead efforts to optimize infrastructure performance across multiple dimensions including cost efficiency, system reliability, deployment speed, and security protocols. Collaboration with cross-functional teams is essential to ensure alignment on service-level objectives and technical requirements. The position also involves developing scalable tools and platforms to support rapid engineering iteration, while establishing a self-service model that empowers development teams with autonomy and resources. Additionally, the individual will focus on maintaining production availability through proactive troubleshooting, creating comprehensive documentation for infrastructure operations, and refining alerting systems to enhance operational visibility and response capabilities.
Key Responsibilities
- Deploy, maintain, and monitor parachain infrastructure (currently GCP, with potential multicloud expansion) to ensure seamless operation and scalability.
- Conduct regular infrastructure audits, performance testing, and monitoring to identify opportunities for optimization in cost, reliability, and security.
- Collaborate with business and technology teams to align on service reliability goals, performance benchmarks, and technical requirements.
- Design and implement tools, platforms, and infrastructure solutions that enable engineering teams to iterate quickly and efficiently.
- Establish a self-service model to empower engineering teams with autonomy, resources, and streamlined access to infrastructure management capabilities.
- Proactively troubleshoot and resolve infrastructure and service issues to maintain production availability and system resilience.
- Create and maintain detailed documentation, standard operating procedures, and governance frameworks for all deployed infrastructure components.
- Participate in the continuous improvement of alerting systems and incident response processes to enhance operational efficiency and reduce downtime.
- Stay updated on emerging technologies and industry best practices to drive innovation in infrastructure management and automation.
- Coordinate with DevOps, security, and compliance teams to ensure infrastructure adheres to organizational standards and regulatory requirements.
Job Requirements
- Proven experience in deploying and managing blockchain-based infrastructure (parachain systems) in cloud environments (GCP, AWS, Azure, or others).
- Strong understanding of infrastructure optimization principles, including cost management, performance tuning, and security hardening.
- Excellent communication skills to collaborate with business stakeholders and technical teams on infrastructure-related challenges and solutions.
- Ability to design and implement scalable tools and platforms that support rapid development cycles and engineering productivity.
- Experience in building self-service models through automation, documentation, and user-friendly infrastructure management interfaces.
- Technical proficiency in troubleshooting complex systems, diagnosing root causes, and implementing fixes to ensure high availability and resilience.
- Knowledge of cloud-native technologies, containerization (Docker/Kubernetes), and infrastructure-as-code (Terraform/Ansible) practices.
- Strong documentation skills to create clear, concise, and actionable infrastructure guidelines and operational procedures.
- Experience with monitoring tools (Prometheus, Grafana, Datadog) and alerting systems to track infrastructure health and performance metrics.
- Ability to work in a fast-paced, dynamic environment with a focus on continuous improvement and innovation in infrastructure operations.
- Excellent problem-solving abilities and analytical mindset to address infrastructure challenges and drive system reliability.
- Preferred: Familiarity with blockchain protocols, parachain architecture, and decentralized application (dApp) ecosystems.
- Preferred: Experience with multi-cloud environments and cross-cloud orchestration strategies.
- Preferred: Strong background in DevOps practices and CI/CD pipelines for infrastructure automation.
Additional Qualifications
- Proficiency in scripting languages (Python, Bash, PowerShell) for automation and infrastructure management tasks.
- Experience with cloud cost optimization frameworks and budgeting tools to maximize resource efficiency.
- Knowledge of network security protocols (TLS/SSL, VPCs, firewalls) to ensure secure infrastructure deployment.
- Ability to lead infrastructure projects from concept to execution, including stakeholder management and resource allocation.
- Strong familiarity with container orchestration platforms (Kubernetes) and microservices architecture for scalable deployments.
- Experience with infrastructure-as-code (IaC) tools to automate provisioning and configuration management across environments.
- Preferred: Certification in cloud computing (AWS/Azure/GCP) or DevOps methodologies (Certified DevOps Engineer, AWS Certified Solutions Architect).
- Preferred: Experience with blockchain development tools (Substrate, Polkadot, Parity) and smart contract deployment frameworks.
- Preferred: Familiarity with CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions) for infrastructure automation and testing.