Job Description
This position is responsible for ensuring the stability, performance, and security of software systems within the team. The role involves managing and maintaining critical components such as blockchain nodes, web3 applications, and official websites. Key tasks include overseeing the operational status of these systems, implementing monitoring solutions, and resolving technical issues promptly. The candidate will also design and deploy automation tools to streamline operations and improve system reliability. This role requires close collaboration with cross-functional teams to achieve seamless integration and continuous improvement of software infrastructure.
Key Responsibilities
- Monitor and maintain the stability of blockchain nodes, web3 applications, and team websites, ensuring 24/7 availability and optimal performance.
- Implement comprehensive monitoring systems for software products and their operating environments, including real-time logging and alert mechanisms.
- Diagnose and resolve complex technical issues across distributed systems, with a focus on root-cause analysis and system recovery.
- Develop automation scripts and tools for routine maintenance tasks, reducing manual intervention and improving operational efficiency.
- Collaborate with developers and security teams to enhance system resilience, optimize resource allocation, and ensure compliance with industry standards.
- Document system configurations, incident reports, and operational procedures to support knowledge sharing and audit requirements.
- Stay updated on emerging technologies and industry trends to propose innovative solutions for system optimization and scalability.
Job Requirements
- Proven experience in system administration and DevOps practices, with a focus on blockchain infrastructure and web3 technologies.
- Strong proficiency in Linux/Unix operating systems, shell scripting, and automation frameworks such as Ansible or Terraform.
- Deep understanding of monitoring tools (e.g., Prometheus, Grafana, ELK stack) and log management systems for real-time insights.
- Experience with blockchain protocols, smart contract interactions, and decentralized application (dApp) deployment processes.
- Excellent problem-solving skills and ability to troubleshoot complex issues in distributed systems with minimal downtime.
- Knowledge of cloud platforms (AWS, Azure, GCP) and containerization technologies (Docker, Kubernetes) for scalable infrastructure management.
- Ability to design and implement CI/CD pipelines for automated testing, deployment, and version control of software products.
- Strong communication skills to collaborate with developers, stakeholders, and team members on technical decisions and system improvements.
- Proficiency in programming languages such as Python, Go, or JavaScript for custom tool development and system integration.
- Experience with security best practices, including vulnerability management, access control, and data encryption protocols.