Real-time IT infrastructure audit

Is your infrastructure performant enough to meet your business challenges?

Our questionnaire will help you find out. Based upon DevOps best practices and metrics, this infrastructure audit checklist identifies all important aspects of a secure and resilient system and helps to discover bottlenecks. Use it to diagnose your infrastructure efficiency!

The infrastructure audit questionnaire consists of 12 questions. The first four represent DORA metrics – key parameters to measure software development and delivery performance as defined by Google’s DevOps Research and Assessment (DORA) team. The DORA metrics questions are based on Google’s Accelerate State of DevOps Report 2021.

The remaining questions relate to DevOps best practices and the level of their implementation in a company. 

Upon completion, you will receive a detailed conclusion and recommendations from our DevOps experts on improving your infrastructure.

1. Deployment Frequency

This metric measures how often a company deploys code for a particular application, for example, once per week or month. The higher this measure is, the better your product performs.

 

How often does your company release new changes? Please choose one of the answers below.

Choose one answer

2. Lead Time for Changes

This metric measures the time for committed code to reach production. The metric indicates the velocity of deployment: the lower its value, the better it is for the manufacturer.

 

How long does it take in your company for code changes to reach production? Please choose one of the answers below.

Choose one answer

3. Change Failure Rate

This metric captures the percentage of code changes that resulted in incidents, rollbacks, or any type of production failure. The Change Failure Rate indicates the quality of deployed software: the lower the average is, the fewer errors a code contains.

 

How often do changes in your code lead to critical production issues? Please choose one of the answers below.

Choose one answer

4. Mean Time to Recover

The Mean Time to Recover metric measures the average time required to troubleshoot a component or recover a system after failure. Effective DevOps reduces this metric.

 

How would you assess a Mean Time to Recover in your company? Please choose one of the answers below.

Choose one answer

5. Infrastructure as Code

Infrastructure as code (IaC) is an approach to set up, provision, and deploy IT infrastructures by describing their resources in code. Implementing the IaC practice allows you to automate deployments, trace and validate infrastructure changes, and deploy environment configurations to create identical environments as often as needed.

Technologies: Terraform, Pulumi, AWS CloudFormation.

 

Do you follow the Infrastructure as Code approach? Please choose your answer. You can supplement your answer with an additional option (+) if it is relevant to your organization.

Choose one answer
    All infrastructure resources are defined in the code.
    Git is a single source of truth for all infrastructure operations. The declared and actual infrastructure states are in full correspondence, and any divergences are reconciled automatically.
    Automation of infrastructure deployments is triggered by changes in the Git repository, but reconciliation is controlled by the CI/CD system.
    Some infrastructure resources are defined in the code, but many tasks are still performed manually.
    All infrastructure resources are configured manually.

6. Containerization and Orchestration

Containerization is the practice of packaging an application code with all its related files, libraries, and dependencies within a standardized unit, or ‘container.’ Once workloads are containerized, they can run on any platform, be independent of one another in terms of languages or frameworks, and managed collectively with container orchestration tools.

Technologies: Docker, Kubernetes, Rancher, Docker Swarm, OpenShift, EKS, and AWS Fargate.

 

Do you leverage the advantages of containerization technology? Please choose one of the answers below.

Choose one answer
    Applications are natively designed to utilize cloud services and maximize their cloud-native potential, including the use of microservices for application architecture, portability of containerized apps, and facilitated CI/CD efforts.
    Container orchestration is implemented, but the services were not initially designed to be containerized.
    Applications are packaged within container units and isolated from one another in terms of operation, configuration, and debugging.
    An application is composed all in one piece. The program’s components are tightly coupled and must all be present for the software to run.

7. Infrastructure Stack Modernity

A measure to indicate how much infrastructure corresponds to the latest technology trends and whether it is ready for future challenges. Regular technology updates allow companies to remain technologically advanced and ahead of the competition.

Technologies: vary on performance level.

 

Is your infrastructure stack modern enough? Please choose one of the answers below.

Choose one answer
    A high level of stability that implies using the latest, most cost-effective practices and tools, e.g., declarative approach, containerization, Terraform, and Kubernetes. Technology updates are regular and follow official releases, i.e., it takes no longer than six months after the official release for the newest version to run in production.
    A high level of stability that implies using the most trendy and cost-effective practices and tools, e.g., declarative approach, containerization, Terraform, and Kubernetes. Technology updates are not regular, meaning the gap between the official release and the newest version running in production is over six months old.
    This level implies using tried-and-tested technologies that are somewhat outdated in light of today’s technological advancements. Examples: imperative approach infrastructure tools like Ansible and Salt.
    The technologies in use are largely outdated, and many are no longer maintained. Examples: virtual machines technology, Chef, Puppet, and infrastructure running on Bash scripts.

8. CI/CD

Continuous Integration (CI) and Continuous Deployment (CD) are the DevOps practices of automated building, testing, and deployment of code to target environments. Implementation of CI/CD enables automation of repetitive tasks, provides for a faster deployment pace, shorter release cycles, early detection of erroneous code and quick fixes, and improves overall code quality.

Technologies: GitLab, GitHub, Argo CD, Bitbucket, and Jenkins.

 

Which of the CI/CD processes are established in your company? Please choose one or more answers.

Multiple choice
    The next logical step of continuous delivery. The approach is defined by feature flagging during deployments, gradual rollouts, canary launches, blue-green deployments, A/B testing, and so on.
    Artifacts are deployed to one environment at a time and immutably promoted to the next after testing.
    All proposed code changes are reviewed before application. Pull requests are an easy way to do this.
    A trend in modern DevOps that lies in decoupling CI and CD workflows by using separate tools for their implementation. For example, CI is enabled using native GIT provider tools (GitLab Pipelines/GitHub Actions) and CD by a polling model from the cluster itself via GitOps toolkit (ArgoCD, Flux).
    All infrastructure is defined in code, and automated tests are applied to verify changes after every single commit.

9. Security

Security is a set of specific guidelines and best practices to protect information, systems, and assets against potential attacks. Effective security strategy mitigates the risks of your data assets being compromised, prevents security breaches and data leakage, and enhances the overall reliability and availability of services.

Techniques and technologies: Threat modeling, risk assessment, Defense in Depth (DiD) approach, security by design principles, and Application Security (AppSec) tools.

 

Which of the following security practices are implemented in your company? Please choose one or more answers.

Multiple choice
    Any open-source or third-party components are tested for potential security issues before employing.
    An application security practice that involves introducing security in the early stages of the software development lifecycle rather than at the end when identified vulnerabilities are more costly to fix.
    Implies a dedicated position within a company in control of information security, cybersecurity, and IT risk management programs.
    Since secrets contain private and sensitive information, they should never be stored in plaintext. Using secure secret managers like 1Password or LastPass and secret stores such as AWS Secrets Manager, SSM Parameter Store, or HashiCorp Vault helps to protect your sensitive data against cyber thieves.
    A practice of identifying potential cyber threats that could disrupt business, analyzing their consequences, and designing countermeasures.
    Going through security audits and having a third-party company conduct penetration testing on your services is an effective way to identify and fix potential issues.
    A set of measures for protecting websites, services, and networks against possible external attacks; includes the setup of web firewalls, DDoS protection, and intrusion detection system (IDS).
    Static application security testing (SAST) and Dynamic Application Security Testing (DAST) are included as part of the CI/CD pipeline.

10. Backups and Disaster Recovery

Backup and Disaster Recovery (DR) are DevOps strategies for restoring infrastructure or system components after failover with minimum downtime and data loss. Effective backups and recovery strategies imply redundancy of information and data assets, so you will always have a copy of your data available elsewhere when a disaster strikes.

Technologies: dedicated backup software (Veeam Backup & Replication, Velero, Rsnapshot, FSBackup), snapshots, and BaaS for cloud services.

 

How much is your business secured against force majeure? Please choose one or more answers.

Multiple choice
    A DR plan has been carefully designed and proved to work by running disaster scenarios and checking the restoration procedures in practice.
    Recovery scenarios are automatically tested by periodically restoring data from created backups to ensure they work.
    Though not a backup strategy per se, IaC could be considered a kind of infrastructure backup as it allows for the quick restoration of infrastructure from available code or the configuration of identical environments using the same code.
    Both server backup and snapshot options are effectively implemented to secure your datasets.
    A DR strategy with a core system functionality configured and running in the cloud or a separate cloud account. Then, when recovery time comes, you can rapidly provision a full-scale production environment around the critical core.

11. Observability

Observability is a DevOps practice of measuring a system’s current state based on the data it generates, including logs, metrics, and traces. Observability makes infrastructure processes visible, allows for data visualization and analysis, and enables effective code debugging and timely troubleshooting of issues.

Technologies: Grafana, Prometheus, Alertmanager, ELK, AWS CloudWatch, Jaeger, and Datadog.

 

How effective is observability in your company? Please choose one or more answers

Multiple choice
    All critical parameters – including availability metrics, business metrics, application metrics, and server metrics – are being tracked, covered with monitoring, and recorded in logs.
    A system that aggregates all the data produced by all the IT systems in one place and allows for its single pane of glass management and processing.
    Alerts aim to notify on-call engineers when critical metrics cross pre-defined thresholds. Most metrics and log tools support alerting and can be integrable with notification tools.
    Postmortem analysis includes the detailed recording of an incident with its further investigation, identifying a root cause and preventive measures to exclude the possibility of it reoccurring in the future.
    An SLA (service level agreement) is an agreement between a provider and client about measurable metrics like uptime, responsiveness, and responsibilities. An SLO (service level objective) is an agreement within an SLA about a specific metric like uptime or response time.

12. Documentation

Documentation is an effective way to keep internal processes and procedures systemized and available for future reference. Detailed and accurate documentation is a centerpiece of all your must-know information and an advisory for new employees.

Technologies: documentation management systems (Confluence, Nuclino, Read the Docs), GitHub Pages, and Continuous Documentation tools.

 

How good is your documentation? Please choose one or more answers.

Multiple choice
    The top level when you first document a desired system and then follow the documentation when designing the system.
    Your project documentation includes detailed infrastructure descriptions updated on demand, including IP addresses, physical locations, dependencies, and passwords. Moreover, elements of system design and all connections between them are visualized on architecture diagrams.
    Internal documentation is centralized and regularly updated; onboarding documentation exists for newcomers to make a smooth start.

Is your infrastructure performant enough to meet your business challenges?

LOOKING FOR A DEVOPS PARTNER?

Get Results

You will know the results of the audit
as soon as you complete the form.
Please fill in the fields below.

Full Name*
Company Name*
E-mail*

By clicking on "Submit", you confirm that you have read, understood, and accept our privacy policy.

Order Tariff Plan - Basic

Please, text your message in the form below
and we will get back to you shortly.

Full Name
E-mail
Phone Number
Additional Information

Order Tariff Plan - Pro

Please, text your message in the form below
and we will get back to you shortly.

Full Name
E-mail
Phone Number
Additional Information

Order Tariff Plan - Plus

Please, text your message in the form below
and we will get back to you shortly.

Full Name
E-mail
Phone Number
Additional Information

By clicking on "Submit", you confirm that you have read, understood, and accept our privacy policy.

Thank You For Your Request

We will contact you as soon as possible.