Hana Khoury

Senior Platform Engineer & SRE

About Me

Continuously expanding ‘The Platform’ while fostering a sustainable DevOps ecosystem.

Senior SRE with strong ability for self-learning, highly adaptive to fast-paced environments, team player and detail oriented.

Experience

Cisco

Site Reliability Engineer

January 2023 - Present

https://www.cisco.com

Developing CNAPP & GenAI Incubation platform.

Being dispatched into multiple venture teams at different development lifecycles, our engineering platform offers multiple flavors of services:

  • Comprehensive CI/CD & GitOps development
  • Cross-functional observability design
  • Project consultation and stack architecture
  • Microservice integration and security
  • Cloud services and Infra migrations
  • IDP self-service workflows
  • Production on-call and incident mitigation

Leveraging the latest tech stack, all cloud-based while adhering to Cisco’s highest security standards.

Micro Focus (Formerly HPE)

DevOps Engineer - SaaS R&D

July 2022 - January 2023

Part of Control Tower R&D group:

  • Developing GitOps & CI/CD pipelines for multiple R&D groups
  • IaC deployment on AWS CDK (Python/TypeScript)
  • Microservice deployment on AWS ECS
  • Automating infra migrations to VMC & AWS

Micro Focus (Formerly HPE)

Infrastructure Engineer - Cloud R&D

December 2019 - July 2022

  • ScrumMaster: facilitating daily scrum, sprint planning, and retrospectives.
  • Automating cloud migration efforts.
  • Designing observability solutions for public cloud and middleware services.
  • Developing Internal Cloud Management Platform (IaC/NodeJS):
    • Provisioning over 5,000 virtual servers for R&D and Production
    • Handling different requirements and architectures from ~1,200 users

Micro Focus (Formerly HPE)

SRE - SaaS Production

December 2018 - December 2019

  • Developing self-heal cronjobs for our on-prem infra (C#/BASH).
  • Debugging telemetry and mitigating real-time production outages.
  • Coordinating app upgrade processes with SLA compliance for all stakeholders.
  • Managing RCA; implementing preventative & auto-detective measures.
  • Creating & refining production on-call run books.
  • Individually developed a full-stack application for SaaS incident management.

Hewlett Packard Enterprise

SRE - SaaS Production

December 2016 - December 2018

https://www.hpe.com
  • Deploying telemetry scraping workflows and alert configurations (metrics, logs, traces).
  • Infrastructure health maintenance for VMware & cloud environments.
  • Debugging outages, developing and automating self-heal workflows.
  • Production on-call; incident management & coordination with R&D and stakeholders.
  • Individually developed a full-stack app of a centralized observability console.

Education

Technion - Israel Institute of Technology

B.Sc Computer Science & Information Systems

2012 - 2016

Tel Aviv University

Master of Business Administration (MBA), Data Science

2024 - 2026

Certifications

AWS Certified Solutions Architect – Associate

Amazon Web Services (AWS)

Skills & Technologies

Cloud & Orchestration

AWS GCP Kubernetes OpenShift Docker Helm Kustomize

CI/CD & GitOps

GitHub Actions Jenkins ArgoCD Argo Workflows Temporal HashiCorp Vault

IaC & Automation

Terraform Pulumi AWS CDK Ansible Python Bash

Observability & Monitoring

Prometheus Grafana Elastic Splunk OpenTelemetry

Backend & API

Java Python Go C#

Databases

Postgres MySQL MongoDB Redis