Download PDF

Summary

Platform engineer with 12+ years of cloud experience and 14+ years in tech, specializing in AWS infrastructure, cloud cost optimization, and AI-powered operational tooling.

Track record of delivering measurable cost savings (>$100K/yr), architecting multi-agent AI systems used by 100+ engineers weekly, and building cross-company observability infrastructure. AWS/CKA-certified with expertise in Kubernetes, Python, Terraform, and data-driven infrastructure management.

Experience

Sr. Platform Engineer

2025 - Present
PayPay Corporation
Tokyo, Japan
  • Built and maintain CloudOps Agent, an AI-powered operations platform on EKS using the Strands Agents SDK with 30+ subagents connected via the A2A protocol. Used by 100+ engineers weekly, handling 1,300+ queries per week with full observability and traceability via OpenTelemetry. Adopted as the standard agent pattern across multiple teams and group companies
  • Architected the AWS infrastructure for a multi-tenant observability platform serving several business units. Built on VictoriaMetrics, OpenTelemetry, ClickHouse, and Grafana running on dedicated multi-region EKS clusters
  • Reduced cloud spend by over $100K/yr within the first year across 100+ AWS accounts spanning multiple organizations. Built Savings Plans and Reserved Instance dashboards with renewal alerts, CUDOS reporting, and cost anomaly detection integrated directly into the CloudOps Agent

Sr. Site Reliability Engineer

2022 - 2025
Autify, Inc.
Tokyo, Japan
  • Orchestrated mission-critical workload migration from ECS Fargate to EKS with zero downtime, increasing system reliability while preserving all customer functionality
  • Reduced AWS infrastructure costs by over 50% within first year through strategic resource optimization, Savings Plans purchasing, and architecture refinements
  • Strengthened application reliability by implementing Helm-based deployments with robust rollback capabilities, streamlining CI/CD pipelines, and enhancing observability through Prometheus, Grafana and distributed tracing with Datadog
  • Implemented comprehensive cost observability including Amazon CUDOS and custom pipelines into Holistics, providing leadership with cost visibility dashboards, anomaly detection, and proactive budget alerts

Sr. Technical Account Manager

2019 - 2022
Amazon Web Services (AWS)
San Diego, CA
  • Provided architectural and practical guidance to cloud engineering and software development teams to improve resiliency, efficiency, performance, and costs
  • Assisted enterprise customers in the formulation and improvement of overall workload observability posture and reporting on service level objectives
  • Assisted customers in capacity planning and management for AWS hosting based on E2E user flow profiles
  • Boosted enterprise cloud operations team velocity by escalating support cases while simultaneously helping troubleshoot and optimize services in AWS

Full Stack Developer

2016 - 2019
SAIC
San Diego, CA
  • Engineered an internal automation platform using PHP7 (Laravel), JavaScript, and MySQL that reduced manual processes across various departments
  • Designed and implemented CI/CD pipelines with Atlassian Bamboo, increasing deployment frequency while reducing errors
  • Administered self-hosted Atlassian ecosystem (Jira, Confluence, Bamboo, Bitbucket, HipChat) on VMware infrastructure, including server maintenance, application upgrades, and MySQL database optimization

Software Engineer

2014 - 2016
THALES RAYTHEON Systems
Fullerton, CA
  • Performed international system integration for C4ISR environments in Saudi Arabia, completing deployments 15% ahead of schedule by helping develop automation scripts that reduced system provisioning time from days to hours
  • Earned achievement award for completing System Level Use Case 200+ hours under budget while exceeding all customer requirements
  • Implemented and managed modern CI/CD pipelines with Jenkins and Git, replacing legacy ClearCase systems and reducing build times and deployment complexity

Sr. Service Center Analyst / Field Engineer

2012 - 2013
Foster Farms
Livingston, CA / Farmerville, LA
  • Provided comprehensive IT support including desktop troubleshooting, SOP creation, SharePoint optimization, OS deployments, and hardware management
  • Managed end-to-end support for diverse technology stack including desktops, printers, telecom equipment, servers, networks, surveillance systems, and mobile devices across multiple facilities
  • Enhanced infrastructure security and performance through server health monitoring, trend analysis, and strategic system upgrades while collaborating with the infrastructure team

Help Desk Analyst

2008 - 2009
Ridgecrest Regional Hospital
Ridgecrest, CA
  • Assisted in the daily operations of the Information Systems department working both independently and in cohesive teams
  • Took on several large projects including the installation of roughly 30 wireless access points, hard wiring a newly constructed building with category 6 cables, and the installation of server hardware and software
  • Provided an additional communication channel between the IS manager and other department staff working closely with the CIO

Skills & Proficiencies

Proficient

Amazon Web Services (AWS), Google Cloud Platform (GCP), Cost Engineering, Kubernetes, Python, Infrastructure as Code (Terraform, CDK, CloudFormation), AI/ML Agent Systems (Strands SDK, A2A Protocol, Bedrock), Observability (VictoriaMetrics, Grafana, ClickHouse, OpenTelemetry, Prometheus), DevOps, Linux, SQL (Postgres, MySQL, Athena), CI/CD (GitHub Actions, Jenkins, ArgoCD)

Projects

personalized-aws-features

This project attempts to cut through AWS announcement noise by analyzing actual AWS usage through Cost Explorer data, fetching recent AWS announcements, and using Amazon Bedrock (Default: Amazon Nova Lite) to determine which announcements are relevant to your services. Notifications can be viewed in the CLI or sent directly to a Slack channel.
k8s-autoscaler-benchmarker

The k8s-autoscaler-benchmarker can be a useful tool for administrators and developers looking to optimize the scaling capabilities of their EKS clusters. The tool offers a streamlined process for benchmarking the performance of Karpenter and Cluster Autoscaler for EKS workloads.
Matt Hopkins Resume (AWS TypeScript CDK)

This resume site deployed on CloudFront using the AWS CDK. Entirely serverless, leveraging S3, CloudFront with Origin Access Control, ACM, and Route53. Replicated worldwide via CloudFront's globally distributed edge networks.

Publications

Speed Meets Security: How Bottlerocket Optimizes EKS Workloads

Comparison of Bottlerocket, Amazon Linux 2 (AL2), and Amazon Linux 2023 (AL2023) as EKS worker node operating systems, highlighting Bottlerocket's superior security, startup efficiency and native container image caching for Kubernetes workloads.

[Archive Link]

Leveraging Amazon S3 with Athena for Cost Effective Log Management

Technical post diving into our experience at Autify where we transitioned from a default Amazon CloudWatch Logs deployment to Amazon S3 with Athena for managing application logs in order to save thousands of dollars a month on our AWS bill.

[Archive Link]

Solutions for Cost-Effective EKS Control Plane Logging

Technical deep dive into cost optimization strategies for Amazon EKS Control Plane logging.

[Archive Link]

Education

Bachelor of Science - Management Information Systems

2009 - 2012
California State University, Chico