Piotr Szymański

I am a seasoned engineer with well over 10 years of professional experience.

From small companies to big-league enterprises, I have designed, built, shipped, maintained, and operated software projects.

I can rock the Developer, DevOps, SRE, or Sysops hats – sometimes all at once.

With eyes on the prize, I want to deliver the simplest possible solutions that will survive the ever-increasing entropy of the tech universe.

I know when to shoot straight or when to apply Socratic ignorance, whatever it takes to pave the clear path to mutual understanding.

Skills

Development

Python
Golang
TypeScript
PostgreSQL
Redis
RabbitMQ
Kafka
uWSGI
Celery
Memcached
Sentry

DevOps

AWS
Terraform
CDK
Docker
Ansible
Envoy
Squid
CircleCI
Jenkins
Datadog
Vector
Grafana
Logentries
Splunk
OpenMetrics
HAProxy

SRE/SysOps

Linux
Systemd
Networking
SSL
PKI
DNS
Distributed Systems
HTTP Proxy

Work Experience (7)

Oct 2024 - Current

Senior Site Reliability Engineer

Clari (Remote)

https://clari.com

Jan 2020 - Mar 2024

Lead Engineer

Klarna Bank AB (Remote)

https://klarna.com

I joined Klarna to build products at scale and found exactly what I was looking for. My team provided the platform for running containerized workloads for the entire company, serving nearly 2000 engineers, running 1000 microservices, and maintaining 30000 containers in production. Every engineer interacted with their deployments via our API, leaving the rest to us.

The API, Control Plane, and supporting services were built in TypeScript and ran with AWS Lambda and Step Functions. Underlying infrastructure was managed by CloudFormation, mostly using CDK. To ensure a frictionless experience for engineers, we used a plethora of AWS services such as ECS, ECR, CodeDeploy, ASG, EC2, ALB, Route53, CloudWatch, ACM, SSM, IAM, and many more.

Given the platform's handling of billions of dollars, the SLA bar was set reasonably high. Everything was meticulously monitored, audited, and tightly secured. Designed to be self-healing, the platform resolved issues before engineers even noticed them.

By default, the platform provided scalability, observability, security, logging, and compliance worthy of the biggest fintech in Europe. DataDog alerts, dashboards, and metrics were automatically provided for each deployment. We invested considerable effort into training and onboarding engineers to ensure metrics, dashboards, and alerts were both meaningful and actionable. I even gave a talk on our platform at the Klarna's conference, you can watch here.

We managed EC2 instances with ASGs and rotated the entire production fleet of 1000 EC2 instances each week to apply the latest patches and security updates. Changing Linux distributions without downtime was a fun challenge. With thousands of containers, upgrading the kernel by a major version required significant SRE magic. The scale of our underlying infrastructure occasionally exposed cracks in AWS itself, necessitating close collaboration with the AWS ECS team.

As a financial institution, everything had to be end-to-end encrypted and secured. We managed the PKI infrastructure, public certificates, and provided engineers with tools to manage their secrets.

Of course, not everything ran smoothly all the time. Positioned at the crossroads of the entire company, our on-call rotations were intense. We served as the go-to SRE team when incidents were challenging to mitigate.

Mar 2015 - Dec 2019

DevOps team lead

FieldAware (Remote)

https://fieldaware.com

I started as a Senior Software Engineer at the small venture founded by two CS PhDs, heavily focused on engineering. My inaugural project was to build an authentication service integrated with an SSO provider via SAML (Python, Flask, uWSGI, SqlAlchemy). While development progressed smoothly, releasing posed significant challenges. Neglected fabric scripts proved unreliable and somewhat hazardous.

We opted to rebuild deployments using Ansible, slashing deployment times from 2 hours to just 15 minutes. This transition proved highly successful, although the application lacked isolation from the host OS.

This issue led us to Docker. We unified runtime environment for local deevelopment, testing and production (AWS ECS). Migrating services one by one to Docker was harder than it sounds.

As the service count grew, we needed a sane way of data synchronization between them. We implemented a Change Data Capture model with AWS Aurora (MySQL) and Kafka. With Kafka Connect, we translated the database's replication log into other datastores like PostgreSQL and ElasticCache. Having a single source of truth and replayable event history was a big win.

As the infrastructure grew in complexity, Puppet and Foreman proved insufficient. We needed a solution for the entire AWS infrastructure and settled on Terraform and Terragrunt. Gradually migrating every AWS resource into Terraform, we managed it as code. Since Spacelift has not been built yet, I built a simple CI/CD pipeline. It posted Terraform plans upon pull requests and results after the application was completed. I did a talk about our Terraform experience, available for viewing here (Polish).

Oct 2013 - Nov 2014

Senior Software Engineer

F-Secure (Remote)

https://f-secure.com

This company was my first experience with a multinational enterprise. The project essentially was an S3 for big telecoms. Despite my familiarity with the used stack (Python, PostgreSQL, PL/pgSQL, PL/Proxy, Memcached, RabbitMQ), I had to learn a lot. The delivered features had to be thoroughly tested and perform reliably at scale. That is why I focused on rewriting automated high availability tests (using py.test). Running them involved provisioning the entire stack on a staging environment and inducing failures in selected components. The end result was faster, more reliable builds and elimination of random failures. This led to a significant acceleration of the release cycle, proving that DevOps is not merely a buzzword.

Jul 2012 - Oct 2013

Software Engineer

ClearCode

https://clearcode.cc/portfolio/kanary/

Given my experience in the ad networking business, I was presented with an exciting project to lead — a DSP (Demand-Side Platform) in the RTB ecosystem.

Regardless of the traffic volume, each request had to be handled within 100ms. I approached the role with passion, acknowledging and learning from my mistakes.

Despite initial over-engineering with Python/Twisted and Redis, I eventually delivered a functional POC capable of handling over 1500 requests per second on a single t2.medium instance.

It was the first time I took the TDD approach using py.test for integration and unit testing.

For reporting and BI, I utilized RabbitMQ for async processing and MongoDB for storage. This database taught me the valuable lesson that not all mainstream software products are worth the hype.

Oct 2011 - Sep 2012

Python developer

BusinessClick / Money.pl

https://businessclick.com

I was hired from within the same capital group as an employee. It was the biggest ad network in Poland at that time.

Mostly responsible for the frontend (Python, HTML, JS, CSS), I quickly picked up the design of the entire system, including the backend (Python, Twisted).

The architecture was centered around PostgreSQL,PgBouncer, PL/Proxy and PL/pgSQL. It was the chance for me to understand the internals of relational databases. We also successfully replaced memcached with Redis, which was cutting-edge at the time. Adaption of statsd, collectd and graphite helped us to understand the runtime in depth.

Cloud was not yet mainstream, so we handled the physical infrastructure with a team of sysadmins. To update the hardware, we had to go to the datacenter and swap blade servers ourselves.

Jul 2011 - Sep 2012

Internship

MyStock.pl / Money.pl

https://mystock.pl

My professional journey in software began with MyStock, a social networking platform delivering real-time stock quotes directly from the Polish Stock Exchange to thousands of users.

I learned Python the hard way, using the Twisted Framework, and managed Postgresql with psycopg. With proven Linux experience, I quickly earned trust in maintaining the existing production infrastructure and codebase.

Education (1)

2007 - 2011

Engineer

Computer Science, Electronics

Wrocław University of Technology (Politechnika Wrocławska)

Grade: 4.6

Certificates

2022-03-26

AWS Certified Solutions Architect - Associate

Amazon Web Services Training and Certification

https://www.credly.com/badges/12ef8245-8a50-40f2-9f25-f55c232a674d

Languages

English

Fluent

Polish

Native speaker