In this issue of Coffee with Calyptus, we sit down with Varshith Anilkumar, Founding Engineering Lead at Every Health and a technologist who has built and scaled systems across e-commerce, fintech, climatetech, and healthtech. From wildfire detection networks running on microcontrollers to Snowflake migrations in tightly regulated environments, Varshith has navigated some of the most demanding engineering challenges out there. His journey reveals a blueprint for building resilient AI systems even when resources are scarce.

You went from building data pipelines at Wayfair to detecting forest fires with AI at Dryad Networks, and now you're freelancing across IoT and fintech. What's harder, migrating a Postgres warehouse to Snowflake at SumUp, or architecting a fault-tolerant sensor network that needs to catch a fire before it spreads? Please walk us through your journey.
Each experience has its own interesting learnings and challenges. At Wayfair, at the time, it was a drive to standardise and scale ML engineering systems that would enable multiple different ML use cases across the company. At SumUp, the challenge was to provide data autonomy for all teams across the company to treat data like a product, with a focus on ensuring essential regulatory standards in the fintech space. At Dryad, it was rearchitecting the entire software architecture to scale the solution that would help the company transition from a startup to a scaleup, maintaining the same standards and SLAs that proved your solution and built credibility in the Climate Tech space for an ultra early wildfire detection and prevention system.
Each of the experiences provided key insights that were very much applicable to one another and were also transferable. My journey in tech has been quite diverse, and I am very grateful and lucky to have worked on numerous different domains, ranging from e-commerce to fintech to healthtech to climatetech, all having their own interesting challenges as well. Scalability has been one key theme throughout my tech journey, be it scaling engineering systems or scaling engineering teams. This diverse set of experiences has also equipped me and positioned me very well to help similar companies scale their solutions and technical teams, which I continue to do today.
BlockMMP, your NIH-funded blockchain project tracking opioid prescriptions, has been running since 2019 while you've held full-time roles at SumUp and Dryad. How do you balance building a regulated healthcare product on the side with leading entire engineering divisions at day jobs?
BlockMMP was born through an inquisitive enquiry into the Opioid crisis, which was and is still a huge problem in North America. Having gotten involved with the community, catering to solve problems in the Opioid Crisis space, initially through a Hackathon, we found a solution to a pressing problem around prescription tracking and how people who face Opioid use disorders face discrimination. Having gotten the support of this community, we were able to devise a solution that utilises a permissioned blockchain-based framework and also eventually procure an NIH grant. Having a Co-Founder based in the US helped significantly, and I was able to transition into a more passive advisory role, which helped me balance my roles here in Germany. I was able to build the right partnerships and counsel when it came to BlockMMP being in such a niche space, such as the Opioid Crisis. COBRE (Centres of Biomedical Research Excellence) at Brown University is one of them. These partnerships and alliances provided the proper guidance and support to build and grow BlockMMP.
What are some architectural mistakes you see companies make when they try to deploy AI at the edge versus keeping it in the cloud?
This is a very nuanced question, as edge-based AI solutions vary vastly based on the type of use case your company might have. As the resource constraints vary as well. For example, you might be very much compute restricted like with Dryad's case for example for compute at the edge, as you need to be very strategic and mindful as the solution is deployed within forest sites, so compute here is very much limited to a microcontroller architecture on a PCB, however, in some other usecases for example with other companies for instance, they might be able to afford to have raspberry PIs running at the edge wherein now, the compute constraints are very much different.
Some of the more generalised architectural mistakes that companies need to be mindful of are:
- Lack of a simulation setup that enables you to benchmark and threshold compute metrics needed for your edge-based AI solutions. A small test setup is crucial for you to really calibrate and benchmark your solution.
- Lack of proper versioning when it comes to firmware and software updates at the edge. This is also highly crucial as you need to have a way to easily rollback and have a very robust way to track changes between software versions, as you might have solutions deployed across different regions globally.
- Another pitfall is that you might fall into the trap of designing systems that assume constant connectivity to the cloud. When the connection drops, the system fails instead of degrading gracefully. There's often no local fallback or caching strategy.
- Inadequate monitoring and debugging - It's much harder to debug edge devices than cloud services. Companies don't build sufficient telemetry, logging, or remote diagnostics into their edge systems, making it nearly impossible to understand failures in production. And you also need to constantly upgrade and maintain your monitoring and logging infrastructure. For early-stage startups or smaller companies, it is almost always better to go with a cloud-based solution via AWS, for instance, as this will significantly lower your operational overhead when it comes to monitoring or logging
- Misalignment with latency constraints, as these are crucial for real-time applications like robotics, AR/VR, or autonomous vehicles. Again, a test environment becomes very useful and if you can afford to, have a pilot test site in order to ensure that you can meet your SLA requirements.
- No clear decision framework - Not having explicit criteria for what workloads run where. Teams make ad-hoc decisions, leading to inconsistent architectures.
- Lack of fallback strategies - Not designing for graceful degradation. When the edge fails, there's no cloud fallback; when the cloud is unreachable, the edge can't operate independently.
As a freelance engineering leader now, how are you using AI agents or automation tools to scale your own consulting work?
This is another nuanced question. The usage of AI agents and agentic frameworks also depends on the use case and the type of company.
A few use cases:
- Following a spec-driven framework like AWS Kiro, but with coding agents' help to maintain context and boundaries when automating agentic AI workflows.
- I have also created frameworks based on the project, which can automate agent creation as well in a way that complies with the regulatory requirements that align with the project.
- MCP integration has been very useful as well, especially when it comes to testing and iteration. This is where a lot of the AI agentic workflows truly shine when it comes to initial testing and prototyping.
- I also utilise a lot of AI productivity tools that help me save a lot of time. I utilise Notion with its AI plugin, and I also utilise Gemini with their transcription and summary features. Additionally, I utilise Notebook LM, which is also an amazing tool when it comes to brainstorming and organising content, in addition to Notion AI and Miro.
- These are just a few, but there are many more customised workflows that I set up that help me be productive, but those range from project to project, of course.
From your NEMIC Med Tech Leadership Program to now consulting on highly regulated products, you've seen compliance from every angle. How do you approach the challenges of innovation within a regulated environment?
On the contrary, a lot of innovation arises because of compliance in many cases. Sometimes these constraints and boundaries are essential and very needed. For example, at SumUp, we were able to create a flexible data mesh architecture that would speed up innovation and provide autonomy for each team while ensuring the regulatory compliance needed for data and the SLAs we had to maintain.
- When you also look at edge-based solutions, a lot of standards enforced enable you to seamlessly scale your architectures. A key theme that I have seen and holds true is that standardisation enables scalable innovation.
- Another key insight from my experience in working with projects in healthtech, fintech and other regulated sectors is to start with compliance in mind from day one. Tech debt and operational debt exponentially increase over time, especially when you are an early-stage startup working on an MVP. In the current day and age, with the tools and frameworks we have at our disposal, we can definitely start with compliance in mind from day one, and this is something that more and more companies need to realise.
- Another key insight to navigate highly regulated sectors is to always maintain reproducibility and transparency, especially when it comes to data workflows. This is what enables you to become compliant, and this is what regulatory bodies value and look at, at the end of the day.
We hope you enjoyed this edition of Coffee with Calyptus. Stay curious, stay inspired, and keep building what matters. Explore more editions and insightful articles at



