这是用户在 2024-10-21 12:50 为 https://www.greptile.com/blog/how-we-engineer 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Greptile logo
Splitting engineering teams into defense and offense

Splitting engineering teams into defense and offense
将工程队分为防守和进攻

October 14, 2024 (6d ago) 2024年10月14日(6天前)

Author: Daksh Gupta 作者:达克什·古普塔

I’m Daksh, one of the co-founders of Greptile. We build AI that understands large codebases, which you can query via an API. Large software teams use it for things like AI-powered code reviews and diagnosing root causes for outages.
我是达克什,Greptile的联合创始人之一。我们构建了能够理解大型代码库的人工智能,您可以通过API进行查询。大型软件团队使用它进行人工智能代码审查和诊断停机的根本原因。

We are a team of 4 engineers. Our customers are often surprised to learn this, since they assume we must be much larger given the breadth of our product.
我们是一个由4名工程师组成的团队。我们的客户经常对此感到惊讶,因为他们认为,考虑到我们产品的广度,我们肯定要大得多。

While this is flattering, the truth is that our product is covered in warts, and our “lean” team is more a product of our inability to identify and hire great engineers, rather than an insistence on superhuman efficiency.
虽然这是奉承,但事实是,我们的产品布满了瑕疵,我们的“精干”团队更多地是我们无法识别和聘用伟大工程师的产物,而不是对超人效率的坚持。

The result is that our product breaks more often than we’d like. The core functionality may remain largely intact but the periphery is often buggy, something we expect will improve only as our engineering headcount catches up to our product scope.
其结果是,我们的产品出现故障的频率比我们希望的要高。核心功能可能基本保持不变,但外围设备经常出现故障,我们预计只有当我们的工程人员数量赶上我们的产品范围时,这一点才会得到改善。

Nevertheless, the reason we get anything done at all in these circumstances has to do with a specific way in which we structure our engineering team.
然而,我们之所以能在这种情况下完成任何事情,都与我们构建工程团队的特定方式有关。

Event-driven vs. long-running processes
事件驱动流程与长时间运行流程

15 years ago, Paul Graham wrote about the “maker vs. manager schedule”, the idea that makers, such as software developers, were different from managers in that they need long, uninterrupted hours to build great things.
15年前,保罗·格雷厄姆在《制造者与管理者时间表》一书中写道,软件开发人员等制造者与管理者的不同之处在于,他们需要长时间、不间断的工作时间来打造伟大的事物。

This essay resonated with engineers around the world who had been trying to squeeze their work in between endless mandatory meetings, and probably led to some sweeping changes at least at software-driven companies in favor of creating a “maker-schedule” for engineers.
这篇文章引起了世界各地的工程师的共鸣,他们一直试图在无休止的强制性会议之间挤出自己的工作,并可能至少在软件驱动的公司中导致了一些彻底的改变,有利于为工程师创建“制造者时间表”。

Small startups don’t suffer from an excess of meetings and instead have a whole different problem. Customers!
小型初创公司不会遭受过多会议的困扰,而是有完全不同的问题。顾客!

Without dedicated technical support teams and usually with immature products, engineers take on a lot of support work - from hotfixes to building small features for large customers, to just helping customers navigate their products.
没有专门的技术支持团队,而且通常是使用不成熟的产品,工程师会承担大量的支持工作-从修补程序到为大客户构建小功能,再到只是帮助客户导航他们的产品。

With enough customers, there is very little time to build new features and make ambitious, complex changes to the codebase.
有了足够的客户,构建新功能和对代码库进行雄心勃勃、复杂的更改的时间就很少了。

The engineering work that comes from customers, whether it is general support, bug fixes, or small modifications can be considered “event-driven” engineering.
来自客户的工程工作,无论是一般支持、错误修复,还是小的修改,都可以被认为是“事件驱动”工程。

The engineering work that includes longer-term (more than a week), ambitious projects, can be considered “long-running” engineering.
包括较长时间(一周以上)、雄心勃勃的项目的工程工作可以被认为是“长期的”工程。

These two are at odds.
这两个人意见相左。

The fortress 《堡垒》

Our solution to this problem has been simple, but so far, effective. This is not meant to be prescriptive. Every engineering team is different.
我们对这个问题的解决方案很简单,但到目前为止是有效的。这并不意味着是说明性的。每个工程团队都是不同的。

We instruct half the team (2 engineers) at a given point to work on long-running tasks in 2-4 week blocks. This could be refactors, big features, etc. During this time, they don’t have to deal with any support tickets or bugs.
我们指示一半的团队(2名工程师)在给定的时间点上以2-4周的时间块来处理长期运行的任务。这可能是重构、大功能等。在此期间,他们不必处理任何支持票证或错误。

Their only job is to focus on getting their big PR out.
他们唯一的工作就是专注于让他们的大公关走出去。

The other half of engineers must simply protect the first two from any support work, bugs, etc. Their job is to maintain a fortress around the long-running processes, by catching all the event-driven engineering work. At the end of the cycle, we swap.
另一半工程师必须简单地保护前两个不受任何支持工作、错误等的影响。他们的工作是通过捕获所有事件驱动的工程工作,在长期运行的流程周围维护一座堡垒。在周期结束时,我们互换。

Why this works 为什么这是可行的

Remarkable things happen when you take distractions away from a craftsperson. They can spend more time in flow and keep a large amount of context on the “client-side” of their brains.
当你把注意力从手艺人身上移开时,非凡的事情就会发生。他们可以花更多的时间在心流上,并将大量的上下文保存在他们大脑的“客户端”上。

Critically, it takes only 1-2 short interruptions to dramatically reduce the amount of work an engineer can do in a day. This chart sums it up well.
至关重要的是,只需1-2次短暂的中断就能大幅减少工程师一天可以完成的工作量。这张图表很好地总结了这一点。

Productivity chart

Impact of interruptions on developer productivity
中断对开发人员生产力的影响

It follows then that it’s far more useful to isolate interruptions to a few people rather than disperse them to “keep everyone productive”. If you’re spending some amount of time on support, incrementally more time spent on support will not affect your productivity much.
因此,将干扰隔离到少数人而不是分散它们以“保持每个人的生产力”要有用得多。如果您在支持上花费了一些时间,那么在支持上花费的时间逐渐增加不会对您的生产力产生太大影响。

Chart showing the impact of interruptions on productivity

The impact of interruptions on productivity
中断对生产力的影响

Defense/Offense engineering
防御/进攻工程

A mental model that I have found useful is to view event-driven engineering as “defensive” and long-running processes as “offensive”. This tracks nicely to the effect that each has.
我发现的一个有用的心理模型是,将事件驱动的工程视为“防御性”,将长期运行的流程视为“进攻性”。这很好地跟踪了每个人的效果。

Defensive engineering exists to maintain your product, whereas offensive engineering exists to expand it.
防御性工程的存在是为了维护你的产品,而进攻性工程的存在是为了扩展它。

Defensive engineering more strongly correlates with your retention and customer satisfaction, whereas offensive engineering arguably correlates a little more strongly with your ability to acquire new customers.
防御性工程与你的留存和客户满意度有更强的相关性,而进攻性工程与你获得新客户的能力有更强的相关性。

Disclaimer 免责

Not only am I not a professional engineering manager, this is also a very specific and usually ephemeral situation - a small team running a disproportionately fast growing product in a hyper-competitive and fast-evolving space.
我不仅不是一名专业的工程经理,而且这也是一个非常具体且通常是短暂的情况-一个小团队在一个竞争激烈、快速发展的领域运营着一个不成比例的快速增长的产品。

This is not advice, but rather an observation about how we run our engineering team.
这不是建议,而是对我们如何运营工程团队的观察。