July 30, 2019 2019년 7월 30일

How I became a machine learning practitioner
머신 러닝 실무자가 된 방법

For the first three years of OpenAI, I dreamed of becoming a machine learning expert but made little progress towards that goal. Over the past nine months, I’ve finally made the transition to being a machine learning practitioner. It was hard but not impossible, and I think most people who are good programmers and know (or are willing to learn) the math can do it too. There are many online courses to self-study the technical side, and what turned out to be my biggest blocker was a mental barrier — getting ok with being a beginner again.
OpenAI에 입사한 첫 3년 동안 저는 머신러닝 전문가가 되는 것이 꿈이었지만 그 목표를 향해 거의 진전을 이루지 못했습니다. 지난 9개월 동안 저는 마침내 머신러닝 실무자로 전환했습니다. 어렵긴 했지만 불가능하지는 않았고, 수학을 잘 알고 있거나 배우려는 의지가 있는 프로그래머라면 누구나 할 수 있는 일이라고 생각합니다. 기술적인 측면을 독학할 수 있는 온라인 강좌가 많지만, 가장 큰 걸림돌은 다시 초보자가 되는 것에 대한 정신적인 장벽이었습니다.

Studying machine learning during the 2018 holiday season.
2018년 홀리데이 시즌 동안 머신러닝을 공부합니다.

Early days 초기 단계 #

A founding principle of OpenAI is that we value research and engineering equally — our goal is to build working systems that solve previously impossible tasks, so we need both. (In fact, our team is comprised of 25% people primarily using software skills, 25% primarily using machine learning skills, and 50% doing a hybrid of the two.) So from day one of OpenAI, my software skills were always in demand, and I kept procrastinating on picking up the machine learning skills I wanted.
OpenAI의 설립 원칙은 연구와 엔지니어링을 동등하게 중요시한다는 것입니다. 이전에는 불가능했던 작업을 해결하는 작업 시스템을 구축하는 것이 목표이므로 두 가지 모두 필요합니다. (실제로 우리 팀은 소프트웨어 기술을 주로 사용하는 25%, 머신러닝 기술을 주로 사용하는 25%, 이 둘을 혼합하여 사용하는 50%로 구성되어 있습니다.) 그래서 OpenAI 입사 첫날부터 소프트웨어 기술은 항상 수요가 많았고, 제가 원하는 머신러닝 기술을 익히는 것을 계속 미뤄왔습니다.

After helping build OpenAI Gym, I was called to work on Universe. And as Universe was winding down, we decided to start working on Dota — and we needed someone to turn the game into a reinforcement learning environment before any machine learning could begin.
OpenAI Gym을 구축하는 데 도움을 준 후 저는 유니버스에서 일해 달라는 요청을 받았습니다. 그리고 유니버스가 마무리 단계에 접어들면서 도타 작업을 시작하기로 결정했고, 머신러닝을 시작하기 전에 게임을 강화 학습 환경으로 전환할 사람이 필요했습니다.

Dota 도타 #

Turning such a complex game into a research environment without source code access was awesome work, and the team’s excitement every time I overcame a new obstacle was deeply validating. I figured out how to break out of the game’s Lua sandbox, LD_PRELOAD in a Go GRPC server to programmatically control the game, incrementally dump the whole game state into a Protobuf, and build a Python library and abstractions with future compatibility for the many different multiagent configurations we might want to use.
이렇게 복잡한 게임을 소스 코드 액세스 없이 연구 환경으로 전환하는 것은 멋진 작업이었으며, 새로운 장애물을 극복할 때마다 팀원들이 흥분하는 모습을 보며 큰 보람을 느꼈습니다. 저는 게임의 Lua 샌드박스인 Go GRPC 서버의 LD_PRELOAD에서 벗어나 게임을 프로그래밍 방식으로 제어하고, 전체 게임 상태를 점진적으로 프로토버프에 덤프하고, 향후 사용할 다양한 멀티에이전트 구성에 호환되는 Python 라이브러리와 추상화를 구축하는 방법을 알아냈죠.

But I felt half blind. At Stripe, though I gravitated towards infrastructure solutions, I could make changes anywhere in the stack since I knew the product code intimately. In Dota, I was constrained to looking at all problems through a software lens, which sometimes meant I tried to solve hard problems that could be avoided by just doing the machine learning slightly differently.
하지만 저는 반쯤 장님이 된 기분이었습니다. Stripe에서는 인프라 솔루션에 관심이 많았지만, 제품 코드를 잘 알고 있었기 때문에 스택의 어느 곳에서든 변경할 수 있었습니다. 도타에서는 모든 문제를 소프트웨어 렌즈를 통해 바라봐야 하는 제약이 있었기 때문에 머신러닝을 조금만 다르게 수행하면 피할 수 있는 어려운 문제를 해결하려고 할 때도 있었습니다.

I wanted to be like my teammates Jakub Pachocki and Szymon Sidor, who had made the core breakthrough that powered our Dota bot. They had questioned the common wisdom within OpenAI that reinforcement algorithms didn’t scale. They wrote a distributed reinforcement learning framework called Rapid and scaled it exponentially every two weeks or so, and we never hit a wall with it. I wanted to be able to make critical contributions like that which combined software and machine learning skills.
저는 도타 봇을 구동하는 핵심적인 돌파구를 마련한 팀원 야쿱 파초키와 사이먼 시도르처럼 되고 싶었습니다. 이들은 강화 알고리즘이 확장되지 않는다는 OpenAI 내부의 통념에 의문을 제기했습니다. 그들은 Rapid라는 분산 강화 학습 프레임워크를 개발하여 2주마다 기하급수적으로 확장했고, 그 결과 벽에 부딪힌 적이 없었습니다. 저는 소프트웨어와 머신러닝 기술을 결합한 이와 같은 중요한 기여를 할 수 있기를 원했습니다.

Szymon on the left; Jakub on the right.
왼쪽은 시몬, 오른쪽은 야쿠브입니다.

In July 2017, it looked like I might have my chance. The software infrastructure was stable, and I began work on a machine learning project. My goal was to use behavioral cloning to teach a neural network from human training data. But I wasn’t quite prepared for just how much I would feel like a beginner.
2017년 7월, 제게 기회가 찾아올 것 같았습니다. 소프트웨어 인프라는 안정적이었고 저는 머신러닝 프로젝트에 착수했습니다. 제 목표는 행동 복제를 사용하여 인간의 훈련 데이터로 신경망을 학습시키는 것이었습니다. 하지만 초보자처럼 느껴질 정도로 준비가 되어 있지 않았어요.

I kept being frustrated by small workflow details which made me uncertain if I was making progress, such as not being certain which code a given experiment had used or realizing I needed to compare against a result from last week that I hadn’t properly archived. To make things worse, I kept discovering small bugs that had been corrupting my results the whole time.
특정 실험에 어떤 코드를 사용했는지 확실하지 않거나 제대로 보관하지 않은 지난 주 결과와 비교해야 한다는 사실을 깨닫는 등, 워크플로우의 사소한 세부 사항으로 인해 진전이 있는지 불확실해져 계속 좌절했습니다. 설상가상으로 계속 결과를 망치는 작은 버그도 계속 발견했습니다.

I didn’t feel confident in my work, but to make it worse, other people did. People would mention how how hard behavioral cloning from human data is. I always made sure to correct them by pointing out that I was a newbie, and this probably said more about my abilities than the problem.
저는 제 작업에 자신감이 없었지만 설상가상으로 다른 사람들도 그렇게 생각했습니다. 사람들은 인간 데이터에서 행동 복제가 얼마나 어려운지 언급하곤 했죠. 저는 항상 제가 초보자라는 점을 지적하며 바로잡곤 했는데, 이는 문제보다는 제 능력에 대해 더 많은 것을 말해주는 것 같았습니다.

It all briefly felt worth it when my code made it into the bot, as Jie Tang used it as the starting point for creep blocking which he then fine-tuned with reinforcement learning. But soon Jie figured out how to get better results without using my code, and I had nothing to show for my efforts.
제 코드가 봇에 적용되었을 때 잠시 보람을 느꼈는데, Jie Tang이 이 코드를 크립 블로킹의 시작점으로 사용한 다음 강화 학습을 통해 미세 조정했기 때문입니다. 하지만 곧 Jie는 제 코드를 사용하지 않고도 더 나은 결과를 얻을 수 있는 방법을 알아냈고, 저는 제 노력에 대한 보람을 느낄 수 없었습니다.

I never tried machine learning on the Dota project again.
도타 프로젝트에서 머신러닝을 다시는 시도하지 않았습니다.

Time out 시간 초과 #

After we lost two games in The International in 2018, most observers thought we’d topped out what our approach could do. But we knew from our metrics that we were right on the edge of success and mostly needed more training. This meant the demands on my time had relented, and in November 2018, I felt I had an opening to take a gamble with three months of my time.
2018년 인터내셔널에서 두 경기에서 패배한 후, 대부분의 관찰자들은 우리의 접근 방식이 한계에 다다랐다고 생각했습니다. 하지만 저희는 지표를 통해 우리가 성공의 가장자리에 있으며 대부분 더 많은 훈련이 필요하다는 것을 알고 있었습니다. 그래서 2018년 11월, 3개월의 시간을 가지고 도박을 할 수 있는 기회가 생겼다고 생각했습니다.

Team members in high spirits after losing our first game at The International.
인터내셔널에서 첫 경기에서 패배한 후 기분이 상한 팀원들.

I learn best when I have something specific in mind to build. I decided to try building a chatbot. I started self-studying the curriculum we developed for our Fellows program, selecting only the NLP-relevant modules. For example, I wrote and trained an LSTM language model and then a Transformer-based one. I also read up on topics like information theory and read many papers, poring over each line until I fully absorbed it.
저는 구축할 구체적인 목표가 있을 때 가장 잘 배웁니다. 저는 챗봇을 만들어보기로 결심했습니다. 펠로우 프로그램을 위해 개발한 커리큘럼 중 NLP 관련 모듈만 골라 독학으로 공부하기 시작했습니다. 예를 들어 LSTM 언어 모델을 작성하고 훈련한 다음 Transformer 기반 모델을 작성하고 훈련했습니다. 또한 정보 이론과 같은 주제를 공부하고 많은 논문을 읽으며 완전히 흡수할 때까지 한 줄 한 줄 정독했습니다.

It was slow going, but this time I expected it. I didn’t experience flow state. I was reminded of how I’d felt when I just started programming, and I kept thinking of how many years it had taken to achieve a feeling of mastery. I honestly wasn’t confident that I would ever become good at machine learning. But I kept pushing because… well, honestly because I didn’t want to be constrained to only understanding one part of my projects. I wanted to see the whole picture clearly.
느리게 진행되었지만 이번에는 예상했습니다. 플로우 상태를 경험하지 못했습니다. 처음 프로그래밍을 시작했을 때의 기분이 떠올랐고, 숙달되기까지 몇 년이 걸렸는지 계속 생각했습니다. 솔직히 저는 머신러닝을 잘할 수 있을 거라는 확신이 없었습니다. 하지만 계속 밀어붙인 이유는... 솔직히 말해서 프로젝트의 한 부분만 이해하는 데만 국한되고 싶지 않았기 때문입니다. 전체 그림을 명확하게 보고 싶었거든요.

My personal life was also an important factor in keeping me going. I’d begun a relationship with someone who made me feel it was ok if I failed. I spent our first holiday season together beating my head against the machine learning wall, but she was there with me no matter how many planned activities it meant skipping.
제 개인적인 삶도 저를 버티게 하는 중요한 요소였습니다. 저는 실패해도 괜찮다고 느끼게 해주는 사람과 연애를 시작했었죠. 첫 휴가 시즌을 함께 보내면서 머신러닝의 벽에 머리를 부딪히기도 했지만, 그녀는 계획된 활동을 몇 번이나 건너뛰더라도 제 곁에 있어 주었습니다.

One important conceptual step was overcoming a barrier I’d been too timid to do with Dota: make substantive changes to someone else’s machine learning code. I fine-tuned GPT-1 on chat datasets I’d found, and made a small change to add my own naive sampling code. But it became so painfully slow as I tried to generate longer messages that my frustration overwhelmed my fear, and I implemented GPU caching — a change which touched the entire model.
개념적으로 중요한 한 가지 단계는 다른 사람의 머신러닝 코드를 실질적으로 변경하는 것, 즉 도타에서 너무 소심했던 장벽을 극복하는 것이었습니다. 저는 제가 찾은 채팅 데이터 세트에서 GPT-1을 미세 조정하고 저만의 순진한 샘플링 코드를 추가하기 위해 약간의 변경을 가했습니다. 하지만 더 긴 메시지를 생성하려고 할 때 속도가 너무 느려져 좌절감이 두려움을 압도했고, 결국 전체 모델에 영향을 미치는 변경 사항인 GPU 캐싱을 구현했습니다.

I had to try a few times, throwing out my changes as they exceeded the complexity I could hold in my head. By the time I got it working a few days later, I realized I’d learned something that I would have previously thought impossible: I now understood how the whole model was put together, down to small stylistic details like how the codebase elegantly handles TensorFlow variable scopes.
제 머릿속에서 감당할 수 있는 복잡성을 넘어서는 변경 사항을 몇 번이나 버리면서 몇 번이나 시도해야 했습니다. 며칠 후 작동을 시작했을 때, 전에는 불가능하다고 생각했던 것을 배웠다는 것을 깨달았습니다: 코드베이스가 TensorFlow 변수 범위를 우아하게 처리하는 방법과 같은 작은 스타일 세부 사항까지 전체 모델이 어떻게 구성되는지 이해하게 된 것이죠.

Retooled 리툴링 #

After three months of self-study, I felt ready to work on an actual project. This was also the first point where I felt I could benefit from the many experts we have at OpenAI, and I was delighted when Jakub and my co-founder Ilya Sutskever agreed to advise me.
3개월간의 독학 끝에 저는 실제 프로젝트를 진행할 준비가 되었다고 느꼈습니다. 이때 처음으로 OpenAI에 있는 많은 전문가들의 도움을 받을 수 있다고 느꼈고, Jakub와 공동 창립자인 Ilya Sutskever가 제게 조언을 해주기로 동의했을 때 정말 기뻤습니다.

Ilya singing karaoke at our company offsite.
회사 외부에서 노래방에서 노래하는 일리야.

We started to get very exciting results, and Jakub and Szymon joined the project full-time. I feel proud every time I see a commit from them in the machine learning codebase I’d started.
매우 흥미로운 결과가 나오기 시작했고 Jakub와 Szymon은 프로젝트에 풀타임으로 합류했습니다. 제가 시작한 머신러닝 코드베이스에서 두 사람이 커밋하는 것을 볼 때마다 뿌듯함을 느낍니다.

I’m starting to feel competent, though I haven’t yet achieved mastery. I’m seeing this reflected in the number of hours I can motivate myself to spend focused on doing machine learning work — I’m now around 75% of the number of coding hours from where I’ve been historically.
아직 숙달되지는 않았지만 유능하다는 느낌이 들기 시작했습니다. 머신 러닝 작업에 집중할 수 있는 시간도 늘어났고, 코딩 시간도 과거에 비해 75% 정도 줄었습니다.

But for the first time, I feel that I’m on trajectory. At first, I was overwhelmed by the seemingly endless stream of new machine learning concepts. Within the first six months, I realized that I could make progress without constantly learning entirely new primitives. I still need to get more experience with many skills, such as initializing a network or setting a learning rate schedule, but now the work feels incremental rather than potentially impossible.
하지만 처음으로 제가 궤도에 올랐다는 느낌이 들었습니다. 처음에는 끝없이 쏟아져 나오는 새로운 머신러닝 개념에 압도당했습니다. 하지만 6개월이 지나면서 완전히 새로운 기본 개념을 계속 배우지 않아도 발전할 수 있다는 것을 깨달았습니다. 네트워크 초기화나 학습 속도 일정 설정과 같은 많은 기술에 대해 더 많은 경험을 쌓아야 하지만, 이제는 작업이 불가능하다고 느끼기보다는 점진적으로 느껴집니다.

From our Fellows and Scholars programs, I’d known that software engineers with solid fundamentals in linear algebra and probability can become machine learning engineers with just a few months of self study. But somehow I’d convinced myself that I was the exception and couldn’t learn. But I was wrong — even embedded in the middle of OpenAI, I couldn’t make the transition because I was unwilling to become a beginner again.
펠로우와 장학생 프로그램을 통해 선형 대수학과 확률에 대한 탄탄한 기초를 갖춘 소프트웨어 엔지니어가 몇 달만 독학하면 머신러닝 엔지니어가 될 수 있다는 사실을 알고 있었습니다. 하지만 왠지 저는 예외라서 배울 수 없다고 스스로를 설득했었습니다. 하지만 제가 틀렸습니다. OpenAI의 한가운데서조차 다시 초보자가 되고 싶지 않아서 전환을 할 수 없었습니다.

You’re probably not an exception either. If you’d like to become a deep learning practitioner, you can. You need to give yourself the space and time to fail. If you learn from enough failures, you’ll succeed — and it’ll probably take much less time than you expect.
여러분도 예외는 아닐 것입니다. 딥러닝 전문가가 되고 싶다면 그렇게 할 수 있습니다. 실패할 수 있는 공간과 시간을 확보해야 합니다. 충분한 실패를 통해 배우면 성공할 수 있으며, 예상보다 훨씬 짧은 시간 안에 성공할 수 있습니다.

At some point, it does become important to surround yourself by existing experts. And that is one place where I’m incredibly lucky. If you’re a great software engineer who reaches that point, keep in mind there’s a way you can be surrounded by the same people as I am — apply to OpenAI!
어느 순간부터 기존 전문가들과 함께 하는 것이 중요해집니다. 저는 정말 운이 좋았습니다. 여러분이 그 지점에 도달한 훌륭한 소프트웨어 엔지니어라면 저와 같은 사람들과 함께할 수 있는 방법이 있다는 것을 명심하세요. 바로 OpenAI에 지원하는 것입니다!

2,870

Kudos

2,870

Kudos

How I became a machine learning practitioner
머신 러닝 실무자가 된 방법

Early days 초기 단계 #

Dota 도타 #

Time out 시간 초과 #

Retooled 리툴링 #

Now read this 지금 읽어보세요

Setting up federated addresses with Stellar
Stellar로 페더레이션 주소 설정하기

How I became a machine learning practitioner 머신 러닝 실무자가 된 방법

Early days 초기 단계 #

Dota 도타 #

Time out 시간 초과 #

Retooled 리툴링 #

Now read this 지금 읽어보세요

Setting up federated addresses with StellarStellar로 페더레이션 주소 설정하기

How I became a machine learning practitioner
머신 러닝 실무자가 된 방법

Setting up federated addresses with Stellar
Stellar로 페더레이션 주소 설정하기