Software development 450 words per minute

“Something’s a little bit off here.” That’s what I predict your first thought to be upon seeing my cubicle for the first time. There’s no screen or mouse in sight. Instead there’s a guy hammering away on a keyboard, staring at seemingly nothing.

我认为你第一次看到我的工作间肯定这样想 —— “总感觉少些什么”。没有显示器和鼠标,却有个人敲打着键盘,不知注视着哪里。

It’s only me, and my colleagues can assure you that I’m mostly harmless. I’m a software developer working at Vincit offices in Tampere. I’m also blind. In this blog post I’m going to shed some light on the way I work.

这就是我,我同事可以证明我没问题。我是位于坦佩雷(芬兰西南部一座城市)的 Vincit 写字楼中的一名软件开发者。我双目失明。这篇文章中我将讲述有关我工作中的事情。

Are you blind as in actually blind?


Correct. I can perceive sunlight and some other really bright lights but that’s about it. In essence, nothing that would be useful for me at work.


What are you doing there, then?


The same as almost everyone else, that is: making software and bantering with my colleagues whenever the time permits. I have worked in full stack web projects with a focus on the backend. I have also taken up the role of a general accessibility consultant – or police; depends on how you look at it.

和大部分人一样:忙时写代码,闲时和同事吹逼。我做全栈项目,主攻后端。兼职访问顾问 – 或称监管,随你如何称呼。

How do you use the computer?


The computer I use is a perfectly normal laptop running Windows 10. It’s in the software where the “magic happens”. I use a program called a screen reader to access the computer. A screen reader intercepts what’s happening on the screen and presents that information via braille (through a separate braille display) or synthetic speech. And it’s not the kind of synthetic speech you hear in today’s smart assistants. I use a robotic-sounding voice which speaks at around 450 words per minute. For comparison, English is commonly spoken at around 120-150 words per minute. There’s one additional quirk in my setup: Since I need to read both Finnish and English regularly I’m reading English with a Finnish speech synthesizer. Back in the old days screen readers weren’t smart enough to switch between languages automatically, so this was what I got used to. Here’s a sample of this paragraph being read as I would read it:

我用的电脑是一台运行 Windows 10 的普通笔记本。是其中的软件让一切变得神奇。我使用一款叫做屏幕阅读器的程序来访问电脑。屏幕阅读器监听屏幕上的变化并通过盲文(需要单独的盲文设备)或合成的声音来展示给用户。这并不是你如今听到的各种智能助理的合成声音。我使用一种机械声音,每分钟能说 450 个单词。相比较而言,英语正常语速每分钟 120-150 个单词。我有一个怪癖:我既说英语也说芬兰语,我用芬兰语合成器读英语,因为老旧的屏幕阅读器在语言之间切换不够智能,所以我习惯这样做。下面是个例子是阅读这个段落,我能听懂。


And here’s the same text spoken by an English speech synthesizer:



A mouse is naturally not very useful to me so I work exclusively at the keyboard. The commands I use should be familiar to anyone reading this post: Arrow keys and the tab key move you around inside a window, alt+tab changes between windows etc. Screen readers also have a whole lot of shortcuts of their own, such as reading various parts of the active window or turning some of their features on or off.

鼠标对于我来说并不是非常有用,所以我仅仅通过键盘工作。在座的各位应该十分熟悉我用到的命令:方向键和 tab 键控制窗口内的移动,alt+tab 切换窗口等等。屏幕阅读器也有很多自己的快捷键,比如阅读活动窗口的不同区域或开关一些功能特性。

It’s when reading web pages and other formatted documents that things get a little interesting. You see, a screen reader presents its information in chunks. That chunk is most often a line but it may also be a word, a character or any other arbitrary piece of text. For example, if I press the down arrow key on a web page I hear the next line of the page. This type of reading means that I can’t just scan the contents of my screen the same way a sighted person would do with their eyes. Instead, I have to read through everything chunk by chunk, or skip over those chunks I don’t care about.


Speech or braille alone can’t paint an accurate representation of how a window is laid out visually. All the information is presented to me in a linear fashion. If you copy a web page and paste it into notepad you get a rough idea of how web pages look to me. It’s just a bunch of lines stacked on top of another with most of the formatting stripped out. However, a screen reader can pick up on the semantics used in the HTML of the web page, so that links, headings, form fields etc. are announced to me correctly. That’s right: I don’t know that a check box is a check box if it’s only styled to look like one. However, more on that later; I’ll be devoting an entire post to this subject. Just remember that the example I just gave is a crime against humanity.

语音或盲文并不能描绘出窗口的显示布局。信息以线性方式呈现给我。如果你把网页复制粘贴进记事本,你就能明白我看到的网页是什么样子的。就是剥离大部分格式的多行文本。然而屏幕阅读器可以获取网页上的 HTML 语法,所以我也能知道超链接、标题、表单等等。事实上,如果非复选框元素展示成复选框样式,我并不能知道这是复选框。我之后将写一篇文章详细讲述这些内容,记住我刚刚举的是个“反人类”例子。
(译者注:突然感到自责和羞愧,深深明白了一个道理:不要用各种有含意义的传统标签 hack 布局和样式,也不要因为 css 的强大而懒得使用各种有含义的传统标签。共勉)

I spend a good deal of my time working at the command line. In fact I rarely use any other graphical applications than a web browser and an editor. I’ve found that it’s often much quicker to do the task at hand on the command line than to use an interface which was primarily designed with mouse users in mind.


So, given my love of the command line, why am I sticking with Windows, the operating system not known for its elegant command line tools? The answer is simple: Windows is the most accessible operating system there is. NVDA, my screen reader of choice is open source and maintained more actively than any other screen reader out there. If I had the choice I would use Mac OS since in my opinion it strikes a neat balance between usability and functionality. Unfortunately VoiceOver, the screen reader built in to Mac OS, suffers from long release cycles and general neglect, and its navigation models aren’t really compatible with my particular way of working. There’s also a screen reader for the Gnome desktop and, while excellently maintained for such a minor user base, there are still rough edges that make it unsuitable for my daily use. So, Windows it is. I’ve been compensating for Windows’ inherent deficiencies by living inside Git Bash which comes with an excellent set of GNU and other command line utilities out of the box.

既然我如此热爱命令行,为什么我却要选择 Windows 这个并不以命令行出名的操作系统呢?答案很简单:Windows 是最方便的操作系统。NVDA是我所选择的屏幕阅读器,它是开源的并且维护比其他阅读器更频繁。如果上天再我一次机会,我可能会选 Mac 系统,因为我认为它是易用性和功能性平衡的典范。不幸的是 Mac 系统上的屏幕阅读器 VoiceOver 经历了漫长的发布周期从而被遗忘,并且它的导航模型和我独特的工作方式并不协调。当然这里也有一个 Gnome 桌面上的屏幕阅读器,虽然用户很少,依然被很好地维护着,不过还有一些不完善的地方和我日常工作不协调。所以,我选择 Windows。由 GNU 诞生的 Git Bash 和其他命令行工具弥补了 Windows 内置命令行的缺陷。

How can you code?


It took me quite a long time to figure out why this question was such a big deal for so many people. Remember what I said earlier about reading text line by line? That’s how I read code. I do skip over the lines that aren’t useful to me, or maybe listen only halfway through them just for context, but whenever I actually need to know what’s going on I have to read everything as if I were reading a novel. Naturally I can’t just read through a huge codebase like that. In those cases I have to abstract some parts of the code in my mind: this component takes x as its input and returns y, never mind what it actually does.

我花费好长时间才明白为什么大家觉得这个问题是个很高深的问题。记得我上面说过一行一行地阅读文本吗?我也是通过这种方式读代码。通常我会跳过无用的行,或仅听半行来获取内容,但当我需要知道完整信息的时候,我不得不像读小说一样读完所有东西。我当然无法阅读整个代码库。这种情况下我会在脑中抽象一部分代码:这个组件输入 x 返回 y,并不用关心细节逻辑。

This type of reading makes me do some coding tasks a little bit differently than my sighted colleagues. For example, when doing a code review I prefer to look at the raw diff output whenever I can. Side-by-side diffs are not useful to me, in fact they are a distraction if anything. The + and – signs are also a far better indicator of modified lines than background colours, not because I couldn’t get the names of those colours, but because “plus” takes far less time to say than some convoluted shade of red that is used for highlighting an added line. (I am looking at you, Gerrit.)

这种阅读方式让我和正常同事的工作方式有些区别。举个例子,当代码审查时,我喜欢看原始 diff 输出,并列窗口显示 diff 对我并不适用,而且还容易让人分心。有修改的代码行上用符号 + 和 – 比用不同背景色标注也要好太多,并不是因为我不能获知颜色名字,而是因为在新增的一行中,读“加”这个字比读“带复杂阴影的高亮红色”用更短的时间。(嘿,我说你呢 Gerrit (一款代码审查工具))

You might think that indentation and other code formatting would be totally irrelevant to me since those are primarily visual concerns. This is not true: proper indentation helps me just as much as it does a sighted programmer. Whenever I’m reading code in braille (which, by the way, is a lot more efficient than with speech) it gives me a good visual clue of where I am, just like it does for a sighted programmer. I also get verbal announcements whenever I enter an indented or unindented block of text. This information helps me to paint a map of the code in my head. In fact Python was the first real programming language I picked up (Php doesn’t count) and its forced indentation never posed a problem for me. I’m a strong advocate of a clean and consistent coding style for a number of reasons, but mostly because not having one makes my life much more difficult

你或许会认为缩进和其他代码格式和我无关,因为都是基本的视觉问题。并不是这样,正确的缩进对我的帮助和正常开发者一样。当我用盲文(比语音更加高效)读代码时,我像其他正常程序员一样清楚代码结构。当我进入一段有缩进或无缩进的代码时,我也会得到语音提醒。这些信息帮助我在脑中描绘代码结构。事实上我学的第一门语言就是 Python (PHP 不算),它强制使用代码缩进,这对我来说并不是问题。我有众多理由来强烈建议使用整洁统一的代码风格,其中之一就是不要让我的生活变得更加艰难了,好吗。

Which editor do you prefer?


Spoiler alert: The answer to this question doesn’t start with either V or E. (Granted, I do use Vim for crafting git commit messages and other quick notes on the command line. I consider myself neutral on this particular minefield.) A year ago my answer would have been, of all things, Notepad++. It’s a lightweight, well-made text editor that gets the job done. However, a year ago I hadn’t worked in a large-scale Java project. When that eventually happened it was time to pick between Notepad++ and my sanity. I ended up clinging to the latter (as long as I can, anyway) and ditching Notepad++ in favour of IntelliJ IDEA. It has been my editor of choice ever since. I have a deeply-rooted aversion towards IDEs since most of them are either inaccessible or inefficient to work with solely on the keyboard. Chances are that I would have switched to using an IDE a lot sooner if I was sighted.

剧透一下:这个答案并不是以 V 或者 E 开头(我虽然通过命令行用 Vim 来写 git commit 信息和其他备注。我认为我在这场圣战中是中立的)(译者注:Vim 和 Emacs 梗)一年前我认为 Notepad++ 最棒,它是轻量级的做工精细的文本编辑器。然而一年前我还没有接触大规模 Java 项目,当我接触这种项目时,意味着我应该在 Notepad++ 和理智之间做个选择。最后我选择理智,抛弃 Notepad++ 转投 IntelliJ IDEA 的怀抱。从那之后 IntelliJ IDEA 便是我首选编辑器。我曾对各种 IDE 有深深怨念,它们大多数在纯键盘流操作下麻烦又低效。如果我视力没问题,我肯定早就跳到 IDE 阵营了。

But why Notepad++, you might ask. There are more advanced lightweight editors out there like Sublime text or Atom. The answer is simple: neither of them is accessible to screen readers. Text-mode editors like Vim aren’t an option either, since the screen reader I use has some problems in its support of console applications that prevent those editors from being used for anything longer than a commit message. Sadly, accessibility is the one thing that has the last word on the tools I use. If it’s not workable enough that I can use it efficiently, it’s out of the question.

但你可能会问,为什么当初选 Notepad++。还有其他很多更先进的轻量级编辑器,比如 Sublime 或 Atom。原因很简单:屏幕阅读器无法访问它们。Vim 一类的文本编辑器也是如此,我使用的屏幕阅读器对命令行程序的支持有问题,在这些编辑器上无法处理多于 commit 信息的文本。很遗憾,可用性决定了我能够使用的工具。即使我不能高效工作,也不是什么大问题。

Do you ever work with frontend code?


You would think that frontend development was so inherently visual that it would be no place for a blind developer, and for the most part that is true. You won’t find me doing a basic Proof-of-Concept on my own, since those projects tend to be mostly about getting the looks right and adding the real functionality later.


However, I’ve had my fair share of Angular and React work too. How’s that? Many web apps of today have a lot going on under the hood in the browser. For example, I once worked a couple of weeks adding internationalization support to a somewhat complex Angular app. I didn’t need to do any visual changes at all.

然而,我也做过 Angular 和 React 工作任务。怎么会这样?如今很多 APP 基于浏览器。举个例子,我曾花费两周时间为一个 Angular APP 增加国际化支持。我并不需要做任何视觉上的改动。

I’ve found that libraries like Bootstrap are a godsend for people like me. Because of the grid system I can lay out a rough version of the user interface on my own. Despite this all the interface-related changes I’m doing are going through a pair of eyes before shipping to the customer. So, to sum up: I can do frontend development up to a point, at least while not touching the presentation layer too much.

我发现对于我这类开发者开说,像 Bootstrap 这类的库简直是上天的礼物。正因为栅格系统(Bootstrap的响应式布局解决方案),我可以自己构建一个粗糙的界面。尽管如此,我做的有关界面的改动在呈现给用户之前仍然要有一双眼睛检查。所以,总而言之,我可以在一定程度上做些前端开发,至少不是和表现层太相关。

How about all the things you didn’t mention?


There are certainly a lot of things I had to leave out of this blog post. As promised I’ll be devoting a post to the art of making web pages more accessible, since the lack of proper semantics is one of my pet peeves. However, there’s a good chance I won’t leave it at that. Stay tuned!




下面是Nipun Ramakrishnan的回答

Sleep sort


It is another joke algorithm that that became popular on the 4chan board /prog/ . Created by some anonymous programmer it basically works like this:

另一个搞笑算法流传于 4chan 的 /prog/ 板块。无从查证具体出自哪位程序员,伪代码如下

procedure printNumber(n)
    sleep n seconds
    print n

for arg in args
    run printNumber(arg) in background
wait for all processes to finish

The algorithm basically works like this:
For every element x in an array, start a new program that:

  • Sleeps for x seconds
  • Prints out x
    The clock starts on all the elements at the same time.
    It works for any array that has non-negative numbers.

对于数组中每个元素 x,开启一个新程序:

  • 休眠 x 秒
  • 打印 x

下面是Ryan Turner的回答


Bogo 排序(猴子排序)

Bogosort. It even has a strange name! It’s literally classified as a Stupid sort.
Basically, you randomly put each element in a random place.
If it isn’t sorted, you randomly put each element in a random place.
If it isn’t sorted after that… you get the picture. Here’s an example:
4, 7, 9, 6, 5, 5, 2, 1 (unsorted)
2, 5, 4, 7, 5, 9, 6, 1 (randomly placed)
1, 4, 5, 6, 9, 7, 5, 2 (randomly placed again)
1, 2, 4, 5, 5, 6, 7, 9 (wow that was lucky)
Basically you keep randomly shuffling until you get a sorted array.
It’s easily one of the least efficient sorting algorithms, unless you are really, really lucky, and it has the horrific expected run time of O(n!), but tends to O(∞) the more elements you add.

Bogo 排序,名字很奇怪。它是愚蠢排序中的一员。
4, 7, 9, 6, 5, 5, 2, 1 (未排序)
2, 5, 4, 7, 5, 9, 6, 1 (随机排列)
1, 4, 5, 6, 9, 7, 5, 2 (再次随机排列)
1, 2, 4, 5, 5, 6, 7, 9 (天呐,真幸运)
毫无疑问这是最低效的排序算法之一,除非你非常非常幸运。它时间复杂度是令人窒息的 O(n!),而且随着元素数量增加,很有 O(∞) 的趋势。

下面是Tyler Schroeder的回答

Quantum Bogosort

量子 Bogo 排序

I’m quite a fan of quantum bogosort, myself:

  • Randomly shuffle the elements of the list.
  • If the list is not sorted, destroy the universe (this step is left as an activity for the reader).
  • Any surviving universes will then have the sorted version of the list.
    Works in O(N) time!*
    *note: this figure relies on the accuracy of the many worlds theory of quantum mechanics. if the many worlds theory of quantum mechanics is not accurate, your algorithm is unlikely to work in O(N) time.

我是量子 Bogo 排序的粉丝:

  • 随机排列数组中元素。
  • 如果数组没有排好序,摧毁当前宇宙(这一步就拜托你了)
  • 存活的宇宙将会有排好序的数组。
    时间复杂度仅仅 O(n)
    注意:这种算法依赖于量子力学的平行宇宙理论的可靠性。如果量子力学的平行宇宙理论不准确,这个算法时间复杂度达不到 O(n)

下面是Yi Wang的回答


I didn’t invent this. I saw it somewhere.
A student went to a copy shop to print some materials. He wanted to print 2 copies. Somehow, instead of printing out 2 copies, he printed every pages twice. Let me illustrate.
Pages he wanted : 1 2 3 4 … N; 1 2 3 4 … N
Pages he got: 1 1 2 2 3 3 4 4 …. N N
So when he was trying to sort the pages by picking up one and putting on the left pile and then picking another and putting on the right pile, the owner of the shop took over.
He first put the first page on left and then picked 2 pages and put on the right, then 2 left, 2 right ……
Sorting speed doubled……

需要的页码顺序: 1 2 3 4 … N; 1 2 3 4 … N
手上的页码顺序: 1 1 2 2 3 3 4 4 …. N N
排序速度瞬间翻倍 ……




It is a sorting algorithm that is of humorous nature and not useful. It’s based on the principle of multiply and surrender, a tongue-in-cheek joke of divide and conquer. It was published in 1986 by Andrei Broder and Jorge Stolfi in their paper Pessimal Algorithms and Simplexity Analysis. Here is the pseudocode:

这是一个非常幽默却没什么用的排序算法。它基于“合而不治”的原则(分治算法基本思想“分而治之”的反义词,文字游戏),它由 Andrei Broder 和 Jorge Stolfi 于 1986 年发表在论文《Pessimal Algorithms and Simplexity Analysis(最坏排序和简单性分析)》中,伪代码如下:

function slowSort(array,start,end){
    if( start >= end ) return; //已经不能再慢了
    middle = floor( (start+end)/2 );


    if( array[end] < array[middle] )
        swap array[end] and array[middle]

  • Sort the first half recursively
  • Sort the second half recursively
  • Find the maximum of the whole array by comparing the middle with the end, placing maximum at the end of the list.
  • Recursively sort the entire list without the maximum.
  • 递归排序好前一半
  • 递归排序好后一半
  • 比较中间和队尾的值,得到整个数组的最大值,将最大值放到队尾。
  • 去掉最大值,递归整个数组

Stack Sort

Stack 排序

It searches for all the posts on StackOverflow with ‘sort a list’ in their title, extracts code snippets from them and then runs these code snippets until the list is sorted, which I think it verifies by actually traversing the list. It was posted on xkcd and an implementation can be found at stacksort .

从 StackOverflow 上搜索标题含有“数组排序”的帖子,复制粘贴并运行其中的代码片段,直到数组排好序。我认为这种排序算法事实上验证了整个数组。它被发表在xkcd网站上,这里有一个在线版的具体实现stacksort

Random Sort


It is as follows:
Create and compile a random program.
Run the random program on the input array.
If the program produces sorted output, you are done.
Else repeat from the beginning.


Solar Bitflip Sort


The alpha particles from the sun flip a few bits in the memory once a while, so this algorithm basically hopes that this flipping can cause the elements to be sorted. The way it works is:


Check if the array is sorted.
If the array is sorted, return the array.
If the array is not sorted, wait for 10 seconds and pray for having bit flips caused by solar radiation, just in the correct order then repeat from step 1.

如果没有,等 10 秒钟并祈祷太阳辐射使得比特位翻转,而且使得数组排好序,重复第一步。

Spaghetti Sort


It is a linear time algorithm which sorts a sequence of items requiring O(n) stack space in a stable manner. It requires a parallel processor. For simplicity, assume we are sorting a list of natural numbers. The sorting method is illustrated using uncooked rods of spaghetti.

这是一种线性时间算法,是需要 O(n) 空间的稳定排序。它需要并行处理器。简单来说,假设我们排序一列自然数。排序方法需要使用很多根生的意大利面条。

Convert the data to positive real values scaled to the length of a piece of spaghetti.
Write each value onto a piece of spaghetti (a spaghetto) and break the spaghetti to the length equal to the scaled values.
Hold all resulting spaghetti in a bundle and slam the end on a flat surface.
Take the spaghetto which sticks out i.e. the longest one, read the value, convert back to the original value and write this down.
Repeat until all spaghetti are processed.


Stalin Sort


This works in O(n) time complexity.
Gather up the comrades and show them the list.
Ask the comrades to raise their hands if they believe the list is sorted.
Select any comrade who did not raise his hand, and executes him as a traitor.
Repeat steps 2 and 3 until everyone agrees that the list is sorted.

这个算法时间复杂度 O(n)。

Intelligent Design Sort


Whatever state your list is in, it’s already sorted meaning: The probability of the original input list being in the exact order it’s in is 1/(n!). There is such a small likelihood of this that it’s clearly absurd to say that this happened by chance, so it must have been consciously put in that order my an intelligent sorted. Therefore it’s safe to assume that it’s already optimally sorted in some way that transcends our naive mortal understanding of ‘ascending order’. Any attempt io change that orders to conform to our own preconceptions would actually make it less sorted.

解释:原始输入按照某种顺序的概率是 1/(n!)。概率是如此小,(当前的顺序)归结于运气成分显然是荒谬的,所以它是按照“智能设计”排序过的。所以完全可以说数组已经排好序了,只是不是我们传统意义上的“升序”。如果按照我们传统观点对它进行操作,只会让它乱序。(“智能设计”涉及宗教和哲学,不过多解释

The Internet Sort


It’s just a bubble sort, but perform every comparison by searching the internet. For example, “Which is greater – 0.211 or 0.75?”.

这是一种冒泡排序,但每次比较都依靠互联网的搜索。比如 “0.211 和 0.75 哪个大?”

Committee Sort


To sort a list comprised of N natural numbers, start by printing out N copies of the whole list on paper.
Now form N committees from any poor souls who happen to be milling around the office. The number of people on each committe should correspond to their particular value in the list.
Each committee is given their sheet of paper and told to, through a meeting or some other means, decide where in the sorted list their number should fall.
The list will be naturally sorted by the order that the committees return with their results

排序一个包含 N 个自然数的数组,首先用纸打印出 N 份整个数组。

理解 JavaScript 作用域

Understanding Scope in JavaScript

Understanding Scope in JavaScript

理解 JavaScript 作用域



JavaScript has a feature called Scope. Though the concept of scope is not that easy to understand for many new developers, I will try my best to explain them to you in the simplest scope. Understanding scope will make your code stand out, reduce errors and help you make powerful design patterns with it.

JavaScript 有个特性称为作用域。尽管对于很多开发新手来说,作用域的概念不容易理解,我会尽可能地从最简单的角度向你解释它们。理解作用域能让你编写更优雅、错误更少的代码,并能帮助你实现强大的设计模式。

What is Scope?


Scope is the accessibility of variables, functions, and objects in some particular part of your code during runtime. In other words, scope determines the visibility of variables and other resources in areas of your code.


Why Scope? The Principle of Least Access


So, what’s the point in limiting the visibility of variables and not having everything available everywhere in your code? One advantage is that scope provides some level of security to your code. One common principle of computer security is that users should only have access to the stuff they need at a time.


Think of computer administrators. As they have a lot of control over the company’s systems, it might seem okay to grant full access user account to them. Suppose you have a company with three administrators, all of them having full access to the systems and everything is working smooth. But suddenly something bad happens and one of your system gets infected with a malicious virus. Now you don’t know whose mistake that was? You realize that you should them basic user accounts and only grant full access privileges when needed. This will help you track changes and keep an account of who did what. This is called The Principle of Least Access. Seems intuitive? This principle is also applied to programming language designs, where it is called scope in most programming languages including JavaScript which we are going to study next.

想想计算机管理员吧。他们在公司各个系统上拥有很多控制权,看起来甚至可以给予他们拥有全部权限的账号。假设你有一家公司,拥有三个管理员,他们都有系统的全部访问权限,并且一切运转正常。但是突然发生了一点意外,你的一个系统遭到恶意病毒攻击。现在你不知道这谁出的问题了吧?你这才意识到你应该只给他们基本用户的账号,并且只在需要时赋予他们完全的访问权。这能帮助你跟踪变化并记录每个人的操作。这叫做最小访问原则。眼熟吗?这个原则也应用于编程语言设计,在大多数编程语言(包括 JavaScript)中称为作用域,接下来我们就要学习它。

As you continue on in your programming journey, you will realize that scoping parts of your code helps improve efficiency, track bugs and reduce them. Scope also solves the naming problem when you have variables with the same name but in different scopes. Remember not to confuse scope with context. They are both different features.

在你的编程旅途中,你会意识到作用域在你的代码中可以提升性能,跟踪 bug 并减少 bug。作用域还解决不同范围的同名变量命名问题。记住不要弄混作用域和上下文。他们是不同的特性。

Scope in JavaScript


In the JavaScript language there are two types of scopes:

在 JavaScript 中有两种作用域

  • Global Scope
  • Local Scope
  • 全局作用域
  • 局部作用域

Variables defined inside a function are in local scope while variables defined outside of a function are in the global scope. Each function when invoked creates a new scope.


Global Scope


When you start writing JavaScript in a document, you are already in the Global scope. There is only one Global scope throughout a JavaScript document. A variable is in the Global scope if it’s defined outside of a function.

当你在文档中(document)编写 JavaScript 时,你就已经在全局作用域中了。JavaScript 文档中(document)只有一个全局作用域。定义在函数之外的变量会被保存在全局作用域中。

// the scope is by default global
var name = 'Hammad';

Variables inside the Global scope can be accessed and altered in any other scope.


var name = 'Hammad';

console.log(name); // logs 'Hammad'

function logName() {
    console.log(name); // 'name' is accessible here and everywhere else

logName(); // logs 'Hammad'

Local Scope


Variables defined inside a function are in the local scope. And they have a different scope for every call of that function. This means that variables having the same name can be used in different functions. This is because those variables are bound to their respective functions, each having different scopes, and are not accessible in other functions.


// Global Scope
function someFunction() {
    // Local Scope ##1
    function someOtherFunction() {
        // Local Scope ##2

// Global Scope
function anotherFunction() {
    // Local Scope ##3
// Global Scope

Block Statements


Block statements like if and switch conditions or for and while loops, unlike functions, don’t create a new scope. Variables defined inside of a block statement will remain in the scope they were already in.


if (true) {
    // this 'if' conditional block doesn't create a new scope
    var name = 'Hammad'; // name is still in the global scope

console.log(name); // logs 'Hammad'

ECMAScript 6 introduced the let and const keywords. These keywords can be used in place of the var keyword.

ECMAScript 6 引入了letconst关键字。这些关键字可以代替var

var name = 'Hammad';

let likes = 'Coding';
const skills = 'Javascript and PHP';

Contrary to the var keyword, the let and const keywords support the declaration of local scope inside block statements.


if (true) {
    // this 'if' conditional block doesn't create a scope

    // name is in the global scope because of the 'var' keyword
    var name = 'Hammad';
    // likes is in the local scope because of the 'let' keyword
    let likes = 'Coding';
    // skills is in the local scope because of the 'const' keyword
    const skills = 'JavaScript and PHP';

console.log(name); // logs 'Hammad'
console.log(likes); // Uncaught ReferenceError: likes is not defined
console.log(skills); // Uncaught ReferenceError: skills is not defined

Global scope lives as long as your application lives. Local Scope lives as long as your functions are called and executed.




Many developers often confuse scope and context as if they equally refer to the same concepts. But this is not the case. Scope is what we discussed above and Context is used to refer to the value of this in some particular part of your code. Scope refers to the visibility of variables and context refers to the value of this in the same scope. We can also change the context using function methods, which we will discuss later. In the global scope context is always the Window object.

很多开发者经常弄混作用域和上下文,似乎两者是一个概念。但并非如此。作用域是我们上面讲到的那些,而上下文通常涉及到你代码某些特殊部分中的this值。作用域指的是变量的可见性,而上下文指的是在相同的作用域中的this的值。我们当然也可以使用函数方法改变上下文,这个之后我们再讨论。在全局作用域中,上下文总是 Window 对象。

// logs: Window {speechSynthesis: SpeechSynthesis, caches: CacheStorage, localStorage: Storage…}

function logFunction() {
// logs: Window {speechSynthesis: SpeechSynthesis, caches: CacheStorage, localStorage: Storage…}
// because logFunction() is not a property of an object

If scope is in the method of an object, context will be the object the method is part of.


class User {
    logName() {

(new User).logName(); // logs User {}

(new User).logName() is a short way of storing your object in a variable and then calling the logName function on it. Here, you don’t need to create a new variable.

(new User).logName()是创建对象关联到变量并调用logName方法的一种简便形式。通过这种方式你并不需要创建一个新的变量。

One thing you’ll notice is that the value of context behaves differently if you call your functions using the new keyword. The context will then be set to the instance of the called function. Consider one of the examples above with the function called with the new keyword.


function logFunction() {

new logFunction(); // logs logFunction {}

When a function is called in Strict Mode, the context will default to undefined.

当在严格模式(strict mode)中调用函数时,上下文默认是 undefined。

Execution Context


To remove all confusions and from what we studied above, the word context in Execution Context refers to scope and not context. This is a weird naming convention but because of the JavaScipt specification, we are tied to it.

为了解决掉我们从上面学习中会出现的各种困惑,“执行环境(context)”这个词中的“环境(context)”指的是作用域而并非上下文。这是一个怪异的命名约定,但由于 JavaScript 的文档如此,我们只好也这样约定。

JavaScript is a single-threaded language so it can only execute a single task at a time. The rest of the tasks are queued in the Execution Context. As I told you earlier that when the JavaScript interpreter starts to execute your code, the context (scope) is by default set to be global. This global context is appended to your execution context which is actually the first context that starts the execution context.

JavaScript 是一种单线程语言,所以它同一时间只能执行单个任务。其他任务排列在执行环境中。当 JavaScript 解析器开始执行你的代码,环境(作用域)默认设为全局。全局环境添加到你的执行环境中,事实上这是执行环境里的第一个环境。

After that, each function call (invocation) would append its context to the execution context. The same thing happens when an another function is called inside that function or somewhere else.


Each function creates its own execution context.


Once the browser is done with the code in that context, that context will then be popped off from the execution context and the state of the current context in the execution context will be transferred to the parent context. The browser always executes the execution context that is at the top of the execution stack (which is actually the innermost level of scope in your code).


There can only be one global context but any number of function contexts.
The execution context has two phases of creation and code execution.




The first phase that is the creation phase is present when a function is called but its code is not yet executed. Three main things that happen in the creation phase are:

第一阶段是创建阶段,是函数刚被调用但代码并未执行的时候。创建阶段主要发生了 3 件事。

  • Creation of the Variable (Activation) Object,
  • Creation of the Scope Chain, and
  • Setting of the value of context (this)
  • 创建变量对象
  • 创建作用域链
  • 设置上下文(this)的值

Variable Object


The Variable Object, also known as the activation object, contains all of the variables, functions and other declarations that are defined in a particular branch of the execution context. When a function is called, the interpreter scans it for all resources including function arguments, variables, and other declarations. Everything, when packed into a single object, becomes the the Variable Object.

变量对象(Variable Object)也称为活动对象(activation object),包含所有变量、函数和其他在执行环境中定义的声明。当函数调用时,解析器扫描所有资源,包括函数参数、变量和其他声明。当所有东西装填进一个对象,这个对象就是变量对象。

'variableObject': {
    // contains function arguments, inner variable and function declarations

Scope Chain


In the creation phase of the execution context, the scope chain is created after the variable object. The scope chain itself contains the variable object. The Scope Chain is used to resolve variables. When asked to resolve a variable, JavaScript always starts at the innermost level of the code nest and keeps jumping back to the parent scope until it finds the variable or any other resource it is looking for. The scope chain can simply be defined as an object containing the variable object of its own execution context and all the other execution contexts of it parents, an object having a bunch of other objects.

在执行环境创建阶段,作用域链在变量对象之后创建。作用域链包含变量对象。作用域链用于解析变量。当解析一个变量时,JavaScript 开始从最内层沿着父级寻找所需的变量或其他资源。作用域链包含自己执行环境以及所有父级环境中包含的变量对象。

'scopeChain': {
    // contains its own variable object and other variable objects of the parent execution contexts

The Execution Context Object


The execution context can be represented as an abstract object like this:


executionContextObject = {
    'scopeChain': {}, // contains its own variableObject and other variableObject of the parent execution contexts
    'variableObject': {}, // contains function arguments, inner variable and function declarations
    'this': valueOfThis



In the second phase of the execution context, that is the code execution phase, other values are assigned and the code is finally executed.


Lexical Scope


Lexical Scope means that in a nested group of functions, the inner functions have access to the variables and other resources of their parent scope. This means that the child functions are lexically bound to the execution context of their parents. Lexical scope is sometimes also referred to as Static Scope.


function grandfather() {
    var name = 'Hammad';
    // likes is not accessible here
    function parent() {
        // name is accessible here
        // likes is not accessible here
        function child() {
            // Innermost level of the scope chain
            // name is also accessible here
            var likes = 'Coding';

The thing you will notice about lexical scope is that it works forward, meaning name can be accessed by its children’s execution contexts. But it doesn’t work backward to its parents, meaning that the variable likes cannot be accessed by its parents. This also tells us that variables having the same name in different execution contexts gain precedence from top to bottom of the execution stack. A variable, having a name similar to another variable, in the innermost function (topmost context of the execution stack) will have higher precedence.




The concept of closures is closely related to Lexical Scope, which we studied above. A Closure is created when an inner function tries to access the scope chain of its outer function meaning the variables outside of the immediate lexical scope. Closures contain their own scope chain, the scope chain of their parents and the global scope.


A closure can not only access the variables defined in its outer function but also the arguments of the outer function.


A closure can also access the variables of its outer function even after the function has returned. This allows the returned function to maintain access to all the resources of the outer function.


When you return an inner function from a function, that returned function will not be called when you try to call the outer function. You must first save the invocation of the outer function in a separate variable and then call the variable as a function. Consider this example:


function greet() {
    name = 'Hammad';
    return function () {
        console.log('Hi ' + name);

greet(); // nothing happens, no errors

// the returned function from greet() gets saved in greetLetter
greetLetter = greet();

 // calling greetLetter calls the returned function from the greet() function
greetLetter(); // logs 'Hi Hammad'

The key thing to note here is that greetLetter function can access the name variable of the greet function even after it has been returned. One way to call the returned function from the greet function without variable assignment is by using parentheses () two times ()() like this:


function greet() {
    name = 'Hammad';
    return function () {
        console.log('Hi ' + name);

greet()(); // logs 'Hi Hammad'

Public and Private Scope


In many other programming languages, you can set the visibility of properties and methods of classes using public, private and protected scopes. Consider this example using the PHP language:

在许多其他编程语言中,你可以通过 public、private 和 protected 作用域来设置类中变量和方法的可见性。看下面这个 PHP 的例子

// Public Scope
public $property;
public function method() {
  // ...

// Private Sccpe
private $property;
private function method() {
  // ...

// Protected Scope
protected $property;
protected function method() {
  // ...

Encapsulating functions from the public (global) scope saves them from vulnerable attacks. But in JavaScript, there is no such thing as public or private scope. However, we can emulate this feature using closures. To keep everything separate from the global we must first encapsulate our functions within a function like this:

将函数从公有(全局)作用域中封装,使它们免受攻击。但在 JavaScript 中,没有 共有作用域和私有作用域。然而我们可以用闭包实现这一特性。为了使每个函数从全局中分离出去,我们要将它们封装进如下所示的函数中:

(function () {
  // private scope

The parenthesis at the end of the function tells the interpreter to execute it as soon as it reads it without invocation. We can add functions and variables in it and they will not accessible outside. But what if we want to access them outside, meaning we want some of them to be public and some of them to be private? One type of closure, we can use, is called the Module Pattern which allows us to scope our functions using both public and private scopes in an object.

函数结尾的括号告诉解析器立即执行此函数。我们可以在其中加入变量和函数,外部无法访问。但如果我们想在外部访问它们,也就是说我们希望它们一部分是公开的,一部分是私有的。我们可以使用闭包的一种形式,称为模块模式(Module Pattern),它允许我们用一个对象中的公有作用域和私有作用域来划分函数。



The Module Pattern looks like this:


var Module = (function() {
    function privateMethod() {
        // do something

    return {
        publicMethod: function() {
            // can call privateMethod();

The return statement of the Module contains our public functions. The private functions are just those that are not returned. Not returning functions makes them inaccessible outside of the Module namespace. But our public functions can access our private functions which make them handy for helper functions, AJAX calls, and other things.

Module 的return语句包含了我们的公共函数。私有函数并没有被return。函数没有被return确保了它们在 Module 命名空间无法访问。但我们的共有函数可以访问我们的私有函数,方便它们使用有用的函数、AJAX 调用或其他东西。

Module.publicMethod(); // works
Module.privateMethod(); // Uncaught ReferenceError: privateMethod is not defined

One convention is to begin private functions with an underscore, and returning an anonymous object containing our public functions. This makes them easy to manage in a long object. This is how it looks:


var Module = (function () {
    function _privateMethod() {
        // do something
    function publicMethod() {
        // do something
    return {
        publicMethod: publicMethod,



Another type of closure is the Immediately-Invoked Function Expression (IIFE). This is a self-invoked anonymous function called in the context of window, meaning that the value of this is set window. This exposes a single global interface to interact with. This is how it looks:

另一种形式的闭包是立即执行函数表达式(Immediately-Invoked Function Expression,IIFE)。这是一种在 window 上下文中自调用的匿名函数,也就是说this的值是window。它暴露了一个单一全局接口用来交互。如下所示:

(function(window) {
    // do anything

Changing Context with .call(), .apply() and .bind()

使用 .call(), .apply() 和 .bind() 改变上下文

Call and Apply functions are used to change the context while calling a function. This gives you incredible programming capabilities (and some ultimate powers to Rule The World). To use the call or apply function, you just need to call it on the function instead of invoking the function using a pair of parenthesis and pass the context as the first argument. The function’s own arguments can be passed after the context.

Call 和 Apply 函数来改变函数调用时的上下文。这带给你神奇的编程能力(和终极统治世界的能力)。你只需要使用 call 和 apply 函数并把上下文当做第一个参数传入,而不是使用括号来调用函数。函数自己的参数可以在上下文后面传入。

function hello() {
    // do something...

hello(); // the way you usually call it
hello.call(context); // here you can pass the context(value of this) as the first argument
hello.apply(context); // here you can pass the context(value of this) as the first argument

The difference between .call() and .apply() is that in Call, you pass the rest of the arguments as a list separated by a comma while apply allows you to pass the arguments in an array.

.call().apply()的区别是 Call 中其他参数用逗号分隔传入,而 Apply 允许你传入一个参数数组。

function introduce(name, interest) {
    console.log('Hi! I\'m '+ name +' and I like '+ interest +'.');
    console.log('The value of this is '+ this +'.')

introduce('Hammad', 'Coding'); // the way you usually call it
introduce.call(window, 'Batman', 'to save Gotham'); // pass the arguments one by one after the contextt
introduce.apply('Hi', ['Bruce Wayne', 'businesses']); // pass the arguments in an array after the context

// Output:
// Hi! I'm Hammad and I like Coding.
// The value of this is [object Window].
// Hi! I'm Batman and I like to save Gotham.
// The value of this is [object Window].
// Hi! I'm Bruce Wayne and I like businesses.
// The value of this is Hi.

Call is slightly faster in performance than Apply.

Call 比 Apply 的效率高一点。

The following example takes a list of items in the document and logs them to the console one by one.


<!DOCTYPE html>
<html lang="en">
    <meta charset="UTF-8">
    <title>Things to learn</title>
    <h1>Things to Learn to Rule the World</h1>
        <li>Learn PHP</li>
        <li>Learn Laravel</li>
        <li>Learn JavaScript</li>
        <li>Learn VueJS</li>
        <li>Learn CLI</li>
        <li>Learn Git</li>
        <li>Learn Astral Projection</li>
        // Saves a NodeList of all list items on the page in listItems
        var listItems = document.querySelectorAll('ul li');
        // Loops through each of the Node in the listItems NodeList and logs its content
        for (var i = 0; i < listItems.length; i++) {
          (function () {

        // Output logs:
        // Learn PHP
        // Learn Laravel
        // Learn JavaScript
        // Learn VueJS
        // Learn CLI
        // Learn Git
        // Learn Astral Projection

The HTML only contains an unordered list of items. The JavaScript then selects all of them from the DOM. The list is looped over till the end of the items in the list. Inside the loop, we log the content of the list item to the console.

HTML文档中仅包含一个无序列表。JavaScript 从 DOM 中选取它们。列表项会被从头到尾循环一遍。在循环时,我们把列表项的内容输出到控制台。

This log statement is wrapped in a function wrapped in parentheses on which the call function is called. The corresponding list item is passed to the call function so that the the keyword in the console statement logs the innerHTML of the correct object.

输出语句包含在由括号包裹的函数中,然后调用call函数。相应的列表项传入 call 函数,确保控制台输出正确对象的 innerHTML。

Objects can have methods, likewise functions being objects can also have methods. In fact, a JavaScript function comes with four built-in methods which are:

对象可以有方法,同样函数对象也可以有方法。事实上,JavaScript 函数有 4 个内置方法:

  • Function.prototype.apply()
  • Function.prototype.bind() (Introduced in ECMAScript 5 (ES5))
  • Function.prototype.call()
  • Function.prototype.toString()

Function.prototype.toString() returns a string representation of the source code of the function.


Till now, we have discussed .call(), .apply(), and toString(). Unlike Call and Apply, Bind doesn’t itself call the function, it can only be used to bind the value of context and other arguments before calling the function. Using Bind in one of the examples from above:

到现在为止,我们讨论了.call().apply()toString()。与 Call 和 Apply 不同,Bind 并不是自己调用函数,它只是在函数调用之前绑定上下文和其他参数。在上面提到的例子中使用 Bind:

(function introduce(name, interest) {
    console.log('Hi! I\'m '+ name +' and I like '+ interest +'.');
    console.log('The value of this is '+ this +'.')
}).bind(window, 'Hammad', 'Cosmology')();

// logs:
// Hi! I'm Hammad and I like Cosmology.
// The value of this is [object Window].

Bind is like the call function, it allows you pass the rest of the arguments one by one separated by a comma and not like apply, in which you pass the arguments in an array.

Bind 像call函数一样用逗号分隔其他传入参数,不像apply那样用数组传入参数。



These concepts are radical to JavaScript and important to understand if you want to approach more advanced topics. I hope you got a better understanding of JavaScript Scope and things around it. If something just didn’t click, feel free to ask in the comments below.

这些概念是 JavaScript 的基础,如果你想钻研更深的话,理解这些很重要。我希望你对 JavaScript 作用域及相关概念有了更好地理解。如果有东西不清楚,可以在评论区提问。

Scope up your code and till then, Happy Coding!


Reddit 的愚人节项目 r/Place 是怎么做出来的

How We Built r/Place

How We Built r/Place

Reddit 的愚人节项目 r/Place 是怎么做出来的

Each year for April Fools’, rather than a prank, we like to create a project that explores the way that humans interact at large scales. This year we came up with Place, a collaborative canvas on which a single user could only place a single tile every five minutes. This limitation de-emphasized the importance of the individual and necessitated the collaboration of many users in order to achieve complex creations. Each tile placed was relayed to observers in real-time.

每年的愚人节,我们喜欢创建项目来探索人类大规模的交流互动,而不是做一些恶作剧。今年我们提出了 r/Place,这是一个协作的画板,每个用户每 5 分钟只能修改一个小块。这一限制弱化了个体的重要性,强化了大量用户协作完成复杂创作的必要性。每个小块的变化实时传递给观察者。

Multiple engineering teams (frontend, backend, mobile) worked on the project and most of it was built using existing technology at Reddit. This post details how we approached building Place from a technical perspective.

许多开发团队(前端、后端、移动端)协作开发这个项目,项目大部分基于 Reddit 已有的技术。这篇文章从技术角度详细描述我们如何完成 r/Place。

But first, if you want to check out the code for yourself, you can find it here. And if you’re interested in working on projects like Place in the future, we’re hiring!

且慢。如果你想查看我们的代码,在这里。如果你对构建 r/Place 这一类项目感兴趣,我们欢迎你



Defining requirements for an April Fools’ project is extremely important because it will launch with zero ramp-up and be available immediately to all of Reddit’s users. If it doesn’t work perfectly out of the gate, it’s unlikely to attract enough users to make for an interesting experience.

定义愚人节项目的需求十分重要,因为它一旦发布即面向所有 Reddit 用户,没有增长过程。如果它一开始并不能完美运作,似乎就不能吸引足够的用户来创作并获得有趣的体验。

  • The board must be 1000 tiles by 1000 tiles so it feels very large.
  • All clients must be kept in sync with the same view of the current board state, otherwise users with different versions of the board will have difficulty collaborating.
  • We should support at least 100,000 simultaneous users.
  • Users can place one tile every 5 minutes, so we must support an average update rate of 100,000 tiles per 5 minutes (333 updates/s).
  • The project must be designed in such a way that it’s unlikely to affect the rest of the site’s normal function even with very high traffic to r/Place.
  • The configuration must be flexible in case there are unexpected bottlenecks or failures. This means that board size and tile cooldown should be adjustable on the fly in case data sizes are too large or update rates are too high.
  • The API should be generally open and transparent so the reddit community can build on it (bots, extensions, data collection, external visualizations, etc) if they choose to do so.
  • 画板必须有 1000×1000 个小块,所以它会非常大。
  • 所有客户端必须和当前画板状态同步,并显示一致,否则用户基于不同版本的画板难以协作。
  • 我们必须支持至少 100000 的并发同步用户。
  • 用户每 5 分钟可以修改一个小块,所以我们必须支持平均每 5 分钟 100000 个小块的更新(每秒 333 个更新)。
  • 项目的设计必须遵循这一点,即使 r/Place 流量巨大,也不能影响站点其他功能。
  • 配置必须有足够弹性,应对意外的瓶颈或故障。这意味着画板的大小和小块的使用间隔可以在运行时调节,以防数据量过大或更新过于频繁。
  • API 必须开放和透明,reddit 社区如果对此有兴趣,可以在此之上构建项目(机器人、扩展、数据收集、外部可视化等等)。



Implementation decisions


The main challenge for the backend was keeping all the clients in sync with the state of the board. Our solution was to initialize the client state by having it listen for real-time tile placements immediately and then make a request for the full board. The full board in the response could be a few seconds stale as long as we also had real-time placements starting from before it was generated. When the client received the full board it replayed all the real-time placements it received while waiting. All subsequent tile placements could be drawn to the board immediately as they were received.


For this scheme to work we needed the request for the full state of the board to be as fast as possible. Our initial approach was to store the full board in a single row in Cassandra and each request for the full board would read that entire row. The format for each column in the row was:


(x, y): {‘timestamp’: epochms, ‘author’: user_name, ‘color’: color}

Because the board contained 1 million tiles this meant that we had to read a row with 1 million columns. On our production cluster this read took up to 30 seconds, which was unacceptably slow and could have put excessive strain on Cassandra.

因为画板包含一百万个小块,这意味着我们不得不读取有一百万列的行。在我们的生产集群上这种读取花费 30 秒,慢到无法接受,所以我们不能过度依赖 Cassandra。

Our next approach was to store the full board in redis. We used a bitfield of 1 million 4 bit integers. Each 4 bit integer was able to encode a 4 bit color, and the x,y coordinates were determined by the offset (offset = x + 1000y) within the bitfield. We could read the entire board state by reading the entire bitfield. We were able to update individual tiles by updating the value of the bitfield at a specific offset (no need for locking or read/modify/write). We still needed to store the full details in Cassandra so that users could inspect individual tiles to see who placed them and when. We also planned on using Cassandra to restore the board in case of a redis failure. Reading the entire board from redis took less than 100ms, which was fast enough.

我们下一个方案使用 redis 储存整个画板。我们使用 bitfield 处理一百万个 4 位的整型。每个 4 位的整型可以编码 4 位的颜色,横纵(x,y)坐标可以在 bitfield 里用偏移量表示(offset = x + 1000y)。我们可以通过读取整个 bitfield 来获取整个画板的状态。我们可以通过在 bitfield 中更新指定偏移量上的值,来更新单独的小块(不再需要加锁或读/改/写)。我们仍然需要在 Cassandra 中储存所有的细节,让用户可以检查单独的小块,看一看何时何人更改了它。我们也计划用 Cassandra 备份整个画板,以防 redis 失效。从 redis 中读取整个画板不超过 100ms,这已经足够快了。

Illustration showing how colors were stored in redis, using a 2×2 board:

插图展示了我们如何用 redis 储存 2×2 画板的颜色:

We were concerned about exceeding maximum read bandwidth on redis. If many clients connected or refreshed at once they would simultaneously request the full state of the board, all triggering reads from redis. Because the board was a shared global state the obvious solution was to use caching. We decided to cache at the CDN (Fastly) layer because it was simple to implement and it meant the cache was as close to clients as possible which would help response speed. Requests for the full state of the board were cached by Fastly with an expiration of 1 second. We also added the stale-while-revalidate cache control header option to prevent more requests from falling through than we wanted when the cached board expired. Fastly maintains around 33 POPs which do independent caching, so we expected to get at most 33 requests per second for the full board.

我们非常关心 redis 读取最大带宽。如果很多客户端同时链接或刷新,它们会同时请求整个画板的状态,全部都触发 redis 的读取操作。因为画板是全局共享状态,显而易见的解决方案是使用缓存。我们决定在 CDN 层(Fastly)使用缓存,因为实现简单,并且缓存离客户端更近可以提高响应速度。对整个画板的请求被 Fastly 缓存下来并设置 1 秒的超时时间。我们也添加了 stale-while-revalidate 这个控制缓存的头信息,来应对画板缓存过期导致超过预期的大量请求。Fastly 维护着大约 33 处独立缓存 POPs(接入点),所以我们预期每秒最多有 33 个针对整个画板的请求。

We used our websocket service to publish updates to all the clients. We’ve had success using it in production for reddit live threads with over 100,000 simultaneous viewers, live PM notifications, and other features. The websocket service has also been a cornerstone of our past April Fools projects such as The Button and Robin. For r/Place, clients maintained a websocket connection to receive real-time tile placement updates.

我们使用我们的 websocket 服务 向所有客户端推送更新。我们已经成功地在 reddit live 生产环境中应用过它,来处理超过 100000 的并发用户,比如 live PM notifications 功能或其他特性。wesocket 服务也曾是我们过去愚人节项目的基础,比如 The ButtonRobin 两个项目。对于 r/Place 项目,客户端维护一个 websocket 链接来接收实时的小块变化更新。



Retrieve the full board

Requests first went to Fastly. If there was an unexpired copy of the board it would be returned immediately without hitting the reddit application servers. Otherwise, if there was a cache miss or the copy was too old, the reddit application would read the full board from redis and return that to Fastly to be cached and returned to the client.

请求首先到达 Fastly。如果那里有一份未过期的画板副本,它会立刻返回从而不需要访问 reddit 应用服务器。否则如果缓存未命中或副本过时,reddit 应用会从 redis 中读取整个画板然后返回到 Fastly 中并缓存,并返回给客户端。

Request rate and response time as measured by the reddit application:

reddit 应用测量的请求速率和响应时间:

Notice that the request rate never exceeds 33/s, meaning that the caching by Fastly was very effective at preventing most requests from hitting the reddit application.

注意,请求速率从没超过 33 个/秒,说明 Fastly 缓存非常给力,阻止了大量直接访问 reddit 应用的请求。

When a request did hit the reddit application the read from redis was very fast.

当请求访问 reddit 应用时,redis 的读取操作非常迅速。

Draw a tile

The steps for drawing a tile were:


  1. Read the timestamp of the user’s last tile placement from Cassandra. If it was more recent than the cooldown period (5 minutes) reject the draw attempt and return an error to the user.
  2. Write the tile details to redis and Cassandra.
  3. Write the current timestamp as the user’s last tile placement in Cassandra.
  4. Tell the websocket service to send a message to all connected clients with the new tile.
  5. 从 Cassandra 读取用户上一次更改小块的时间戳。如果和当前时间间隔比冷却时间(5 分钟)短,拒绝绘制请求,返回给用户一个错误。
  6. 向 redis 和 Cassandra 写入小块详情。
  7. 向 Cassandra 写入用户上一次修改小块的时间戳。
  8. 让 websocket 服务向所有链接的客户端发送新的小块。

All reads and writes to Cassandra were done with consistency level QUORUM to ensure strong consistency.

Cassandra 的所有读写操作的一致性设置为 QUORUM 级别,来确保强一致性。

We actually had a race condition here that allowed users to place multiple tiles at once. There was no locking around the steps 1-3 so simultaneous tile draw attempts could all pass the check at step 1 and then draw multiple tiles at step 2. It seems that some users discovered this error or had bots that didn’t gracefully follow the ratelimits so there were about 15,000 tiles drawn that abused this error (~0.09% of all tiles placed).

我们当然也有竞态条件允许用户一次更改多个小块。在步骤 1-3 中并没有锁,因此批量小块修改的操作通过步骤 1 的检查之后将在步骤 2 中进行修改。看起来一些用户发现了这个漏洞或一些机器脚本不遵守速率限制,所以大概有 15000 个小块被利用这个漏洞进行更改(占全部更改小块的 0.09%)

Request rate and response time as measured by the reddit application:

reddit 应用测量的请求速率和响应时间:

We experienced a maximum tile placement rate of almost 200/s. This was below our calculated maximum rate of 333/s (average of 100,000 users placing a tile every 5 minutes).

我们经历了更改小块最大速率大概 200/s。这比我们估算的最大速率 333/s 要低(平均每 5 分钟 100000 个用户更改小块)。

Get details of a single tile

Requests for individual tiles resulted in a read straight from Cassandra.

直接从 Cassandra 请求单个小块。

Request rate and response time as measured by the reddit application:

reddit 应用测量的请求速率和响应时间:

This endpoint was very popular. In addition to regular client requests, people wrote scrapers to retrieve the entire board one tile at a time. Since this endpoint wasn’t cached by the CDN, all requests ended up being served by the reddit application.

这个服务端点用的很多。除了客户端频繁的请求之外,有人编写抓取工具每次检索整个画板的一个小块。由于这个服务端点没有在 CDN 缓存,所有请求被 reddit 应用程序处理。

Response times for these requests were pretty fast and stable throughout the project.




We don’t have isolated metrics for r/Place’s effect on the websocket service, but we can estimate and subtract the baseline use from the values before the project started and after it ended.

我们并没有在 websocket 服务中为 r/Place 做单独指标,但是我们可以估计并减去项目开始前后的基本使用量。

Total connections to the websocket service:

websocket 服务总连接数:

The baseline before r/Place began was around 20,000 connections and it peaked at 100,000 connections, so we probably had around 80,000 users connected to r/Place at its peak.

r/Place 开始前的基本使用量大概有 20000 个连接,而峰值 100000 个链接,所以高峰期我们大概有 80000 个用户连接到 r/Place。

Websocket service bandwidth:

Websocket 服务带宽:

At the peak of r/Place the websocket service was transmitting over 4 gbps (150 Mbps per instance and 24 instances).

高峰期 r/Place 的 websocket 服务吞吐量超过 4gbps(24个实例,每个 150 Mbps)

Frontend: Web and Mobile Clients


Building the frontend for Place involved many of the challenges for cross-platform app development. We wanted Place to be a seamless experience on all of our major platforms including desktop web, mobile web, iOS and Android.

构建 r/Place 的前端工程涉及到了跨平台开发的众多挑战。我们期望 r/Place 在我们所有主流平台上拥有无缝体验,包括桌面web、移动web、iOS 和 Android。

The UI in place needed to do three important things:

r/Place 的 UI 需要做三件很重要的事:

  1. Display the state of the board in real time
  2. Facilitate user interaction with the board
  3. Work on all of our platforms, including our mobile apps
  4. 实时展示画板状态。
  5. 让用户和画板交互方便容易
  6. 在我们所有平台上正常运行,包括移动端 app。

The main focus of the UI was the canvas, and the Canvas API was a perfect fit for it. We used a single 1000 x 1000 <canvas> element, drawing each tile as a single pixel.

UI 的主要焦点集中在了 canvas,并且 Canvas API 完全能胜任要求。我们使用一个 1000 x 1000 的 <canvas> 元素,把每个小块当做一个像素进行绘制。

Drawing the canvas

绘制 canvas

The canvas needed to represent the state of the board in real time. We needed to draw the state of the entire board when the page loaded, and draw updates to the board state that came in over websockets. There are generally three ways to go about updating a canvas element using the CanvasRenderingContext2D interface:

canvas 需要实时展示整个画板的状态。我们需要在页面载入的时候绘制整个画板的状态,然后更新通过 websocket 传输过来的画板状态。通过 CanvasRenderingContext2D 接口,有三种方式更新 canvas 元素。

  1. Drawing an existing image onto the canvas using drawImage()
  2. Draw shapes with the various shape drawing methods, e.g. using fillRect() to fill a rectangle with a color
  3. Construct an ImageData object and paint it into the canvas using putImageData()
  4. drawImage() 将一个存在的图像绘制进 canvas。
  5. 通过众多图形绘制的方法来绘制各种形状,比如用 fillRect() 绘制一个有颜色的矩形。
  6. 构造一个 ImageData 对象,然后用 putImageData() 方法将它绘制进 canvas。

The first option wouldn’t work for us since since we didn’t already have the board in image form, leaving options 2 and 3. Updating individual tiles using fillRect() was very straightforward: when a websocket update comes in, just draw a 1 x 1 rectangle at the (x, y) position. This worked OK in general, but wasn’t great for drawing the initial state of the board. The putImageData() method was a much better fit for this, since we were able to define the color of each pixel in a single ImageData object and draw the whole canvas at once.

第一种选项并不适合我们,因为我们并没有画板的图像格式,还剩下 2、3 选项。用fillRect()方法更新单独的小块非常简洁:当 websocket 通知更新时,只需要在(x,y)位置处绘制一个 1 x 1 的矩形。一般来说这很棒,但并不适合绘制画板的初始状态。putImageData()方法显然更合适,我们可以在 ImageData 对象中定义每个像素的颜色,然后一次性绘制整个 canvas。

Drawing the initial state of the board

Using putImageData() requires defining the board state as a Uint8ClampedArray, where each value is an 8-bit unsigned integer clamped to 0-255. Each value represents a single color channel (red, green, blue, and alpha), and each pixel requires 4 items in the array. A 2 x 2 canvas would require a 16-byte array, with the first 4 bytes representing the top left pixel on the canvas, and the last 4 bytes representing the bottom right pixel.

我们使用putImageData()方法,前提需要将画板状态定义成 Uint8ClampedArray 形式,每个值用 8 位无符号整型表示 0-255 之间的数字。每一个值表示单个颜色通道(红、绿、蓝、alpha),每个像素需要 4 个值组成的数组。一个 2 x 2 的 canvas 需要一个 16 字节的数组,前 4 字节表示 canvas 左上角的像素,最后 4 字节表示右下角像素。

Illustration showing how canvas pixels relate to their Uint8ClampedArray representation:

插图展示了 canvas 像素和 Uint8ClampedArray 映射关系:

For place’s canvas, the array is 4 million bytes long, or 4MB.

对于 r/Place 的 canvas,数组大小是四百万字节,也就是 4MB。

On the backend, the board state is stored as a 4-bit bitfield. Each color is represented by a number between 0 and 15, allowing us to pack 2 pixels of color information into each byte. In order to use this on the client, we needed to do 3 things:

在后端,画板状态储存格式是 4 位的 bitfield。每个颜色用 0 到 15 之间的数字表示,这允许我们将 2 像素的颜色信息打包进 1 个字节(1字节=8位)。为了在客户端配合使用,我们需要做 3 件事:

  1. Pull the binary data down to the client from our API
  2. “Unpack” the data
  3. Map the 4-bit colors to useable 32-bit colors
  4. 将二进制数据从我们的 API 拉取到客户端。
  5. “解压”数据
  6. 将 4 位颜色映射成可用的 32 位颜色。

To pull down the binary data, we used the Fetch API in browsers that support it. For those that don’t, we fell back to a normal XMLHttpRequest with responseType set to “arraybuffer”.

为了拉取二进制数据,我们在支持 Fetch API 的浏览器中使用此 API。在不支持的浏览器中,我们使用 XMLHttpRequest,并把 responseType 设置为 “arraybuffer”

The binary data we receive from the API contains 2 pixels of color data in each byte. The smallest TypedArray constructors we have allow us to work with binary data in 1-byte units. This is inconvenient for use on the client so the first thing we do is to “unpack” that data so it’s easier to work with. This process is straightforward, we just iterate over the packed data and split out the high and low order bits, copying them into separate bytes of another array. Finally, the 4-bit color values needed to be mapped to useable 32-bit colors.

我们从 API 接收到的二进制数据中,每个字节有 2 像素的颜色数据。TypedArray 的构造函数允许操作的最小单位是 1 字节。这在客户端上并不方便使用,所以我们做的第一件事就是“解压”,让数据更容易处理。方式很简洁,我们遍历打包的数据并按照高位低位分割比特位,将它们复制到另一个数组的不同字节中。最后,4 位的颜色值映射成可用的 32 位颜色。

API Response0x470xE9
Mapped to 32bit colors0xFFA7D1FF0xA06A42FF0xCF6EE4FF0x94E044FF

The ImageData structure needed to use the putImageData() method requires the end result to be readable as a Uint8ClampedArray with the color channel bytes in RGBA order. This meant we needed to do another round of “unpacking”, splitting each color into its component channel bytes and putting them into the correct index. Needing to do 4 writes per pixel was also inconvenient, but luckily there was another option.

ImageData这种数据结构需要使用putImageData方法,最终结果要求是可读的Uint8ClampedArray格式并且颜色通道字节要按照 RGBA 这种顺序。这意味着我们要做另一遍“解压”,将每个颜色拆分成颜色通道字节并按顺序排列。每个像素要做 4 次操作,这不是很方便,但幸运的是有其他方式。

TypedArray objects are essentially array views into ArrayBuffer instances, which actually represent the binary data. One neat thing about them is that multiple TypedArray instances can read and write to the same underlying ArrayBuffer instance. Instead of writing 4 values into an 8-bit array, we could write a single value into a 32-bit array! Using a Uint32Array to write, we were able to easily update a tile’s color by updating a single array index. The only change required was that we had to store our color palette in reverse-byte order (ABGR) so that the bytes automatically fell in the correct position when read using the Uint8ClampedArray.

TypeArray对象们本质上是ArrayBuffer的数组视图,实际上表示二进制数据。它们共同的一点就是多个TypeArray实例可以基于一个ArrayBuffer实例进行读写。我们不必将 4 个值写入 8 位的数组,我们可以直接把单个值写入一个 32 位的数组。使用Uint32Array写入值,我们可以通过更新数组单个索引来轻松更新单个小块颜色。我们唯一需要做的就是把我们的颜色字节逆序储存(ABGR),这样一来使用Uint8ClampedArray读取数据时可以自动把字节填入正确位置。


Handling websocket updates
处理 websocket 更新

Using the drawRect() method was working OK for drawing individual pixel updates as they came in, but it had one major drawbacks: large bursts of updates coming in at the same time could cripple browser performance. We knew that updates to the board state would be very frequent, so we needed to address this issue.


Instead of redrawing the canvas immediately each time a websocket update came in, we wanted to be able to batch multiple websocket updates that come in around the same time and draw them all at once. We made two changes to do this:

我们希望在一个时间点前后的 websocket 更新能够批量绘制一次,而不是每次 websocket 更新来到就立刻重新绘制 canvas。我们做了以下两点改变:

  1. We stopped using drawRect() altogether, since we’d already figured out a nice convenient way of updating many pixels at once with putImageData()
  2. We moved the actual canvas drawing into a requestAnimationFrame loop
  3. 因为我们发现了使用putImageData()一次更新多个像素这条明路,所以我们不再使用drawRect()
  4. 我们把绘制 canvas 操作放到requestAnimationFrame循环中。

By moving the drawing into an animation loop, we were able to write websocket updates to the ArrayBuffer immediately and defer the actual drawing. All websocket updates in between frames (about 16ms) were batched into a single draw. Because we used requestAnimationFrame, this also meant that if draws took too long (longer than 16ms), only the refresh rate of the canvas would be affected (rather than crippling the entire browser).

把绘制移到动作循环中,我们可以及时将 websocket 更新写入ArrayBuffer,然后延迟绘制。每一帧(大概 16ms)间的 websocket 更新会再一次绘制中批量执行。因为我们使用requestAnimationFrame,这意味着每次绘制时间不能太长(不超过 16ms),只有 canvas 的刷新速率受影响(而不是拖慢整个浏览器)。

Interacting with the Canvas

Canvas 的交互

Equally importantly, the canvas needed to facilitate user interaction. The core way that users can interact with the canvas is to place tiles on it. Precisely drawing individual pixels at 100% scale would be extremely painful and error prone, so we also needed to be able to zoom in (a lot!). We also needed to be able to pan around the canvas easily, since it was too large to fit on most screens (especially when zoomed in).

还有非常重要的一点,canvas 需要方便用户的交互。用户与 canvas 核心交互方式是更改上面的小块。在 100% 缩放下,精确地选择绘制单个像素很不方便,而且容易出错。所以我们需要放大显示(放大很多)。我们也需要方便的平移 canvas,因为在多数浏览器上它太大了(尤其是放大后)。

Camera zoom

Users were only allowed to draw tiles once every 5 minutes, so misplaced tiles would be especially painful. We had to zoom in on the canvas enough that each tile would be a fairly large target for drawing. This was especially important for touch devices. We used a 40x scale for this, giving each tile a 40 x 40 target area. To apply the zoom, we wrapped the <canvas> element in a <div> that we applied a CSS transform: scale(40, 40) to. This worked great for placing tiles, but wasn’t ideal for viewing the board (especially on small screens), so we made this toggleable between two zoom levels: 40x for drawing, 4x for viewing.

用户只能每五分钟绘制一次小块,所以选错小块非常令人不爽。我们需要把 canvas 放大到每个小块都成为一个相当大的目标。这在触摸设备上尤其重要。我们使用 40x 的放大比例,给每个小块 40 x 40 的目标区域。为了应用缩放,我们把<canvas>元素包裹进一个<div>,并给 div 设置 CSS 属性transform: scale(40, 40)。这样一来,小块的布置变得非常方便,但整个画板的显示并不理想(尤其是在小屏幕上),所以我们混合使用两种缩放级别:40x 用于绘制,4x 用于显示。

Using CSS to scale up the canvas made it easy to keep the code that handled drawing the board separate from the code that handled scaling, but unfortunately this approach had some issues. When scaling up an image (or canvas), browsers default to algorithms that apply “smoothing” to the image. This works OK in some cases, but it completely ruins pixel art by turning it into a blurry mess. The good news it that there’s another CSS, image-rendering, which allows us to ask browsers to not do that. The bad news is that not all browsers fully support that property.

使用 CSS 来放大 canvas 使得绘制画板的代码和缩放代码相分离,但不巧这种方式也带来一些问题。当放大一个图片(或 canvas),浏览器默认使用“平滑”算法处理图片。这适用于一些场景,但也彻底毁灭了像素艺术并把它变得混乱模糊。好消息是有另一个 CSS image-rendering 允许我们命令浏览器不这么做。坏消息并不是所有浏览器完全支持这个属性。

Bad news blurs:

We needed another way to scale up the canvas for these browsers. I mentioned earlier on that there are generally three ways to go about drawing to a canvas. The first method, drawImage(), supports drawing an existing image or another canvas into a canvas. It also supports scaling that image up or down when drawing it, and though upscaling has the same blurring issue by default that upscaling in CSS has, this can be disabled in a more cross-browser compatible way by turning off the CanvasRenderingContext2D.imageSmoothingEnabled flag.

我们需要在那些浏览器上用其他方式放大 canvas。我之前提到过绘制 canvas 有三种方式。其中第一个是drawImage()方法,它可以把一个存在的图像或另一个 canvas 绘制进一个 canvas。它也支持在绘制的时候放大或缩小图像,虽然放大的时候会和在 CSS 中放大一样出现模糊问题,但是可以通过关闭 CanvasRenderingContext2D.imageSmoothingEnabled 标识,这种跨浏览器兼容性的方式来解决。

So the fix for our blurry canvas problem was to add another step to the rendering process. We introduced another <canvas> element, this one sized and positioned to fit across the container element (i.e. the viewable area of the board). After redrawing the canvas, we use drawImage() to draw the visible portion of it into this new display canvas at the proper scale. Since this extra step adds a little overhead to the rendering process, we only did this for browsers that don’t support the CSS image-rendering property.

所以修复模糊 canvas 问题的答案就是在渲染过程中增加额外一步。我们引入了另一个<canvas>元素,它大小位置适应于容器元素(比如画板的可见区域)。每次重新绘制 canvas 后,我们使用drawImage()把它的一部分绘制到新的、有合适缩放比例的 canvas。因为额外的步骤给渲染过程带来微小的开销,所以我们只在不支持image-renderingCSS 属性的浏览器上这样做。

Camera pan

The canvas is a fairly big image, especially when zoomed in, so we needed to provide ways of navigating it. To adjust the position of the canvas on the screen, we took a similar approach to what we did with scaling: we wrapped the <canvas> element in another <div> that we applied CSS transform: translate(x, y) to. Using a separate div made it easy to control the order that these transforms were applied to the canvas, which was important for preventing the camera from moving when toggling the zoom level.

canvas 是一个相当大的图像,尤其是放大之后,所以我们需要提供一些方式操作它。为了调整 canvas 在屏幕上的位置,我们采取和解决缩放问题一样的方式:我们将<canvas>包裹进另一个<div>,并在它上面应用 CSS 属性transform: translate(x, y)。使用单独的 div 使得应用在 canvas 上的变换操作更容易控制,这对于防止视角在缩放时产生移动非常重要。

We ended up supporting a variety of ways to adjust the camera position, including:


  • Click and drag
  • Click to move
  • Keyboard navigation
  • 点击拖拽
  • 点击移动
  • 键盘导航

Each of these methods required a slightly different approach.



The primary way of navigating was click-and-drag (or touch-and-drag). We stored the x, y position of the mousedown event. On each mousemove event, we found the offset of the mouse position relative to that start position, then added that offset to the existing saved canvas offset. The camera position was updated immediately so that this form of navigation felt really responsive.

最基本的导航方式就是点击拖拽(或触摸拖拽)。我们保存了mousedown事件的 x、y 坐标。对于每次mousemove事件,我们计算鼠标相对于起点的偏移量,然后把偏移量加到已存在的 canvas 偏移量中。视角位置立刻改变,让人感觉这种到导航方式很灵敏。


We also allowed clicking on a tile to center that tile on the screen. To accomplish this, we had to keep track of the distance moved between the mousedown and mouseup events, in order to distinguish “clicks” from “drags”. If the mouse did not move enough to be considered a “drag”, we adjusted the camera position by the difference between the mouse position and the point at the center of the screen. Unlike click-and-drag movement, the camera position was updated with an easing function applied. Instead of setting the new position immediately, we saved it as a “target” position. Inside the animation loop (the same one used to redraw the canvas), we moved the current camera position closer to the target using an easing function. This prevented the camera move from feeling too jarring.

我们也支持点击一个小块,使得小块定位到屏幕中心。为了实现这个功能,我们需要跟踪mousedownmouseup事件,为了区别“点击”和“拖动”。如果鼠标移动距离达不到“拖动”的标准,我们会根据鼠标位置和屏幕中心的距离来调整视角位置。和点击拖动不同,视角位置的更新使用了缓动函数(easing function)。我们没有立刻设定新的位置,而是把它保存成“目标”位置。在动画循环中(每次绘制 canvas 的循环),我们使用缓动函数移动当前视角逐渐接近目标。这避免了视角移动太突然。

Keyboard navigation

We also supported navigating with the keyboard, using either the WASD keys or the arrow keys. The four direction keys controlled an internal movement vector. This vector defaulted to (0, 0) when no movement keys were down, and each of the direction keys added or subtracted 1 from either the x or y component of the vector when pressed. For example, pressing the “right” and “up” keys would set the movement vector to (1, -1). This movement vector was then used inside the animation loop to move the camera.

我们也支持键盘导航,既可以使用 WASD 键也可以使用方向键。四个键控制内置 移动向量。没有按键按下时,向量默认是 (0, 0),每个按键按下时会增加或减少向量的 x 或 y 轴 1 个单位。举个例子,按下“右”和“上”键会把移动向量设置成 (1,-1)。这个移动向量随后应用在动画循环中,来移动视角。

During the animation loop, a movement speed was calculated based on the current zoom level using the formula:


movementSpeed = maxZoom / currentZoom * speedMultiplier

This made keyboard navigation faster when zoomed out, which felt a lot more natural.


The movement vector is then normalized and multiplied by the movement speed, then applied to the current camera position. We normalized the vector to make sure diagonal movement was the same speed as orthogonal movement, which also helped it feel more natural. Finally, we applied the same kind of easing function to changes to the movement vector itself. This smoothed out changes in movement direction and speed, making the camera feel much more fluid and juicy.


Mobile app support


There were a couple of additional challenges to embedding the canvas in the mobile apps for iOS and Android. First, we needed to authenticate the user so they could place tiles. Unlike on the web, where authentication is session based, with the mobile apps we use OAuth. This means that the app needs to provide the webview with an access token for the currently logged in user. The safest way to do this was to inject the oauth authorization headers by making a javascript call from the app to the webview (this would’ve also allowed us to set other headers if needed). It was then a matter of passing the authorization headers along with each api call.

在 iOS 和 Android 的移动应用嵌入 canvas 过程中,我们遇到一些挑战。首先,我们需要认证用户,然后用户才能更改小块。和基于 session 的 web 认证不同,移动应用中我们使用 OAuth。这意味着应用需要为 webview 提供当前登录用户的访问令牌。最安全的方式就是用 JavaScript 在应用调用 webview 时注入 oauth 认证头信息(这也允许我们设置其他需要的头信息)。问题就简化为在每个 api 调用中传递认证头信息了。

r.place.injectHeaders({‘Authorization’: ‘Bearer <access token>’});

For the iOS side we additionally implemented notification support when your next tile was ready to be placed on the canvas. Since tile placement occurred completely in the webview we needed to implement a callback to the native app. Fortunately with iOS 8 and higher this is possible with a simple javascript call:

在 iOS 端,当你可以更改 canvas 中的下一个小块时,我们实现了消息提醒功能。因为小块的变更完全在 webview 中,所以我们需要实现向原生应用的回调。辛运的是在 iOS 8 及以上版本中只需要一个简单的 JavaScript 调用:

webkit.messageHandlers.tilePlacedHandler.postMessage(this.cooldown / 1000);

The delegate method in the app then schedules a notification based on the cooldown timer that was passed in.


What We Learned


You’ll always miss something


Since we had planned everything out perfectly, we knew when we launched, nothing could possibly go wrong. We had load tested the frontend, load tested the backend, there was simply no way we humans could have made any other mistakes.




The launch went smoothly. Over the course of the morning, as the popularity of r/Place went up, so did the number of connections and traffic to our websockets instances:

上线过程很顺利。经历了一个黎明,r/Place 人气迅速上升,我们 websocket 实例的链接数量和通信量也随之增加:

No big deal, and exactly what we expected. Strangely enough, we thought we were network-bound on those instances and figured we had a lot more headway. Looking at the CPU of the instances, however, painted a different picture:

并没有什么惊喜,所有和我们预期的一样。奇怪的是,我们怀疑限制了这些服务器实例的网络带宽,因为我们预计会有更大的流量。查看了一下 CPU 的实例情况,却显示出一幅不同的图片:

Those are 8-core instances, so it was clear they were reaching their limits. Why were these boxes suddenly behaving so differently? We chalked it up to place being a much different workload type than they’d seen before. After all, these were lots of very tiny messages; we typically send out larger messages like live thread updates and notifications. We also usually don’t have that many people all receiving the same message, so a lot of things were different.

服务器实例是 8 核的,所以很明显它们快到上限了。为什么它们突然表现的如此不同?我们将原因归结于 r/Place 的工作负载类型不同于以往项目。毕竟这里有很多微小的消息,我们一般发送大型消息,比如直播帖子的更新和通知。我们也没有处理过大量用户接收相同消息的情况,所以有很多地方都不同。

Still, no big deal, we figured we’d just scale it and call it a day. The on-call person doubled the number of instances and went to a doctor’s appointment, not a care in the world.


Then, this happened:


That graph may seem unassuming if it weren’t for the fact that it was for our production Rabbit MQ instance, which handles not only our websockets messages but basically everything that reddit.com relies on. And it wasn’t happy; it wasn’t happy at all.

这幅图看上去可能并没什么,但事实上这是我们生产环境的 Rabbit MQ 实例,不仅处理 websocket 消息,也处理 reddit.com 所有底层的依赖项。这不容乐观,一点都不。

After a lot of investigating, hand-wringing, and instance upgrading, we narrowed down the problem to the management interface. It had always seemed kind of slow, and we realized that the rabbit diamond collector we use for getting our stats was querying it regularly. We believe that the additional exchanges created when launching new websockets instances, combined with the throughput of messages we were receiving on those exchanges, caused rabbit to buckle while trying to do bookkeeping to do queries for the admin interface. So we turned it off, and things got better.

经过了各种调查、束手无策和升级实例,我们把问题锁定在管理接口。它总是有点慢,随后我们意识到,我们为了获取项目状态用 rabbit diamond collector 会频繁查询接口。我们认为创建新的 websocket 实例时创建了额外的 exchange (RabbitMq 中概念),再加上这些 exchange 的消息吞吐量,导致了管理界面在查询和记录时,rabbit 卡住了。我们把它关掉,情况好多了。

We don’t like being in the dark, so we whipped up an artisanal, hand-crafted monitoring script to get us through the project:


$ cat s****y_diamond.sh


/usr/sbin/rabbitmqctl list_queues | /usr/bin/awk '$2~/[0-9]/{print "servers.foo.bar.rabbit.rabbitmq.queues." $1 ".messages " $2 " " systime()}' | /bin/grep -v 'amq.gen' | /bin/nc 2013

If you’re wondering why we kept adjusting the timeouts on placing pixels, there you have it. We were trying to relieve pressure to keep the whole project running. This is also the reason why, during one period, some pixels were taking a long time to show up.


So unfortunately, despite what messages like this would have you believe:


10K upvotes to reduce the cooldown even further! ADMIN APPROVED

The reasons for the adjustments were entirely technical. Although it was cool to watch r/Place/new after making the change:

尽管调整完看 r/Place/new 版块很有意思,但调整完全出于技术原因:

So maybe that was part of the motivation.


Bots Will Be Bots


We ran into one more slight hiccup at the end of the project. In general, one of our recurring problems is clients with bad retry behavior. A lot of clients, when faced with an error, will simply retry. And retry. And retry. This means whenever there is a hiccup on the site, it can often turn into a retry storm from some clients who have not been programmed to back-off in the case of trouble.


When we turned off place, the endpoints that a lot of bots were hitting started returning non-200s. Code like this wasn’t very nice. Thankfully, this was easy to block at the Fastly layer.

当我们关闭 r/Place 时,很多机器人端点请求时返回非 200 的响应码。像这样的代码不是十分友好。值得庆幸的是,在 Fastly 层很容易拦截它们。

Creating Something More


This project could not have come together so successfully without a tremendous amount of teamwork. We’d like to thank u/gooeyblob, u/egonkasper, u/eggplanticarus, u/spladug, u/thephilthe, u/d3fect and everyone else who contributed to the r/Place team, for making this April Fools’ experiment possible.

如果没有庞大团队的协作,项目不会这么成功。我们很感谢 u/gooeyblob、u/egonkasper、u/eggplanticarus、u/spladug、u/thephilthe、u/d3fect 等人对 r/Place 团队的贡献,让愚人节的尝试变成现实。

And as we mentioned before, if you’re interested in creating unique experiences for millions of users, check out our Careers page.