0. Notes

  • 在很大程度上,这是一个“科学的”怀疑论者的典型思维方式。在很多年前,与朋友讨论问题的时候,就提到了“除了去搜索佐证自己的证据,还应该去搜索那些相反的证据”,get the full picture。
  • 合理的怀疑是理性主义,过份的怀疑是怀疑主义,毫不怀疑的是拿来主义,没有一个客观的边界,因此在这个意题上,就更容易形成讨论和“aha moment”——它迎合了在思辩领域的一般性辩论框架——即辩证法思维。
  • 它既可以是一个人拓展视界的内驱力,也可能是一个人陷入虚无主义的开端

1. [AI] rating summary

🤔 Analysis Trace

🏷️ [独立思考 (Critical-Thinking)] | 信息密度:9.5 | 🆕 新颖度:8.7
判断: 全文以“单研究陷阱”为切口,用医学与政治(最低工资)双案例解构证据生态的系统性失真——不是批评个别研究造假,而是揭示所有严谨研究共存时仍必然产生误导性共识的统计与传播机制;密度极高但非技术晦涩,新颖性稍减于同类思想实验(如《黑天鹅》前奏),胜在具象化推演与反讽张力。

🎯 核心信号 (The Signal)

  • 一句话: 所谓“科学共识”常是选择性聚合噪声的结果;警惕任何仅凭部分研究(无论数量多少)就宣称代表整体证据的论述——真正的证据素养在于识别研究分布的全貌(如钟形/针形曲线)及其结构性偏移。
  • 关键要点和逻辑:
    • 真实效应常呈钟形分布(如药物弱有效),但因研究异质性(人群/终点/设计)+ 随机噪声 + 发表偏倚,必然产出从“极好”到“极坏”的离散结果;
    • 利益方(药企/政见阵营)通过 cherry-picking 分布尾部研究(或混入不同子问题研究)构建“伪共识”,而公众因无法接触完整分布图谱而被说服;
    • 元分析、专家调查、媒体综述同样受选择性呈现污染——不同元分析结论可完全相反,经济学家签名信数量差100人即翻转“主流意见”叙事;
    • 唯一可靠线索是可视化证据分布(如漏斗图),但该工具本身亦可被操纵,故终极防御是主动搜索对立观点的同等强度证据

⚖️ 立场与倾向 (Stance & Bias)

  • 作者意图: 深度探讨(以认知防御为目标的思想训练,非情绪宣泄或立场站队)
  • 潜在偏见: 隐含“科学共同体自我纠错能力有限”的悲观预设;将医生接受药企午餐会简化为被动受骗,低估临床指南与系统评价的实际约束力(属修辞强化,非核心论点漏洞)。

🌲 关键实体与作用 (Entities & Roles)

  • Depakote : 作为贯穿全文的医学案例锚点,用以具象化“同一药物在不同亚型/病程中效果分裂”的核心论点 → [中立]
  • Card & Krueger (1992) : 最低工资争议的“原爆点研究”,被双方同时征用为符号性证据,证明单研究如何被叙事绑架 → [中立]
  • University of Chicago Booth School (2013 survey) : 作为“专家共识”可被选择性引用的典型案例,其4:1结论与AEA劳动经济学家73%反对结论形成直接对冲 → [中立]
  • Employment Policies Institute : 代表利益相关方包装“中立”形象的典型机构,其数据引用服务于反最低工资叙事 → [负面]
  • raisetheminimumwage.com : 同样以“中立”自居但立场鲜明的倡导网站,其选择性元分析摘要构成信息污染源 → [负面]
  • Funnel plot : 文中唯一提出的实证校验工具,被赋予“逼近真实信号分布”的方法论权重,但作者亦坦承其可被黑客攻击 → [正面]
  • Scott Alexander (author) : 本文的思维框架建构者,以医生-程序员双重身份实践“证据分布意识”,其博客语境本身即知识过滤器的反例 → [中立]

💡 启发性思考 (Heuristic Questions)

  1. 如果所有可信研究都指向“X在A场景下有效、在B场景下无效”,而政策制定必须二选一(全域推行/全面禁止),那么所谓“基于证据的决策”究竟是在依据什么做决定?

背景知识补充(Background info.)

  1. 漏斗图(Funnel Plot): 元分析中用于检测发表偏倚的图形工具——横轴为效应量,纵轴为研究精度(如样本量倒数)。若无偏倚,点应呈倒漏斗形对称分布;若小样本研究集中于效应显著一侧,则提示阴性结果未被发表。文中将其升维为“证据分布全景图”的隐喻。
  2. Depakote(丙戊酸钠): 实际为广谱抗惊厥药,确有抗躁狂作用(FDA批准),但对双相抑郁疗效弱且不获批维持治疗——作者所举临床细分差异完全符合循证事实。

2. original content

    • Beware The Man Of One Study

Posted on December 12, 2014 by Scott Alexander

Aquinas famously said: beware the man of one book. I would add: beware the man of one study.

For example, take medical research. Suppose a certain drug is weakly effective against a certain disease. After a few years, a bunch of different research groups have gotten their hands on it and done all sorts of different studies. In the best case scenario the average study will find the true result – that it’s weakly effective.

But there will also be random noise caused by inevitable variation and by some of the experiments being better quality than others. In the end, we might expect something looking kind of like a bell curve. The peak will be at “weakly effective”, but there will be a few studies to either side. Something like this:

We see that the peak of the curve is somewhere to the right of neutral – ie weakly effective – and that there are about 15 studies that find this correct result.

But there are also about 5 studies that find that the drug is very good, and 5 studies missing the sign entirely and finding that the drug is actively bad. There’s even 1 study finding that the drug is very bad, maybe seriously dangerous.

This is before we get into fraud or statistical malpractice. I’m saying this is what’s going to happen just by normal variation in experimental design. As we increase experimental rigor, the bell curve might get squashed horizontally, but there will still be a bell curve. 在我们讨论欺诈或统计不当之前,我要说明的是,仅仅由于实验设计中的正常变异就会出现这种情况。随着实验严谨性的提高,钟形分布的横向宽度可能会被压缩(变窄),但钟形分布本身仍然会存在。

In practice it’s worse than this, because this is assuming everyone is investigating exactly the same question. 实际上情况要更糟,因为这里假设每个人研究的恰好是完全相同的问题。

Suppose that the graph is titled “Effectiveness Of This Drug In Treating Bipolar Disorder”.假設該圖表的標題為「此藥物治療雙相情感障礙的有效性」。

But maybe the drug is more effective in bipolar i than in bipolar ii (Depakote, for example) 不过,这种药物可能在双相情感障碍 I 型中比在 II 型中更有效,例如 Depakote(丙戊酸钠)。

Or maybe the drug is very effective against bipolar mania, but much less effective against bipolar depression (Depakote again). 或者这类药物可能对双相情感障碍的躁狂发作非常有效,但对双相抑郁的疗效要差得多(例如丙戊酸钠,商品名Depakote)。

Or maybe the drug is a good acute antimanic agent, but very poor at maintenance treatment (let’s stick with Depakote). 或者,這種藥物作為急性躁狂症的治療藥物效果良好,但在維持治療方面效果卻非常差。 ​ If you have a graph titled “Effectiveness Of Depakote In Treating Bipolar Disorder” plotting studies from “Very Bad” to “Very Good” – and you stick all the studies – maintenence, manic, depressive, bipolar i, bipolar ii – on the graph, then you’re going to end running the gamut from “very bad” to “very good” even before you factor in noise and even before even before you factor in bias and poor experimental design. 如果你有一張標題為「丙戊酸鈉(Depakote)治療雙相情感障礙之有效性」的圖表,橫軸從「非常差」到「非常好」進行評級,並把所有相關研究——包括維持治療、躁狂發作、抑鬱發作、雙相情感障礙 I 型、雙相情感障礙 II 型等——全部納入該圖表中,那麼,即便尚未考慮數據雜訊、偏倚,也尚未考慮實驗設計不良等因素,你最終呈現的結果就已自然涵蓋了從「非常差」到「非常好」的全範圍分佈。

So here’s why you should beware the man of one study.

If you go to your better class of alternative medicine websites, they don’t tell you “Studies are a logocentric phallocentric tool of Western medicine and the Big Pharma conspiracy.” 如果你去看那些较高档的另类医学网站,他们可不会跟你说:“研究是西方医学和制药巨头阴谋中一种以语言/理性和父权为中心的工具。”

They tell you “medical science has proved that this drug is terrible, but ignorant doctors are pushing it on you anyway. Look, here’s a study by a reputable institution proving that the drug is not only ineffective, but harmful.” 他们告诉你:“医学科学已经证实,这种药物极其糟糕,但那些无知的医生仍执意给你开。看,这里有一项来自权威机构的研究,明确证明该药物不仅无效,而且有害。”

And the study will exist, and the authors will be prestigious scientists, and it will probably be about as rigorous and well-done as any other study. 这项研究将会存在,作者也将是享有盛誉的科学家,其严谨性和完成度大概会与其他研究相当。

And then a lot of people raised on the idea that some things have Evidence and other things have No Evidence think holy s**t, they’re right! 随后,许多成长于“某些事情有证据,而另一些事情则毫无证据”这一观念中的人们,看到这种情况时不禁惊呼:“天啊,他们说对了!”

On the other hand, your doctor isn’t going to a sketchy alternative medicine website. She’s examining the entire literature and extracting careful and well-informed conclusions from… 另一方面,你的医生不会去那些可疑的替代医学网站寻找信息。她会全面审视相关文献,并从中提炼出严谨、基于充分依据的结论……

Haha, just kidding. She’s going to a luncheon at a really nice restaurant sponsored by a pharmaceutical company, which assures her that they would never take advantage of such an opportunity to shill their drug, they just want to raise awareness of the latest study. And the latest study shows that their drug is great! Super great! And your doctor nods along, because the authors of the study are prestigious scientists, and it’s about as rigorous and well-done as any other study. 哈哈,开个玩笑而已。她要去一家高档餐厅参加由某制药公司赞助的午宴,公司信誓旦旦地保证,他们绝不会借这个机会推销自家药品——他们只是想提高公众对最新研究的关注度。而这项“最新研究”显示,他们的药物效果极佳!棒极了!你的医生也频频点头附和,毕竟这项研究的作者都是声名显赫的科学家,研究设计严谨、质量上乘,几乎和任何其他优质研究一样可靠。

But obviously the pharmaceutical company has selected one of the studies from the “very good” end of the bell curve.

And I called this “Beware The Man of One Study”, but it’s easy to see that in the little diagram there are like three or four studies showing that the drug is “very good”, so if your doctor is a little skeptical, the pharmaceutical company can say “You are right to be skeptical, one study doesn’t prove anything, but look – here’s another group that finds the same thing, here’s yet another group that finds the same thing, and here’s a replication that confirms both of them.”

And even though it looks like in our example the sketchy alternative medicine website only has one “very bad” study to go off of, they could easily supplement it with a bunch of merely “bad” studies. Or they could add all of those studies about slightly different things. Depakote is ineffective at treating bipolar depression. Depakote is ineffective at maintenance bipolar therapy. Depakote is ineffective at bipolar ii.

So just sum it up as “Smith et al 1987 found the drug ineffective, yet doctors continue to prescribe it anyway”. Even if you hunt down the original study (which no one does), Smith et al won’t say specifically “Do remember that this study is only looking at bipolar maintenance, which is a different topic from bipolar acute antimanic treatment, and we’re not saying anything about that.” It will just be titled something like “Depakote fails to separate from placebo in six month trial of 91 patients” and trust that the responsible professionals reading it are well aware of the difference between acute and maintenance treatments (hahahahaha). 简单来说就是:“史密斯等人1987年的研究发现该药物无效,但医生们仍继续开处方。” 即便你费尽周折找到原始研究(实际上几乎没人会这么做),史密斯等人也绝不会明确指出:“请记住,本研究仅关注双相情感障碍的维持治疗,这与急性躁狂期的治疗属于不同范畴,我们并未对后者发表任何意见。” 该研究的标题很可能只是类似“在91名患者参与的六个月试验中,得理多(Depakote)与安慰剂无显著差异”,然后就指望那些专业人士自己清楚地分辨急性治疗和维持治疗之间的区别(呵呵呵)。

So it’s not so much “beware the man of one study” as “beware the man of any number of studies less than a relatively complete and not-cherry-picked survey of the research”.

II.

I think medical science is still pretty healthy, and that the consensus of doctors and researchers is more-or-less right on most controversial medical issues.

(it’s the uncontroversial ones you have to worry about)

Politics doesn’t have this protection.

Like, take the minimum wage question (please). We all know about the Krueger and Card study in New Jersey that found no evidence that high minimum wages hurt the economy. We probably also know the counterclaims that it was completely debunked as despicable dishonest statistical malpractice. Maybe some of us know Card and Krueger wrote a pretty convincing rebuttal of those claims. Or that a bunch of large and methodologically advanced studies have come out since then, some finding no effect like Dube, others finding strong effects like Rubinstein and Wither. These are just examples; there are at least dozens and probably hundreds of studies on both sides.

But we can solve this with meta-analyses and systemtic reviews, right? 但我们可以通过元分析和系统性综述来解决这个问题,对吧?

Depends which one you want. Do you go with this meta-analysis of fourteen studies that shows that any presumed negative effect of high minimum wages is likely publication bias? With this meta-analysis of sixty-four studies that finds the same thing and discovers no effect of minimum wage after correcting for the problem? Or how about this meta-analysis of fifty-five countries that does find effects in most of them? Maybe you prefer this systematic review of a hundred or so studies that finds strong and consistent effects? 这取决于你倾向于哪一种观点。你是否愿意相信这份对十四项研究进行的元分析——它指出,所谓高最低工资带来的负面效应,很可能是由于发表偏倚所致?还是更认可另一份涵盖六十四项研究的元分析,该研究得出了相同结论,并在纠正了这一问题后发现最低工资实际上并无显著影响?又或者,你更倾向于这份针对五十五个国家的研究元分析,它确实发现了大多数国家中存在实际影响?也许你更青睐这份系统性综述——它涵盖了约一百项研究,结果表明最低工资的影响既强烈又一致?

Can we trust news sources, think tanks, econblogs, and other institutions to sum up the state of the evidence? 我们能否信赖新闻媒体、智库、经济博客以及其他机构,来全面总结现有证据的状况?

CNN claims that 85% of credible studies have shown the minimum wage causes job loss. But raisetheminimumwage.com declares that “two decades of rigorous economic research have found that raising the minimum wage does not result in job loss…researchers and businesses alike agree today that the weight of the evidence shows no reduction in employment resulting from minimum wage increases.” Modeled Behavior says “the majority of the new minimum wage research supports the hypothesis that the minimum wage increases unemployment.” The Center for Budget and Policy Priorities says “The common claim that raising the minimum wage reduces employment for low-wage workers is one of the most extensively studied issues in empirical economics. The weight of the evidence is that such impacts are small to none.”

Okay, fine. What about economists? They seem like experts. What do they think?

Well, five hundred economists signed a letter to policy makers saying that the science of economics shows increasing the minimum wage would be a bad idea. That sounds like a promising consensus…

..except that six hundred economists signed a letter to policy makers saying that the science of economics shows increasing the minimum wage would be a good idea. (h/t Greg Mankiw)

Fine then. Let’s do a formal survey of economists. Now what?

raisetheminimumwage.com, an unbiased source if ever there was one, confidently tells us that “indicative is a 2013 survey by the University of Chicago’s Booth School of Business in which leading economists agreed by a nearly 4 to 1 margin that the benefits of raising and indexing the minimum wage outweigh the costs.” raisetheminimumwage.com——一个若论客观性可谓堪称典范的来源——自信地指出:“2013年,芝加哥大学布斯商学院进行的一项调查显示,众多顶尖经济学家以接近4比1的比例一致认为,提高并指数化最低工资所带来的好处,大于其带来的成本。”

But the Employment Policies Institute, which sounds like it’s trying way too hard to sound like an unbiased source, tells us that “Over 73 percent of AEA labor economists believe that a significant increase will lead to employment losses and 68 percent think these employment losses fall disproportionately on the least skilled. Only 6 percent feel that minimum wage hikes are an efficient way to alleviate poverty.”

So the whole thing is fiendishly complicated. But unless you look very very hard, you will never know that.

If you are a conservative, what you will find on the sites you trust will be something like this:

Economic theory has always shown that minimum wage increases decrease employment, but the Left has never been willing to accept this basic fact. In 1992, they trumpeted a single study by Card and Krueger that purported to show no negative effects from a minimum wage increase. This study was immediately debunked and found to be based on statistical malpractice and “massaging the numbers”. Since then, dozens of studies have come out confirming what we knew all along – that a high minimum wage is economic suicide. Systematic reviews and meta-analyses (Neumark 2006, Boockman 2010) consistently show that an overwhelming majority of the research agrees on this fact – as do 73% of economists. That’s why five hundred top economists recently signed a letter urging policy makers not to buy into discredited liberal minimum wage theories. Instead of listening to starry-eyed liberal woo, listen to the empirical evidence and an overwhelming majority of economists and oppose a raise in the minimum wage.

And if you are a leftist, what you will find on the sites you trust will be something like this:

People used to believe that the minimum wage decreased unemployment. But Card and Krueger’s famous 1992 study exploded that conventional wisdom. Since then, the results have been replicated over fifty times, and further meta-analyses (Card and Krueger 1995, Dube 2010) have found no evidence of any effect. Leading economists agree by a 4 to 1 margin that the benefits of raising the minimum wage outweigh the costs, and that’s why more than 600 of them have signed a petition telling the government to do exactly that. Instead of listening to conservative scare tactics based on long-debunked theories, listen to the empirical evidence and the overwhelming majority of economists and support a raise in the minimum wage.

Go ahead. Google the issue and see what stuff comes up. If it doesn’t quite match what I said above, it’s usually because they can’t even muster that level of scholarship. Half the sites just cite Card and Krueger and call it a day!

These sites with their long lists of studies and experts are super convincing. And half of them are wrong.

At some point in their education, most smart people usually learn not to credit arguments from authority. If someone says “Believe me about the minimum wage because I seem like a trustworthy guy,” most of them will have at least one neuron in their head that says “I should ask for some evidence”. If they’re really smart, they’ll use the magic words “peer-reviewed experimental studies.”

But I worry that most smart people have not learned that a list of dozens of studies, several meta-analyses, hundreds of experts, and expert surveys showing almost all academics support your thesis – can still be bullshit.

Which is too bad, because that’s exactly what people who want to bamboozle an educated audience are going to use.

III.

I do not want to preach radical skepticism. 我无意宣扬极端的怀疑主义。

For example, on the minimum wage issue, I notice only one side has presented a funnel plot. A funnel plot is usually used to investigate publication bias, but it has another use as well – it’s pretty much an exact presentation of the “bell curve” we talked about above. 例如,在最低工资问题上,我注意到只有一方展示了漏斗图。漏斗图通常用于检测发表偏倚,但它还有一个用途——它几乎就是我们前面所讨论的“钟形曲线”的精确呈现。

This is more of a needle curve than a bell curve, but the point still stands. We see it’s centered around 0, which means there’s some evidence that’s the real signal among all this noise. The bell skews more to left than to the right, which means more studies have found negative effects of the minimum wage than positive effects of the minimum wage. But since the bell curve is asymmetrical, we intepret that as probably publication bias. So all in all, I think there’s at least some evidence that the liberals are right on this one. 这更像是一条针状分布曲线,而非标准的钟形曲线,但核心观点依然成立。我们可以看到,这条曲线以0为中心,说明在众多噪声之中,确实存在一些指向真实效应的证据。该分布向左偏斜,意味着关于最低工资产生负面影响的研究,多于显示其正面效果的研究。但由于曲线并不对称,我们倾向于将其解释为——可能存在发表偏差。综上所述,我认为至少有部分证据支持自由派在此问题上的立场。

Unless, of course, someone has realized that I’ve wised up to the studies and meta-analyses and and expert surveys, and figured out a way to hack funnel plots, which I am totally not ruling out. 当然,除非有人已经意识到我已看穿这些研究、元分析以及专家调查的门道,并找到了一种“破解”漏斗图的方法——这一点我可完全不敢排除。

(okay, I kind of want to preach radical skepticism)

Also, I should probably mention that it’s much more complicated than one side being right, and that the minimum wage probably works differently depending on what industry you’re talking about, whether it’s state wage or federal wage, whether it’s a recession or a boom, whether we’re talking about increasing from 6 or from 30, etc, etc, etc. There are eleven studies on that plot showing an effect even worse than -5, and very possibly they are all accurate for whatever subproblem they have chosen to study – much like the example with Depakote where it might an effective antimanic but a terrible antidepressant. 此外,我得提醒你,事情远比“一方正确、一方错误”要复杂得多。最低工资的实际效果,可能因行业不同而异——比如是州级最低工资还是联邦最低工资,是在经济衰退期还是经济繁荣期,是将工资从5美元提高到6美元,还是从20美元涨到30美元,这些因素都会产生截然不同的影响。在那张图表中,有十一个研究显示其效应甚至比-5还差,但很可能它们各自针对的特定子问题都是准确的。这就像德帕酮(Depakote)的例子:它可能对躁狂症有效,却可能是糟糕的抗抑郁药物——不同情境下,同一事物的效果可能完全不同。

(radical skepticism actually sounds a lot better than figuring this all out).

IV.

But the question remains: what happens when (like in most cases) you don’t have a funnel plot? 但问题依然存在:当(如大多数情况那样)我们没有漏斗图时,该怎么办?

I don’t have a good positive answer. I do have several good negative answers.

Decrease your confidence about most things if you’re not sure that you’ve investigated every piece of evidence. 如果你无法确定自己已经考察过所有相关证据,那么就应当降低对大多数事情的自信。

Do not trust websites which are obviously biased (eg Free Republic, Daily Kos, Dr. Oz) when they tell you they’re going to give you “the state of the evidence” on a certain issue, even if the evidence seems very stately indeed. This goes double for any site that contains a list of “myths and facts about X”, quadruple for any site that uses phrases like “ingroup member uses actual FACTS to DEMOLISH the outgroup’s lies about Y”, and octuple for RationalWiki. 当某些明显带有偏见的网站(例如自由共和国、每日科斯、奥兹博士网站)声称要向你呈现“关于某一问题的证据全貌”时,切勿轻信,即使它们所展示的证据看似非常权威。如果某个网站列出“关于某事的谣言与事实”,这种可信度应打上对折;若网站使用诸如“本群体成员用真实数据彻底击溃外群体关于某事的谎言”之类的言辞,可信度再打一次折扣;而像理性维基(RationalWiki)这类网站,其可信度则需直接乘以八倍——几乎可以完全忽略。

Most important, even if someone gives you what seems like overwhelming evidence in favor of a certain point of view, don’t trust it until you’ve done a simple Google search to see if the opposite side has equally overwhelming evidence. 最重要的是,即使有人向你提供看似压倒性的证据来支持某种观点,也别轻易相信。在下结论前,不妨简单地用谷歌搜索一下,看看对方立场是否同样拥有同样令人信服的证据。