我对“Designing Data-Intensive Applications - Detecting Concurrent Writes”的理解对吗？

V2EX = way to explore

V2EX 是一个关于分享和探索的地方

已注册用户请登录

这是一个创建于 1350 天前的主题，其中的信息可能已经有所发展或是发生改变。

（腾讯云最新优惠活动来了：云产品限时1折，云服务器低至88元/年，点击这里立即抢购：9i0i.cn/qcloud，更有2860元代金券免费领取，付款直接抵现金用，点击这里立即领取：9i0i.cn/qcloudquan）

（福利推荐：你还在原价购买阿里云服务器？现在阿里云0.8折限时抢购活动来啦！4核8G企业云服务器仅2998元/3年，立即抢购>>>：9i0i.cn/aliyun）

在“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Capturing the happens-before relationship”中，有这么一张图：

之后它说：

Note that the server can determine whether two operations are concurrent by looking at the version numbers—it does not need to interpret the value itself (so the value could be any data structure). The algorithm works as follows:

The server maintains a version number for every key, increments the version number every time that key is written, and stores the new version number along with the value written.

When a client reads a key, the server returns all values that have not been overwritten, as well as the latest version number. A client must read a key before writing.

When a client writes a key, it must include the version number from the prior read, and it must merge together all values that it received in the prior read. (The response from a write request can be like a read, returning all current values, which allows us to chain several writes like in the shopping cart example.)

When the server receives a write with a particular version number, it can overwrite all values with that version number or below (since it knows that they have been merged into the new value), but it must keep all values with a higher version number (because those values are concurrent with the incoming write).

我不太确定“我对覆盖和保留的理解”是不是正确，我的理解如下所示：

13 条回复 ? 2020-08-08 14:22:12 +08:00

ekoeko

2020-08-06 21:27:25 +08:00

我也在看 DDIA，我觉得你理解的是对的。client1 第一次写入 milk 返回 v1，同时 client2 写入 eggs 返回了 v2 （此时 client2 知道了 client1 和他同时写入了，因为版本号多了），所以 client1 在 v1 的基础上又继续写入返回 v3 （覆盖 v1 ），而 client2 在则合并了 v1 和 v2 的结果再加上 ham 写入返回了 v4 （覆盖 v2 ）...以此类推，让客户端自己去解决冲突，服务端每次把所有版本所有结果返回给你。

JasonLaw

2020-08-07 07:28:56 +08:00 via iPhone

@ekoeko 其实更严格来说，client 2 是不知道 client 1 和它并发的，它只是知道返回了多个值，出现了并发，但是它不知道具体是和哪个 client 。或许更加合理的方式是说数据库知道哪个版本跟哪个版本并发了。你觉得呢？

guyskk0x0

2020-08-07 19:38:07 +08:00 via Android

这个也叫做乐观锁，可以这么理解：
假设数据库里有一行数据 key= k, value=100, version=1 。

写入操作：update table set value=101, version=2 where key= k and version=1，数据库会返回更新的行数。

如果版本号和 client 读到的一致，写入会成功并返回 1，不一致就不会写入并返回 0，client 根据返回值判断是否有冲突。版本号加一可以在 client 计算也可以在 server 计算。

ekoeko

2020-08-07 21:47:46 +08:00

@JasonLaw 是的！?

JasonLaw

2020-08-07 22:42:07 +08:00

@guyskk0x0 #3 其实你说的跟我问的不太一样，你所说的“通过版本号实现的乐观并发控制”是不支持并发写的，即不支持“client 1 update table set value=101, version=2 where key= k and version=1”和“client 2 update table set value=102, version=2 where key= k and version=1”并发，只有一个先执行的 client 才会成功，后面的一个会被忽略。

而这种并发写是 Multi-Leader Replication 和 Leaderless Replication 的根本，我的问题也是关于 Leaderless Replication 如何处理并发写的。

guyskk0x0

2020-08-07 23:45:27 +08:00

@JasonLaw #5 我仔细看了一下，感觉这个图要结合前文来看。图上有几处并发更新，最终 value 有 2 个值不需要合并吗？如果有合并那就是 CRDT，没有合并那就应当有一个 client 失败。

JasonLaw

2020-08-08 12:12:18 +08:00

@guyskk0x0 #6

在“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Merging concurrently written values”中讲了怎么合并并发写的值，具体内容你可以看一下，这里就不贴出来了。

我还有几个疑问：
1. “如果有合并那就是 CRDT”是什么意思？
2. “没有合并那就应当有一个 client 失败”是指“不支持并发写”吗？这不就违背了 Multi-Leader Replication 和 Leaderless Replication 的初衷吗？

guyskk0x0

2020-08-08 12:22:16 +08:00

@JasonLaw #7 CRDT 是一种可以自动解决冲突的数据结构，在 "Merging concurrently written values" 里讲的应该就是它，可能叫法不一样。（我一年前看的中文版，具体内容记不太清了）

“没有合并那就应当有一个 client 失败”，如果不合并数据就不一致了，所以必须有一个失败。

JasonLaw

2020-08-08 12:48:42 +08:00

@guyskk0x0 #8 没错，是 CRDT，conflict-free replicated data type，“Merging concurrently written values”中讲了很多种方式，CRDT 是其中一种。但是这里怎么使用 CRDT 呢？

“没有合并那就应当有一个 client 失败”，如果不合并数据就不一致了，所以必须有一个失败。

我不同意你所说的“没有合并那就应当有一个 client 失败”。本主题的合并操作是让 client 去解决，比如说第 4 步时，client 2 知道 cart 有两个值，分别为[milk]和[eggs]，它想继续加 ham，于是它合并了两个并发写的值[milk]和[eggs]，然后加上 ham，最后将[eggs, milk, ham]发给数据库。根本不需要实现“Last write wins (discarding concurrent writes)”，“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Last write wins (discarding concurrent writes)”也讲了它所存在的问题。

我不同意你所说的“如果不合并数据就不一致了”，不一致是指多个 replicas 的数据不一致，合并并发写的值跟不一致没有太大关系。如果有两个 replicas，分别为 r1 和 r2 （ Multi-Leader Replication ）。client 1 向 r1 执行 set k 1，client 2 并发向 r1 执行 set k 2，最后 r1 保留了 k 的两个值，就算 r1 不执行合并操作，那么只要 r2 也达到了“k 有两个值，分别为 1 和 2”这种状态，那就是一致的。

期待你的回复。

guyskk0x0

2020-08-08 13:05:42 +08:00

“于是它合并了两个并发写的值[milk]和[eggs]，然后加上 ham，最后将[eggs, milk, ham]发给数据库” 这个就是合并，在 client 或 server 都可以做。
不同的场景有不同的合并策略，在购物车场景这里可能是一致的，但如果修改的数据是用户年龄，就不对了。

JasonLaw

2020-08-08 13:14:52 +08:00 via iPhone

@guyskk0x0 但是人更明白应该怎么处理冲突，而不是数据库，你觉得呢？我在 Android 和 iOS 并发修改了我的年龄，数据库怎么选择一个最终值呢？

guyskk0x0

2020-08-08 13:43:07 +08:00

@JasonLaw #11 所以要告诉数据库冲突解决的策略，或者在 client 解决冲突。

JasonLaw

2020-08-08 14:22:12 +08:00

@guyskk0x0 #12 如果是数据库解决冲突，那当然可以，但是就像“Designing Data-Intensive Applications - CHAPTER 5 Replication - Leaderless Replication - Detecting Concurrent Writes - Last write wins (discarding concurrent writes)”中所说的“LWW achieves the goal of eventual convergence, but at the cost of durability: if there are several concurrent writes to the same key, even if they were all reported as successful to the client (because they were written to w replicas), only one of the writes will survive and the others will be silently discarded.”。我一直都是在强调数据库解决冲突的弊端。

@guyskk0x0 #6 我不觉得你前面所说的跟后面所说的是统一的，你前面说“没有合并那就应当有一个 client 失败”、“如果不合并数据就不一致了，所以必须有一个失败”，后面你却说“在 client 解决冲突”。