Go: Understanding Race Conditions and Atomic Synchronization

Preventing data races with mutexes may sound easy, but dealing with race conditions is a whole other matter. Let's learn how to handle these beasts!
Race Condition
Let's say we're keeping track of the money in users' accounts:
// Accounts - money in users' accounts.
type Accounts struct {
bal map[string]int
mu sync.Mutex
}
// NewAccounts creates a new set of accounts.
func NewAccounts(bal map[string]int) *Accounts {
return &Accounts{bal: maps.Clone(bal)}
}
We can check the balance by username or change the balance:
// Get returns the user's balance.
func (a *Accounts) Get(name string) int {
a.mu.Lock()
defer a.mu.Unlock()
return a.bal[name]
}
// Set changes the user's balance.
func (a *Accounts) Set(name string, amount int) {
a.mu.Lock()
defer a.mu.Unlock()
a.bal[name] = amount
}
Account operations — Get and Set — are concurrent-safe, thanks to the mutex.
There's also a store that sells Lego sets:
// A Lego set.
type LegoSet struct {
name string
price int
}
Alice has 50 coins in her account. She wants to buy two sets: "Castle" for 40 coins and "Plants" for 20 coins:
func main() {
acc := NewAccounts(map[string]int{
"alice": 50,
})
castle := LegoSet{name: "Castle", price: 40}
plants := LegoSet{name: "Plants", price: 20}
var wg sync.WaitGroup
// Alice buys a castle.
wg.Go(func() {
balance := acc.Get("alice")
if balance < castle.price {
return
}
time.Sleep(5 * time.Millisecond)
acc.Set("alice", balance-castle.price)
fmt.Println("Alice bought the castle")
})
// Alice buys plants.
wg.Go(func() {
balance := acc.Get("alice")
if balance < plants.price {
return
}
time.Sleep(10 * time.Millisecond)
acc.Set("alice", balance-plants.price)
fmt.Println("Alice bought the plants")
})
wg.Wait()
balance := acc.Get("alice")
fmt.Println("Alice's balance:", balance)
}
Alice bought the castle
Alice bought the plants
Alice's balance: 30
What a twist! Not only did Alice buy both sets for a total of 60 coins (even though she only had 50 coins), but she also ended up with 30 coins left! Great deal for Alice, not so great for us.
The problem is that checking and updating the balance is not an atomic operation:
// body of the second goroutine
balance := acc.Get("alice") // (1)
if balance < plants.price { // (2)
return
}
time.Sleep(10 * time.Millisecond)
acc.Set("alice", balance-plants.price) // (3)
At point ➊, we see a balance of 50 coins (the first goroutine hasn't done anything yet), so the check at ➋ passes. By point ➌, Alice has already bought the castle (the first goroutine has finished), so her actual balance is 10 coins. But we don't know this and still think her balance is 50 coins. So at point ➌, Alice buys the plants for 20 coins, and the balance becomes 30 coins (the "assumed" balance of 50 coins minus the 20 coins for the plants = 30 coins).
Individual actions on the balance are safe (there's no data race). However, balance reads/writes from different goroutines can get "mixed up", leading to an incorrect final balance. This situation is called a race condition.
You can't fully eliminate uncertainty in a concurrent environment. Events will happen in an unpredictable order — that's just how concurrency works. However, you can protect the system's state — in our case, the purchased sets and balance — so it stays correct no matter what order things happen in.
Let's check and update the balance in one atomic operation, protecting the entire purchase with a mutex. This way, purchases are processed strictly sequentially:
// Shared mutex.
var mu sync.Mutex
// Alice buys a castle.
wg.Go(func() {
// Protect the entire purchase with a mutex.
mu.Lock()
defer mu.Unlock()
balance := acc.Get("alice")
if balance < castle.price {
return
}
time.Sleep(5 * time.Millisecond)
acc.Set("alice", balance-castle.price)
fmt.Println("Alice bought the castle")
})
// Alice buys plants.
wg.Go(func() {
// Protect the entire purchase with a mutex.
mu.Lock()
defer mu.Unlock()
balance := acc.Get("alice")
if balance < plants.price {
return
}
time.Sleep(10 * time.Millisecond)
acc.Set("alice", balance-plants.price)
fmt.Println("Alice bought the plants")
})
Alice bought the plants
Alice's balance: 30
One of the goroutines will run first, lock the mutex, check and update the balance, then unlock the mutex. Only after that will the second goroutine be able to lock the mutex and make its purchase.
We still can't be sure which purchase will happen — it depends on the order the goroutines run. But now we are certain that Alice won't buy more than she's supposed to, and the final balance will be correct:
Alice bought the castle
Alice's balance: 10
Or:
Alice bought the plants
Alice's balance: 30
To reiterate:
- A data race happens when multiple goroutines access shared data, and at least one of them modifies it. We need to protect the data from this kind of concurrent access.
- A race condition happens when an unpredictable order of operations leads to an incorrect system state. In a concurrent environment, we can't control the exact order things happen. Still, we need to make sure that no matter the order, the system always ends up in the correct state.
Go's race detector can find data races, but it doesn't catch race conditions. It's always up to the programmer to prevent race conditions.
Compare-and-Set
Let's go back to the situation with the race condition before we added the mutex:
// Alice's balance = 50 coins.
// Castle price = 40 coins.
// Plants price = 20 coins.
// Alice buys a castle.
wg.Go(func() {
balance := acc.Get("alice")
if balance < castle.price {
return
}
time.Sleep(5 * time.Millisecond)
acc.Set("alice", balance-castle.price)
fmt.Println("Alice bought the castle")
})
// Alice buys plants.
wg.Go(func() {
balance := acc.Get("alice")
if balance < plants.price {
return
}
time.Sleep(10 * time.Millisecond)
acc.Set("alice", balance-plants.price)
fmt.Println("Alice bought the plants")
})
Alice bought the castle
Alice bought the plants
Alice's balance: 30
As we discussed, the reason for the incorrect final state is that buying a set (checking and updating the balance) is not an atomic operation:
// body of the second goroutine
balance := acc.Get("alice") // (1)
if balance < plants.price { // (2)
return
}
time.Sleep(10 * time.Millisecond)
acc.Set("alice", balance-plants.price) // (3)
At point ➊, we see a balance of 50 coins, so the check at ➋ passes. By point ➌, Alice has already bought the castle, so her actual balance is 10 coins. But we don't know this and still think her balance is 50 coins. So at point ➌, Alice buys the plants for 20 coins, and the balance becomes 30 coins (the "assumed" balance of 50 coins minus the 20 coins for the plants = 30 coins).
To solve the problem, we can protect the entire purchase with a mutex, just like we did before. But there's another way to handle it.
We can keep two separate operations (checking and updating the balance), but make sure they happen atomically. To do this, we'll use a compare-and-set pattern:
// Buy attempts to purchase a set for the given buyer.
func (a *Accounts) Buy(buyer string, price int) bool {
a.mu.Lock()
defer a.mu.Unlock()
balance := a.bal[buyer]
if balance < price {
return false
}
a.bal[buyer] = balance - price
return true
}
Now the purchase logic is atomic — checking and updating happen together, protected by the mutex. The client code becomes simpler:
// Alice buys a castle.
wg.Go(func() {
if acc.Buy("alice", castle.price) {
fmt.Println("Alice bought the castle")
}
})
// Alice buys plants.
wg.Go(func() {
if acc.Buy("alice", plants.price) {
fmt.Println("Alice bought the plants")
}
})
Alice bought the castle
Alice's balance: 10
Now only one purchase can succeed, and the final balance will always be correct.
Idempotence and Atomicity
An operation is idempotent if performing it multiple times has the same effect as performing it once. For example, setting a value is idempotent:
acc.Set("alice", 30)
acc.Set("alice", 30) // Same effect as the first call
But incrementing a value is not idempotent:
acc.Set("alice", acc.Get("alice") + 10)
acc.Set("alice", acc.Get("alice") + 10) // Different effect!
An operation is atomic if it appears to happen all at once from the perspective of other goroutines. Even if the operation involves multiple steps internally, other goroutines can't see intermediate states.
In our purchase example, the Buy method is atomic because the entire check-and-update operation happens while holding the mutex. Other goroutines can't see the balance between the check and the update.
Locker
The sync.Locker interface provides a standard way to work with locks:
type Locker interface {
Lock()
Unlock()
}
Both sync.Mutex and sync.RWMutex implement this interface. This allows you to write code that works with any type of lock:
func doWork(lock sync.Locker) {
lock.Lock()
defer lock.Unlock()
// Do work...
}
TryLock
Sometimes you want to try to acquire a lock, but if it's not available, you should immediately return an error instead of waiting.
We can use the TryLock method of a mutex to implement this logic:
// External is a client for an external system.
type External struct {
lock sync.Mutex
}
// Call calls the external system.
func (e *External) Call() error {
if !e.lock.TryLock() {
return errors.New("busy") // (1)
}
defer e.lock.Unlock()
// Simulate a remote call.
time.Sleep(100 * time.Millisecond)
return nil
}
TryLock tries to lock the mutex, just like a regular Lock. But if it can't, it returns false right away instead of blocking the goroutine. This way, we can immediately return an error at ➊ instead of waiting for the system to become available.
Now, out of four simultaneous calls, only one will go through. The others will get a "busy" error:
func main() {
const nCalls = 4
ex := new(External)
start := time.Now()
var wg sync.WaitGroup
for range nCalls {
wg.Go(func() {
err := ex.Call()
if err != nil {
fmt.Println(err)
} else {
fmt.Println("success")
}
})
}
wg.Wait()
fmt.Printf(
"%d calls took %d ms\n",
nCalls, time.Since(start).Milliseconds(),
)
}
busy
busy
busy
success
4 calls took 100 ms
According to the standard library docs, TryLock is rarely needed. In fact, using it might mean there's a problem with your program's design. For example, if you're calling TryLock in a busy-wait loop ("keep trying until the resource is free") — that's usually a bad sign:
for {
if mutex.TryLock() {
// Use the shared resource.
mutex.Unlock()
break
}
}
This code will keep one CPU core at 100% usage until the mutex is unlocked. It's much better to use a regular Lock so the scheduler can take the blocked goroutine off the CPU.
Shared Nothing
Let's go back one last time to Alice and the Lego sets we started the chapter with.
We manage user accounts:
// Accounts - money in users' accounts.
type Accounts struct {
bal map[string]int
mu sync.Mutex
}
// NewAccounts creates a new set of accounts.
func NewAccounts(bal map[string]int) *Accounts {
return &Accounts{bal: maps.Clone(bal)}
}
// Get returns the user's balance.
func (a *Accounts) Get(name string) int {
a.mu.Lock() // (1)
defer a.mu.Unlock()
return a.bal[name]
}
// Set changes the user's balance.
func (a *Accounts) Set(name string, amount int) {
a.mu.Lock() // (2)
defer a.mu.Unlock()
a.bal[name] = amount
}
And handle purchases:
acc := NewAccounts(map[string]int{
"alice": 50,
})
castle := LegoSet{name: "Castle", price: 40}
plants := LegoSet{name: "Plants", price: 20}
// Shared mutex.
var mu sync.Mutex
// Alice buys a castle.
wg.Go(func() {
// Protect the entire purchase with a mutex.
mu.Lock() // (3)
defer mu.Unlock()
// Check and update the balance.
})
// Alice buys plants.
wg.Go(func() {
// Protect the entire purchase with a mutex.
mu.Lock() // (4)
defer mu.Unlock()
// Check and update the balance.
})
This isn't a very complex use case — I'm sure you've seen worse. Still, we had to put in some effort:
- Protect the balance with a mutex to prevent a data race ➊ ➋.
- Protect the entire purchase operation with a mutex (or use compare-and-set) to make sure the final state is correct ➌ ➍.
We were lucky to notice and prevent the race condition during a purchase. What if we had missed it?
There's another approach to achieving safe concurrency: instead of protecting shared state when working with multiple goroutines, we can avoid shared state altogether. Channels can help us do this.
Here's the idea: we'll create a Processor function that accepts purchase requests through an input channel, processes them, and sends the results back through an output channel:
// A purchase request.
type Request struct {
buyer string
set LegoSet
}
// A purchase result.
type Purchase struct {
buyer string
set LegoSet
succeed bool
balance int // balance after purchase
}
// Processor handles purchases.
func Processor(acc map[string]int) (chan<- Request, <-chan Purchase) {
// ...
}
Buyer goroutines will send requests to the processor's input channel and receive results (successful or failed purchases) from the output channel:
func main() {
const buyer = "Alice"
acc := map[string]int{buyer: 50}
wishlist := []LegoSet{
{name: "Castle", price: 40},
{name: "Plants", price: 20},
}
reqs, purs := Processor(acc)
// Alice buys stuff.
var wg sync.WaitGroup
for _, set := range wishlist {
wg.Go(func() {
reqs <- Request{buyer: buyer, set: set}
pur := <-purs
if pur.succeed {
fmt.Printf("%s bought the %s\n", pur.buyer, pur.set.name)
fmt.Printf("%s's balance: %d\n", buyer, pur.balance)
}
})
}
wg.Wait()
}
Alice bought the Plants
Alice's balance: 30
This approach offers several benefits:
- Buyer goroutines send their requests and get results without worrying about how the purchase is done.
- All the buying logic is handled inside the processor goroutine.
- No need for mutexes.
All that's left is to implement the processor. How about this:
// Processor handles purchases.
func Processor(acc map[string]int) (chan<- Request, <-chan Purchase) {
in := make(chan Request)
out := make(chan Purchase)
acc = maps.Clone(acc)
go func() {
for {
// Receive the purchase request.
req := <-in
// Handle the purchase.
balance := acc[req.buyer]
pur := Purchase{buyer: req.buyer, set: req.set, balance: balance}
if balance >= req.set.price {
pur.balance -= req.set.price
pur.succeed = true
acc[req.buyer] = pur.balance
} else {
pur.succeed = false
}
// Send the result.
out <- pur
}
}()
return in, out
}
It would have been a good idea to add a way to stop the processor using context, but I decided not to do it to keep the code simple.
The processor clones the original account states and works with its own copy. This approach makes sure there is no concurrent access to the accounts, so there are no races. Of course, we should avoid running two processors at the same time, or we could end up with two different versions of the truth.
It's not always easy to structure a program in a way that avoids shared state. But if you can, it's a good option.
Summary
Now you know how to protect shared data (from data races) and sequences of operations (from race conditions) in a concurrent environment using mutexes. Be careful with them and always test your code thoroughly with the race detector enabled.
Use code reviews, because the race detector doesn't catch every data race and can't detect race conditions at all. Having someone else look over your code can be really helpful.
Key points to remember:
- Data races occur when multiple goroutines access shared data concurrently, with at least one modifying it.
- Race conditions occur when an unpredictable order of operations leads to incorrect system state.
- Atomic operations ensure that check-and-update sequences happen as a single unit.
- Compare-and-set patterns help make operations atomic.
- Shared nothing architecture avoids concurrency issues by eliminating shared state.
- TryLock can be useful but is rarely needed and may indicate design problems.
Safe concurrent programming requires careful design and thorough testing.






