Skip to content

Possible concurrency bug in DatePatternConverter.formatWithoutThreadLocals(Instant,StringBuilder) #1490

@smarkwal

Description

@smarkwal

Description

If two threads are calling the method DatePatternConverter.formatWithoutThreadLocals(Instant,StringBuilder) at the same time, one of the threads may incorrectly use the formatted date of the other thread in the output. In other words, the event of one thread could be written to a log file with the date of the other thread's event.

Sequence of actions

The following sequence shows how this could happen. All line numbers are based on this source code:

private void formatWithoutThreadLocals(final Instant instant, final StringBuilder output) {
CachedTime cached = cachedTime.get();
if (instant.getEpochSecond() != cached.epochSecond || instant.getNanoOfSecond() != cached.nanoOfSecond) {
final CachedTime newTime = new CachedTime(instant);
if (cachedTime.compareAndSet(cached, newTime)) {
cached = newTime;
} else {
cached = cachedTime.get();
}
}
output.append(cached.formatted);
}

Initial situation:

  1. cachedTime contains time t0.

Thread 1:

  1. Line 327: Calls the method formatWithoutThreadLocals(instant,output) with instant = t1, where t1 > t0.
  2. Line 328: Reads t0 from cachedTime into local variable cached.
  3. Line 329: if condition evaluates to true (t1 != t0).
  4. Line 330: Creates a new CachedTime(t1) and stores it in local variable newTime.

Thread 2:

  1. Line 327: Calls the method formatWithoutThreadLocals(instant,output) with instant = t2, where t2 > t1.
  2. Line 328: Reads t0 from cachedTime into local variable cached.
  3. Line 329: if condition evaluates to true (t2 != t0).
  4. Line 330: Creates a new CachedTime(t2) and stores it in local variable newTime.

Thread 1:

  1. Line 331: Calls cachedTime.compareAndSet(cached, newTime): This will set the shared cachedTime to thread 1's newTime t1 and return true (t0.compareAndSet(t0,t1) == true).
  2. Line 332: Sets local variable cached = newTime (t1).
  3. Line 337: Appends cached.formatted (t1) to output.

Thread 2:

  1. Line 331: Calls cachedTime.compareAndSet(cached, newTime): This will NOT change the shared cachedTime and return false (t1.compareAndSet(t0,t2) == false).
  2. Line 334: Sets local variable cached = cachedTime.get(). <-- This will read t1 into local variable cached!
  3. Line 337: Appends cached.formatted (t1!) to output.

Result

The event recorded by thread 2 at time t2 will be logged with a formatted date based on t1, the time when thread 1 has recorded its own event. In other words, both events would be written to the log file with the same date and time, and the time of event 2 is completely lost. Also, in the presence of multiple appenders, the same event can be written to different log files with a different date and time.

Bug

I think the problem is found at line 334 where thread 2 is reading the time that was previously written to cachedTime by thread 1. In order to use the correct time, thread 2 should instead set cached = newTime. This means that the if block (line 330) and else block (line 332) would be identical. So the return value of cachedTime.compareAndSet(cached, newTime) would not be needed, and the code could be simplified a lot.

I'm not familiar with the code base of Log4j2 and therefore do not know whether the same pattern has been used in other classes.

Background information

I have found this issue when using Log4j 2 in a web application. This means that the use of thread-locals for "gc-free logging" is disabled by default. The web application currently uses only synchronous loggers and appenders.

As part of a migration project from Log4j 1 to Log4j 2, I have written an integration test which compares the log messages written to two different log files (two appenders attached to the same logger). I noticed that the same event may be logged with different timestamps:

image

Since the threads need to call the method in parallel, the difference in time between the two recorded events is usually very small. In many cases it may not even be noticed, especially if the printed timestamp only has millisecond-accuracy.

Configuration

Version: Tested with 2.20.0, but the code has not been changed since 5+ years.

Operating system: (independent)

JDK: (independent)

Reproduction

I'm currently working on an example project to reproduce the issue. I will link it here in the next 2 - 3 days.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions