Log4j2 在启用 JDBC+Async Appender 情况下死锁
issue url
issue: druid issue #5414
现象
发生系统无法响应情况
分析
观察 javacore
及 log4j2.xml
配置,发现存在通过日志直接插数据库配置。
死锁
Log4j2-TF-3-AsyncLoggerConfig-2
线程(下称线程A
)要消费queue
写数据库,需要数据库连接。数据库连接都被用光或过期了,所以需要调用Druid-ConnectionPool-Create--1146771917
线程(下称线程B
)创建连接。这里存在线程A
等待线程B
。3XMTHREADINFO "Log4j2-TF-3-AsyncLoggerConfig-2" J9VMThread:0x0000000003EB5800, omrthread_t:0x00007F9E58A39D30, java/lang/Thread:0x000000068845D290, state:P, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0x8C, isDaemon:true)
3XMTHREADINFO1 (native thread ID:0x1B344, native priority:0x5, native policy:UNKNOWN, vmstate:P, vm thread flags:0x00020081)
3XMTHREADINFO2 (native stack address range from:0x00007F9EB8E35000, to:0x00007F9EB8E76000, size:0x41000)
3XMCPUTIME CPU usage total: 4147.834012773 secs, current category="Application"
3XMTHREADBLOCK Parked on: java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionObject@0x000000069AF44CF0 Owned by: <unknown>
3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at sun/misc/Unsafe.park(Native Method)
4XESTACKTRACE at java/util/concurrent/locks/LockSupport.park(LockSupport.java:186(Compiled Code))
4XESTACKTRACE at java/util/concurrent/locks/AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2050(Compiled Code))
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.takeLast(DruidDataSource.java:2218)
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.getConnectionInternal(DruidDataSource.java:1690)
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.getConnectionDirect(DruidDataSource.java:1427)
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.getConnection(DruidDataSource.java:1407)
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.getConnection(DruidDataSource.java:1397)
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource.getConnection(DruidDataSource.java:100)
4XESTACKTRACE at xxx/xxx/xxx/ConnectionFactoryConfig.getDatabaseConnection(ConnectionFactoryConfig.java:52)
4XESTACKTRACE at sun/reflect/NativeMethodAccessorImpl.invoke0(Native Method)线程B
在运行过程中需要输出日志,被NioProcessor-1
(下称线程C
)阻塞3XMTHREADINFO "Druid-ConnectionPool-Create--1146771917" J9VMThread:0x0000000007CBFC00, omrthread_t:0x00007F9E58866C28, java/lang/Thread:0x000000069AF57EB0, state:B, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0xA94E, isDaemon:true)
3XMTHREADINFO1 (native thread ID:0x11981, native priority:0x5, native policy:UNKNOWN, vmstate:B, vm thread flags:0x00000281)
3XMTHREADINFO2 (native stack address range from:0x00007F9DA8CB4000, to:0x00007F9DA8CF5000, size:0x41000)
3XMCPUTIME CPU usage total: 13.888869677 secs, current category="Application"
3XMTHREADBLOCK Blocked on: java/lang/Object@0x00000006884A0560 Owned by: "NioProcessor-1" (J9VMThread:0x0000000007C39700, java/lang/Thread:0x00000006B5BBAE18)
3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfigDisruptor.enqueue(AsyncLoggerConfigDisruptor.java:356)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfigDisruptor.enqueueEvent(AsyncLoggerConfigDisruptor.java:311)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.logInBackgroundThread(AsyncLoggerConfig.java:158)
4XESTACKTRACE at org/apache/logging/log4j/core/async/EventRoute$1.logMessage(EventRoute.java:46)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.handleQueueFull(AsyncLoggerConfig.java:148)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.logToAsyncDelegate(AsyncLoggerConfig.java:135(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.log(AsyncLoggerConfig.java:115(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/config/LoggerConfig.log(LoggerConfig.java:388(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/config/AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/Logger.logMessage(Logger.java:153(Compiled Code))
4XESTACKTRACE at org/apache/logging/slf4j/Log4jLogger.log(Log4jLogger.java:376(Compiled Code))
4XESTACKTRACE at com/alibaba/druid/support/logging/SLF4JImpl.error(SLF4JImpl.java:54(Compiled Code))
4XESTACKTRACE at com/alibaba/druid/pool/DruidDataSource$CreateConnectionThread.run(DruidDataSource.java:2840)线程C
持有锁,也要写日志,但queue
满了,必须等待3XMTHREADINFO "NioProcessor-1" J9VMThread:0x0000000007C39700, omrthread_t:0x00007F9E940D1308, java/lang/Thread:0x00000006B5BBAE18, state:R, prio=5
3XMJAVALTHREAD (java/lang/Thread getId:0xAE4A, isDaemon:false)
3XMTHREADINFO1 (native thread ID:0xE08F, native priority:0x5, native policy:UNKNOWN, vmstate:CW, vm thread flags:0x00000081)
3XMTHREADINFO2 (native stack address range from:0x00007F9DA920E000, to:0x00007F9DA924F000, size:0x41000)
3XMCPUTIME CPU usage total: 83.480989527 secs, current category="Application"
3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 (0x0)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at sun/misc/Unsafe.park(Native Method)
4XESTACKTRACE at java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:349(Compiled Code))
4XESTACKTRACE at com/lmax/disruptor/MultiProducerSequencer.next(MultiProducerSequencer.java:137(Compiled Code))
4XESTACKTRACE at com/lmax/disruptor/MultiProducerSequencer.next(MultiProducerSequencer.java:105)
4XESTACKTRACE at com/lmax/disruptor/RingBuffer.publishEvent(RingBuffer.java:524)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfigDisruptor.enqueue(AsyncLoggerConfigDisruptor.java:357)
5XESTACKTRACE (entered lock: java/lang/Object@0x00000006884A0560, entry count: 1)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfigDisruptor.enqueueEvent(AsyncLoggerConfigDisruptor.java:311)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.logInBackgroundThread(AsyncLoggerConfig.java:158)
4XESTACKTRACE at org/apache/logging/log4j/core/async/EventRoute$1.logMessage(EventRoute.java:46)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.handleQueueFull(AsyncLoggerConfig.java:148)
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.logToAsyncDelegate(AsyncLoggerConfig.java:135(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/async/AsyncLoggerConfig.log(AsyncLoggerConfig.java:115(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/config/LoggerConfig.log(LoggerConfig.java:388(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/config/AwaitCompletionReliabilityStrategy.log(AwaitCompletionReliabilityStrategy.java:63(Compiled Code))
4XESTACKTRACE at org/apache/logging/log4j/core/Logger.logMessage(Logger.java:153(Compiled Code))
4XESTACKTRACE at org/apache/log4j/Category.maybeLog(Category.java:533(Compiled Code))
4XESTACKTRACE at org/apache/log4j/Category.debug(Category.java:251(Compiled Code))
由此构成由三个线程死锁
解决
对数据库日志由异步改为同步
备注
需要非常巧合的场合才能出现这种死锁。我尝试构建最小错误集,没有构建出来。
发布在这里,对大家做个提醒。
出现这个死锁的原因是 Druid
创建连接是通过额外线程创建的,而额外线程也需要写日志,这就构成了死锁的潜在可能。最终触发。