- Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Labels
:Distributed Coordination/NetworkHttp and internode communication implementationsHttp and internode communication implementationstest"" muted="" aria-describedby="MDU6TGFiZWw2MDQ0NTIyOA==-tooltip :R5b96b:">>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding testsv6.3.0v6.4.0
Description
A failure in the BWC tests of 6.3 vs 5.0.2:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.3+bwc-tests/19/console
FAILURE 253s | FullClusterRestartIT.testSecurityNativeRealm <<< FAILURES! > Throwable #1: java.lang.AssertionError: there are still running tasks: > {time_in_queue=28ms, time_in_queue_millis=28, source=_add_listener_, executing=true, priority=HIGH, insert_order=765} > {time_in_queue=18ms, time_in_queue_millis=18, source=cluster_reroute(async_shard_fetch), executing=false, priority=HIGH, insert_order=767} > {time_in_queue=28ms, time_in_queue_millis=28, source=delayed_allocation_reroute, executing=false, priority=NORMAL, insert_order=766} > at __randomizedtesting.SeedInfo.seed([3CA270480BA6CAFB:B75A8C1B31106AE3]:0) > at org.elasticsearch.test.rest.ESRestTestCase.lambda$waitForClusterStateUpdatesToFinish$0(ESRestTestCase.java:347) > at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:754) > at org.elasticsearch.test.rest.ESRestTestCase.waitForClusterStateUpdatesToFinish(ESRestTestCase.java:338) > at org.elasticsearch.test.rest.ESRestTestCase.cleanUpCluster(ESRestTestCase.java:151) > at java.lang.Thread.run(Thread.java:748) > Suppressed: java.lang.AssertionError: there are still running tasks: > {time_in_queue=40ms, time_in_queue_millis=40, source=cluster_reroute(async_shard_fetch), executing=true, priority=HIGH, insert_order=637} ... Tests with failures: - org.elasticsearch.xpack.restart.FullClusterRestartIT.testSecurityNativeRealm - org.elasticsearch.xpack.restart.FullClusterRestartIT.testWatcher However, other tests in this fixture also seemed very slow:
HEARTBEAT J0 PID(7616@worker-099a6b92441027ab0.build.us-west-2c.elasticnet.co): 2018-05-01T05:33:54, stalled for 252s at: FullClusterRestartIT.testSecurityNativeRealm HEARTBEAT J0 PID(7616@worker-099a6b92441027ab0.build.us-west-2c.elasticnet.co): 2018-05-01T05:37:46, stalled for 231s at: FullClusterRestartIT.testWatcher HEARTBEAT J0 PID(7616@worker-099a6b92441027ab0.build.us-west-2c.elasticnet.co): 2018-05-01T05:41:02, stalled for 190s at: FullClusterRestartIT.testSingleDoc HEARTBEAT J0 PID(7616@worker-099a6b92441027ab0.build.us-west-2c.elasticnet.co): 2018-05-01T05:44:40, stalled for 211s at: FullClusterRestartIT.testSqlFailsOnIndexWithTwoTypes Digging into the logs a bit, I find this which doesn't look right:
[2018-05-01T05:29:46,598][DEBUG][o.e.c.s.ClusterService ] [node-0] processing [zen-disco-receive(from master [master {node-1}{-Wjuwb5sSUiVnV5ldBENqA}{qs8hEUgARRul6e9hCQ6Vog}{127.0.0.1}{127.0.0.1:53625}{testattr=test} committed version [119]])]: [2018-05-01T05:29:46,627][WARN ][o.e.x.s.t.n.SecurityNetty4Transport] [node-0] exception caught on transport layer [[id: 0xd86dc7ea, L:/127.0.0.1:39843 - R:/127.0.0.1:60615]], closing connection io.netty.handler.codec.DecoderException: io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1 at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:442) ~[netty-codec-4.1.5.Final.jar:4.1.5.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:248) ~[netty-codec-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:350) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1334) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:372) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:358) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:926) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:129) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:610) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:513) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:467) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:437) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873) [netty-common-4.1.5.Final.jar:4.1.5.Final] at java.base/java.lang.Thread.run(Thread.java:844) [?:?] Caused by: io.netty.util.IllegalReferenceCountException: refCnt: 0, decrement: 1 at io.netty.buffer.AbstractReferenceCountedByteBuf.release(AbstractReferenceCountedByteBuf.java:111) ~[?:?] at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1068) ~[?:?] at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:900) ~[?:?] at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:411) ~[?:?] ... 15 more [2018-05-01T05:29:46,638][DEBUG][i.n.h.s.SslHandler ] [id: 0x2280b8e2, L:/127.0.0.1:39843 - R:/127.0.0.1:60618] Swallowing a harmless 'connection reset by peer / broken pipe' error that occurred while writing close_notify in response to the java.io.IOException: Connection reset by peer at java.base/sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:?] at java.base/sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:?] at java.base/sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:283) ~[?:?] at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:250) ~[?:?] at java.base/sun.nio.ch.IOUtil.read(IOUtil.java:226) ~[?:?] at java.base/sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:382) ~[?:?] at io.netty.buffer.PooledHeapByteBuf.setBytes(PooledHeapByteBuf.java:261) ~[netty-buffer-4.1.5.Final.jar:4.1.5.Final] at io.netty.buffer.AbstractByteBuf.writeBytes(AbstractByteBuf.java:1100) ~[netty-buffer-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.socket.nio.NioSocketChannel.doReadBytes(NioSocketChannel.java:367) ~[netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:118) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:610) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysPlain(NioEventLoop.java:513) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:467) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:437) [netty-transport-4.1.5.Final.jar:4.1.5.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:873) [netty-common-4.1.5.Final.jar:4.1.5.Final] at java.base/java.lang.Thread.run(Thread.java:844) [?:?] This is on the 5.0.2 node (running on JVM 10):
[2018-05-01T05:28:21,514][INFO ][o.e.n.Node ] [node-0] version[5.0.2], pid[7422], build[f6b4951/2016-11-24T10:07:18.101Z], OS[Linux/3.13.0-137-generic/amd64], JVM["Oracle Corporation"/Java HotSpot(TM) 64-Bit Server VM/10/10+46] Metadata
Metadata
Assignees
Labels
:Distributed Coordination/NetworkHttp and internode communication implementationsHttp and internode communication implementationstest"" muted="" aria-describedby="MDU6TGFiZWw2MDQ0NTIyOA==-tooltip :Ra5pmb:">>testIssues or PRs that are addressing/adding testsIssues or PRs that are addressing/adding testsv6.3.0v6.4.0