SM session ends up in a limbo state after network reconnect
Description
Environment
is related to
Activity
Guus der Kinderen May 18, 2020 at 12:37 PM
I think this might be resolved by OF-2013, https://igniterealtime.atlassian.net/browse/OF-2014#icft=OF-2014 and OF-2015, yes. I've not verified this though.

Daryl Herzmann May 5, 2020 at 1:27 PM
Guus, you think your SM work this week will help / fix this issue too?
Guus der Kinderen October 1, 2019 at 3:19 PM
I'm not saying that there's no issue in Openfire, but I wonder if the issue is in the client (that fails to properly resume a stream, and starts another stream instead).
With fixes for https://igniterealtime.atlassian.net/browse/OF-1883#icft=OF-1883 and https://igniterealtime.atlassian.net/browse/OF-1884#icft=OF-1884 I have successfully resumed a stream with Conversations, which runs on a mobile phone on my LAN. To trigger the session detachment, I disabled all data on my phone.
I'll leave the logs here, so that we can use them as a reference when testing with other clients.
To clean up the log files, I logged everything but MINA to 'trace':
<!-- OF-506: Jetty INFO messages are generally not useful. Ignore them by default. -->
<Logger name="org.eclipse.jetty" level="warn"/>
<Logger name="org.apache.mina" level="info"/>
<Root level="trace">
<AppenderRef ref="all-out"/>
<AppenderRef ref="debug-out"/>
<AppenderRef ref="info-out"/>
<AppenderRef ref="warn-out"/>
<AppenderRef ref="error-out"/>
</Root>
To speed up the tests, I've set Server > Server Settings > Client Connections > Idle Connections Policy to 10 seconds.
This is being logged by Openfire, when I disable the network on my phone that has Conversations running. As you can see, the stream is getting detached:
2019.10.01 17:02:43 TRACE [TaskEngine-pool-7]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Added stanza of type 'iq' to collection of unacknowledged stanzas (x=17). Collection size is now 2.
2019.10.01 17:02:48 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.nio.ClientConnectionHandler - ConnectionHandler: Pinging connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@a993ae MINA Session: (0x00000006: nio socket, server, /192.168.178.20:43086 => /192.168.178.21:5222)
2019.10.01 17:02:48 TRACE [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Added stanza of type 'iq' to collection of unacknowledged stanzas (x=18). Collection size is now 3.
2019.10.01 17:02:53 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.nio.ConnectionHandler - ConnectionHandler: Closing connection that has been idle: org.jivesoftware.openfire.nio.NIOConnection@a993ae MINA Session: (0x00000006: nio socket, server, /192.168.178.20:43086 => /192.168.178.21:5222)
2019.10.01 17:02:53 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.session.LocalSession - Suppressing close for session with address guus@laptop-guus/Conversations.qpvf and streamID akgzdp7d2b.
2019.10.01 17:02:53 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.SessionManager - Closing session has SM enabled; detaching.
2019.10.01 17:02:53 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.session.LocalSession - Setting session with address guus@laptop-guus/Conversations.qpvf and streamID akgzdp7d2b in detached mode.
After approximately half a minute, I re-enabled the network on the mobile phone. The following was logged, which to me looks like a successful stream resumption to me.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'ANONYMOUS' as it has been disabled by configuration.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'GSSAPI' as the 'sasl.gssapi.config' property has not been defined.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'JIVE-SHAREDSECRET' as it has been disabled by configuration.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Configured TrustManager class: org.jivesoftware.openfire.keystore.OpenfireX509TrustManager
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Attempting to instantiate 'class org.jivesoftware.openfire.keystore.OpenfireX509TrustManager' using the three-argument constructor that is properietary to Openfire.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.keystore.OpenfireX509TrustManager - Constructed trust manager. Number of trusted issuers: 0, accepts self-signed: false, checks validity: true
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Successfully instantiated 'class org.jivesoftware.openfire.keystore.OpenfireX509TrustManager'.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Configured TrustManager class: org.jivesoftware.openfire.keystore.OpenfireX509TrustManager
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Attempting to instantiate 'class org.jivesoftware.openfire.keystore.OpenfireX509TrustManager' using the three-argument constructor that is properietary to Openfire.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.keystore.OpenfireX509TrustManager - Constructed trust manager. Number of trusted issuers: 0, accepts self-signed: false, checks validity: true
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-2]: org.jivesoftware.openfire.spi.EncryptionArtifactFactory - Successfully instantiated 'class org.jivesoftware.openfire.keystore.OpenfireX509TrustManager'.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'ANONYMOUS' as it has been disabled by configuration.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'GSSAPI' as the 'sasl.gssapi.config' property has not been defined.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.net.SASLAuthentication - Cannot support 'JIVE-SHAREDSECRET' as it has been disabled by configuration.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: org.jivesoftware.openfire.nio.NIOConnection - Peer does not offer certificates in session: LocalClientSession{address=laptop-guus/5bobqusy4r, streamID=5bobqusy4r, status=1 (connected), isSecure=true, isDetached=false, serverName='laptop-guus', isInitialized=false, hasAuthToken=false, peer address='192.168.178.20', presence
<presence type="unavailable"/>'}
javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated
at sun.security.ssl.SSLSessionImpl.getPeerCertificates(SSLSessionImpl.java:450) ~[?:1.8.0_201]
at org.jivesoftware.openfire.nio.NIOConnection.getPeerCertificates(NIOConnection.java:191) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.jivesoftware.openfire.net.SASLAuthentication.getSASLMechanismsElement(SASLAuthentication.java:226) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.jivesoftware.openfire.net.SASLAuthentication.getSASLMechanisms(SASLAuthentication.java:199) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.jivesoftware.openfire.net.StanzaHandler.tlsNegotiated(StanzaHandler.java:458) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:138) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandler.java:183) [xmppserver-4.5.0-SNAPSHOT.jar:4.5.0-SNAPSHOT]
at org.apache.mina.core.filterchain.DefaultIoFilterChain$TailFilter.messageReceived(DefaultIoFilterChain.java:1015) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:650) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1300(DefaultIoFilterChain.java:49) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:1128) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.IoFilterAdapter.messageReceived(IoFilterAdapter.java:122) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:650) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1300(DefaultIoFilterChain.java:49) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:1128) [mina-core-2.1.3.jar:?]
at org.apache.mina.filter.codec.ProtocolCodecFilter$ProtocolDecoderOutputImpl.flush(ProtocolCodecFilter.java:413) [mina-core-2.1.3.jar:?]
at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecFilter.java:257) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.callNextMessageReceived(DefaultIoFilterChain.java:650) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain.access$1300(DefaultIoFilterChain.java:49) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.filterchain.DefaultIoFilterChain$EntryImpl$1.messageReceived(DefaultIoFilterChain.java:1128) [mina-core-2.1.3.jar:?]
at org.apache.mina.core.session.IoEvent.run(IoEvent.java:89) [mina-core-2.1.3.jar:?]
at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.runTask(OrderedThreadPoolExecutor.java:766) [mina-core-2.1.3.jar:?]
at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.runTasks(OrderedThreadPoolExecutor.java:758) [mina-core-2.1.3.jar:?]
at org.apache.mina.filter.executor.OrderedThreadPoolExecutor$Worker.run(OrderedThreadPoolExecutor.java:697) [mina-core-2.1.3.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_201]
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Attempting resumption for Q29udmVyc2F0aW9ucy5xcHZmAGFrZ3pkcDdkMmI=, h=17
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Resuming session for 'guus@laptop-guus/Conversations.qpvf'. Current session: 5bobqusy4r
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Found existing session for 'guus@laptop-guus/Conversations.qpvf', checking status
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Attaching to other session 'akgzdp7d2b' of 'guus@laptop-guus/Conversations.qpvf'.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.session.LocalSession - Setting session with address laptop-guus/5bobqusy4r and streamID 5bobqusy4r in detached mode.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: org.jivesoftware.openfire.session.LocalSession - Reattaching session with address guus@laptop-guus/Conversations.qpvf and streamID akgzdp7d2b.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Agreeing to resume
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Resuming session: Ack for 17
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Before processing client Ack (h=17): 3 unacknowledged stanzas.
2019.10.01 17:03:24 TRACE [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - After processing client Ack (h=17): 1 unacknowledged stanzas.
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Processing remaining unacked stanzas
2019.10.01 17:03:24 DEBUG [socket_c2s-thread-3]: class org.jivesoftware.openfire.streammanagement.StreamManager[192.168.178.20] - Perform resumption on session akgzdp7d2b for 'guus@laptop-guus/Conversations.qpvf'. Closing session 5bobqusy4r

wroot September 26, 2019 at 7:54 PM
This might even be kind of an intended behavior. You still are online and receive messages anyway. But it seems it should be handled more gracefully and without a reconnect.
Details
Assignee
Guus der KinderenGuus der KinderenReporter
wrootwrootLabels
Affects versions
Priority
Major
Details
Details
Assignee
Reporter

Session Management (SM) is enabled on the server. Using SM supporting client (Spark 2.9.0, Gajim). ClientA is connected and is shown on the Sessions page, on the Users page and in a roster of ClientB. Then network is disconnected on ClientA machine. Sessions page still shows him online, same for the Users page and roster of ClientB. Then network connection is restored. ClientA vanishes from Sessions page, but is still shown online on Users page and in ClientB. If ClientB tries to send messages to ClientA, they don't go through (they are stored as offline messages). ClientA can't send messages either. After a minute ClientA errors out and starts reconnecting. After that it is again shown on the Sessions page and receives all offline messages that were sent during his limbo time.