Menu

#146 Lost Connection Gives No Error Message

closed-fixed
nobody
None
5
2014-04-30
2011-01-18
Rob Winch
No

Environment

* Pidgin 2.7.9 with Sipe from master today
* Ubuntu 10.10 32bit
* Friend is using Communicator 2007 client

Steps

1) I start pidgin and login to SIPE
2) Dustin Singleton starts communicator and logs in
3) Dustin IMs me - this succeeds
4) I message Dustin - this succeeds
5) We both leave the conversation open
6) Dustin unplugs his network cable (he has wireless disabled) and then plugs it back in.
7) I IM Dustin \"this message should not reach him but i should get an error\" while he is offline (no error message)
8) He logs back in and IMs me
9) I respond

After 5 minutes there is still no message indicating the message did not get sent. With communicator it takes about a minute, but an error is displayed. I have attached logs from a communicator to communicator conversation with this working and logs with sipe to communicator with it not.

PS Sorry for logging these weird lost connection issues, but I see it occur fairly often when people travel to meetings.

Discussion

1 2 > >> (Page 1 of 2)
  • Rob Winch

    Rob Winch - 2011-01-18

    Logs of sipe to communicator illustrating the bug

     
  • Rob Winch

    Rob Winch - 2011-01-18

    and communicator to communicator demonstrating it working

     
  • Stefan Becker

    Stefan Becker - 2011-01-19

    The situations shown in the logs are different:

    - SIPE only gets an indication that something is wrong after we receive the new INVITE after reconnect
    - In the communicator-communicator log Micah doesn't receive a new INVITE, but at some point an error message when he is typing

    SIPE -> communicator:
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-18T18:35:22.169813Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ok go ahead and disconnect for a bit
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-18T18:35:22.301517Z
    SIP/2.0 200 OK
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-18T18:35:37.609789Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    this message should not reach him but i should get an error
    <<< Dustin reconnects and IMs us. We receive new INVITE and send BYE to old dialog >>>
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-18T18:36:02.678114Z
    BYE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-18T18:36:02.785567Z
    SIP/2.0 481 Call Leg/Transaction Does Not Exist

    Communicator -> Communicator: (Dustin is unplugging, Micah is trying to send)

    01/18/2011|11:01:43.937 D64:156C INFO :: Sending Packet - 10.160.5.37:5061 (From Local Address: 10.184.224.93:49342) 1283 bytes:
    01/18/2011|11:01:43.937 D64:156C INFO :: MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0

    01/18/2011|11:01:43.953 D64:156C INFO :: Data Received - 10.160.5.37:5061 (To Local Address: 10.184.224.93:49342) 803 bytes:
    01/18/2011|11:01:43.953 D64:156C INFO :: SIP/2.0 200 OK

    <<< Dustin disconnects >>>

    <<< Micah types on keyboard -> XML KeyboardActivity >>>
    01/18/2011|11:02:27.493 D64:156C INFO :: Sending Packet - 10.160.5.37:5061 (From Local Address: 10.184.224.93:49342) 1273 bytes:
    01/18/2011|11:02:27.493 D64:156C INFO :: INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0

    01/18/2011|11:02:59.504 D64:156C INFO :: Sending Packet - 10.160.5.37:5061 (From Local Address: 10.184.224.93:49342) 1273 bytes:
    01/18/2011|11:02:59.504 D64:156C INFO :: INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0

    01/18/2011|11:02:59.504 D64:156C INFO :: Data Received - 10.160.5.37:5061 (To Local Address: 10.184.224.93:49342) 889 bytes:
    01/18/2011|11:02:59.504 D64:156C INFO :: SIP/2.0 481 Call Leg/Transaction Does Not Exist

    Looking at Dustin*.log shows that after disconnect, his communicator tries to send BYE twice, which of course never reach OCS. After reconnect he gets the second INFO from Micah and his communicator reacts with the 481 error. The log doesn't show that Dustin IM'd Micah after reconnect -> test execution is a little different.

    I see two possibilities:
    - add timeout to MESSAGE requests. I have no idea what/if the RFCs say something about this. Should it expire after 10, 30, 60, ... seconds?
    - create error messages for the outstanding messages on the dialog when we receive a new INVITE on an existing dialog. Not sure how to this yet.

     
  • Stefan Becker

    Stefan Becker - 2011-01-19

    Just to be clear on the differences in the Communicator-Communicator test:

    - Micah was still typing his message while Dustin reconnected, i.e. no MESSAGE from Micah that would have a delivery failure.
    - Dustin didn't IM Micah after reconnecting, i.e. no new INVITE from Dustin for an existing IM.

    So the logs aren't really comparable.

     
  • Rob Winch

    Rob Winch - 2011-01-19

    Sorry...I will get some logs that ensure that we are doing as close to the same thing as possible. Just as an FYI...Micah did IM dustin while Dustin was offline. The result of seeing the message delivery failure just did not appear until shortly after Dustin logged back on.

     
  • Stefan Becker

    Stefan Becker - 2011-01-19

    Did Micah press return *before* Dustin came back online? If yes then that would mean that M$ Communicator is queuing messages, i.e. there can only be one SIP message in-flight for a session. It would therefore not send the MESSAGE until it got a 200 response to the previous message, which was the INFO/KeyboardActivity. And the 481 response cancelled all activities in the queue, i.e. the entered text was send back to the UI with an error message.

    Implementing something like that would require some major rewrite for SIPE. But I don't think such a functionality is necessary if we not only send the BYE but cancel all unconfirmed messages when we receive a INVITE for an existing IM.

     
  • Rob Winch

    Rob Winch - 2011-01-19

    Yes. Micah hit return before Dustin got back online. Thank you for explaining a bit of the internals so I can understand more clearly.

    What I intend to try is a simpler and more controlled use case as described below. I suspect that Micah will still get an error message indicating the message could not be sent. Please let me know if there is the possibility that logs from this scenario might provide information that would make the SIPE plugin better.

    Micah and Dustin start as being logged out and are both using Communicator.

    1) Dustin and Micah login
    2) Dustin IMs Micah "Hi"
    3) Micah responds "Hi"
    4) Both Windows are left open
    5) Dustin uplugs and waits ~10seconds
    6) Micah IMs Dustin "This should fail"
    7) Wait till an error message is received or 5 minutes

    The scenario will be repeated with one communicator and one pidgin (replacing Micah). I suspect that in this case no error message will be delivered. This was at least true of the similar (but less controlled) scenario I have already posted.

     
  • Stefan Becker

    Stefan Becker - 2011-01-19

    Please try commit 370146b.

    IM MESSAGE SIP messages do now have a 60 second timeout. So latest 60 seconds after pressing Return you should get an error feedback from SIPE.

     
  • Rob Winch

    Rob Winch - 2011-01-19

    Thanks for the quick turnaround. This worked, but there was still some unexpected behaviour. Below is a description of what happened. I have also attached the logs.

    1) I IMd Dustin
    2) Dustin IMd me
    3) Dustin unplugged
    4) I IMd Dustin "this should not work"
    5) The message successfully timed out and I got an error
    6) I IMd Dustin again "testing2"
    7) Dustin plugged in
    8) As expected the message timed out and I got an error indicating the message could not be sent
    9) I sent him another message "testing again"
    10) I immediately got an error stating that he was offline (this is not expected as he was online and had been for about a minute)
    11) Immediately after getting the error in #10 I IMd him again "testing" and this message was sent successfully

     
  • Rob Winch

    Rob Winch - 2011-01-19

    Logs of it failing after a timeout

     
  • Stefan Becker

    Stefan Becker - 2011-01-20

    10) is unfortunately as expected, as we ignore responses to INFO/KeyboardActivity and therefore only process the error response to the MESSAGE:

    1) new dialog with Call-ID: 9826gF660aE774iB5F4m4DD3tC057bC674x0E3Ex
    ...
    4)
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:57:44.070235Z
    INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:57:44.070235Z
    INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:57:48.726766Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    this should not work

    5)
    (15:58:48) sipe: process_message_timeout: removed message <9826gF660aE774iB5F4m4DD3tC057bC674x0E3Ex><5><MESSAGE><sip:dustin.singleton@abcdef.com> from unconfirmed_messages(count=0)
    ---> error message

    6)
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:58:55.462009Z
    INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:58:57.045924Z
    INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T21:58:59.493083Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    testing2

    7) we don't get any indication from OCS, as Dustin doesn't IM us.

    8)
    (15:59:59) sipe: process_message_timeout: removed message <9826gF660aE774iB5F4m4DD3tC057bC674x0E3Ex><8><MESSAGE><sip:dustin.singleton@abcdef.com> from unconfirmed_messages(count=0)
    ---> error message

    9)
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:03.845982Z
    INFO sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-19T22:00:03.855064Z
    SIP/2.0 481 Call Leg/Transaction Does Not Exist
    ---> this is the first error indication we get from OCS. But we ignore those.

    10)
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:05.208266Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    testing again
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-19T22:00:05.236510Z
    SIP/2.0 481 Call Leg/Transaction Does Not Exist

    (16:00:05) sipe: process_message_response: MESSAGE response >= 400
    (16:00:05) sipe: process_message_response: assuming dangling IM session, dropping it.

    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:05.250486Z
    BYE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-19T22:00:05.258425Z
    SIP/2.0 481 Call Leg/Transaction Does Not Exist
    ---> dialog is cleaned up and user got error message

    11) new dialog with Call-ID: 08F1g4FD2a94E4iC416m9819t15FEb2D76x44DBx
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:11.286628Z
    INVITE sip:dustin.singleton@abcdef.com SIP/2.0
    ...
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:28.675042Z
    ACK sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    ------------- NEXT MESSAGE: outgoing SIP at 2011-01-19T22:00:28.676069Z
    MESSAGE sip:Dustin.Singleton@abcdef.com;opaque=user:epid:W20KpJcExlyuFPb9FjNwvgAA;gruu SIP/2.0
    testing
    ------------- NEXT MESSAGE: incoming SIP at 2011-01-19T22:00:28.785297Z
    SIP/2.0 200 OK

    It might be possible to add response handling to sipe_core_user_feedback_typing() to improve the behaviour.

     
  • Stefan Becker

    Stefan Becker - 2011-01-20

    Please try commit fdae6bb.

    Added response handling for INFO/KeyboardActivity messages. In step 9) we should now detect the error and therefore step 10) should initiate a new IM session i.e. you should no longer get an error.

     
  • Stefan Becker

    Stefan Becker - 2011-01-20

    Please also make sure to test this scenario:

    ...
    4) I IMd Dustin "this should not work"
    5) Dustin plugged in (in less than 60 seconds from 4)!)
    6) Dustin IMd "back" to you

    You should probably see in your IM window:

    you: "this should not work"
    Dustin: "back"
    message "this should not work" could not be delivered...

    i.e. we don't yet cancel outstanding messages when we receive the new INVITE from Dustin.

     
  • Rob Winch

    Rob Winch - 2011-01-20

    Progress has been made in that the last scenario I mentioned (where I IM him and it fails when it shouldn't) is fixed. However, the scenario you requested we test still fails. Here is the script we used. I have attached the logs as pidgin-2011-01-19-224859.zip

    (09:27:28 AM) me, myself, & I: go ahead and IM me back and then unplug be ready to plug back in right after i send you the message (I will give you a thumbs up)
    (09:27:43 AM) Singleton,Dustin: ok, unplugging now
    (09:27:49 AM) me, myself, & I: testing
    (09:28:08 AM) Singleton,Dustin: back
    (09:28:12 AM) me, myself, & I: nice
    (09:28:17 AM) Singleton,Dustin: got it
    (09:28:24 AM) me, myself, & I: setting if testing eventually fails still
    (09:28:31 AM) Singleton,Dustin: ok
    (09:28:34 AM) me, myself, & I: (that was the one i sent in step 4)

     
  • Rob Winch

    Rob Winch - 2011-01-20

    Logs for the scenario proposed in comment 2011-01-19 22:48:59 CST

     
  • Stefan Becker

    Stefan Becker - 2011-01-20

    Not exactly the same situation I described, but another failure case. I guess you didn't get a "message not delivered" error for you "testing" in step 4)?

    Your log shows that Dustin sends us a BYE while our MESSAGE (from step 4) is still in-flight. I guess Dustin wasn't disconnected long enough, so on reconnect we got the BYE communicator sent after disconnecting. The BYE closes the dialog on our side, so when we finally get the 481 response to the MESSAGE we can't handle it anymore.

    So we also need to cancel all unconfirmed messages when we receive a BYE for a dialog.

     
  • Rob Winch

    Rob Winch - 2011-01-21

    Sorry about not getting your exact flow. It sometimes difficult to coordinate without IMing each other from our physical locations. I will try again with the exact flow tomorrow.

     
  • Rob Winch

    Rob Winch - 2011-01-21

    PS You were correct in your assumption that I didn't get a message not delivered for step 4

     
  • Rob Winch

    Rob Winch - 2011-01-21

    I have attached a new log, pidgin-2001-01-19-224859-2.zip, that better represents the request in 2011-01-19 22:48:59 CST. The script we followed was...

    1) I IMd Dustin
    2) Dustin IMd me
    3) Dustin unplugged
    4) I IMd Dustin "this should not work"
    5) Dustin plugged in (in less than 60 seconds from 4)!)
    6) Dustin IMd "back" to me

    I got "back", but did not see an error for "this should not work" after waiting more than a minute from sending it (I believe I waited for 2 minutes).

     
  • Rob Winch

    Rob Winch - 2011-01-21

    Better logs for the scenario proposed in comment 2011-01-19 22:48:59 CST

     
  • Stefan Becker

    Stefan Becker - 2011-01-21

    Please try commit ecf4e1d.

    This cancels in-flight messages when we receive a BYE.

     
  • Rob Winch

    Rob Winch - 2011-01-22

    Fixed. Thanks again for your patience and excellent support!

     
  • Rob Winch

    Rob Winch - 2011-01-22
    • status: open --> closed-fixed
     
  • Stefan Becker

    Stefan Becker - 2011-01-22

    Hold your horses, we're not done yet :-) Please try commit a4bff9d.

    This re-enqueues in-flight messages when we receive an INVITE to an already existing dialog.

    Please try this test scenario:
    1) set up IM with Dustin as before
    2) Dustin disconnects
    3) start sending IMs to Dustin roughly every 10 seconds: test1, test2, test3, ... (a sequence counting up, you get the picture...)
    4) Dustin reconnects after X minutes (while your continuing IM'ing him)
    5) Dustin IMs "back" to you

    Repeat this test with X=1,2,3,4,5 and create a fresh pidgin log for each attempt.

    In each attempt make sure that:
    a) each of your "testX" IMs is either reported as "undelivered" or arrives in Dustin window
    b) that every "testX" (undelivered or delivered) shows up in the correct order, i.e. "test1", "test2", ... (you get the picture)

    I know it sounds like a lot of work. But I want to make sure that we encounter both the BYE and the INVITE situation.

     
  • Rob Winch

    Rob Winch - 2011-01-22
    • status: closed-fixed --> open-fixed
     
1 2 > >> (Page 1 of 2)

Log in to post a comment.