Uploaded image for project: 'Beaker'
  1. Beaker
  2. BKR-1605

SSH connections fail randomly during tests

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Accepted
    • Priority: Normal
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: ssh
    • Labels:
      None
    • Template:

      Error rendering 'issue-templates-customfield'. Please contact your Jira administrators.

    • Method Found:
      Needs Assessment
    • QA Risk Assessment:
      Needs Assessment

      Description

      Recently, the SIMP project has noticed that long running tests were having issues with SSH connections randomly failing.

      Some digging indicated that the failure was happening in the retry logic in lib/beaker/ssh_connection.rb.

      We believe that the commit at 43e519dc2839601a1de08eb273e9c19cff49237c is the cause of the issue but, unfortunately, it is intermittent and difficult to pinpoint exactly.

      An active failure in GitLab can be found at https://gitlab.com/simp/simp-core/-/jobs/252971777 with the relevant output pasted below:

       

      ipa 20:49:58$ puppet agent -t
        Info: Using configured environment 'production'
        Info: Retrieving pluginfacts
        Info: Retrieving plugin
        Info: Retrieving locales
        Info: Loading facts
        Info: Caching catalog for ipa.int.onyxpoint.com
        Info: Applying configuration version '1563324605'
        Warning: svckill: Would have killed:
          svckill: chronyd.service
          svckill: gssproxy.service
          svckill: qemu-guest-agent.service
          svckill: rhel-domainname.service
          svckill: rpcbind.service
          svckill: vgauthd.service
          svckill: vmtoolsd.service
        Notice: Applied catalog in 8.02 seconds
       
      ipa executed in 36.25 seconds
      .
      ipa 20:50:34$ puppet agent -t ostensibly successful.
       
      agent-el7 20:50:34$ puppet agent -t
        Trying command 3 times.
      .
      agent-el7 20:50:34$ puppet agent -t
        Warning: ssh channel on agent-el7 received exception post command execution IOError - closed stream
        Warning: ssh.close: connection is already closed, no action needed
        ssh connection to agent-el7 has been terminated
       
      agent-el7 executed in 0.13 seconds
          should apply the configuration (FAILED - 1)
      ssh connection to puppet has been terminated
      ssh connection to ipa has been terminated
      Warning: ssh.close: connection is already closed, no action needed
      ssh connection to agent-el7 has been terminated
      ssh connection to agent-el6 has been terminated
      removing temporary ssh-config files per-vagrant box
      Destroying vagrant boxes
      ==> agent-el6: Forcing shutdown of VM...
      ==> agent-el6: Destroying VM and associated drives...
      ==> agent-el7: Forcing shutdown of VM...
      ==> agent-el7: Destroying VM and associated drives...
      ==> ipa: Forcing shutdown of VM...
      ==> ipa: Destroying VM and associated drives...
      ==> puppet: Forcing shutdown of VM...
      ==> puppet: Destroying VM and associated drives...
       
      Failures:
       
        1) set up an IPA server configure nodes for the IPA services should apply the configuration
           Failure/Error:
             retry_on(agent, 'puppet agent -t',
               :desired_exit_codes => [0],
               :retry_interval     => 15,
               :max_retries        => 3,
               :verbose            => true.to_s # work around beaker bug
             )
           Beaker::Host::CommandFailure:
             Host 'agent-el7' connection failure running:
              puppet agent -t
             Last 10 lines of output were:
             
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/host.rb:359:in `exec'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/dsl/helpers/host_helpers.rb:83:in `block in on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/shared/host_manager.rb:130:in `run_block_on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/dsl/patterns.rb:37:in `block_on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/dsl/helpers/host_helpers.rb:63:in `on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/dsl/helpers/host_helpers.rb:568:in `retry_on'
           # ./spec/acceptance/suites/ipa/10_ipa_server_spec.rb:102:in `block (4 levels) in <top (required)>'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/shared/host_manager.rb:130:in `run_block_on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/shared/host_manager.rb:118:in `block in run_block_on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/shared/host_manager.rb:117:in `map'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/shared/host_manager.rb:117:in `run_block_on'
           # ./.vendor/ruby/2.4.0/gems/beaker-4.10.0/lib/beaker/dsl/patterns.rb:37:in `block_on'
           # ./spec/acceptance/suites/ipa/10_ipa_server_spec.rb:101:in `block (3 levels) in <top (required)>'
       
      Finished in 89 minutes 32 seconds (files took 6 minutes 28 seconds to load)
      58 examples, 1 failure
       
      Failed examples:
       
      rspec ./spec/acceptance/suites/ipa/10_ipa_server_spec.rb:100 # set up an IPA server configure nodes for the IPA services should apply the configuration
       
      /opt/puppetlabs/puppet/bin/ruby -I/var/lib/gitlab-runner/builds/7212ed3c/0/simp/simp-core/.vendor/ruby/2.4.0/gems/rspec-core-3.8.2/lib:/var/lib/gitlab-runner/builds/7212ed3c/0/simp/simp-core/.vendor/ruby/2.4.0/gems/rspec-support-3.8.2/lib /var/lib/gitlab-runner/builds/7212ed3c/0/simp/simp-core/.vendor/ruby/2.4.0/gems/rspec-core-3.8.2/exe/rspec /var/lib/gitlab-runner/builds/7212ed3c/0/simp/simp-core/spec/acceptance/suites/ipa --color failed
      ERROR: Job failed: exit status 1
      

      You can repeat the test by doing the following:

        Attachments

          Activity

            People

            Assignee:
            Unassigned
            Reporter:
            peiriannydd Trevor Vaughan
            People Involved:
            Kevin Imber
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Dates

              Created:
              Updated:

                Zendesk Support