Uploaded image for project: 'PuppetDB'
  1. PuppetDB
  2. PDB-4579

Enable tcpKeepAlive on the postgres driver

    XMLWordPrintable

Details

    • PuppetDB
    • Bug Fix
    • Updated the PostgreSQL driver version to be able to properly detect dead connections before their use. This resolves an issue where an unreachable PostgreSQL server can cause PuppetDB to exhaust its connection pool (thus requiring a restart)
    • Needs Assessment

    Description

      Bringing down the network link on the PostgreSQL server causes the connection pool to hold on to connections that were already closed on the database server for (what seems to be) an infinite time.

      The network stack on the client is never notified these connections have been closed on the peer and PuppetDB's connection pool still believes they are active.

      This caused us to run out of available connections in the connection pool until restarting PuppetDB. The PDBReadPool_pool_ActiveConnections metric also reports a value of 25 (maximum-pool-size).

       

      Can the tcpKeepAlive option of the PostgreSQL JDBC driver be enabled to prevent this class of issue from happening ?

       

      Network link going down on the PostgreSQL server 

      [di nov  5 12:49:51 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Down
      [di nov  5 12:55:03 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex, Flow control: none
      [di nov  5 12:55:09 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Down
      [di nov  5 12:55:10 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex, Flow control: none
      [di nov  5 12:55:11 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Down
      [di nov  5 12:55:13 2019] bnx2x 0000:37:00.0 eno1: NIC Link is Up, 10000 Mbps full duplex, Flow control: none
      
      

       

      Connections still in ESTABLISHED state on the client side 

      [root@puppetdb ~]# netstat -ntp|grep 10.197.29.74:5432
      tcp6       0      0 10.198.174.11:39186     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:59996     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:50380     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:60952     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:33536     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:60902     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:35564     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:57950     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:45416     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:33644     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:39678     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:43846     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:55738     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:58098     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:34214     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:40098     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:41694     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:53760     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:33806     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:50358     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:60068     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:33530     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:38840     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:54616     10.197.29.74:5432       ESTABLISHED 47079/java          
      tcp6       0      0 10.198.174.11:36002     10.197.29.74:5432       ESTABLISHED 47079/java     
      

       

      Actual established connections. 

      [root@pgsqldb-puppetdb ~]# netstat -ntp|grep 10.198.174.11
      tcp        0      0 10.197.29.74:5432       10.198.174.11:39186     ESTABLISHED 9292/postgres: pupp 
      tcp        0      0 10.197.29.74:5432       10.198.174.11:40098     ESTABLISHED 9369/postgres: pupp 
      tcp        0      0 10.197.29.74:5432       10.198.174.11:39678     ESTABLISHED 9338/postgres: pupp 
      tcp        0      0 10.197.29.74:5432       10.198.174.11:60902     ESTABLISHED 7652/postgres: pupp 
      
      

       

      PuppetDB connection pool running out of available connections. 

      2019-11-06T12:43:50.504+01:00 WARN  [p.p.jdbc] Caught exception. Last attempt, throwing exception.
      2019-11-06T12:43:50.506+01:00 WARN  [o.e.j.s.HttpChannel] /pdb/query/v4
      javax.servlet.ServletException: java.sql.SQLTransientConnectionException: PDBReadPool - Connection is not available, request timed out after 3000ms.
      
      

      Attachments

        Issue Links

          Activity

            People

              robert.roland Robert Roland
              tdevelioglu Taylan Develioglu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:

                Zendesk Support