← Back to team overview

debcrafters-packages team mailing list archive

[Bug 2121607] Re: Nova-api showing latency after upgrading to Caracal

 

** Description changed:

- After upgrading to Caracal, we noticed the duration of GET calls to
- nova-api is increasing over time, and same for the memory usage of nova-
- api. We first noticed that in telegraf metrics, to validate that, I
- created a brand new cluster of VMs without telegraf, with only one
- headnode running nova-api, and have multiple nodes sending GET request
- to that and monitor the duration.
+ [ Impact ]
  
- Script to send requests:
- # --- Get a fresh token (requires openrc sourced first) ---
- get_token() {
-   openstack token issue -f value -c id
- }
- OS_TOKEN=$(get_token)
- echo "Using token: $OS_TOKEN"
+ * A resource leak bug in 21.2.0 of python-attrs (<https://github.com/python-attrs/attrs/issues/826>)
+  caused performance issues when an application created many classes with the same name.
  
- # --- Send requests at 10 per second ---
- COUNT=0
- while true; do
-   COUNT=$((COUNT+1))
-   STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
-     -H "X-Auth-Token: $OS_TOKEN" \
-     -H "Accept: application/json" \
-     "$NOVA_URL/servers/detail")
+ * This caused a performance regression in nova through its dependency on python-jsonschema
+   * The latency of API calls to nova increases over time
  
-   echo "$(date +'%F %T') [$COUNT] HTTP $STATUS"
+ [ Test Plan ]
  
-   if [ "$STATUS" = "401" ]; then
-     echo "[$(date)] Got 401 → refreshing token..."
-     OS_TOKEN=$(get_token)
-     continue   # retry next loop with fresh token
-   fi
+  To reproduce the issue and confirm the fix:
  
-   sleep 0.1   # 0.1 sec → 10 per second
- done
+ * Create a minimal openstack deployment, using e.g. devstack
+ (<https://docs.openstack.org/devstack/latest/>).
  
- script to monitor the duration (avg per 5 minutes)
- grep 'servers/detail' /var/log/nova/nova-api.log | awk '
-     # Example line:
-   # 2025-08-21 17:27:08.859 ... "GET /v2.1/os-quota-sets/..." ... time: 0.6598654
-   match($0, /^([0-9-]+) ([0-9]{2}):([0-9]{2}):([0-9]{2})(\.[0-9]+)?.* time: ([0-9.]+)/, m) {
-       ymd = m[1]; hh = m[2]; mm = m[3]; dur = m[6]
-       bmin = int(mm/5)*5                           # floor minute to 5-min bucket
-       key = sprintf("%s %s:%02d", ymd, hh, bmin)   # e.g., 2025-08-21 17:25
-       sum[key] += dur; cnt[key]++
-   }
-   END {
-       for (k in sum) printf "%s,%.3f\n", k, sum[k]/cnt[k] | "sort"
-   }'
+ * Run a script which makes many calls to the nova API
  
- I use systemctl status to track the memory usage, it increased about
- 500MB during a weekend (I'm testing on a small cluster). The duration of
- the GET request also showed obvious increment, and seems no restriction
- limit.
+  ---Script to call API---
+  #!/bin/bash
+  # --- Get a fresh token (requires openrc sourced first) ---
+  get_token() {
+      openstack token issue -f value -c id
+  }
  
- Wondering if it is a memory leak thing, but want to get confirmation
- from team. Thanks.
+  OS_TOKEN=$(get_token)
+  echo "Using token: $OS_TOKEN"
+ 
+  # --- Send requests at 10 per second ---
+  COUNT=0
+  while true; do
+      COUNT=$((COUNT + 1))
+      STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
+          -H "X-Auth-Token: $OS_TOKEN" \
+          -H "Accept: application/json" \
+          "$NOVA_URL/servers/detail")
+ 
+      echo "$(date +'%F %T') [$COUNT] HTTP $STATUS"
+ 
+      if [ "$STATUS" = "401" ]; then
+          echo "[$(date)] Got 401 → refreshing token..."
+          OS_TOKEN=$(get_token)
+          continue # retry next loop with fresh token
+      fi
+      # sleep 0.1 # 0.1 sec → 10 per second
+  done
+  ---/Script to call API---
+ 
+ * Simultaneously monitor the response time of API calls using the
+ openstack CLI
+ 
+  ---Script to monitor response time---
+  #!/bin/bash
+ 
+  # --- Again openrc must be sourced first ---
+  os_quota() {
+      openstack quota list --compute --timing --format=json
+  }
+ 
+  headers() {
+      os_quota | grep -o -P '/(\w|\d){32}(/d)?' | tr '\n' ','
+  }
+ 
+  time_api() {
+      os_quota |
+          awk 'BEGIN { FS=","; ORS=","} /quota/ {print $2} END { printf "\n" }'
+  }
+ 
+  OUTPUT="api_calls_$(date +%Y-%m-%dT%H_%M_%S)"
+ 
+  echo -n "# time," >>"$OUTPUT"
+  headers >>"$OUTPUT"
+  echo >>"$OUTPUT"
+ 
+  i=0
+  while true; do
+      echo -n "$(date +'%F %T')," >>"$OUTPUT"
+      time_api >>"$OUTPUT"
+      echo -n "."
+      if ((i > 80)); then
+          i=0
+          echo
+      fi
+      sleep 5
+  done
+ 
+ ---/Script to monitor response time---
+ 
+ * Observe that the response time increases over time with python-attrs 21.2.0
+   * I did this over a period of about 20 hrs.
+ 
+ * Log in to the server hosting the nova API and upgrade python-attrs to
+ 21.4.0
+ 
+ * Repeat the above steps and observe that the response time remains
+ stable over time.
+ 
+ [ Where problems could occur ]
+ 
+ * Other packages depend on python-attrs as a library and so could be
+ affected.
+ 
+ * From 21.2.0 to 21.4.0 there was one backward incompatible change change of which could potentailly cause issues:
+   * <https://github.com/python-attrs/attrs/blob/21.4.0/CHANGELOG.rst>
+     """
+     When using @define, converters are now run by default when setting an attribute on an instance -- additionally to validators. I.e. the new default is on_setattr=[attrs.setters.convert, attrs.setters.validate].
+ 
+     This is unfortunately a breaking change, but it was an oversight, impossible to raise a DeprecationWarning about, and it's better to fix it now while the APIs are very fresh with few users.
+     """
+   * Since there are no version pins of python-attrs excluding this version it is unlikely to break other packages.
+   * It is possible that some users could be using the python3-attr package in their own code, but it is much more likely that
+     for user code, attr would be installed via pip, either directly or within a virtualenv.
+ 
+ * The version in noble is 23.2.0 which already includes all these
+ changes and there doesn't seem to be any associated issues.
+ 
+ [ Other Info ]
+ 
+ * This SRU proposal bumps to 21.4.0 instead of 21.3.0 because 21.4.0 is a bug fix
+   release for a regression in 21.3.0.
+ 
+ * This is the first ubuntu specific modification to this package, but it
+ is a fairly minor one and doesn't need to be carried forward to any
+ other series.
+ 
+ * A possible lighter weight alternative is to simply cherry pick the
+ bugfix commit <https://github.com/python-
+ attrs/attrs/commit/38580632ceac1cd6e477db71e1d190a4130beed4>

-- 
You received this bug notification because you are a member of
Debcrafters packages, which is subscribed to python-attrs in Ubuntu.
https://bugs.launchpad.net/bugs/2121607

Title:
  Nova-api showing latency after upgrading to Caracal

Status in OpenStack Compute (nova):
  Confirmed
Status in python-attrs package in Ubuntu:
  New
Status in python-attrs source package in Jammy:
  New

Bug description:
  [ Impact ]

  * A resource leak bug in 21.2.0 of python-attrs (<https://github.com/python-attrs/attrs/issues/826>)
   caused performance issues when an application created many classes with the same name.

  * This caused a performance regression in nova through its dependency on python-jsonschema
    * The latency of API calls to nova increases over time

  [ Test Plan ]

   To reproduce the issue and confirm the fix:

  * Create a minimal openstack deployment, using e.g. devstack
  (<https://docs.openstack.org/devstack/latest/>).

  * Run a script which makes many calls to the nova API

   ---Script to call API---
   #!/bin/bash
   # --- Get a fresh token (requires openrc sourced first) ---
   get_token() {
       openstack token issue -f value -c id
   }

   OS_TOKEN=$(get_token)
   echo "Using token: $OS_TOKEN"

   # --- Send requests at 10 per second ---
   COUNT=0
   while true; do
       COUNT=$((COUNT + 1))
       STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
           -H "X-Auth-Token: $OS_TOKEN" \
           -H "Accept: application/json" \
           "$NOVA_URL/servers/detail")

       echo "$(date +'%F %T') [$COUNT] HTTP $STATUS"

       if [ "$STATUS" = "401" ]; then
           echo "[$(date)] Got 401 → refreshing token..."
           OS_TOKEN=$(get_token)
           continue # retry next loop with fresh token
       fi
       # sleep 0.1 # 0.1 sec → 10 per second
   done
   ---/Script to call API---

  * Simultaneously monitor the response time of API calls using the
  openstack CLI

   ---Script to monitor response time---
   #!/bin/bash

   # --- Again openrc must be sourced first ---
   os_quota() {
       openstack quota list --compute --timing --format=json
   }

   headers() {
       os_quota | grep -o -P '/(\w|\d){32}(/d)?' | tr '\n' ','
   }

   time_api() {
       os_quota |
           awk 'BEGIN { FS=","; ORS=","} /quota/ {print $2} END { printf "\n" }'
   }

   OUTPUT="api_calls_$(date +%Y-%m-%dT%H_%M_%S)"

   echo -n "# time," >>"$OUTPUT"
   headers >>"$OUTPUT"
   echo >>"$OUTPUT"

   i=0
   while true; do
       echo -n "$(date +'%F %T')," >>"$OUTPUT"
       time_api >>"$OUTPUT"
       echo -n "."
       if ((i > 80)); then
           i=0
           echo
       fi
       sleep 5
   done

  ---/Script to monitor response time---

  * Observe that the response time increases over time with python-attrs 21.2.0
    * I did this over a period of about 20 hrs.

  * Log in to the server hosting the nova API and upgrade python-attrs
  to 21.4.0

  * Repeat the above steps and observe that the response time remains
  stable over time.

  [ Where problems could occur ]

  * Other packages depend on python-attrs as a library and so could be
  affected.

  * From 21.2.0 to 21.4.0 there was one backward incompatible change change of which could potentailly cause issues:
    * <https://github.com/python-attrs/attrs/blob/21.4.0/CHANGELOG.rst>
      """
      When using @define, converters are now run by default when setting an attribute on an instance -- additionally to validators. I.e. the new default is on_setattr=[attrs.setters.convert, attrs.setters.validate].

      This is unfortunately a breaking change, but it was an oversight, impossible to raise a DeprecationWarning about, and it's better to fix it now while the APIs are very fresh with few users.
      """
    * Since there are no version pins of python-attrs excluding this version it is unlikely to break other packages.
    * It is possible that some users could be using the python3-attr package in their own code, but it is much more likely that
      for user code, attr would be installed via pip, either directly or within a virtualenv.

  * The version in noble is 23.2.0 which already includes all these
  changes and there doesn't seem to be any associated issues.

  [ Other Info ]

  * This SRU proposal bumps to 21.4.0 instead of 21.3.0 because 21.4.0 is a bug fix
    release for a regression in 21.3.0.

  * This is the first ubuntu specific modification to this package, but
  it is a fairly minor one and doesn't need to be carried forward to any
  other series.

  * A possible lighter weight alternative is to simply cherry pick the
  bugfix commit <https://github.com/python-
  attrs/attrs/commit/38580632ceac1cd6e477db71e1d190a4130beed4>

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/2121607/+subscriptions