Conversation
First commit
README first commit
fixed typo
removed px-security comment
|
ReadMe LGTM |
adityadani
left a comment
There was a problem hiding this comment.
When would one run this script? Is at install time, upgrade time ?
We dont need to do it now, but separating out the checks based on what activity someone wants to do would be better. So run the NBDD checks only during general health check, or run the PDB check only before upgrades.
Secondly I would highly recommend using YAMLs and JSONs for parsing CLI or CR outputs. The CLI outputs can change and we dont want to keep changing this script. JSON outputs will never break compatibility
| print_warning "Found duplicate IP addresses (possible ghost entries):" | ||
| echo "=========================================" | ||
| while IFS= read -r ip; do |
There was a problem hiding this comment.
Duplicate IP entries can happen momentarily in cloud drive environments as disks/DriveSets move between nodes. Instead of ghost entries, you can say make sure all storage nodes are up and running
There was a problem hiding this comment.
The script is intended to primarily be run before a PXE upgrade and if required to do a quick health check.
Discussed internally and CX TSE/DSE favored to keep the script as is rather than seperate it out at this time. If this comes up again in future will try to do so.
Did change the commands to use JSON wherever possible.
We have seen seen in some cases a cluster has nodes with the same IP but different UUID. The check is intended for that.
| for vol_id in $(/opt/pwx/bin/pxctl volume list 2>/dev/null | awk "NR>1 && NF>0 {print \$1}"); do | ||
| inspect=$(/opt/pwx/bin/pxctl volume inspect "$vol_id" 2>/dev/null) | ||
| if echo "$inspect" | grep -q "Replication Status.*:.*Resync"; then | ||
| # Extract name and status - format is " Name : volume-name" | ||
| name=$(echo "$inspect" | awk -F: "/^[[:space:]]*Name[[:space:]]*:/{gsub(/^[[:space:]]+/,\"\",\$2);print \$2;exit}") | ||
| status=$(echo "$inspect" | awk -F: "/^[[:space:]]*Status[[:space:]]*:/{gsub(/^[[:space:]]+/,\"\",\$2);print \$2;exit}") | ||
| echo "RESYNC|${vol_id}|${name}|${status}" | ||
| fi |
There was a problem hiding this comment.
nit: You could do a pxctl v i <vol> -j | grep resync
There was a problem hiding this comment.
In general I would suggest use -j or the json outputs to parse fields instead of parsing pretty printed CLI outputs. Once you have jsons then you can use jq as well to parse sub fields.
There was a problem hiding this comment.
Made the changes to the script to do this.
| local pure_json | ||
| pure_json=$(echo "$pure_json_b64" | base64 -d 2>/dev/null) | ||
| if [[ -z "$pure_json" ]]; then | ||
| print_error "Failed to decode pure.json from px-pure-secret." | ||
| return | ||
| fi | ||
|
|
||
| # Step 4: Parse FlashArrays from the JSON | ||
|
|
||
| local fa_count=0 | ||
| local fa_endpoints=() | ||
| local fa_tokens=() | ||
|
|
||
|
|
||
| local fa_section | ||
| fa_section=$(echo "$pure_json" | tr -d '\n' | sed -n 's/.*"FlashArrays"[[:space:]]*:[[:space:]]*\(\[[^]]*\]\).*/\1/p') | ||
|
|
||
| if [[ -z "$fa_section" ]]; then | ||
| print_info "No FlashArrays section found in px-pure-secret. Skipping FlashArray check." | ||
| return | ||
| fi | ||
|
|
||
| # Parse the FlashArrays section - extract all MgmtEndPoint and APIToken values | ||
| while IFS= read -r line; do | ||
| [[ -n "$line" ]] && fa_endpoints+=("$line") | ||
| done < <(echo "$fa_section" | grep -o '"MgmtEndPoint"[[:space:]]*:[[:space:]]*"[^"]*"' | sed 's/.*:.*"\([^"]*\)"/\1/') | ||
|
|
||
| while IFS= read -r line; do | ||
| [[ -n "$line" ]] && fa_tokens+=("$line") | ||
| done < <(echo "$fa_section" | grep -o '"APIToken"[[:space:]]*:[[:space:]]*"[^"]*"' | sed 's/.*:.*"\([^"]*\)"/\1/') | ||
|
|
||
| fa_count=${#fa_endpoints[@]} | ||
|
|
||
| if [[ $fa_count -eq 0 ]]; then | ||
| print_info "No FlashArrays found in px-pure-secret." | ||
| return | ||
| fi | ||
|
|
There was a problem hiding this comment.
Since this is a json, jq command will make your life very easy than parsing line by line
There was a problem hiding this comment.
As jq may/may not be installed on a user system I did not do this. skipping.
| fi | ||
| echo "" | ||
|
|
||
| # Step 5: Test connectivity to each FlashArray |
There was a problem hiding this comment.
The connnectivity should be checked from the worker nodes and not from the mac/windows machine where this script is running
There was a problem hiding this comment.
Good point. I have made this change. Now the FA perf testing is being done from the portworx pod that is selected at the start of the script.
px_healthcheck/px_healthcheck.sh
Outdated
| manual_image_check | ||
| check_flasharray | ||
| px_alerts_show | ||
| check_nbdd |
There was a problem hiding this comment.
You need a PX version check to ensure this is being reported only when PX is at the version which supports NBDD
There was a problem hiding this comment.
Made changes to the script to only run check_nbdd if the PXE version is 3.5.0 or above.
Added changes flagged by Aditya - Mar 3, 2026
|
lgtm |
First commit
What this PR does / why we need it:
Which issue(s) this PR fixes (optional)
Closes #
Special notes for your reviewer: