1
0
Fork 0
mirror of https://github.com/ansible-collections/community.general.git synced 2026-03-22 05:09:12 +00:00
Commit graph

11 commits

Author SHA1 Message Date
Felix Fontein
476f2bf641
Integration tests: replace ansible_xxx with ansible_facts.xxx (#11479)
Replace ansible_xxx with ansible_facts.xxx.
2026-02-07 18:18:48 +01:00
Alexei Znamensky
ac37544c53
monit: investigating tests again - using copilot on this one (#11255)
* add monit version to successful exit

* install the standard monit - if 5.34, then bail out

* add 3sec wait after service restart

- that restart happens exactly before the task receiving the SIGTERM, so maybe, just maybe, it just needs time to get ready for the party

* wait for monit initialisation after restart

* monit tests: check service-specific status in readiness wait

The wait task was checking 'monit status' (general), but the actual
failing command is 'monit status -B httpd_echo' (service-specific).
This causes a race where general status succeeds but service queries
fail. Update to check the exact command format that will be used.

* monit tests: remove 5.34.x version restriction

The version restriction was based on incorrect diagnosis. The actual
issue was the readiness check validating general status instead of
service-specific queries. Now that we check the correct command
format, the tests should work across all monit versions.

* monit tests: add stabilization delay after readiness check

After the readiness check succeeds, add a 1-second pause before
running actual tests. Monit 5.34.x and 5.35 appear to have a
concurrency issue where rapid successive 'monit status -B' calls
can cause hangs even though the first call succeeds.

* monit tests: add retry logic for state changes to handle monit daemon hangs

Monit daemon has an intermittent concurrency bug across versions 5.27-5.35
where 'monit status -B' commands can hang (receiving SIGTERM) even after
the daemon has successfully responded to previous queries. This appears
to be a monit daemon issue, not a timing problem.

Add retry logic with 2-second delays to the state change task to work
around these intermittent hangs. Skip retries if the failure is not
SIGTERM (rc=-15) to avoid masking real errors.

* monit tests: capture and display monit.log for debugging

Add tasks in the always block to capture and display the monit log file.
This will help diagnose the intermittent hanging issues by showing what
monit daemon was doing when 'monit status -B' commands hang.

* monit tests: enable verbose logging (-v flag)

Modify the monit systemd service to start with -v flag for verbose
logging. This should provide more detailed information in the monit
log about what's happening when status commands hang.

* monit: add 0.5s delay after state change command

After extensive testing and analysis with verbose logging enabled, identified
that monit's HTTP interface can become temporarily unresponsive immediately
after processing state change commands (stop, start, restart, etc.).

This manifests as intermittent SIGTERM (rc=-15) failures when the module
calls 'monit status -B <service>' to verify the state change. The issue
affects all monit versions tested (5.27-5.35) and is intermittent, suggesting
a race condition or brief lock in monit's HTTP request handling.

Verbose logging confirmed:
- State change commands complete successfully
- HTTP server reports as 'started'
- But subsequent status checks can hang without any log entry

Adding a 0.5 second sleep after sending state change commands gives the
monit daemon time to fully process the command and become responsive again
before the first status verification check.

This complements the existing readiness check after daemon restart and
the retry logic for SIGTERM failures in the tests.

* tests(monit): remove workarounds after module race condition fix

After 10+ successful CI runs with no SIGTERM failures, removing test-level
workarounds that are now redundant due to the 0.5s delay fix in the module:

- Remove 1-second stabilization pause after daemon restart
  The module's built-in 0.5s delay after state changes makes this unnecessary

- Remove retry logic for SIGTERM failures in state change tests
  The race condition is now prevented at the module level

- Remove verbose logging setup and log capture
  Verbose mode didn't log HTTP requests, so it didn't help diagnose the issue
  and adds unnecessary overhead

Kept the readiness check with retries after daemon restart - still needed
to validate daemon is responsive after service restart (different scenario
than the state change race condition).

* restore tasks/main.yml

* monit tests: reduce readiness check retries from 60 to 10

After successful CI runs, observed that monit daemon becomes responsive
within 1-2 seconds after restart. The readiness check typically passes
on the first attempt.

Reducing from 60 retries (30s timeout) to 10 retries (5s timeout) is
more appropriate and allows tests to fail faster if something is
genuinely broken.

* add changelog frag

* Update changelogs/fragments/11255-monit-integrationtests.yml

Co-authored-by: Felix Fontein <felix@fontein.de>

---------

Co-authored-by: Felix Fontein <felix@fontein.de>
2025-12-10 13:29:32 +01:00
Felix Fontein
eaa5e07b28
Adjust YAML files (#10233)
Adjust YAML files.
2025-06-15 09:13:16 +02:00
Felix Fontein
24efe9ee9a
Normalize bools in tests (#5996)
* Normalize bools in tests.

* Fix typo.
2023-02-15 22:55:23 +01:00
Felix Fontein
3dcff121c4
Try to install virtualenv via pip on Arch. (#5116)
ci_complete
2022-08-13 12:08:07 +02:00
Felix Fontein
68e7e52557
Add simple license headers, not completely at top. (#5080) 2022-08-05 21:31:34 +02:00
Felix Fontein
1ab2a5f1bc
Add default license header to files which have no copyright or license header yet (#5074)
* Add default license header to files which have no copyright or license header yet.

* yml extension should have been xml...
2022-08-05 14:03:38 +02:00
Felix Fontein
e1cfa13a1b
python-daemon 2.3.1 requires Python 3+. (#4977) 2022-07-23 12:02:18 +02:00
Felix Fontein
319c29c2a2
Add RHEL 9.0, FreeBSD 13.1, Ubuntu 22.04 and Fedora 36 to CI, fix bug in filesystem module (#4700)
* Add RHEL 9.0 and FreeBSD 13.1 to CI.

* RHEL 9 has no pyOpenSSL apparently.

* Adjust URL for EPEL.

* Fix cargo install on FreeBSD 13.1.

* Add Ubuntu 22.04 and Fedora 36 to CI.

* Fix logic.

* filesystem: do not die output line does not contain ':'

* Skip django_manage tests on RHEL 9 as well.

* homectl tests don't work with RHEL 9.0.

* Improve error handling, improve fatresize output handling.

* Skip Fedora 36.

* Skip filesystem vfat tests on Ubuntu 22.04.

There, resizing fails with a bug:
Bug: Assertion (disk != NULL) at ../../libparted/disk.c:1620 in function ped_disk_get_partition_by_sector() failed.

* 'trusty' is 14.04. Adding 22.04 to skip list.

* Skip jail tests for FreeBSD 13.1.

* Add config for postgres on Ubuntu 22.04.

* Make CentOS 6 happy.

* Adjust postgres version.

* Try installing EPEL a bit differently.

* Skip ufw and iso_extract tests on RHEL 9.

* Skip odbc tests on RHEL 9.

* Skip RHEL 9.0 for snap tests.

* Add changelog fragment for filesystem code changes.
2022-05-22 17:20:30 +02:00
Felix Fontein
24ca69aa05
CI: Remove 'warn:' that's removed in ansible-core 2.14 (#4434)
* Remove 'warn:' that's removed in ansible-core 2.14.

* Install virtualenv when needed.
2022-04-01 22:53:05 +02:00
Simon Kelly
8de1c0c205
monit: fix module detection of monitored process state (#1107)
* refactor and test

* require version >= 5.21.0

Prior to this version the status output was different

* python version compatability

* use exception classes from utils

* modify monit to use 'status' output instead of 'summary' output

The summary output is a fixed width table which truncates the
contents and prevents us from parsing the actual status of the
program.

* add integration tests + fixes

* remove unused handlers in monit integration test

* fix lint

* add '__metaclass__ = type' to integration python files

* raise AttributeError

* simplify status

* lint: add type to parameter docs

* remove lint ignore

* move monit process config into main file

* specify path to monit PID file

* set config location based on os_family

* create required directories

* update aliases to set group and skips

* add changelog

* add author

* add types to docs

* add EPEL repo

* custom vars for centos-6

* uninstall EPEL

* support older versions

* wait for status to change before exiting

* use 'validate' to force status updates

* handle 'execution failed'

* better status output for errors

* add more context to failure + standardize

* don't check rc for validate

* legacy string format support

* add integration test for 'reloaded' and 'present'

* don't wait after reload

* lint

* Revert "uninstall EPEL"

This reverts commit 4d548718d0.

* make 'present' more robust

* Apply suggestions from code review

Co-authored-by: Andrew Klychkov <aaklychkov@mail.ru>

* add license header

* drop daemon.py and use python-daemon instead

* skip python2.6 which is not supported by python-daemon

* refactor test tasks for reuse

* cleanup files after test

* lint

* start process before enabling monit

This shouldn't be necessary but I'm adding it in the hopes
it will make tests more robust.

* retry task

* attempt to rescue the task on failure

* fix indentation

* ignore check if rescue ran

* restart monit instead of reload

Co-authored-by: Andrew Klychkov <aaklychkov@mail.ru>
2020-10-23 12:26:23 +02:00