- Run workflow ... into LDAP machine to create the reverse hosted zone
ipa dnszone-add --name-from-ip=10.0.0.63 or using IP range (
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/linux_domain_identity_authentication_and_policy_guide/managing-reverse-dns-zones)
- create LDAP replica instance type
- run workflow to switch IP interfaces between LDAP
- Check everything is fine (deploy a new machine, login to other machines using DNS)
- Remove old LDAP after some time?
on centos7 version 4.6.8 tried updating libraries, but the dns broke:
[root@ldap murdaca]# systemctl status -l named-pkcs11.service ● named-pkcs11.service - Berkeley Internet Name Domain (DNS) with native PKCS#11 Loaded: loaded (/usr/lib/systemd/system/named-pkcs11.service; disabled; vendor preset: disabled) Active: failed (Result: signal) since Thu 2024-10-17 15:24:59 UTC; 13h ago Process: 11930 ExecStart=/usr/sbin/named-pkcs11 -u named -c ${NAMEDCONF} $OPTIONS (code=exited, status=0/SUCCESS) Process: 11927 ExecStartPre=/bin/bash -c if [ ! "$DISABLE_ZONE_CHECKING" == "yes" ]; then /usr/sbin/named-checkconf -z "$NAMEDCONF"; else echo "Checking of zone files is disabled"; fi (code=exited, status=0/SUCCESS) Main PID: 11932 (code=killed, signal=ABRT) Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #6 0x7f26fc529b89 in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #7 0x7f26fc533358 in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #8 0x7f270b9aa713 in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #9 0x7f270b9ab28b in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #10 0x7f2709a80ea5 in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: #11 0x7f2708af3b0d in ?? Oct 17 15:24:59 ldap.batchpro.ewc named-pkcs11[11932]: exiting (due to assertion failure) Oct 17 15:24:59 ldap.batchpro.ewc systemd[1]: named-pkcs11.service: main process exited, code=killed, status=6/ABRT Oct 17 15:24:59 ldap.batchpro.ewc systemd[1]: Unit named-pkcs11.service entered failed state. Oct 17 15:24:59 ldap.batchpro.ewc systemd[1]: named-pkcs11.service failed.
check all logs at /var/named/data/named.run
18-Oct-2024 06:17:54.898 managed-keys-zone: loaded serial 734 18-Oct-2024 06:17:54.898 zone 0.in-addr.arpa/IN: loaded serial 0 18-Oct-2024 06:17:54.899 zone 1.0.0.127.in-addr.arpa/IN: loaded serial 0 18-Oct-2024 06:17:54.900 zone 1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.ip6.arpa/IN: loaded serial 0 18-Oct-2024 06:17:54.901 zone localhost/IN: loaded serial 0 18-Oct-2024 06:17:54.902 zone localhost.localdomain/IN: loaded serial 0 18-Oct-2024 06:17:54.902 all zones loaded 18-Oct-2024 06:17:54.902 running 18-Oct-2024 06:17:54.903 ../../../lib/dns-pkcs11/name.c:1114: REQUIRE((target != ((void *)0) && (__builtin_expect(!!((target) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(target))->magic == (0x42756621U)), 1))) || (target == ((void *)0) && (__builtin_expect(!!((name->buffer) != ((void *)0)), 1) && __builtin_expect(!!(((const isc__magic_t *)(name->buffer))->magic == (0x42756621U)), 1)))) failed, back trace 18-Oct-2024 06:17:54.903 #0 0x564390897130 in ?? 18-Oct-2024 06:17:54.903 #1 0x7fcdba9d048a in ?? 18-Oct-2024 06:17:54.903 #2 0x7fcdbacd5b7b in ?? 18-Oct-2024 06:17:54.903 #3 0x7fcda7540b78 in ?? 18-Oct-2024 06:17:54.903 #4 0x7fcda7540fee in ?? 18-Oct-2024 06:17:54.903 #5 0x7fcda754298b in ?? 18-Oct-2024 06:17:54.903 #6 0x7fcda7542b89 in ?? 18-Oct-2024 06:17:54.903 #7 0x7fcda754c358 in ?? 18-Oct-2024 06:17:54.903 #8 0x7fcdba9f5713 in ?? 18-Oct-2024 06:17:54.903 #9 0x7fcdba9f628b in ?? 18-Oct-2024 06:17:54.903 #10 0x7fcdb8acbea5 in ?? 18-Oct-2024 06:17:54.903 #11 0x7fcdb7b3eb0d in ?? 18-Oct-2024 06:17:54.903 exiting (due to assertion failure)
Admins part
Procedure for existing LDAPs
pre-requisite
ansible script:
- iterate on every openstack from vault
- create app credentials
- create vault entry under ewc-tenant-credentials
- create app role for each tenancy entry in ewc-tenant-credentials and add it to the corresponding Cypher in each tenancy
- every Cypher of every tenancy has the following info to access Vault:
- secret/vault_approleid
- secret/vault_secretid
- every Cypher of every tenancy has the following info to access Vault:
ansible script:
- iterate on every openstack from vault
- add port 636 to ldap security group
changes:
- every new tenant gets port 636 added for ldap security group
backup existing ldap server before starting
run workflow to create dns reverse zone into LDAP
New instance type replica
Creating a new LDAP to migrate to
- deploy a new machine with
- rocky 9
- new ldap security group
- plan eo1.medium
- create a local user with sudo rights
- become sudo
- stop and disable firewalld on the machine systemctl stop firewalld && systemctl disable firewalld
- add old ldap IP and domain of the machine acting as LDAP in one line to /etc/hosts on this machine
[test234@test-podman2 ~]$ cat /etc/hosts <!-- BEGIN ANSIBLE MANAGED BLOCK --> 10.0.0.133 test-podman2.eumetsat.sandbox.ewc 10.0.0.63 <OLD LDAP complete domain (e.g. ldap.eumetsat.sandbox.ewc)> <!-- END ANSIBLE MANAGED BLOCK -->
- Install ipa-replica-install command (dnf install ipa-server ipa-server-dns -y) (second is required because it is required for the option in the ipa-replica-install)
- Run the following command to install the LDAP replica
ipa-replica-install --domain "eumetsat.sandbox.ewc" --principal sandbox-ldap-admin --admin-password hunter2 --server <OLD LDAP complete domain (e.g. ldap.eumetsat.sandbox.ewc)> --setup-ca --setup-dns --force-join --forwarder=8.8.8.8 --forwarder=1.1.1.1 --no-host-dns
add --verbose for more logs
on centos7 version 4.6.8
go ahead with the default netbios and also put yes when asked about the ipa-sidgen task installation for users
first error:
[4/30]: creating installation admin user Unable to log in as uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca on ldap://ldap.datapro.ewc:389 [hint] tune with replication_wait_timeout [error] NotFound: uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 Your system may be partly configured. Run /usr/sbin/ipa-server-install --uninstall to clean up. uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information MORE LOGS: 2024-10-18T08:59:21Z DEBUG Waiting 300 seconds for uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca to appear on ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z ERROR Unable to log in as uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca on ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z INFO [hint] tune with replication_wait_timeout 2024-10-18T09:04:22Z DEBUG Traceback (most recent call last): File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 686, in start_creation run_step(full_msg, method) File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 672, in run_step method() File "/usr/lib/python3.9/site-packages/ipaserver/install/dogtaginstance.py", line 789, in setup_admin raise errors.NotFound( ipalib.errors.NotFound: uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z DEBUG [error] NotFound: uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z DEBUG Removing /root/.dogtag/pki-tomcat/ca 2024-10-18T09:04:22Z DEBUG File "/usr/lib/python3.9/site-packages/ipapython/admintool.py", line 180, in execute return_value = self.run() File "/usr/lib/python3.9/site-packages/ipapython/install/cli.py", line 344, in run return cfgr.run() File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 360, in run return self.execute() File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 386, in execute for rval in self._executor(): File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 435, in __runner exc_handler(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 468, in _handle_execute_exception self._handle_exception(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 458, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise raise value File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 425, in __runner step() File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 419, in step_next return next(self.__gen) File "/usr/lib/python3.9/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from six.reraise(*exc_info) File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise raise value File "/usr/lib/python3.9/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from value = gen.send(prev_value) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 663, in _configure next(executor) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 435, in __runner exc_handler(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 468, in _handle_execute_exception self._handle_exception(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 526, in _handle_exception self.__parent._handle_exception(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 458, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise raise value File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 523, in _handle_exception super(ComponentBase, self)._handle_exception(exc_info) File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 458, in _handle_exception six.reraise(*exc_info) File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise raise value File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 425, in __runner step() File "/usr/lib/python3.9/site-packages/ipapython/install/core.py", line 419, in step_next return next(self.__gen) File "/usr/lib/python3.9/site-packages/ipapython/install/util.py", line 81, in run_generator_with_yield_from six.reraise(*exc_info) File "/usr/lib/python3.9/site-packages/six.py", line 709, in reraise raise value File "/usr/lib/python3.9/site-packages/ipapython/install/util.py", line 59, in run_generator_with_yield_from value = gen.send(prev_value) File "/usr/lib/python3.9/site-packages/ipapython/install/common.py", line 65, in _install for unused in self._installer(self.parent): File "/usr/lib/python3.9/site-packages/ipaserver/install/server/__init__.py", line 599, in main replica_install(self) File "/usr/lib/python3.9/site-packages/ipaserver/install/server/replicainstall.py", line 401, in decorated func(installer) File "/usr/lib/python3.9/site-packages/ipaserver/install/server/replicainstall.py", line 1392, in install ca.install(False, config, options, custodia=custodia) File "/usr/lib/python3.9/site-packages/ipaserver/install/ca.py", line 354, in install install_step_0(standalone, replica_config, options, custodia=custodia) File "/usr/lib/python3.9/site-packages/ipaserver/install/ca.py", line 423, in install_step_0 ca.configure_instance( File "/usr/lib/python3.9/site-packages/ipaserver/install/cainstance.py", line 505, in configure_instance self.start_creation(runtime=runtime) File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 686, in start_creation run_step(full_msg, method) File "/usr/lib/python3.9/site-packages/ipaserver/install/service.py", line 672, in run_step method() File "/usr/lib/python3.9/site-packages/ipaserver/install/dogtaginstance.py", line 789, in setup_admin raise errors.NotFound( 2024-10-18T09:04:22Z DEBUG The ipa-replica-install command failed, exception: NotFound: uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z ERROR uid=admin-new-ldap.datapro.ewc,ou=people,o=ipaca did not replicate to ldap://ldap.datapro.ewc:389 2024-10-18T09:04:22Z ERROR The ipa-replica-install command failed. See /var/log/ipareplica-install.log for more information
solution:
- bug https://lists.fedorahosted.org/archives/list/freeipa-users@lists.fedorahosted.org/thread/TK2HCUU2RXRTJVCM3KPVOIGR5J23M4FO/?sort=date → https://lists.fedoraproject.org/archives/list/freeipa-users@lists.fedorahosted.org/message/IHIPPVMMIWV2TL7BNLW55XII3OIQ62HK/
- bug to be considered https://lists.fedoraproject.org/archives/list/freeipa-users@lists.fedorahosted.org/thread/5VGR7DFU4XO63X6KB4ETKSGLKP4A2LWP/
current ldap are on 3.6.8 with python 2.7 so the bug above needs to be changed on this file: /usr/lib/python2.7/site-packages/ipaserver/secrets/store.py in PEMfilehandler class for export_key method. (Line 211)
'-keypbe', 'AES-256-CBC', '-certpbe', 'AES-256-CBC', '-macalg', 'sha384',
11.1 ipa dnszone-mod <DNS ZONE you find with ipa dnszone-find> --name-server=<NEWLDAP complete domain (e.g. ldap-test-rocky.eumetsat.sandbox.ewc)>
- on the new ldap machine specify which ldap to be master (using command: ipa-csreplica-manage set-renewal-master)
DON'T DO IT, see next steps comment on why → create an A record for existing ldap domain to point to the new LDAP ipa dnsrecord-add eumetsat.sandbox.ewc. ldap --a-rec <NEW_LDAP_IP>
- Modify the DNS primary in Network (cannot be done by a user, unless we give permissions to modify networks) and wait some time (1 week? ) or do manual action on each machine to make the changes: (lease time ~5.20 hours)
- for Ubuntu a restart of the machine or the newotk systemctl restart networking or maybe after while you will get it (TBT)
- for Rocky 9 network manager has priority on resolve.conf
- for Rocky 8 happening in sandbox but not in mcat, one option either remove /etc/resolve.conf and dhclient or ??
- Change the secret/ldap_hostname to point to the new ipa host
sudo ipactl stop on the old LDAP machine (check if this step can be removed on another test run)
- ipa-replica-manage del ldap.eumetsat.sandbox.ewc --force delete the replica ldap
Missing part
/etc/resolv.conf needs to be manually updated??
Next steps:
- reboot more VMs to see if they recover → not changing
- understand why the resolv.conf didn't update → for rocky it changes only on newer deployment machine from cloud-init, not on rebooted machines, Ubuntu has an internal dns server
- for Ubuntu a restart of the machine or the nework systemctl restart networking or maybe after while you will get it (TBT)
- for Rocky either remove /etc/resolve.conf and dhclient
- create a fresh VM to see if the ldap workflow works → tested and we had to remove the ldap A record and therefore change the secret/ldap_hostname in order to work
- destroy the new ldap and recover the old one, then try to repeat the migration without stopping IPA on the old LDAP when deleting it → we have to specify which ldap to be master (using command: ipa-csreplica-manage set-renewal-master)
Automation to be run from any machine beside LDAP
alternative path from 11.1
12. detach the interface from the old LDAP
13 update the ip in the /etc/hosts and remove the old LDAP record
14. ipa-replica-manage del ldap.eumetsat.sandbox.ewc --force delete the replica ldap
15. replace the new ldap IP with the IP of the interface of the old LDAP →
ipa dnsrecord-mod eumetsat.sandbox.ewc. ipa-ca --a-rec OLD_LDAP_IP ipa dnsrecord-mod eumetsat.sandbox.ewc. NEW_ldap_hostname --a-rec OLD_LDAP_IP
16. swtich off the new LDAP
17. remove interface
18. add interface with the IP of the old LDAP
19 restart new LDAP machine
20 add security groups to new LDAP
Test
21 deploy a new machine
22 login with DNS
v 4.6.8 default
Procedure for new LDAPs
- Modify the ldap base image to be rocky 9
LDAP machine
- check if reverse zone exists and add dns reverse into deployment
- populate it for all existing machines in the tenancy (needs script dumping the forward zone)
- include reverse zone flag (nope, we have this and it seems not to work for private IP ranges, we have to do it manually) → create manually the reverse zone in the workflow - probably just add
ipa dnszone-add --name-from-ip=192.0.2.0/24
to the ansible
enrolling workflow
- we modify the enroll script to include --subids and we add an extra line at the end of the default workflow for machines to run
sudo rpm --restore shadow-utils
- Remove /etc/NetworkManager/conf.d/99-cloud-init.conf
Write a script that verifies whether a machine is gone (how? no ping? for a long time?) and destroys ldap entries relating to it. Put this in the crontab on all ldaps.