Contents
This chapter offers a range of common problems that can arise with an intention of covering as many of the various types of potential problems as possible. That way, even if your precise situation is not listed here, there might be one similar enough to offer hints as to the solution.
Linux logs things in a fair amount of detail. There are several places to look when you have problems with your system, most of which are standard to Linux systems in general and some of which are peculiar to SUSE Linux Enterprise systems. Most log files can also be viewed with YaST (+).
YaST offers the possibility to collect all system information needed by the support team. Use +. Select the problem category. When all information is gathered, attach it to your support request.
The following is a list of the most commonly checked log files and what they typically contain.
Table 51.1. Log Files¶
|
Log File |
Description |
|---|---|
|
|
Messages from the kernel during the boot process. |
|
|
Messages from the mail system. |
|
|
Ongoing messages from the kernel and system log daemon when running. |
|
|
Log file from NetworkManager to collect problems with network connectivity |
|
|
Hardware messages from the SaX display and KVM system. |
|
|
Messages from the desktop applications currently running. Replace
|
|
|
All messages from the kernel and system log daemon assigned WARNING level or higher. |
|
|
Binary file containing user login records for the current machine session. View it with last. |
|
|
Various start-up and runtime logs from the X Window system. It is useful for debugging failed X start-ups. |
|
|
Directory containing YaST's actions and their results. |
|
|
Directory containing Samba server and client log messages. |
Apart from log files, your machine also supplies you with information about the running system. See Table 51.2: System Information.
Table 51.2. System Information¶
|
File |
Description |
|---|---|
|
|
This displays processor information, including its type, make, model, and performance. |
|
|
This shows which DMA channels are currently being used. |
|
|
This shows which interrupts are in use and how many of each have been in use. |
|
|
This displays the status of I/O (input/output) memory. |
|
|
This shows which I/O ports are in use at the moment. |
|
|
This displays memory status. |
|
|
This displays the individual modules. |
|
|
This displays devices currently mounted. |
|
|
This shows the partitioning of all hard disks. |
|
|
This displays the current version of Linux. |
Linux comes with a number of tools for system analysis and monitoring. See Chapter 17, System Monitoring Utilities for a selection of the most important ones used in system diagnostics.
Each scenario included in the following begins with a header describing the problem followed by a paragraph or two offering suggested solutions, available references for more detailed solutions, and cross-references to other scenarios that might be related.
Installation problems are situations when a machine fails to install. It may fail entirely or it may not be able to start the graphical installer. This section highlights some of the typical problems you might run into and offers possible solutions or workarounds for this kind of situations.
If you encounter any problems using the SUSE Linux Enterprise installation media, you can check the integrity of your installation media with +. Media problems are more likely to occur with media you burn yourself. To check a SUSE Linux Enterprise CD or DVD, insert the medium into the drive and click for YaST to check the MD5 checksum of the medium. This may take several minutes. If errors are detected, do not use this medium for installation.
Display detected hardware and technical data using +. Click any node of the tree for more information about a device. This module is especially useful, for example, when submitting a support request for which you need information about your hardware.
Save the hardware information displayed to a file by clicking . Select the desired directory and filename then click to create the file.
If your computer does not contain a bootable CD or DVD-ROM drive or if the one you have is not supported by Linux, there are several options for installing your machine without a need for a built-in CD or DVD drive:
Create a boot floppy and boot from floppy disk instead of CD or DVD.
If it is supported by the machine's BIOS and the installation kernel, boot for installation from external CD or DVD drives.
If a machines lacks a CD or DVD drive, but provides a working ethernet connection, perform a completely network-based installation. See Section 4.1.3, “Remote Installation via VNC—PXE Boot and Wake on LAN” and Section 4.1.6, “Remote Installation via SSH—PXE Boot and Wake on LAN” for details.
On some older computers, there is no bootable CD-ROM drive available, but a floppy disk drive. To install on such a system, create boot disks and boot your system with them.
The boot disks include the loader SYSLINUX and the program linuxrc. SYSLINUX enables the selection of a kernel during the boot procedure and the specification of any parameters needed for the hardware used. The program linuxrc supports the loading of kernel modules for your hardware and subsequently starts the installation.
When booting from a boot disk, the boot procedure is initiated by the boot loader
SYSLINUX (package syslinux). When the system is booted, SYSLINUX runs
a minimum hardware detection that mainly consists of the following steps:
The program checks if the BIOS provides VESA 2.0–compliant framebuffer support and boots the kernel accordingly.
The monitor data (DDC info) is read.
The first block of the first hard disk (MBR) is read to map BIOS IDs to Linux device names during the boot loader configuration. The program attempts to read the block by means of the the lba32 functions of the BIOS to determine if the BIOS supports these functions.
If you keep Shift pressed when SYSLINUX starts, all these steps are skipped. For troubleshooting purposes, insert the line
verbose 1
in syslinux.cfg for the boot loader to display which action is
currently being performed.
If the machine does not boot from the floppy disk, you may need to change the boot
sequence in the BIOS to A,C,CDROM.
Most CD-ROM drives are supported. If problems arise when booting from the CD-ROM drive, try booting CD 2 of the CD set.
If the system does not have a CD-ROM or floppy disk, it is still possible that an external CD-ROM, connected with USB, FireWire, or SCSI, can be used to boot the system. This depends largely on the interaction of the BIOS and the hardware used. Sometimes a BIOS update may help if you encounter problems.
There are two possible reasons for a machine not to boot for installation:
Your CD-ROM drive might not be able to read the boot image on CD 1. In this case, use CD 2 to boot the system. CD 2 contains a conventional 2.88 MB boot image that can be read even by unsupported drives and allows you to perform the installation over the network as described in Chapter 4, Remote Installation.
The BIOS boot sequence must have CD-ROM set as the first entry for booting. Otherwise the machine would try to boot from another medium, typically the hard disk. Guidance for changing the BIOS boot sequence can be found the documentation provided with your motherboard or in the following paragraphs.
The BIOS is the software that enables the very basic functions of a computer. Motherboard vendors provide a BIOS specifically made for their hardware. Normally, the BIOS setup can only be accessed at a specific time—when the machine is booting. During this initialization phase, the machine performs a number of diagnostic hardware tests. One of them is a memory check, indicated by a memory counter. When the counter appears, look for a line, usually below the counter or somewhere at the bottom, mentioning the key to press to access the BIOS setup. Usually the key to press is Del, F1, or Esc. Press this key until the BIOS setup screen appears.
Procedure 51.1. Changing the BIOS Boot Sequence¶
Enter the BIOS using the proper key as announced by the boot routines and wait for the BIOS screen to appear.
To change the boot sequence in an AWARD BIOS, look for the entry. Other manufacturers may have a different name for this, such as . When you have found the entry, select it and confirm with Enter.
In the screen that opens, look for a subentry called .
The boot sequence is often set to something like C,A or
A,C. In the former case, the machine first searches the hard disk (C) then
the floppy drive (A) to find a bootable medium. Change the settings by pressing
PgUp or PgDown until the sequence is
A,CDROM,C.
Leave the BIOS setup screen by pressing Esc. To save the changes, select or press F10. To confirm that your settings should be saved, press Y.
Procedure 51.2. Changing the Boot Sequence in a SCSI BIOS (Adaptec Host Adapter)
Open the setup by pressing Ctrl+A.
Select , which displays the connected hardware components.
Make note of the SCSI ID of your CD-ROM drive.
Exit the menu with Esc.
Open . Under , select and press Enter.
Enter the ID of the CD-ROM drive and press Enter again.
Press Esc twice to return to the start screen of the SCSI BIOS.
Exit this screen and confirm with to boot the computer.
Regardless of what language and keyboard layout your final installation will be using, most BIOS configurations use the US keyboard layout as depicted in the following figure:
Some hardware types, mainly fairly old or very recent ones, fail to install. In many cases, this might happen because support for this type of hardware is missing from the installation kernel or due to certain functionality included in this kernel, such as ACPI, that still cause problems on some hardware.
If your system fails to install using the standard mode from the first installation boot screen, try the following:
With the first CD or DVD still in the CD-ROM drive, reboot the machine with Ctrl-Alt-Del or using the hardware reset button.
When the boot screen appears, use the arrow keys of your keyboard to navigate to and press Enter to launch the boot and installation process. This option disables the support for ACPI power management techniques.
Proceed with the installation as described in Chapter 3, Installation with YaST.
If this fails, proceed as above, but choose instead. This option disables ACPI and DMA support. Most hardware should boot with this option.
If both of these options fail, use the boot options prompt to pass any additional
parameters needed to support this type of hardware to the installation kernel. For more
information about the parameters available as boot options, refer to the kernel documentation
located in /usr/src/linux/Documentation/kernel-parameters.txt.
![]() | Obtaining Kernel Documentation |
|---|---|
Install the | |
There are various other ACPI-related kernel parameters that can be entered at the boot prompt prior to booting for installation:
acpi=off
This parameter disables the complete ACPI subsystem on your computer. This may be useful if your computer cannot handle ACPI at all or if you think ACPI in your computer causes trouble.
acpi=force
Always enable ACPI even if your computer has an old BIOS dated before the year 2000.
This parameter also enables ACPI if it is set in addition to
acpi=off.
acpi=noirq
Do not use ACPI for IRQ routing.
acpi=ht
Run only enough ACPI to enable hyper-threading.
acpi=strict
Be less tolerant of platforms that are not strictly ACPI specification compliant.
pci=noacpi
Disable PCI IRQ routing of the new ACPI system.
pnpacpi=offThis option is for seriell or parallel problems when your BIOS setup contains wrong interrupts or ports.
notscDisable the time stamp counter. This option can be used to work around timing problems on your systems. It is a new feature, if you see regressions on your machine, especially time related or even total hangs, this option is worth a try.
nohz=offDisable the nohz feature. If your machine hangs, this option might help. Generally, you do not need it.
Once you have determined the right parameter combination, YaST automatically writes them to the boot loader configuration to make sure that the system boots properly next time.
If unexplainable errors occur when the kernel is loaded or during the installation, select in the boot menu to check the memory. If returns an error, it is usually a hardware error.
After you insert the first CD or DVD into your drive and reboot your machine, the installation screen comes up, but after you select , the graphical installer does not start.
There are several ways to deal with this situation:
Try to select another screen resolution for the installation dialogs.
Select for installation.
Do a remote installation via VNC using the graphical installer.
To change to another screen resolution for installation, proceed as follows:
Boot for installation.
Press F3 to open a menu from which to select a lower resolution for installation purposes.
Select and proceed with the installation as described in Chapter 3, Installation with YaST.
To perform an installation in text mode, proceed as follows:
Boot for installation.
Press F3 and select .
Select and proceed with the installation as described in Chapter 3, Installation with YaST.
To perform a VNC installation, proceed as follows:
Boot for installation.
Enter the following text at the boot options prompt:
vnc=1 vncpassword=some_password Replace some_password with the password to use for
installation.
Select then press Enter to start the installation.
Instead of starting right into the graphical installation routine, the system continues to run in text mode then halts, displaying a message containing the IP address and port number at which the installer can be reached via a browser interface or a VNC viewer application.
If using a browser to access the installer, launch the browser and enter the address information provided by the installation routines on the future SUSE Linux Enterprise machine and hit Enter:
http://ip_address_of_machine:5801A dialog opens in the browser window prompting you for the VNC password. Enter it and proceed with the installation as described in Chapter 3, Installation with YaST.
![]() | |
Installation via VNC works with any browser under any operating system, provided Java support is enabled. | |
If you use any kind of VNC viewer on your preferred operating system, enter the IP address and password when prompted to do so. A window opens, displaying the installation dialogs. Proceed with the installation as usual.
You inserted the first CD or DVD into the drive, the BIOS routines are finished, but the system does not start with the graphical boot screen. Instead it launches a very minimalistic text-based interface. This might happen on any machine not providing sufficient graphics memory for rendering a graphical boot screen.
Although the text boot screen looks minimalistic, it provides nearly the same functionality as the graphical one:
Unlike the graphical interface, the different boot options cannot be selected using the cursor keys of your keyboard. The boot menu of the text mode boot screen offers some keywords to enter at the boot prompt. These keywords map to the options offered in the graphical version. Enter your choice and hit Enter to launch the boot process.
After selecting a boot option, enter the appropriate keyword at the boot prompt or enter some custom boot options as described in Section 51.2.5, “Fails to Boot”. To launch the installation process, press Enter.
Use the F keys to determine the screen resolution for installation. If you need to boot in text mode, choose F3.
Boot problems are situations when your system does not boot properly (does not boot to the expected runlevel and login screen).
If the hardware is functioning properly, it is possible that the boot loader has become corrupted and Linux cannot start on the machine. In this case, it is necessary to reinstall the boot loader. To reinstall the boot loader, proceed as follows:
Insert the installation media into the drive.
Reboot the machine.
Select from the boot menu.
Select a language.
Accept the license agreement.
In the screen, select and set the installation mode to .
Once in the YaST System Repair module, select then select .
Restore the original settings and reinstall the boot loader.
Leave YaST System Repair and reboot the system.
If, for some reason, the graphical interface does not come up or you prefer to repair the system manually, refer to Section 51.6.3.2, “Using the Rescue System” for instructions.
Other reasons for the machine not booting may be BIOS-related:
Check your BIOS for references to your hard drive. GRUB might simply not be started if the hard drive itself cannot be found with the current BIOS settings.
Check whether your system's boot order includes the hard disk. If the hard disk option was not enabled, your system might install properly, but fail to boot when access to the hard disk is required.
If the machine comes up, but does not boot into the graphical login manager, anticipate
problems either with the choice of the default runlevel or the configuration of the X Window
System. To check the runlevel configuration, log in as the root user and check whether the machine is configured to boot into runlevel 5
(graphical desktop). A quick way to check this is to examine the contents of
/etc/inittab, as follows:
nld-machine:~ # grep "id:" /etc/inittab id:5:initdefault: nld-machine:~ #
The returned line indicates that the machine's default runlevel
(initdefault) is set to 5 and that it should boot to the
graphical desktop. If the runlevel is set to any other number, use the YaST Runlevel
Editor module to set it to 5.
![]() | |
Do not edit the runlevel configuration manually. Otherwise SuSEconfig (run by YaST)
will overwrite these changes on its next run. If you need to make manual changes here, disable
future SuSEconfig changes by setting | |
If the runlevel is set to 5, you might have corruption problems with
your desktop or X Windows software. Examine the log files at
/var/log/Xorg.*.log for detailed messages from the X server as it attempted
to start. If the desktop fails during start, it might log error messages to
/var/log/messages. If these error messages hint at a configuration problem
in the X server, try to fix these issues. If the graphical system still does not come up,
consider reinstalling the graphical desktop.
One quick test: the startx command should force the X Window System to start with the configured defaults if the user is currently logged in on the console. If that does not work, it should log errors to the console. For more information about the X Window system configuration, refer to Chapter 26, The X Window System.
Login problems are those where your machine does, in fact, boot to the expected welcome screen or login prompt, but refuses to accept the username and password or accepts them but then does not behave properly (fails to start the graphic desktop, produces errors, drops to a command line, etc.).
This usually occurs when the system is configured to use network authentication or
directory services and, for some reason, is unable to retrieve results from its configured
servers. The root user, as the only local user, is the
only user that can still log in to these machines. The following are some common reasons why a
machine might appear functional but be unable to process logins correctly:
The network is not working. For further directions on this, turn to Section 51.5, “Network Problems”.
DNS is not working at the moment (which prevents GNOME or KDE from working and the system from making validated requests to secure servers). One indication that this is the case is that the machine takes an extremely long time to respond to any action. Find more information about this topic in Section 51.5, “Network Problems”.
If the system is configured to use Kerberos, the system's local time might have drifted past the accepted variance with the Kerberos server time (this is typically 300 seconds). If NTP (network time protocol) is not working properly or local NTP servers are not working, Kerberos authentication ceases to function because it depends on common clock synchronization across the network.
The system's authentication configuration is misconfigured. Check the PAM configuration files involved for any typographical errors or misordering of directives. For additional background information about PAM and the syntax of the configuration files involved, refer to Chapter 27, Authentication with PAM.
In all cases that do not involve external network problems, the solution is to reboot the system into single-user mode and repair the configuration before booting again into operating mode and attempting to log in again. To boot into single-user mode:
This is by far the most common problem users encounter, because there are many reasons this can occur. Depending on whether you use local user management and authentication or network authentication, login failures occur for different reasons.
Local user management can fail for the following reasons:
The user might have entered the wrong password.
The user's home directory containing the desktop configuration files is corrupted or write protected.
There might be problems with the X Window System authenticating this particular user, especially if the user's home directory has been used with another Linux distribution prior to installing the current one.
To locate the reason for a local login failure, proceed as follows:
Check whether the user remembered his password correctly before you start debugging the whole authentication mechanism. If the user might not remember his password correctly, use the YaST User Management module to change the user's password.
Log in as root and check
/var/log/messages for error messages of the login process and of PAM.
Try to log in from a console (using Ctrl+Alt+F1). If this is successful, the blame cannot be put on PAM, because it is possible to authenticate this user on this machine. Try to locate any problems with the X Window System or the desktop (GNOME or KDE). For more information, refer to Section 51.4.3, “Login Successful but GNOME Desktop Fails ” and Section 51.4.4, “Login Successful but KDE Desktop Fails”.
If the user's home directory has been used with another Linux distribution, remove the
Xauthority file in the user's home. Use a console login via Ctrl+Alt+F1 and run rm .Xauthority as this user. This should eliminate X
authentication problems for this user. Try a graphical login again.
If graphical login still fails, do a console login with Ctrl+Alt+F1. Try to start an X session on another display—the first one
(:0) is already in use:
startx -- :1
This should bring up a graphical screen and your desktop. If it does not, check the log
files of the X Window System
(/var/log/Xorg.) or the log
file for your desktop applications (displaynumber.log.xsession-errors in the user's home
directory) for any irregularities.
If the desktop could not start because of corrupt configuration files, proceed with Section 51.4.3, “Login Successful but GNOME Desktop Fails ” or Section 51.4.4, “Login Successful but KDE Desktop Fails”.
The following are some common reasons why network authentication for a particular user might fail on a specific machine:
The user might have entered the wrong password.
The username exists in the machine's local authentication files and is also provided by a network authentication system, causing conflicts.
The home directory exists but is corrupt or unavailable. Perhaps it is write protected or is on a server that is inaccessible at the moment.
The user does not have permission to log in to that particular host in the authentication system.
The machine has changed hostnames, for whatever reason, and the user does not have permission to log in to that host.
The machine cannot reach the authentication server or directory server that contains that user's information.
There might be problems with the X Window System authenticating this particular user, especially if the user's home has been used with another Linux distribution prior to installing the current one.
To locate the cause of the login failures with network authentication, proceed as follows:
Check whether the user remembered his password correctly before you start debugging the whole authentication mechanism.
Determine the directory server the machine relies on for authentication and make sure that it is up and running and properly communicating with the other machines.
Determine that the user's username and password work on other machines to make sure that his authentication data exists and is properly distributed.
See if another user can log in to the misbehaving machine. If another user can log in
without difficulty or if root can log in, log in and
examine the /var/log/messages file. Locate the time stamps that
correspond to the login attempts and determine if PAM has produced any error messages.
Try to log in from a console (using Ctrl+Alt+F1). If this is successful, the blame cannot be put on PAM or the directory server on which the user's home is hosted, because it is possible to authenticate this user on this machine. Try to locate any problems with the X Window System or the desktop (GNOME or KDE). For more information, refer to Section 51.4.3, “Login Successful but GNOME Desktop Fails ” and Section 51.4.4, “Login Successful but KDE Desktop Fails”.
If the user's home directory has been used with another Linux distribution, remove the
Xauthority file in the user's home. Use a console login via Ctrl+Alt+F1 and run rm .Xauthority as this user. This should eliminate X
authentication problems for this user. Try a graphical login again.
If graphical login still fails, do a console login with Ctrl+Alt+F1. Try to start an X session on another display—the first one
(:0) is already in use:
startx -- :1
This should bring up a graphical screen and your desktop. If it does not, check the log
files of the X Window System
(/var/log/Xorg.) or the log
file for your desktop applications (displaynumber.log.xsession-errors in the user's home
directory) for any irregularities.
If the desktop could not start because of corrupt configuration files, proceed with Section 51.4.3, “Login Successful but GNOME Desktop Fails ” or Section 51.4.4, “Login Successful but KDE Desktop Fails”.
If this is true for a particular user, it is likely that the user's GNOME configuration files have become corrupted. Some symptoms might include the keyboard failing to work, the screen geometry becoming distorted, or even the screen coming up as a bare gray field. The important distinction is that if another user logs in, the machine works normally. If this is the case, it is likely that the problem can be fixed relatively quickly by simply moving the user's GNOME configuration directory to a new location, which causes GNOME to initialize a new one. Although the user is forced to reconfigure GNOME, no data is lost.
Switch to a text console by pressing Ctrl+Alt+F1.
Log in with your user name.
Move the user's GNOME configuration directories to a temporary location:
mv .gconf .gconf-ORIG-RECOVER mv .gnome2 .gnome2-ORIG-RECOVER
Log out.
Log in again, but do not run any applications.
Recover your individual application configuration data (including the Evolution e-mail
client data) by copying the ~/.gconf-ORIG-RECOVER/apps/ directory back
into the new ~/.gconf directory as follows:
cp -a .gconf-ORIG-RECOVER/apps .gconf/
If this causes the login problems, attempt to recover only the critical application data and reconfigure the remainder of the applications.
There are several reasons why a KDE desktop would not allow users to login. Corrupted cache data can cause login problems as well as corrupt KDE desktop configuration files.
Cache data is used at desktop start-up to increase performance. If this data is corrupted, start-up is slowed down or fails entirely. Removing them forces the desktop start-up routines to start from scratch. This takes more time than a normal start-up, but data is intact after this and the user can login.
To remove the cache files of the KDE desktop, issue the following command as root:
rm -rf /tmp/kde-user/tmp/socket-user
Replace user with the actual username. Removing these two
directories just removes the corrupted cache files. No real data is harmed using this procedure.
Corrupted desktop configuration files can always be replaced with the initial configuration files. If you want to recover the user's adjustments, carefully copy them back from their temporary location after the configuration has been restored using the default configuration values.
To replace a corrupted desktop configuration with the initial configuration values, proceed as follows:
Switch to a text console by pressing Ctrl+Alt+F1.
Log in with your user name.
Move the KDE configuration directory and the .skel files to a
temporary location:
mv .kde .kde-ORIG-RECOVER
mv .skel .skel-ORIG-RECOVERLog out.
Log in again.
After the desktop has started successfully, copy the user's own configurations back into place:
cp -a .kde-ORIG-RECOVER/share .kde/share
![]() | |
If the user's own adjustments caused the login to fail and continue to do so, repeat
the procedure as described above, but do not copy the | |
Many problems of your system may be network-related, even though they do not seem to be at first. For example, the reason for a system not allowing users to log in might be a network problem of some kind. This section introduces a simple check list you can apply to identify the cause of any network problem encountered.
When checking the network connection of your machine, proceed as follows:
If using an ethernet connection, check the hardware first. Make sure that your network cable is properly plugged into your computer. The control lights next to your ethernet connector, if available, should both be active.
If the connection fails, check whether your network cable works with another machine. If it does, your network card causes the failure. If hubs or switches are included in your network setup, suspect them to be the culprits as well.
If using a wireless connection, check whether the wireless link can be established by other machines. If this is not the case, contact the wireless network's administrator.
Once you have checked your basic network connectivity, try to find out which service is not responding. Gather the address information of all network servers needed in your setup. Either look them up in the appropriate YaST module or ask your system administrator. The following list gives some of the typical network servers involved in a setup together with the symptoms of an outage.
A broken or malfunctioning name service affects the network's functioning in many ways. If the local machine relies on any network servers for authentication and these servers cannot be found due to name resolution issues, users would not even be able to log in. Machines in the network managed by a broken name server would not be able to “see” each other and communicate.
A malfunctioning or completely broken NTP service could affect Kerberos authentication and X server functionality.
If any application needed data stored in an NFS mounted directory, it would not be
able to start or function properly if this service was down or misconfigured. In a worst
case scenario, a user's personal desktop configuration would not come up if his home
directory containing the .gconf or .kde
subdirectories could not be found due to an outage of the NFS server.
If any application needed data stored in a directory on a Samba server, it would not be able to start or function properly if this service was down.
If your SUSE Linux Enterprise system relied on a NIS server to provide the user data, users would not be able to log in to this machine if the NIS service was down.
If your SUSE Linux Enterprise system relied on an LDAP server to provide the user data, users would not be able to log in to this machine if the LDAP service was down.
Authentication would not work and login to any machine would fail.
Users would not be able to print.
Check whether the network servers are running and whether your network setup allows you to establish a connection:
![]() | |
The debugging procedure described below only applies to a simple network server/client setup that does not involve any internal routing. It assumes both server and client are members of the same subnet without the need for additional routing. | |
Use ping hostname (replace
hostname with the hostname of the server) to check whether each
one of them is up and responding to the network. If this command is successful, it tells you
that the host you were looking for is up and running and that the name service for your
network is configured correctly.
If ping fails with destination host
unreachable, either your system or the desired server is not properly configured or
down. Check whether your system is reachable by running ping
your_hostname
from another machine. If you can reach your machine from another machine, it is
the server that is not running at all or not configured correctly.
If ping fails with unknown host, the name service is not configured
correctly or the hostname used was incorrect. Use ping -n
ipaddress to try to connect to this host without name service. If
this is successful, check the spelling of the hostname and for a misconfigured name service
in your network. For further checks on this matter, refer to
Step 4.b. If ping still fails, either your network card is not configured
correctly or your network hardware is faulty. Refer to Step 4.c for information about this.
Use host hostname to check whether the
hostname of the server you are trying to connect to is properly translated into an IP address
and vice versa. If this command returns the IP address of this host, the name service is up
and running. If the host command fails, check all network configuration
files relating to name and address resolution on your host:
/etc/resolv.conf
This file is used to keep track of the name server and domain you are currently using. It can be modified manually or automatically adjusted by YaST or DHCP. Automatic adjustment is preferable. However, make sure that this file has the following structure and all network addresses and domain names are correct:
searchfully_qualified_domain_namenameserveripaddress_of_nameserver
This file can contain more than one name server address, but at least one of them must be correct to provide name resolution to your host. If needed, adjust this file using the YaST DNS and Hostname module.
If your network connection is handled via DHCP, enable DHCP to change hostname and name service information by selecting and in the YaST DNS and Hostname module.
/etc/nsswitch.conf
This file tells Linux where to look for name service information. It should look like this:
... hosts: files dns networks: files dns ...
The dns entry is vital. It tells Linux to use an external name
server. Normally, these entries are automatically made by YaST, but it never hurts
to check.
If all the relevant entries on the host are correct, let your system administrator check the DNS server configuration for the correct zone information. For detailed information about DNS, refer to Chapter 33, The Domain Name System. If you have made sure that the DNS configuration of your host and the DNS server are correct, proceed with checking the configuration of your network and network device.
If your system cannot establish a connection to a network server and you have excluded name service problems from the list of possible culprits, check the configuration of your network card.
Use the command ifconfig network_device
(executed as root) to check whether this device was
properly configured. Make sure that both inet address and
Mask are configured correctly. An error in the IP address or a missing bit
in your network mask would render your network configuration unusable. If necessary, perform
this check on the server as well.
If the name service and network hardware are properly configured and running, but some
external network connections still get long time-outs or fail entirely, use
traceroute fully_qualified_domain_name
(executed as root) to track the network route these
requests are taking. This command lists any gateway (hop) a request from your machine passes
on its way to its destination. It lists the response time of each hop and whether this hop is
reachable at all. Use a combination of traceroute and ping to track down the culprit and let
the administrators know.
Once you have identified the cause of your network trouble, you can resolve it yourself (if the problem is located on your machine) or let the system administrators of your network know about your findings so they can reconfigure the services or repair the necessary systems.
If you have a problem with network connectivity, narrow it down as described in Procedure 51.2, “”. If NetworkManager seems to be the culprit, proceed as follows to get logs providing hints on why NetworkManager fails:
Open a shell and log in as root.
Restart the NetworkManager:
rcnetwork restart -o nm
Open a web page, for example, http://www.opensuse.org as normal user to see, if you can connect.
Collect any information about the state of NetworkManager in
/var/log/NetworkManager.
For more information about NetworkManager, refer to Section 30.6, “Managing Network Connections with NetworkManager”.
Data problems are when the machine might or might not boot properly but, in either case, it is clear that there is data corruption on the system and that the system needs to be recovered. These situations call for a backup of your critical data, enabling you to recover a system state from before your system failed. SUSE Linux Enterprise offers dedicated YaST modules for system backup and restoration as well as a rescue system that can be used to recover a corrupted system from the outside.
System backups can be easily managed using the YaST System Backup module:
As root, start YaST and select +.
Create a backup profile holding all details needed for the backup, filename of the archive file, scope, and type of the backup:
Select +.
Enter a name for the archive.
Enter the path to the location of the backup if you want to keep a local backup. For your backup to be archived on a network server (via NFS), enter the IP address or name of the server and the directory that should hold your archive.
Determine the archive type and click .
Determine the backup options to use, such as whether files not belonging to any package should be backed up and whether a list of files should be displayed prior to creating the archive. Also determine whether changed files should be identified using the time-consuming MD5 mechanism.
Use to enter a dialog for the backup of entire hard disk areas. Currently, this option only applies to the Ext2 file system.
Finally, set the search constraints to exclude certain system areas from the backup area that do not need to be backed up, such as lock files or cache files. Add, edit, or delete items until your needs are met and leave with .
Once you have finished the profile settings, you can start the backup right away with or configure automatic backup. It is also possible to create other profiles tailored for various other purposes.
To configure automatic backup for a given profile, proceed as follows:
Select from the menu.
Select .
Determine the backup frequency. Choose , , or .
Determine the backup start time. These settings depend on the backup frequency selected.
Decide whether to keep old backups and how many should be kept. To receive an automatically generated status message of the backup process, check .
Click to apply your settings and have the first backup start at the time specified.
Use the YaST System Restoration module to restore the system configuration from a backup. Restore the entire backup or select specific components that were corrupted and need to be reset to their old state.
Start ++.
Enter the location of the backup file. This could be a local file, a network mounted file, or a file on a removable device, such as a floppy or a CD. Then click .
The following dialog displays a summary of the archive properties, such as the filename, date of creation, type of backup, and optional comments.
Review the archived content by clicking . Clicking returns you to the dialog.
opens a dialog in which to fine-tune the restore process. Return to the dialog by clicking .
Click to open the view of packages to restore. Press to restore all files in the archive or use the various , , and buttons to fine-tune your selection. Only use the option if the RPM database is corrupted or deleted and this file is included in the backup.
After you click , the backup is restored. Click to leave the module after the restore process is completed.
There are several reasons why a system could fail to come up and run properly. A corrupted file system after a system crash, corrupted configuration files, or a corrupted boot loader configuration are the most common ones.
SUSE Linux Enterprise offers two different methods to cope with this kind of situation. You can either use the YaST System Repair functionality or boot the rescue system. The following sections cover both flavors of system repair.
Before launching the YaST System Repair module, determine in which mode to run it to best fit your needs. Depending on the severeness and cause of your system failure and your expertise, there are three different modes to choose from:
If your system failed due to an unknown cause and you basically do not know which part of the system is to blame for the failure, use . An extensive automated check will be performed on all components of your installed system. For a detailed description of this procedure, refer to Section 51.6.3.1.1, “Automatic Repair”.
If your system failed and you already know which component is to blame, you can cut the lengthy system check with short by limiting the scope of the system analysis to those components. For example, if the system messages prior to the failure seem to indicate an error with the package database, you can limit the analysis and repair procedure to checking and restoring this aspect of your system. For a detailed description of this procedure, refer to Section 51.6.3.1.2, “Customized Repair”.
If you already have a clear idea of what component failed and how this should be fixed, you can skip the analysis runs and directly apply the tools necessary for the repair of the respective component. For details, refer to Section 51.6.3.1.3, “Expert Tools”.
Choose one of the repair modes as described above and proceed with the system repair as outlined in the following sections.
To start the automatic repair mode of YaST System Repair, proceed as follows:
Insert the first installation medium of SUSE Linux Enterprise into your CD or DVD drive.
Reboot the system.
At the boot screen, select .
Select the language and click .
Confirm the license agreement and click .
In , select +.
Select .
YaST now launches an extensive analysis of the installed system. The progress of the procedure is displayed at the bottom of the screen with two progress bars. The upper bar shows the progress of the currently running test. The lower bar shows the overall progress of the analysis. The log window in the top section tracks the currently running test and its result. See Figure 51.2, “Automatic Repair Mode”. The following main test runs are performed with every run. They contain, in turn, a number of individual subtests.
Checks the validity and coherence of the partition tables of all detected hard disks.
The swap partitions of the installed system are detected, tested, and offered for activation where applicable. The offer should be accepted for the sake of a higher system repair speed.
All detected file systems are subjected to a file system–specific check.
/etc/fstabThe entries in the file are checked for completeness and consistency. All valid partitions are mounted.
The boot loader configuration of the installed system (GRUB or LILO) is checked for completeness and coherence. Boot and root devices are examined and the availability of the initrd modules is checked.
This checks whether all packages necessary for the operation of a minimal installation are present. While it is optionally possible also to analyze the base packages, this takes a long time because of their vast number.
Whenever an error is encountered, the procedure stops and a dialog opens outlining the details and possible solutions.
Read the screen messages carefully before accepting the proposed fix. If you decide to decline a proposed solution, your system remains unchanged.
After the repair process has been terminated successfully, click and and remove the installation media. The system automatically reboots.
To launch the mode and selectively check certain components of your installed system, proceed as follows:
Insert the first installation medium of SUSE Linux Enterprise into your CD or DVD drive.
Reboot the system.
At the boot screen, select .
Select the language and click .
Confirm the license agreement and click .
In , select +.
Select .
Choosing shows a list of test runs that are all marked for execution at first. The total range of tests matches that of automatic repair. If you already know where no damage is present, unmark the corresponding tests. Clicking starts a narrower test procedure that probably has a significantly shorter running time.
Not all test groups can be applied individually. The analysis of the fstab entries is always bound to an examination of the file systems, including existing swap partitions. YaST automatically resolves such dependencies by selecting the smallest number of necessary test runs.
Whenever an error is encountered, the procedure stops and a dialog opens outlining the details and possible solutions.
Read the screen messages carefully before accepting the proposed fix. If you decide to decline a proposed solution, your system remains unchanged.
After the repair process has been terminated successfully, click and and remove the installation media. The system automatically reboots.
If you are knowledgeable with SUSE Linux Enterprise and already have a very clear idea of what needs to be repaired in your system, directly apply the tools skipping the system analysis.
To make use of the feature of the YaST System Repair module, proceed as follows:
Boot the system with the original installation medium used for the initial installation (as outlined in Chapter 3, Installation with YaST).
In , select +.
Select and choose one or more repair options.
After the repair process has been terminated successfully, click and and remove the installation media. The system automatically reboots.
Expert tools provides the following options to repair your faulty system:
This starts the YaST boot loader configuration module. Find details in Section 21.3, “Configuring the Boot Loader with YaST”.
This starts the expert partitioning tool in YaST.
This checks the file systems of your installed system. You are first offered a selection of all detected partitions and can then choose the ones to check.
It is possible to attempt to reconstruct damaged partition tables. A list of detected hard disks is presented first for selection. Clicking starts the examination. This can take a while depending on the processing power and size of the hard disk.
![]() | Reconstructing a Partition Table |
|---|---|
The reconstruction of a partition table is tricky. YaST attempts to recognize lost partitions by analyzing the data sectors of the hard disk. The lost partitions are added to the rebuilt partition table when recognized. This is, however, not successful in all imaginable cases. | |
This option saves important system files to a floppy disk. If one of these files become damaged, it can be restored from disk.
This checks the consistency of the package database and the availability of the most important packages. Any damaged installed packages can be reinstalled with this tool.
SUSE Linux Enterprise contains a rescue system. The rescue system is a small Linux system that can be loaded into a RAM disk and mounted as root file system, allowing you to access your Linux partitions from the outside. Using the rescue system, you can recover or modify any important aspect of your system:
Manipulate any type of configuration file.
Check the file system for defects and start automatic repair processes.
Access the installed system in a “change root” environment
Check, modify, and reinstall the boot loader configuration
Resize partitions using the parted command. Find more information about this tool at the Web site of GNU Parted (http://www.gnu.org/software/parted/parted.html).
The rescue system can be loaded from various sources and locations. The simplest option is to boot the rescue system from the original installation CD or DVD:
Insert the installation medium into your CD or DVD drive.
Reboot the system.
At the boot screen, choose the option.
Enter root at the Rescue: prompt. A
password is not required.
If your hardware setup does not include a CD or DVD drive, you can boot
the rescue system from a network source. The following example applies to a
remote boot scenario—if using another boot medium, such as a floppy
disk, modify the info file accordingly and boot as
you would for a normal installation.
Enter the configuration of your PXE boot setup and replace
install=
with
protocol://instsourcerescue=.
As with a normal installation, protocol://instsourceprotocol
stands for any of the supported network protocols (NFS, HTTP, FTP, etc.)
and instsource for the path to your network
installation source.
Boot the system using “Wake on LAN”, as described in Section 4.3.7, “Wake on LAN”.
Enter root at the Rescue:
prompt. A password is not required.
Once you have entered the rescue system, you can make use of the virtual consoles that can be reached with Alt+F1 to Alt+F6.
A shell and many other useful utilities, such as the mount program, are
available in the /bin directory. The
sbin directory contains important file and network
utilities for reviewing and repairing the file system. This directory also
contains the most important binaries for system maintenance, such as
fdisk, mkfs, mkswap, mount, mount, init, and shutdown, and ifconfig, ip,
route, and netstat for maintaining the network. The directory
/usr/bin contains the vi editor, find, less, and ssh.
To see the system messages, either use the command dmesg
or view the file /var/log/messages.
As an example for a configuration that might be fixed using the rescue system, imagine you have a broken configuration file that prevents the system from booting properly. You can fix this using the rescue system.
To manipulate a configuration file, proceed as follows:
Start the rescue system using one of the methods described above.
To mount a root file system located under
/dev/sda6 to the rescue system, use the following
command:
mount /dev/sda6 /mnt
All directories of the system are now located under
/mnt
Change the directory to the mounted root file system:
cd /mnt
Open the problematic configuration file in the vi editor. Adjust and save the configuration.
Unmount the root file system from the rescue system:
umount /mnt
Reboot the machine.
Generally, file systems cannot be repaired on a running system. If you
encounter serious problems, you may not even be able to mount your root file
system and the system boot may end with a kernel panic.
In this case, the only way is to repair the system from the outside. It is
strongly recommended to use the YaST System Repair for this task (see Section 51.6.3.1, “Using YaST System Repair” for details). However, if you need to do a
manual file system check or repair, boot the rescue system. It contains the
utilities to check and repair the
ext2, ext3,
reiserfs, xfs,
dosfs, and vfat file systems.
If you need to access the installed system from the rescue system to, for example, modify the boot loader configuration, or to execute a hardware configuration utility, you need to do this in a “change root” environment.
To set up a “change root” environment based on the installed system, proceed as follows:
First mount the root partition from the installed system and the device file system:
mount /dev/sda6 /mnt mount --bind /dev /mnt/dev
Now you can “change root” into the new environment:
chroot /mnt
Then mount /proc and /sys:
mount /proc mount /sys
Finally, mount the remaining partitions from the installed system:
mount -a
Now you have access to the installed system. Before rebooting the
system, unmount the partitions with umount
-a and leave the “change root”
environment with exit.
![]() | Limitations |
|---|---|
Although you have full access to the files and applications of the installed system, there are some limitations. The kernel that is running is the one that was booted with the rescue system. It only supports essential hardware and it is not possible to add kernel modules from the installed system unless the kernel versions are exactly the same (which is unlikely). So you cannot access a sound card, for example. It is also not possible to start a graphical user interface. Also note that you leave the “change root” environment when you switch the console with Alt+F1 to Alt+F6. | |
Sometimes a system cannot boot because the boot loader configuration is corrupted. The start-up routines cannot, for example, translate physical drives to the actual locations in the Linux file system without a working boot loader.
To check the boot loader configuration and reinstall the boot loader, proceed as follows:
Perform the necessary steps to access the installed system as described in Section 51.6.3.2.3, “Accessing the Installed System”.
Check whether the following files are correctly configured according to the GRUB configuration principles outlined in Chapter 21, The Boot Loader.
/etc/grub.conf
/boot/grub/device.map
/boot/grub/menu.lst
Apply fixes to the device mapping (device.map)
or the location of the root partition and configuration files, if
necessary.
Reinstall the boot loader using the following command sequence:
grub --batch < /etc/grub.conf
Unmount the partitions, log out from the “change root” environment, and reboot the system:
umount -a exit reboot
If the kernel of the SUSE® Linux Enterprise Server for IBM System z is upgraded or modified, it is possible to reboot the system accidentally in an inconsistent state, so standard procedures of IPLing the installed system fail. This most commonly occurs if a new or updated SUSE Linux Enterprise Server kernel has been installed and the zipl program has not been run to update the IPL record. In this case, use the standard installation package as a rescue system from which the zipl program can be executed to update the IPL record.
![]() | Making the Installation Data Available |
|---|---|
For this method to work, the SUSE Linux Enterprise Server for IBM System z installation data must be available. For details, refer to Section “Making the Installation Data Available” (Chapter 2, Preparing for Installation, ↑Architecture-Specific Information) from Architecture-Specific Information. Additionally, you need the channel number of the device and the partition number within the device that contains the root file system of the SUSE Linux Enterprise Server installation. | |
First, IPL the SUSE Linux Enterprise Server for IBM System z installation system as described in the Architecture-Specific Information manual. A list of choices for the network adapter to use is then presented.
Select then to start the rescue system. Depending on the installation environment, you now must specify the parameters for the network adapter and the installation source. The rescue system is loaded and the following login prompt is shown at the end:
Skipped services in runlevel 3: nfs nfsboot Rescue login:
You can now login as root without a password.
In this state, no disks are configured. You need to configure them before you can proceed.
Procedure 51.3. Configuring DASDs
Configure DASDs with the following command:
dasd_configure 0.0.0150 1 0
0.0.0150 is the channel to which the DASD is connected. The
1 means
activate the disk (a 0 at this place would deactivate
the disk). The 0
stands for “no DIAG mode” for the disk (a 1
here would enable DAIG access to the disk).
Now the DASD is online (check with cat /proc/partitions) and can used for subsequent commands.
Procedure 51.4. Configuring a zFCP Disk
To configure a zFCP disk, it is necessary to first configure the zFCP adapter. Do this with the following command:
zfcp_host_configure 0.0.4000 1
0.0.4000 is the channel to which the adapter is
attached and 1 stands
for activate (a 0 here would deactivate the adapter).
After the adapter is activated, a disk can be configured. Do this with the following command:
zfcp_disk_configure 0.0.4000 1234567887654321 8765432100000000 1
0.0.4000 is the previously-used channel ID,
1234567887654321 is the
WWPN (World wide Port Number), and 8765432100000000 is
the LUN (logical
unit number). The 1 stands for activating the disk (a
0 here would
deactivate the disk).
Now the zFCP disk is online (check with cat /proc/partitions) and can used for subsequent commands.
If all needed disks are online, you should now be able to mount the root
device. Assuming that the root device is on the second partition of the DASD
device (/dev/dasda2), the corresponding command is
mount /dev/dasda2 /mnt.
![]() | File System Consistency |
|---|---|
If the installed system has not been shut down properly, it may be
advisable to check the file system consistency prior to mounting. This
prevents any accidental loss of data. Using this example, issue the command
fsck | |
By just issuing the command mount, it is possible to check whether the file system could be mounted correctly.
Example 51.1. Output of the Mount Command¶
SuSE Instsys suse:/ # mount shmfs on /newroot type shm (rw,nr_inodes=10240) devpts on /dev/pts type devpts (rw) virtual-proc-filesystem on /proc type proc (rw) /dev/dasda2 on /mnt type reiserfs (rw)
For the zipl command to read the configuration file from
the root device of the installed system and not from the rescue system,
change the root device to the installed system with the
chroot command:
Example 51.2. chroot to the Mounted File System¶
SuSE Instsys suse:/ # cd /mnt SuSE Instsys suse:/mnt # chroot /mnt
Now execute zipl to rewrite the IPL record with the correct values:
Example 51.3. Installing the IPL Record with zipl¶
sh-2.05b# zipl building bootmap : /boot/zipl/bootmap adding Kernel Image : /boot/kernel/image located at 0x00010000 adding Ramdisk : /boot/initrd located at 0x00800000 adding Parmline : /boot/zipl/parmfile located at 0x00001000 Bootloader for ECKD type devices with z/OS compatible layout installed. Syncing disks.... ...done
To exit the rescue system, first leave the shell opened by the chroot command with exit. To prevent any loss of data, flush all unwritten buffers to disk with the sync command. Now change to the root directory of the rescue system and unmount the root device of SUSE Linux Enterprise Server for IBM System z installation.
Example 51.4. Unmounting the File System¶
SuSE Instsys suse:/mnt # cd / SuSE Instsys suse:/ # umount /mnt
Finally, halt the rescue system with the halt command. The SUSE Linux Enterprise Server system can now be IPLed as described in Section 3.13.1, “IBM System z: IPLing the Installed System”.