Monday, August 10, 2009

[Linux] Version Magic

(Reposted from my mscorlib account)

First off, some background information. I am an Oracle DBA, not a Red Hat admin. I only play one here at work because we don't have a real one. I'm running Red Hat ES v4 for an 11i install of Oracle. The server itself is an HP Blade server. The server is currently being built up to replace our existing version 10 Oracle Apps install, so it is not production. However, it may as well be since we have an offshore team that is working on it. When it's down, they twiddle their thumbs and I look stupid.

Yesterday afternoon we installed the support pack for HPs Blade server. We did this because we were trying to set up disk mirroring, something that had been left off when the server was set up originally. The install had about 20 items to install. About 4 failed. There really wasn't much of an error given, and since they weren't related to what we needed, we ignored it and rebooted. That's when the fun began.

When the machine started back up, I knew something was wrong when it failed to detect the ethernet ports (eth0-eth3). After logging back in to the machine, the network would not activate. I had various error messages as I worked on the problem, from:

tg3 device eth1 does not seem to be present, delaying initialization

to an error message about the MAC address. The network card itself is a Broadcom Corporation NetXtreme BCM 5703. tg3 is the driver they have you use. eth1 is the port I was plugged in to.

After some poking around, I figured that the driver was some how corrupt or invalid. I thought it might be out of date, so I set out to find a new driver. I downloaded both a Broadcom driver and a tg3 driver. Both compiled and looked good, but when I tried to install, I had the version error in /var/log/messages:

tg3: version magic '2.6.9-22.ELsmp SMP 686 REGPARM 4KSTACKS gcc-3.2' should be 'tg3:version magic '2.6.9-22.ELsmp SMP686 REGPARM 4KSTACKS gcc-3.4'

Now, a seasoned Linux Guru would probably look at that and notice immediate what is wrong.... gcc is buggered. I didn't notice that, so I continued what I was doing. Let me just add that the HP guys aren't too good at Linux. This morning we found someone who may have been able to help, but we were able to fix it right as we got a hold of her.

Anyway, here's the point of this whole blog. When you install Oracle 11i, there's a step in the installation procedures that has you rename gcc and use an older version. Ah, now perhaps you see the problem. Because we're only using the server for Oracle, we never needed gcc and were still using an older version. After swapping versions to the newer one, we ran the HP support installer again. And what do you know, everything installed. All 20+ package had no problems at all.

I rebooted the server and the network cards were found once again. So there you have it. I'm hoping that other admins might find this on Google if they are struggling to solve the problem like I was. If you are running Oracle 11i, don't forget that you probably renamed the gcc file when you installed.

Comments from mscorlib

Hi there, we are having the exact same problem. What is the best way to rename gcc and use the older version?

In our case, we had the old files still, just renamed to something like gcc_bak. I just renamed the current version to something else, and renamed our backup to gcc. That fixed the problem. If you don't have a backup of gcc, you might be able to grab it off of a different server if you have the same version of OS (possibly hardware as well). Just make sure to backup your current version in case that's not the problem.


Glad I found this page, been hunting around for a solution for a few days.


I don't know who replaced gcc with the "old" version for oracle, but it would have been nice to know.

What confused me was that querying the installed version of the gcc packages reported all ok at version 3.4

If I'd thought about it a bit more I would have run the following command to check for file changes:

for i in $(rpm -qa | grep gcc); do rpm -V $i; done;
S.5....T /usr/bin/g++
S.5....T /usr/bin/gcc

For the non admin types:
rpm -qa | grep gcc lists for all packages with gcc in the name
rpm -V $i Verifies the files in each package found and lists files that have changed.

Hope this helps some one else!




Same problem on a Red Hat Enterprise Linux AS4 U4 with Oracle Application Server 10g (Update 9.0.4.2.0).
Solved installing gcc-3.4.6-9 and requirements (cpp-3.4.6-9, libgcc-3.4.6-9).
Thanks!



...and same for a RHEL3 U2 with Oracle Application Server 10g (Update 9.0.4.1.0). Required re-installation of gcc-3.2.3-39 and gcc-c++-3.2.3-39 and re-installation of Proliant Support Pack. Afterwards found that there existed both /usr/bin/gcc.backup and /usr/bin/g++.backup, probably renamed during OAS installation, so that it would only have been necessary to replace current gcc & g++ with .backup files before re-installing HP PSP.



One thing to keep in mind, is that when you go to patch the server, these files may be overwritten with new versions. When you go to install a patch (typically via adpatch, although I'm sure opatch could fail also), you will get another error about gcc. I've modified my patch notes that we use to include a step to check the filesizes of gcc. I have a backup copy now and if they are the wrong size, I just copy them back in before I start patching, whether they are needed or not. The last thing you want to see is a patch fail to install because of something so simple. Glad it is helping others though.

No comments:

Post a Comment