[Lustre-discuss] 1.6.6 to 1.8.3 upgrade, OSS with wrong "Target" value

Wed Jul 14 15:57:47 PDT 2010

Hi Roger,

Sorry for the delay. From the ldiskfs messages I seem to me that you are
using ext4 ldiskfs
(Jun 26 17:54:30 puppy7 kernel: ldiskfs created from ext4-2.6-rhel5).
If you upgrading from 1.6.6 you ldiskfs is ext3 based so I think taht in
lustre-1.8.3 you should use ext3 based ldiskfs rpm.

Can you also  tell us a bit more about your setup? From what you wrote so
far I understand you have 2 OSS servers and each server has one OST device.
In addition to that you have a third server which acts as a MGS/MDS, is that
right?

The logs you provided seem to be only from one server called puppy7 so it
does not give a whole picture of the situation. The timeout messages may
indicate a problem with communication between the servers but it is really
difficult to say without seeing the whole picture or at least more elements
of it.

To check if you have correct rpms installed can you please run 'rpm -qa |
grep lustre' on both OSS servers and the MDS?

Also please provide output from command 'lctl list_nids'  run on both OSS
servers, MDS and a client?

In addition to above please run following command on all lustre targets
(OSTs and MDT) to display your current lustre configuration

 tunefs.lustre --dryrun --print /dev/<ost_device>

If possible please attach syslog from each machine from the time you mounted
lustre targets (OST and MDT).

Best regards,

Wojciech

On 14 July 2010 20:46, Roger Sersted <rs1 at aps.anl.gov> wrote:

>
> Any additional info?
>
> Thanks,
>
> Roger S.
>

-- 
--
Wojciech Turek
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/attachments/20100714/7efc42f8/attachment.htm>