canonical-server team mailing list archive

Thread
Date

[Bug 334994] Re: Degraded RAID boot fails: kobject_add_internal failed for dev-sda1 with -EEXIST, don't try to register things with the same name in the same directory

To: canonical-server@xxxxxxxxxxxxxxxxxxx
From: Andy Whitcroft <apw@xxxxxxxxxxxxx>
Date: Fri, 27 Mar 2009 19:01:26 -0000
Reply-to: Bug 334994 <334994@xxxxxxxxxxxxxxxxxx>
Sender: bounces@xxxxxxxxxxxxx

Added some debugging to the teardown code and managed to reproduce this.
What we find is that we unbind and then attempt and fail a bind on the
array, then we see the deletes for the unbind complete.  This leads to
the bind failure:

    [    3.476504] md: bind<sda1>
    [...]
    [   35.097882] md: md0 stopped.
    [   35.097897] md: unbind<sda1>
    [   35.097907] APW: sysfs_remove_link ret<0>
    [   35.110198] md: export_rdev(sda1)
    [   35.113254] md: bind<sda1>
    [   35.113297] ------------[ cut here ]------------
    [   35.113300] WARNING: at /home/apw/build/jaunty/ubuntu-jaunty/fs/sysfs/dir.c:462 sysfs_add_one+0x4c/0x50()
    [...]
    [   35.115126] APW: deleted something

Here where we happened to mount successfully, note the delete falls in
the expected place:

    [    3.479917] md: bind<sda5>
    [...]
    [   35.118235] md: md1 stopped.
    [   35.118240] md: unbind<sda5>
    [   35.118244] APW: sysfs_remove_link ret<0>
    [   35.140164] md: export_rdev(sda5)
    [   35.142276] APW: deleted something
    [   35.143848] md: bind<sda1>
    [   35.152288] md: bind<sda5>
    [   35.158571] raid1: raid set md1 active with 1 out of 2 mirrors

If we look at the code for stopping the array we see the following:

    static int do_md_stop(mddev_t * mddev, int mode, int is_open)
    {
    [...]
		    rdev_for_each(rdev, tmp, mddev)
			    if (rdev->raid_disk >= 0) {
				    char nm[20];
				    sprintf(nm, "rd%d", rdev->raid_disk);
				    sysfs_remove_link(&mddev->kobj, nm);
			    }

		    /* make sure all md_delayed_delete calls have finished */
		    flush_scheduled_work();

		    export_array(mddev);
    [...]

Note that we flush_scheduled_work() to wait for md_delayed_deletes and then
export the array.  However it is export_array() which triggers these
deletes:

    static void export_array(mddev_t *mddev)
    {
    [...]
	    rdev_for_each(rdev, tmp, mddev) {
		    if (!rdev->mddev) {
			    MD_BUG();
			    continue;
		    }
		    kick_rdev_from_array(rdev);
	    }
    [...]
    }

It does this via unbind_rdev_from_array():

    static void kick_rdev_from_array(mdk_rdev_t * rdev)
    {
	    unbind_rdev_from_array(rdev);
	    export_rdev(rdev);
    }

Which triggers the delated delete:

    static void unbind_rdev_from_array(mdk_rdev_t * rdev)
    {
    [...]
	    rdev->sysfs_state = NULL;
	    /* We need to delay this, otherwise we can deadlock when
	     * writing to 'remove' to "dev/state".  We also need
	     * to delay it due to rcu usage.
	     */
	    synchronize_rcu();
	    INIT_WORK(&rdev->del_work, md_delayed_delete);
	    kobject_get(&rdev->kobj);
	    schedule_work(&rdev->del_work);
    }

So in reality we do not want to wait for this before the export_array()
but after.  Testing with a patch to do this seems to resolve the issue.

-- 
Degraded RAID boot fails: kobject_add_internal failed for dev-sda1 with -EEXIST, don't try to register things with the same name in the same directory
https://bugs.launchpad.net/bugs/334994
You received this bug notification because you are a member of Canonical
Server Team, which is a bug assignee.