← Back to team overview

canonical-server team mailing list archive

[Bug 334994] Re: Degraded RAID boot fails: kobject_add_internal failed for dev-sda1 with -EEXIST, don't try to register things with the same name in the same directory

 

Added some debugging to the teardown code and managed to reproduce this.
What we find is that we unbind and then attempt and fail a bind on the
array, then we see the deletes for the unbind complete.  This leads to
the bind failure:

    [    3.476504] md: bind<sda1>
    [...]
    [   35.097882] md: md0 stopped.
    [   35.097897] md: unbind<sda1>
    [   35.097907] APW: sysfs_remove_link ret<0>
    [   35.110198] md: export_rdev(sda1)
    [   35.113254] md: bind<sda1>
    [   35.113297] ------------[ cut here ]------------
    [   35.113300] WARNING: at /home/apw/build/jaunty/ubuntu-jaunty/fs/sysfs/dir.c:462 sysfs_add_one+0x4c/0x50()
    [...]
    [   35.115126] APW: deleted something

Here where we happened to mount successfully, note the delete falls in
the expected place:

    [    3.479917] md: bind<sda5>
    [...]
    [   35.118235] md: md1 stopped.
    [   35.118240] md: unbind<sda5>
    [   35.118244] APW: sysfs_remove_link ret<0>
    [   35.140164] md: export_rdev(sda5)
    [   35.142276] APW: deleted something
    [   35.143848] md: bind<sda1>
    [   35.152288] md: bind<sda5>
    [   35.158571] raid1: raid set md1 active with 1 out of 2 mirrors

If we look at the code for stopping the array we see the following:

    static int do_md_stop(mddev_t * mddev, int mode, int is_open)
    {
    [...]
		    rdev_for_each(rdev, tmp, mddev)
			    if (rdev->raid_disk >= 0) {
				    char nm[20];
				    sprintf(nm, "rd%d", rdev->raid_disk);
				    sysfs_remove_link(&mddev->kobj, nm);
			    }

		    /* make sure all md_delayed_delete calls have finished */
		    flush_scheduled_work();

		    export_array(mddev);
    [...]

Note that we flush_scheduled_work() to wait for md_delayed_deletes and then
export the array.  However it is export_array() which triggers these
deletes:

    static void export_array(mddev_t *mddev)
    {
    [...]
	    rdev_for_each(rdev, tmp, mddev) {
		    if (!rdev->mddev) {
			    MD_BUG();
			    continue;
		    }
		    kick_rdev_from_array(rdev);
	    }
    [...]
    }

It does this via unbind_rdev_from_array():

    static void kick_rdev_from_array(mdk_rdev_t * rdev)
    {
	    unbind_rdev_from_array(rdev);
	    export_rdev(rdev);
    }

Which triggers the delated delete:

    static void unbind_rdev_from_array(mdk_rdev_t * rdev)
    {
    [...]
	    rdev->sysfs_state = NULL;
	    /* We need to delay this, otherwise we can deadlock when
	     * writing to 'remove' to "dev/state".  We also need
	     * to delay it due to rcu usage.
	     */
	    synchronize_rcu();
	    INIT_WORK(&rdev->del_work, md_delayed_delete);
	    kobject_get(&rdev->kobj);
	    schedule_work(&rdev->del_work);
    }

So in reality we do not want to wait for this before the export_array()
but after.  Testing with a patch to do this seems to resolve the issue.

-- 
Degraded RAID boot fails: kobject_add_internal failed for dev-sda1 with -EEXIST, don't try to register things with the same name in the same directory
https://bugs.launchpad.net/bugs/334994
You received this bug notification because you are a member of Canonical
Server Team, which is a bug assignee.