processing Xhtml files with Xdocument class adds unwanted elements

I am working on a project which requires me to process xhtml files to fix the content of certain tags. The fixing itself is not a problem, however I have troubles when saving the files.

THe code I am using is:

 var spanNodesList = p.GetSpanNodesList(xDoc);

            foreach (XElement span in spanNodesList)
            {
                if (span.Value == null || span.Value == "")
                {
                    span.Remove();
                }
                else
                {
                    string[] words = p.SplitNodeText(span.Value);
                    XElement parent = span.Parent;
                    span.Remove();

                    foreach (string word in words)
                    {
                        parent.Add(new XElement("span", word,
                            new XAttribute("id", "w" + p.currentNodeID.ToString())));
                        p.currentNodeID++;
                    }                      
                }
            }

List<XElement> GetSpanNodesList(XDocument file)
    {
        //Get only 'word' nodes
        var spanNodes = file.Descendants("{http://www.w3.org/1999/xhtml}span");
        if (spanNodes != null)
        {
            var spanNodesList = spanNodes.ToList();
            spanNodesList.RemoveAll(x => ((x.Attribute("id") == null) || !x.Attribute("id").Value.Contains("w")));
            return spanNodesList;
        }
        else return null;
    }

As firstly, I couldn't get any elements, I have found out somewhere in SO that I might need to add namespace reference to file.Descendants("{http://www.w3.org/1999/xhtml}span"); as it yielded no results. This has indeed helped and I get the nodes I want. However, the resulting code produces has two problems.

        <span id="w1" xmlns="">Word one</span>
        <span id="w2" xmlns="">Word two</span>
        <span id="w3" xmlns="">Word three</span>

It adds the xmlns attribute, which I don't need (and which was not in the original file) and it adds <?xml version="1.0" encoding="utf-8"?> header. I assume this is expected behaviour resulting from what I coded, so my question is - what can I do to remove these 'problems'. Or perhaps there's a better way of dealing with xHtml files? Also, I don't know if this is relevant, but source files have references to a number of different namespaces...

Cheers Bartosz

Jon Skeet
people
quotationmark

When you add the span element, you're doing it without a namespace - whereas some ancestor element has set the default namespace. All you need to do is use the right namespace for your new elements:

XNamespace ns = "http://www.w3.org/1999/xhtml";
...
parent.Add(new XElement(ns + "span", ...);

Likewise you can use:

var spanNodes = file.Descendants(ns + "span");

which is rather more readable, IMO. You almost certainly don't need to worry about the XML declaration.

people

See more on this question at Stackoverflow