Frequently Asked Questions

What is an element?
An element corresponds to a <tag> in an HTML or XML document. It includes not only the name of the tag itself but also its attributes, its contents, etc. It is a tag come to life.

What is a CSS Selector?
CSS has a powerful way of referring to elements within a document. It is like a mini language which allows you to select elements based on their tag name, id, class attribute, etc. Some examples:

div selects all the div elements in your document
div.info selects all the div elements that have a class of 'info'.
div#profile selects the one (if present) element that has the id 'profile'
img[width='200'] selects all the img tags that have a width attribute with a value of '200'

You can also allows you to chain more than one of these selectors together to select elements within elements. Some examples:

div a the space between div and a imply that a must be a descendant of div… ie it is ‘inside’ the div matches <div><a> and <div><p><a> but not <div></div><a>
div > a the greater than sign between div and a imply that a must be an immediate child of div matches <div><a> and <div><p></p><a> but not <div><p><a>
div + a the plus sign implies that the end of div must be immediately followed by a matches <div></div><a> but not <div><a> or <div></div><p></p><a>

Does ElementParser support Namespaces / XML Schema / Internal Subsets?
No. These were some of the simplifying design decisions.

Does ElementParser perform as quickly as NSXMLParser / lib2xml?
No. That said, in most cases it performs quickly enough… even on the iPhone.

Advertisements

25 Responses to “Frequently Asked Questions”

  1. Peter Says:

    Hey, i’m currently trying out your lib.

    Is there a github project or anything? Since its gpl’ed…

    Because it doesn’t compile for iPhone OS 3 with gcc 4.2, theres an error with attributes.

    I fixed it; you have to use the _exact_ var:
    @property (nonatomic, retain) NSMutableDictionary* attributes;
    (not NSDictionary)

    and also -(NSMutableDictionary*)attributes;

    best,
    Peter

    • touchtank Says:

      Perfect timing! It is now on GitHub. Thanks for fix… latest code resolves issue.

      • Mike Says:

        I dont think the NSMutableDictionary was ever fixed on github. I also had to fix it manually after downloading the latest release.

  2. Bryan Smoltz Says:

    Hello,
    Thanks so much for sharing this. I am using this in my project and its very useful. I was wondering how I might be able to select the href tag from a link. Right now, I get the element itself but not the link.

    Thank you.

    • touchtank Says:

      Bryan,

      You’ll want to do something like this:

      DocumentRoot* document = [Element parseHTML: source];
      Element* aTag = [document selectElement: @"a"]; // use more complex css selector to get to the 'right' element
      NSString* href = [aTag attribute: @"href"];

  3. itgiawa Says:

    How can I get the whole element including start and end tags?
    Right now I can only get the start tag with [element description] and the stuff inbetween the tags with [element contentSource]. But I need the tag as well.

    How do I get that?

    Thanks!

    • touchtank Says:

      One of the simplifications of ElementParser is that it doesn’t ‘store’ end tags. It is not designed to do roundtrip-ing. Can you help me understand why you need it?

  4. itgiawa Says:

    Sometimes html has text thats not inside tags like this

    text I want

    I’m trying to get that “text I want”
    The only way I can think to do that is to remove the tags from the parent element. It would work except that I can’t make the assumption that all tags have end tags.

    If you have another way to get that raw text please let me know.

    • itgiawa Says:

      for some reason my html didn’t show. I’ll try again:

      <div>
         text I want
         <randomTag>/randomTag>
         <tagWithNoEndTag>
      </div>
      
      • touchtank Says:

        Hmmm… You have hit one of the areas where the simplification is unhelpful. You could get the contentsText or source of the enclosing element but you can’t get just the text node within the element. It would be a fairly straightforward method to add on Element though… The joys of open source 🙂

  5. itgiawa Says:

    Allright, if you don’t have a solution… then I’ll just scan ahead and look for an end tag myself. Little extra work and kinda ugly but no biggie.

  6. itgiawa Says:

    It looks like

     [element childElements] 

    omits some elements when the there are tags without corresponding end tags.

    instead this code does seem to work….

    DocumentRoot* childDocument = [Element parseHTML: [element contentsSource]];					
    	NSArray* children = [childDocument selectElements: pattern];
    

    just something to be aware of…

    • itgiawa Says:

      Actually, the above code does not work…
      Can you think of an easy way to make childElements or syblingElements work even when there are tags that are not closed?

      If you cant think of something I’m going to have to write my own parser…

      For example. If you call get children on the html element:

      text

      text

      it will only return div2 as a child.

  7. Gigi Says:

    Hi,
    I’m new to objective C but I’m using this library to perform some test on the HTML parsing.
    I declared a string in this way:
    string1 = @”the text”;
    then:
    DocumentRoot* document = [Element parseHTML: string1];
    Element* theElement = [document selectElement: @”p”];

    The problem is that the “theElement” contentsText is NIL, while the contentsLenght is 8.
    ANy ideas??

    Thank you

  8. Gigi Says:

    Oh sorry,
    the string is declared in this way:
    string1 = @”the text”;

  9. Lele Says:

    Gigi, you must use a string with html for example @”ciao a tutti buon natale” , in this case if you parse html with pattern @”p” you have a result. 🙂

  10. Lele Says:

    sorry for example @”ciao a tutti buon natale”

  11. Sam Says:

    Is there a way to tell when it’s done parsing so I could call another method or something? like -(void) didFinishParsing or something? Thanks.

  12. Saleh Hosseinkhani Says:

    may i ask what is the difference between this parser than using LibXml2 or such Html parser like Hpple?


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: