Hello Nokogiri
Posted: March 2nd, 2012 | Author: xanda | Filed under: IT Related | Tags: crawler, google, nokogiri, ruby, scrubyt, web | No Comments »I’ve talked about scRUBYt! once and I’ve been using it for years as my primary ‘Google crawler’ aka Google web-scraper. So it is not a surprise if I say.. It was part of MyLipas Defacement Crawler as well 😉
If you are using scRUBYt! as your Google web-scraper as well, I would suggest you to take a look at your script, since it might be broken by now. Its not only the gem itself, event the domain of their website, scrubyt.org, is now expired. (but yes the project is till in github). I’ve noticed that my crawler reported zero URL (scraped from Google) everyday and it made me to think of 2 possibilities; the strings return zero match, OR the scraper is broken. And guess what, my second thought was right.
Yes.. Its another day in lab looking back at the crawler/scraper code. Now I don’t really depend on scRUBYt anymore due to lack of support/maintenance and broken gem dependencies. So here come the Nokogiri. With the XPaths support I manage to get working crawler as for the replacement.. in just few minutes. But of course the code will be a bit longer but NVM.. It works like a charm! 😀
Leave a Reply