jeff's blog
New improved google images results
I’m really liking the way that Google improved the Google Images search results. When you click on an image it provides the image as an overlay over the top of the host’s page. I believe that this change has increased pageviews and visits to one of my domains by about 90% per day. That’s good for ads!
Google Analytics - Record Outbound Links
Using Google Analytics, this little piece is helpful for recording outbound links as events.
http://www.google.com/support/analytics/bin/answer.py?hl=en&answer=55527
BeautifulSoup - convert/decode HTML entity codes into regular python string
Dear BeautifulSoup users,
Use convertEntities=BeautifulSoup.HTML_ENTITIES to decode or convert HTML entity codes into regular python strings.
Example:
‘>’ converts to ‘>’
‘&’ decodes ‘&’.
Background
I am working with an XML feed that has HTML embedded in it. By default BeautifulSoup will encodes characters into SGML (or XML or HTML) entities.
This is the XML message I receive.
<message><strong>Hello</strong> world</message>
Since BeautifulSoup automatically encodes the contents of the message to be safe for XML the string you get will be different from the raw XML I expected.
What I wanted
<strong>Hello</strong> world
Instead I got
<strong>Hello</strong> world
Using convertEntites resolves the problem.
soup = BeautifulSoup(content, convertEntities=BeautifulSoup.HTML_ENTITIES)
Answer from StackOverflow
But available in BeautifulSoup Documentation I just didn’t understand their examples.
