The dev blog for Justin Duewel-Zahniser's Chapbook poetry sharing web app project. I'm going to post random stuff here that's too big for the dev list and hopefully get conversations going with testers/users.

Search Form and Some UI

Chapbook now has a proper search form which you can drop down from the top menu.  it also introduces a *gasp* new color.  I also spruced up the Favorites page just a tiny bit, as it was very, very rudimentary.  The next target is chapbook organization and then the river, and then I will try to get some beta people to opine on the functionality and usefulness.

I must also not get too distracted by other ideas.

Comments (View)

Rails Issue w/ Highlight and Excerpt TextHelper Methods

Here’s a weekend Rails discovery that I meant to post about yesterday. I was working on the search support in Chapbook using Ferret and was referencing the Acts_as_Ferret Tutorial from Rails Envy. I jumped down to section on Highlighting because I wanted to introduce highlighting in to the result set for ease of use. Here’s what I read:

The requirement to do this, however, is that you must have your search fields stored as I showed above.

So, what was shown above?

If you take a look inside one of your search indexes right now, believe it or not, you would not see your data. By default acts_as_ferret does not store your data in a recoverable form, it just indexes it.

“What if my data is small and I want to store it in the index?” I hear you ask.

Good question grasshopper. If your data is small, or you only really care about one field of information, you can get a speed bonus by storing the data in the index itself.

Note the bit about your data being small. I’m really uncomfortable with altering how my data is stored to a format which appears to imply danger down the road related to data size in exchange for search highlighting. There was no information about what the scale limitation or long-term effects would be, but I was nervous about doing this just for highlighting.

Rails has some TextHelper methods called “highlight” and “excerpt” which provide pretty much what you might guess by the name. So I decided to try these out as an alternative which would not involve indexing or data growth dangers.

I ran in to a problem. If I searched for “dog” and looked at the result set I would see “dogged” in the highlighting. Not good. So I took a look at the source in the Rails API:

71:       def highlight(text, phrases, highlighter = '\1')
72: if text.blank? || phrases.blank?
73: text
74: else
75: match = Array(phrases).map { |p| Regexp.escape(p) }.join('|')
76: text.gsub(/(#{match})/i, highlighter)
77: end
78: end
101:       def excerpt(text, phrase, radius = 100, excerpt_string = "...")
102: if text.nil? || phrase.nil? then return end
103: phrase = Regexp.escape(phrase)
104:
105: if found_pos = text.chars =~ /(#{phrase})/i
106: start_pos = [ found_pos - radius, 0 ].max
107: end_pos = [ found_pos + phrase.chars.length + radius, text.chars.length ].min
108:
109: prefix = start_pos > 0 ? excerpt_string : ""
110: postfix = end_pos 111:
112: prefix + text.chars[start_pos..end_pos].strip + postfix
113: else
114: nil
115: end
116: end 

Look at those regular expressions. Look again. Hm. They’re too simple. If fact, they won’t highlight correctly on full words and they won’t highlight or excerpt correctly in the case of punctuation following a word (e.g. “… the dog.”). Ew.

But that’s a solvable problem. I’m not sure of the most Railsy way to solve the problem, but I created alternate helpers in my poem helper which upgrade the regexps to be a bit smarter. I wonder if I should submit these, or go ahead and override the methods in my app. Not sure what else I might break or whether someone would reject my changes because the lack of rigor around the matching is desirable for some use. Anyway, here’s the code.

module PoemsHelper def search_excerpt(text, phrase, radius = 100, excerpt_string = “…”) if text.nil? || phrase.nil? then return end phrase = Regexp.escape(phrase) if found_pos = text.chars =~ /(\W+#{phrase}\W+)/i start_pos = [ found_pos - radius, 0 ].max end_pos = [ found_pos + phrase.chars.length + radius, text.chars.length ].min prefix = start_pos > 0 ? excerpt_string : “” postfix = end_pos \1’) if text.blank? || phrases.blank? text else match = Array(phrases).map { |p| Regexp.escape(p) }.join(‘|’) text.gsub(/(\W+#{match}\W+)/i, highlighter) end endend
Pretty much just some \W+s thrown in there and it works great. And I didn’t have to change my indexes. Using “search_*” feels dirty, though.
Comments (View)

Comments Feed

Phew, long time.  Busy at work.

The poem pages now parse out the Disqus comments feed and set it up for auto-discovery in the browser.  So, you now have easier access to subscribe to comments on any poem.

Comments (View)

OpenID Improvements

Okay, one quick refactoring later.  The site is now based on OpenID rather than locked-in accounts.  So, you use an OpenID, you get logged in, a blank account is created for you and you must add a name before proceeding.  That’s pretty much the deal.  The login page now has some guidance on OpenID for anyone not familiar.

You can also add a backup OpenID (or 20).  I will have those authenticate, but for now they just are made available to the account and so aren’t authenticated until you try to use them to login.

Also, congrats to the Heroku team for getting a big round of funding.  I can’t wait to see what they do with the money.  I’m sure it will be awesome.

Comments (View)

OpenID Support

I’ve added in OpenID support.  You can now create an account/login using OpenID and/or Clickpass.

There are, however, some oddities.  Clickpass doesn’t appear to provide nickname or email when it authenticates a user so if you login by OpenID you can’t use an existing account.  Long term, this won’t be a problem because my plan is to move to purely OpenID-based authentication, use what data the provider will give and ask you to fill in the rest.  That way, I won’t be generating a bunch of proprietary, chapbook-only user accounts.

I’ll probably also stop accepting file upload avatars and go web-based instead, encouraging the use of Gravatars. 

Comments (View)

RSS for Search

You can now perform a keyword search using the URL format http://poetry.heroku.com/search/waterfall and subscribe to the feed from the result set (or be optimistic and add .rss after the terms to get the feed straight away).

Hoping to get some time this evening to build a search form. 

Comments (View)

Revisions RSS and Search

RSS for revisions and search are now working (miracle occurred?).  I need to build a search form, but for now the syntax is http://poetry.heroku.com/search/[terms] with the usual ferret OR and other modifiers being applicable.

Comments (View)

FOAF Added (why?)

Thanks to CrowdVine Blog » Blog Archive » Implementing FOAF in Rails, I was able to easily implement Friend of a Friend (FOAF - an RDF spec for semantic web stuff) quite easily on Chapbook.  So, now your profile is more semantical.

Comments (View)

Most RSS Now Working

With the exception of user’s activity log (aka, “the river”), revisions to poems and comments on poems, all RSS is now operational.  I’ll try to polish the rest off this weekend.  Disqus has an RSS link for comments on poems, but I’m in the process of figuring out how to make it auto-discoverable.  The rest just requires me to write more code.

The river may have to wait until beta, since it will be easier to test, design and validate if there’s a lot of activity (or, more than my own).

Comments (View)

A Poem About The Internet

kfan:

Do not stop to think or edit
You must be the first who said it.

Nice.

Comments (View)