Friday, March 4, 2011

'text compare' churn engine

How does churnalism.com work?


The site compresses all articles published on national newspaper websites, on BBC news, and Sky news online, into a series of numbers based on 15 character strings (using a hash function) and then stores them in a fast access database. When someone pastes in some text and clicks 'compare', the churn engine compresses the text entered and then searches for similar compressions (or 'common hashes'). If the engine finds any articles where the similarity is greater than 20%, then it suggests the article may be churn. Churnalism.com is powered off the back of the database of over three million compressed articles in journalisted.com.

No comments:

Post a Comment

OK restart and I’m doing it right now, everything‘s gonna be clear as can be and if you have any questions get a change resolved immediately ... everybody’s got to carry their own weight I’m gonna make this happen; it’s gonna happen really fast!
Trust God
Thanks Jesus,hold on tight!

Labels

EDX-101 (2) Flash (2) pamama (2)