Installing mosh on Dreamhost
Posted by Kelvin on 26 Mar 2013 | Tagged as: programming
Here's a gist which helps you install mosh on Dreamhost: https://gist.github.com/andrewgiessel/4486779
Posted by Kelvin on 26 Mar 2013 | Tagged as: programming
Here's a gist which helps you install mosh on Dreamhost: https://gist.github.com/andrewgiessel/4486779
Posted by Kelvin on 26 Nov 2012 | Tagged as: programming
There are a number of examples online which show how to generate HMAC MD5 digests in Java. Unfortunately, most of them don't generate digests which match the digest examples provided on the HMAC wikipedia page. HMAC_MD5("key", "The quick brown fox jumps over the lazy dog") = 0x80070713463e7749b90c2dc24911e275 HMAC_SHA1("key", "The quick brown fox jumps over the […]
Posted by Kelvin on 25 Nov 2012 | Tagged as: programming, PHP
http://code.google.com/p/rolling-curl/ A more efficient implementation of curl_multi() https://github.com/krakjoe/pthreads http://docs.php.net/manual/en/book.pthreads.php Posix threads in PHP. Whoa! http://www.underhanded.org/blog/2010/05/05 Installing Apache Worker over prefork. http://www.wikivs.com/wiki/Apache_vs_nginx I stumbled on this page when researching the pros/cons of Apache + mod_php vs nginx + php5-fpm http://barry.wordpress.com/2008/04/28/load-balancer-update/ Nice posting about wordpress.com's use of nginx for load balancing.
Posted by Kelvin on 19 Nov 2012 | Tagged as: programming, Lucene / Solr / Elasticsearch / Nutch
Here's a straight Java port of the quicksilver algo, found here: http://orderedlist.com/blog/articles/live-search-with-quicksilver-style-for-jquery/ quicksilver.js contains the actual algorithm in javascript. It uses the same input strings as the demo page at http://static.railstips.org/orderedlist/demos/quicksilverjs/jquery.html import java.io.IOException; import java.util.TreeSet; public class Quicksilver { public static void main(String[] args) throws IOException { for (ScoreDoc doc : getScores("DGHTD")) System.out.println(doc); System.out.println("============================================"); […]
Posted by Kelvin on 16 Nov 2012 | Tagged as: programming
There are *numerous* pages online describing how to fix those awful junk characters in a latin1 column caused by unicode characters. After spending over 2 hours trying out different methods, I found one that's dead simple and actually works: Export: mysqldump -u $user -p –opt –quote-names –skip-set-charset \ –default-character-set=latin1 $dbname > dump.sql Import: mysql -u […]
Posted by Kelvin on 14 Nov 2012 | Tagged as: programming
I checked out a whole bunch of jquery tooltip plugins for a new website I just created, and just wanted to say that the best, IMHO, was Tipsy. qTip and qTip2 is obviously very full-featured and beautiful, but overkill for my needs – the qTip 1.0.0-rc3 download weighed in at 38KB minified, and 83KB uncompressed. […]
Posted by Kelvin on 14 Nov 2012 | Tagged as: Lucene / Solr / Elasticsearch / Nutch
Just spent the day hacking together a website that does a blow-by-blow examination of Solr vs ElasticSearch. Hopefully it'll address any questions people might have about whether to use Solr or ES.. Let me know what you think!
Posted by Kelvin on 12 Nov 2012 | Tagged as: Lucene / Solr / Elasticsearch / Nutch
A term is the unit of search in Lucene. A Lucene document comprises of a set of terms. Tokenization means splitting up a string into tokens, or terms. A Lucene Tokenizer is what both Lucene (and correspondingly, Solr) uses to tokenize text. To implement a custom Tokenizer, you extend org.apache.lucene.analysis.Tokenizer. The only method you need […]
Posted by Kelvin on 12 Nov 2012 | Tagged as: Lucene / Solr / Elasticsearch / Nutch
In my previous post, I described how to extract second- and top-level domains from a URL in Java. Now, I'll build a Lucene Tokenizer out of it, and a Solr TokenizerFactory class. DomainTokenizer doesn't do anything really fancy. It first returns the hostname as the first token, then the 2nd-level domain as the second token, […]
Posted by Kelvin on 12 Nov 2012 | Tagged as: Lucene / Solr / Elasticsearch / Nutch
It turns out that extracting second- and top-level domains is not a simple task, the primary difficulty being that in addition to the usual suspects (.com .org .net etc), there are the country suffixes (.uk .it .de etc) which need to be accounted for. Regex alone has no way of handling this. http://publicsuffix.org/list/ contains a […]