Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Make Eclipse more like Intellij Idea

Posted by Kelvin on 12 Oct 2010 | Tagged as: programming

http://byteco.de/2010/08/03/making-eclipse-like-idea/ has 2 excellent tips for making Eclipse more like Intellij.  Most important is Intellij keyboard mappings for Eclipse: http://code.google.com/p/ideakeyscheme/updates/list Copy to your Eclipse/plugins folder and restart Eclipse. Then change the keyboard scheme to intellij.

GimpShop – a saner interface for Gimp on Ubuntu

Posted by Kelvin on 10 Oct 2010 | Tagged as: Ubuntu

Gimp's interface sucks. GimpShop offers an interface which should be familiar to Photoshop user, but still uses Gimp in the background. Installation instructions here

Run php from html files on Dreamhost

Posted by Kelvin on 10 Oct 2010 | Tagged as: programming, PHP

Modify .htaccess to include this: Correct AddType php-cgi .html .htm WRONG AddType application/x-httpd-php .php .htm .html or AddHandler application/x-httpd-php .html

Upgrade your HTC droid eris to android 2.2

Posted by Kelvin on 10 Oct 2010 | Tagged as: programming, android

Why bother upgrading? 2 simple reasons: USB and wifi tethering. Instructions courtesy of my friend Jack: Step 1) Do a complete backup of your SD card data (just in case) Step 2) Root your phone – Go to http://forum.xda-developers.com/showthread.php?t=742228 and follow the instructions Step 3) Do a Nand backup – Make sure you have >=500Mb […]

[SOLVED] Howto build the PHP rrdtool extension

Posted by Kelvin on 09 Oct 2010 | Tagged as: programming, Ubuntu, PHP

The definitive answer is here: http://www.samtseng.liho.tw/~samtz/blog/2009/03/11/howto-build-the-php-rrdtool-extension/ If you're on Ubuntu, do this first: sudo apt-get install rrdtool librrd-dev php5-dev Then follow the steps above.

[SOLVED] curl: (56) Received problem 2 in the chunky parser

Posted by Kelvin on 09 Oct 2010 | Tagged as: PHP, programming, crawling

The problem is described here: http://curl.haxx.se/mail/lib-2006-04/0046.html I successfully tracked the problem to the "Connection:" header. It seems that if the "Connection: keep-alive" request header is not sent the server will respond with data which is not chunked . It will still reply with a "Transfer-Encoding: chunked" response header though. I don't think this behavior is […]

How to write a custom Solr FunctionQuery

Posted by Kelvin on 03 Sep 2010 | Tagged as: programming, Lucene / Solr / Elasticsearch / Nutch

Solr FunctionQueries allow you to modify the ranking of a search query in Solr by applying functions to the results. There are a list of out-of-box FunctionQueries available here: http://wiki.apache.org/solr/FunctionQuery In order to write a custom Solr FunctionQuery, you'll need to do 2 things: 1. Subclass org.apache.solr.search.ValueSourceParser. Here's a stub ValueSourceParser. public class MyValueSourceParser extends […]

RT: Larry is furious about this Mark Hurd thing

Posted by Kelvin on 27 Aug 2010 | Tagged as: life

This is absolutely priceless: Honestly, he won’t let it go. He’s calling me over and over saying, Wait until you hear the latest, you won’t believe what they’re saying now! As if I care. Jesus. I put my iPhone 4 down on the desk and let him rant for a few minutes while I do […]

Arithmetic mean vs Geometric mean

Posted by Kelvin on 24 Aug 2010 | Tagged as: programming

I've been brushing up on some basic statistics, and ran into this interesting bit of information. We're all familiar with the average of a set of values, also known as the mean. Arithmetic Mean Turns out that there's more than one way to calculate the mean of a distribution. The method we probably associate with […]

Average length of a URL (Part 2)

Posted by Kelvin on 16 Aug 2010 | Tagged as: programming

Here's a follow-up on my previous attempt at calculating the average length of a URL, which was naive and totally primitive. In my previous attempt, I used the DMOZ urls and arrived at 4074300 unique URLs averaging 34 characters each. The DMOZ dataset is inadequate for a number of reasons, most of all because DMOZ's […]

« Previous PageNext Page »