Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Recursively find the n latest modified files in a directory

Posted by Kelvin on 18 May 2011 | Tagged as: programming, Ubuntu

Here's how to find the latest modified files in a directory. Particularly useful when you've made some changes and can't remember what! find . -type f -printf '%T@ %p\n' | sort -n | tail -1 | cut -f2- -d" " Replace tail -1 with tail -20 to list the 20 most recent files for example. […]

Convert fixed-width file to CSV

Posted by Kelvin on 12 May 2011 | Tagged as: programming, Ubuntu

After trying various sed/awk recipes to convert from fixed-width to CSV, I found a Python script that works well. Here it is, from http://code.activestate.com/recipes/452503-convert-db-fixed-width-output-to-csv-format/   ## {{{ http://code.activestate.com/recipes/452503/ (r1) # Ian Maurer # http://itmaurer.com/ # Convert a Fixed Width file to a CSV with Headers # # Requires following format: # # header1 header2 header3 […]

MD5 a directory recursively

Posted by Kelvin on 05 May 2011 | Tagged as: Ubuntu

Ever need to check if a directory is exactly the same as another (including file contents)?

find . –type f –exec md5sum {} + | awk '{print $1}' | sort | md5sum

 

This runs md5sum on the individual md5sum hashes of each file.

Application-wide keyboard shortcuts in Swing

Posted by Kelvin on 21 Apr 2011 | Tagged as: programming

Swing's focus subsystem of keyboard events are fired specific to the component in focus. One way of implementing application-wide keyboard shortcuts is to add it to _every_ component that is created. (yes, its as ridonkulous as it sounds) Here's another way, using KeyboardFocusManager: // Add Ctrl-W listener to quit application KeyboardFocusManager.getCurrentKeyboardFocusManager().addKeyEventDispatcher(new KeyEventDispatcher(){   public boolean […]

Working MySQL 5.1+ Levenshtein Stored Procedure

Posted by Kelvin on 13 Apr 2011 | Tagged as: programming

Update: Changed 0x00 to '\0' as per Jan-Hendrik's comment below. There are a number of MySQL functions for calculating Levenshtein distance floating around StackOverflow and other forums. They all seem to be based off http://codejanitor.com/wp/2007/02/10/levenshtein-distance-as-a-mysql-stored-function/ (broken link). Anyway, I couldn't get them to work for me. MySQL complained: ERROR 1064 (42000): You have an error […]

Name parser links

Posted by Kelvin on 13 Apr 2011 | Tagged as: programming

I'm about to write some code to normalize names, e.g. split out firstName, middleName, lastName etc. Here's some links on the topic: http://search.cpan.org/dist/Lingua-EN-NameParse/lib/Lingua/EN/NameParse.pm http://alphahelical.com/code/misc/nameparse/nameparse.php.txt http://jasonpriem.com/human-name-parse/ http://code.google.com/p/php-name-parser/ http://www.onlineaspect.com/2009/08/17/splitting-names/

Preventing Java XML Parsers from resolving external DTDs

Posted by Kelvin on 07 Apr 2011 | Tagged as: programming

With some SAX parsers you can disable loading of external DTDs with this: xmlReader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd" , false); Not all do, however. Piccolo, for one, does not. However, you can accomplish the same thing with this: SAXReader reader = new SAXReader(); reader.setEntityResolver(new EntityResolver(){ public InputSource resolveEntity(String publicId, String systemId) throws SAXException, IOException { return new InputSource(new StringReader("")); […]

[solved] checking for shout-config… no while compiling ices

Posted by Kelvin on 05 Apr 2011 | Tagged as: Ubuntu

If you're trying to compile ices and get this error: checking for pkg-config… /usr/bin/pkg-config configure: /usr/bin/pkg-config couldn't find libshout. Try adjusting PKG_CONFIG_PATH. checking for shout-config… no configure: error: Could not find a usable libshout And you swear you've already installed libshout and libshout-devel, then you need to install libtheora and libtheora-devel. Yes, the error message […]

Improving EasyHotSpot usability

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

Here are some changes I made to make EasyHotSpot more usable.. If you are interested in any of these changes, just drop me a mail and I'll email them to you. 1. Allow voucher generation to accept usernames instead of just numberofvouchers 2. Integration of SquidGuard for blocking ads, trackers, etc In the future, I […]

Modifying EasyHotSpot 0.2 for per-user daily bandwidth quotas

Posted by Kelvin on 03 Apr 2011 | Tagged as: Ubuntu

First of all, I'll say this – if its at all possible to install the Ubuntu distro of EasyHotSpot (available from the EasyHotSpot download page), do so! I couldn't because I couldn't get Ubuntu 10.04 installed on my antique laptop which I was going to use as the internet gateway. Only Ubuntu 10.10 worked. I […]

« Previous PageNext Page »