Supermind Search Consulting Blog 
Solr - Elasticsearch - Big Data

Posts about programming

[SOLVED] Connection Reset problems with Proxoid

Posted by Kelvin on 02 May 2010 | Tagged as: android, programming

Update

My patched version of Proxoid is available here: http://www.supermind.org/friends/SuperProxoid-debug.apk
Source code is available here: http://www.supermind.org/friends/superproxoid-snapshot-20101009.tar.gz

Note: I'm no longer using Proxoid, SuperProxoid, or even AziLink etc. I'm now using a rooted Android 2.2 which supports USB and wifi tethering! Way cool. Instructions here: http://www.supermind.org/blog/774/upgrade-your-htc-droid-eris-to-android-2-2


I've been using the excellent tether app, Proxoid to hook up my PC to my Android-based Droid Eris phone.

However, after a period of somewhat heavy usage/browsing, it inevitably barfs and refuses to take any more connections.

In Firefox, this appears as "Connection reset". In Chrome, pages simply refuse to load. The workaround I used for awhile was simply to stop the proxoid service on the Proxoid app, then start it again. This was fine (if somewhat annoying for surfing), but is a non-starter if you happen to be downloading anything at that point.

I decided to do abit of sleuthing, and installed the Log Collector app so I could see what was going on when Proxoid refused any more connections. Here's a snapshot of what the log file showed:

05-02 09:56:47.466 E/OSNetworkSystem( 6923): unclassified errno 24 (Too many open files)
05-02 09:56:47.466 E/OSNetworkSystem( 6923): unclassified errno 24 (Too many open files)
05-02 09:56:47.477 E/ProxoidService( 6923): null
05-02 09:56:47.477 E/ProxoidService( 6923): java.lang.NullPointerException
05-02 09:56:47.477 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SelectorImpl.prepareChannels(SelectorImpl.java:223)
05-02 09:56:47.477 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SelectorImpl.selectInternal(SelectorImpl.java:191)
05-02 09:56:47.477 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SelectorImpl.select(SelectorImpl.java:167)
05-02 09:56:47.477 E/ProxoidService( 6923): at com.mba.proxylight.RequestProcessor$1.run(RequestProcessor.java:72)
05-02 09:56:47.477 E/ProxoidService( 6923): at java.lang.Thread.run(Thread.java:1058)
05-02 09:56:47.486 E/ProxoidService( 6923): null
05-02 09:56:47.486 E/ProxoidService( 6923): java.net.SocketException: Operation failed
05-02 09:56:47.486 E/ProxoidService( 6923): at org.apache.harmony.luni.platform.OSNetworkSystem.createSocketImpl(Native Method)
05-02 09:56:47.486 E/ProxoidService( 6923): at org.apache.harmony.luni.platform.OSNetworkSystem.createSocket(OSNetworkSystem.java:85)
05-02 09:56:47.486 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SocketChannelImpl.(SocketChannelImpl.java:156)
05-02 09:56:47.486 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SelectorProviderImpl.openSocketChannel(SelectorProviderImpl.java:79)
05-02 09:56:47.486 E/ProxoidService( 6923): at java.nio.channels.SocketChannel.open(SocketChannel.java:95)
05-02 09:56:47.486 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:131)
05-02 09:56:47.486 E/ProxoidService( 6923): at com.mba.proxylight.RequestProcessor.process(RequestProcessor.java:351)
05-02 09:56:47.486 E/ProxoidService( 6923): at com.mba.proxylight.ProxyLight$1.run(ProxyLight.java:127)
05-02 09:56:47.486 E/ProxoidService( 6923): at java.lang.Thread.run(Thread.java:1058)
05-02 09:56:47.496 E/ProxoidService( 6923): null
05-02 09:56:47.496 E/ProxoidService( 6923): java.lang.NullPointerException
05-02 09:56:47.496 E/ProxoidService( 6923): at org.apache.harmony.nio.internal.SelectorImpl.wakeup(SelectorImpl.java:332)
05-02 09:56:47.496 E/ProxoidService( 6923): at com.mba.proxylight.RequestProcessor.closeAll(RequestProcessor.java:323)
05-02 09:56:47.496 E/ProxoidService( 6923): at com.mba.proxylight.RequestProcessor.access$2(RequestProcessor.java:302)
05-02 09:56:47.496 E/ProxoidService( 6923): at com.mba.proxylight.RequestProcessor$1.run(RequestProcessor.java:246)
05-02 09:56:47.496 E/ProxoidService( 6923): at java.lang.Thread.run(Thread.java:1058)
05-02 09:56:47.496 E/AndroidRuntime( 6923): Crash logging skipped, already logging another crash
05-02 09:56:47.496 E/OSNetworkSystem( 6923): unclassified errno 24 (Too many open files)

Huh. Too many open files. That would explain the non-response of Proxoid.

Googling for the offending line (org.apache.harmony.nio.internal.SelectorImpl.prepareChannels(SelectorImpl.java:223), yielded this: Selector leaks file descriptors in Apache Harmony. Yikes! Bad news.

The reporter of the bug in Android's issue tracker said:

Thanks for the quick answer. The workaround I used is quite simple: create as few
selectors as possible and recycle them endlessly.

I took a look at the Proxoid source and realized the error was fatal. A new selector is opened for each new request.

Its technically not a bug in Proxoid, but its awful to have to keep restarting Proxoid.

So I decided to hack a replacement proxy server implementation for Proxoid which uses threads instead of non-blocking IO. Took me 2 days to work everything out, and I ended up using some code from Muffin Proxy Server but it looks like things work fine. I tested my app out on 10000 HTTP requests (Proxoid barfed after about 1000) and no errors thus far. 🙂

Upgrading to Lucene 3.0

Posted by Kelvin on 28 Apr 2010 | Tagged as: programming, Lucene / Solr / Elasticsearch / Nutch

Recently upgraded a 3-year old app from Lucene 2.1-dev to 3.0.1.

Some random thoughts to the evolution of the Lucene API over the past 3 years:

I miss Hits

Sigh. Hits has been deprecated for awhile now, but with 3.0 its gone. And I have to say its a pain that it is.

Where I used to pass the Hits object around, now I need to pass TopDocs AND Searcher in order to get to documents.

Instead of

Document doc = hits.doc(i);

its now

Document doc = searcher.doc(topdocs.scoreDocs[i].doc);

Much more verbose with zero benefit to me as a programmer.

Nice number indexing via NumericField

Where I previously had to pad numbers for lexicographic searching, there's now a proper NumericField and NumericRangeFilter.

Lockless commits

What more can I say? Yay!!

What has not changed…

Perhaps somewhat more important than what has changed, is what has remained the same, which is 95% of the API and the query language.

3 years is a mighty long time and Lucene has experienced explosive growth during this period. The overall sanity of change is a clear sign of Lucene's committers' dedication to simplicity and a hat-tip to Doug's original architecture and vision.

HOWTO: Persistent DNS Caching on Ubuntu with pdnsd

Posted by Kelvin on 27 Apr 2010 | Tagged as: programming, Ubuntu

sudo apt-get install pdnsd

If prompted, choose "Manual".

sudo gedit /etc/pdnsd.conf

Copy and paste this into the editor.

// Read the pdnsd.conf(5) manpage for an explanation of the options.

/* Note: this file is overriden by automatic config files when
   /etc/default/pdnsd AUTO_MODE is set and that
   /usr/share/pdnsd/pdnsd-$AUTO_MODE.conf exists
 */

global {
	perm_cache=8192;
	cache_dir="/var/cache/pdnsd";
	run_as="pdnsd";
	server_ip = 127.0.0.1;  // Use eth0 here if you want to allow other
				// machines on your network to query pdnsd.
	status_ctl = on;
  	paranoid=on;
//	query_method=tcp_udp;	// pdnsd must be compiled with tcp
				// query support for this to work.
	min_ttl=96h;       // Retain cached entries at least 15 minutes.
	max_ttl=2w;	   // One week.
	timeout=10;        // Global timeout option (10 seconds).
	proc_limit=60;
	procq_limit=60;
	par_queries=4;
        // Don't enable if you don't recurse yourself, can lead to problems
        // delegation_only="com","net";
}


server {
	label="OpenDNS Plus";
	ip=	208.67.222.222
	,	208.67.220.220
	,	12.213.224.61
	,	192.228.79.201
	,	192.33.4.12
	,	128.8.10.90
	,	192.203.230.10
	,	192.5.5.241
	,	192.112.36.4
	,	128.63.2.53;
	timeout = 5;
	uptest = query;
	interval = 30m;      // Test every half hour.
	ping_timeout = 300;  // 30 seconds.
	purge_cache = off;
	exclude = .localdomain;
	policy = included;
	preset = off;
}

source {
	owner=localhost;
//	serve_aliases=on;
	file="/etc/hosts";
}

rr {
	name=localhost;
	reverse=on;
	a=127.0.0.1;
	owner=localhost;
	soa=localhost,root.localhost,42,86400,900,86400,86400;
}

Now edit /etc/default/pdnsd

sudo gedit /etc/default/pdnsd

Replace

AUTO_MODE=recurse

with

#AUTO_MODE=recurse

This disables AUTO_MODE and gets pdnsd to use our /etc/pdnsd.conf file.

Now edit the dhclient.conf file.

sudo gedit /etc/dhcp3/dhclient.conf

Replace

#prepend domain-name-servers 127.0.0.1;

With

prepend domain-name-servers 127.0.0.1;

(delete the # from the start of the line). Save and exit.

sudo /etc/init.d/pdnsd restart

Test out the DNS cache like so

dig google.com

Check that the SERVER line shows 127.0.0.1#53(127.0.0.1). This means you’re pointed at your local cache.

Now, if you run that command again:

dig google.com

You should see something like Query time: 0 msec.

Mapping neighborhoods to street addresses via geocoding

Posted by Kelvin on 19 Apr 2010 | Tagged as: programming, Lucene / Solr / Elasticsearch / Nutch, Ubuntu

As far as I know, none of the geocoders consistently provide neighborhood data given a street address. Useful information when consulting the gods at google proves elusive too.

Here's a step-by-step guide to obtaining neighborhood names for your street addresses (on Ubuntu).

0. Geocode your addresses if necessary using Yahoo, MapQuest or Google geocoders. (this means converting addresses into latitude and longitude).

1. Install PostGIS.

sudo apt-get install postgresql-8.3-postgis

2. Complete the postgis install

sudo -u postgres createdb mydb
sudo -u postgres createlang plpgsql mydb
cd /usr/share/postgresql-8.3-postgis/
sudo -u postgres psql -d mydb -f lwpostgis.sql
sudo -u postgres psql -d mydb -f spatial_ref_sys.sql

3. Download and import Zillow neighborhood data. For this example, we'll be using California data.

cd /tmp
wget http://www.zillow.com/static/shp/ZillowNeighborhoods-CA.zip
unzip ZillowNeighborhoods-CA.zip
shp2pgsql ZillowNeighborhoods-CA public.neighborhood > ca.sql
sudo -u postgres psql -d mydb -f ca.sql

4. Connect to psql and run a query.

sudo -u postgres psql -d mydb
select name,city from public.neighborhood where ST_Within(makepoint(-122.4773980,37.7871760), the_geom)=true ;

If you've done everything right, this should be returned from the SQL:

name | city
—————-+—————
Inner Richmond | San Francisco
(1 row)

Voila!g

Android's clever workaround of Sun's licensing

Posted by Kelvin on 08 Apr 2010 | Tagged as: android, programming

Just discovered this gem here: http://www.betaversion.org/~stefano/linotype/news/110/

  1. Android apps are developed in Java
  2. Android itself is licensed in APL, but Sun's source is licensed in GPL.
  3. Furthermore, Java is not open-sourced in mobile environments.
  4. How did Google do it?

Turns out that Android

  1. Uses Java as a development language but does not use Java bytecode or Java virtual machine for deployment.
  2. Has a virtual machine called Dalvik which does not claim to be a JVM
  3. First uses a regular java compiler to generate regular java bytecode (say, javac or the built-in Eclipse compiler) and then converts that bytecode into Dalvik’s bytecode (the “dx” tool does this: converts .class/.jar into .dex files)
  4. Supports only a subset of Java SE (AWT/Swing and JavaME classes are omitted) classlib. Instead of using Sun's implementation, it uses Apache Harmony's implementation.

That's quite a tour de force.

Overview of free turn-based strategy and war games

Posted by Kelvin on 01 Apr 2010 | Tagged as: programming

http://www.freewaregenius.com/2008/05/15/an-overview-of-free-turn-based-strategy-and-war-games/

MySQL Large ResultSet java.lang.OutOfMemoryError Workaround

Posted by Kelvin on 04 Mar 2010 | Tagged as: programming

Ever tried to fetch all rows from a large MySQL table?

You're bound to hit up against a java.lang.OutOfMemoryError.

So you try searching for how to set the fetchSize, and of course, it doesn't quite work.

The MySQL JDBC Driver implementation notes states:

By default, ResultSets are completely retrieved and stored in memory. In most cases this is the most efficient way to operate, and due to the design of the MySQL network protocol is easier to implement. If you are working with ResultSets that have a large number of rows or large values, and can not allocate heap space in your JVM for the memory required, you can tell the driver to stream the results back one row at a time.

To enable this functionality, you need to create a Statement instance in the following manner:

stmt = conn.createStatement(java.sql.ResultSet.TYPE_FORWARD_ONLY,
java.sql.ResultSet.CONCUR_READ_ONLY);
stmt.setFetchSize(Integer.MIN_VALUE);

The combination of a forward-only, read-only result set, with a fetch size of Integer.MIN_VALUE serves as a signal to the driver to stream result sets row-by-row. After this any result sets created with the statement will be retrieved row-by-row.

WHAT?! That's pretty silly. Well turns out there are 2 alternatives to the ludicrous way of stepping through row-by-row.

1. from 5.0.2 onwards mysql supports server-side cursor by setting the property • useCursorFetch=true and defaultFetchSize.
This means that all rows wil not be pushed to client . Only defaultFetchSize number of rows will be send to client at a time.

2. If you're selecting the rows to update them, AND the criteria you're selecting on IS what you're updating, then there's an uber-simple workaround:

a Add a limit to the SQL (e.g. limit 1000)
b. Instantiate an updateable statement
c. Execute the SQL, updating the rows as you iterate
d. Repeat until there are no more results

QED

Writing user documentation in style with Sphinx

Posted by Kelvin on 03 Mar 2010 | Tagged as: programming

Recently stumbled on Sphinx for generating user docs.

The closest alternative is docbook, so it was pretty much a no-brainer to try out Sphinx considering it lets you write documentation in the uber-friendly wiki-like ReStructuredText format.

It lets you export to HTML as well as HTMLHelp and Latex/PDF. There's also a direct rst2Pdf converter too (although formatting on that one is not as good as Latex)

You should definitely try it out next time you need to write user docs. Two thumbs way up!

Adventures in Java Doclets

Posted by Kelvin on 03 Mar 2010 | Tagged as: programming

I dug into the innards of javadocs and the Standard HTML Doclet recently to do 2 things:

1. decorate the standard doclet output with my custom annotations

2. generate ReStructuredText version of the custom annotations for a user manual

Its pretty cool that javadoc lets you customize output and all, but I've arrived at a few conclusions:

  1. the javadoc code was written by a bunch of idiots. It is a steaming pile of turd.
  2. it is difficult to extend, poorly designed, and the classnames make no bloody sense
  3. sounds crazy, but you need to COPY the ENTIRE standard doclet codebase in order to change the way it parses and displays files
  4. And yet.. its really not that difficult to write a doclet from scratch

Ubuntu 9.04 + Engenius EUB-362 EXT [SOLVED]

Posted by Kelvin on 10 Feb 2010 | Tagged as: programming, Ubuntu

Recently bought an Engenius Wireless LAN USB adapter EUB-362 EXT.

There are a number of postings on Ubuntu Forums about trying to get this working on Ubuntu, with no avail. Keenan Systems from whom I bought the adapter has an out-dated howto for getting it working on Ubuntu.

Here are the definitive steps to getting it working on Ubuntu 9.04. May work with other Ubuntu versions too.

sudo apt-get install ndiswrapper-common ndiswrapper-utils-1.9
cd /tmp
wget http://www.engeniustech.com/resources/EUB862_362_XPV2.1.zip
unzip EUB862_362_XPV2.1.zip
wine EUB862_362_XPV2.1.exe

Run the driver installation using wine. Then..

cd ~/.wine/drive_c/windows/system32
sudo ndiswrapper -i net5523.inf

Connect EUB-362 to your computer.

ndiswrapper -l

should produce

net5523 : driver installed
device (0CF3:0002) present

That's it. Enjoy your new wlan adapter!

« Previous PageNext Page »