Wednesday, December 24, 2008

fox2html: foxmarks.json to HTML converter

This script converts "foxmarks.json" saved by the Firefox plugin, Foxmarks, on a self hosted server to a styled HTML file with interactive folders. An example output is here.

Thursday, September 25, 2008

Duplication elimination

In engineering a computation project, one desires a single point of information entry. It's elegant to come up with coding structures that will facilitate this requirement, that is, a single place in the codes pertinent to the to-be-computed problems. However, while the programming language is an interface between human and machine. It's not generally friendly to the users. The ultimate way of choice is to have human readable/editable files and make programs to generate corresponding codes when needed.

Thursday, August 14, 2008

Pay attention

If you don't see people, you don't see intention.

Sunday, June 1, 2008

adding OCR to djvu file

For each page in the file "cake.djvu", we can use the "tesseract" to process the page image:
djvused -e "select ${page};save-page-with \"cake_page.djvu\"" cake.djvu
convert cake_page.djvu cake.tif
tesseract cake.tif cake_box batch.nochop makebox
tesseract cake.tif cake_txt batch.nochop
This produces the information for the text structure (lines and words) and positioning (coordinate for each character). To convert this information to the hidden-text format for use with djvused, use
perl<<'EOL'>cake_text.txt
open TXT, "<:utf8", "cake_txt.txt";
open BOX, "<:utf8", "cake_box.txt";
$pxn = 1000000;
$pxx = 0;
$pyn = 1000000;
$pyx = 0;
$pagebuf = "";
while ($line = <TXT>) {
chop $line;
@words = split /\s+/, $line;
next if $#words < 0;
$lxn = 1000000;
$lxx = 0;
$lyn = 1000000;
$lyx = 0;
$linebuf = "";
foreach $word (@words) {
$xmin = 1000000;
$xmax = 0;
$ymin = 1000000;
$ymax = 0;
$w = "";
for ($i = 0; $i < length($word); $i ++) {
$c = substr($word, $i, 1);
do {
$cline = <BOX>;
} while (substr($cline, 0, 1) ne $c);
($xn, $yn, $xx, $yx) = substr($cline, 2) =~ /\S+/g;
$w = $w . '\\' if $c eq '"';
$w = $w . '\\' if $c eq '\\';
$w = $w . substr($cline, 0, 1);
$xmin = $xn if ($xmin > $xn);
$xmax = $xx if ($xmax < $xx);
$ymin = $yn if ($ymin > $yn);
$ymax = $yx if ($ymax < $yx);
}
$wline = '(word ' . $xmin . ' ' . $ymin . ' ' . $xmax . ' ' . $ymax . ' "' . $w . '")';
$linebuf = $linebuf . "\n  " . $wline;
$lxn = $xmin if ($lxn > $xmin);
$lxx = $xmax if ($lxx < $xmax);
$lyn = $ymin if ($lyn > $ymin);
$lyx = $ymax if ($lyx < $ymax);
}
$pagebuf = $pagebuf . "\n (line $xmin $ymin $xmax $ymax" . $linebuf . ')';
$pxn = $lxn if ($pxn > $lxn);
$pxx = $lxx if ($pxx < $lxx);
$pyn = $lyn if ($pyn > $lyn);
$pyx = $lyx if ($pyx < $lyx);
}
close BOX;
close TXT;
binmode(STDOUT, ":utf8");
print "(page $pxn $pyn $pxx $pyx", $pagebuf, ')', "\n";
EOL
which generates "cake_text.txt" in the accordant format. The hidden text can be saved back to the djvu file with
djvused -e "select ${page};set-txt \"cake_text.txt\";save" cake.djvu
We just need to repeat this for all the desired pages.

Sunday, May 4, 2008

Information overflow

The growing world of connected is increasing all we have available to take in. It's bound to exceed ones capacity without some form of aids. But, in raw, where is the limit? And, how are we going to cook it? In person, where is the limit? And, how are we going to team up? Any tool building helps?

Thursday, April 24, 2008

Competition

Life is like a game. Those who enjoy the most win.

Wednesday, April 23, 2008

Live like what and when?

People have said things like: "Live like you'll die tomorrow." Maybe, this is for just so that we don't put off things that's really important. Well, whenever and whatever we are doing, we better pause when we can and think: Is there a better thing to do than this?

Thursday, April 3, 2008

LaTeXMathML

This is another way to have math in web pages. It's not as portable as jsMath since MathML is not wildly supported yet. However, it allows executing the script from an external server and thus can be used on Blogger.

Sunday, March 30, 2008

jsMath

This is a very neat way to add math to a web page. It's as easy to use as typing in TeX and much portable than the right way of doing such things.

Wednesday, March 5, 2008

Sand pit on a beach

As an analogy to the consciousness in a physical system, the water shows when the pit is dug deep enough. Similarly, consciousness shows when the system meets certain criteria. Just like the water is from the same body for all the pits. Consciousness is the same phenomena for all individual systems. Since we can make a channel between two pits or dig a big pit to encompass them, can we make links to connect conscious systems or merge them the form a bigger one?

Friday, February 29, 2008

Fishery

The net has been cast for quite a while. With no satisfactory gain in sight, it's time to reel it in... Until next time!

Tuesday, February 26, 2008

The onset

Does it start with locally strongly connected clusters and merge into a system wide network? Or, it starts with a weak global connection which strengthens into maturity.

Wednesday, January 9, 2008

Conciseness

In programming, beautiful structures often emerge after you try trimming away duplication or redundancy.