Friday, July 18, 2008

An algorithm to render PDF on small devices

I am interested in ebook reader for quite a while. But after trying with a 6-inch e-ink reader (Hanlin V3), I found it is almost useless to read normal PDF files (A4 size) on these devices. The font size is too small, while the page size is too wide. So, a method to render PDF for these small devices is thought about and prototyped. The details are as follow:
  1. Convert pdf to image. I use pdftoppm of xpdf. Such as: pdftoppm -r 180 -f 245 -l 245 -gray -aa yes a.pdf a
  2. Analyze the generated images. Break page into lines.
  3. Divide each line long enough to two segments.
  4. Rearrange the segments into a new page, with half of the width.
A full functional PDF re-flow algorithm is very difficult to implemented. But a more simplified version as above is much easier.

No state updating: functional language

The programming model of command language is state machine. Such as in C, every variable is a state, one of the most common statements are assigning new value to a variable, that is, state updating. But in functional programming language, there is no state updating, how to program without state updating? Let's begin with a simple example of bank account management system.


struct account
{
char *name;
int remain;
struct account *next;
};

#define OP_NOP 0
#define OP_WITHDRAW 1
#define OP_DEPOSIT 2
#define OP_PRINT 3
#define OP_EXIT 4

void deposit(struct account *account, int val)
{
account->remain += val;
}

void withdraw(struct account *account, int val)
{
account->remain -= val;
}

int main(int argc, char *argv[])
{
struct account *accounts = NULL;
char *name;
int op = OP_NOP, val;

while (op != OP_EXIT) {
op = read_operation();
case OP_WITHDRAW:
case OP_DEPOSIT:
name = read_name();
account = list_find(accounts, name);
val = read_val();
if (op == OP_WITHDRAW)
withdraw(account, val);
else
deposit(account, val);
break;
case OP_NEW:
name = read_name();
account = malloc(sizeof(struct account));
account->name = name;
account->remain = 0;
list_add(&accounts, account);
break;
case OP_PRINT:
/* a macro to iterate every element of list */
list_for_each(accounts, account) {
printf("Name: %s\tRemain: %d\n", account->name, account->remain);
}
break;
case OP_EXIT:
break;
}
}
return 0;
}


In erlang, a functional programming language:


-module(bank).
-export([service/1]).

deposit(Account, Val) ->
{Name, Remain} = Account,
{Name, Remain + Val}.

withdraw(Account, Val) ->
{Name, Remain} = Account,
{Name, Remain - Val}.

print_account({Name, Remain}) ->
io:format("Name: ~s Remain: ~B~n", [Name, Remain]).

service(Accounts) ->
{ok, [Operation|_]} = io:fread("Operation: ", "~a"),
if
Operation == deposit; Operation == withdraw ->
{ok, [Name|_]} = io:fread("Name: ", "~s"),
{ok, [Val|_]} = io:fread("Val: ", "~d"),
case lists:keysearch(Name, 1, Accounts) of
false ->
io:format("Error: no such account.~n"),
NewAccounts = Accounts;
{value, Account} ->
case Operation of
deposit ->
NewAccount = deposit(Account, Val);
withdraw ->
NewAccount = withdraw(Account, Val)
end,
NewAccounts = lists:keyreplace(Name, 1, Accounts,
NewAccount)
end,
service(NewAccounts);
Operation == new ->
{ok, [Name|_]} = io:fread("Name: ", "~s"),
case lists:keymember(Name, 1, Accounts) of
true ->
NewAccounts = Accounts,
io:format("Error: account already exist.~n");
false ->
NewAccounts = [{Name, 0} | Accounts]
end,
service(NewAccounts);
Operation == print ->
lists:foreach(fun print_account/1, Accounts),
service(Accounts);
Operation == exit ->
ok;
true ->
io:format("Error: unknown operation~n"),
service(Accounts)
end.


Notes:
  • In functional programming language, instead of updating object (value) in-place, every time a new object (value) is constructed. In example above, in C, deposit() change account->remain directly, while in erlang, a new account is constructed and returned.
  • If data of program is organized in a hierarchical way, to change the data in leaf, in command language, only the leaf is changed in place; in functional language, all data structure from that leaf to root will be reconstructed. In example above, in C, during deposit operation, only the corresponding account object is changed in-place, while in erlang, both corresponding account object and the whole account list will be re-constructed. With the help of carefully designed implementation, althought the performance of that in functional language is wrose, it is not so bad as it appears.
  • In functional programming language, instead of updating the binding of variable name to new value directly, a new scope is constructed and in the new scope, same variable name is binded to a new value. In example above, during "new" operation, accounts may be bind to a new value, in erlang, a new scope is constructed by "recursing", that is, the new constructed account list is bind to Accounts in a new scope. Thanks to the "tail recursing" optimization, the performance is not so bad.
Questions:
  • How to shared objects in functional programming language? Such as in a database system, the cached database table should be shared between service threads.