Skip to content


Computing {tweak, bug, annoyance} of the day...

Tweak of the day:
Problem: Cygwin doesn't support UTF-8, which leads to issues with unicode filenames (e.g. when running backups).

Solution: for Cygwin, download the patch from http://www.okisoft.co.jp/esc/utf8-cygwin/
in .bashrc:
export LC_CTYPE=en_US.UTF-8
export LANG=en_US.UTF-8
alias ls='ls --show-control-chars'
Terminal: mrxvt doesn't support unicode, but urxvt does.  If urxvt is not using unicode fonts by default (or the font is ugly), figure out which fonts are unicode by doing xlsfonts | grep iso10646 and put one of those fonts into .Xdefaults in the form:
URxvt*font: -misc-dejavu sans mono-medium-r-normal--*-120-*-*-m-0-iso10646-1
Don't forget to re-read .Xdefaults using xrdb < ~/.Xdefaults.

Bug of the day: sometimes my Emacs responds to C-k by performing winner-undo.  Version 22.3 did that sometimes, 23 does that more often.  Bizarre.

Annoyance Tweak #2 of the day: Emacs page down and then page up does not return the cursor to the same location.  Annoying if page-downs are accidental.  (Turns out that there's a simple fix: (setq scroll-preserve-screen-position t).  Incidentally, in the Usenet thread discussing this, another useful trick came up: C-u C-SPC pops the mark.  So, for a quick-and-dirty bookmark, one could use C-SPC or C-SPC-g to set and C-u-SPC to jump back!)

"Well, this wasted my time" of the day:  I aliased grep to produce a more colorful and informative output.  Some time later my ssh agent broke -- it kept asking for my password.  Took me some time to connect the two, but what happened was: the ssh agent script is invoked via source, meaning aliased commands are used -- meaning it was getting unexpected output from my grep.  Clever fix: replace grep with `which grep` -- this effectively bypasses bash aliasing.   Cleverer fix: don't alias things that are important and are likely to be found in scripts.

"Makes my life easier" of the day:  with Firefox 3.5 you can set gmail to handle mailto: links.  Very handy -- no more stupid Outlook popups.  Details at http://googlesystem.blogspot.com/2009/06/set-gmail-as-default-email-client-in.html

Tagged with .

So what's really wrong with Mathematica?..

Clipboard01I think the main problem with Mathematica is very simple: the Notebook interface tries to be user-friendly, and tries to enable literate programming. It makes doing straightforward calculations very easy. However, this belies the fact that Mathematica is an extremely complex system, and it requires a great amount of mathematical sophistication to understand how it works. Anybody that's not a mathematician or a computer scientist by trade (or by calling) runs head first into the exponential learning curve (Fig .1). I don't know anybody in my scientific community (Applied Physics) that is proficient with Mathematica beyond the basics they need for quick plots and integrals.

This disconnect between the friendly interface and easy basics and the nighmarish syntax of rules, patterns, holds, folds, maps, delayed vs non-delayed, etc once you get into things that are just a little more complex leads to a tremendous amount of frustration. Furthermore, it is often hard to figure out which constructs are imporant and efficient for what I want to do, and which are syntactic sugar that's only useful for cellular automata. What's worse is that while every individual feature is reasonably well documented, there are no tutorials that I have found that tell you how to use them together to write efficient and maintainable code.

As a result, those of us that have smacked their heads against these problems long enough just give up and settle for Matlab -- surely much more specialized, but also much more intuitive, and for many people also much faster in terms of computing time. And most engineering students don't even bother with Mathematica -- Matlab is the de facto standard, and that's what everyone learns.

So my message to those ardent supporters of Mathematica is this: think of the vast population residing in the middle ground between wanting to do easy plots and integrals, and between wanting to conquer the world with cellular automata or power through sophisticated arbitrary-precision and symbolic calculations. We want to like Mathematica, but find it hard -- so please give us some guidance before we all defect to other software.

Tagged with .

Mathematica?.. Don't hold back, tell us how you really feel

For years I've been meaning to post on a public forum my ruminations on the subject of how after about 5 years of day-to-day use,  it completely escapes me how to write good Mathematica code.  Granted, the discussion of "good" can be rather lengthy  in and of itself, but the main problem is that the code is inefficient and unmaintainable.  I consider myself to be a reasonably intelligent person -- Ph.D. student, physics, engineering, all that jazz; I don't think I'm terrible at programming, and I've hung out with theoretical CS types enough to at least believe that rules, patterns, maps, folds  are useful and maybe even elegant -- so the fact that I've had trouble writing Mathematica code that doesn't disgust me is quite vexing.  Furthermore, I know people that are way smarter than I that have had similar problems.  In fact, I thought that this is an important issue -- to the point that I didn't want to treat it lightly, and so have been putting off posting on comp.soft-sys.math.mathematica for years.

Then, just a few hours ago, I saw a reddit link to a page comparing some scientific computing code implementation in Mathematica, Matlab, and Python.  Needless to say, Mathematica code makes one cry.  Well, it made me cry because it reminds me so much of my own code.  For all I know, others LOL their butts off looking at it.  In the commends, there is a counterexample -- a piece of code where Mathematica code is terse and simple compared to Matlab.  (Simple?  Mathematica?..  In any case, the code is really terse.)  The person posting is theodoregray -- which means that he's either a co-founder of Mathematica, or just some guy named Theodore Gray, or someone else altogether -- but I hope he's the Theodore Gray.  Wouldn't it be neat to have someone like him explain Mathematica to me?..  My mental process went something like this: "wow, am I really responding to a co-founder of Mathematica?"  And the next thought: "Theodore Gray, you ruined my life with your computer algebra system, you bastard!"  (Stephen Wolfram isn't blameless here, but I'll forgive him when he solves all mysteries of the universe with cellular automata, which should happen real soon now.)  In any case, if there ever was an opportunity to rant about Mathematica, that was it, and here's what I wrote:

Now, about theodoregray's comment [link to the terse Mathematica clock applet]. His example proves nothing, except for possibly the fact that Mathematica is really good at tasks that about 99% of users will not care about and/or understand. Why is Mathematica code so brief here?.. Two reasons: First, Mathematica handles graphics primitives rather well. This has a lot to do with the basic language constructs of Mathematica, and the focus on list processing and symbolic computations instead of the raw procedural number crunching of Matlab. Second, the use of Dynamic[] construct, which has no direct equivalent in Matlab. So, congratulations: if I ever need a toy clock applet, I'll be using Mathematica. In fact, if I need interactive applets to illustrate the behavior of certain equations as parameters change, I'll use Mathematica with Manipulate[] and/or Dynamic[]. If I want to double-check my algebra, I'll use Mathematica. However, for anything that requires computation, or is simply code-intensive, I'll use Matlab.

If you think I have an axe to grind viz a viz Mathematica, you are completely right. It has wasted many months of my life in grad school, simply because its absolute ineptitude at numerical calculations made for very inefficient workflow, and the arcane syntax and reliance on notebook interface made it all but impossible to write maintainable code and create portable functions.

The notebook interface deserve special mention. It it rather handy for quick-and-dirty, on-the-fly computations, but it is orthogonal to writing robust code in a programming sense. All variables defined in a notebook are by default global across all notebooks. Making local variables using the menagerie of scoping constructs (Module[], Block[], With[]) is error-prone due to the fact that you have to supply a list of locals (in Matlab, you just start using them!) and due to subtle differences between the scoping constructs. Furthermore, there is no equivalent of Matlab structures and cell arrays, which make it easy to pass arguments between functions and keep things modular. What about debugging?.. The debugging facilities are a joke compared to Matlab.

All right, so one doesn't have to use notebook interface. Let's try to use Mathematica in the same way that we would use Matlab: write a script in plain ASCII and feed it to the kernel. Then we come back to the point of inefficiency. After quickly developing (in Mathematica) a function that took some Fourier transforms and performed integration (I still rely on Mathematica to help me with symbolics), I put it in a script and ran it. Took 2.5 hours. Not impressed, I reimplemented the same exact functions in Matlab. They now ran in 47 seconds(!)

I don't exclude the possibility that there are ways to make the code run faster, and I'd love to know where the bottleneck is -- in fact, here is the code: http://www.dnquark.com/blog/?p=29 -- go ahead, try to figure it out if you want -- but why should I spend days poring over Mathematica docs when Matlab just works (TM)?..

The moral of the story is this: if you are thinking about using Mathematica for things that involve numerical calculations, and you don't want the code to be a one-time throw-away affair, think again.

Tagged with .

Mathematica, 2.5 hours. Matlab, 47 seconds. Not impressed, WRI, not impressed.

The code is more or less identical (Mathematica version is necessarily more verbose, because, well, the syntax is more verbose. Unless you are drawing a clock applet.)

By the way, since people or than me might be looking at this code, here's a brief synopsis of what it does: we are trying to optimize the transmission function of a certain planar structure over two parameters.  The cost function we chose to use in this optimization is \int | F^{-1}\{T(k)\} | /\mathrm{max}[|F^{-1}\{T(k)\}|]dx, where T(k) depends on the optimization parameters.  So the code simply iterates through a 2D matrix of parameters computing the cost function integral and recording the result.

Mathematica: Matlab:
(* Fourier transform function definitions: *)

FftShift1D[x_] :=
 Module[{n = Ceiling[Length[x]/2]},
  Join[Take[x, n - Length[x]],
   Take[x, n]]]; (* operates just like matlab fftshift function *)
(* wrapper over the fft function.  Return values are in a replacement \
list, so usage is as follows:
{a,b,c,d}={transform,kpoints,klim,smpRateK} \
/.PerformDft1[foo,xlim,samprate]
*)
PerformDft1[func_, xLim_, smpRateX_] :=
 Module[{kLim, step, kstep, Nw, data, dataTransform, xvec, kvec},
  step = 1/smpRateX;
  Nw = 2*xLim/step; kLim = smpRateX; kstep = 1/(2*xLim);
  xvec = Table[x, {x, -xLim, xLim - step, step}];
  kvec = Table[k, {k, -kLim/2, kLim/2 - kstep, kstep}];
  data = FftShift1D[func /@ xvec];
  dataTransform =
   smpRateX^-1*Fourier[data, FourierParameters -> {1, -1}];
  {transform -> dataTransform, kpoints -> kvec, klim -> kLim,
   smpRateK -> 1/kstep}
  ]

TESlabTransmissionNice[kx_, mu11_, d_] :=
 Module[{m0 = 1, mu1, kz0, kz1, epar = 1, mperp = 1, eps = 10^-20},
  m0 = m0 + I*eps; mu1 = mu11 + I*eps; epar = epar + I*eps;
  kz0 = Sqrt[m0 - (kx^2) ]; kz1 = Sqrt[mu1 (epar - kx^2/mperp)];
  1 / (Cos[d kz1] -
     I/2 ((mu1 kz0)/(m0 kz1) + (m0 kz1)/(mu1 kz0)) Sin[d kz1])
  ]

MagneticSlabTransmission[kx_, mupp_, d_] :=
  Module[{mu1 = -1 + I*mupp, h = 0.5},
   Exp[-2 Pi 2 h Sqrt[kx^2 - 1]] TESlabTransmissionNice[kx, mu1,
     2 Pi d]];

CostFunctionIntegrand[fn_, print_: False] :=
 Module[{xLim = 100, xSmpRate = 100, xform, kvec, srk},
  {xform, kvec, srk} = {transform, kpoints, smpRateK} /.
    PerformDft1[fn, xLim, xSmpRate];
  xform = FftShift1D@xform;
  If[print,
   Print[ListPlot[Re[Transpose[{kvec, xform}]],
     PlotRange -> {{-1, 1}, All}]];
   Print["k sampling rate: ", srk];
   ];
  Transpose[{kvec, Abs[xform]/Max[Abs[xform]]}]
  ]

ComputeCostFunction[fn_] := Module[{integrand, min, max},
  integrand = CostFunctionIntegrand[fn];
  min = Min[integrand[[All, 1]]];
  max = Max[integrand[[All, 1]]];
  Integrate[
   Interpolation[integrand, InterpolationOrder -> 2][x], {x, min, max}]
  ]

ComputeCostAndOutput[d_, mpp_,
  fname_] := PutAppend[{d, mpp,
    ComputeCostFunction[MagneticSlabTransmission[#, mpp, d] &]},fname];
(* can be read in using perl -pe 's/{(.+)}$/$1/' to strip the braces,
(nb: here + should really be *, just can't easily put it within mma
comment); then: Import["param_optim_numerics\\par_opt1.dat", "CSV"] *)

Map[((ComputeCostAndOutput[#1, #2, "costfn1.dat"] &) @@ #) &,
 Table[{d, mpp}, {d, 0.05, 1, 0.05}, {mpp, 0.05, 1, 0.05}], {2}]
%% ComputeCostAndOutput.m
%
outfile = fopen('par_opt1_matlab.dat','w');
for d=0.05:0.05:1
    for mpp=0.05:0.05:1
        integrand = CostFunctionIntegrand(...
            @(x)(MagneticSlabTransmission(x,mpp,d)));
        costfn=trapz(integrand.kvec,integrand.int);
        fprintf(outfile,'%f,%f,%f\n',d,mpp,costfn);
    end
end
fclose(outfile);

% CostFunctionIntegrand.m
function out=CostFunctionIntegrand(fh)
    xLim=100; xSmpRate=100;
    dftStruct = PerformDft1(fh,xLim,xSmpRate);
    xform = fftshift(dftStruct.transform);
    out.kvec = dftStruct.kpoints;
    out.int = abs(xform)./max(abs(xform));
end

% MagneticSlabTransmission.m
function out = MagneticSlabTransmission(kx,mu1,d,h)
    out=exp(-2*pi*2*h*sqrt(kx.^2-1)).*...
           TESlabTransmissionNice(kx,mu1,2*pi*d);
end

% TESlabTransmissionNice.m
function out = TESlabTransmissionNice(kx,mu11,d)
    m0=1; epar=1; mperp=1;
    m0 = m0 + i*eps; mu1 = mu11 + i*eps;
    epar = epar + i*eps; %#ok
    kz0 = sqrt(m0 - (kx.^2) );
    kz1 = sqrt(mu1.*(epar - kx.^2/mperp));
    out=(cos(d.*kz1)+(sqrt(-1)*(-1/2)).*(kz0.^(-1).*...
    kz1.*m0.*mu1.^(-1)+ ...
   kz0.*kz1.^(-1).*m0.^(-1).*mu1).*sin(d.*kz1)).^(-1);
end

% PerformDft1.m
function out = PerformDft1(fh,xLim,smpRateX)
    step = 1/smpRateX;
    Nw = 2*xLim/step; kLim = smpRateX;
    kstep = 1/(2*xLim);
    xvec = linspace(-xLim,xLim-step,Nw);
    kvec = linspace(-kLim/2,kLim/2-kstep,Nw);
    data = fftshift(fh(xvec));
    dataTransform = smpRateX^-1*fft(data);
    out.transform = dataTransform;
    out.kpoints = kvec;
    out.klim = kLim;
    out.smpRateK = 1/kstep;
end

Tagged with .

Grandma rules, part N

She called me to say that she was watching Russian TV, and they were saying that on the internet they have all sorts of harmful sex programs that get people addicted ("I don't know what they are, but they are bad for you. Afterwards they don't know how to deal with real women. Don't watch this, those things are for weak people. I understand, it's hard without a partner around, but don't do it. You just have to finish up grad school and set up your personal life.")
-- "It's pornography, grandma. People used to watch in in magazines and TV; now it's on the internet."
-- "Pornography?.. Oh yes, that's what it's called. No, I think it's different. It used to be pictures and whatnot..."

I never thought that grad school would ruin my life through pornography. It's an interesting angle though.

Tagged with .

Emacs + Launchy + notepad2 + AutoHotkey make Windows tolerable

I have finally kicked Windows notepad completely to the curb, having replaced it with notepad2 as per instructions here.  But I didn't stop there.  I realized that by putting essential light-weight customizations into a separate file (init.q.el, q for quick) and replacing the shortcut to my Emacs with
C:\Emacs\emacs-23.0.95\bin\runemacs.exe -q -l "%HOME%\.emacs.d\init.q.el",
I can make it launch almost as fast as notepad.  So, I did my usual trick of appending the names of commonly used files where I keep my notes to the above command line, putting the shortcut somewhere in the start menu where Launchy can find it, and having instant access to the information I need -- in Emacs!

But I didn't stop there.  I snagged the recently released binaries for Emacs 23 and installed nice fonts -- I'm a fan of DejaVu Sans 10 and Consolas 11, and now my text editing is beautiful.  Anyhow, I guess I'm getting pretty indoctrinated by the Church of Emacs.  I don't particularly like the idea of RMS as my spiritual leader, but I'm fine with worshiping Yidong.  I just wish he'd forced me to see the light back in undergrad :P

So how does AutoHotkey fit into the picture?  Well, it makes it trivial to send any file from Windows Explorer to Emacs with a keyboard shortcut.  See this stackoverflow thread for details!  The thread also describes how you can also add send-to-Emacs to a context menu.  Sending things to Emacs can work in several ways, but I like the idea of starting Emacs server (M-x server-start) in one instance of Emacs, and then all these send-to-Emacs actions will direct the files to that particular instance.

Tagged with .

Let's back that sh*t up

Now that I find myself with gigs of online storage space to be used for backups, it gives me a perfect excuse to organize my files in a sensible manner and schedule them to be backed up regularly.  Now, "sensible manner" in my case ended up involving many directories and a host of symlinks (which appear as shortcuts under Windows...  nice touch by Cygwin), and so the backup script had to apply some filtering rules.  It turned out to be rather more involved than one would think is necessary.  rsync include / exclude mechanism is rather tricky, and many people recommend using find for filtering and piping the results to rsync.   It is dubious whether that is a more straightforward solution: for instance, in order to select a set of files with extensions .pl, .h, and .cpp, I ended up with the following gem:

srcdir="projects/fin/code/"
srcpath=~/$srcdir
exts="pl,h,cpp"
eval ls -A1 $srcpath*.{$exts} 2>/dev/null | \
xargs -I{} echo {} | eval sed 's/^.\\{${#srcpath}\\}//g' | \
rsync -vzaEHxuh --progress --stats --append --delete \
--files-from=- $srcpath $hostname:~/$srcdir

Whoa.  What's going on here?..  Many subtle things.  Line 4 creates a list of files with the given extensions.  eval is necessary to combine brace expansion with variable expansion, while the output redirection kills stderr messages, e.g. in case there are no files with .h extension.  However, rync has problems unless the files are given relative to the source path.  Line 5, then, strips out the first ${#srcpath} characters (this is the bash syntax for length of string in $srcpath).  If we wanted to remove 5 chars from the start of $foo, we would do
echo $foo | sed 's/^.\{5\}//g'
-- pretty standard usage of sed.  Here, we need once again to use eval in order to get to the ${#srcpath} variable, and since we are using eval the braces in sed have to be escaped twice.
The xargs construct simply takes the list of files and feeds it to the sed construct line by line (-I{} instructs xargs to place the line where {} are in the following code).

Now, of course after feeling great about myself for being at ease with all aspects of this eval sed construction, I realized (and by that I mean, Andrei realized) that  echo $foo | eval sed 's/^.\\{${#srcpath}\\}//g' is completely equivalent to basename $foo. I like this sort of brevity, although it's a little disappointing that as of late my attempts to showcase knowledge of sed and awk have been torpredoed by these sorts of built-in bash commands.

Here's another one: I want to back up all the hidden (dot-) files and directories, as well as their contents.

srcdir=~
ls $srcdir -A1 | grep -e "^\." 2>/dev/null | \
rsync -vzraEHxuh --progress --stats --append --delete \
--files-from=- $srcdir $hostname:~/
# note that it is necessary to explicitly use the r switch!  The file
# list includes the directories, but not the contents of their directories.
# Even though -a switch in rsync implies -r, it won't go into the directories
# if -r isn't there explicitly.

Luckily, -r switch on rsync forced it to look into the directories given in the list and process them recursively.  If that didn't work, I would have had to use
find ~ -path ~/.\*
-- (notice the escaped *), and I'd have to strip the ~ from the start of full path strings using the complicated construct above.

The other good side effect of this project (in addition to having my data safely backed up) is that I understand the find command much better now.  In hinsight, it is quite simple, actually.  find starts in a given directory and recursively constructs a file tree.  It then goes through this file tree, applying a set of expressions to each element.  If the expressions evaluate to true, it performs certain actions (default is to print the pathname).  Expressions can involve options, tests (-name, -path, -regex, -type, -user, etc.), and actions (-delete, -exec, -prune, ...).    So what does a statement like
find $mydir  \! -name $mydirname -prune  -name ".*"
actually mean?

Well, the parenthesized expression means: if the name of the file doesn't contain $mydirname, (\! -name $mydirname) then prune it (don't include it).  (Notice that negation has high precedence -- higher than the implicit AND between -name and -prune.)   For instance,  find ~  \! -name Leo -prune  will see /home/Leo, see that the (base)name
is Leo, and will let it pass, but everything else will die -- since /home/Leo will be the first "file" listed, it won't go into any of the subdirectories.  Using this trick, we can preclude find from recursing into subdirectories.

Another example (taken from the manual):
find . -name .snapshot -prune -o  \! -name *~ -print0
This is a common construct used to ignore certain files and directories while performing some other action.  OR is needed because for files that don't match our pruning criterion, the -name .snapshot -prune will be false.  If we had an implicit AND, the expressions that follow would also be short-circuited to false.

Another very common usage (one of my earliest memories of find, actually):
find . -name "*.nb" -exec grep -H PATTERN {} +
This simply runs a command (grep in this case) putting the matching file(s) in {}.  (It used to be that instead of + the expression was terminated by an escaped semicolon \; -- but + is faster).   Apparently, this runs quite fast and in efficiency rivals piping the result of find to xargs:
find . -name "*.nb" -print0 | xargs -0 grep PATTERN
(note the -print0 / xargs -0 pair -- this guards against non-standard filenames).

Other things that I learned this fine holiday weekend include being wary of wc character count (it counts newlines as characters!  (cf.   echo "123" | wc -m vs printf "123" | wc -m or echo -n "123" | wc -m), got reminded of the basename command, refreshed basic awk syntax...  Not bad, not bad at all.

Tagged with .

Beam propagation in 15 lines of Matlab

Clipboard00

I recently returned to examining transmission / reflection properties of anisotropic planar structures, and I figured that this time around I should do it the Right Way (TM), which boils down to:

1. Take legible notes and
2. Stay the hell away from Mathematica for numerics.

With both of these, I succeeded beautifully.  Once again, I have some Matlab code which takes seconds to produce the same results my old Mathematica notebooks would take minutes or hours to compute -- and it's clean, legible, and ran on the first try!

function out = BeamProp1(x,z,kx,kxmask)
    kz0 = sqrt(1 - kx.^2);
    [zz,kk0] = meshgrid(z,kz0);
    fwdPropFieldK = diag(kxmask)*exp(1i.*kk0.*zz);
    clear('zz','kk0'); %free memory
    out = complex(zeros(length(x),length(z)));
    col = 1;
    [xx,kkx] = meshgrid(x,kx);
    for eKZslice=fwdPropFieldK
        [xx,fld] = meshgrid(x,eKZslice); % diff x's are in columns
        eKX = sum(fld.*exp(1i.*xx.*kkx),1).';
        out(:,col) = eKX;
        col=col+1;
    end
end

Couple of notes on the code: first, it does not handle evanescent fields gracefully: if you put evanescent fields (kx > 1) at the origin, they will exponentially grow for z < 0 and will clobber all other features of the plot.  Second, the code is fully vectorized -- meaning that it sacrifices memory for speed.  The good news is that even though there are 3 vectorized dimensions (x,z,kx), only two of them are put into meshgrid at the same time -- so no 3D matrices to store.  Still, expect to have at least a few tens of megs available when dealing with large fields of view.

Tagged with , .

Matlab vs Mathematica, part ....

Simple code to compute a Fourier transform and numerically integrate the output.  Takes 2.5 hours to run in Mathematica, 47 seconds in Matlab.  Moral of the story?..  If it's easy to translate Mathematica code to Matlab -- don't bother with numerics in Mathematica...

Tagged with .