Monday, May 30, 2016

[Linux] server

1. mount the server
log in as root
$ su
$ mount /home/{hostname}
2. unmount the server
$ umount /home/{hostname}

[python] install python and R using miniconda

Two years ago, the spring semester of my first year, I was transitioning from matlab to python at that time. I am not a programming person, and even learning matlab during my undergrad was a pain for me. People who use matlab, according to a Ted talk speaker, are being conservative and taking default as it is, something like stuck in the mud. It is true, in matlab you can always click on a bottom if you forget the command, while typing in terminal (Linux/bash) is something quite different from the Windows OS and something that I can't really get used to, even for today. So it took me a long time from being comfortable with matlab to falling in love with python.

dependency:
libcom_err.so <- kerberos (conda install pykerberos)
scipy.misc.imread <- pil (conda install pil)

Thursday, May 12, 2016

Papers·使用手记(持续更新)

Papers具有令人叹为观止的综合性、流畅性,谁用谁知道,这一点就可以秒杀所有的文献管理软件。

2016/05/12 更新

关于papers使用体验的更新(用了这个软件那么久,也算是有些话语权吧)
1)其实在半年之前的过去两年里,文献自动下载功能运作非常好,之前出现的那些导入不了下载不了的问题都大大改善
2)然而半年以来,最近不少出版商(Wiley,Springer等等,甚至AMS系列)开发了一种叫做epdf的interface,还算蛮酷炫的吧,但直接导致了文献自动下载作废了。。。所有的链接打开,都是问你要打开standard pdf还是 enhanced pdf。我觉得既然拥有了自己的local library,实在没有必要在不同的出版商的世界里又搞一发,浪费时间精力不说,又有啥用呢。
我今天一怒之下仔细看了一下epdf,发现这半年时间,epdf也有很大改进了,虽然做标注做笔记等无法储存,但一些自动跳到图片和文献的功能还是比普通pdf有用的。
小结一下:

  • epdf是一个比pdf阅读效率更高的格式
  • epdf必须借助于网页
  • 不同出版商有不同类型的epdf界面,library也应该不共享
目前我没有想到什么特别好的方法,都是被迫做出的选择。。。

  • 在Papers里面搜索一篇文献,点开epdf界面,先浏览图和结果。
    • 一般的文章不下载spdf,保存这个epdf的界面
    • 重要的文章既保存epdf界面,也下载pdf,然后开始在上面做标注。

也许这是个契机——减少对spdf的执念,不要总觉得只有白纸黑字的pdf才觉得真的收录了这篇文献。

3)去除重复文献和作者,都需要手工完成。去除重复文献是show duplicate papers and then merge; 去除重复作者是到作者那个tab里,直接用鼠标拖动重复作者实现merge。依然不是很方便,但也能够忍受吧。
4)dropbox的同步问题我没再测试过


Wednesday, May 11, 2016

代码经验(待续)

墨菲定律。
如果代码出了问题,往往肯定是自己最不确定,写的最随便最模糊的地方出bug。就算这次不出,迟早也会出!

老板刷刷两下就判断出bug在哪并且fix掉了。
我除了目瞪口呆,五体投地之外,更多是想我应该怎么改进才是。
当数据出现问题的时候,老板想的是回溯思路,上一步到这一步,有没有问题?不要跳跃式胡思乱想。
debug就是这种思路,跑程序到出错的前一步,看是什么东西导致出错。
程序要写的易读易改,程序关键之处,如核心运算啊,time step,I/O等等要写的显眼,而且最好把主要关键步骤写在简介里,这样改起来也容易一些。

自从上了APC 524之后,我开始将代码模块化。只要是做同一样任务的东西,全都写成一个个函数;只要是读入同一批数据,全都弄在同一个class里面。这样的话,不仅调代码快,也便于成熟的代码局部复制。

最后,就是熟能生巧,多写多改。

[Python] Regrid/Remap using python/cdo/gdal

Spatial resampling or sometimes we call it regridding and remapping, or even interpolation depending on whether we upscale or downscale grid cells, is something that we did quite often while dealing with large scale datasets.

If we want to use scipy, here are the functions I found relevant, but not good enough because they are disconnected from their geographic coordinates.
scipy.ndimage.map_coordinates
scipy.interpolate.RectBivariateSpline
I use the second one and found several issues. First, the latitude and longitude seems get wrong. While the other problem is more problematic. That is the boundary is wrong.

Another way to work around is using cdo remapgrid. CDO has a wrapper of SCRIP (Spherical Coordinate Remapping and Interpolation Package), which could be found on line (Los Alamos National Laboratory). I strongly recommend use this functionality of CDO, a powerful and fast tool based on Fortran. It has bilinear, bicubic, distance-weighted average, nearest neighbor, conservative (box-average), and largest area fraction interpolations.

I haven't really looked at gdal, but I guess it takes more time to figure out the commands from the unfriendly gdal manual...

Barometric Law