02
Mar

Update on life

It’s been over a couple months now since I’ve written so I thought I’d update people on what has been going on in my life. I hope to resume posting regularly starting this week so check back soon.

As some of you may know, I graduated last December from The University of Maryland, Baltimore County with a B.S. in Bioinformatics and Computational Biology, and a B.S. in Psychology (yes,yes, laugh all you want… It was fun!) The next month or so I spent job hunting and interviewing at various places. I had originally expected that I would end up working somewhere like JCVI, however before I was ever able to get in touch with people there I was offered a really cool job working at the NASA Goddard Space Flight Center in Greenbelt, Maryland.

I’m now working with Solar Physicists there as a web developer. Although in the days to come I will be working on several different projects including a Virtual Solar Observatory, the project I’m currently focused on is called “HelioViewer.” The goals of HelioViewer are to produce something similar to Google Maps, but using solar data. The project is still in it’s infancy, and has a long way to go, but there already a working prototype which includes some basically functionality like loading images, layering, and zooming.

This is what it looks like:

HelioViewer Screenshot

The application is written primarily in Javascript and PHP, and uses the Prototype and Script.aculo.us Javascript Frameworks.

I’ve been developing web applications for a pretty long time now, but up until recently I stuck mostly to server-side languages and have avoided Javascript like the plague. The whole idea of client-side scripting seemed like a dead-end: browser support was variable and the only really interesting applications of Javascript to come out at the time (aside from, of course, blinking text) were browser exploits. Recently, however, with the availability of the XMLHttpRequest object, which is the life-force behind the now-ubiquitous Ajax applications we have come to know and love (think gmail), things are changing for Javascript.

Javascript 2.0, based off ECMAScript version 4, will boast a number of improvements including better OOP support and is (I believe) slated for release at the end of 2008. That’s grand and all but already the web-development community has gotten together and made some extraordinary progress on the Javascript front in the form of Javascript “frameworks.” These frameworks, which include Prototype, Jquery, YUI and Ext JS, make writing full-scale Javascript applications a highly respectable task. I will talk more about these later, but for now, let’s just say I’m a Javascript convert.

Let me step aside for one moment and point out that, while Ajax has become hugely popular for web application/RIA development, it is not the only contender in the arena. Another technology I wrote about a while back, Flex, is also a very able contestant in my opinion. Just this past week it reached a new milestone with the release of the Flex 3. Flex and Ajax applications have their own advantages and weaknesses, and both are worth considering. I also plan to write more about this in the future, but if you are interested in learning more about Flex in the meantime, there are a number of excellent blogs worth checking out including Ted On Flex, Flex Examples, EverythingFlex, InsideRIA.

Finally, as I mentioned above, I graduated with a degree in Bioinformatics. I still love bioinformatics, and am doing my best to keep up with current research in the field. From time to time I will try to post interesting advances in the field, and maybe even write some posts which combine topics in bioinformatics some of the web development technologies I’m working with at the moment. It should be a lot of fun :) If you would like to see what I’m reading in the meantime, feel free to subscribe to my Google Reader Shared items feed. I should warn you though, I am presently subscribed to over 300 feeds on topics ranging from bioinformatics to cute annotated pictures of cats, so be prepared for an interesting mix of items.

11
Dec

Have a merry x-mas… compiz style

One of the cooler, lesser-known plugins for Compiz, xglsnow, was sadly left in the dust with the inclusion of Compiz Fusion into Ubuntu 7.10. That doesn’t mean, however, you can’t still get it working in time for the holiday season! Check out the video below to see the plugin in action.

Here is a screenshot of what it looks like on my machine:

screenshot of xglsnow running on my desktop

Note: This tutorial assumes that you have Compiz or Compiz fusion setup already. If you don’t, however, try searching the forums– there is a huge number of guides floating around on getting Compiz running for different graphics cards.

Ready? Here goes…

I. Installing xglsnow

First, you need to install the necessary packages to build the plugin. Open up a console (alt+F2 -> “gnome-terminal”),
and type:

sudo apt-get install compiz-bcop compiz-dev build-essential libxcomposite-dev libpng12-dev libsm-dev libxrandr-dev libxdamage-dev libxinerama-dev libstartup-notification0-dev libgconf2-dev librsvg2-dev libdbus-1-dev libdbus-glib-1-dev libgnome-desktop-dev x11proto-scrnsaver-dev libxss-dev libxslt1-dev libtool

Create a directory in your home folder to install the plugin to:

mkdir -p ~/compiz/

Download xglsnow and extract it to the directory you just created:

wget -O /tmp/snow.tar.gz "http://gitweb.opencompositing.org/?p=fusion/plugins/snow;a=snapshot;h=01d0ff6ec71dae4699bc990e0114569c8ad4e083"

tar -xf "/tmp/snow.tar.gz" -C ~/compiz/

Finally, navigate to the directory, compile and install:

cd ~/compiz/snow

make
make install

Now you just need to install some textures, configure xgl, and you’re done! :)

*The above steps are based off a tutorial by Scott at the Compiz Fusion forums. Thanks!

II. Adding textures

The above tarball doesn’t include any snow textures, so by default all you would see are some floating white blocks… not very pretty… The package from the xglsnow homepage, however, includes a texture which looks pretty nice. To set it up, go to the xglsnow project homepage and download xglsnow-0.2.0.tar.gz. Extract the files, and copy the file “snowflake2.png” to any location you would like, e.g. ~/.compiz/images or /usr/share/images:

tar -xf xglsnow-0.2.0.tar.gz

cd xglsnow-0.2.0/

mkdir ~/.compiz/images

mv snowflake2.png ~/.compiz/images

If you haven’t already, restart Compiz to load the new plugin (alt+F2 -> “compiz –replace”) and run the Compiz settings manager: alt+F2 -> “ccsm”. Find the “Snow” plugin and check the box to the left of it to enable it.

Compiz settings manager

Now click on the plugin’s name to modify its settings. Next go to “Textures” -> “add” -> “browse” (click the folder icon). Navigate to the location where you saved the texture from above and hit “okay.”

Compiz settings manager (snow configuration)

All done!

Press “Super + F3″ to start xgl snow. If you don’t see anything, check to make sure the the PNG plugin for compiz is enabled, and that the hotkey for xglsnow is in fact “super + F3.”

If you want to install some different snow textures, try the Snowflakes pack on Gnome-look.

III. Wallpapers

Finally, if you want to find some wintry wallpapers to go along with your new snow-covered desktop, take a look at Blue Christmas
from digital blasphemy (that is the one in the screenshot above). Gnome-art has a nice picture of a winter landscape in Alsace, France You can also find some winter wallpapers at Gnome-look and Kde-look.
Try searching for “winter” or “snow.”

That’s all!

Feel free to write any suggestions, or a link to a screenshot of your own holiday desktop :)

04
Dec

Intro Data Mining Webinar (December 13, 2007)

Salford Systems is hosting a free Introductory Data Mining Webinar on December 13, from 10-11am EST.

From the description of the seminar:

This one-hour webinar is a perfect place to start if you are new to data mining and have little-to-no background in statistics or machine learning.

In one hour, we will discuss:

**Data basics: what kind of data is required for data mining and predictive analytics; in what format must the data be; what steps are necessary to prepare data appropriately.

**What kinds of questions can we answer with data mining?

**How data mining models work: the inputs, the outputs, and the nature of the predictive mechanism.

**Evaluation criteria: how predictive models can be assessed and their value measured.

**Specific background knowledge to prepare you to begin a data mining project.

Data mining and the related field of machine learning deal with finding patterns in large sets of data. This is very useful for trying to understand and model complex natural phenomena, and bioinformaticians have not been shy to take advantage of these methods. Just look at any recent issue of BMC Bioinformatics or PLoS Computational Biology and you will see a number of articles involving SVMs, Neural Networks, and Bayesian networks.

This webinar is geared towards people with little or no understanding of data mining, so it should be a good introduction if you haven’t learned about machine learning or data mining. If you are interested in learning more, there are some good tutorials online at here, and here. Videolectures.net and Peteris’s blog include a number of video lectures on the machine learning.

To sign up, go to the event description and click “enroll.”

02
Dec

Bio::Blogs #17 (Courtesy of Mr. Claus) is now available

The seventeenth edition of the premier bioinformatics blog carnival, Bio::Blogs is now available over at Paulo’s blog. Give it a read why don’t you!

Zoidberg

(Oh, and the new Futurama movie, Bender’s Big Score is out now too! Oh happy day.)

27
Oct

Compiz Fusion with ATI Radeon X800 GTO on Ubuntu Gusty

After much weeping and gnashing of teeth, I have finally gotten Compiz Fusion to run after upgrading to Ubuntu 7.10, and it looks very sharp.

gusty_compiz_xgl

Compiz Fusion running on Ubuntu 7.10 with XGL.

As anyone else who owns an ATI card can attest, getting OpenGL working in Ubuntu is often no small task. After the initial upgrade to 7.10, I was not surprised to find that Ubuntu was using the Vesa (2d) graphics drivers. My first thought was to try using the proprietary drivers manager version of the ATI drivers, xorg-driver-fglrx (ATI’s non-open-source 3d drivers for linux). This is the version of the driver that is installed if you click “enable” in Ubuntu’s proprietary drivers manager. After enabling the drivers, and playing around with xorg.conf settings some, I still was having no luck getting OpenGL working and was getting the standard error messages:

ubuntu-desktop:~$ fglrxinfo
display: :0.0 screen: 0
OpenGL vendor string: Mesa project: www.mesa3d.org
OpenGL renderer string: Mesa GLX Indirect
OpenGL version string: 1.2 (1.5 Mesa 6.4.1)

At this point I decided to try the new 8.42.3 ATI drivers which are purported to support AIGLX, and thus should be able to work without XGL. Following a guide I found on via the forums, I was able to install the 8.42.3 drivers without too much trouble. Unfortunately however, I still had no luck getting OpenGL to work. I tried several combinations of xorg.conf settings, switching Composition and AIGLX on and off, but to no avail.

So I decided to uninstall the new driver, and wait and pray that when the newer drivers were uploaded to the main repos, it would work for me. By a stroke of luck however, I noticed that after removing the new driver, and reloading the default proprietary-driver-manager version, OpenGL was now working! I reinstalled xserver, rebooted, and Voila!– Working Compiz Fusion!

ubuntu-desktop:~$ fglrxinfo
display: :0.0 screen: 0
OpenGL vendor string: ATI Technologies Inc.
OpenGL renderer string: RADEON X800 GTO
OpenGL version string: 2.0.6473 (8.37.6)

So what is the difference between now and after the initial upgrade? The Screen & Graphics manager is set to use the “ati” drivers instead of “fglrx!” Also, I enabled the “Composite” extension in xorg.conf (See below). Everything else is the same:

gusty restricted drivers manager

Using the default restricted drivers manager ATI drivers.

Gusty screen and graphics preferences

Screen and graphics preferences

Even though Gusty is set to use the “ati” driver version, xorg.conf is still set to use fglrx, and running compiz in the terminal confirms that the fglrx drivers are being used.

ubuntu-desktop:~$ compiz
compiz compiz.real
ubuntu-desktop:~$ more compiz
compiz: No such file or directory
ubuntu-desktop:~$ compiz –version
Checking for Xgl: present.
Checking for nVidia: not present.
Checking for Xgl: present.
Enabling Xgl with fglrx ATi drivers…
Starting emerald
compiz 0.6.1

Why setting the screen and graphics preferences driver to “fglrx” breaks fglrx is beyond me, but in any case, at least it is working now. In case anyone else would like to see, the contents of my xorg.conf file are as below:


Section “ServerLayout”

# Uncomment if you have a wacom tablet
# InputDevice “stylus” “SendCoreEvents”
# InputDevice “cursor” “SendCoreEvents”
# InputDevice “eraser” “SendCoreEvents”
Identifier “Default Layout”
Screen 0 “aticonfig-Screen[0]” 0 0
InputDevice “Generic Keyboard”
InputDevice “Configured Mouse”
EndSection

Section “Files”
EndSection

Section “Module”
Load “bitmap”
Load “extmod”
Load “freetype”
Load “int10″
Load “vbe”
Load “glx”
Load “dbe”
Load “dri”
Load “v4l”
EndSection

Section “InputDevice”
Identifier “Generic Keyboard”
Driver “kbd”
Option “CoreKeyboard”
Option “XkbRules” “xorg”
Option “XkbModel” “pc104″
Option “XkbLayout” “us”
Option “XkbOptions” “altwin:meta_win”
EndSection

Section “InputDevice”
Identifier “Configured Mouse”
Driver “mouse”
Option “CorePointer”
Option “Device” “/dev/input/mice”
Option “Protocol” “ImPS/2″
Option “ZAxisMapping” “4 5″
Option “Emulate3Buttons” “true”
EndSection

Section “InputDevice”
Identifier “stylus”
Driver “wacom”
Option “Device” “/dev/input/wacom”
Option “Type” “stylus”
Option “ForceDevice” “ISDV4″# Tablet PC ONLY
EndSection

Section “InputDevice”
Identifier “eraser”
Driver “wacom”
Option “Device” “/dev/input/wacom”
Option “Type” “eraser”
Option “ForceDevice” “ISDV4″# Tablet PC ONLY
EndSection

Section “InputDevice”
Identifier “cursor”
Driver “wacom”
Option “Device” “/dev/input/wacom”
Option “Type” “cursor”
Option “ForceDevice” “ISDV4″# Tablet PC ONLY
EndSection

Section “Monitor”
Identifier “aticonfig-Monitor[0]“
Option “VendorName” “ATI Proprietary Driver”
Option “ModelName” “Generic Autodetecting Monitor”
Option “DPMS” “true”
EndSection

Section “Device”
Identifier “aticonfig-Device[0]“
Driver “fglrx”
EndSection

Section “Screen”
Identifier “aticonfig-Screen[0]“
Device “aticonfig-Device[0]“
Monitor “aticonfig-Monitor[0]“
DefaultDepth 24
SubSection “Display”
Viewport 0 0
Depth 24
EndSubSection
EndSection

Section “Extensions”
Option “Composite” “enable”
EndSection

Hopefully compiz will not break again with the next update of xorg-driver-fglrx. If so, you may see another “Compiz Fusion on Gusty” post in the weeks to come.
16
Oct

Science Viral Metagenomics Webinar Oct 24, 2007

In case anyone is interested, Science will be hosting an online discussion on the metagenomics of Honey Bee colony collapse disorder next week (Oct 24 2007, 12 - 5pm EST). Speakers include Dr. W Ian Lipkin (Columbia University) and Dr. Michael Egholm (454 Life Sciences).

From the description on the seminar homepage:

Colony collapse disorder (CCD) among honey bee populations in the United States has resulted in the loss of between 50% and 90% of hive colonies. Previous studies have pointed to the possibility that an infectious agent could be involved. A recent study published in Science magazine used unbiased metagenomic analysis to survey microflora present in normal and CCD-affected hives to determine whether a pathological agent could be linked to CCD. The authors found that the presence of one virus, Israeli acute paralysis virus of bees (IAPV), showed a strong correlation with colony collapse disorder. In addition to the important economical implications, this work also represents a novel use for massively parallel next generation sequencing technology which has enabled this type of high level metagenomic study.

You will hear our panel, which includes two of the study’s authors, discussing:

  • How metagenomics can be applied in the discovery of unknown pathogens.
  • The importance of study design and data analysis in metagenomics research.
  • How recent technological advances have made this type of study possible ovarian cancer
  • .

Registration is required, but is open to anyone interested.

14
Oct

No Mathematica for Ubuntu

I decided to download and install Mathematica 6.0 for Linux- our school has a license agreement with Wolfram so we can use Mathematica for free. Ubuntu 7.04 however doesn’t seem to want to play nicely with Mathematica. Even after using a couple of tricks to get it to install and to display output properly, Mathematica is still seg-faulting anytime I try to evaluate anything more complicated than simple arithmetic.

Mathematica on Ubuntu Seg Fault Example

It works fine however on Fedora 8.

Screenshot of Mathematica 6 Running on Fedora 8

I haven’t spent very much time troubleshooting the problem yet, and I would imagine the problem could be fixed without too much work, but right now I just don’t have the time. I haven’t had the need to use Mathematica before, but so far I’m pretty impressed. The Mathematica Demonstrations Project has a bunch of cool visualization demos, and you can even attend free online seminars to learn how to use some of the features of Mathematica.

There are some open-source alternatives to Mathematica (e.g. Gnu Octave and R)
) that I would like to learn eventually, but for now I think I’ll stick to Mathematica since there is already a wealth of documentation for it online.

07
Oct

Pipe Dreams

Although most of you are probably already familiar with Yahoo pipes by now, and there have already been several bioinformatics, etc pipes created, I thought I’d add one more to the array. I made this pipe back in August to search through several of the bioinformatics journals I could think of, as well as a couple other mainstream science journals and preprint servers.

Daily Cup O' Joe

The pipe works by grabbing feeds from each of the journals and then filtering based on the title and/or description. The pipe includes feeds from Oxford Bioinformatics, Evolutionary Bioinformatics, Journal of Computational Biology, PLOS Computational Biology, BMC Bioinformatics, Nature, and Science (along with preprint feeds from several of the journals). To use the pipe, you simply specify the search term(s) you want to filter by. Running the pipe will show you what the current results for the specified search terms are, as well as give you the option to subscribe to those search terms. This way, if you want to be sent any new articles relating to “prediction,” for example, you can add the search to your news reader.


Yahoo pipes example search results

Although the pipe works pretty well for the most part, I’ve noticed that there are still a couple kinks. For one thing, the pipe has trouble fetching articles from PLOS. When trying to grab the feed, it outputs “error fetching pbiol.plosjournals.org/perlserv/?request=get-rss&issn=1553-7358&type=new-articles (302 Moved Temporarily)“. I’ve checked the feed, however, and it seems to be working fine. Another anomaly that I noticed is that sometimes the filters have trouble working together. The pipe is supposed to perform an “OR” search based on input to the title/description fields. In one instance I was getting more results when searching only the description field than when searching both the title and description field for the same search term. That was a couple of days ago, and I have not been able to reproduce the error since then.

All in all, the pipe is pretty nifty. You all are welcome to modify and use it as you please. Let me know if you have any suggestions :)

24
Sep

More on data munging…

Nsaunders pointed out last week the sad trend of bioinformaticians spending more and more time parsing data into something useful and less time actually using the data.

Meta servers in particular pose both a problem and a potential solution to some of this. Since it is often helpful to query more than one server, meta servers, by relaying your query to a number of selected servers, provide a simple solution to avoid having to copy and paste sequences to each and every server you want to query. @Tome for example can take a single sequence and run structure or fold-prediction queries to six different servers simultaneously. MetaPP (although recently beginning to charge for most types of usage) can query an even larger number of servers with just a few clicks.

This is where the usefulness of meta servers ends however. Although they may collect
Your data and put it all in a single location for you, the data is still just as jumbled and heterogeneous as if you had gone to each server and run the queries yourself. Although you may have saved yourself five minutes of copying and pasting, you still have many hours of hackery ahead of yourself before you will be able to do anything useful with the results.

One obvious solution would be for servers to coordinate and come up with some standard format for how data that is similar in nature should be outputted. This isn’t going to happen anything soon. A more realistic solution might be to handle the formatting at the meta-server level. Rather than creating web applications that simply relay queries to multiple servers and return the results as a single html file, why not first parse those results and turn them into something more readable?

This wouldn’t be an easy task of course, and since, as nsaunders mentioned, most servers don’t provide any API, would mean more hacking around with cumbersome html parsing classes. This would at least save a lot of hours for all the people who run those queries. Rather than spending days writing scripts to parse the data, the bionformaticians could instead spend days maintaining the improved meta servers move onto more important tasks like interpreting results.

08
Sep

Adventures With Linux RAID Part 2

Background

After several days of playing around with FakeRAID on Ubuntu, I decided to scrap the install and start over using software raid instead. Originally I had went with FakeRAID because I wanted to share the RAID partition between Windows and Linux. Software raid partitions however are only available to linux, so that would have excluded Windows. By the third or so day or attempting to setup a dual-boot on FakeRAID however I decided the need to have Windows running on RAID was not that great– thus, it was time to try Soft RAID.

Initial Setup

Since my data was already backed up from before, I did not need to worry about salvaging anything from my drives and simply wiped them clean. So here is what I have:

  • Western Digital 320GB SATA 7200rpm
  • Maxtor 300GB SATA 7200rpm
  • Maxtor 160GB SATA 7200rpm
The goal was to set-up a software (mdadm) RAID-0 partition spread across the first two ~300GB drives. Even though they are not the exact same size, by creating partitions of the same size, hard-drive size should not be an issue. Windows would have to go on the third (160GB) drive since it cannot see Soft RAID partitions. In order to avoid having to re-install GRUB, I installed Windows XP on the 160GB drive beforehand.
First Attempt

Using the Ubuntu 7.04 Alternate Install CD, I follow the steps from a useful tutorial I found on setting up software RAID. At the partition set-up step, I created two 280GB partitions of type “physical volume for RAID“, one on each of the ~300GB hard-drives. I then created a 2GB swap space following the 300GB partition on the first drive, and then left the rest of the free space on each drive empty (I figured later I could use it for another OS, or for a shared NTFS partition). The 160GB drive with Windows was left alone.

Everything during the partition step worked fine– I used the “Configure software RAID” command to setup a RAID array using the two partitions, and set that to be my “/” partition. Problems arise only after all of the system files have been copied to the hard-drive, and the installer is attempting to Install GRUB. The screen flashes red and displays some error message along the lines of “Failed to install grub on MBR, would you like to chose another partition?” No matter what partition I attempt to install to however, GRUB will not install.

Eventually I give up and head back to the forums / Google to see if I can find a solution. A hint came in the form of a guide on Ubuntu Forums. The author suggests that for a RAID-0 setup, GRUB must be installed on a separate boot partition. So I wipe the two drives once more and try again.

Second Attempt

This time I do things slightly different. First, I create two 512Mb partitions at the beginning of each of the ~300GB hard-drives to use for a /boot partition (Actually, I only plan to use one, but decide to create similar partitions on both drives so that the RAID partitions will at the same location). Furthermore, I decide to only use 250GB of each drive for the Raid partition. I’ve decided to install two more operating systems, Fedora 8 and Debian Lenny, which will occupy the remaining space of each drive.

This time the install runs successfully and GRUB is installed on the /boot partition on the first hard-drive. Ubuntu boots fine, as do Fedora, Debian and Windows. My only remaining problem at this point is getting Fedora and Debian to recognize the md0 (software RAID) partition. After some messing around, I’m able to get Fedora go mount the md0 partition however I still have no luck with Debian. I decide that the performance gains (which I haven’t noticed really) are not worth all of the trouble I’ve gone through, and would surely go through in the future as I modify the system and try out other distros.

Conclusions

So in the end I decided to ditch the software RAID and install Ubuntu on a normal ext3 partition. I’m sure It would have been possible to get everything running well, and have Debian and Fedora working with the software raid partition, but you have to understand that by this point in time I’ve already spent the greater part of the the past three or four days Installing and troubleshooting RAID partitions. There was also the fact that Windows would not ever be able to see anything on the RAID partition. Perhaps if the performance gains had been more phenomenal I would have been tempted to stick with mdadm, and work out the kinks. They weren’t though. If you are just going to be running Linux alone, or perhaps with one other system, then Linux Software RAID might be worth the effort to install. Anything more elaborate however though I would just stick with traditional partitioning.